<<

Introduction

The first two parts of the course concern propositional . This is, roughly, the logic of combining statements. Here, the defining property of a statement is that it is either true or false. We’ll use statement and proposition interchangeably. So, for instance, ‘7 is a prime’ is a statement. So is ‘the moon is made of cheese’. But ‘is the moon made of cheese?’ and ‘add 8 and 7’ are not: the first is a question, the second an instruction. What about ‘x is prime’? This isn’t a statement, unless we also specify x, in a way that makes sense. (E.g. x = 7, but perhaps not x = the moon.) In propositional logic, we will have propositional variables to represent statements, these might be called say p or q. We combine our propositional variables to form more complicated statements, called propositional terms. We combine them using connectives: ‘and’ (∧), ‘or’ (∨), ‘not’ (¬), ‘implies’ (→), and ‘if and only if’ ↔. When we do this at first it will be purely syntactic (that is, just formal manipulation of strings of symbols, with nothing to give meaning). When we do add meaning (using valuations), the crucial thing to note is that the truth or falsity of a combined statement, say ‘7 is prime and the moon is made of cheese’ depends only on the truth or falsity of the two statements that are combined. It doesn’t depend in any way on what those two statements actually say. In the next section, we start making this all precise, with the definition of proposi- tional terms.

1. Propositional terms

Definition 1.1. A propositional language is a nonempty set L of symbols, called propo- sitional variables, usually denoted p, r, q, . . . , p1, p2,.... We abbreviate propositional variable by p.v.. We define the set of propositional terms (or propositional formulas) of L, which we write SL, by induction:

• S0L = L,

• Sn+1L = SnL ∪ {(s ∧ t), (s ∨ t), (¬s), (s → t), (s ↔ t): s, t ∈ SnL}.

And then we define

S • SL = n≥0 SnL. We will always assume that L ∩ {∧, ∨, →, ↔, ¬, (, )} = ∅.

Let’s consider the simplest case L = {p}. Then we have S0L = {p}. And S1L = {p, (p∧p), (p∨p), (p → p), (p ↔ p), (¬p)}. And, for instance ((p∧p)∧(p∧p)) ∈ S2L. And so on.

And for instance, with L = {p, q, r}, we have (p ∧ q), (¬r) ∈ S1L (among other things) and ((p ∧ q) ∨ (¬r)) ∈ S2L and so on.

For the time being, this is completely formal. We will read (p ∧ q) as ‘p and q’, and so on, but so far we haven’t done anything to give these formal strings meaning.

Note that the definition is inductive. And correspondingly, it is often helpful to prove results about terms by induction (which we call induction on complexity of terms).

1 2

There are many brackets. This is to avoid any ambiguity (for instance how should we read p ∨ q ∧ r?). But there is still some possible ambiguity. In simple terms such as (p∧q) or ((p∧q)∨r) we can see which connective is introduced last. This is called the principal connective. And we have a clear sense of how these terms are built up from more complicated terms. We can draw trees to represent this. For (p ∧ q) we have:

(p ∧ q)

p q

And for ((p ∧ q) ∨ r) we have:

((p ∧ q) ∨ r)

(p ∧ q) r

p q

For more complicated terms, it isn’t immediately clear that we can tell which con- nective is the principal connective. For instance, consider the term

(((¬p) → ((p → (¬q)) → (¬r))) → (p → r)).

How can we tell which is the principal connective? We can count brackets! So for the example above we have:

(1(2(3¬p)2 → (3(4p → (5¬q)4)3 → (4¬r)3)2)1→(p → r))

(1(2¬p)1→((p → (¬q)) → (¬r))) (p → r)

(¬p) (1(2p → (3¬q)2)1→(¬r)) p r

p (p → (¬q)) (¬r)

p (¬q) r

q

Here, we keep count of the number of open left brackets, until we get to the con- nective after we have one open bracket. This will be the principal connective (this will fall out of our proof below). In the tree, the principal connective is underlined.

Here is the example from class: 3

(1(2(3(4¬p)3 ∧ q)2 ∧ (3(4¬r)3 ∨ (4r ∧ p)3)2)1∨((p ∧ r) ∨ (¬q)))

(1(2(3¬p)2 ∧ q)1∧((¬r) ∨ (r ∧ p))) (1(2p ∧ r)1∨(¬q))

(1(2¬p)1∧q) (1(2¬r)1∨(r ∧ p)) (p ∧ r) (¬q)

(¬p) q (¬r) (r ∧ p) p r q

p r r p

We now prove that there is no ambiguity. (Note, for those who used truth tables in 10101 or 10111, you’ve been implicitly assuming the following theorem!) Theorem 1.2 (Unique readability). Let s ∈ SL. Then exactly one of the following holds. (a) s ∈ L (that is, s is a p.v.); (b) s is (¬t) for some t ∈ SL; (c) s is (t ∧ u) for some t, u ∈ SL; (d) s is (t ∨ u) for some t, u ∈ SL; (e) s is (t → u) for some t, u ∈ SL; (f) s is (t ↔ u) for some t, u ∈ SL. Moreover, in (b), the term t is unique and in (c)-(f) both t and u are unique.

Proof. To begin, we prove that every term can be written in at least one of these forms. We do this by induction on complexity of terms. First, note that, by definition, every term in S0L has one of the required forms (as they are all p.v’s). This gives the base case of our induction. For the inductive step, suppose that n ≥ 0 and that every s ∈ SnL has one of the above forms. Let s ∈ Sn+1L. Then either s ∈ SnL, in which case we’re done by our inductive hypothesis, or s has one of the required forms by the definition of Sn+1L. This completes the induction and shows that every term is in one of these forms. The uniqueness step is more difficult. To prove it, we introduce some notation. Suppose that s ∈ SL. We let l(s) = the number of left brackets ( in s, and r(s) = the number of right brackets ) in s. We need the following. Lemma 1.3. Suppose that s ∈ SL. Then l(s) = r(s).

Proof. This is again by induction on complexity. First suppose that s ∈ S0L. Then s is a p.v. and so l(s) = 0 and r(s) = 0. So the result holds in this case. Suppose that n ≥ 0 and that for all s ∈ SnL we have l(s) = r(s). Suppose that s ∈ Sn+1L \ SnL. Then s has one of the forms (b)-(f).

If s is (¬t) for some t ∈ SnL, then l(s) = l(t) + 1 and r(s) = r(t) + 1. And by our inductive hypothesis, l(t) = r(t), so we have l(s) = r(s). 4

If s is (t ∧ u), for some t, u ∈ SnL then we have l(s) = l(t) + l(u) + 1 and r(s) = r(t) + r(u) + 1. And again by the inductive hypothesis, we have l(t) = r(t) and l(u) = r(u). So l(s) = r(s). And then there are similar cases for ∨, → and ↔. 

Before continuing with the proof of the unique readability theorem, we need some further terminology. Every propositional term is a string of symbols, each of which is either in L, or in {∧, ∨, ↔, →, ¬} or ( or ). The terms have a certain particular form, and we refer to a general finite string of these symbols as a word. And if x, y, z are words then xyz is the word obtained by first writing x then y and then z. If a word x has the form yz then we say that y is a left subword of x, and we say y is a proper left subword of x if z is nonempty. Similarly, z is a right subword of x and is said to be proper if y is nonempty. Lemma 1.4. Suppose that s is a propositional term and that x is a proper left subword of s. Then either x is empty or l(x) > r(x). In particular, x is not a propositional term.

Proof. This is again by induction on the complexity of terms. Suppose that s ∈ S0L. Then s is a p.v. so if x is a proper left subword of s then x must be empty. So the lemma holds for all s in S0L.

Suppose that the lemma holds for all s ∈ SnL. Let s ∈ Sn+1L \ SnL. If s is (¬t) for some t ∈ SnL and x is a proper left subword of s then either x is empty, x is (, x is (¬, or x is (¬y, for a nonempty left subword y of t. We’re clearly done in the first three of these cases, so suppose that x is (¬y, with y a nonempty left subword of t. By 1.3 and the inductive hypothesis, we have l(y) ≥ r(y). So l(x) = l(y) + 1 ≥ r(y) + 1 > r(y) = r(x) as required.

Next suppose that s is (t∗u) where ∗ ∈ {∧, ∨, →, ↔} and t, u ∈ SnL. And suppose that x is a proper left subword of s. If x is empty we’re done, and if x is ( we’re done. Next suppose that x is (y where y is a nonempty left subword of t. By lemma 1.3 and the inductive hypothesis, l(y) ≥ r(y). And so l(x) = l(y) + 1 > r(y) = r(x), and we’re done. The same argument also works if x is (t∗. Finally, suppose that x is (t ∗ y where y is a nonempty left subword of u. By lemma 1.3 we have l(t) = r(t). And by 1.3 and the inductive hypothesis, we have l(x) = 1 + l(t) + l(y) > r(t) + r(y) = r(x). And this completes the proof of the lemma. 

Similarly, we have a statement for proper right subwords. 5

Lemma 1.5. Suppose that s is a propositional term and that x is a proper right subword of s. Then either x is empty or r(x) > l(x). In particular, x is not a propositional term.

The proof is very similar to the previous proof, so we omit it. The following result completes the proof of the unique readbility theorem. Proposition 1.6. Suppose that s ∈ SL. Then exactly one of the following holds: s is a p.v. or there is exactly one way of writing s as (¬t), with a unique t ∈ SL, or there is exactly one way of writing s as (t∗u) where t, u ∈ SL are unique and ∗ ∈ {∧, ∨, ↔, →} is unique.

Proof. We’re done if s is a p.v., so suppose it isn’t. If s has the form (¬t) for some t ∈ SL then s has (¬ as a left subword. If s has the form (t ∗ u) for some t, u ∈ SL then s has either (p as a left subword, for some p.v. p, or (( as a left subword. So only one of these two cases can happen for a given s and we can treat the two cases separately. In the first case, it is clear that t is unique. So suppose that s is (t∗u) and is (t0 ∗0 u0) for some t, u, t0, u0 ∈ SL and ∗, ∗0 ∈ {∧, ∨ →, ↔}. Suppose for a contradiction that these are not identical ways of writing s. As (t ∗ u) and (t0 ∗0 u0) are both s, we must have that either t is a proper left subword of t0 or that t0 is a proper left subword of t (if t, t0 are identical, then we’re done). Suppose that the first of these happens (the other case is the same). As t is nonempty, we have l(t) > r(t) by lemma 1.4. But t ∈ SL so by lemma 1.3 l(t) = r(t). This is a contradiction, and completes the proof of unique readability. 

 6

2. Valuations

Definition 2.1. Fix a propositional language L, and the corresponding propositional terms SL.A on the set of propositional terms is a function v : SL → {T, F} such that for all terms s, t ∈ SL we have

(1) v(¬s) = T if and only if v(s) = F, (2) v(s ∧ t) = T if and only if v(s) = T and v(t) = T, (3) v(s ∨ t) = T if and only if v(s) = T or v(t) = T, (4) v(s → t) = T if and only if v(s) = F or v(t) = T, (5) v(s ↔ t) = T if and only if v(s) = v(t).

In fact, valuations are determined by the values that they take on the propositional variables. (This is another result you may have implicitly used in the past, for instance when computing truth tables.) Proposition 2.2. Let L be a propositional language.

(i) If v0 : S0L → {T, F} is any function then there is a unique valuation v : SL → {T, F} such that v(p) = v0(p) for all p ∈ S0L. (ii) If t is any propositional term, and v, w are valuations agreeing on all the p.v. which occur in t then v(t) = w(t).

Proof. For (i), let v0 : S0L → {T, F} be any function. We prove, by induction on n, that there is a unique vn : SnL → {T, F} satisfying all the requirements of being a valuation for terms in SnL, such that for all p ∈ S0L we have

vn(p) = v0(p). This is enough, as then we can define v : SL → {T, F} by

v(t) = vn(t) where t ∈ SnL. (This will be well-defined since if t ∈ SnL and t ∈ SmL, and without loss of generality n ≤ m, we have vn(t) = vm(t) by uniqueness.) This v will be a valuation and will satisfy v(p) = v0(p) for p ∈ S0L. It is easy to see that since each vn is unique, the resulting v is also unique.

We now define the vn. For n = 0 there is nothing to do. So suppose that we’re given vn. We need to define vn+1(s) for s ∈ Sn+1L. If s ∈ SnL then we put vn+1(s) = vn(s) so suppose that s ∈ Sn+1L \ SnL. Then s is (¬t) for some t ∈ SnL or (t ∗ u) for some t, u ∈ SnL and ∗ ∈ {∧, ∨, →, ↔}. If s is (¬t) then we put vn+1(s) = T if vn(t) = F and vn+1(s) = F if vn(t) = T. If s is (t∨u) we put vn+1(s) = T if and only if v(t) = T or v(u) = T. And similarly for the other connectives, we just use the rules of being a valuation. Because we have no choices to make, the resulting vn+1 is unique. The second part is an exercise. First try proving the following, by induction on 0 0 complexity of terms: if L ⊆ L are sets of p.v. then for every n we have SnL ⊆ SnL. Then show that if v0 is a valuation on SL0 and v is a valuation on SL such that v0(p) = v(p) for every p ∈ L0 then v0(t) = v(t) for every t ∈ SL0. From this, the second part follows easily. 

Example. Suppose that L = {p, q}. How many valuations are there on SL? To work this out, note that by the proposition it suffices to count functions L → {T, F}. As L has two elements, there are four such functions and so four valuations.

Definition 2.3. A propositional term t is a if v(t) = T for every valuation v. On the other hand t is unsatisfiable if v(t) = F for every valuation v. 7

The idea here is that a tautology is a term that is true ‘in all possible worlds’. Truth tables are a convenient way of summarizing the effect of valuations on a propo- sitional term. Suppose that t is a term in which only propositional variables p1, . . . , pn occur. Then the truth table for t will be of the form

p1 p2 ··· pn t TT ··· T ? TT ··· F ? ...... ? FF ··· F ? When computing the truth table for a given term t, it is helpful to record the effect of the valuations on the terms that occur in the construction of t. It is also very helpful to be systematic. For a first example, let’s compute the truth table of the term (p → (q → p)):

p q (q → p) (p → (q → p)) TTT T TFT T FTF T FFT T

So v(p → (q top)) = T for every valuation v, and hence (p → (q → p)) is a tautology. For another example, suppose we want to compute the truth table of ((¬(p ∨ q)) → (p ∧ q)). Following the advice given above, we find the following.

p q (p ∨ q)(¬(p ∨ q)) (p ∧ q) ((¬(p ∨ q)) → (p ∧ q)) TTTFT T TFTFF T FTTFF T FFFTF F Notice that the third column and the final column of this truth table are identical. That is, the terms (p ∨ q) and ((¬(p ∨ q)) → (p ∧ q)) are true under exactly the same valuations. When this happens, we say that the terms are logically equivalent. For a more complicated example, we compute the truth table of the term t given by ((p → (q ∧ r)) ↔ (¬(p ∨ r))). Here it is. p q r (q ∧ r)(p → (q ∧ r)) (p ∨ r)(¬(p ∨ r)) t TTTTTTF F TTFFFTF T TFTFFTF T TFFFFTF T FTTTTTF F FTFFTFT T FFTFTTF F FFFFTFT T Here is the formal definition of logical equivalence. 8

Definition 2.4. Two propositional terms s, t are said to be logically equivalent if v(s) = v(t) for every valuation v. We write s ≡ t to indicate that s and t are logically equivalent.

For example p ≡ p ∧ p. Proposition 2.5. Let s, t be propositional terms. Then s ≡ t if and only if (s ↔ t) is a tautology.

Proof. Suppose s ≡ t. If v is any valuation, then v(s) = v(t) since s ≡ t. And then v(s ↔ t) = T, by the definition of valuations. But v was arbitrary, so (s ↔ t) is a tautology. On the other hand, if (s ↔ t) is a tautology, and v is any valuation, then we have v(s ↔ t) = T. By the definition of valuations, v(s) = v(t). And as v was arbitrary, s ≡ t.  Definition 2.6. Let S be a set of propositional terms and t a propositional term. We write S |= t and say that t is a logical consequence of S (or that S logically implies t, or even just t follows from S) if, for every valuation v such that v(s) = T for all s ∈ S, we have v(t) = T. (Informally, t follows from S if whenever S is true t is true.) For example, let us show that {(p → q), (¬q)} |= (¬p). We can do this using a truth table (though it can be done by other means too). p q p → q ¬q ¬p TTTFF TFFTF FTTFT FFTT tr We see that whenever v is a valuation such that v(p → q) = T and v(¬p) = T we have v(¬p) = T (the bottom row is the only relevant valuation). Hence {(p → q), (¬q)} |= (¬p). For another example, what does ∅ |= t mean? (Recall that ∅ denotes the empty set.) Well, in this case, any valuation at all makes ∅ true, as there are no terms to make true. So ∅ |= t holds if every valuation makes t true, that is, if t is a tautology. Definition 2.7. Let S be a set of propositional terms. We write v(S) = T if v(s) = T for all s ∈ S. We say that S is tautologous if v(S) = T for every valuation v. And we say that S is satisfiable if v(S) = T for some valuation v. Finally, S is unsatisfiable if for every valuation v there is some s ∈ S such that v(s) = F, i.e. there is no v such that v(S) = T.

We also need some notation. We will abbreviate expressions such as S∪{t1, . . . , tn} |= u by S, t1, . . . , tn |= u. Proposition 2.8. Let S ⊆ SL and t, t0, u ∈ SL. Then (i) S |= t if and only if S ∪ {¬t} is unsatisfiable; (ii) S, t |= u if and only if S |= (t → u); (iii) S, t, t0 |= u if and only if S, (t ∧ t0) |= u.

Proof. (i) We have: S ∪ {¬t} is unsatisfiable if and only if for all valuations v we have v(s) = F for some s ∈ S or v(¬t) = F. This holds if and only if for all valuations v, if v(s) = T for all s ∈ S then v(¬t) = F, which is equivalent to for all valuations v, if v(s) = T for all s ∈ S then v(t) = T, and this is the definition of S |= t. 9

(ii) S, t |= u if and only if for all valuations v, if v(S) = T and v(t) = T then v(u) = T. This holds if and only if for all valuations v with v(S) = T we have v(t → u) = T, which is the definition of S |= (t → u). (iii) S, t ∧ t0 |= u if and only if for all valuations v if v(S) = T and v(t ∧ t0) = T then v(u) = T. This holds if and only if for all valuations v if v(S) = T and v(t) = T and v(t0) = T then v(u) = T, which is the definition of S, t, t0 |= u.  It can often be quicker to reason directly with valuations than to use truth tables. Above, we showed using a truth table that (p → q), (¬q) |= (¬p). We now do this again, arguing directly using valuations. So suppose that v(p → q) = T and v(¬q) = T. Then (v(p) = F or v(q) = T) and v(¬q) = T. So v(p) = F and so v(¬p) = T. Before the next section we do some more examples of reasoning with valuations. First, from the 2018 paper, is the following statement true for all propositional term s and t? If s ∧ t |= ¬t and s ∧ ¬t |= t then s is unsatisfiable. To see whether this is true, we try to prove it and see if we get stuck. Suppose, in the hope of finding a contradiction, that s ∧ t |= ¬t and s ∧ ¬t |= t and that s is satisfiable . Then there is a valuation v such that v(s) = T. We have either v(t) = T or v(t) = F. If v(t) = T then since s∧t |= ¬t and we have v(¬t) = T so that v(t) = F, a contradiction. So we must have v(t) = F. But then v(¬t) = T and since s ∧ ¬t |= t we have v(t) = T, another contradiction. So our original assumption that v(s) = T is wrong, and there is no such v, so s is indeed unsatisfiable. So the statement is true. Next consider the following. If s ∧ t |= ¬t and s ∧ ¬t |= ¬t then s is unsatisfiable. We start as before, and assume that this hypothesis holds and that s is satisfiable. So v(s) = T for some valuation v. As before we get a contradiction if v(t) = T. Now though there is no obvious contradiction when v(t) = F. This suggests looking for a counterexample. Trying simple terms is a good idea. So let s be p a propositional variable. We know that if v(s) = T then we can’t have v(t) = T (if our assumptions are to hold) so let’s try taking t to be ¬p. Then we do have s ∧ t |= ¬t. And clearly we have s ∧ ¬t |= ¬t. But s is satisfiable. So the statement is false, and taking s as a p.v. p and t as ¬p gives a counterexample. Notation. In order to keep the notion under control, we now drop some of the brackets. We omit: • The (·) associated to (¬p) for p ∈ L. • The (·) associated to the principal connective. So for example we now write (¬p ∧ q) → (¬r ∨ ¬p) in place of (((¬p) ∧ q) → ((¬r) ∨ (¬q))). Here are some more examples: (1) (¬p ∧ q) → (p ∧ ¬q) is really (((¬p) ∧ q) → (p ∧ (¬q))) (2) (¬(¬p ∧ q)) → (¬r ∨ q) is really ((¬((¬p) ∧ q)) → ((¬r) ∨ q)) 10

3. Normal forms

Recall that we say that terms s, t ∈ SL are logically equivalent if, for all valuations v we have v(s) = v(t). Lemma 3.1. Suppose that s, t ∈ SL. The following are equivalent. (i) s ≡ t, (ii) s |= t and t |= s, (iii) ∅ |= s ↔ t, (iv) s ↔ t is a tautology.

We’ve proved these already. But if you’re at all unsure, it is worth having a go at doing them yourself as an exercise. The relation ≡ has the following properties: (1) ≡ is reflexive, that is s ≡ s for all s ∈ SL, (2) ≡ is symmetric, that is, if s ≡ t then t ≡ s, (3) ≡ is transitive, that is, if s ≡ t and t ≡ u then s ≡ u. This means that logical equivalence is an equivalence relation and so SL can be partitioned into equivalence classes. So we can write SL = E1 ∪ E2 ∪ · · · , with a possibly infinite union, where for each i we have s, t ∈ Ei if and only if s ≡ t, and for i 6= j we have Ei ∩ Ej = ∅. Note that it makes sense to talk about the value of a valuation on one of these equivalence classes. For example suppose that L = {p}, a language with exactly one propositional vari- able. Since two terms are logically equivalent if and only if they have the same column in their truth tables, finding all the logical equivalence classes of SL is the same as finding the possible truth tables. For our L the possible truth tables are given by the columns t1, . . . , t4 below.

p t1 t2 t3 t4 T T T F F F T F T F We can find terms in SL with these columns, for instance p ∨ ¬p, p, ¬p, and p ∧ ¬p respectively. So there are exactly four equivalence classes, and these terms are representatives of these four equivalence classes. Here are some basic logical equivalences, where s, t, u ∈ SL. • s ∧ t ≡ t ∧ s. • s ∨ t ≡ t ∨ s, •¬ (s ∨ t) ≡ (¬s ∧ ¬t), •¬ (s ∧ t) ≡ ¬s ∨ ¬t, • ¬¬s ≡ s, • s → t ≡ ¬s ∨ t, • (s ∧ t) ∧ u ≡ s ∧ (t ∧ u) (so we can unambiguously write s ∧ t ∧ u), • (s ∨ t) ∨ u ≡ s ∨ (t ∨ u) (so we can unambiguously write s ∨ t ∨ u), • (s ∧ t) ∨ u ≡ (s ∨ u) ∧ (t ∨ u), • (s ∨ t) ∧ u ≡ (s ∧ u) ∨ (t ∧ u), • s ∧ s ≡ s ∨ s ≡ s, • (s ∧ t) ∨ (¬s ∧ t) ≡ t. • (s ∧ t) ∨ s ≡ s. 11

These can all be checked directly by arguing with valuations. We just do the last one. If v(s) = T then clearly v((s ∧ t) ∨ s) = T. So suppose that v((s ∧ t) ∨ s) = T. Then either v(s) = T, and we’re done, or v(s ∧ t) = T, so that v(s) = T and we’re done. The following proposition says that we can ‘substitute in’ logically equivalent terms without changing the logical equivalence class. Proposition 3.2. Suppose that s, s0, t, t0 ∈ SL and that s ≡ s0 and t ≡ t0. Then (1) ¬s ≡ ¬s0, (2) s ∧ t ≡ s0 ∧ t0, (3) s ∨ t ≡ s0 ∨ t0, (4) s → t ≡ s0 → t0.

Proof. We prove the last one (the others are left to you). Suppose v is a valuation. We have v(s → t) = F if and only if v(s) = T and v(t) = F. Since s ≡ s0 and t ≡ t0, this holds if and only if v(s0) = T and v(t0) = F. And this is equivalent to v(s0 → t0) = F. 0 0 So s → t ≡ s → t .  Vn Wn We introduce some notation, which we will use a fair bit. We define i=1 and i=1 V1 by induction on n, as follows. i=1 si = s1 and k+1 k ! ^ ^ si = si ∧ sk+1 i=1 i=1 W1 and i=1 si = s1 and k+1 k ! _ _ si = si ∨ sk+1. i=1 i=1 These behave exactly as we might expect.

Proposition 3.3. Suppose that s1, . . . , sn ∈ SL and that v is a valuation. Then Vn (1) v ( i=1 si) = T if and only if v(si) = T for all i = 1, . . . , n, Wn (2) v ( i=1 si) = T if and only if v(si) = T for some i = 1, . . . , n, Vn Wn (3) i=1 si ≡ ¬ i=1 ¬si, Wn Vn (4) i=1 si ≡ ¬ i=1 ¬si. Vn Proof. (1) By induction on n. The case n = 1 is trivial, so suppose that v ( si) = i=1V T if and only if v(si) = T for all i = 1, . . . , n. By the definition of we Vn+1  Vn Vn+1  have v i=1 si = v ( i=1 si ∧ sn+1). So v i=1 si = T if and only Vn Vn if v ( i=1 si ∧ sn+1) = T. This holds if and only if v ( i=1 si) = T and v(sn+1) = T, by the definition of valuations. And this holds if and only v(si) = T for i = 1, . . . , n and v(sn+1) = T, by the induction hypothesis. And that is what we wanted. (2) Very similar to the previous one. Vn (3) For any valuation v we have v ( i=1 si) = T if and only if v(si) = T for i = 1 . . . , n, by (1) if and only if v(¬si) = F for i = 1, . . . , n Wn if and only if v ( i=1 ¬si) = F by (2) Wn if and only if v (¬ i=1 ¬si) = T. (4) Similar to the previous one, 

From this, we can easily conclude the following. 12

Corollary 3.4. Suppose that s1, . . . , sn, t1, . . . , tm ∈ SL are such that {s1, . . . , sn} = {t1, . . . , tm} (think about what this means). Then n m _ _ si ≡ tj i=1 j=1 and n m ^ ^ si ≡ tj. i=1 j=1 Definition 3.5. (1) We call any p.v. p or ¬p of a p.v. a literal. (2) A term t is in disjunctive normal form (which we abbreviate dnf) if it has the form n m _ ^i gi,j i=1 j=1

where each gi,j is a literal. Theorem 3.6 (The disjunctive normal form theorem.). Suppose that t is a propositional term. Then there is a propositional term s in dnf such that t ≡ s.

Moreover, if {p1, . . . , pk} are the p.v. occurring in t then we may take s to have the form n m _ ^i gi,j i=1 j=1 k where n ≤ 2 , mi ≤ k.

Proof. Let v1, . . . , vn be all the valuations v on SL, where L = {p1, . . . , pk}, such that v(t) = T. We can assume that n ≥ 1, for if not then t is unsatisfiable, and logically equivalent to p ∧ ¬p, which is in disjunctive normal form. For each i = 1, . . . , n and j = 1, . . . , k let ( pj if vi(pj) = T gi,j = ¬pj if vi(pj) = F. Vk  Then for any valuation v, we have v j=1 gi,j = T if and only if v = vi. Now let s be the term n k _ ^ gi,j. i=1 j=1 Vk  Then if v is any valuation on SL, we have v(s) = T if and only if v j=1 gi,j = T for some i, and this holds if and only if v = vi for some i, which holds if and only if v(t) = T.

So s ≡ t, as we wanted. Finally, we evidentally have mi ≤ k (as in the term we’ve found they are in fact all equal to k), and n ≤ 2k since there are only 2k valuations on SL. 

The proof of the dnf theorem gives us a method to find a dnf for a term t: first we find all valuations v such that v(t) = T then we take conjunctions of literals as in the proof of the theorem. Our dnf is given by taking disjunctions of these. For example, consider the term t given by (p → (q∧r))∧(p∨r). To use this method to find a dnf for t, we first compute all the valuations v on p, q, r such that v(t) = T. To do this, we can use a truth table. 13

p q r p → (q ∧ r) p ∨ r t TTTTT T TTFFT F TFTFT F TFFFT F FTTTT T FTFTF F FFTTT T FFFTF F Now for each valuation v such that v(t) = T, we write down the corresponding conjunction of literals. So, for the first row, we take p ∧ q ∧ r, for the fifth row, we take ¬p ∧ q ∧ r and for the seventh row we take ¬p ∧ ¬q ∧ r. Our dnf is then the disjunction of all of these, that is, our dnf for t is (p ∧ q ∧ r) ∨ (¬p ∧ q ∧ r) ∨ (¬p ∧ ¬q ∧ r). An alternative way to compute a dnf for t is to directly manipulate the term using the various logical equivalences. We have: t ≡ (p → (q ∧ r)) ∧ (p ∨ r) ≡ (¬p ∨ (q ∧ r)) ∧ (p ∨ r) use s → t ≡ ¬s ∨ t ≡ (¬p ∧ (p ∨ r)) ∨ ((q ∧ r) ∧ (p ∨ r)) use (s ∨ t) ∧ u ≡ (s ∧ u) ∨ (t ∧ u) ≡ (¬p ∧ p) ∨ (¬p ∧ r) ∨ (q ∧ r ∧ p) ∨ (q ∧ r ∧ r) and again ≡ (¬p ∧ r) ∨ (q ∧ r ∧ p) ∨ (q ∧ r ∧ r) use (p ∧ ¬p) ∨ t ≡ t ≡ (¬p ∧ r) ∨ (q ∧ r ∧ p) ∨ (q ∧ r) use (t ∧ t ≡ t ≡ (¬p ∧ r) ∨ (q ∧ r) use (t ∧ s) ∨ s ≡ s

We can also go directly from the dnf we found using the truth table to this one, as follows: t ≡(p ∧ q ∧ r) ∨ (¬p ∧ q ∧ r) ∨ (¬p ∧ ¬q ∧ r) from the truth table ≡ (p ∧ q ∧ r) ∨ (¬p ∧ r) using (s ∧ t) ∨ (¬s ∧ t) ≡ t ≡ (q ∧ r) ∨ (¬p ∧ r) using (s ∧ t ∧ u) ∨ (¬s ∧ t) ≡ (t ∧ u) ∨ (¬s ∧ t). We’ve now found lots of different dnfs for t. In particular, normal forms are not unique! For another example, suppose that t is the term (p ∧ q) → (¬p ∨ r). Again we first compute the truth table for t. p q r p ∧ q ¬p ¬p ∨ r t TTTTFT T TTFTFF F TFTFFT T TFFFFF T FTTFTT T FTFFTT T FFTFTT T FFFFTT T From the truth table, we see that only one valuation v makes t false, namely the valuation determined by v(p) = T, v(q) = T and v(r) = F. So for each other valuation, we write down a conjunction of literals as in the proof of the theorem. For instance, for the valuation determined by v(p) = v(q) = v(q) = T we take the conjunction p ∧ q ∧ r. 14

And for v determined by v(p) = v(r) = T and v(q) = F we take p ∧ ¬q ∧ r. Doing this for each of the seven valuations involved, and taking disjunctions we the following dnf for t: (p∧q∧r)∨(p∧¬q∧r)∨(p∧¬q∧¬r)∨(¬p∧q∧r)∨(¬p∧q∧¬r)∨(¬p∧¬q∧r)∨(¬p∧¬q∧¬r). We can simplify this a lot. First, using the rule (s ∧ t) ∨ (¬s ∧ t) ≡ t three times, on the last four terms, t is equivalent to (p ∧ q ∧ r) ∨ (p ∧ ¬q ∧ r) ∨ (p ∧ ¬q ∧ ¬r) ∨ ¬p. Using the same rule on the second and third terms, this is equivalent to (p ∧ q ∧ r) ∨ (p ∧ ¬q) ∨ ¬p. Then using s ∨ (¬s ∧ t) ≡ s ∨ t (on the second and third terms), this is equivalent to (p ∧ q ∧ r) ∨ ¬q ∨ ¬p and a similar argument shows that this is equivalent to r ∨ ¬q ∨ ¬p. Another way to see this, directly from the truth table, is to note, using the truth table, that ¬t ≡ p ∧ q ∧ ¬r. So ¬¬t ≡ ¬(p ∧ q ∧ ¬r). Then this is equivalent to ¬p ∨ ¬q ∨ r. Similar ideas will be used in proving the next result. Definition 3.7. A term s is in conjunctive normal form (cnf) if s has the form n m ^ _i gi,j i=1 j=1 where each gi,j is a literal. Theorem 3.8. Suppose that t is a propositional term. Then there is a propositional term s in cnf such that s ≡ t. If {p1, . . . , pk} are the pv occurring in t then we can k take s in cnf (as in the defn) with n ≤ 2 and mi ≤ k.

Proof. This follows from the dnf theorem. By that theorem, there is a dnf for ¬t, so we can write: n m _ ^i ¬t ≡ gi,j i=1 j=1 with literals gi,j. Now, t ≡ ¬¬t  n m  _ ^i ≡ ¬  gi,j . i=1 j=1 Using the various equivalences listed at the start of this section, we see that this is equivalent to n m ^ _i ¬gi,j. i=1 j=1

And each ¬gi,j is either a literal, or we can cancel a ¬¬ to make is a literal. So we’re done.  15

As with the dnf Theorem, the proof of the cnf Theorem gives us a method to find cnfs. For instance, suppose that t is the term (p ∧ q) → (¬p ∨ r). Following the proof of the cnf theorem, we first find a dnf for ¬t. We do this by direct manipulation using logical equivalences, but it could also be done using a truth table. We have ¬t ≡ ¬((p ∧ q) → (¬p ∨ r)) ≡ ¬((¬(p ∧ q)) ∨ (¬p ∨ r)) ≡ (¬¬(p ∧ q)) ∧ (¬(¬p ∨ r)) ≡ (p ∧ q) ∧ (p ∧ ¬r) ≡ p ∧ q ∧ ¬r. So ¬t ≡ p ∧ q ∧ ¬r. Now again following the proof of cnf, we negate both sides of this: ¬¬t ≡ ¬(p ∧ q ∧ ¬r) ≡ ¬p ∨ ¬q ∨ ¬¬r ≡ ¬p ∨ ¬qr and we have found a cnf for t. Finally, we find a cnf for the term t given by (p → (q ∧ r)) ∧ (p ∨ r) (which we considered in the examples of dnfs. ) First we find a dnf for ¬t: ¬t ≡ ¬(p → (q ∧ r)) ∧ (p ∨ r) ≡ (¬(p → (q ∧ r))) ∨ (¬(p ∨ r)) ≡ (p ∧ ¬(q ∧ r)) ∨ (¬p ∧ ¬r) ≡ (p ∧ (¬q ∨ ¬r)) ∨ (¬p ∧ ¬r) ≡ (p ∧ ¬q) ∨ (p ∧ ¬r) ∨ (¬p ∧ ¬r). This is a dnf, so t ≡ ¬¬t ≡ ¬((p ∧ ¬q) ∨ (p ∧ ¬r) ∨ (¬p ∧ ¬r)) ≡ (¬(p ∧ ¬q)) ∧ (¬(p ∧ ¬r)) ∧ (¬(¬p ∧ ¬r)) ≡ (¬p ∨ q) ∧ (¬p ∨ r) ∧ (p ∨ r) which is in cnf. Alternatively, we could use the truth table for t (see earlier) to compute a dnf for ¬t, and then get the following cnf for t: (¬p ∨ ¬q ∨ r) ∧ (¬p ∨ q ∨ ¬r) ∧ (¬p ∨ q ∨ r) ∧ (p ∨ ¬q ∨ r) ∧ (p ∨ q ∨ r). 16

4. Adequate sets of connectives

The proof of the dnf theorem shows more than might appear to be the case. Let

L = {p1, . . . , pk}. Let V alp1,...,pk be the set of all valuations on L. We can think of a truth table as a function V alp1,...,pk → {T, F}. We can also associate such a function to a term t ∈ SL via v 7→ v(t). The proof of the dnf theorem shows that all functions

V alp1,...,pk → {T, F} can be obtained as the functions induced by terms. We can also ask what happens if we have fewer connectives, or perhaps new con- nectives. Can we still represent all functions V alp1,...,pk → {T, F}? Definition 4.1. We say that a set S of propositional connectives is adequate if, for every propositional term t (in any number of propositional variables), there is a term t0, constructed only using the connectives in S, which is logically equivalent to t.

Here, by logically equivalent we mean: defines the same function V alp1,...,pk → {T, F} (with p1, . . . , pk the pv that occur in t). Informally, we think of this as meaning ‘has the same truth table’. Note that we will use connectives differing from those we’ve already met. To proceed properly we should define the set of terms for these new connectives, prove unique readability and so on. We won’t do this, but it is important to know that it could be done! Here are some examples.

(1) The set {∧, ∨, ¬} is adequate. This follows immediately from the dnf theorem. (2) The set {∧, ¬} is adequate. We can use s ∨ t ≡ ¬(¬s ∧ ¬t) to eliminate each occurrence of ∨ from a dnf for a given term, and thus find a logically equivalent term in which only ¬ and ∧ occur. (3) Similarly, the set {∨, ¬} is adequate. (4) Next we introduce a new connective. We define the binary connective | by: p q p|q TT F TF T FT T FF T This is called NAND, or the Sheffer stroke. To show that {|} is adequate, we show that we can represent ∧ and ¬. This is enough, as we already know that {∧, ¬} is adequate. First, note that the term p|p is equivalent to ¬p: p p|p T F F T. But from the truth table for |, we can see that p|q ≡ ¬(p ∧ q). So p ∧ q ≡ ¬(p|q) ≡ (p|q)|(p|q), and the latter terms uses only |. So {|} represents both ∧ and ¬, and so is adequate. Proving that a set of connectives is not adequate is more difficult. The idea is to experiment, and try to spot a pattern. Then prove that this pattern does indeed hold, by induction on complexity of terms. Then take a term that doesn’t obey this pattern, and so can’t be obtained from the set of connectives in question. Hopefully an example will help. Suppose that we take S = {∧, ∨}. Experimenting with terms formed using only these connectives, we might get the feeling that these connectives are positive in some way. And in fact, we can’t represent negation using these connectives. 17

To show this, we first show that if t is a term using only {∧, ∨} and v is the valuation giving all propositional variables the value T, then v(t) = T. This is ‘the pattern’, and we prove it by induction on complexity. In the base case, t is a propositional variable and there isn’t anything to do. Inductively, suppose that t is either s ∧ u or s ∨ u, and v(s) = v(u) = T, where v is the valuation giving all propositional variables the value T. Then, by the definition of valuation, v(t) = T, and that completes the proof by induction. Now we take a term not fitting this pattern, let’s take ¬p. If v(p) = T then v(¬p) = F, hence by the pattern, ¬p can’t be represented using only ∧ and ∨, so {∧, ∨} is not adequate. 18

5. Interpolation

Given that

¬((p1 ∧ p2) → p3) |= p4 → p1 we might wonder whether there is a term t in which only p1 occurs such that

¬((p1 ∧ p2) → p3) |= t and

t |= p4 → p1. Such a term is called an interpolant, and the interpolation theorem tells us that these exist, apart from in trivial cases.

In our particular example, it is easy to see that we can take t to be p1.

Theorem 5.1 (The interpolation theorem). Let L1,L2 be propositional languages. Suppose that s ∈ SL1 and t ∈ SL2 and that s |= t. Then either s is unsatisfiable, or t is a tautology, or there is a u ∈ SL3, where L3 = L1 ∩ L2, such that s |= u and u |= t.

Proof. We suppose that s is not unsatisfiable, and that t is not a tautology, and that s |= t, and find a u as in the conclusion. Let v1 be a valuation on SL1 such that v1(s) = T and let v2 be a valuation on SL2 such that v2(t) = F. Suppose, for a contradiction, that L3 = L1 ∩ L2 is empty. Then we can define a valuation v on S(L1 ∪ L2) by ( v (p) if p ∈ L v(p) = 1 1 v2(p) if p ∈ L2.

Then we would have v(s) = v1(s) = T and v(t) = v2(t) = F. But we have assumed that s |= t so there is no valuation w such that w(s) = T and w(t) = F, and we have a contradiction. So L3 6= ∅. By the dnf theorem, we can find a term in dnf which is logically equivalent to s, say

n  l m  _ ^i ^i  gi,j ∧ hi,k i=1 j=1 k=1 where gi,j ∈ SL1 \ SL3 and hi,k ∈ SL3 are literals. Without loss of generality, none of the disjuncts are unsatisfiable (any that are can be removed, and nothing changes). Let u be the term n m _ ^i hi,k. i=1 k=1

Then u ∈ SL3. Let v be a valuation on SL1 such that v(s) = T. Then for some   Vli Vmi Vmi i we have v j=1 gi,j ∧ k=1 hi,k = T. So v ( k=1 hi,k) = T for some i, and so v(u) = T. Hence s |= u.

Now we show that u |= t. Suppose that v is a valuation on SL2 such that v(u) = T. m V i0  By definition of u, we have some i0 such that v k=1 hi0,k = T. Define a valuation w on L1 ∪ L2 by  v(p) if p ∈ L2   if p is g for some j w(p) = T i0,j if ¬p is g for some j F i0,j  T if p ∈ L1 \ L2 and p doesn’t occur in the i0th disjunct. 19

Then  li mi  ^0 ^0 w  gi0,j ∧ hi0,k = T. j=1 k=1

So w(s) = T, and hence w(t) = T. But w agrees with v on all terms in SL2, so v(t) = T, and u |= t.  Again the proof gives a method for finding interpolants. First we find a dnf for the left hand side, then we remove literals that are not in L3. For instance, given L = {p, q, r, s} and told that ((p ∨ q) →) ∧ (r → p) |= s → (p ∨ ¬q) we can try to find an interpolant involving only p, q. (In the setting of the theorem, L1 = {p, q, r},L2 = {p, q, s} so L3 = {p, q} and it is easy to check that we are not in the trivial cases. Following the proof of the theorem, we first find a dnf for the left hand side (any method for finding this is fine!). We have ((p ∨ q) → r) ∧ (r → p) ≡ (¬(p ∨ q) ∨ r) ∧ (¬r ∨ p) ≡ ((¬p ∧ ¬q) ∨ r) ∧ (¬r ∨ p) ≡ (¬p ∨ r) ∧ (¬q ∨ r) ∧ (¬r ∨ p) ≡ (¬p ∨ r) ∧ ((¬q ∧ ¬r) ∨ (¬q ∧ p) ∨ (r ∧ ¬r) ∨ (r ∧ p)) ≡ (¬p ∧ ¬q ∧ ¬r) ∨ (¬p ∧ ¬q ∧ p) ∨ (¬p ∧ r ∧ p) ∨ (r ∧ ¬q ∧ ¬r) ∨ (r ∧ ¬q ∧ p) ∨ (r ∧ p) ≡ (¬p ∧ ¬q ∧ ¬r) ∨ (r ∧ p). Now that we’ve found the dnf, the proof of interpolation tells us to just remove the literals involving r. So we get p ∨ (¬p ∧ ¬q) as an interpolant. (Note that this is equivalent to p∨¬q, so this latter term also works. Interpolants, like dnfs, are not unique.) For a second example, suppose that we know

(p1 → (¬p2 ∧ p3)) ∧ (p1 ∨ (p2 ∧ ¬p3)) |= ((p3 → p2) → p4) ∨ (¬p4 → (p2 ∧ ¬p3)).

Then we can find an interpolant involving only p2 and p3. We first find a dnf for the left hand side. We could do this using truth tables, but instead we compute directly. We have

(p1 → (¬p2 ∧ p3)) ∧ (p1 ∨ (p2 ∧ ¬p3)) ≡ (¬p1 ∨ (¬p2 ∧ p3)) ∧ (p1 ∨ (p2 ∧ ¬p3))

≡ ((¬p1 ∨ (¬p2 ∧ p3)) ∧ p1) ∨ ((¬p1 ∨ (¬p2 ∧ p3)) ∧ (p2 ∧ ¬p3))

≡ (¬p1 ∧ p1) ∨ (¬p2 ∧ p3 ∧ p1) ∨ (¬p1 ∧ p2 ∧ ¬p3) ∨ (¬p2 ∧ p3 ∧ p2 ∧ ¬p3)

≡ (¬p2 ∧ p3 ∧ p1) ∨ (¬p1 ∧ p2 ∧ ¬p3)

To get our interpolant, we remove the literals not in L3, so not involving p2 or p3, and we get (¬p2 ∧ p3) ∨ (p2 ∧ ¬p3). 20

6. Deductive systems

In the first part of the course, we have introduced (among other things) a notion of ‘follows’ S |= t. This was semantic, that is involving valuations (so assigning meaning). In this section, we will introduce a purely syntactic notion S ` t. We will do this using a formal system (or calculus) for deductions (i.e. formal proofs) . This can be thought of as a kind of toy model of real proofs. We will prove two theorems, the Soundness Theorem (easy) and the Completeness Theorem (hard!) which show that the semantic and syntactic notions coincide. The idea of a formal calculus for deductions is that a term t is derivable from a set of terms S (perhaps empty) if there is a sequence of ‘proof steps’, each of which has one of the following forms: (1) we use an , one of a list of terms we agree are always allowed; (2) we state an assumption from our set S; (3) we use a rule of inference on some of the earlier steps. There are many possible choices for setting this up. Our system is a ‘Hilbert-style’ system, with lots of and few rules. We first describe the axioms. Definition 6.1. The axiom schemas of our calculus are all propositional terms of one of the following forms: (i) s → (t → s) (ii) (r → (s → t)) → ((r → s) → (r → t)) (iii) ¬¬s → s (iv) (¬s → ¬t) → (t → s) where r, s, t are any propositional terms (in the propositional variables of our language). An axiom is an instance of an axiom scheme.

For instance, the term (p ∧ q) → ((¬p ∧ q) → (p ∧ q))) is an axiom, an instance of the first schema, with s equal to p∧q and t equal to ¬p∧q. The term ((p∨q) → (((p∧q) → r) → (r∧¬q))) → (((p∨q) → ((p∧q) → r)) → ((p∨q) → (r∧¬q))) is an axiom, an instance of the second axiom scheme. (With r the term p ∨ q, and s equal to ((p ∧ q) → r) and t the term (r ∧ ¬q)) Finally, for instance ¬¬(p ∨ p ∨ q) → (p ∨ p ∨ q) is an instance of the third scheme.

Definition 6.2. A deduction is a finite sequence S ` t1,...,S ` tn, where S ⊆ SL and t1, . . . , tn ∈ SL are such that for each i = 1, . . . , n one of the following holds:

(1) ti is an axiom, (2) ti ∈ S, (3) there are j, k < i such that tj is tk → ti.

We will write S ` t if there is a deduction S ` t1,...,S ` tn, where t is tn. We read S ` t as ‘S proves t’. 21

When we give deductions we label them, to explain why they are deductions. We number each line, and write abbreviations to the side to indicate the justification for each line. We write LA followed by (i),(ii), (iii) or (iv), to indicate that we’ve used a logical axiom. We write NLA to indicate that we’ve used a non-logical axiom (so something from the set S), and we write MPj,k to indicate that we’ve used modus ponens (the third on the list) with lines j and k. For instance, the following is a deduction. (Here S is the empty set.) (1) ` p → ((p → p) → p) LA(i) (2) ` (p → ((p → p) → p)) → ((p → (p → p)) → (p → p)) LA(ii) (3) ` (p → (p → p)) → (p → p) MP1,2 (4) ` p → (p → p) LA(i) (5) ` p → p MP3,4 Here is a second example of a deduction, this time with S = {p → q, q → r}. (1) S ` q → r NLA (2) S ` (q → r) → (p → (q → r)) LA(i) (3) S ` (p → (q → r)) MP1,2 (4) S ` (p → (q → r)) → ((p → q) → (p → r)) LA(ii) (5) S ` (p → q) → (p → r) MP3,4 (6) S ` p → q NLA (7) S ` p → r MP5,6 Note that it can be very difficult to find deductions in this calculus.I don’t expect you to be able to find deductions, but I do expect you to be able to label correct deductions. Lemma 6.3. Suppose that T is a set of propositional terms, S ⊆ T and that t is a propositional term such that S ` t. Then T ` t.

Proof. By assumption, we have a deduction S ` t1,...,S ` tn, where tn is t. It follows that T ` t1,...,T ` tn is also a deduction, since S ⊆ T . So T ` t. 

We now move on to the easier of two theorems explaining the relationship between S ` t and S |= t. Theorem 6.4 (The soundness theorem). Suppose that S is a set of propositional terms and that t is a propositional term. If S ` t then S |= t.

Note the special case that S = ∅. Before we prove the soundness theorem, we prove a special case of that special case: Lemma 6.5. Suppose that u is a logical axiom. Then u is a tautology.

Proof. We have four cases, corresponding to the four axiom schemas. We do the most involved case. (We’ve done the second and fourth in tutorials, the first in lectures and the third is easy. Here we redo the second.) Suppose that u is (r → (s → t)) → ((r → s) → (r → t)) where r, s, t are propositional terms. For a contradiction suppose that u is not a tautology. Then there is a valuation, v say, such that v(u) = F. Then v((r → s) → (r → t)) = F so v(r → s) = T and v(r → t) = F. From the latter, v(r) = T and v(t) = F. So v(s) = T. But then v(r → (s → t)) = F, and v(u) = T, a contradiction. So u is indeed a tautology.  22

Proof of the Soundness Theorem. Suppose that S ` t, and that v is a valuation such that v(S) = T. We need to prove that v(t) = T, so that we can conclude S |= t. By our assumption, we have a deduction S ` t1,...,S ` tn, where tn is t. We will prove by induction on i that v(ti) = T.

First, suppose that i = 1. The first line of our deduction, S ` t1, must be either a logical axiom or a nonlogical axiom. In the first case, by lemma 6.5, t1 is a tautology, so we’re done. In the second case t1 ∈ S, so we’re done.

For the inductive step, suppose that v(tj) = T for all j < i. If the ith step is a logical axiom or a nonlogical axiom, we can proceed as above. So suppose that the ith step is an application of modus ponens. Then there are j, k < i such that tj is tk → ti. Inductively, we have v(tk) = T and v(tk → ti) = T. It then follows that v(ti) = T. This completes the inductive step, and so the proof of the Soundness Theorem.  23

7. The deduction theorem

Before proving the completeness theorem, we prove a result which is useful for constructing deductions. We will need a lemma. Lemma 7.1. For any set S of propositional terms and any term s we have S ` s → s.

Proof. Let s be any term. The following deduction shows that ` s → s. 1. ` s → ((s → s) → s) LA(i) 2. (s → ((s → s) → s)) → ((s → (s → s)) → (s → s)) LA(ii) 3. ` (s → (s → s)) → (s → s) MP1,2 4. ` s → (s → s) LA(i) 5. ` s → s MP3,4. Since the empty set is a subset of any set, we then have S ` s → s by lemma 6.3.  Theorem 7.2 (The Deduction Theorem). Suppose that S is a set of propositional terms and s and t are propositional terms. Then S ∪ {s} ` t if and only if S ` s → t.

Note that this captures a property of informal proofs: if we want to prove that some proposition P implies another Q, we would often assume P and show Q. The deduction theorem tells us that this is captured in our formal system.

Proof. First, the easy direction. Suppose S ` s → t. Then there is a deduction S ` t1,...,S ` tn where tn is s → t. We can change this to a deduction S ∪ {s} ` t1,...,S ∪ {s} ` tn as in the proof of lemma 6.3. We then add two lines at the end of this deduction: n. S ∪ {s} ` s → t n+1. S ∪ {s} ` s NLA n+2. S ∪ {s} ` t MPn,n+1.

To show that if S ∪ {s} ` t then S ` s → t we suppose that S ∪ {s} ` t So we have a deduction S ∪ {s} ` t1,...,S ∪ {s} ` tn, where tn is t. We will show, by induction on i, that there is a deduction of S ` s → ti.

Suppose that i = 1. The line S ∪ {s} ` t1 is either a logical axiom of a nonlogical axiom. Suppose that it is a logical axiom. Then we can use the following deduction.

1. S ` t1 LA

2. S ` t1 → (s → t1) LA(i)

3. S ` s → t1 MP1,2. Here, LA is whichever logical axiom we used in line 1. of the original deduction. If the first line is a nonlogical axiom, that is t ∈ S ∪ {s}, then there are two cases, t1 ∈ S and t1 is s. In the first case, we use the following.

1. S ` t1 NLA

2. S ` t1 → (s → t1) LA(i)

3. S ` s → t1 MP1,2. In the second case we use the previous lemma, which showed that S ` s → s to supply the deduction. 24

Now suppose that i > 1 and that for j < i we have a deduction with last line S ` s → tj. We need to find a deduction of S ` s → ti. The ith line in our original deduction is either a logical axiom, a nonlogical axiom or an application of modus ponens. In the first two cases, we can proceed as in the case i = 1. So suppose that there are j, k < i such that tj is tk → ti. We have . .

j. S ∪ {s} ` tk → ti . .

k. S ∪ {s} ` tk . .

i. S ∪ {s} ` ti MP j,k By our induction hypothesis, we can find a deduction . .

nj.S ` s → (tk → ti) . .

nk.S ` s → tk . .

ni−1.S ` s → ti−1. We add the following lines at the end:

ni−1 + 1.S ` (s → (tk → ti)) → ((s → tk) → (s → ti)) LA(ii)

ni−1 + 2 S ` (s → tk) → (s → ti) MPnj, ni−1 + 1

ni−1 + 3 S ` s → ti MPnkni−1 + 2. This completes the proof.



We now allow ourselves to use the deduction theorem in deductions. We indicate this by n. S ` s → t n+1. S ∪ {s} ` t DT or n. S ∪ {s} ` t n+1. S ` s → t DT.

Note that formally these are not deductions. But the deduction theorem tells us that the required deductions exist, so we can proceed as though they are. For the next few lemmas, we fix some propositional terms s and t. Lemma 7.3. ` s → (¬s → t)

Proof. Here is the required deduction. 25

1. {s, ¬s} ` ¬s → (¬t → ¬s) LA(i) 2. {s, ¬s} ` ¬s NLA 3. {s, ¬s} ` ¬t → ¬s MP1,2 4. {s, ¬s} ` (¬t → ¬s) → (s → t) LA(iv) 5. {s, ¬s} ` (s → t) MP3,4  6. {s, ¬s} ` s NLA 7. {s, ¬s} ` t MP5,6 8. {s} ` ¬s → t DT 9. ` s → (¬s → t) DT Lemma 7.4. ` (s → ¬s) → ¬s.

Proof. Here is the required deduction. 1. {s → ¬s} ` ¬¬s → s LA(iii) 2. {s → ¬s, ¬¬s} ` s DT 3. {s → ¬s, ¬¬s} ` s → ¬s NLA 4. {s → ¬s, ¬¬s} ` ¬s MP2,3 5. {s → ¬s, ¬¬s} ` s → (¬s → ¬(s → s)) Lemma 7.3 6. {s → ¬s, ¬¬s} ` ¬s → ¬(s → s) MP2,5 7. {s → ¬s, ¬¬s} ` ¬(s → s) MP4,6  8. {s → ¬s} ` ¬¬s → ¬(s → s) DT 9. {s → ¬s} ` (¬¬s → ¬(s → s)) → ((s → s) → ¬s) LA(iv) 10. {s → ¬s} ` (s → s) → ¬s MP8,9 11. {s → ¬s} ` s → s Lemma 7.1 12. {s → ¬s} ` ¬s MP10,11 13. ` (s → ¬s) → ¬s DT Lemma 7.5. ` s → ¬¬s.

Proof. Here is the deduction. 1. ` ¬¬¬s → ¬s LA(iii) 2. ` (¬¬¬s → ¬s) → (s → ¬¬s) LA(iv)  3. ` s → ¬¬s MP1,2. Lemma 7.6. ` ¬s → (s → t). Lemma 7.7. ` s → (¬t → ¬(s → t)). 26

8. The completeness theorem

Definition 8.1. A set S of propositional terms is consistent if there is some propo- sitional term t such that there is no deduction of t from S. And S is inconsistent if S ` t for every term t. Lemma 8.2. Let S be a set of propositional terms and let s be a term. Then S ∪ {s} is inconsistent if and only if S ` ¬s.

Proof. Suppose S ∪ {s} is inconsistent. Then there is deduction S ∪ {s} ` ¬s. We extend this as follows: n. S ∪ {s} ` ¬s n+1. S ` s → ¬s DT n+2. S ` (s → ¬s) → ¬s Lemma 7.4 n+3. S ` ¬s MPn+1,n+2.

For the other direction suppose that S ` ¬s and let t be any term. We extend a deduction showing S ` ¬s as follows. n. S ∪ {s} ` ¬s n+1. S ∪ {s} ` s NLA n+2. S ∪ {s} ` s → (¬s → t) Lemma 7.3 n+3. S ∪ {s} ` ¬s → t MPn+1,n+2 n+4. S ∪ {s} ` t MPn,n+3 

We will prove completeness in the following form. Theorem 8.3 (Completeness Theorem v.1). Suppose that S is a consistent set of propositional terms. Then there is a valuation v such that v(S) = T.

We show how this implies the following. Theorem 8.4 (Completeness Theorem v.2). Let S be a set of propositional terms and t a term. If S |= t then S ` t.

Proof of v.2 from v.1. Suppose that it is not the case that S ` t, which we write as S 0 t. We show that S 2 t (that is, it is not the case that S |= t). First, we have that S 0 ¬¬t. By Lemma 8.2 the set S ∪ {¬t} is consistent. By version 1 of Completeness, there is a valuation v say such that v(S ∪ {¬t}) = T. So S 2 t, as needed. 

The two versions of Completeness are equivalent. To see this we prove version 1 assuming version 2.

Proof of v.1 from v.2. Suppose that S is a consistent set of terms but that there is no valuation v such that v(S) = T. Let t be an arbitrary term. Since S is unsatisfiable, we have S |= t. By version 2 of completeness, S ` t. But t was arbitrary, so S is inconsistent, a contradiction. 

Before we prove version 1, we give an important application of completeness. First we need some lemmas. Lemma 8.5. A set S of propositional terms S is inconsistent if and only if there is a term s such that S ` ¬(s → s). 27

Proof. For the nontrivial direction, suppose that S ` ¬(s → s) for some s and let t be any term. Then we have a deduction as follows. n. S ` ¬(s → s) n+1. S ` s → s Lemma 7.1 n+2. S ` (s → s) → (¬(s → s) → t) Lemma 7.3 n+3. S ` ¬(s → s) → t MPn+1,n+2 n+4. S ` t MPn,n+3.  Lemma 8.6. Suppose that S is a set of terms, t a term and that S ` t. Then there is a finite set S0 ⊆ S such that S0 ` t.

0 Proof. Let S ` t1,...,S ` tn be a deduction of S ` t. Let S be the set of s ∈ S 0 0 0 that occur as a nonlogical axiom. Then S is finite and S ` t1,...,S ` tn is a valid deduction.  Theorem 8.7 (The compactness theorem, v.1). Suppose that S is a set of propositional terms. Then S is satisfiable if and only if every finite subset S0 ⊆ S is satisfiable.

Proof. For the nontrivial direction suppose that S is unsatisfiable. By version 1 of the completeness theorem, S is inconsistent. Let s be any term. Then S ` ¬(s → s), since S is inconsistent. By the previous lemma, there is a finite set S0 ⊆ S such that S0 ` ¬(s → s). By lemma 8.5 the set S0 is inconsistent. So by the soundness theorem, S0 is unsatisfiable. (To see this last step, suppose that S0 was satisfiable. Then there would be a valuation v such that v(S0) = T. Now, S0 ` ¬(s → s), so by Soundness, 0 S |= ¬(s → s), and then v(¬(s → s)), which is impossible.)  Theorem 8.8 (The compactness theorem, v.2). Suppose that S is a set of terms and that t is a term. Then S |= t if and only if there is a finite set S0 ⊆ S such that S0 |= t.

Again this is equivalent to v.1.

Proof of Compactness v.2 from v.1. For the difficult direction, suppose that S |= t but that for every finite S0 ⊆ S we have S0 2 t. Then for every finite S0 ⊆ S there is a valuation v such that v(S0) = T and v(t) = F. So every finite subset of S ∪ {¬t} is satisfiable. By compactness version 1, S ∪ {¬t} is satisfiable. So that is a valuation v such that v(S ∪ {¬t}) = T, i.e, v(S) = T and v(t) = F, contradicting S |= t. 

For the converse, see the exercsise sheets. For an exercise, prove this using completeness (both versions). Here is an example of an application of compactness. Suppose that L, L0 are propositional languages and that S ⊆ SL, and S0 ⊆ SL0 are such that S ∪ S0 is unsatisfiable. Show that there exist t ∈ SL and t0 ∈ SL0 such that S |= t and S0 |= t0 and t |= ¬t0. To see this, we first apply compactness v.1 to the set S ∪ S0. This gives us a finite 0 0 0 set T ⊆ S ∪ S such that T is unsatisfiable. Let T = {s1, . . . , sn, s1, . . . , sm} where 0 0 0 si ∈ S and sj ∈ S and n, m ≥ 1. (If there were none from say S , we could always add 0 0 a term from S .) Then there is no valuation v such that v(si) = T and v(sj) = T for 0 0 all i, j. So the set {s1 ∧· · ·∧sn, s1 ∧· · ·∧sm} is unsatisfiable. So if v(s1 ∧· · ·∧sn) = T 0 0 0 0 0 then v(s1 ∧ · · · ∧ sm) = F. Hence with t taken as s1 ∧ · · · ∧ sn and t as s1 ∧ · · · ∧ sm we have t |= ¬t0. And clearly S |= t and S0 |= t0, so we’re done.

For a second example, suppose that S = {si : i ∈ N} is a set of terms such that si |= sj if and only if j ≤ i. Show that S is satisfiable. 28

To prove this, suppose that S is unsatisfiable. By compactness v.1 there is a finite 0 0 S ⊆ S which is unsatisfiable. Let n be the largest integer such that sn ∈ S . If v is a valuation such that v(sn) = T then by our assumption we have v(si) = T for all i ≤ n 0 and then v(S ) = T, a contradiction. So there is no valuation v such that v(sn) = T. But then sn |= sn+1, contradicting our assumption. So S is satisfiable. We now return to proving completeness. We need one final lemma. Lemma 8.9. Suppose that S is a set of terms and that s is a term. If S ` s and S ` ¬s then S is inconsistent.

Proof. Let t be any term. By Lemma 7.3 we have S ` s → (¬s → t). By our assumption, S ` s and S ` ¬s. We then get S ` t by applying modus ponens twice. 

Proof of v1 of the Completeness theorem. We want to show that if S is consistent then there is a valuation v such that v(S) = T. Suppose that a consistent set of terms S is given. We will assume that the set L of p.v. is countable. It follows that the set SL of terms is countably infinite. So we can write SL = {t0, t1,...}.

We inductively define sets Tn of terms, for n ≥ 0, as follows:

T0 = S ( Tn ∪ {tn} if Tn ` tn Tn+1 = Tn ∪ {¬tn} otherwise S Let T = n≥0 Tn. We will show that T is a maximal consistent set of sentences. That is, T is consistent and any set T 0 6= T such that T ⊆ T 0 is inconsistent.

First, we show that each Tn is consistent. For T0 this is our assumption, so suppose that Tn is consistent. If Tn ` tn then by lemma 8.9 we have Tn 0 ¬tn. So by lemma 8.2 Tn+1 ∪ {tn} is consistent. If Tn 0 tn then Tn 0 ¬¬tn so by lemma 8.2 Tn+1 is consistent. So each Tn is consistent. If T is inconsistent, then T ` ¬(t → t) for some term t. By lemma 8.6 there is 0 0 S a finite T ⊆ T such that T ` ¬(t → t). Since T = Tn, there is some n such 0 that T ⊆ Tn and so Tn ` ¬(t → t). By lemma 8.5 the set Tn is inconsistent, a contradiction. So T is consistent after all.

To show that T is maximal consistent, let t be any term. Then t is tn for some n. So either t ∈ T or ¬t ∈ T . So if t∈ / T then ¬t ∈ T . And then by 8.9 T ∪ {t} is inconsistent. So T is a maximal consistent set of terms. We aim to define a valuation v by v(t) = T if and only if t ∈ T . But we need some claims. First, we show that if T ` t then t ∈ T . To see this, suppose that T ` t but that t∈ / T . Then T ∪ {t} is inconsistent, so by 8.2 we have T ` ¬t. But then by lemma 8.9 T is inconsistent, a contradiction. Second, we show that s → t ∈ T if and only ¬s ∈ T or t ∈ T . To see this, first suppose that ¬s ∈ T . Then by lemma 7.6 we have T ` s → t and then by the first claim s → t ∈ T . If t ∈ T then it is an exercise using the first claim to show that s → t ∈ T . For the other direction, suppose that ¬s∈ / T and that t∈ / T . Since for every term r either r ∈ T or ¬r ∈ T , we have s ∈ T and ¬t ∈ T . By lemma 7.7 we have T ` ¬(s → t) so ¬(s → t) ∈ T by the first claim, and so s → t∈ / T . 29

Finally, we define a function v on the set of terms by v(t) = T if and only if t ∈ T . We want to show that v is a valuation. It is sufficient to check the ¬ and → clauses, since the set {¬, →} is adequate. The clause for → is exactly the second claim above, so we only need to check the clause for ¬. If v(t) = T then t ∈ T so ¬t∈ / T and then v(¬t) = F. And if v(t) = F then t∈ / T so ¬t ∈ T and v(¬t) = T. This valuation is such that v(T ) = T. And since S ⊆ T , we have v(S) = T as we wanted.  30

9. Predicate logic: the basic language

We begin by giving an overview, which is meant to give ideas, rather than to be precise. Predicate logic has much more expressive power than the propositional logic we studied earlier. We will be able to form statements about ‘sets with structure’. In particular, we will be able to talk about elements of sets. We will have variable symbols x, y, x1, x2, . . . , y1, y2,... which, when interpreted, range over a give set. We will also have quantifiers ∀, ∃ (‘for all’ and ‘there exists’). So a formula in predicate logic could, for example, begin ∀x∃y ··· . What we are allowed to include in the formula will depend on the structure we are studying. We will have different languages for different types of structure. We give a language by choosing certain symbols which stand for constants, certain symbols standing for relations and certain symbols standing for functions. Let’s give some examples. Our sets-with-structure will always include = ( a 2-ary, or binary relation) and we will always have a binary relation symbol in our language, also denoted by =. We might also have, for instance, an ordering, e.g. ≤ in in the integers Z or the real numbers R. This is another binary relation, and if we wanted to study it we could include another binary relation symbol in our language. If we wanted to talk about arithmetic in Z or R we could add binary function symbols, to be interpreted as + and ×, and perhaps constant symbols to be interpreted as 0 and 1. Before getting onto these, we start with the basic language. So, with the overview over, the basic language L0 has the following symbols: (1) the propositional connectives, ∧, ∨, →, ↔, ¬ and brackets (, ); (2) countably many variable symbols x, y, z, . . . , v0, v1,...; (3) quantifier symbols ∃, ∀; (4) equality symbol =. (To be more formal, we should distinguish the relation symbol = from the actual relation =, but we will not do this.)

A term of L0 is just a variable, and the free variable of a term is the variable that makes up the term: fv(x) = {x} for any variable x.

An of L0 is an expression s = t where s, t are terms. We define fv(s = t) = fv(s) ∪ fv(t) [note the different roles of the two equals signs here].

We define the formulas of L0, and their free variables, as follows: (1) Every atomic formula is a formula. (2) If φ is a formula then so is ¬φ, and fv(¬φ) = fv(φ). (3) If φ, ψ are formulas then so are φ∧ψ, φ∨ψ, φ → ψ and φ ↔ ψ and fv(φ∧ψ) = fv(φ ∨ ψ) = fv(φ → ψ) = fv(φ ↔ ψ) = fv(φ) ∪ fv(ψ). (4) If φ is a formula and x is a variable symbol then ∃xφ and ∀xφ are formulas, and fv(∃xφ) = fv(∀xφ) = fv(φ) \{x}. (5) Nothing else is a formula of L0.

A sentence of L0 is a formula σ of L0 such that fv(σ) = ∅. We use brackets to help readability. To be more formal, we should be careful with brackets as we were with propositional logic (at the beginning), but we won’t do this.

Here are some examples of formulas of L0 x = y is a formula, with free variables x, y. ∀x(x = y) is a formula, with free variable y. ∃x∀y(x = y ∧ y 6= z) is a formula with free variable z. (Note that we use y 6= z to abbreviate ¬(y = z).) 31

∃x(x = y ∧ y 6= z) is a formula with free variables y, z. ∃w∀x∀y∀z(x = w ∨ y = w ∨ z = w) is a formula with no free variables, that is, it is a sentence. (Try to figure out what this sentence says. Although we haven’t yet defined meaning for formulas, we can and will later, and you can try to guess at what this says.) The definition of formulas is inductive, just like the definition for propositional terms, and each formula has a corresponding construction tree. Any formula that occurs in the tree for φ is called a subformula of φ. Just like with propositional terms, each formula has a principle connective (and there is a unique readability theorem, but we won’t study that). Here is an example. ∃x∀y(x = y ∧ y 6= z)

∀y(x = y ∧ y 6= z)

x = y ∧ y 6= z

x = y ¬(y = z)

y = z

And we can see that for instance y = z and ∀y(x = y ∧ y 6= z) are subformulas of ∃x∀y(x = y ∧ y 6= z) but that ∃x(x = y ∧ y 6= z) is not. We define the free occurrences of a variable x in a formula as follows: (1) Every occurrence of x in any atomic formula is free. (2) The free occurrences of x in ¬φ are the free occurrences of x in φ. (3) If ∗ is one of ∧, ∨, →, ↔ then the free occurrences of x in φ ∗ ψ are the free occurrences of x in φ together with the free occurrences of x in ψ. (4) There are no free occurrences of x in ∃xφ, and no free occurrences of x in ∀xφ. In a formula of the form Qxφ where Q is either ∀ or ∃, we refer to φ as the scope of Qx. Any free occurrence of x in φ is said to be bound by Qx. For instance, in the formula ∃x(x = y ∧ y 6= z) ∧ ∀z(x = z) the variable y is free. The first occurrence of z is free, the second is bound by ∀z. The second occurrence of x is free, the first is bound by ∃x. The scope of the first ∃x is the subformula (x = y ∧ y 6= z), the scope of the ∀z is (x = z). Now consider the formula ∃x((∀x(x = x)) → x = x). The scope of the quantifier ∃x is ((∀x(x = x)) → x = x). But the first x = x here lies in the scope of the ∀x, so these xs are bound by ∀x, while the second x = x are bond by ∃x. Instead of this formula, we could also write ∃x((∀y(y = y) → x = x). 32

This formula is different, but it has the same meaning (it is logically equivalent, though we won’t formally define this). It is also much easier to read! 33

10. Predicate logic: enriching the language

Here are some more examples of formulas in our basic language.

•∀ x(x = x) •∀ x∀y(x = y → y = x) •∀ x∀y∀z((x = y ∧ y = z) → x = z) •∃ x∃y∃z(x 6= y ∧ x 6= z ∧ y 6= z ∧ ∀w(w = x ∨ w = y ∨ w = z)) •∃ x(x 6= x) •∀ x∃y(x 6= y).

Try to think about what these say, when we interpret them in a set. For instance, is the first one always true? For which sets is the last one true? There isn’t a lot we can say with our basic language. To get more a more expressive logic, we now add symbols for constants, relations and functions. So, suppose that S is a collection of constant symbols, function symbols (perhaps of different arities) and relation symbols (perhaps of different arities), where we might have, say, no relation symbols. We write L for L0 ∨ S (we explain this notation below). The terms of L, and their free variables, are defined as follows.

(1) Each variable symbol is a term, and fv(x) = {x}, (2) Each constant symbol c is a term, and fv(c) = ∅, (the empty set) (3) If f is an n-ary function symbol and t1, . . . , tn are terms, then f(t1, . . . , tn) is a term, and fv(f((t1, . . . , tn)) = fv(t1) ∪ · · · ∪ fv(tn)), (4) Nothing else is a term.

Suppose for example that our collection S includes three constant symbols c1, c2, c3 a 2-ary function symbol f and a 3-ary function symbol g. We give some examples of terms of L and their free variables; Term Free variables c1 ∅ c2 ∅ g(c1, c2, c3) ∅ g(x, c1, c2) x g(x, y, z) x, y, z f(g(c1, x, y), c2) x, y f(g(x, y, c1), f(x, c2)) x, y We define the atomic formulas of L as follows.

(1) If s, t are terms then s = t is an atomic formula, and fv(s = t) = fv(s)∪fv(t). (2) If R is an n-ary relation symbol, and t1, . . . , tn are terms, then R(t1, . . . , tn) is an atomic formula, and fv(R(t1, . . . , tn)) = fv(t1) ∪ · · · ∪ fv(tn). The formulas of L are then defined as in the case of the basic language, but with the more general definitions of terms and atomic formulas. So

(1) Every atomic formula is a formula. (2) If φ is a formula then so is ¬φ, and fv(¬φ) = fv(φ). (3) If φ, ψ are formulas then so are φ∧ψ, φ∨ψ, φ → ψ and φ ↔ ψ and fv(φ∧ψ) = fv(φ ∨ ψ) = fv(φ → ψ) = fv(φ ↔ ψ) = fv(φ) ∪ fv(ψ). (4) If φ if a formula and x is a variable symbol then ∃xφ and ∀xφ are formulas, and fv(∃xφ) = fv(∀xφ) = fv(φ) \{x}. (5) Nothing else is a formula of L. 34

A sentence of L is a formula with no free variables.

We define free occurrences, scope, subformulas and so on as before.

The idea with the notation L0 ∨ S is that our language is some sort of join of the basic language and S, rather than literally the union.

To see what the definitions of terms and formulas mean we write them out again in a special case. We take our extra symbols S to be:

• two constant symbols c, d, • two binary function symbols f, g, • a binary relation symbol R.

Then the terms of L are the strings of symbols formed by finitely many applications of the following rules:

(1) all the variable symbols x, y, z, v1, v2,... are terms, (2) the constant symbols c, d are terms, (3) if t1, t2 are terms, then f(t1, t2) is a term and g(t1, t2) is a term, (4) nothing else is a term.

To see whether a string is a term, we can try to form a construction tree.

For example,

f(x, c)

x c

At the bottom of the tree, we have x and c, both of which are terms. Then above, we have f(x, c), which is formed by applying one of the rules. So we get a term.

Here is a non-example. We can see that g(c, f(y)) is not a term, by trying to draw a tree, and seeing what happens:

g(c, f(y))

c f(y)

y

At the bottom of the tree we have y which is a term, but above it we have f(y), but f is 2-ary, so f(y) is not a term according to the rules. Hence we can’t use it to form other terms, and so g(c, f(y)) is not a term.

Similarly, we can see that g(g(g(x, y), d), f(x, g(c, d))) is a term. Again, we try to draw the tree: 35

g(g(g(x, y), d), f(x, g(c, d)))

g(g(x, y), d) f(x, g(c, d))

g(x, y) d x g(c, d)

x y c d

We can see from the tree that this is a term. We do one more example.

f(g(c, R(x, y)), g(x, d))

g(c, R(x, y)) g(x, d)

c R(x, y) x d

On the right hand side, x and d are terms, and then g(x, d) is a term following the rules. But on the left, at the bottom we have R(x, y). This is not a term: there is no way to introduce the relation symbol R in the definition of terms. The atomic formulas of L are strings of symbols of one of the following forms:

• t1 = t2 where t1, t2 are terms of L, • R(t1, t2) where t1, t2 are terms of L.

We then define the formulas as before. We now determine whether some strings of symbols are formulas of L. We do this by trying to draw a construction tree, and seeing what happens.

∃x∀y(R(x, y) → R(y, R(y, x)))

∀y(R(x, y) → R(y, R(y, x)))

R(x, y) → R(y, R(y, x))

R(x, y) R(y, R(y, x))

This is not a formula, as at the end of the tree on the right we have R(y, R(y, x)), which is not an atomic formula, since R(y, x) isn’t a term (as in the last nonexample of a term above). The next example is a formula, and we underline the free variables at each point. 36

∃x∀y(R(z, v) → f(c, y) = g(x, f(z, v)))

∀y(R(z, v) → f(c, y) = g(x, f(z, v)))

R(z, v) → f(c, y) = g(x, f(z, v))

R(z, v) f(c, y) = g(x, f(z, v))

At the bottom here we have atomic formulas. (To be more precise, we should also check that f(c, y) and g(x, f(z, v)) are terms. This can be done as in the term examples above.) Then everything above is formed followig the rules, so what we get is a formula. (After the tree is completed, we can go back up underlining the free variables.) For a final example, consider ∃x(∀x(R(x, y) → f(x, y) = v) ∧ g(x, c) = d). Let’s check that this is a formula (it is!) using a tree. ∃x(∀x(R(x, y) → f(x, y) = v) ∧ g(x, c) = d)

∀x(R(x, y) → f(x, y) = v) ∧ g(x, c) = d

R(x, y) → f(x, y) = v g(x, c) = d

R(x, y) f(x, y) = v

Once we’ve found that this is a formula, we can go back up underlining the free variables. 37

11. L-structures

We interpret the terms and formulas of a language L in certain structure, called L-structures. In propositional logic, to interpret terms, we only needed to give a value T or F to each p.v., and this determined everything. Here we need more. We need a set for the variables to range over, and a meaning for the constant symbols, function symbols and relation symbols. So, an L-structure M consists of a nonempty set M together with (1) a distinguished element cM of M for each constant symbols c of L, (2) a subset RM of M n for each n-ary relation symbol of L, and each n, (3) a function f M : M n → M for each n-ary function symbol f of L and each n. We call cM,RM and f M the interpretations of the symbols c, R and f, respectively. We make this explicit in the notation by writing M = (M, {cM}, {RM}, {f M}). For an example, suppose that we have the language considered in the previous section, with two constant symbols c, d, two binary functions symbols f, g and a binary relation symbol R. Then the following are examples of L-structures:

(1) Take M = R, RM = {(x, y) ∈ R2 : x < y}, f M(x, y) = x + y, gM = x · y, cM = 0, dM = 1. (2) Same, but with M = Q. (3) Or we could take M = R, RM = {(x, y) ∈ R2 : x2 + y2 = 1}, f M(x, y) = x2 + y2, gM = x3 + y3, cM = −1, dM = 1. M M M (4) Or we could take M = {a1, a2, a3, b1, b2, b3}, c = a1, d = b1,R = {(ai, bj): i, j = 1, 2, 3} and ( a if x 6= y f M(x, y) = 1 b1 if x = y ( a if x 6= y gM(x, y) = 2 . b2 if x = y Note how different these are. All they have in common is their ‘structure’: each has two distinguished elements, a distinguished subset of the square of the universe, and two distinguished binary functions. Suppose that our language includes a binary relation symbol R. We can draw pictures of finite L-structures M, by viewing RM as the edge relation of a directed graph. That is, we have vertices, on for each x ∈ M and an edge from x to y if and only if (x, y) ∈ RM. We do a few examples. Suppose that we have a single binary relation R in our language. Then the following are L-structures (pictures below). (a) M = {a, b, c, d},RM = {(a, b), (b, c), (c, d), (d, a)}.

(b) M = {a, b, c, d},RM = {(a, b), (a, c), (a, d), (b, a), (b, c), (b, d)}.

(c) M = {a, b, c, d, e},RM = {(a, a), (a, b), (a, c), (b, b), (b, d), (c, d), (d, d)}.

39

12. Interpreting terms in L-structures

We can interpret terms of L in our L-structures. Informally, terms define functions simply by interpreting constant symbols and n-ary function symbols as elements of M and functions M n → M, respectively, and by interpreting variable symbols as variables ranging over M.

For instance, suppose we have a language L = L0 ∨ {f, g, c} where f is a unary function symbol, g is a binary function symbol and c is a constant symbol. Consider the L-structure M = (N, f M, gM, cM), where M = N, f M(x) = x2, gM(x, y) = x + y and cM = 5. Then the term g(c, f(x)) gives the function N → N defined by x 7→ gM(cM, f M(x)) = cM + f M(x) = 5 + x2. That is, the of this term in this structure gives the function x 7→ 5 + x2. For a second example, consider the term f(g(c, g(x, f(y)))).

We can unravel this as follows: f M(gM(cM, gM(x, f M(y)))) = (gM(cM, gM(x, f M(y))))2 = (cM + gM(x, f M(y)))2 = (5 + x + f M(y))2 = (5 + x + y2)2.

So this term, interpreted in M gives the function N2 → N defined by (x, y) 7→ f M(gM(cM, gM(x, f M(y)))).

Now we give the formal definition. This is not covered in lectures, and the informal way above is the right way to think about it. But I include the formal definition for completeness.

As these example might suggest, the definition is inductive. It is easiest to formulate with only variable symbols v1, v2, v3,.... (In practice, we will also use x, y, z, w and so on, and it should be clear how to do this.) Fix a language L and an L-structure M. Given a term t of L, we define a function tM : M ∞ → M, ∞ ∞ where M = {m¯ :m ¯ = (m1, m2, m3, . . . , mp,...), mp ∈ M, p ≥ 1}, that is M is the set of all infinite sequences of elements of M. For m¯ ∈ M ∞, we define tM(m ¯ ) by induction as follows:

M (1) if t is a variable symbol vp then vp (m ¯ ) = mp, (2) if t is a constant symbol c then cM(m ¯ ) = cM, (3) if t is f(t1, . . . , tn) where t1, . . . , tn are terms of L and f is an n-ary function symbol, then M M M M t (m ¯ ) = f (t1 (m ¯ ), . . . , tn (m ¯ )). 40

M It is easy to show that t (m ¯ ) only depends on the elements mp such that vp occurs in t. This mean that in practice, we can think of tM as a function of finitely many variables, rather than infinitely many variables. This is the formal definition, but in practice, we continue to work out examples as we did with the example above. Here is another example, where now L = L0 ∨ {f, g, c, d}, with binary function symbols f, g and constant symbols c, d. Suppose our L-structure is given by M = Z, f M(x, y) = x + y, gM = x − y, cM = 17, dM = 19. Consider the term t given by f(g(g(f(x, x), g(x, y)), c), d). We have f M(gM(gM(f M(x, x), gM(x, y)), cM), dM) = gM(gM(f M(x, x), gM(x, y)), cM) + dM = gM(f M(x, x), gM(x, y)) − cM + 19 = f M(x, x) − gM(x, y) − 17 + 19 = 2x − (x − y) + 2 = x + y + 2. So the function tM : M 2 → M is given by (x, y) 7→ x + y + 2. For another example, continue with this L and take M given by M = R, f M(x, y) = x + y, gM(x, y) = xy, cM = 0, dM = 1. Consider the L-term t as above, so given by f(g(g(f(x, x), g(x, y)), c), d). Then in this new M, we have tM(x, y) = f M(gM(gM(f M(x, x), gM(x, y)), cM), dM) = gM(gM(f M(x, x), gM(x, y)), cM) + dM = gM(f M(x, x), gM(x, y)) · cM + 1 = gM(f M(x, x), gM(x, y)) · 0 + 1 = 1 So in this structure, the term t is just a complicated way of writing the function with constant value 1. 41

13. Interpreting formulas in L-structures

When we interpret the symbols of L in an L-structure M and interpret the quantifiers as ranging over M and the propositional connectives as we did in propositional logic, a sentence of L makes an assertion about M. When this assertion is true, we say that the sentence is true in M. When the assertion is false, we say that the sentence is false in M. As with the interpretation of terms, there is a complicated formal definition, which we give below for completeness. But it is much easier to get the idea by looking at examples.

For instance let L = L0 ∨ {R, f, c} where f is a unary function symbol, c a constant symbol and R a binary relation symbol. Let M be given by M = R,RM = {(x, y) ∈ R2 : x < y}, f M(x) = x2 and cM = 0. Let φ be the sentence of L given by ∀x(R(c, x) → ∃y(x = f(y))). Interpreting this in M, we have 2 ∀x ∈ R(0 < x → ∃y ∈ R(x = y )). And this is true in M, so we say that φ is true in M. If we instead consider M given by M = Q and RM = {(x, y) ∈ Q2 : x < y}, f M(x) = x2 and cM = 0, then in this new M our sentence φ says 2 ∀x ∈ Q(0 < x → ∃y ∈ Q(x = y )). So φ is false in M, as for instance 0 < 2 but 2 6= y2 for any rational y. Next consider this M again, but with f M(x) = 2x. Then in this M our sentence φ says ∀x ∈ Q(0 < x → ∃y ∈ Q(x = 2y)) which is true, so we say that φ is true in this M. For another example, let L have a binary relation symbol R, a unary function symbol f and constant symbols c, d. Let M be given by M = Q,RM = {(x, y) ∈ Q2 : x < y}, f M(x) = x2, cM = 0, dM = 1. Let φ be the sentence ∃x(R(x, f(x)) → (x = c ∧ x = d)). Interpreted in M this says 2 ∃x ∈ Q((x < x ) → (x = 0 ∧ x = 1)). Now the (x = 0 ∧ x = 1) is always false. So the sentence will only be true if there is some x ∈ Q such that x < x2 is false. (Then we will have ‘false implies false’, which is true.) There are such x, e.g. x = 0. So the sentence is true in M. If instead of a sentence φ, we consider an L-formula φ with a free variable x, which we indicate by φ(x), then φ doesn’t make an assertion about an L-structure M. But if we also given an element m ∈ M, then φ(m), that is, ‘φ with m substituted in for all free occurrences of x’ does make an assertion, about m in M. For instance, continue with the L from the start of this section, with a binary relation symbol R, a unary function symbol f and a constant symbols c. Let M = Q,RM = {(x, y) ∈ Q : x < y}, f M = x2 and cM = 0. Consider the formula R(c, x) → ∃y(x = f(y)). Then if we interpret in M and ‘substitute 2 for the free occurrences of x’ we get 2 0 < 2 → ∃y ∈ Q(2 = y ). 42

And this is false (as the 0 < 2 is true and the ∃y ∈ Q(2 = y2) is false), so we say φ(2) is false in M. On the other hand, if we interpret in M and ‘substitute 4 for the free occurrences of x’ we get 2 0 < 4 → ∃y ∈ Q(4 = y ). And this is true. So we say that φ(4) is true in M. For the formal definition again not given in lectures, which is due to Tarski, we will again work only with variables v1, v2, v3,... (and again, we’ll apply the result more generally, see later). Suppose that φ is an L-formula and that M is an L-structure and that m¯ ∈ M ∞. We write M |= φ(m ¯ ), and say ‘φ is true of m¯ in M’ if:

(1) φ is an atomic formula of the form t1 = t2, for terms t1 and t2 and then M M M |= φ(m ¯ ) if and only if t1 (m ¯ ) = t2 (m ¯ ), or, (2) φ is an atomic formula of the form R(t1, . . . , tn) and then M |= φ(m ¯ ) if and M M M only if (t1 (m ¯ ), . . . , tn (m ¯ )) ∈ R , or, (3) φ is ψ ∧ θ and then M |= φ(m ¯ ) if and only if M |= ψ(m ¯ ) and M |= θ(m ¯ ), or, (4) φ is ¬ψ and then M |= φ(m ¯ ) if and only if it is not the case that M |= ψ(m ¯ ), or, (5) φ is ∃vpψ and then M |= φ(m ¯ ) if and only if there exists n ∈ M such that M |= φ(m ¯ (p/n)), where m¯ (p/n) denotes the sequence (m1, . . . , mp−1, n, mp+1,...) of M ∞.

Note that whether or not M |= φ(m ¯ ) only depends on those mp such that vp occurs free in φ. Also, note that we haven’t defined what to do if φ is ψ → θ, ψ ∨ θ, ψ ↔ θ or ∀vpψ. For the first three, we use that {¬, ∧} is adequate, to reduce to something covered by our definition. And for the last, we define M |= ∀vpψ if and only if M |= ¬∃vp¬ψ. So these cases can all be covered by our definition. Finally, if φ is a sentence of L, we say that M is a model of φ is M |= φ(m ¯ ) for some (or equivalently for all) m¯ ∈ M ∞. So informally, M is a model of φ is φ is true in M. Let’s work through an example of this definition, just to see how it works. Normally, we proceed without it (as we did in the examples before the definition). For the example, we take L = L0 ∨ {P, f, g, c} where P is a unary relation symbol, f, g are binary function symbols and c is a constant symbol. We take M to be given by M = {1, 2, 3, 4,..., }, the positive integers, P M = {2, 3, 5, 7, 11, 13,...}, the primes, f M(m, n) = m · n, gM(m, n) = m + n and cM = 2. Let φ be the sentence ∀x∃y∃z(g(f(c, x), c) = g(y, z) ∧ P (y) ∧ P (z)) and let θ(x, y, z) be the formula g(f(c, x), c) = g(y, z) and ψ(y) be the formula P (y) and χ(z) be the formula P (z). Note that the term g(f(c, x), c) interpreted in M is the function x 7→ 2x + 2. We use a sequence m¯ = (mx, my, mz, m1, m2,...), indicating which elements correspond to x, y, z and which to variables v1, v2,.... And we write m¯ (x/n) for the sequence (n, my, mz, m1, m2,...), and similarly for m¯ (y/n) etc. Then we have M |= φ(m ¯ ) iff for every n ∈ MM |= (∃y∃z(θ ∧ ψ ∧ χ)(m ¯ (x/n)) iff for every n ∈ M there exist p, q ∈ M s.t. M |= (θ ∧ ψ ∧ χ)(m ¯ (x/n)(y/p)(z/q)) iff for every n ∈ M there exist p, q ∈ M s.t. M |= (θ)(m ¯ (x/n)(y/p)(z/q)) and M |= (ψ)(m ¯ (x/n)(y/p)(z/q)) and M |= (χ)(m ¯ (x/n)(y/p)(z/q)) iff for every n ∈ M there exist p, q ∈ M s.t. 2n + 2 = p + q and p is prime and q is prime iff every even natural number greater than 2 is the sum of two primes . 43

And this is Goldbach’s conjecture, no one knows whether or not this is true. Important notation. If φ is a sentence of L and M is an L-structure, we write M |= φ is φ is true in M. As well as saying φ is true in M, we say that M is a model of φ, or that M models φ. We will use this phrase M models φ even though the formal definition isn’t seen in lectures. In any given case, it is clear what is meant, working informally as we did earlier.

For further examples, we work informally. Take L = L0 ∨{R,P } where R is a binary relation symbol and P a unary relation symbol (or predicate symbol). Let M1 be given by M = {1, 2, 3,...}, the positive integers, RM1 =≤, i.e. RM1 = {(x, y) ∈ M 2 : M1 x ≤ y} and P = {x ∈ M : x is even }. And let M2 be given by M = R, the real numbers, RM2 = {(x, y) ∈ R2 : x2 = y} and P M2 = Q. Let φ be the L-sentence ∃x∀y∃z((P (x) → R(x, y)) ∧ P (y) ∧ ¬R(x, y)). This is true in an L-structure M if there is an a ∈ M such that for all b ∈ M there is a c ∈ M such that each of the conjuncts is true in M with a substituted in for x, b substituted in for y and c substituted in for z. Let’s consider each of M1 and M2.

In M1. Fix any x ∈ M. If we choose y odd, then no matter what z is, P (y) will be false. So the sentence φ is false in M1, and M1 2 φ. Similarly in M2, if we fix any x ∈ R, and choose y ∈ R irrational, then no matter what z is, P (y) will be false. So φ is false in M2, i.e. M2 2 φ. Next consider the sentence ψ given by ∀y∃x(¬P (x) ∧ R(x, y)). What does this say in M1 and M2? In M1, this says ‘for all positive integers y there is a positive integer x such that (x is odd and x ≤ y)’. This is true: no matter what y is we can take x = 1. So M1 |= ψ. And in fact the sentence ∃x∀y(¬P (x) ∧ R(x, y)) is true in M1.

On the other hand, in M2, the sentence ψ says ‘for all reals y there is an irrational x such that x2 = y’. This is false. For instance, if we take y = 4 then x2 = y if and only if x = ±2, and neither of these values of x are irrational. So ψ is false in M2, i.e. M2 2 ψ. Now we do an example in which L has a single binary relation symbol R. Let X be the L-structure given by X = {a, b, c, d, e, f} and RX = {(a, b), (a, c), (a, d), (a, e), (a, f), (b, d), (c, e), (c, f)}. First, in such an example, you should draw the directed graph (see the examples in the section on L-structures). Here can be imagined as have a sitting on top. with b, c, d, e, f arranged in a line below, and arrows from a to each of those below, and then arrows from b to d, and from c to e and f. Let φ be the sentence ∃x∀y((¬∃zR(z, y)) → y = x). To examine this in X , first we consider the subformula (¬∃zR(z, y)) → y = x. This subformula says ‘if there is no z pointing to y then y = x. So the whole sentence φ is true in X is there is some x ∈ X such that for an y ∈ X, if there is no z pointing to y then y = x. To see that this is indeed true in X take x = a. Now, if y 6= a then ¬∃zR(z, y) is false (since something points to y), and the implication is true. On the other hand if y = a then the x = y on the right is true, and the implication is true. So with the choice x = a the implication is true for all y. So X is a model of φ. 44

Next let φ be the sentence ∃x∀y((¬∃zR(y, z)) → y = x). (So the y and z are swapped from the previous example.) Now ¬∃zR(y, z) says ‘y does not point to anything’. So the whole sentence says ‘there is an x such that for all y if y doesn’t point to anything then y = x’. There are three elements, d, e, f of X that don’t point to anything. If we choose x to be one of these, then taking y to be another of these elements, y = x is false, and ¬∃zR(y, z) is true. So the implication is false. (And if we choose x to be one of the other elements, then taking y to be d the implication is again false.) So the sentence is false in X . 45

14. Definable sets

Suppose that L = L0 ∨ S is a language as before and that φ is an L-formula with one free variable x (which we indicate by φ(x)). Given an L-structure M, we can look at ‘the solution set’ of φ in M: {m ∈ M : M |= φ(x/m)} = {m ∈ M : M |= φ(m)} = {m ∈ M : φ is true in M with m substituted in for the free occurrences of x}. This is called the set defined by φ in M, and written φ(M). Such sets are called definable.

For example, take L = L0 ∨ {f} where f is a unary function symbol. Let φ(x) be the formula ∃y(f(y) = x). We examine the set defined by φ in several L-structures.

M1 2 Let M1 be given by M1 = N, the natural numbers, and f (x) = x . Then

φ(M1) = {m ∈ M1 : M1 |= φ(m)} 2 = {m ∈ N : there exists n ∈ N s.t. m = n } = the square numbers in N.

M2 Let M2 be given by M2 = N, the natural numbers, and f (x) = 2x. Then

φ(M2) = {m ∈ M2 : M2 |= φ(m)} = {m ∈ N : there exists n ∈ N s.t. m = 2n} = even numbers in N.

M3 2 Next, let M3 be given by M3 = R, the real numbers, and f (x) = x . Then

φ(M3) = {m ∈ M3 : M3 |= φ(m)} 2 = {a ∈ R : there exists b ∈ R s.t. a = b } = {a ∈ R : a ≥ 0} = [0, ∞).

M4 Finally, consider M4 given by M4 = R, the real numbers, and f (x) = 2x. Then

φ(M4) = {m ∈ M4 : M4 |= φ(m)} = {a ∈ R : there exists b ∈ R s.t. a = 2b} = R.

Here are some further examples. For these, suppose that L = L0 ∨ {R, f, c} where R is a binary relation symbol, f a unary function symbol and c a constant symbol. Let M be the L-structure given by M = R, RM = {(x, y) ∈ R2 : x ≤ y}, f M(x) = cos x and cM = π. Here is a table of formulas with a free variable x and the set they define in M. Formula φ(x) φ(M) R(c, x) [π, ∞) ∃yf(y) = x [−1, 1] ∃yf(x) = y R f(x) = π ∅ (i.e. the empty set) ∃y(R(c, x) ∧ f(y) = x) ∅ ∃y(R(c, y) ∧ f(y) = x) [−1, 1] ∀yR(x, f(y)) (−∞, −1] ∀yR(f(x), f(y)) {(2k + 1)π : k ∈ Z} ∀y∃z(R(y, z) ∧ f(z) = x) [−1, 1] 46

Now let’s take N as the M above, but with RN = {(x, y) ∈ R2 : x < y} (so we change to strict inequality. Let’s look at the same formulas, and the sets they define now. The ones without an R in don’t change. For the rest we have: Formula φ(x) φ(N ) R(c, x) (π, ∞) ∃y(R(c, x) ∧ f(y) = x) ∅ ∃y(R(c, y) ∧ f(y) = x) [−1, 1] ∀yR(x, f(y)) (−∞, −1) ∀yR(f(x), f(y)) ∅ ∀y∃z(R(y, z) ∧ f(z) = x) [−1, 1] As a further example of definable sets we do an example with a directed graph. We consider L with one binary relation symbol R, and let X be the L-structure given by X = {a, b, c, d, e, f} and RX = {(a, b), (a, c), (a, d), (a, e), (a, f), (b, d), (c, e), (c, f)}, the example from the end of the last section. As before, it is worth drawing a picture of the directed graph. First consider the formula ρ(x) given by ∀yR(x, y). This says ‘for every y, x points to y’. But in our graph, there is no x pointing to every y (note that a doesn’t point to itself). So ρ(X ) = ∅.

Let’s consider the formula φ(x) given by ∀y¬R(x, y). So φ(x) says ‘for every y, ¬R(x, y)’, that is ‘for every y it is not the case that x points to y’, i.e. ‘x doesn’t point to anything’. So we have φ(X ) = {x ∈ X : x doesn’t point to anything }. Now looking at your picture, or considering the definition of RX , we see that φ(X ) = {d, e, f}. Next let us consider the formula ψ(x) given by ∃y∃z(y 6= z ∧ R(x, y) ∧ R(x, z)). So ψ(x) says ‘there are y and z such that y 6= z and x points to y and x points to z’. That is, ψ(x) says ‘x points to at least two elements’. So ψ(X ) = {x ∈ X : x points to at least two elements} and so ψ(X ) = {a, c}.

For some more examples, let L = L0 ∨ {c, d, f, g, R} where c and d are constant symbols, f and g are binary function symbols and R is a binary relation symbol. Let M be the L-structure which has underlying set M = R, the set of real numbers, cM = 0, dM = 1, f M(x, y) = x+y, gM(x, y) = x·y and RM = {(x, y) ∈ R2 : x < y}. For each of the following sentences of L we check whether it is true or false in M.

(i) ∀x∃yR(x, y) This is true in M. To see this, we translate the sentence. It says ‘For all x ∈ R there is a y ∈ R such that x < y’. And this is true: given x, we could take y = x + 1, say. 47

(ii) ∃x∀yR(x, y) This is false in M. Why? Well, it says ‘there is an x ∈ R such that for all y ∈ R we have x < y’, and this is clearly false: given any x we have x = x so it is not the case that x < x. (iii) ∃y∀x(R(c, x) → R(x, f(y, g(x, x)))) This is true in M. Is says ‘there is a y ∈ R such that for all x ∈ R if 0 < x then we have x < y + x2. This is true, with y = 1 say. (iv) ∀x∃y(R(y, c) → R(g(y, g(x, x)), c)) This is true in M. It says ‘for all x ∈ R there exists y ∈ R such that (y < 0 → yx2 < 0)’. To see that it is true, no matter what x is we take y = 1. Then y < 0 is false, so the implication is true. (v) ∀x∃y(R(y, d) ∧ R(g(y, g(x, x)), x)) This is false in M. It says ‘for all x ∈ R there is a y ∈ R such that (y < 1 and yx2 < x)’. To see that it is false, note that for x = 0, we have yx2 = x, no matter what y is, so the second conjunct will always be false for x = 0 and any y, hence the sentence is false in M. Next we take some formulas with a free variable, φ(x), and write down the subset φ(M) of M which φ defines in M. (i) R(g(f(d, d), g(f(x, f(d, d)), f(f(d, d), x))), c) We have φ(M) = {a ∈ R : M |= φ(a)} = {a ∈ R : M |= R(g(f(d, d), g(f(a, f(d, d)), f(f(d, d), a))), c)} = {a ∈ R : 2(a + 2)2 < 0}. But for any a ∈ R we have 2(a + 2)2 > 0, so this set is the empty set ∅. (ii) ¬(x = c ∧ x 6= c) → (¬R(g(x, x), x)) To see what set this define in M, first note that (x = c ∧ x 6= c) translates as ‘x = 0 and x 6= 0’. This is always false, so the negation of it is always true. For the formula to be true, we thus need the second part of the implication to be true. So the set the formula defines is just the set of a ∈ R such that M |= ψ(a), where ψ(x) is ¬R(g(x, x), x). So the set is the set of reals a such that a2 ≥ a, that is, (−∞, 0] ∪ [1, ∞). (iii) (∃yR(g(y, y), y)) → (x = c ∧ x 6= c) Again, the (x = c ∧ x 6= c) is always false. So the set this defines will be the set of x which make the first part false. But the first part ∃yR(g(y, y), y) is a sentence: x doesn’t occur (free) in it. And it is a sentence that is true in M, as it says ‘there is a y such that y2 < y’, which is true, for instance with y = 1/2. So this formula defines the empty set in M. 48

15. Posets and equivalence relations and constructing models

A different situation is as follows. We are given a language L and some sentences σ1, . . . , σl of L, and we try to find, if possible, an L-structure M such that M |= σi for each i (i.e. M |= σ1 ∧ · · · σl.

(For some choices of σ this might be impossible. For instance, if σ1 is ∃x(x 6= x) then we won’t be able to find a model. )

Suppose that L = L0 ∨ {R} where R is a binary relation. Consider the following three L-sentences:

σ which is ∀xR(x, x) ρ which is ∀x∀y∀z((R(x, y) ∧ R(y, z)) → R(x, z)) τ which is ∀x∀y((R(x, y) ∧ R(y, x)) → x = y). Suppose that M is an L-structure. We say that RM is • reflexive if M |= σ, • transitive if M |= ρ, • antisymmetric if M |= τ. As an example, let us write down an L-structure M such that RM is reflexive and transitive but not antisymmetric. So we need to give a set M and a subset RM of M 2 such that M |= σ and M |= ρ but such that it is not the case that M |= τ. One way is to let M = {a, b} be a two element set and then just put RM = M 2. Clearly σ and ρ are true here, but (a, b) ∈ RM and (b, a) ∈ RM, and a 6= b so τ is false in M. As an exercise, find structures which model other combinations of these sentences, but not all three. If RM is reflexive, transitive and antisymmetric, we say that M is a partially ordered set (or poset). For instance, we could take as an example a set with three elements {a, b, c} and define RM = {(a, a), (b, b), (c, c)}, a rather trivial poset. Or we could define RM, with the same M by RM = {(a, a), (b, b), (c, c), (a, b), (a, c)}. Or, for instance we could take M = N and RM = {(m, n) ∈ N2 : m ≤ n}. And so on. Now let θ be the L-sentence ∀x∀y(R(x, y) → R(y, x)). We call RM symmetric if M |= θ. And then if RM is reflexive, transitive and symmetric, we call RM an equivalence relation on M. As en exercise, write down an L-structure M such that RM is an equivalence relation on M.

For a different kind of example let L = L0 ∨{f} where f is a unary function symbol. Can we find an L-structure M such that M is a model of σ which is ∃x, y, z∀w(w = x ∨ w = y ∨ w = z) ρ which is ∀x(x 6= f(x)) τ which is ∃x(x 6= f(f(x)))?

Let’s first study σ and try to see what is says, that is, when it will be true in a given structure. But in fact we could study the simple sentence ∃x∀w(w = x). When is this true in a structure? We must have an element x such that all other elements are x. I.e. our set M must have cardinality 1. What about ∃x∃y∀w(w = x ∨ w = y)? 49

This says that there exist x and y such that every element is either x or y. I.e., our set M must have at most 2 elements (we don’t say x 6= y). Similarly, M |= σ if and only if M has at most three elements. The other two are a bit easier: we just have to make sure f M doesn’t send any element to itself, and that there is some element x such that x 6= f(f(x)). Here is an example, we can take M = {a, b, c}, a set with three elements, and define f M(a) = b, f M(b) = c and f M(c) = a. Can you find another example? What about an model with only two elements?