<<

2. First Order 2.1. Expressions.

Definition 2.1. A language L consists of a set LF of function symbols, a set LR of relation symbols disjoint from LF , and a function arity : LF ∪LR → N. We will sometimes distinguish a special binary relation symbol =. 0-ary function symbols are called constant symbols. We will always assume that both LF and LR are countable. First-order logic will involve expressions built from symbols of our lan- guage together with additional symbols:

• Infinitely many first-order variables, x0, x1,..., • The logical connectives ∧, ∨, →, ⊥, • Quantifiers ∀ and ∃, • Parentheses ( and ). We usually write x, y, z, . . . for first-order variables. Definition 2.2. The terms of L are given inductively by:

• The variables x0, x1,... are all terms, • If F ∈ LF , so F is a function symbol, arity(F ) = n, and t1, . . . , tn are terms, then F t1 ··· tn is a term. Note that 0-ary function symbols are permitted, in which case each 0-ary function symbol is itself a term; we call these constants. If R ∈ LR, so R is a relation symbol, arity(R) = n, and t1, . . . , tn are terms then Rt1 ··· tn is n . The formulas of L are given inductively by: • Every atomic formula is a formula, • ⊥ is a formula, • If φ and ψ are formulas then so are (φ ∧ ψ), (φ ∨ ψ), (φ → ψ), • If x is a and φ is a formula then ∀xφ and ∃xφ are formulas. Formulas and terms are collectively called expressions.

When R is binary, we often informally write with infix notation: t1 < t2, 0 t1 = t2, etc.. We sometimes write Q, Q , etc. for an arbitrary quantifier. 2.2. Free Variables. Definition 2.3. If e is an , we define the free variables of e, free(e), recursively by: • free(x) = {x}, S • free(F t1 ··· tn) = i≤n free(ti), S • free(Rt1 ··· tn) = i≤n free(ti), • free(⊥) = ∅, • free(φ ~ ψ) = free(φ) ∪ free(ψ), • free(Qxφ) = free(φ) \{x}. 1 2

Definition 2.4. Given t, we define the formulas φ such that t is substitutable for x recursively by: • If φ is atomic or ⊥ then t is substitutable for x in φ, • t is substitutable for x in φ ~ ψ iff t is substitutable for x in both φ and ψ, • t is substituable for x in ∀yφ, ∃yφ iff either – x does not occur in ∀yφ, – y does not occur in t and t is substituable for x in φ. If s is a term, we define s[t/x] by: • If y 6= x, y[t/x] is y, • x[t/x] is x, • (F t1 ··· tn)[t/x] is F t1[t/x] ··· tn[t/x]. When t is substitutable for x in φ, we define φ[t/x] by:

• (Rt1 ··· tn)[t/x] is Rt1[t/x] ··· tn[t/x], • ⊥[t/x] is ⊥, • (φ ~ ψ)[t/x] is φ[t/x] ~ ψ[t/x], • If y 6= x,(Qyφ)[t/x] is Qy(φ[t/x]), • (Qxφ)[t/x] is Qxφ. We only write φ[t/x] when it is the case that t is substitutable for x in φ. If we have distinguished some variable x we write φ(t) for φ[t/x]. Definition 2.5. We define the alphabetic variants of a formula φ recursively by: • If φ is atomic, φ is its only alphabetic variant, 0 0 0 • φ ~ψ is an alphabetic variant of φ~ψ iff φ is an alphabetic variant of φ and ψ0 is an alphabetic variant of ψ, • Qxφ0 is an alphabetic variant of Qyφ iff there is a φ00 such that φ00 is an alphabetic variant of φ and φ is φ00[x/y]. Definition 2.6. We define the instances of φ inductively by: • φ is a substitution instance of φ, • If ψ is a substitution instance of φ and t is substitutable for x in ψ then ψ[t/x] is a substitution instance of φ. 2.3. Sequent Calculus. The sequent calculus for first-order logic is a direct extension of the cal- culus for propositional logic. We add four new rules:

Γ, φ[y/x] ⇒ Σ Γ ⇒ φ[t/x], Σ L∃ R∃ Γ, ∃xφ ⇒ Σ Γ ⇒ ∃xφ, Σ Where y does not appear free in ΓΣ

Γ, φ[t/x] ⇒ Σ Γ ⇒ φ[y/x], Σ L∀ R∀ Γ, ∀xφ ⇒ Σ Γ ⇒ ∀xφ, Σ Where y does not appear free in ΓΣ 3

In L∃ and R∀, the variable y is called the eigenvariable, and the condition that y not appear free in ΣΓ is the eigenvariable condition. When φ is a formula with a known distinguished variable x, it will be convenient to write φ(t) for φ[t/x].

Definition 2.7. Fc is the system consisting of the nine rules of Pc together with the four rules above. As before, Fi is the fragment of Fc where the right- hand part of each sequent consists of 1 or 0 formulas, Fm is the fragment of cf Fi omitting L⊥, and F is the fragment of F omitting Cut.

Taking for granted that Fm ` φ ⇒ φ for any formula, we can consider some examples. Example 2.8. φ(y) ⇒ φ(y) φ(y) ⇒ ∃xφ(x) ∀yφ(y) ⇒ ∃xφ(x) ⇒ ∀yφ(y) → ∃xφ(x) Example 2.9. ¬φ(x), φ(x) ⇒ ⊥ ¬φ(x), ∀yφ(y) ⇒ ⊥ ∃x¬φ(x), ∀yφ(y) ⇒ ⊥ ∃x¬φ(x) ⇒ ¬∀yφ(y) ⇒ ∃x¬φ(x) → ¬∀yφ(y) Example 2.10. ⇒ φ, ¬φ ⇒ ∃xφ, ¬φ ⇒ ∃xφ, ∀x¬φ ⊥ ⇒ ∃xφ ¬∀x¬φ ⇒ ∃xφ ⇒ ¬∀x¬φ → ∃xφ Example 2.11. φ(x), φ(y) ⇒ φ(y), ∀yφ(y) φ(x) ⇒ φ(y), φ(y) → ∀yφ(y) φ(x) ⇒ φ(y), ∃x(φ(x) → ∀yφ(y)) φ(x) ⇒ ∀yφ(y), ∃x(φ(x) → ∀yφ(y)) ⇒ φ(x) → ∀yφ(y), ∃x(φ(x) → ∀yφ(y)) ⇒ ∃x(φ(x) → ∀yφ(y)) cf Lemma 2.12. For any φ, Fm ` φ ⇒ φ.

Proof. For atomic φ this is an axiom, and for φ ~ ψ we have covered this case in propositional logic. The only new cases are the two quantifiers. 4

IH φ ⇒ φ φ ⇒ ∃xφ ∃xφ ⇒ ∃xφ Similarly, in the ∀ case, we have: IH φ ⇒ φ ∀xφ ⇒ φ ∀xφ ⇒ ∀xφ  2.4. Properties of Intuitionistic Logic. cf Theorem 2.13 (Generalized Subformula Property). If F ` Γ ⇒ Σ then every formula appearing in the deduction is a substitution instance of a subformula of some formula in either Γ or Σ. cf Theorem 2.14 (Existence Property). If Fi ` ∃xφ then there is a term t cf such that Fi ` φ[t/x]. Proof. The last inference rule of this deduction must be R∃, and therefore must have concluded φ[t/x].  cf Theorem 2.15. If Fi ` ∀x∃yφ then there is a term t (possibly containing cf x free) such that Fi ` ∀xφ[t/y]. Proof. The last inference rule of this deduction must be R∀, and concluded ∃yφ, and so the previous line must have been φ[t/y]. Therefore by R∀, ∀xφ[t/y].  2.5. Double Negation. Definition 2.16. As before, we define φ∗ recursively by: ∗ • (Rt1 ··· tn) is (Rt1 ··· tn) ∨ ⊥, • ⊥∗ is ⊥, ∗ ∗ ∗ • (φ ~ ψ) is φ ~ ψ , • (Qxφ)∗ is Qx(φ∗). Theorem 2.17. For any φ, ∗ (1) Fi ` φ ↔ φ , ∗ (2) Fm ` ⊥ → φ , ∗ (3) If Fi ` φ then Fm ` φ . Proof. Again, we prove only the third part. We show by induction on de- ductions ∗ ∗ Suppose Fi ` Γ ⇒ Σ. If Σ = {φ} then Fm ` Γ ⇒ φ , and ∗ if Σ = ∅ then Fm ` Γ ⇒ ⊥. 5

The induction from the propositional case works for all rules of Pi, and the four new rules are all rules of Fm which go through unchanged.  Definition 2.18. We define the double negation of φ, φN , inductively by: • ⊥N is ⊥, • pN is ¬¬p, N N N • (φ0 ∧ φ1) is φ0 ∧ φ1 , N N N • (φ0 ∨ φ1) is ¬(¬φ0 ∧ ¬φ1 ), N N N • (φ0 → φ1) is φ0 → φ1 , • (∀xφ)N is ∀xφN , • (∃xφ)N is ¬∀x¬φN . Again ΓN = {γN | γ ∈ Γ}. N N Lemma 2.19. Fm ` ¬¬φ ⇒ φ . Proof. By induction on φ. We have already handled all cases except when φ is Qxψ. For φ = ∀xψ, we have

ψN ⇒ ψN ∀xψN ⇒ ψN ¬ψN , ∀xψN ⇒ ⊥ ¬ψN ⇒ ¬∀xψN IH ¬¬∀xψN ⇒ ¬¬ψN ¬¬ψN ⇒ ψN ¬¬∀xψN ⇒ ψN ¬¬∀xψN ⇒ ∀xψN For φ = ∃xψ, we have

∀x¬ψN , ¬∀x¬ψN ⇒ ⊥ ∀x¬ψN ⇒ ¬¬∀x¬ψN  ¬¬¬∀x¬ψN , ∀x¬ψN ⇒ ⊥ ¬¬¬∀x¬ψN ⇒ ¬∀x¬ψN N N Lemma 2.20. If Fm ` Γ, ¬φ ⇒ ⊥ then Fm ` Γ ⇒ φ . N Theorem 2.21. If Fc ` φ then Fm ` φ . Proof. Again, we prove by induction on deductions N N If Fc ` Γ ⇒ Σ then Fm ` Γ , ¬Σ ⇒ ⊥. The inductive steps from the propositional case work unchanged, so we need only deal with the four new rules. If the last inference of the original deduction was R∀ then we have ¬∀xφN in ¬ΣN , so we have: 6

IH ΓN , ¬ΣN , ¬φN (y) ⇒ ⊥ 2.20 ΓN , ¬ΣN ⇒ φN (y) ΓN , ¬ΣN ⇒ ∀xφN ΓN , ¬ΣN , ¬∀xφN ⇒ ⊥ If the last inference of the original deduction was L∀ then we have ∀xφN in ΓN , and so: IH ΓN , φN (t), ¬ΣN ⇒ ⊥ ΓN , ∀xφN , ¬ΣN ⇒ ⊥ If the last inference is R∃ then we have ¬¬∀x¬φN in ¬ΣN , and so: IH ΓN , ¬ΣN , ¬φN (t) ⇒ ⊥ 2.19 ΓN , ¬ΣN , ∀x¬φN ⇒ ⊥ ¬¬∀x¬φN ⇒ ∀x¬φN ΓN , ¬ΣN , ¬¬∀x¬φN ⇒ ⊥ If the last inference is L∃ then we have ¬∀x¬φN in ΓN , and so IH ΓN , φN (y), ¬ΣN ⇒ ⊥ ΓN , ¬ΣN ⇒ ¬φN (y)  ΓN , ¬ΣN ⇒ ∀x¬φN ∀x¬φN , ¬∀x¬φN ⇒ ⊥ ΓN , ¬∀x¬φN , ¬ΣN ⇒ ⊥

We also have Theorem 2.22. N (1) Fm ` φ → φ , and N (2) Fc ` φ ↔ φ .

However Glivenko’s Theorem is false for first-order logic: Fc ` ∀x(Px ∨ ¬Px) but Fi 6` ¬¬∀x(Px∨¬Px). Checking the latter requires a bit of effort; cf in the next section we will show that Fi is conservative over Fi , so we will cf only show here that Fi 6` ¬¬∀x(Px ∨ ¬Px). To show this, we need to show something slightly more general:

cf Lemma 2.23. There is no deduction in Fi of

¬∀x(Px ∨ ¬Px), Py1,..., Pyn ⇒ ∀x(Px ∨ ¬Px) for any of variables Py1,..., Pyn. Proof. Suppose there were such a deduction; then there must be a short- est such deduction. Consider the last inference rule of the shortest such deduction; it can only be L → or R∀. 7

If the last inference rule were an application of L → then one of the immediate subdeductions would actually be a deduction of the same sequent, contradicting the choice of the shortest deduction. So the last inference rule must be R∀ applied to a deduction of

¬∀x(Px ∨ ¬Px), Py1,..., Pyn ⇒ Pyn+1 ∨ ¬Pyn+1 where, by the eigenvariable condition, yn+1 6= yi for i ≤ n. The last inference rule in the deduction of this sequent must be either L → or R∨, and by the same argument as before, cannot be L →. The sequent

¬∀x(Px ∨ ¬Px), Py1,..., Pyn ⇒ Pyn+1 can only be deduced using L →, again contradicting our choice of the shortest deduction. On the other hand,

¬∀x(Px ∨ ¬Px), Py1,..., Pyn ⇒ ¬Pyn+1 can only deduced by applying R → to a deduction of

¬∀x(Px ∨ ¬Px), Py1,..., Pyn, Pyn+1 ⇒ ⊥, which in turn can only be deduced by applying L → to

¬∀x(Px ∨ ¬Px), Py1,..., Pyn, Pyn+1 ⇒ ∀x(Px ∨ ¬Px), once again contradicting our choice of the shortest such eduction. So no matter how we attempt to derive a sequent of the form

¬∀x(Px ∨ ¬Px), Py1,..., Pyn ⇒ ∀x(Px ∨ ¬Px), we must have used another sequent of the same form earlier in the deduction, we conclude that there can be no such deduction.  By a similar analysis of what the final inference rule must be, any deduc- cf tion of ¬¬∀x(Px ∨ ¬Px) in Fi must look like this (*) ¬∀x(Px ∨ ¬Px) ⇒ ∀x(Px ∨ ¬Px) ¬∀x(Px ∨ ¬Px) ⇒ ⊥ ⇒ ¬¬∀x(Px ∨ ¬Px) and we have just seen that there is no deduction (*). 2.6. Cut-Elimination. We extend the notion of the rank of a formula: Definition 2.24. We define the rank of a formula inductively by: • rk(p) = rk(⊥) = 0, • rk(φ ~ ψ) = max{rk(φ), rk(ψ)} + 1, • rk(Qxφ) = rk(φ) + 1.

We write F `r Γ ⇒ Σ if there is a deduction of Γ ⇒ Σ in F such that all cut-formulas have rank < r. 8

Lemma 2.25 (Substitution). If F `r Γ ⇒ Σ then for any x, t, F `r Γ[t/x] ⇒ Σ[t/x]. Proof. By induction on deductions. We have to show inductively that for any deduction d with conclusion Γ ⇒ Σ and any sequence of substitu- tions x1, . . . , xn, t1, . . . , tn, there is a deduction with the same cut-rank of Γ[t1/x1] ··· [tn/xn] ⇒ Σ[t1/x1] ··· [tn/xn]. The only non-trivial case is when the last inference rule is either L∃ or R∀, since in this case it might be that the eigenvariable y appears in some ti. In this case, let z be some new variable not appearing anywhere in the deduction d, not in any ti, and not equal to any xi, and apply IH to the sequence y, x1, . . . , xn, z, t1, . . . , tn. Then we may use z as an eigenvariable and apply the same inference rule.  Lemma 2.26 (Inversion).

• If F `r Γ ⇒ Σ, ∀xφ then for every term t, F `r Γ ⇒ Σ, φ[t/x]. • If F `r Γ, ∃xφ ⇒ Σ then for every term t, F `r Γ, φ[t/x] ⇒ Σ. Proof. We prove the first part by induction on deductions; the second part is similar. The only non-trivial case is when ∀xφ is the main formula of the last inference rule. In this case we must have Γ ⇒ Σ, ∀xφ, φ[y/x] Γ ⇒ Σ, ∀xφ By IH, there is a deduction of Γ ⇒ Σ, φ[t/x], φ[y/x], and substituting t for x gives a deduction of Γ ⇒ Σ, φ[t/x].  Lemma 2.27. 0 0 • Suppose F `r Γ ⇒ Σ, ∀xφ and F `r Γ , ∀xφ ⇒ Σ , rk(∀xφ) ≤ r, 0 0 0 and if  ∈ {i, m} then |ΣΣ | ≤ 1. Then F `r ΓΓ ⇒ ΣΣ . 0 0 • Suppose F `r Γ, ∃xφ ⇒ Σ and F `r Γ ⇒ Σ , ∃xφ, rk(∃xφ) ≤ r, 0 0 0 and if  ∈ {i, m} then |ΣΣ | ≤ 1. Then F `r ΓΓ ⇒ ΣΣ . Proof. We again prove the first part, and the other is similar. By induction on the deduction of Γ0, ∀xφ ⇒ Σ0. The only non-trivial case is when ∀xφ is the main formula of the last inference of this deduction. In this case we have Γ0, ∀xφ, φ[t/x] ⇒ Σ0 Γ0, ∀xφ ⇒ Σ0 0 0 By IH, P `r ΓΓ , φ[t/x] ⇒ ΣΣ . By Inversion, we know that F `r Γ ⇒ Σ, φ[t/x], so an application of Cut to the formula φ[t/x], which has rank 0 0 < r, gives a deduction of ΓΓ ⇒ ΣΣ . 

Theorem 2.28. Suppose F `r+1 Γ ⇒ Σ. Then F `r Γ ⇒ Σ. Proof. By induction on deductions. If the last inference is anything other than a cut over a formula of rank r¡ the claim follows immediately from IH. If the last inference is a cut over a formula of rank r, we have 9

Γ ⇒ Σ, φ Γ0, φ ⇒ Σ0 ΓΓ0 ⇒ ΣΣ0 0 0 Therefore F `r+1 Γ ⇒ Σ, φ and F `r+1 Γ , φ ⇒ Σ , so by IH, F `r Γ ⇒ 0 0 Σ, φ and F `r Γ , φ ⇒ Σ . If φ is φ0 ~ φ1, we proceed as in the propositional case. If φ is ∀xφ or ∃xφ, we may apply the preceeding lemma. 

Theorem 2.29. Suppose F `r Γ ⇒ Σ. Then F `0 Γ ⇒ Σ.

Proof. By induction on r, applying the previous theorem repeatedly.  2.7. Consequences of Cut-Elimination.

Theorem 2.30 (Craig Interpolation Theorem). Suppose F ` Γ ⇒ Σ. Then there is a formula ψ such that:

• F ` Γ ⇒ ψ, • F ` ψ ⇒ Σ, • Every function or relation symbol and every free variable appearing in ψ appears in both Γ and Σ. Proof. The proof is essentially the same as that of the interpolation theorem we gave for propositional logic. The only new cases are the quantifier rules. Recall that we need the more general inductive case: 0 0 Suppose F ` ΓΓ ⇒ ΣΣ where if  ∈ {i, m} then Σ = ∅. Then there is a formula ψ such that: • F ` Γ ⇒ Σ, ψ, 0 0 • F ` Γ , ψ ⇒ Σ , • Every function or relation symbol and every free variable appearing in ψ appears in both ΓΣ and Γ0Σ0. We consider just one quantifier rule, L∃. Importantly, we should make sure to ensure that the eigenvariable condition is unharmed. There are two subcases, depending on whether the formula ∃xφ is considered part of Γ or part of Γ0, and since they are quite similar, we consider only one: Γ, φ;Γ0 ⇒ ΣΣ0 Γ, ∃xφ;Γ0 ⇒ ΣΣ0 0 0 By IH there is a ψ so that F ` Γ, φ ⇒ Σ, ψ, F ` Γ , ψ ⇒ Σ , and the function symbols, relation symbols, and free variables in ψ appear in both ΓΣ, φ and Γ0Σ0. In particular, the eigenvariable condition means that the eigenvariable in φ does not appear in Γ0Σ, and therefore does not appear in φ, so Γ, φ ⇒ Σ, ψ Γ, ∃xφ ⇒ Σ, ψ as needed.  The formula ψ is known as the interpolant. Note that the choice of ψ depends both on Γ and Σ. Even when Γ is finite, ψ may depend on the particular choice of conclusion Σ. 10

Example 2.31. Consider the language with two binary predicates, = and <. We can write down finitely many axioms stating that < is a dense linear order: • ∀x∀y∀z x < y ∧ y < z → x < z, • ∀x∀y(x < y → ∃z x < z ∧ z < y), • ∀x∀y x < y → x 6= y, • ∃x∃y x < y. From these, we can easily deduce, for any n,

∃x1∃x2 · · · ∃xn(x1 6= x2 ∧ x2 6= x3 ∧ · · · ∧ x1 6= xn ∧ x2 6= x2 ∧ · · · ). In other words, the finite list of axioms above implies that the model is infinite. However no single formula involving only equality can imply that the model is infinite.

Theorem 2.32 (Herbrand’s Theorem). Suppose Fc ` ∃xφ where φ is quantifier- free. Then there are terms t1, . . . , tn such that

Fc ` φ(t1), . . . , φ(tn).

We don’t bother to state this for Fi or Fm since the existence property gives a much stronger result for those systems. Also, the same statement is true, by the same proof but with more complicated notation, if we replace the variable x by several variables. Note that φ may contain free variables, as may the terms t1, . . . , tn. Proof. We prove by induction on cut-free deductions: Suppose d is a cut-free deduction of Γ ⇒ ∆Σ where the formulas in Γ∆ are quantifier-free and the formulas in Σ consist only of formulas of the form ∃xψ with ψ quantifier free. Then there is a deduction of Γ ⇒ ∆Σ0 where every element of Σ0 has the form ψ[t/x] for some ∃xψ ∈ Σ. The idea is simply to remove all R∃ inferences from the deduction. If the last inference of d is anything other than R∃, the claim follows immediately from IH. If the last inference of d is R∃, we have Γ ⇒ ∆, ψ[t/x], Σ Γ ⇒ ∆Σ By IH, there is a deduction of Γ ⇒ ∆, ψ[t/x], Σ0, which proves the claim.  Actually, it’s useful to have the a stronger theorem. Definition 2.33. The prenex formulas are defined inductively by: • Every quantifier-free formula is prenex, • If φ is prenex, so are ∀xφ and ∃xφ. In other words, the prenex formulas are the ones that have all their quan- tifiers out in the front. 11

Theorem 2.34 (Middle Sequent Theorem). If Fc ` Γ ⇒ Σ where ΓΣ consists of prenex formulas then there is a deduction of Γ ⇒ Σ in which the quantifier rules (R∀,R∃,L∀,L∃) appear below every other inference rule. We begin with a cut-free deduction of Γ ⇒ Σ. The idea is to take the (or a, since the proof might branch) bottom-most quantifier rule in the proof and “permute” it towards the bottom—repeatedly reorder rules so the quantifier rule gets lower and lower. Once that rule is the last one in the proof, we turn to the next quantifier rule, and repeat until we have pulled all quantifier rules to the bottom. Lemma 2.35. Suppose there is a cut-free deduction of Γ ⇒ Σ where ΓΣ consists of prenex formulas with n quantifier rules appearing in the deduc- tion. Then there is such a deduction where the final rule is a quantifier rule. Proof. Let m be the number of inference rules appearing below the lowest quantifier rule in the deduction. We proceed by induction on m. If m = 0, there is nothing to do. If m > 1, the claim follows by using IH to reduce to the case where m = 1. So the only interesting case is when m = 1. That means the final inference rule of d is not a quantifier rule, and one of its immediate subdeductions does end in a quantifier rule. It suffices to show that we can swap the order of these rules without increasing the total number of quantifier rules. For example, suppose the deduction looks like this: Γ ⇒ Σ, φ, θ Γ ⇒ Σ, φ, ∃xθ Γ ⇒ Σ, φ ∧ ψ, ∃xθ Then we can simply exchange the order: Γ ⇒ Σ, φ, θ Γ ⇒ Σ, φ ∨ ψ, θ Γ ⇒ Σ, φ ∨ ψ, ∀xθ We need to worry that the main formula of the quantifier rule might be a subformula of the main formula of the non-quantifier rule; since all formulas in Γ ⇒ Σ are prenex, this cannot happen: the main formula of the quantifier rule has a quantifier in it, and therefore no non-quantifier rule may later be applied to it, since that would produce a non-prenex formula. We need to worry about the eigenvariable condition when we permute ap- propriate rules. For instance, in the example, it could be that the eigenvari- able appears in ψ. In this case we apply the substitution lemma, replacing the eigenvariable with some other variable for which this is not a problem. Note that the substitution lemma does not change the number of quantifier rules in the deduction. Also, note that we might permuate the rule over a two-premise rule: Γ0 ⇒ Σ0, ψ, θ Γ ⇒ Σ, φ Γ0 ⇒ Σ0, φ, ∃xθ ΓΓ0 ⇒ ΣΣ0, φ ∧ ψ, ∃xθ 12

The same argument works: Γ ⇒ Σ, φ Γ0 ⇒ Σ0, ψ, θ ΓΓ0 ⇒ ΣΣ0, φ ∧ ψ, θ  ΓΓ0 ⇒ ΣΣ0, φ ∧ ψ, ∃xθ

Proof of Middle Sequent Theorem. We take a cut-free deduction and prove the middle sequent theorem by induction on the number of quantifier rules in the deduction. If there are no quantifier rules, there is nothing to do. If there are n + 1 quantifier rules, the lemma gives us a deduction in which the last rule is a quantifier rule. We then apply IH to the immediate subdeduction, which has n quantifier rules. 

The Middle Sequent Theorem immediately gives another proof of Her- brand’s Theorem, and also of generalizations of Herbrand’s Theorem for more complicated formulas. We give one example:

Corollary 2.36. Suppose Fc ` ∃x∀yφ where φ is quantifier free. Then there are variables y1, . . . , yn and terms t1, . . . , tn such that yi does not appear in tj with j ≤ i, such that

Fc ` φ(t1, y1), . . . , φ(tn, yn).

Proof. By the Middle Sequent Theorem, there is a deduction of ∃x∀yφ with all quantifier rules at the end. Suppose we have a deduction of some subset of

∃x∀yφ, ∀yφ(s1, y),..., ∀yφ(sm, y), φ(t1, y1), . . . , φ(tn, yn) with all quantifier rules at the end and where yi does not appear in tj with j ≤ n. We show by induction on the number of quantifier rules that there is a deduction of a sequent satisfying the statement of the theorem. If the last rule is R∃, the immediate subdeduction is of a subset of

∃x∀yφ, ∀yφ(s1, y),..., ∀yφ(sm, y), ∀yφ(sm+1, y), φ(t1, y1), . . . , φ(tn, yn) and the claim follows by IH. If the last rule is R∀, we may assume without loss of generality that the main formula is ∀yφ(sm, y). Then the immediate subdeduction is of a subset of

∃x∀yφ, ∀yφ(s1, y),..., ∀yφ(sm, y), φ(t1, y1), . . . , φ(tn, yn), φ(sm, ym) where, by the eigenvariable condition, ym cannot appear free in t1, . . . , tn, and the claim follows by IH. 

There are further generalizations of Herbrand’s Theorems for formulas with more quantifiers in the front. 13

2.8. Optimality of Cut-Elimination. We show that in some instances cut-elimination for first-order logic must drastically increase the height of proofs. The idea is that we will construct some sequents and exhibit short proofs involving cut, but then prove that any cut-free proofs must be very large. Recall that the size of a deduction is the number of inference rules ap- pearing in the deduction. We work in a language with a ternery relation R, a constant 0, and a unary function S. The intuition is that 0 represents the number 0, S the successor, and Rxyz holds when z = y + 2x. We implement this with two formulas:

• ∀yR(y, 0, Sy), • ∀x, y, z, z0(R(y, x, z) → R(z, x, z0) → R(y, Sx, z0)).

Note that these are properties we expect to hold: the first says that for every y, y + 1 = y + 20, while the second says that for any x, y,

y + 2x+1 = y + 2x + 2x.

2 Definition 2.37. 20 = 1, 2k+1 = 2 k .

Let us take Λ to be the set consisting of these two axioms. We now define some formulas we wish to give deductions of:

• A0(x, y) is the formula ∃zR(y, x, z), • A0(x) is the formula ∀yA0(x, y), • Ai+1(x, y) is the formula Aiy → ∃z(Aiz ∧ R(y, x, z)), • Ai+1(x) is the formula ∀yAi+1(x, y).

Lemma 2.38. There is a constant c so that for every i there is a deduction of Λ ⇒ Ai0 in Pc with size ≤ ci.

Proof. For i = 0, 1 we have ad-hoc deductions. Λ,R(y, 0, Sz) ⇒ R(y, 0, Sz) Λ ⇒ R(y, 0, Sz) Λ ⇒ ∃zR(y, 0, z) Λ ⇒ ∀y∃zR(y, 0, z) We’ll leave the deduction of A1 as an exercise. We first give a deduction of Λ,Aiy ⇒ Ai(Sy). 14

Λ,R(y0, Sy, z0) ⇒ R(y0, Sy, z0) Λ,R(z, y, z0),R(z, y, z0) → R(y0, Sy, z0) ⇒ R(y0, Sy, z0) Λ,R(z, y, z0),R(y0, y, z),R(y0, y, z) → R(z, y, z0) → R(y0, Sy, z0) ⇒ R(y0, Sy, z0) 0 0 0 0 0 0 Aiz ⇒ Aiz Λ,R(z, y, z ),R(y , y, z) ⇒ R(y , Sy, z ) 0 0 0 0 0 0 Λ,Aiz ,R(z, y, z ),R(y , y, z) ⇒ Aiz ∧ R(y , Sy, z ) 0 0 0 0 Λ,Aiz ,R(z, y, z ),R(y , y, z) ⇒ ∃z(Aiz ∧ R(y , Sy, z)) 0 0 0 0 Λ,Aiz ∧ R(z, y, z ),R(y , y, z) ⇒ ∃z(Aiz ∧ R(y , Sy, z)) 0 0 0 0 0 Λ, ∃z (Aiz ∧ R(z, y, z )),R(y , y, z) ⇒ ∃z(Aiz ∧ R(y , Sy, z)) 0 0 0 0 0 Λ,Aiz → ∃z (Aiz ∧ R(z, y, z )),Aiz, R(y , x, z) ⇒ ∃z(Aiz ∧ R(y , Sy, z)) Λ, ∀z(A z → ∃z0(A z0 ∧ R(z, y, z0))),A z, R(y0, x, z) ⇒ ∃z(A z ∧ R(y0, Sy, z)) ...... i i i i 0 0 Λ,Ai+1y, Aiz, R(y , y, z) ⇒ ∃z(Aiz ∧ R(y , Sy, z)) 0 0 Λ,Ai+1y, Aiz ∧ R(y , y, z) ⇒ ∃z(Aiz ∧ R(y , Sy, z)) 0 0 Λ,Ai+1y, ∃z(Aiz ∧ R(y , y, z)) ⇒ ∃z(Aiz ∧ R(y , Sy, z)) 0 0 0 0 Λ,Ai+1y, Aiy → ∃z(Aiz ∧ R(y , y, z)),Aiy ⇒ ∃z(Aiz ∧ R(y , Sy, z)) Λ,A y, ∀y0(A y0 → ∃z(A z ∧ R(y0, y, z))),A y0 ⇒ ∃z(A z ∧ R(y0, Sy, z)) ...... i+1 i i i i 0 0 Λ,Ai+1y, Aiy ⇒ ∃z(Aiz ∧ R(y , Sy, z)) 0 0 Λ,Ai+1y ⇒ Aiy → ∃z(Aiz ∧ R(y , Sy, z)) Λ,A y ⇒ ∀y0(A y0 → ∃z(A z ∧ R(y0, Sy, z))) ...... i+1 i i Λ,Ai+1y ⇒ Ai+1(Sy) Then the main deduction is:

Λ,R(y, 0, Sy) ⇒ R(y, 0, Sy) ...... Λ,Ai+1y ⇒ Ai+1(Sy) Λ ⇒ R(y, 0, Sy) Λ,A y ⇒ A (Sy) ∧ R(y, 0, Sy) i+1 i+1  Λ,Ai+1y ⇒ ∃z(Ai+1z ∧ R(y, 0, z)) Λ ⇒ Ai+1y → ∃z(Ai+1z ∧ R(y, 0, z)) Λ ⇒ Ai+2(0)

Now we define the formula we intend to illustrate growth of cut-elimination: the formula Ck is given by

∃z0, z1, . . . , zk (R(0, 0, zk) ∧ R(0, zk, zk−1) ∧ · · · ∧ R(0, z1, z0)) . Note that in the intended interpretation, where terms are natural numbers and R(x, y, z) means z = y + 2x, the only solution is to inductively have

21 zk = 1 = 20, zk−1 = 2 = 21, zk−2 = 2 = 22, . . . , z0 = 2k.

Lemma 2.39. There is a constant c and, for every k, a deduction of Λ ⇒ Ck with size ≤ ck. 15

Proof. Let B0 be the formula R(0, z1, z0), and let Bi(z) for i > 0 be the formula Ai−1(z) ∧ R(0, zi+1, z).

Λ,R(0, 0, zk),R(0, zk, zk−1),...,R(0, z2, z1),R(0, z1, z0) ⇒ Ck Λ,R(0, 0, z ),R(0, z , z ),...,R(0, z , z ), ∃zR(0, z , z) ⇒ C ...... k k k−1 2 1 1 k Λ,R(0, 0, zk),R(0, zk, zk−1),...,R(0, z2, z1),A0z1 ⇒ Ck Λ,R(0, 0, zk),R(0, zk, zk−1),...,A0z1 ∧ R(0, z2, z1) ⇒ Ck Λ ⇒ A00 Λ,R(0, 0, zk),R(0, zk, zk−1),..., ∃z(A0z ∧ R(0, z2, z)) ⇒ Ck Λ,R(0, 0, z ),R(0, z , z ),...,A 0 → ∃z(A z ∧ R(0, z , z)) ⇒ C ...... k k k−1 0 0 2 k Λ,R(0, 0, zk),R(0, zk, zk−1),...,A1z2 ⇒ Ck

Λ,R(0, 0, zk),Ak−1zk ⇒ Ck Λ,Ak−1zk ∧ R(0, 0, zk) ⇒ Ck Λ ⇒ Ak−10 Λ, ∃z(Ak−1z ∧ R(0, 0, z)) ⇒ Ck Λ,A 0 → ∃z(A z ∧ R(0, 0, z)) ⇒ C ...... k−1 k−1 k Λ ⇒ Ak0 Λ,Ak0 ⇒ Ck Λ ⇒ Ck  We note the following helpful lemma: Lemma 2.40. Suppose we have a cut-free deduction of Γ ⇒ Σ, ∆ where ∆ consists of atomic formulas, Γ does not contain ⊥, and no Ax rule in the deduction includes a formula from ∆. Then there is a cut-free deduction of Γ ⇒ Σ. Proof. By straightforward induction we may simply delete the formulas of ∆ from every sequent in the deduction; ∆ consists of atomic formulas, so no inference rule has a main formula in ∆, and no Ax rule depends on the formulas in ∆, so ∆ is never needed.  We also need the following, which can be proven using the method used to prove the middle sequent theorem: Lemma 2.41. If there is a cut-free deduction of Γ, φ → ψ ⇒ Σ then there is a cut-free deduction of at most the same size in which the final inference is a L → rule introducing φ → ψ.

Theorem 2.42. Any cut-free deduction of Λ ⇒ Ck has size ≥ 2k−1. Proof. These are prenex formulas, so by the middle sequent theorem, we may assume that there is a subdeduction of some sequent Γ ⇒ Σ where Γ consists entirely of instances of the axioms in Λ while Σ consists entirely of conjunctions of atomic formulas. We may replace all free variables appearing in this sequent by the term 0, and so assume that all formulas are closed. 16

For any natural number n, let us write n for the term consisting of n applications of S followed by 0. The only formulas we will have to deal with are formulas of the form R(m, n, k) and conjunctions or implications built from such formulas. We call a formula R(m, n, k) accurate if k = m+2n; we call a conjunction of such formulas accurate if each conjunct is accurate, and we call an implication accurate if its premise is not accurate or its conclusion is accurate. (In other words, a formula is accurate if it is true in the intended model.) Note that all the elements of Γ must be accurate, since they are instances of the axioms. As a consequence, we must have the (unique) accurate in- stance of Ck appearing on the right hand side, since we know there is a model (namely, the natural numbers with R(x, y, z) meaning z = y + 2x) in which this is the unique solution, and anything provable must be true in each model. (We may, however, have ended up with extraneous, inaccurate solutions on the right hand side as well.) Let R(0, 0, nk) ∧ R(0, nk, nk−1) ∧ · · · ∧ R(0, n1, n0) be the unique accurate solution and let Σ0 be the result of removing this from Σ. By inversion, we 0 have deductions of Γ ⇒ Σ ,R(0, ni, ni−1) for each i ≤ k+1 (where nk+1 = 0). Further, by applying inversion to the other conjunctions, we may arrange for Σ0 to consist entirely of inaccurate formulas. To summarize, we have obtained a deduction of

0 Γ ⇒ Σ ,R(0, nk, nk−1)

0 where nk = 2k, nk−1 = 2k−1, all formulas in Σ are atomic and inaccurate, and Γ consists of instances of axioms from Λ. We give one more definition before getting to the heart of our proof: we define the scale of R(m, n, k) to be n, and if φ is a propositional combination of atomic formulas, the scale of φ is the scale of the largest formula appearing in φ. We now prove the following by induction on n: Suppose there is a cut-free deduction of Γ ⇒ Σ where Γ consists of instances of axioms from Λ, Σ consists of atomic formulas, and every accurate formula in Σ has scale ≥ n. Then there are at least n formulas in Γ of scale ≤ n. If n = 0, this is trivial, so assume n > 0. Without loss of generality, we may assume that every element of Γ is the conclusion of some inference rule in the deduction and that every element of Σ appears as the main formula of some Ax rule in this deduction. We proceed by side induction on the size of the deduction. Choose some accurate formula R(m, n0, k) in Σ. R(m, n0, k) appears as the main formula of some Ax rule, and therefore is a subformula of some formula on the left side. Since n0 ≥ n > 0, this must be an instance of the second axiom, and so has the form φ → φ0 → ψ. 17

By the previous lemma, we may assume that the final inference is L → introducing this instance. One of these subdeductions has a further implica- tion in the left hand side, so we may apply the same argument again. That is, we may assume the deduction has the form Γ, φ0 → ψ, φ → φ0 → ψ ⇒ Σ, φ0 Γ, ψ, φ → φ0 → ψ ⇒ Σ Γ, φ → φ0 → ψ ⇒ Σ, φ Γ, φ0 → ψ, φ → φ0 → ψ ⇒ Σ 0 ...... Γ, φ → φ → ψ ⇒ Σ Γ ⇒ Σ We split into cases. If R(m, n0, k) is either φ or φ0 then there are subdeduc- tions to which the side inductive hypothesis applies (that is, subdeductions which are necessarily shorter where the accurate elements of the consequent have scale at least n). So we consider the case where R(m, n0, k) is ψ. In this case, we may consider the subdeduction of Γ, φ → φ0 → ψ ⇒ Σ, φ, where φ necessarily has scale n0 − 1. If n0 > n then the side inductive hypothesis applies to Γ, φ → φ0 → ψ ⇒ Σ, φ. So suppose n0 = n; then the main inductive hypothesis applies to Γ, φ → φ0 → ψ ⇒ Σ, φ, and therefore Γ, φ → φ0 → ψ contains at least n−1 elements of scale ≤ n − 1. Furthermore, these must be distinct from φ → φ0 → ψ, since the latter has scale n. So Γ contains at least n elements of scale ≤ n. Now we return to our original deduction. The accurate solution includes a term R(0, n1, n0) where we must have n1 = 2k−1. Therefore |Γ| ≥ 2k−1, and there must be at least 2k−1 inferences to convert all these formulas into instances of Λ. Therefore the size of the original deduction was at least 2k−1.