<<

Highlights in Pro of



Solomon Feferman

Abstract

This is a survey of some of the principal developments in pro of

theory from its inception in the 1920s, at the hands of David Hilb ert,

up to the 1960s. Hilb ert's aim was to use this as a to ol in his nitary

program to eliminate the \actual in nite" in

from pro ofs of purely nitary statements. One of the main approaches

that turned out to b e the most useful in pursuit of this program was

that due to , in the 1930s, via his calculi of \"

and his Cut-Elimination for them. Following that wetrace

how and why prima facie in nitary concepts, such as ordinals, and

in nitary metho ds, such as the use of in nitely long pro ofs, gradually

came to dominate pro of-theoretical developments.

In this rst lecture I will giveanoverview of the developments in pro of

theory since Hilb ert's initiative in establishing the sub ject in the 1920s. For

this purp ose I am following the rst part of a series of exp ository lectures

that I gave for the Collo quium `94 held in Clermont-Ferrand 21-23

July 1994, but haven't published. The theme of my lectures there was that

although Hilb ert established his theory of pro ofs as a part of his foundational

program and, for philosophical reasons whichwe shall get into, aimed to have

it develop ed in a completely nitistic way, the actual work in pro of theory



This is the rst of three lectures that I delivered at the conference, Pro of Theory: His-

tory and Philosophical Signi cance, held at the University of Roskilde, Denmark, Oct. 31{

Nov. 1, 1997. I wish to thank the organizers, Prof. Stig Andur Pedersen and Prof. Vincent

F. Hendricks for inviting me to present these lectures, and esp ecially for their substantial

work in preparing them for publication with the assistance of Mr. Klaus Frovin Jorgen-

son, from their audio tap es and my transparencies. I have edited the resulting articles for

coherence and readability, but otherwise have maintained their character as lectures, as

originally given. 1

has moved steadily away from that towards rather in nitary metho ds. So

the question p osed in the title of my Clermont-Ferrand lectures was \howdid

this happ en, how is it that nitary pro of theory b ecame in nitary?". We will

get some idea of this. But what I want to concentrate on here are some of

the technical asp ects of that work, to try to giveyou an idea of what are the

notions involved in the actual development of pro of theory, and what are some

of the main results, at least up to the 1960s. The references b elow should b e

consulted for more detailed technical exp ositions that bring the sub ject up

to the present, and for a more complete history of its development.

1. Review of Hilb ert's Program and nitary pro of theory. Let us

start with Hilb ert's program itself and his conception of it. He was very

concerned foundationally ab out the problems of the in nite in mathematics.

Those were of two kinds. On the one hand you had explicit uses of the in nite

in Cantorian theory, that featured in some sense the completed in nite,

the trans nite. Then he also saw as a problem implicit uses of the in nite,

already in classical numb er theory with the use of rst order predicate cal-

culus and ordinary where, by the law of the excluded middle,

one assumes the natural numb ers as in some sense a completed totality.For,

one would havetobeabletodecidebetween alternatives of the form: either

all integers have a prop erty R or there is some integer which fails to have the

prop erty R,i.e.

8xR(x) _9x:R(x): (1)

In general you would not b e able to decide this. That kind of reasoning is

essential in various numb er-theoretical . In order to eliminate the

in nite in mathematics, Hilb ert's idea was that somehow the actual in nite

should only as an ideal , and is to b e eliminated in favor

of the p otential in nite. So nitism, in his conception of it, is to b e the

mathematics of the p otential in nite.

Wehave a contrast here b etween nitary (or nitistic) metho ds and in-

nitary (or in nitistic) metho ds. Paradigmatic for the former is the estab-

lishment of universal prop ositions of the form: that each natural numb er, x,

has an e ectively decidable prop erty R(x) (or even more basically, a primi-

tive recursive prop erty). A nitistic pro of of this has the character that for

each individual natural number x you can establish R at x,justby running

though a nite p ortion of the integers. In the pro of itself one mighthaveto

go far b eyond x in order to establish that, but, so to sp eak, the nitary char- 2

acter of the statementthat R holds of all natural numbers x is that at each

individual natural number you will only need a nite p ortion of the natural

numb ers to verify the statement. If A on the other hand were a prop osition

of the form 8xR(x), i.e. that all integers have a certain prop erty, there would

b e no way in principle in which that could b e veri ed by nite means as a

whole.

How then was pro of theory to b e used as a to ol in Hilb ert's program? How

was pro of theory to b e used in order to eliminate the actual in nite in systems

likenumb er theory, as re ected in prop ositions like (1)|to reduce pro ofs in

systems whichinvolve implicitly or explicitly the actual in nite, to pro ofs of

nitary, or apparently nitary, prop ositions, and to reduce these pro ofs to

actual nitary veri cations? The idea was: you would show nitistically that

if you have a S in which a b o dy of mathematics is represented,

and if S proves a nitarily meaningful prop osition R(x), then you would want

to b e able to showthatateach natural numb er, x, R(x) holds. Formally

that takes the form: If u is a pro of in S of R(x) then R(x)holds,i.e.

P r oof (u; pR(x)q) ! R(x) (2)

S

In order to do this it turns out that for a system S which is able to es-

tablish elementary facts ab out primitive recursive functions and relations, it

is sucient to establish the consistency of S in a nitistic way.For supp ose

you have a pro of of R(x)inS (with free variable x)butR(x) do es not hold

(for a sp eci c x). Then within S you could provethatR(x) do es not hold (at

that x) and therefore you would have a contradiction; if you assumed consis-

tency this could not happ en. That is why Hilb ert's program concentrated on

the consistency problem. The essence of the program was just this so-called

re ection principle (2) for nitary statements: that if you have a pro of of

R(x) in the system then in fact R(x) holds. The elimination of the actual or

potential in nite in S would b e accomplished if the re ection principle could

b e established and, for that, if the consistency of S , Con(S ), could b e proved

nitistically.Infreevariable form, Con(S ) is the formula

: (u; p0=1q) (3)

S

anditisthus a candidate for a nitistic pro of.

Thereareanumb er of metho ds whichwere develop ed in pro of theory in

order to b e able to establish this program for the elimination of the actual

in nite (explicit or implicit) in mathematics, and I just wanttomention 3

several b efore turning to what has come to b e the dominantmethod. I

divide these into two groups, one group having a certain kind of functional

character and a second group having a more syntactic character.

Functional Metho ds. The ones having functional character start with

Hilb ert's "-calculus. `"' there had nothing to do with the memb ership rela-

tion, but is simply a symb ol he used to write "xA(x), whichisinterpreted

informally as an x such that the prop erty A(x) holds, if there is anysuch

x at all. So you would say there is an x suchthat A(x) holds if A holds of

"xA(x), that is, if A("xA(x)). You have here a formal elimination of quan-

ti ersinfavor of such "-terms. Then you wanttoshow that "-terms can b e

eliminated, and thence reduce predicate calculus to prop ositional calculus,

which is unproblematic from a nitary p oint of view. I think of these "-terms

as functions, b ecause what they really do is to provide choices of an x,such

that A(x) holds; they are functions of the other arguments in A(x). Hilb ert

himself initiated this approach; he prop osed some that oughttobe

established. Wilhelm Ackermann, his collab orator, continued the work, and

it went a certain distance, but then after the 1930s not muchwas done until

William Tait to ok it up again in the late 1950s and early 1960s. In more

recentyears it is my colleague Grigori Mints who has really pursued this

quite systematically and has extended the metho d to rather strong systems.

That is work which is currently in progress.

Jacques Herbrand had in some resp ects a related approach, and again

the idea was to reduce in the predicate calculus in some wayto

validity in the prop ositional calculus. Basically what you do is that you in-

tro duce functions which are in a sense dual to Skolem functions, which are

ob jects that are appropriate to eliminate quanti ers for satis ability. Her-

brand used functions which are appropriate for eliminating quanti ers for

validity.Formally you can reduce questions of validity of arbitrary formulas

in the predicate calculus to questions of validity of purely existential for-

mulas byintro ducing many new function symb ols in them. Then Herbrand

showed that if an existential formulaisprovable then some nite disjunction

of instances of that formula is prop ositionally provable; moreover you can

derive the original formula, once you havesuch a nite substitution instance.

This approachwas initiated by Herbrand around 1930; there were technical

problems in Herbrand's own work, whichwere dealt with in the 1950s by

Burton Dreb en and his (then) studentWarren Goldfarb, among others.

Related workthatcontinues this into number theory was carried on by 4

Georg Kreisel, under the rubric of \the no-counter-example ".

That designation comes from the fact that if a formula is not valid, Skolem

functions would give an example for the negation of the formula. So if you

say that the formula is derivable, there is no counter-example, and that leads

you to the idea of the \no-counter-example interpretation".

Kurt Godel did something quite di erent where functionals enter very

clearly, in the use of the so-called \Dialectica" (functional) interpretation.

He formally reduced classical systems to intuitionistic systems, and then

showed that pro ofs of intuitionistic prop ositions have a natural functional

interpretation which he provided.

Syntactical Metho ds. NowIwant to turn to what are more syntactic

or purely logical approaches. In the early 1930s Gerhard Gentzen intro duced

two kinds of calculi: First, the Calculi of , which are

reasonably close in formal terms to the actual wayinwhichwe carry out

pro ofs. He tried to use this kind of calculus in order to carry out Hilb ert's

program. But there were some technical problems there, and he put it aside

in favor of what are called Calculi of Sequents, which I will describ e in more

detail. However, in later years the Natural Deduction Calculi were taken up

again by Dag Prawitz, and these have b ecome very imp ortant, b oth in pro of

theory and in applications of pro of theory to computer science.

Gentzen intro duced two kinds of Calculi for the rst order pred-

icate calculus: LJ (intuitionistic predicate calculus) and LK (classical pred-

icate calculus). L stands for pure predicate logic, J for intuitionistic, K for

classical. The calculi have rules which are sp eci c to each logical op eration,

and they separate the use of implication from its use in in a way

that we will see in a moment.

2. Results of nitary pro of theory via Gentzen's L-Calculi. Let

; ;  b e nite sequences (or multi-sets, or sets|di erent p eople take dif-

ferentchoices) of formulas. Instead of using set notation we write A ; ;A

1 n

for the sequence (multi-set or set), i.e. = A ; ;A .;A simply means

1 n

adjoin A to the sequence , i.e. ;A = A ; ;A ;A. Gentzen considers

1 n

derivations of formal expressions of the form ` A where the sequence

represents the hyp otheses or premises and A the conclusion of an .

An expression of the form ` A is called a sequent. Gentzen also allows that

there maybeanempty conclusion, that is, expressions of the form `. In-

stead of thinking of ` as an argument without conclusion, you can think of 5

it as an argument with a contradiction as conclusion by, for instance, putting

as conclusion a formula for a contradiction, like0 = 1 or A ^:A for some

sp eci c formula A. The character of these rules is that they tell you how

something is to b e intro duced in a conclusion and they also tell you how

something is to b e used as a hyp othesis. So in each of these rules you have

a left rule and a right rule. They either intro duce a formula with a sp eci c

principal logical op erator as conclusion|those are the right rules|or they

intro duce in a similar way a formula in the hyp otheses (or antecedent of the

sequent)|those are the left rules.

The calculus LJ has the following formal rules

 Rules for logical op erations

Right Left

;A ` ` A

:

`:A ; :A `

;A ` B ` A ;B ` C

!

` A ! B ;A ! B ` C

` A ` B ;A ` C ;B ` C

_

` A _ B ` A _ B ;A _ B ` C

;A ` C ;B ` C ` A ` B

^

` A ^ B ;A ^ B ` C ;A ^ B ` C

;A(x) ` B ` A(t)

9 restriction on x

; 9xA(x) ` B `9xA(x)

;A(t) ` B ` A(x)

8 restriction on x

; 8xA(x) ` B `8xA(x)

 Cut rule

0

` A ;A ` B

0

; ` B

 Structural rules

` A

0

; 

0

` A 6

Let's rst of all lo ok at disjunction, whichisvery characteristic. You

can infer A _ B as the conclusion if you have either inferred A or you have

inferred B .TouseA _ B you argue as follows: if I knew A and I obtain C ,

that would b e one way, if I knew B and I obtain C that would b e another

way. I do not knowwhichof A or B holds, but if I have A _ B , either A then

C or B then C , and therefore we can conclude C . The rules for conjunction

are dual to those for disjunction.

With negation it go es as follows. To infer that establishes :A you need

to know that together with A reaches a contradiction. On the other hand

to establish that together with :A reaches a contradiction you establish

that A follows from .

Now, p erhaps the most natural rule, although they all are quite natural, is

the one with implication-intro duction on the right. In order to infer A ! B

from you simply say: let us take A as an additional hyp othesis with , use

it and then infer B .

For quanti cation, let us just lo ok at existential quanti cation: in order

to infer 9xA(x)you infer that establishes A(t) for a sp eci c t. In order

to establish that together with 9xA(x) has as consequence B you say:

supp ose 9xA(x), let x be anything with that prop erty and showthatB

is a consequence of together with A(x). We need x there to b e a fresh

variable that has not b een named otherwise in the sequentinvolved; that is

the restriction on x in the quanti er rules.

There are some and rules b esides these that are not sp eci c to

the connectives. Axioms are simply of the form A ` A, i.e. that from A, A

follows; that is clear. The structural rules say that the set of premises in an

argument can b e expanded: if from a set of hyp otheses you have obtained

0

A and you takeany larger set that also gives A.

The most signi cant rule here is the Cut Rule whichsays: if from you

0

have inferred A and from another with A you have inferred B then from

0

b oth and you can infer B . This rule has some of the character of Mo dus

Ponens, but you can also think of it as a rule which corresp onds to the use

of lemmas or theorems in the pro cess of a derivation: in that pro cess you

establish some particular lemma or theorem A as an intermediate step and

then by use of A you establish the nal conclusion B .

The Cut Rule di ers from the other rules in an imp ortant resp ect. With

the rules for intro duction of a connective on the left or the right, you see that

every formula that o ccurs ab ove the line o ccurs b elow the line either directly,

or as a sub-formula of a formula b elow the line, and that is also true for the 7

structural rules. (We count A(t) as a subformula, in a slightly extended

sense, of b oth 9xA(x)and 8xA(x).) But in the case of the Cut Rule, the

cut formula A disapp ears. So you have a kind of detour, p erhaps, through

a formula whichmay b e more complicated than formulas which o ccur in the

nal sequent.

Now, a crucial question would b e: under what conditions can we establish

a direct derivation from a set of hyp otheses, , to a conclusion, A, without

use of the Cut Rule, b ecause the Cut Rule means that we are making some

detours?

Before wegetinto this; what wehave describ ed so far is the Gentzen

calculus for . The Gentzen calculus for classical logic is

formally similar, except that on the righthandsidewemayalsohave a nite

set or sequence of formulas (= B ;:::;B ), and the rules lo ok exactly the

1 n

same as they did with the intuitionistic calculus allowing these sets of side

formulas, , throughout.

 The calculus LK has the following formal rules:

 Rules for logical op erations

Right Left

;A `  ` A; 

:

`:A;  ; :A ` 

;A ` B;  ` A; ;B ` C; 

!

` A ! B;  ; ;A ! B ` C; ; 

` B;  ;A ` C; ;B ` C;  ` A; 

_

` A _ B;  ` A _ B;  ;A _ B ` C; 

` A; ` B;  ;A ` C;  ;B ` C; 

^

` A ^ B;  ;A ^ B ` C;  ;A ^ B ` C; 

;A(x) ` B;  ` A(t); 

9 restriction on x

; 9xA(x) ` B;  `9xA(x); 

` A(x);  ;A(t) ` B; 

8 restriction on x

; 8xA(x) ` B;  `8xA(x);  8

 Cut rule

0 0

` A;  ;A ` 

0 0

; ` 

 Structural rules

` 

0 0

;  ;   

0 0

` 

In the classical case, Cut informally takes the form: if from you have

0 0 0 0

obtained ;A and from ;A you get  then from ; you obtain ,  .

In a momentwe will lo ok at an interpretation of LK in more usual Hilb ert

style terms, but to see how this form of the calculus gives us something that

the intuitionistic calculus do es not give, let us just lo ok at a pro of of the Law

of the Excluded Middle in LK. A formal pro of of this law takes the following

form:

A ` A

` A; :A

` A _:A; :A

` A _:A; A _:A

` A _:A

From the A ` A,you can bring negation over to the righthandside

and then, using the disjunction rule applied to the rst formula A on the

right hand side you get A_:A, and then using the disjunction rule once

more for intro ducing disjunction on the rightyou again obtain A _:A.Now

the set of formulas to the rightof` just consists of two exemplars of A _:A

which simply collapses to the set A _:A.From the way this argumentgoes

we do not knowwhy A _:A holds, i.e. we do not know whichof A; :A is

true, but by the mechanics of this form of Gentzen's classical calculus we are

able to derive it in that form. Note that weareblocked at the very outset

from carrying out this derivation in LJ, where at most one formula app ears

on the right hand side of a sequent. In LJ if ` A _ B then ` A or ` B , but

this do esn't hold in LK. In particular ` A _:A in LK even when neither

` A nor `:A.

The interpretation of LK is simply that a sequent ` isderivable

in that calculus just in case the conjunction of formulas in implies the

disjunction of formulas in , i.e.

A ^  ^ A ! B __B (4)

1 n 1 m 9

is valid. If you lo ok back at the rules with this interpretation in mind you

see that each of the rules preserves validity in that sense.

So the Cut Rule is a kind of generalization of Mo dus Ponens and takes

over its roles. The principal theorem that Gentzen obtained is the Cut-

Elimination Theorem, and this works b oth for the classical and the intu-

itionistic calculi. It says that for each derivation d of a sequent ` we



can asso ciate a cut-free derivation d of the same sequent, i.e. one whichhas

no applications of the cut rule in it. Everything that is used in it is either

an axiom, a or one of the rules for intro ducing one of the

connectives either on the left or on the right. There is a price, though, that

you havetopay for this, and that is that the length of the cut-free derivation

asso ciated with the original derivation is much longer in the worst case, and

it can growatahyp er-exp onential rate, which can b e measured as follows:

Asso ciated with each derivation is its cut-rank, which is the maximum of

the complexities of the cut-formulas A which are eliminated by the use of

the cut rule. You have a natural measure of the complexity of those for-

mulas and you also have a measure of the length of the derivation jdj and



the length of the new derivation jd j. The length of the cut-free derivation,

compared to the length of the original derivation, is b ounded bya stackof

2's ab ove which the highest exp onentisjdj, and the length of the stackis



the cut-rank of the original derivation. In symbols: jd j2 (jdj)where

r

2 (a)

n

r = cut-rank(d); 2 (a)= a and 2 (a)= 2 . Basically each reduction

0 n+1

of the cut-rank by 1 corresp onds to an increase in the length of the mo di ed

derivation, by exp onent to the base 2. By suitable examples this is in general

the b est p ossible; you can't do much b etter than this.

Though not feasible op erations in general, b ecause of the p ossible hyp er-

exp onential rate of growth, these are e ective transformations of the orig-

inal derivations into new cut-free derivations. The cut-free derivations are

signi cant, as I say, b ecause they have the sub-formula prop erty. In par-

ticular, if you ask whether you can derive the emptysequent, whichwould

simply b e a derivation of a contradiction, the answer is no, b ecause by the

sub-formula prop ertyeverything in such a derivation would havetobea

sub-formulaofthiseventual conclusion, but you have to start with axioms

so that can't b e p ossible. So this proves that the classical predicate calculus

is consistent, which is not surprising|but this is a simple example of how

the Cut-Elimination Theorem might b e useful in establishing consistency of

stronger systems S . 10

Applications. As with the disjunction prop ertyforLJ, that if ` A _ B

then ` A or ` B ,wehave an existential instantiation prop ertyfor LJ:if

`9xA(x), then for some t; ` A(t). This cannot b e done in LK,even for

quanti er-free A. Herbrand's Theorem, whichIhave already mentioned, tells

us the next b est thing we can do in LK. It runs as follows:

If R is a quanti er-free formula and `9xR(x)inLK then

there exist terms t ;  ;t suchthat ` R(t ) _  _ R(t ). And,

1 n 1 n

more generally, when is purely universal,ifin LK,`9xR(x)

then ` R(t ) _  _ R(t ) for some t ;:::;t :

1 n 1 n

A pro of of Herbrand's Theorem using LK go es along the following lines:

If you have a derivation d of an existential statement, `9xR(x), where R

is quanti er free, then by the Cut-Elimination Theorem we can transform

 

it into a cut-free derivation d of `9xR(x). Nowind , this nal sequent

would have had to have b een established by the right rule for existential

quanti cation, whichwould mean that you had a substitution instance R(t)

which brought that in. This 9xR(x)might still have b een there, b ecause by

the structural rules the set of formulas would collapse if I proved 9xR(x),

R(t) and then established 9xR(x), 9xR(x); and since the set of those are

the same as 9xR(x)I would still have that formula 9xR(x) one step back.

So wewould continue up the of that derivation in that way,eachtime

having p erhaps some new substitution instance. But eventually wehaveto

stop with that, and we will simply havea bunch of formulas on the rightof

the form: R(t );R(t );:::;R(t ), and of course that is the same as having a

1 2 n

pro of of the disjunction R(t ) _ R(t ) _ ::: _ R(t ).

1 2 n

What Herbrand's Theorem tells us|and what comes out of the pro of in

the Gentzen calculus|is that: classically,ifyou have a pro of of an existen-

tial statement, you do not necessarily know of one sp eci c instance which

realizes that statement, but you will always have a nite set of instances,

by means of whichyou can say at least one of those realizes the statement.

More generally,ifyou have a purely universal set of hyp otheses and that

proves an existential statement then there will b e a nite set of witnesses

whichproves the corresp onding disjunction. One way of seeing that is: take

the universal statements in , use the negation rule to bring them over to

the right side, they then b ecome existential statements, and we can apply

Herbrand's Theorem and then go back and wewillhave the conclusion.

The reason this is useful for Hilb ert's Program is that some formal systems

of interest to us have particularly simple axioms which are purely universal. 11

In particular, in Peano Arithmetic the axioms for 0, successor, addition, mul-

tiplication, and p erhaps other primitive recursive functions are all universal.

The only thing which mightnotbeuniversal is the Axiom of Induction, or the

scheme of induction with various formulas. Let us take the simplest form of

the scheme of induction, namely,Quanti er Free Induction Axiom (QF-IA)

0

R(0) ^8x[R(x) ! R(x )] !8xR(x) (5)

In this, R(x)isaquanti er-free formula, with p erhaps additional free vari-

0

ables. It expresses that if R(0) holds and if R(x) implies R(x )forany x,

then 8xR(x)holds.

Assuming you have some elementary prop erties of primitive recursive

functions and relations|in particular the \less than" relation, then you can

show that (5) is equivalent to the statement

0

8x[R(0) ^8y

which is purely universal with primitive recursive body. To see this, it is

sucient to know that the induction hyp othesis|that R transmits from y

0

to y |holds for numb ers under x in order to get up to x itself. In primitive

0

recursive arithmetic the b ounded quanti er formula 8y

is equivalent to a quanti er-free formula. So, the universal closure of (6)

now b ecomes a purely universal statement. Therefore, if you are lo oking

at consequences of that particular subsystem of Peano Arithmetic which

just uses QF-IA,byGentzen's Cut-Elimination Theorem (or if one prefers,

Herbrand's Theorem, or suitable theorems for Hilb ert's "-calculus), you are

able to establish that if you prove an existential statement in that system

you will have a disjunction of a nite numb er of instances provable there. In

particular, the consistency of this system is an immediate consequence.

That result for QF-IA was established byAckermann as the rst con-

tribution to Hilb ert's consistency program for a system of any mathematical

interest. Though there is not very muchyou can do mathematicallywithin

that system, it is non-trivial at the same time. Much later there was an

extension of this result to the instances of the induction axiom scheme which

0 0

IA): formulas denoted ( use what are called 

1 1

0

A(0) ^8x[A(x) ! A(x )] !8xA(x); (7)

0

where A is a  formula, i.e. it is of the form 9xR(x), where R itself is quanti-

1

er free. (The sup er `0' simply means you havejustnumerical quanti cation, 12

the sub `1' means one quanti er and the  means that it is existential.) The

0

result for  IA was rst proved by Charles Parsons using an adaptation

1

of Godel's functional interpretation. Wilfried Sieg later gave a new pro of,

using Herbrand-Gentzen style metho ds, which happ ens to b e useful for other

things.

0

It turns out from the work of Parsons that the system based on the 

1

Induction-Axiom is conservativeover the system of quanti er-free Primitive

Recursive Arithmetic, PRA. Being conservative means that anyformula

0

formulated in the language of PRA,which is provable with  induction

1

0

( IA), is already provable in PRA.

1

PRA itself is a system based on entirely quanti er-free axioms and rules

for primitive recursive functions, including a Rule of Induction rather than

an Axiom of Induction. It has b een argued byTait, and is generally agreed,

that everything that is obtained in PRA is nitistically justi able, at least

in principle (Tait has also argued the converse). Assuming this, Parsons'

result is, therefore, again a contribution to Hilb ert's program: you eliminate

the use of the in nite as re ected in the use of classical predicate calculus,

0

together with the axioms for this fragment of arithmetic  IA in favor of

1

purely nitary pro ofs, as represented in PRA.

If you are a radical nitist you only talk ab out things that are feasibly

computable, and this result would not cover that b ecause the primitive re-

cursive functions, b eginning with exp onentiation, go far b eyond the feasibly

computable functions. But if you are nitist \in principle" then conservativ-

ityover PRA should certainly satisfy you.

3. Shifting paradigms. How far can Hilb ert's program b e carried out?

Godel's second incompleteness theorem told us that one would not b e able

to prove the consistency of full rst order arithmetic, PA (whose induction

axiom scheme applies to arbitrary formulas) by means which could b e for-

malized within PA itself. Although Hilb ert never made precise what exactly

nitist arguments were to consist in, all the nitary arguments that had b een

carried out up to Godel's incompleteness theorem were evidently formaliz-

able in very weak subsystems of PA, and in fact in PRA itself. Thus, one

had to rethink the Hilb ert program at that p oint, and ask whether there is

some reasonable mo di cation of it which could establish the consistency of

PA and yet stronger systems. For example, instead of reducing in nitary

systems to nitary systems, one could try to reduce non-constructive sys- 13

tems to constructive systems, or, as we will see, even seek other kinds of

reductions.

If one accepts intuitionistic logic as b eing a formal expression of con-

structive ideas, then one could say: supp ose we replace classical logic in PA,

which is b ehind the rst implicit use of the actual in nite (as in, e.g.

8xR(x) _9x:R(x), which cannot b e inferred in intuitionistic logic). Supp ose

weleave out the law of the excluded middle; then we obtain a system which

is called Heyting Arithmetic, HA,which lo oks just like PA with the di er-

ence that it is based on intuitionistic rather than classical logic. But there is

a simple translation of classical logic into intuitionistic logic and of classical

arithmetic into intuitionistic arithmetic|obtained indep endently byGodel

and Gentzen. It is not clear that Hilb ert would have b een satis ed with

this reduction in favor of a system where there is no implicit app eal to the

actual in nite. But this is evidence of a kind of thing that could b e done if

you replaced Hilb ert's program by this mo di cation where you say: let us

just see what can b e obtained by reducing non-constructive formal systems

involving app earances of the in nite into constructive formal systems which

do not contain such app earances.

But to carry out something like Hilb ert's original program for the full

system of arithmetic, PA, more mustbedoneifyou are going to try to push

a pro of of its consistency as far down to nitist arguments as p ossible. That

again was accomplished byGentzen, who broughtelements of the trans nite

into the picture, with the use of ordinals. The ordinals here are those which

are b elow Cantor's ordinal " . That is the limit of the sequence of ordinals

0

!

! !

!; ! ;! 

These ordinals can b e represented in nite form, in what is called Cantor-

normal form to base ! , and when represented symb olically in that form,

these simply lo ok like certain nite symb olic con gurations

m

0

; (8) ! +  + !

where the exp onents are again of such forms. The ordering relation b etween

these nite symb olic con gurations turns out to b e primitive recursive, in

avery easy way: you can decide whether one con guration, representing

an ordinal less than " , is less than another in a primitive recursiveway.

0

Consequently,you could say: though I am talking ab out certain trans nite

ob jects here|trans nite ordinals|I am representing them in a nitary way,

using this primitive recursive relation. 14

What Gentzen did was to asso ciate with each derivation in elementary

numb er theory a derivation of sequents using the induction rule (Ind-Rule)

0

` A(0);  ;A(x) ` ;A(x )

(9)

` A(x); 

(A arbitrary) to supplement the logical rules of LK, to see whether something

like the Cut-Elimination Theorem could hold.

It turns out that wedonothave full cut-elimination for this extension of

LK. But what wedohavebyGentzen's workisthatifyou have a derivation,

d, which is a p ossible derivation of the empty sequent, then with eachsuch

derivation you can assign an ordinal or d(d)lessthan" , such that certain

0

reductions like cut-elimination can b e applied to that derivation to lower the

complexity. If a derivation were a derivation of the empty sequentyou would

b e able to successively lower its complexity. That is, to each derivation d of

0 0

the empty sequent is asso ciated another one d ,suchthator d(d )

" . But if that were the case, then you would have a descending sequence in

0

the ab ove-mentioned order relation of order typ e " .So,ifyou assume that

0

that is a well-founded relation, or equivalently that you can apply trans nite

induction up to the ordinal " , then you can verify that there cannot b e

0

any such descending sequence of ordinals, and, therefore, there cannot b e

such a reduction sequence: so you cannot have a derivation of the empty

sequent. Consequently,wehave a consistency pro of of the full rst order

1

Peano Arithmetic, PA .

What principle of trans nite induction do weneedhere? It turns out

that we only need trans nite induction applied to quanti er-free formulas,

and that the rest of the argument can b e carried out with just things that

are purely within primitive recursive arithmetic PRA. The statement that

PA itself is consistent is a statement of purely universal character, it says

no derivation d is a pro of of 0 6= 1. All of this can b e done in a purely,so

to sp eak, nitary way except for the assumption of quanti er free trans nite

induction up to " .Quanti er-free trans nite induction up to " proves the

0 0

consistency of PA,

QFTI(" ) ` Con(PA ) (10)

0

nitistically (and certainly over PRA). Gentzen showed that this was the

b est p ossible in the following sense. For each ordinal less than " you can

0

1

Gentzen's work actually establishes something stronger, namely the re ection principle

0

for  -formulas 9xR(x)inPA.

1 15

prove trans nite induction up to , TI( ), in PA:

PA ` TI( ) for each <" : (11)

0

In some sense, " was thus attached as the ordinal of Peano arithmetic.

0

Further prop osed mo di cations of Hilb ert's program. That is

one way in which Hilb ert's program was extended. Much further work in this

direction was carried on in the 1950s by the Munichscho ol of Kurt Schutte

and in the scho ol of Gaisi Takeuti. Though the workofthesescho ols had sub-

stantial technical di erences, they agreed on a general extension of Hilb ert's

consistency program. The principal aim of p eople within this program has

b een to prove the consistency of stronger and stronger formal systems, S .

And the way that is to b e done is to asso ciate an ordinal, ,whichcanbe

represented primitive recursively,with S , and then provetwo things. First,

that by just using nitary metho ds and trans nite induction up to you are

able to prove the consistency of S ; and, second, that this is b est p ossible in

the sense that for each ordinal smaller than ,you can prove the trans nite

induction principle up to , TI( ), for < ,inS itself. But also the

trans nite induction principle up to itself must somehow b e recognized in

some constructiveway.Soyou have a kind of curious combination here of

getting larger and larger ordinals attached to formal systems; you try to b e

as nitary as p ossible, but there is this one non- nitary element, the trans -

nite induction up to the ordinal asso ciated with the system. You mayhave

to use very strong constructive metho ds in order to establish that trans nite

induction principle.

A quite di erentway of lo oking at what pro of theory oughttodo,was

prop osed by Kreisel and continued by me in the article \Hilb ert's program

relativized" (see the References). Instead of saying, as in the initial Hilb ert's

program, that what we are trying to do is reduce in nitary systems to nitary

systems, let us say: well, one thing we can do is to p erform various other

kinds of reductions. For instance, reduce

 non-constructive systems to constructive systems

 impredicative systems to predicative systems,

and so on, and then obtain related conservation results. There the reduction

do es not mention ordinals at all, but the pro of of the reduction might actually 16

have to use the techniques of Gentzen and the extensions bySchutte, Takeuti

and others b ehind the scenes in order to establish such conservation results.

That is still another direction that one might hop e to pursue with extensions

of pro of theory. This is something I will talk more ab out in the following

lectures.

Kreisel himself also promoted what you might call a non-ideological at-

tempt at using pro of theory as a kind of extension of Herbrand's Theorem,

dealing with the question: if you prove some existential statements, what

kinds of information can you obtain ab out witnesses to those existential

statements? In the case of numb er theory typically statements of interest

are of the form

8x9yA(x; y ) (12)

that is, for every natural numb er there exists another natural number having

a certain prop erty, A.Whenproving such statements non-constructively you

would like to extract the constructive or recursive content, i.e., you would

liketoknowhow y is obtained as a function of x. In the case that A is

primitive recursive, y is obtained as a recursive function of x. Then you talk

ab out what are called the provably recursive functions of the system S . One

way of thinking of that is that you mayhave a non-constructive pro of of the

existence of a recursivemany-one relation, but the question is actually to

pro duce that recursive relation, y as a function of x, or as a program which

realizes that function. Again, that can b e carried out. So, for example, in

the case of arithmetic p eople have pro duced what are called hierarchies of

recursive functions, or sub-recursive hierarchies, starting with the primitive

recursive functions, going a little farther with the Ackermann function, which

uses a certain bit of ordinal , and then farther up to the ordinal " .

0

Kreisel characterized the provably recursive functions of PA in terms of a

sub-recursive hierarchyupto " .For stronger systems the asso ciated ordinal

0

also leads you to hierarchies of provably recursive functions. That is another

way in which pro of theory is used to obtain, in principle useful, mathematical

information, although in fact it do es not go muchbeyond pro ducing such

hierarchies.

There have b een di erentways that p eople have talked ab out the ordinal

of a formal system. One is the Gentzen-Schutte-Takeuti way, the least ordinal

that you can use to prove the consistency of the system,

the least : PRA +(QFTI( )) ` Con(S ): (13) 17

Another is that it is the least provably recursivewell-ordering of the system,

i.e.

= supfjj:  is a provably recursivewell-ordering of S g: (14)

Still another is that it is the ordinal of the least hierarchywhich cannot b e

established in the system

= supf : e ective trans nite recursion up to is justi ed in S g: (15)

There are no very go o d robust de nitions of these concepts although they

agree in practice. Unfortunately,wedonothave a theory that tells us exactly

what we are doing when we obtain the ordinal of a formal system, though it

is clear that we are doing something of interest.

4. Countably in nitary metho ds (getting the most out of logic). In

the 1950s there was a shift in techniques to the use of in nitary formal and

2

semi-formal systems .Various p eople were involved in this but the p erson

who promoted it most vigorously was Kurt Schutte in Munich. He had a

di erentway of getting the ordinal asso ciated with Peano Arithmetic out of

a direct extension of logic to an in nitary system where, instead of the usual

right universal quanti er rule, we use the ! -rule. It says that if you proved

A(n) for eachnumeraln  , then you are allowed to prove 8xA(x), where A

here is an arbitrary formula. In sequent form it takes the following form

 ` ;A(n)  (n =0; 1; 2;:::)

(16)

` ; 8xA(x)

The system obtained by adding this rule to LK is denoted LK(! ).

Derivations here are in general going to b e in nite, since you have in nite

branching at the ! -rule. But going up along any branchyou will eventually

come to an axiom. If you invert the derivation trees you can see that these

are well-founded b ecause, going down, every branch comes to an end. Being

well-founded in nite trees in this sense, they have a natural asso ciated ordinal

length, namely: the height of the tree as an ordinal. The ordinal asso ciated

with an ! - is the sup of the ordinals asso ciated with the hyp otheses

plus 1, that is, it is the least ordinal greater than all the ordinals asso ciated

with the sub-derivations of each instance ` ;A(n).

2

The history here is incomplete. In particular, P.S.Novikov develop ed cut-elimination

for in nitary derivations with ordinal b ounds already in the late 1930s-early 1940s, but

this was largely unknown in the West for some time; cf. Mints (1991), 387-389. 18

You can replace the use of the induction axiom in Peano arithmetic by

applications of the ! -rule. There is an immediate of derivations in PA

into derivations, d,inLK(! ) preserving cut-rank. So these derivations in

LK(! )have cut-rank less than ! , but, due to the ! -rule, the length of d is

now going to b e in nite, but not to o large, namely, less than ! + ! .

Now, lowering the cut-rank in d by1 gives us an exp onential cost of j d j

to base ! . That is,

0 0 jdj

d 7! d ; j d j ! (17)

By rep eating this r times, where r is the cut-rank of d, r = rnk (d), we end



up with a cut-free derivation d of the same sequent, whose length is at most

astackof ! 's r times up to the original length of the derivation d,i.e.

 

d 7! d ; j d j ! (jdj); (18)

r

! ( )

n

where ! ( )= and w ( )=! .

0 n+1

If we do things rightyou can take base 2 instead of base ! ,justaswedidin

the nitary case, but when we are in ordinals the e ect of that is essentially

!

the same, since 2 = ! . The rst ordinal which is bigger than all those

ordinals that we get via (18) is simply " . So, " falls out of this directly,

0 0

and nowthisisaful l Cut-Elimination Theorem in ! -logic, of derivations

with nite cut-rank, whereas in the Gentzen-treatmentof PA you only had

0

a partial Cut-Elimination Theorem of derivations of  -sequents.

1

Here wehave what p eople regarded as an essential metho dological or

conceptual improvement, and that is that ordinals here were asso ciated in

a canonical way with in nite derivations in contrast to the original work of

Gentzen, where the asso ciation of ordinals lo oked rather ad ho c. It turns

out in recentwork by Wilfried Buchholz that in fact there is a much closer

intrinsic connection b etween the way that Gentzen assigned the ordinal "

0

to derivations in PA and the way that Schutte assigned ordinals to the

asso ciated derivations in LK (! ), than was realized for manyyears. In fact,

this same connection had b een prettymuch established back in the mid 1970s

by Mints, but it was not well-known. What Buchholz has done is essentially

to incorp orate the work of Mints and to simplify it in a waynow that it can

b e extended to much stronger systems, so that it turns out that there is a

much closer intrinsic connection than had b een previously susp ected b etween

ordinal assignments in Schutte-style pro of theory using in nite derivations

and Takeuti-style pro of theory using nitary derivations. 19

The ordinal " came out as a natural stopping p oint, b ecause it is the rst

0

ordinal closed under exp onentiation. If you take the function  ( )= ! ,

0

and if you de ne  as the function whichenumerates the xed-p oints of  ,

1 0

and continue this pro cedure into the trans nite, then you get a hierarchyof

ordinal functions which is called the Veblen hierarchy  of critical functions,

de ned by:

 ( )=!

0

 enumeratesf j  ( )= g (19)

+1

\

 enumerates Rng ( ); when  is a limit ordinal.



<

So, e.g.  enumerates the "-numb ers, i.e.,  ( )=" . The ordinals

1 1

 ( ) app ear in further extensions into in nitary pro of theory,aswe shall

see next.

The calculus LK . The basic simpli cation here is due to Tait

! ;!

1

who said instead of just using the ! -rule|which seems rather sp ecial to

3

arithmetic|let us use countably in nite conjunctions and disjunctions .There

are natural rules which generalize those for nite disjunctions and conjunc-

tions, and which lo ok exactly like the ones we had b efore in LK, namely

V

A with the rules  countable in nite conjunctions

n

n

 ` ;A (n

n k

V V

(20)

` ; ; A A ` 

n n

n n

W

A with the rules  countable in nite disjunctions

n

n

` ;A ;A `   (n

k n

W W

(21)

A A `  ` ; ;

n n

n n

Let LK b e Gentzen's calculus LK extended with these in nitary rules.

! ;!

1

(The sub '! ' here means that all conjunctions and disjunctions have length

1

less than the rst uncountable ordinal ! , while the sub `! ' means that we

1

only allow nite strings of quanti ers at each o ccurrence.) Tait established

a Cut-Elimination Theorem for this language. Derivations mightnowhave

3

Again, this was anticipated byNovikov; cf. Mints (1991), pp. 387-389. 20

trans nite cut-rank instead of nite cut-rank; roughly sp eaking, you can

b ound the length of a cut-free derivation obtained from an original derivation

of cut-rank and length in the Veblen hierarchyby iterating places out

in the  hierarchy. That is, for a derivation d with arbitrary cut-rank and



length , where ; < ! , there exists a cut-free derivation d of the same

1

conclusion with



j d j  ( ) (22)

Actually this is not b est p ossible but this is prettymuch the order of ordinal

complexityofwhatwe get.

Applications. This gets applied to Rami ed Analysis, whichisofvery

great signi cance in the pro of theory of predicativity, and which is something

I am going to b e talking ab out a fair amountinmy next lecture. Predicative

analysis, or rami ed analysis, is essentially Godel's notion of constructibility

restricted to sets of natural numb ers, where you have sets indexed by ordi-

nals at di erent levels and basically eachlevel corresp onds to sets which are

de nable using only quanti cation over sets of lower levels. By RA one

means the system using levels up to ,andinRA ,you use variables only

<

of levels less than .

Schutte established that the pro of theoretical ordinal of RA is  (0)

<

when ! = . That can b e proved quite directly as Tait did bycut-

elimination in the in nitary system LK that I describ ed ab ove. Now,

! ;!

1

Kreisel had prop osed that to explicate the notion of predicativityweuse

Rami ed Analysis, not through arbitrary ordinals, but only through those

ordinals which are accessible from b elow. That is, we only consider au-

tonomously generated RA which means that there is a , < ,ina

previously obtained RA with

RA ` WO( ); (23)

where  is a primitive recursive ordering of order typ e and WO( )

expresses that it is a well ordering. So that is what we call a b o ot-strapping

or autonomy pro cedure.

Schutte and I, indep endently, established in 1964 that the least impred-

icative ordinal is the least ordinal not obtained through the Veblen pro cess.

That is called and it is the least ordinal with the prop erty

0

; < )  ( ) < (24)

0 0

By (24), one has complete cut-elimination in the in nitary system up to .

0 21

This is basically where we came to in the 1960s, and is the b eginning

of what is called predicative pro of theory; further extensions make use of

prima facie uncountable in nitary derivations and much more complicated

ordinal notation systems. For an intro duction to predicative pro of theory,

see the monograph byWolfram Pohlers in the References b elow. Forasurvey

of many further extensions, see also, among other sources, his article \Sub-

systems of and second order numb er theory" in the Handbook of

Mathematical Logic edited bySamuel R. Buss. This material gets extremely

complicated technically, and is b eyond the scop e of this lecture to try to

explain, even in the roughest terms.

References

TEXTS:

Girard, Jean-Yves, Proof Theory and Logical Complexity. Naples, 1987.

Hilb ert, David and Bernays, Paul, Grund lagen der Mathematik,Vol. I.

2nd edn. Berlin, 1968. Ibid.Vol. I I, 2nd edn. Berlin, 1970.

Prawitz, Dag, Natural Deduction. Sto ckholm, 1965.

Schutte, Kurt, Beweistheorie. Berlin, 1960.

Schutte, Kurt, Proof Theory. Berlin, 1977.

Takeuti, Gaisi, Proof Theory, 2nd edn. Amsterdam, 1987.

Tro elstra, A. S. and Schwichtenb erg, Helmut, Basic Proof Theory.

Cambridge, 1996.

COLLECTIONS:

Aczel, Peter; Simmons, Harold; and Wainer, Stanley (eds.) Proof

Theory,Cambridge, 1992.

Buss, Samuel R., (ed.) Handbook of Proof Theory, Amsterdam, 1998.

Buchholz, Wilfried; Feferman, Solomon; Pohlers, Wolfram; and

Sieg, Wilfried, Iterated Inductive De nitions and Subsytems of Anal-

ysis. Recent Proof-theoretical Studies, Lecture Notes in Mathematics

897 (1981). 22

Gentzen, Gerhard, The Col lectedPapers of Gerhard Gentzen (M. Szab o,

ed.). Amsterdam, 1969.

Mints, Grigori E., SelectedPapers in Proof Theory. Naples, 1992.

Kino, Akiko; Myhill, John; and Vesley,Richard E. (eds.), Intuition-

ism and Proof Theory. Amsterdam, 1970.

MONOGRAPHS:

Buchholz, Wilfried and Schutte, Kurt, Proof Theory of Impredicative

Systems. Naples, 1988.

Jager, Gerhard, for Admissible Sets. A Unifying approach to

Proof Theory. Naples, 1986.

Pohlers, Wolfram, Proof Theory. AnIntroduction. Lecture Notes in

Mathematics 1407, 1989.

RESEARCH AND EXPOSITORYARTICLES:

Avigad, Jeremy and Feferman, Solomon,\Godel's functional (`Dialec-

tica') interpretation", in Buss (1998) pp. 337-405.

Buchholz, Wilfried, \Explaining Gentzen's consistency pro of within in-

nitary pro of theory", in G. Gottlob et al. (eds.), Computational Logic

and Proof Theory, KGC `97, Lecture Notes in Computer Science 1289

(1997).

Buss, Samuel R., \An intro duction to pro of theory", in Buss (1998), 1-78.

Buss, Samuel R., \First-order pro of theory of arithmetic", in Buss (1998),

79-147.

Dreb en, Burton and Denton, John, \Herbrand-style consistency pro ofs",

in Myhill et al (1970), 419-433.

Fairtlough, Matt and Wainer, Stanley S., \Hierarchies of provably

recursive functions", in Buss (1998), 149-207. 23

Feferman, Solomon, \Pro of theory. A p ersonal rep ort", App endix to

Takeuti (1987), 447-485.

Feferman, Solomon, \Hilb ert's program relativized. Pro of-theoretical

and foundational reductions". J. Symbolic Logic 53 (1988), 364-384.

Kreisel, Georg, \A survey of pro of theory", J. Symbolic Logic 33 (1965)

321-388.

Mints, Grigori E., \Pro of theory in the USSR 1925-1969", J. Symbolic

Logic 56 (1991), 385-423.

Mints, Grigori E., \Gentzen-typ e systems and Hilb ert's epsilon substitu-

tion metho d. I", in Logic, Methodology and Philosophy of ScienceIX

(D. Prawitz et al, eds., 1994), 91-122.

Pohlers, Wolfram,\Contributions of the Schutte scho ol in Munichto

pro of theory", App endix to Takeuti (1987), 406-431.

Pohlers, Wolfram, \Subsystems of set theory and second order number

theory", in Buss (1998), 209-335.

Q

1

Rathjen, Michael, \Recentadvances in : -CA and

2

related systems", Bul letin of Symbolic Logic 1 (1995), 468-485.

Schwichtenb erg, Helmut, \Pro of theory: some applications of cut-elimination",

in Handbook of (J. Barwise, ed., 1997) 867-895.

Sieg, Wilfried, \Hilb ert's program sixtyyears later", J. Symbolic Logic 53

(1998) 338-348.

Simpson, Steven G.,\Partial realizations of Hilb ert's program", J. Sym-

bolic Logic 53 (1988) 349-363.

Tait, William, \The substitution metho d", J. Symbolic Logic 30 (1965)

175-192.

Tait, William, \Normal derivability in classical logic", in The and

Semantics of In nitary Languages (J. Barwise, ed.), Lecture Notes in

Mathematics 72 (1968) 204-236. 24