A fine structure for the hereditarily finite sets

Laurence Kirby

Figure 1: A5.

1 1 Introduction.

What does theory tell us about the finite sets? This may seem an odd question, because the explication of the infinite is the raison d’ˆetre of . That’s how it originated in Cantor’s work. The universalist, or reductionist, claim of set theory — its claim to provide a foundation for all of — came later. Nevertheless I propose to take seriously the picture that set theory provides of the (or a) and apply it to the finite sets. In any case, you cannot reach the infinite without building it upon the finite. And perhaps the finite can tell us something about the infinite? After all, the finite is all we have to go by, at least in a communicable form. The set-theoretic view of the universe of sets has different aspects that apply to the hereditarily finite sets: 1. The universalist claim, inasmuch as it presumably says that the hereditar- ily finite sets provide sufficient means to express all of finite mathematics. 2. In particular, the subsuming of arithmetic within finite set theory by the identification of the natural numbers with the finite (von Neumann) or- dinals. Although this is, in theory, not the only representation one could choose, it is hegemonic because it is the most practical and graceful. 3. The cumulative hierarchy which starts with the and generates all sets by iterating the operator. I shall take for granted the first two items, as well as the general idea of gener- ating all sets from the empty set, but propose a different generating principle. In infinitary set theory, the cumulative hierarchy and its many variants (from G¨odelonwards) are basic to the picture of the set-theoretic universe. The power set operator ensures explosive growth in the cumulative hierarchy and is really a crude tool. Indeed, many of the related hierarchies, such as the constructible hierarchy, can be viewed as ways of putting brakes on the growth in order to see some fine structure. At the finite level, the power set is also fast and crude. I shall consider a natural candidate for a slower growing hierarchy whose gen- erating principle is the binary operation of adduction

hx, yi7→x ∪{y} which adds a single new to an already existing set. I shall suggest some things that are naturally expressed in this new way of describing sets. Adduction, as a generator, will be seen to be intrinsically finitist in character. An interesting contrast between arithmetic and set theory is that the former is based upon functions (successor, addition, multiplication) whereas the latter, in its usual formalization, is based upon a relation (the membership relation). By making the notion of adduction fundamental, we can base set theory upon a function which is a generalization of the successor function of arithmetic. This brings out in a more direct way the parallels between finite set theory and arithmetic.

2 In fact, the fundamental generator for arithmetic, the successor function, will be seen as the diagonalization of the adduction operator. §2 introduces the appropriate formal language for the purpose, with a binary function symbol to represent adduction instead of the usual binary relation symbol for membership, and defines and establishes some basic properties of the adductive hierarchy. §3, which can be omitted without prejudice to the succeeding sections, de- fines a canonical ordering on the hereditarily finite sets and bases upon it some visualizations of the small classes in the adductive hierarchy. It then calculates the sizes of small classes and subclasses. In §4, building on work of Flavio Previale, I define a first order version of an induction principle based on adduction which is due to Alfred Tarski and Steven Givant. I then define in our language Peano set theory PS which is equivalent to both Peano arithmetic and to ZF with the of infinity replaced by its negation. §5 is concerned with some more philosophical arguments for basing consider- ation of finiteness upon the idea of adduction, bearing in mind Hilbert’s concept of finitism as applying to systems of signs. Any actual communication consists of a finite series of adductions, of bringing things to your attention. I also argue for the provisional nature of the empty set. And I am led, via W. W. Tait’s thesis that finitist reasoning is essentially primitive recursive reasoning, to the primitive recursive set functions. In §6 I survey the primitive recursive set functions, establish their basic properties and give some primitive recursive definitions of basic set-theoretic operators. §7 introduces primitive recursive generalizations to sets of the arithmetical operations of addition and multiplication. This suggests an arithmetic of sets.I shall also provide evidence that the adduction operator, as generating principle, is intrinsically more finitist than the power set operator. Once we have started tinkering with the language of set theory, we can go further and question the role of the constant symbol for the empty set. Drop- ping this allows us to construct an extension of a model of finite set theory in which the empty set of the original model is no longer empty in the extension. In §8 I shall show that every model M of PS has such an extension which is isomorphic to M. The proof uses addition of sets.

I am grateful to Arthur Apter and Akihiro Kanamori for their helpful com- ments.

2 Basics.

We work in a language L(0 ;) for set theory which differs from the usual language L(0 ∈). Instead of the membership relation ∈ we employ as basic operation ad- duction, a binary operation denoted by [x; y]. So L(0 ;) has a binary function symbol for adduction as well as a constant symbol 0. The intended interpreta-

3 tion of [a; p]isa ∪{p}. Frequently we shall drop the brackets, especially the outermost, where no ambiguity results. Notation. a; p, q =[a; p]; q and thus a; p1,...,pn =[...[[a; p1]; p2]; ...]; pn. This is a notational abbreviation for convenience, and not in the formal lan- guage. The symbol 0 will be used in two ways: as the constant symbol in the formal language, and for the empty set which is the standard interpretation of this symbol. Likewise the semi-colon does duty both as the function symbol and as its interpretation in sets. Thus 0; p1,p2,...,pn = {p1,p2,...,pn}. Also p ∈ a is defined to mean a; p = a. So in fact L(0 ∈) and L(0 ;) are equivalent: each of the symbols is definable in terms of the other in a common extension L(0 ∈ ;). As a background theory, assume a familiar set theory: ZF will do. As W. W. Tait [12] emphasizes, when we describe a finitist operator in general terms, we are doing so from an infinitist standpoint. Following the usual development of set theory, the finite (von Neumann) ordinals are defined by iterating the operator n 7→ [n; n] and we temporarily and informally use the notation n + 1 for [n; n]. The successor operation for numbers has become the diagonalization of the adduction operator. Vω is the set of hereditarily finite sets, defined as usual by means of the cumulative hierarchy

V0 =0,Vn+1 = PVn,Vω = [{Vn | n ∈ ω}, where Px is the power set of x. In the iterative conception of set theory, the hierarchy of the Vn is the way of building up the hereditarily finite sets starting from 0. But if we examine this process of generation in detail, especially taking into account considerations of feasibility, it becomes evident that it does not conform with Ockham’s razor. The number 5, for example, doesn’t occur until V6, by which time we have also obtained 265536 − 1 other sets, including many complicated structures. This number is far more than the number of protons in the universe of physics, indeed in 265270 such universes. This seems a grossly inefficient way to generate sets, in as much as long before we get to small, simple, useful sets such as 6 we have generated more sets than we can ever possibly use or name in practice. Of course this is a version of a familiar complaint about exponentiation which, especially along with iteration of functions, results in unfeasible growth too soon and too fast. In the language L(0 ;) we can define a natural, slower-growing hierarchy for the hereditarily finite sets, the adductive hierarchy:

A0 = [0; 0] = {0} =1,An+1 = An ∪{[x; y] | x, y ∈ An}.

(The subscripts n and the names An are here in the metalanguage.)

4 It is intuitively clear that this hierarchy generates all hereditarily finite sets: Vω = S{An | n ∈ ω}. And it conforms well with the following intuitive picture of the hereditarily finite sets given by Flavio Previale ([9], p. 214): Hered. finite sets are exactly what can be obtained by starting from the empty set and then iterating the operation of adding, to a set already obtained (by the procedure itself), an element taken in its turn among the sets already obtained.

Note that 5 ∈ A5, and we shall see that the of A5 is 11680, so we have a more reasonable amount of baggage along with the number 5. The situation has not improved sufficiently, however, to be called feasible. A6 has become unwieldy at 135717904 elements. How fast does the cardinality of An grow? Clearly an upper bound for 2 |An+1| is |An| + |An| , so that the growth of the size of An, as a function of n n, is double-exponential (i. e., of order 22 ), rather than iterated exponential as is the case for Vn. A heuristic argument suggests that this upper bound is not a bad approximation for the size of An since one would expect relatively few repetitions among the new sets [x; y] that are added at stage n + 1. This argument is somewhat borne out by the results of §3. Some basic properties of the An can be proved by induction on n:

Lemma 2.1 (i) For each n ∈ ω, n ∈ An. (ii) x ∈ An →|x|≤n. (iii) x ∈ An → x −{y}∈An, for any y. (Notation: |x| is the cardinality of x. In this paper, the symbol ‘−’ will be used for the set theoretic difference operation, and only occasionally for arith- metical subtraction in clearly arithmetical contexts such as within the subscript of An−1.) Proof of Lemma 2.1(iii).0−{y} =0, so (iii) is true for n = 0. Suppose true for n, and let x ∈ An+1:sayx = a; p with a, p ∈ An.Ifp = y then x −{y} = a −{y} which is in An by the induction hypothesis. If p =6 y and y ∈ a, then x −{y} =(a −{y}); p and, again using the induction hypothesis, this is in An+1.Ifp =6 y and y/∈ a, then x −{y} = x ∈ An+1.

The first part of the next Lemma strengthens (iii) of the previous Lemma. J. C. Shepherdson ([11], §2.4) uses a version of the property stated in the first part of the next Lemma as a condition for absoluteness of cardinals in models of GB set theory. A. R. D. Mathias [7] uses the name supertransitive for the conjunction of properties (i) and (iii).

Lemma 2.2 (i) x ∈ An → Px ⊆ An. (ii) If x ∈ An+1 and y ∈ x then y ∈ An. (iii) An is transitive.

Proof. (i) Inductively, assume (i) proven for n. Suppose x ∈ An+1 and y ⊆ x. We need to show that y ∈ An+1. Say x = a; p with a, p ∈ An. By Lemma 2.1(iii),

5 a−{p}∈An. Since y−{p}⊆a−{p}, by the induction hypothesis y−{p}∈An. Now y is equal to either y −{p} or [(y −{p}); p], depending on whether p/∈ y or p ∈ y. In either case, y ∈ An+1. (ii) The case n = 0 is easy to verify since A1 = {0, {0}} = {0, 1} =2. Inductively, suppose y ∈ x = a; p with a, p ∈ An+1. Then either y ∈ a in which case the inductive hypothesis supplies y ∈ An,ory = p so y ∈ An+1. (iii) follows from (ii).

Each closed term of L(0 ;) represents a hereditarily finite set. But a given hereditarily finite set has, in general, more than one such representation. We shall need to consider when two closed terms represent the same set. For closed terms s, t of L(0 ;), define s ≡ t ⇔ sVω = tVω , where tVω is the interpretation of the closed term t in Vω (considered as a structure for L(0 ;)). Equivalently, s ≡ t ⇔ Vω |= s = t. There will be a syntactical equivalent for this concept in Theorem 2.3 below. Let PS0 be the theory consisting of the universal closures of the following ax- ioms: 0; x =06 . (1) x; y, y = x; y. (2) x; y, z = x; z,y. (3) x; y, z = x; y ↔ x; z = x ∨ z = y. (4)

PS0 is true in Vω. The name PS0 is to suggest a restricted form of Peano set theory, following Flavio Previale [9]. In section 4, a form of induction will be added to PS0 to get the full theory PS. Meanwhile, for Theorem 2.3 we shall only have need of a weaker theory, which I shall denote by PS00. PS00 consists of (2), (3), and 0; 0 =0.6 Franco Montagna and Antonella Mancini [8] have shown how to interpret Robinson’s Q in a system equivalent to (1) and (4). We also need a hierarchy on the set T of closed terms of L(0 ;), corresponding to the An hierarchy, defined by

T0 = {0}, Tn+1 = Tn ∪{[s; t] | s, t ∈Tn}.

V So a ∈ An ↔ for some t ∈Tn,tω = a. Part (i) of the next Theorem is a weak completeness result. Part (ii) is a refinement of Lemma 2.1(iii) needed for the proof:

Theorem 2.3 (i) For closed terms s, t, s ≡ t if and only if PS00 ` s = t. V (ii) Let n ≥ 1. If t ∈Tn and p ∈ t ω then there exist t0 ∈Tn and s ∈Tn−1 such Vω Vω that PS00 ` [t0; s]=t, s = p, and p/∈ t0 . Proof. We prove both parts together by a joint induction. As inductive hypothesis assume that (i) is proven for s, t ∈Tn, and (ii) is proven for n. The case n = 1 is straightforward and uses the instance 0; 0 =6 0 of axiom (1). In the proof, the symbol ‘`’ will refer to provability in PS00.

6 First we prove (ii) with n replaced by n + 1. So assume t ∈Tn+1 and V p ∈ t ω .Sayt = u; v with u, v ∈Tn. We need t0 ∈Tn+1 and s ∈Tn satisfying V V the conditions in (ii). If p = v ω and p/∈ u ω , then we can take t0 = u, s = v. If p = vVω and p ∈ uVω , then by the inductive hypothesis for (ii) applied to Vω Vω u, take t0 ∈Tn,s ∈Tn−1 such that ` t0; s = u, s = p, and p/∈ t0 . By the inductive hypothesis for (i), ` s = v, and hence, using axiom (2),

` t0; s = t0; s, s = t0; s, v = u; v = t.

If p =6 vVω , then p ∈ uVω and as before use the inductive hypothesis for (ii) Vω Vω to obtain u0 ∈Tn and s ∈Tn−1 such that ` u0; s = u, s = p, and p/∈ u0 . Then ` u; v = u0; s, v = u0; v, s =[u0; v]; s (using axiom (3)) V and p/∈ [u0; v] ω , so we can take t0 to be u0; v. V V To complete the inductive step, we prove, given s, t ∈Tn+1 with s ω = t ω , that ` s = t. Say s = q; r and t = u; v with q,r,u,v ∈Tn. Case 1: r ≡ v. Working in Vω, we have q; r = u; r and hence one of three subcases must occur: either Vω |= q = u,orVω |= q; r = u ∧ q; r =6 q,or Vω |= u; r = q ∧ u; r =6 u. In the first subcase, by the inductive hypothesis for (i), ` q = u and hence ` s = q; r = u; r = t. In the second subcase, use the inductive hypothesis for (ii) to obtain s ∈Tn−1 and u0 ∈Tn with ` u0; s = u, Vω Vω Vω Vω Vω Vω Vω Vω s = r , and r ∈/ u0 . Now q = u0 = u −{r }. By the inductive hypothesis for (i), ` s = r and ` q = u0. Hence ` q; r = u. Using axiom (2), ` u; r = q; r, r = q; r. The third subcase is symmetrical with the second. Case 2: r 6≡ v. We have rVω ∈ uVω and vVω ∈ qVω . Using the inductive hypothesis for (ii), obtain u0 ∈Tn and r0 ∈Tn−1 such that ` u = u0; r0 and Vω Vω r0 = r . By the inductive hypothesis for (i), ` r0 = r. In the same way obtain q0 ∈Tn and v0 ∈Tn−1 such that ` q = q0; v0 ∧ v0 = v.Now

Vω Vω Vω Vω Vω Vω q0 = q −{v } = u −{r } = u0 , and by the inductive hypothesis for (i), ` q0 = u0. Hence

` q; r = q; r0 = q0; v0,r0 = q0; r0,v0 = u0; r0,v0 = u; v0 = u; v.

3 The small adductive classes.

In this section we shall explore the classes An for n ≤ 6, first through some visu- alizations and then with the aid of an analysis of the sizes of certain subclasses. The succeeding sections will be independent of this one. The idea of adduction, and the adductive hierarchy, give rise to a natural algorithm for generating a list s0,s1, ···sn, ··· of all the hereditarily finite sets. { } { ··· } The idea is to start with s0 = 0 and A0 = 0 . Given An = s0, ,s|An|−1 , generate An+1 by adding in all sets of form [x; y] with x and y in An, ignoring repetitions. Here are more details. It is convenient to work with terms of L(0 ;) T { } T { ··· } rather than sets, so let t0 = 0 and 0 = 0 . Given n = t0, ,t|Tn|−1 ,

7 generate Tn+1 as follows. Consider in turn each of the terms of form [ti; tj ] with i, j < |Tn|, ordering these terms by the lexicographic order on hi, ji.For each term check whether it is equivalent to (i.e. represents the same set as, see Section 2) any of the terms already in the list, and add it to the end of the list if not. The resulting list is the adductive-lexicographic ordering of Vω, and each set in Vω is uniquely represented in the list by its canonical term. An imple- mentation of this algorithm using Mathematica software was used to produce the visualizations and empirical calculations of this section. Without computer assistance one can start the process:

A1 =0;0, 1=2

A2 = 2; [0; 1], 2

A3 = 2; [0; 1], 2, [0; [0; 1]], [0; 2], [1; [0; 1], [1; 2], [[0; 1]; [0; 1]]; [[0; 1]; 2], [2; [0; 1]], 3 or in traditional setbuilder notation:

A1 = {0, 1} =2

A2 = {0, 1, {1}, 2}

A3 = {0, 1, {1}, 2, {{1}}, {2}, {0, {1}}, {0, 2}, {1, {1}}, {1, 2}, {0, 1, {1}}, 3}.

|A1| =2, |A2| =4, |A3| =12. It will be proved that

|A4| = 112, |A5| = 11680, |A6| = 135717904.

Before proceeding to some visualizations based on the adductive-lexico- graphic ordering of Vω, a note about pictures. Stephen Wolfram, in A New Kind of Science [16], shows many pictures that vividly describe the complex behav- iour of systems based on simple (finite) programs such as cellular automata. In accordance with the universalist claims of set theory, all such systems can be represented using hereditarily finite sets. So it is reasonable to expect com- plex behaviour to manifest itself among the hereditarily finite sets too. But Wolfram’s book, in its very format, constitutes also an argument that the be- haviour of complex systems with sizes in the order of hundreds to hundreds of thousands of units is often best perceived visually. The adductive hierarchy gives an opportunity to study the generation of the hereditarily finite sets on a visualizable scale. Figure 2 is a 3-dimensional bar chart showing the generation of A4 from A3, using the canonical terms s0,s1,s2, ···. It is a visualization of the “multiplica- tion table” for the adduction operation: for 0 ≤ i, j ≤ 11, if [si; sj ]=sk, then the height of the bar at position (i, j)isk. Thus the tallest bar, corresponding to the last element to be added to A4 in the adductive-lexicographic ordering, has height 111 and represents the set s111 =[s11; s11] = [3; 3] = 4. Figure 2 also indicates those adductions that give rise to 1, 2, and 3. All occurrences of 3 have

8 Figure 2: A4. height 11. The first occurrences of 1, 2, and 3 (corresponding to their canonical terms) are indicated in boldface. The empty set itself is not represented in this and the other visualizations. Figure 1 similarly shows A5 generated from the 112 elements of A4. The tallest bar, with height 11679, represents the set 5. A4 occupies the front 12×12 square, flattened compared with Figure 2 because the vertical scale is com- pressed.

Figure 3: s9 =[s2; s3] (bottom) is built up from s2 = [0; [0; 0]] = {1} (top left) and s3 = [[0; 0]; [0; 0]] = 2 (top right).

A different kind of visualization is obtained by giving each canonical term a visual form. I explain the process by an example: Figure 3 shows how the

9 Figure 4: A3. Left, the sets s1 through s11, from right to left, centre aligned. Right, the same sets are glued together.

visualization of s9 = [[0; 1]; [1; 1]] is constructed from visualizations of [0; 1] and [1; 1]. Figure 4 then illustrates the construction of a visualization of A3 by gluing together visualizations of its elements, aligning them at their central semi-colons. Figure 5 shows A4 in the same way.

Figure 5: A4, centre aligned.

10 Figure 6: A4, left aligned.

Figure 6 shows a different aspect of A4, and Figure 7 extends the view as far as s311, from yet another viewpoint. In these two figures, the terms are aligned at their left ends rather than at their central semi-colons as in Figures 4 and 5. Since Figure 6 is viewed from the rear, the representation of the last set in A4, viz. 4 = [3; 3], is clearly visible. The nest depth of a term of L(0 ;) is the nest depth of its brackets, for example s9 = [[0; [0; 0]]; [[0; 0]; [0; 0]]] has nest depth 3. So the heights of the visualizations in Figures 3 through 7 are equal to the nest depths of the terms represented. With this in mind, the following characterization of the classes An, easily proved by induction on n, is visible in the figures:

Proposition 3.1 For any set a ∈ Vω, a ∈ An if and only if the canonical term for a has nest depth ≤ n. Density plots, wherein the increasing heights in Figures 3 through 7 are rendered instead by darker shades, give another kind of visualization: Figures 8 and 9. A6 is too big to be feasibly visualized in these ways. A little of its structure can be discerned by investigating some subclasses of the An, in the following results which are valid for larger classes as well. Towards the end of §6 it will be mentioned that these results can be formalized in PS, and for this reason I shall give a few indications of the of PS0 used in the proofs. Definition. (i) xΓy = x ∪ y ∪{[u; v] | u ∈ x ∧ v ∈ y}. (ii) Am,n = AmΓAn.

11 Figure 7: s1 to s311, left aligned. The first range is A4 (cf. Figure 6), and the first 200 sets of A5 follow.

Figure 8: Density plot of A4, corresponding to Figure 5.

12 Figure 9: Three density plots of A5. Right, centrally aligned. Middle, left aligned. Left, using representations of sets derived from the traditional set- builder notation.

13 Figure 10: A4,3. (Compare Fig. 1.)

Thus AMax(m,n) ⊆ Am,n ⊆ AMax(m,n)+1 and An,n = An+1. Figure 10 shows a bar chart for A4,3.

Lemma 3.2 (i) For n ≥ 1,An,0 = An. (ii) For n ≥ 2,An,1 = An.

Proof. (i) A1,0 = A1 because 0; 0 = 1; 0 = 1. Inductively, suppose An,0 = An and show An+1,0 = An+1. Let u ∈ An+1. Then u = x; y for some x, y ∈ An. Hence u;0 = x; y, 0=[x; 0]; y by Axiom (3). By the inductive hypothesis, x;0∈ An and hence u;0∈ An+1. (ii) Once one has verified that A2,1 = A2, the inductive proof is like that for (i).

0 0 0 Proposition 3.3 If n>0,x; y = x ; y ,x∈ An,x∈ An,y6∈ An−1, and 0 0 0 y 6∈ An−1, then x = x and y = y . Proof. x; y = x0; y0 → y0 ∈ x ∨ y0 = y. (This can be proved using Axioms (2) and (4).) But Lemma 2.2(ii) tells us that y0 6∈ x, hence y0 = y. Since symmetrically y 6∈ x0, it follows that x = x0.

Corollary 3.4 (i) For n ≥ 1, |An+1 − An,n−1| = |An|·|An − An−1| . (ii) For 0 ≤ m

Proof. (i) First note that if u = x; y with x ∈ An and y ∈ An − An−1, then u is not an element of An,n−1. For suppose it were: then for some v ∈ An and

14 w ∈ An−1, u = x; y = v; w. This implies that y = w or y ∈ v. The former contradicts the suppositions that y 6∈ An−1 and w ∈ An−1, and the latter is impossible by Lemma 2.2(ii). It follows that An+1 − An,n−1 = An,n − An,n−1 consists exactly of all sets of form x; y with x ∈ An and y ∈ An − An−1. Proposition 3.3 says that these are all distinct for distinct pairs hx, yi. (ii) If u ∈ Am+1,n − Am,n,sayu =[x; y] with x ∈ Am+1 and y ∈ An, then of course x 6∈ Am. Also y 6∈ An−1: for suppose y ∈ An−1. Then u ∈ Am+1,n−1 ⊆ An−1,n−1 = An ⊆ Am,n. The proof concludes like that of (i).

A manifestation of 3.3 and 3.4(i) with n = 4 can be discerned by examining Figures 1 and 10.

Lemma 3.5 Let n ≥ k ≥ 1.Ifx ∈ An then x has at most k − 1 elements which are not in An−k.

Proof by induction on k that the Lemma holds for all n ≥ k. The case k =1 is Lemma 2.2(ii). If true for k and x ∈ An with n ≥ k + 1, say x = y; z with y, z in An−1. By inductive hypothesis at most k − 1 elements of y are not in An−(k−1) ( and a fortiori not in An−k) and the only additional candidate for an element of x not in An−k is z.

Proposition 3.6 Let n ≥ 2. (i) a ∈ An,n−1−An,n−2 if and only if there exist u ∈ An−1, and distinct elements v and w of An−1 − An−2, such that a = u; v, w. 0 0 0 (ii) If u; v, w = u ; v ,w with u ∈ An−1, and v =6 w with v, w ∈ An−1 − An−2, and likewise for u0,v0,w0, then u = u0 and {v, w} = {v0,w0}.

Proof. (i) Let a ∈ An,n−1 − An,n−2,saya = x; w with x ∈ An and w ∈ An−1 − An−2. In turn let x = u; v with u and v in An−1.Soa = u; v, w and one direction of the proof is complete if we can show that v 6∈ An−2. But suppose v were in An−2: then a =[u; w]; v would be in An,n−2 because [u; w] ∈ An. Conversely, given a = u; v, w as specified in (i), then a ∈ An,n−1 because u; v ∈ An. We need to show that a 6∈ An,n−2. But suppose it were: a = u; v, w = x; y with x ∈ An, y ∈ An−2. Since v and w cannot equal y, they must both be in x, contradicting Lemma 3.5 with k =2. (ii) follows from the fact that none of v,w,v0,w0 can be in u nor in u0 by Lemma 2.2(ii).

≥ | − | | |· |An−1−An−2| Corollary 3.7 For n 3, An,n−1 An,n−2 = An−1 2 . Proposition 3.8 Let n ≥ 3. (i) a ∈ An,n−2 − An,n−3 if and only if there exist u ∈ An−2, and distinct elements v,w,x such that v, w ∈ An−2 − An−3 and x ∈ An−1 − An−3, such that a = u; v,w,x. (ii) If u; v,w,x = u0; v0,w0,x0 with u,v,w,x as in the statement of (i), and likewise for u0,v0,w0,x0, then u = u0 and {v,w,x} = {v0,w0,x0}.

15 Proof. (i) For “only if”, suppose a ∈ An,n−2 − An,n−3. Write a = b; w with b ∈ An and w ∈ An−2 − An−3, and in turn write b = c; x with c and x in An−1, and c = u; v with u and v in An−2. Thus a = u; v,w,x and it remains to show that v,w,x are distinct and none of them are in An−3. For distinctness, if (say) w = x then a = u; v, w would be in An. And suppose that (say) v ∈ An−3: then a =[u; w, x]; v would be in An,n−3 since [u; w, x] ∈ An. For “if”, suppose a = u; v,w,x as given. It is straightforward that a ∈ An,n−2.Ifa ∈ An,n−3,saya = u; v,w,x = y; z with y ∈ An and z ∈ An−3, then v, w and x cannot equal z so they must all be in y, contradicting Lemma 3.5 with k =3. (ii) follows from the fact that none of v,w,x,v0,w0,x0 can be in u or u0.

Corollary 3.9 For n ≥ 3, |An,n−2 − An,n−3| =

|An−2 − An−3| |An−2 − An−3| |A − |· ·|A − − A − | + . n 2 2 n 1 n 2 3

This corollary is obtained by considering two cases in the characterization of An,n−2 given in Proposition 3.8, depending upon whether x is in An−1 − An−2 or in An−2 − An−3.

The same methods can be used to obtain:

Proposition 3.10 Let n ≥ 4. (i) a ∈ An,n−3−An,n−4 if and only if there exist u ∈ An−3, and distinct elements v,w,x,y such that v, w ∈ An−3 −An−4, x ∈ An−2 −An−4, and y ∈ An−1 −An−4, such that a = u; v,w,x,y. (ii) If u; v,w,x,y = u0; v0,w0,x0,y0 with u,v,w,x,y as in the statement of (i), and likewise for u0,v0,w0,x0,y0 then u = u0 and {v,w,x,y} = {v0,w0,x0,y0}.

Corollary 3.11 For n ≥ 4, |An,n−3 − An,n−4| =

|An−3 − An−4| |A − |· · n 3 2  |An−2 − An−3| |A − − A − |·|A − − A − | + n 2 n 3 n 1 n 2 2 |An−3 − An−4| |An−3 − An−4| + ·|A − − A − | + . 3 n 2 n 3 4

Using these results one can fill in the numbers in Figures 11 and 12. In these figures the lines represent inclusions between classes, and the number adjacent to a line is the number of elements of the on the right which are not in the class on the left. All the numbers as far as A5,2 have been confirmed empirically. The only number not obtained directly as a case of one of the above results is the size of A6 − A4,5. By summing the upper numbers we obtain |A6| = 135717904, and from this the value indicated for A6 − A4,5 is deduced.

16 Figure 11: Inclusions between the small classes and subclasses.

Figure 12: Continuation of Figure 11.

4 Peano set theory.

In 1909 [17] proposed a form of induction on finite sets. In 1924 Alfred Tarski [13] gave a different formulation, in terms of what I am calling the adduction operator, that was closer in spirit to Peano’s induction axiom. As a schema, and in the notation of the present paper, Tarski’s 1924 induction is this:

Weak induction. ϕ(0) ∧∀xy(ϕ(x) → ϕ([x; y])) →∀xϕ(x), (5) where ϕ may also contain parameters. But what turns out to be a more fruitful form of induction was proposed half a century later by Steven Givant and Tarski [3]: Induction. ϕ(0) ∧∀xy(ϕ(x) ∧ ϕ(y) → ϕ([x; y])) →∀xϕ(x). (6) As Givant and Tarski pointed out, this version of induction incorporates foun- dation (see Proposition 4.3 below). Previale, in the passage quoted above in §2, was providing a justification for this induction principle: namely, it formal- izes the intuitive idea that we build up, and verify properties of, hereditarily finite sets using adduction, at each stage adding to a set already obtained a new element chosen from among the sets already obtained.

17 Following Previale [9], PS (for Peano set theory) will denote the axioms (1) - (4) together with the induction schema (6) for first order formulae. After Lemma 4.2 I shall discuss the equivalence of my version of PS with Previale’s. PS proves the existence of a −{p} for any a, p: Lemma 4.1 PS `∀xy∃z((y/∈ x ∧ z = x) ∨ (y ∈ x ∧ z; y = x ∧ y/∈ z)). Proof. For given p, show by weak induction on a that ∃z((p/∈ a ∧ z = a) ∨ (p ∈ a ∧ z; p = a ∧ p/∈ z)), or as we may informally write, z = a −{p} exists. If a =0 then z = 0. If z = a −{p} is given, then we can define z0 =[a; q] −{p}:ifp = q then z0 = z, and otherwise z0 = z; q.

Extensionality follows from the particular instances of it in axioms (1) and (4):

Lemma 4.2 The universal closures of the following statements are provable in PS: (i) a; p =06 . (ii) ∃y(y ∈ a) ↔ a =06 . (iii) ∀y(y ∈ a ↔ y ∈ b) → a = b.

Proof. (i) If a; p = 0, then using axiom (2), 0; p = a; p, p = a; p = 0, contradicting axiom (1). (ii) If p ∈ a then a = a; p so by (i) a =06 . An induction on a shows that ∀y(y/∈ a) → a = 0: the case a = 0 is immediate, and ∀y(y/∈ a; p) is false since p ∈ a; p. (iii) The fact that extensionality follows from (ii) and axiom (4) was proved by Previale ([9], §5). We prove by weak induction on a that for every z, ∀y(y ∈ a ↔ y ∈ z) → a = z. The case a = 0 follows from (ii). We suppose the result is proven for a and show it for a; p.Ifp ∈ a then a; p = a so there is nothing to prove, so assume p/∈ a. Assume for arbitrary z: ∀y(y ∈ a; p ↔ y ∈ z). Take z0 = z −{p} given by Lemma 4.1. So p/∈ z0 and z0; p = z. Thus for any y,

z; y = z ↔ z0; p, y = z0; p ↔ z0; y = z0 ∨ y = p (axiom (4)), i.e., y ∈ z ↔ y ∈ z0 ∨ y = p. Since y ∈ a; p ↔ y ∈ a ∨ y = p, and recalling that p/∈ a and p/∈ z0, it follows that y ∈ z0 ↔ y ∈ a. By inductive hypothesis, z0 = a and hence z = z0; p = a; p using (2).

Now that extensionality has guaranteed the uniqueness of x −{y},wemay choose, as Previale [9] does, to include in our language a binary function symbol for x −{y}, with as its definition

x −{y} = z ↔ (y/∈ x ∧ z = x) ∨ (y ∈ x ∧ z; y = x ∧ y/∈ z)

(cf. Lemma 4.1). The theory PS in Previale’s extended language is a conserv- ative extension of our PS, i.e. our PS is equivalent to Previale’s PS.

18 Proposition 4.3 (Previale) (i) (∈-induction.) PS `∀x((∀y ∈ xϕ(y) → ϕ(x)) →∀xϕ(x). (ii) (Foundation schema.) PS `∃xϕ(x) →∃x(ϕ(x) ∧∀y ∈ x ¬ϕ(y)). (iii) PS ` ZF + ¬∞, i.e., Zermelo-Fr¨ankelset theory with the axiom of infinity replaced by its negation. Here ZF + ¬∞ may just as well be considered as formulated in L(0 ;) rather than in the usual language of set theory. For the proof of this Proposition the reader is referred to [9]. Note that (ii) is just the contrapositive of (i) with ϕ replaced by ¬ϕ. ZF + ¬∞ in turn is mutually faithfully interpretable with first order Peano arithmetic PA, by a folklore result. The argument can be summarized as follows, in two converse directions: On the one hand, a coding first introduced by Wilhelm Ackermann [1] gives an interpretation of finite set theory in number theory, which is exposed in detail for example by Hao Wang [14]. Essentially, if we already have natural numbers n1, ··· ,nk coding the finite sets a1, ··· ,ak, then the set {a1, ··· ,ak} is coded by 2n1 + ···+2nk . And on the other hand, in a model of ZF + ¬∞ the predicate ‘x is a natural number’ can be defined as ‘x is transitive as are all elements of x’, and mathe- matical induction can be formulated, and thereby the operations of arithmetic can be defined on the finite ordinals of the model of ZF + ¬∞, to form a model of PA. We shall avail ourselves of standard coding mechanisms available in PA.

5 Adduction, finiteness, and actual objects.

One reason to study finiteness is to study information. The finite sets may be taken to be (or to represent) finite packages of information. The finitism of Hilbert was based on signs, because finite strings of signs can be apprehended surely. Quasi-empirical thesis: Any story is finite. By this I mean that any item of information transmitted from one person’s mind to another, or from one computer’s memory to another, consists of a finite string or some other finite structure of phonemes or letters or gestures or muscle movements or synapses or bytes. This thesis can be interpreted in various ways. The most conservative interpretation of the quasi-empirical thesis would be to define a story to be a finite string of 0s and 1s, or even just of 1s (Tait [12]). This would place us at the foundations of theoretical computer science. It would also eleiminate the empirical content from the thesis. A less conservative, and more empirical, interpretation would be that any actual communication between actual computers is finite. Actual here has a technical sense: an object is actual if I have actualized it by adducing it: bringing it forward for your consideration. I might point to it, for example, or I might say, for example, “Look! This email message”, or I might provide a description which is enough for you to apprehend the specific object.

19 The objects which are capable of being actualized in this way are the natural objects I considered in an earlier paper [5]. As you read “Look! This email message” without any other clue from me in another medium — such as the gesture of pointing to it, or simply from context — the object referred to is, for you, an abstract object. An act of adduction, of bringing forward for your consideration, is needed in order for you to have in your mind an actual object. What is adduced, in the usual use of the word “adduce”, tends to be a fact. In this paper I have broadened the context of this word to allow the adduction of objects, of sets. By the universalist claim of set theory, facts or statements of facts are a special case of sets, or at any rate can be simulated by sets. Alfred North Whitehead [15] has a related concept: he uses the word indi- cation for “the unique determination of a thing by the specification of some of its relationships to the functionings of some human body, and of some human intellect.” He symbolizes the indication of an object x, abstracted from any intensional statement about the properties of x, by “Ec ! x”, from the Latin “Ecce” meaning “Behold”. A broader interpretation of the quasi-empirical thesis has “story” to mean any actual communication between, say, humans, of whom there has only been - and will only be - a finite number during the finitely long history of the species. In an even broader interpretation, “story” could mean the biography of some individual(s), or representation of an event. Such an interpretation could be justified by appealing to actuality: any actual biography, that I can point to, like all known stories is finite. Even more broadly, it can be argued that all conceivable stories are finite. One can appeal to those theories and speculations of modern physics that state that space and time themselves are quantized and thus that the universe consists of finite number of events. Even the biography of the universe of physics, its description in a given language, must be finite. Take a finite object in the sense of this section, say a particular email message from me to you, a string of words. For you to apprehend it and take in the information it contains, you need a context: a framework which primes you to be ready to identify and process the words. The same message, which at one level I can refer to simply by a gesture “Ec ! x”, can be parsed in different contexts, for example as a string of bits. On this level the message consists of a string of adductions of form: “This is a 1. This is a 0. This is a 1. ...” Each adduction adds an item to your already existng knowledge base. To apprehend the message (at this level) you need the ability to make out the 1s and 0s and distinguish between them. Further analysis is always possible at another level: each 1 or 0 is in its turn a particular kind of conformation of ink molecules or pixels, etc. Thus any story, while finite, can be parsed in a richer language in such a way that the atomic units of the original story are no longer atomic and have structure of their own. Advances in knowledge often involve expanding to such a wider context. This leads me to think that the empty set (the sole atomic unit of set theory)

20 is provisional in nature. There is always an extension of the universe in which the previously empty set is discovered to have some structure of its own, in which the atom is split. The empty set is a token for an atomic unit which can, in some extension of the universe, become non-atomic. This idea will be revisited in §8. W. W. Tait ([12], p. 529) bases his discussion of finitism on a similar analysis of experience: “We discern finite sequences in our experience — sequences of words on a page, of peals of a bell, of people in a room ordered by age or size or simply by counting.” Tait calls the “form” of finite sequences “Number”, and continues: ... we are taking Number in its ordinal sense. If we wished to take it in the sense of cardinals we would only need to replace ‘finite sequence’ in the discussion with ‘finite set’ without any essential difference. It is upon the basis of Number that Tait formulates and defends his thesis that “finitist reasoning is essentially primitive recursive reasoning in the sense of Skolem.” But, according to his own remark, his thesis applies equally well to finitist reasoning in set theory. The primitive recursive set functions will occur naturally in what follows.

6 Primitive recursive set functions.

Our language L(0 ;) is natural for expressing primitive recursive set functions on Vω using recursive definitions of form

f(0,~z)=g(~z ) ,f([a; p],~z)=h(a, p, f(a, ~z),f(p, ~z),~z). (7)

In this section I shall survey the primitive recursive set functions from this point of view. D. R¨odding[10] defines the primitive recursive set functions as the smallest k class of functions from Vω to Vω containing as initial functions the constant function 0(˜ ~x ) = 0, the projections Pn,i(x1,...,xn)=xi, operator x 7→ {x}, x ∪ y, and intersection x ∩ y, and closed under substitutions

f(~x )=g(h1(~x ), ··· ,hk(~x )) and recursive definitions of form

f(0,~z)=g(~z ),

f({a},~z)=h1(a, f(a, ~z)),~z),

f(a ∪ b, ~z)=h2(a, b, f(a, ~z),f(b, ~z),~z).

It is not hard to see that R¨odding’stwo generating operators union and singleton are together interchangeable with the single adduction operator [x; y]. Ockham’s razor suggests that the latter is preferable. In addition, F.-K. Mahn [6] shows

21 that the intersection operator may be omitted since it is definable, using the same scheme, from the other initial functions. In this section I shall examine the primitive recursive set functions as defined using adduction. Ronald Jensen and Carol Karp [4] define a more general class of (transfinite) primitive recursive set functions on a broad class of transitive classes (so that Vω is a special case). Following Gandy, they use adduction among their initial functions in place of singleton and union, and instead of intersection they use the cases function: (x if u ∈ v, C(x, y, u, v)= y otherwise. And Jensen and Karp have yet another recursion schema:

f(x, ~z)=g([{f(u, ~z)| u ∈ x},x,~z).

I omit the technical proof that this schema is equivalent to R¨odding’s,and to (7), and the easy proof that the cases function is mutually interdefinable with intersection, so can likewise be omitted by Mahn’s result. Informally, since Ackermann’s coding is itself a primitive recursive transla- tion procedure between sets and numbers, primitive recursive set theory is the same as primitive recursive arithmetic. This is formally proved by R¨odding [10] who shows that the primitive recursive set functions are identical, modulo Ackermann’s coding, with the number-theoretic primitive recursive functions. (See also [4], §3.) And since the number-theoretic primitive recursive functions are those which are provably total in IΣ1, it will be straightforward to see that the primitive recursive set functions are those provable in IΣ1S, in which the induction schema of PS is restricted to Σ1 formulae.

In the above primitive recursive schemas, both (7) and the R¨oddingver- sion, I glossed over a problem with primitive on sets — a problem which does not occur when defining primitive recursive functions on ω (i. e., the traditional number-theoretic primitive recursive functions). The problem is that there are many ways to build up a hereditarily finite set using adduction (or using singleton and union), i. e., for non-empty a ∈ Vω there are many closed terms t ∈T such that tVω = a, and we need to make sure that the result of the recursive procedure does not depend on the order in which the set is built up. R¨oddingdeals with this problem by stipulating that the three clauses of the recursive definition are not mutually contradictory (nicht untereinander widerspruchsvoll) and leaving it at that. One aim of this section is to make this more precise and provide a formal criterion for such consistency (Lemma 6.1). In order to do so, we shall consider recursions not on sets but on closed terms of L(0 ;), and we need to ensure that if f is a primitive recursive function, s ≡ t ⇒ f(s) ≡ f(t).

k+4 Definition. Let h(x, y, u, v, ~z) be a function from Vω to Vω, where k is the arity of ~z . h is respectful if the universal closures of the following statements

22 are true:

h([x; y],y,h(x, y, u, v, ~z),v,~z)=h(x, y, u, v, ~z) and h([x; y],y0,h(x, y, u, v, ~z),v0,~z)=h([x; y0],y,h(x, y0,u,v0,~z),v,~z).

The set PR of primitive recursive functions on Vω is the smallest set of k functions from Vω to Vω containing as initial functions the constant function 0(˜ ~x) = 0, the projections Pn,i(x1,...,xn)=xi, and adduction f(x, y)=[x; y], and closed under substitutions

f(~x )=g(h1(~x ), ··· ,hk(~x )) where g and hi are primitive recursive, and recursion of form (7) where g and h are primitive recursive and h is respectful.

We need to justify this definition by showing that such an f is well-defined. k Suppose f : Vω → Vω is a primitive recursive function. We use the definition of k f to obtain, in a primitive recursive manner, a function f∗ : T →T, as follows: 0˜∗ = 0,˜ Pn,i∗ is the corresponding projection function on T ,[x; y]∗ =[x∗; y∗], and substitutions are dealt with by, for example, if f(x)=g(h(x)) then f∗(t)= g∗(h∗(t)). Finally, if f is defined recursively from g and h by the schema (7), and g∗ and h∗ are already obtained, then

f∗(0,~u)=g∗(~u ),f∗([s; t],~u)=h∗(s, t, f∗(s, ~u),f∗(t, ~u),~u). V V So f∗ exactly mimics the calculations needed to find the value of f(t ω ,~u ω ) for given closed terms t, ~u. The following Lemma will assure us that primitive recursive functions are unambiguously defined:

Lemma 6.1 Let f ∈ PR and let f∗ be defined as above. If s ≡ t and ui ≡ vi for each i, then f∗(s, ~u) ≡ f∗(t, ~v).

Proof. The initial functions and substitutions are easily dealt with; I show that if f is defined from g and h by the recursion schema (7) and the Lemma holds for g and h, then it holds for f. We prove by induction on n that if s, t ∈Tn, ~z and ~w are sequences of terms in T , s ≡ t, and ~z ≡ ~w , then f∗(s, ~z) ≡ f∗(t, ~w). The case n = 0 reduces to the assumption that the Lemma holds for g. Suppose s ≡ t ∈Tn+1,says = q; r and t = u; v, with q,r,u,v ∈Tn. Case 1. r 6≡ v. As in the proof of Theorem 2.3(ii) find q0,u0 ∈Tn such that

23 q0; v ≡ q and u0; r ≡ u. From this we deduce q0 ≡ u0. It follows that

f∗(s, ~z)=h∗(q, r, f∗(q,~z),f∗(r, ~z),~z)

≡ h∗(q, r, f∗([q0; v],~z),f∗(r, ~z),~z) using the inductive hypothesis

= h∗([q0; v],r,h∗(q0,v,f∗(q0,~z),f∗(v,~z),~z),f∗(r, ~z),~z)

≡ h∗([u0; v],r,h∗(u0,v,f∗(q0,~z),f∗(v,~z),~z),f∗(r, ~z),~z) since the Lemma holds for h

≡ h∗([u0; v],r,h∗(u0,v,f∗(u0,~w),f∗(v, ~w),~w),f∗(r, ~w),~w) using the inductive hypothesis

≡ h∗([u0; r],v,h∗(u0,r,f∗(u0,~w),f∗(r, ~w),~w),f∗(v, ~w),~w) since h is respectful

≡ h∗(u, v, f∗(u, ~w),f∗(v, ~w),~w)

= f∗(t, ~w).

Case 2. r ≡ v. As in the proof of Theorem 2.3(ii) there are three subcases: either Vω |= q = u,orVω |= q; r = u,orVω |= u; r = q. In the first subcase the inductive hypothesis gives f∗(q,~z) ≡ f∗(u, ~z), and hence using both the inductive hypothesis and the assumption on h:

f∗([q; r],~z)=h∗(q, r, f∗(q,~z),f∗(r, ~z),~z)

≡ h∗(u,r,f∗(u, ~w),f∗(r, ~w),~w)

≡ h∗(u, v, f∗(u, ~w),f∗(v, ~w),~w)

= f∗([u; v],~w).

In the second subcase,

f∗([u; v],~w)=h∗(u, v, f∗(u, ~w),f∗(v, ~w),~w)

≡ h∗([q; r],r,h∗(q, r, f∗(q,~z),f∗(r, ~z),~z),f∗(r, ~z),~z)

≡ h∗(q, r, f∗(q,~z),f∗(r, ~z),~z) since h is respectful

= f∗([q; r],~z).

The third subcase is similar.

Some examples of primitive recursive functions. In the remainder of this section, I shall run through the primitive recursive definitions of some familiar basic set operations. By standard techniques, given a definition of a primitive re- cursive function f, we can add a function symbol for f to the language together with the definition of f as an axiom, and the resulting theory is a conservative extension of PS. (See for example [2], §1.5.) Thus we can, and shall, consider primitive recursive functions as defined on models of PS, and avail ourselves of the convenience of function symbols for various primitive recursive functions once they have been defined.

24 If t ∈T is a closed term, then the constant function to tVω is in PR.Any open term t(~x )ofL(0; ) gives rise to a function tVω in PR, and we simply write t for this function. The binary union operator f(x, y)=x ∪ y is defined by a ∪ 0=a, a ∪ [b; p]=[(a ∪ b); p]. Here the recursion step of (7) is carried out using the function h(x, y, u, v)=u; y which is respectful because, working in PS0, h([x; y],y,h(x, y, u, v),v)=h([x; y],y,[u; y],v)=[u; y]; y = u; y = h(x, y, u, v) and similarly h([x; y],y0,h(x, y, u, v, ),v0)=u; y, y0 = u; y0,y = h([x; y0],y,h(x, y0,u,v0),v). When using a primitive recursive definition of a standard operation in PS, we need to justify the name by verifying the properties of the function in PS. Thus for the union we note: Lemma 6.2 The universal closures of the following statements are provable in PS: (i) 0 ∪ a = a. (ii) x ∈ a ∪ b ↔ x ∈ a ∨ x ∈ b. Proof. (i) By induction on a: 0 ∪ [a; p]=(0∪ a); p by the definition of ∪ = a; p by inductive hypothesis. (ii) By induction on b for fixed a: x ∈ a ∪ [b; p]=(a ∪ b); p ↔ x ∈ a ∪ b ∨ x = p by Axiom (4) ↔ x ∈ a ∨ x ∈ b ∨ x = p by inductive hypothesis ↔ x ∈ a ∨ x ∈ [b; p] by Axiom (4) again. Standard properties of the binary union operator, such as commutativity and associativity, follow. The unary union operator S x is defined by [ 0=0, [[a; p]=[ a ∪ p. Here we need to check that the defining function h(x, y, u, v)=u∪y is respectful. This is because (a ∪ p) ∪ p = a ∪ p and (a ∪ p) ∪ q =(a ∪ q) ∪ p. In the following examples of standard operations verification of respect, and of the standard properties, is left to the reader unless it is not straightforward. The transitive closure is defined by TC(0) = 0,TC(a; p)=[(TC(a) ∪ TC(p)); p]. I omit proofs that a ⊆ TC(a), y ∈ x ∈ TC(a) → y ∈ TC(a), and TC(a)= S{TC(x) | x ∈ a}∪a. The following strengthening of the axiom of foundation is proved by Previale ([9], §3, Proposition 8):

25 Lemma 6.3 PS `∀x(x/∈ TC(x)).

The power set operator is defined in two stages: first define an operation a.b= {[x; b] | x ∈ a} by

0 .b=0, [a; p] .b=(a.b); [p; b].

Now P 0=1,P[a; p]=Pa∪ (Pa.p). y The set x of functions from y to x: first define a/p b = {[a; hp, yi] | y ∈ b} by a/p 0=0,a/p [b; q]=(a/p b); [a; hp, qi]. (The hp, qi is defined as usual: hp, qi = 0; [0; p], [0; p, q].) Then

0 [a;p] a x =1, x = [{z/p x | z ∈ x}.

The a × b: first define a ×{b} by

0 ×{b} =0, [a; p] ×{b} =(a ×{b}); hp, bi.

Now a × 0=0,a× [b; p]=(a × b) ∪ (a ×{p}). Finally, the operation xΓy = x ∪ y ∪{[u; v] | u ∈ x ∧ v ∈ y} introduced in Section 3 is in PR: first define a?q by

0 ?q=0, [a; p] ?q=(a?q); [p; q].

Then aΓ0=0,aΓ[b; p]=(aΓb ∪ a?p); p. The hierarchy of the An can now be defined and the results concerning the An in Sections 1 and 3 can be formalized as proofs in PS. If ϕ(~x ) is a formula, we write χϕ(~x ) for the characteristic function of ϕ: χϕ(~x )=1ifVω |= ϕ(~x ), = 0 if not. A set or relation is said to be primitive re- cursive iff its characteristic function is. The function χx=06 (x) can be primitively recursively defined by

χx=06 (0) = 0,χx=06 ([a; p]) = 1 and is in PR since the constant function h(x, y, u, v) = 1 is easily seen to be respectful. Likewise for χx=0(x). Mahn [6] showed that χx=y(x, y) is primitive recursive. To show that the relation x ∈ y is primitive recursive, observe that

χx∈y(a, 0)=0,χx∈y(a, [b; p]) = χx∈y(a, b) ∪ χx=y(b, p).

7 Generalizing arithmetical operations.

R¨odding’sresult identifying set-theoretic and number-theoretic primitive re- cursive functions means in particular that the basic operations of arithmetic (addition and multiplication) have extensions to the hereditarily finite sets. In

26 this section I shall exhibit natural set versions of the basic functions of arith- metic, more transparently related to the originals than what would be afforded by R¨odding’sproof. This will constitute the beginning of an arithmetic of finite sets, generalizing in a natural way the arithmetic of natural numbers. k Suppose that f : Vω → Vω is a primitive recursive set function defined by k k the recursion schema (7). If f“ω ⊆ ω then the restriction fA of f to ω is a number-theoretic primitive recursive function, i.e. a primitive recursive function in the usual sense of the phrase, defined by the recursion schema

fA(0,~z)=g(~z ),fA(n +1,~z)=h(n, n, f(n, ~z),f(n, ~z),~z).

In this circumstance we shall say that f is a generalization of fA. I shall define primitive recursive generalizations of the operations of arithmetic. Addition of sets is defined as follows: a +0=a, a +[b; p]=[(a + b); (a + p)]. The notation is suitable because when a, b ∈ ω, a+b under this definition agrees with the usual addition of finite ordinals, so that it generalizes arithmetical addition. For example, 2 + 1 = 2 + [0; 0] = (2 + 0); (2 + 0) = 2; 2 = 3 and 2 + 2 = 2 + [1; 1] = (2 + 1); (2 + 1) = 3; 3 = 4. For any a, note that a +1=[a; a]. This retrospectively justifies the earlier informal use of the notation a +1. We need to check that the defining function for the recursion step in the definition of addition, namely h(x, y, u, v)=[u; v], is respectful: h([x; y],y,h(x, y, u, v),v)=[h(x, y, u, v); v]=[u; v]; v = h(x, y, u, v) and similarly for the second clause needed for respectfulness, using axiom (3). Notice that addition of sets is not commutative: for example, 1+{1} = 1 + [0; 1] = 1; 2 = {0, 2} whereas {1} + 1 = [0; 1] + 1 = [0; 1]; [0; 1] = {1, {1}}.

A closely related function is the lift function λa(b) defined by

λa(0) = 0,λa([b; p]) = [λa(b); (a + p)].

Thus λa(1) = [0; a], and it is straightforward to prove by induction that λ0(a)= a. Here are some basic properties of addition of sets: Proposition 7.1 The universal closures of the following are provable in PS: (i) 0+a = a. (ii) Addition is associative. (iii) a + b = a ∪ λa(b). (iv) λa(b)={a + x | x ∈ b}. (v) a + b/∈ TC(a). (vi) TC(a) ∩ λa(b)=0. (Hence a ∩ λa(b)=0.) (vii) λa(λb(c)) = λa+b(c).

27 In particular, a + b is the of a and λa(b). For example,

5=3+2=3∪ λ3(2) = {0, 1, 2}∪{3, 4}.

Proof. (i) By induction on a. (ii) By induction on c, for fixed a and b, that a +(b + c)=(a + b)+c. The case c = 0 is easy. Suppose the desired conclusion holds for both c and p. Then

(a + b)+[c; p]=((a + b)+c); ((a + b)+p)=(a +(b + c)); (a +(b + p)) = a +[(b + c); (b + p)] = a +(b +[c; p]).

(iii) This is proved by induction on b: the induction step is a+[b; p]=(a+b); (a+p)=(a∪λa(b)); (a+p)=a∪[λa(b); (a+p)] = a∪λa([b; p]).

The penultimate equality uses the definition of ∪. (iv) Note that the statement is really a shorthand for

∀y(y ∈ λa(b) ↔∃x ∈ b(y = a + x)).

The case b = 0 is easy. The induction step is

y ∈ λa([b; p]) ↔ y ∈ [λa(b); (a + p)]

↔ y ∈ λa(b) ∨ y = a + p ↔∃x ∈ b(y = a + x) ∨ y = a + p ↔∃x ∈ [b; p](y = a + x).

(v) The case b = 0 is Lemma 6.3. Assume that neither a + b nor a + p is in TC(a). Then a +[b; p]=(a + b); (a + p) cannot be in TC(a) since this would imply a + p ∈ TC(a) by the transitivity of TC(a). (vi) follows from (iv) and (v). I note that the mention of intersection here is an informal one used to paraphrase a formal statement of L(0 ;). Recall that Mahn [6] provides a primitive recursive definition for intersection. A similar remark will apply to mention of cardinality in Lemma 7.3 below. (vii) follows from (iv).

Cancellation of addition will be used in §8:

Proposition 7.2 (i) PS `∀xyz(λx(y)=λx(z) → y = z). (ii) PS `∀xyz(x + y = x + z → y = z). Proof. It suffices to prove (i), since (ii) then follows using Proposition 7.1(iii) and (vi). For fixed a, prove by induction on b that for all z, λa(b)=λa(z) → z = b. For b = 0, Proposition 7.1(iv) is enough. Assume the inductive hypothesis that for all z, λa(b)=λa(z) → z = b and λa(p)=λa(z) → z = p. Suppose λa(z)=λa([b; p]) = λa(b); (a + p). We need to show that z = b; p. If p ∈ b this is easy so assume p/∈ b.Nowa + p ∈ λa(z) so by Proposition 7.1(iv) there exists x ∈ z such that a + x = a + p. By Proposition 7.1(iii) and (vi)

28 it follows that λa(x)=λa(p) and hence by the inductive hypothesis x = p. So p ∈ z. Now apply Lemma 4.1 to obtain y = z −{p}. So z = y; p and λa(z)=λa(y); (a + p)=λa(b); (a + p). Since p is an element of neither b nor y, Proposition 7.1(iv) tells us that a + p is an element of neither λa(b) nor λa(y) and we can deduce that λa(y)=λa(b) and hence by inductive hypothesis that y = b. Hence z = b; p.

The following results are also provable in PS.

Lemma 7.3 (i) |λa(b)| = |b| . (ii) |a + b| = |a| + |b| .

Proof. (i) follows from proposition 7.2(ii) and Proposition 7.1(iv), and (ii) fol- lows from (i) and Proposition 7.1(iii) and (vi).

Multiplication: Define

a · 0=0,a· [b; p]=(a · b) ∪ λa·p(a).

The defining function h(x,y,u,v,a)=u ∪ λv(a) is straightforwardly respectful. To see that this agrees with the usual multiplication on finite ordinals, observe that for any a and b,

a · (b +1)=a · [b; b]=(a · b) ∪ λa·b(a)=a · b + a, using Proposition 7.1(iii). It is immediate that a · 1=a and a · 2=a + a. Multiplication is not commutative: for example,

{1}·2 = [0; 1] · 2 = [0; 1] + [0; 1] = [0; 1]; ([0; 1] + 1) = {1, {1, {1}}}, whereas 2 ·{1} =2· [0; 1] = (2 · 0) ∪ λ2·1(2) = λ2(2) = {2, 3}. Before proving the properties of multiplication, note this simple consequence of Proposition 7.1(iv):

Lemma 7.4 λx(y ∪ z)=λx(y) ∪ λx(z).

Proposition 7.5 The universal closures of the following are provable in PS: (i) 0 · a =0. (ii) 1 · a = a. (iii) a · b = S{λa·x(a) | x ∈ b}. (iv) a · (b ∪ c)=a · b ∪ a · c. (v) a · λb(c)=λa·b(a · c). (vi) Left distributivity: a · (b + c)=a · b + a · c. (vii) Multiplication is associative.

29 As an example to illustrate (iii):

6=3· 2=λ3·0(3) ∪ λ3·1(3) = {0, 1, 2}∪{3, 4, 5}. Proof. (i) is a straightforward induction on a. (ii) Induction on a:If1· a = a then 1 · [a; p]=1· a ∪ λ1·p(1) = a ∪ λp(1) = a ∪ [0; p]=[a; p]. (iii) Induction on b. (iv) Induction on c: a · (b ∪ [c; p]) = a · ([(b ∪ c); p])

= a · (b ∪ c) ∪ λa·p(a)

= a · b ∪ a · c ∪ λa·p(a) by inductive hypothesis = a · b ∪ a · [c; p]. In order to prove (v) and (vi) I first note that for any particular a, b, c, (v) implies (vi) because given (v),

a · (b + c)=a · (b ∪ λb(c))

= a · b ∪ a · λb(c) by (iv)

= a · b ∪ λa·b(a · c) using (v) = a · b + a · c. Now I prove (v) by induction on c. The induction step assumes (v) (and hence also (vi)) for c and for p, and proceeds thus:

a · λb([c; p]) = a · ([λb(c); (b + p)])

= a · λb(c) ∪ λa·(b+p)(a)

= λa·b(a · c) ∪ λa·b+a·p(a) by inductive hypothesis

= λa·b(a · c) ∪ λa·b(λa·p(a)) by Proposition 7.1(vii)

= λa·b(a · c ∪ λa·p(a)) by Lemma 7.4

= λa·b(a · [c; p]). (vii) Assume inductively that (a · b) · c = a · (b · c) and (a · b) · p = a · (b · p). Then

(a · b) · [c; p]=(a · b) · c ∪ λ(a·b)·p(a · b)

= a · (b · c) ∪ λa·(b·p)(a · b) by inductive hypothesis

= a · (b · c) ∪ a · λb·p(b) by (v)

= a · (b · c ∪ λb·p(b)) by (iv) = a · (b · [c; p]). The situation for exponentiation is less clear. Here is an adductive recursive generalization of the arithmetical function 2x:

0 [a;p] a p 2 =1, 2 =2 ∪ λ2p (2 ).

30 a+1 a a a a This is respectfully defined, and 2 =2 ∪ λ2a (2 )=2 +2 , so it agrees with 2x on ω. What is odd about it is that it doesn’t preserve : for [0;1] 0 1 [0;1] example, 2 =2 ∪ λ 1 (2 )=1∪ λ2(2) = [1; 2, 3] = {0, 2, 3},so 2 =6 2 2|[0;1]|.

x p Lemma 7.6 PS ` 2 =1∪ S{λ2p (2 )| p ∈ x}. At least some of the “high school algebra” properties of exponentiation carry through under this generalization: Lemma 7.7 PS ` 2x+y =2x · 2y. Proof by induction on y, for fixed x. The case y = 0 is easy. Suppose 2x+y = 2x · 2y and 2x+p =2x · 2p. Then

x+[y;p] (x+y);(x+p) x+y x+p 2 =2 =2 ∪ λ2x+p (2 ) x y x p =2 · 2 ∪ λ2x·2p (2 · 2 ) by inductive hypothesis x y x p =2 · 2 ∪ 2 · λ2p (2 ) by Proposition 7.5(v) x y p =2 · (2 ∪ λ2p (2 )) by Proposition 7.5(iv) =2x · 2[y;p]. Question. Is there a natural, cardinality preserving generalization of expo- nentiation to sets?

I conclude this section with two general remarks about this arithmetic of sets. First, it allows a straightforward generalization of a standard result about number-theoretic primitive recursive functions:

Lemma 7.8 ϕ(~x ) is primitive recursive for every ∆0 formula ϕ(~x ). The case where ϕ is atomic was discussed at the end of §6. The inductive proof on the structure of ϕ follows the number-theoretic case. Thus for exam- ple given a function f(x, ~y) we can use primitive recursion to define a product function Qi∈x f(i, ~y), and this is used to obtain χ∀z∈xϕ(z,x,~y) from χϕ(x,~y).

My second remark is that the addition and multiplication of sets introduced here can be straightforwardly extended to infinite sets in such a way that they extend ordinal addition and multiplication. (Note that the non-commutativity of these operations for infinite ordinals has been prefigured in the finite sets.) This is despite the fact that the adductive hierarchy stalls at stage ω, because

Proposition 7.9 VωΓVω = Vω. So adduction alone is not enough to generate the transfinite sets, even with S as an additional generator. I reserve fuller discussion of the arithmetic of infinite sets for a subsequent article. But Proposition 7.9 is evidence that adduction as a generating operator is intrinsically finitist in character, in marked contrast with the power set operator, which together with S, generates the whole cumulative hierarchy.

31 8 Inward extensions of models of PS.

In considering models of PS, we can use the languages L(0 ;) and L(0 ∈) in- terchangeably. In what follows, the former language may be replaced, mutatis mutandis, by the latter. One might think that it is similarly a matter of indifference if we drop the constant symbol for 0 from the language. After all, in the presence of the other axioms for PS, we can replace Axiom (1) by ∃z∀xz; x =6 z. We can then prove the uniqueness of this z and eliminate all subsequent ref- erences to 0, including in the induction schema, by using circumlocutions of type: replace ϕ(0) by ∀z(∀xz; x =6 z → ϕ(z)) or, equivalently, ∃z(∀xz; x =6 z ∧ ϕ(z)).

This ∆1 manner of speaking does not increase quantifier complexities. However there is one situation in which the restricted language makes a difference. Suppose M and M 0 are structures for L(0 ;) and M is a of M 0. I shall say that M is a weak substructure of M 0,orM 0 is a weak extension of M, if the reduct of M to L( ; ) is a substructure of the reduct of M 0 to L(;). Dropping the constant symbol means that the zero element is free to move in 0 0 the extension, so that 0M need not be the same element as 0M .If0M =06 M , I call M 0 an inward extension of M because the empty set in M turns out, in the extension, not to be empty after all: new sets have been discovered within the empty set. Every model of PS has an isomorphic inward extension: Theorem 8.1 Let M ` PS and 0M =6 c ∈ M. There exists an inward ∼ extension µc(M) of M and an isomorphism π : M = µc(M) such that π(c)= 0M . Proof. In analogy with Proposition 7.1(iv), define

λc(M)={c + x | x ∈ M}.

λc(M) is closed under adduction since (c + x); (c + y)=c +[x; y], so λc(M)isa weak substructure of M. Define αc : λc(M) → M by αc(c + x)=x. This is well defined by Proposition 7.2, and αc([(c + x); (c + y)]) = αc(c +[x; y)]) = x; y = αc(c + x); αc(c + y), so αc is an isomorphism between λc(M) and M. To sum- marize, we now have a weak substructure λc(M)ofM and an isomorphism αc from the weak substructure to M. We can now turn this around by identifying M with this weak substructure, and by adding to M a new constant for each element of M − λc(M), we can form µc(M) and π.

If M ⊆ M 0 and γ : M → M 0, then γ is definable in M 0 if there is a formula 0 0 θγ (with parameters from M ) such that for all a and b in M ,γ(a)=b iff 0 0 M |= θγ(a, b). Note that if such γ is definable in M , so is M.

32 The π of Theorem 8.1 is definable in µc(M). And with definability as an extra condition on π, we have a converse for 8.1:

Proposition 8.2 Suppose M and M 0 are models of PS, M 0 is a weak extension of M, and π is an isomorphism from M to M 0 which is definable in M 0. Then 0 ∼ there exists c ∈ M such that π  λc(M)=αc and M = µc(M).

Thus the following diagram commutes:

⊆ λc(M) −−−−→ M   αc π y y ⊆ M −−−−→ M 0 Proof. Let c = π−1(0M ). Note that the isomorphism π, while preserving adductions, does not necessarily preserve the operation +. However we can define, in M 0, the addition +M of M. We need to show that π(c +M x)=x for x ∈ M. The definability of π allows a proof by induction in M 0 that this is in fact true in M 0: π(c) = 0, and if π(c +M x)=x and π(c +M y)=y then M M M 0 π(c + [x; y]) = π([(c + x); (c + y)]) = x; y. Thus M = λc(M ).

These results, like those of §7, have infinitary versions for models of ZF.

References

[1] Wilhelm Ackermann, Die Widerspruchsfreiheit der allgemeinen Mengen- lehre, Math. Annalen 114 (1937), pp. 305-315. [2] Jon Barwise, Admissible Sets and Structures, Perspectives in Mathematical Logic, Springer-Verlag, 1975. [3] Steven Givant and Alfred Tarski, Peano arithmetic and the Zermelo-like theory of sets with finite ranks, Notices Amer. Math. Soc. 77T-E51 (1977), p. A-437. [4] Ronald B. Jensen and Carol Karp, Primitive recursive set functions, Sym- posia in Pure Mathematics XIII(1) (1967), pp. 143-176. [5] Laurence Kirby, Steps towards a logic of natural objects, Epistemologia XXV (2002), pp. 225-244. [6] F.-K. Mahn, Zu den Primitiv-rekursive Funktionen ¨uber einem Bereich endlicher Mengen, Arch. Math. Logik u. Grundlagenforschung 10 (1967), pp. 30-33. [7] A. R. D. Mathias, Slim models of , J. Symbolic Logic 66(2) (2001), pp. 487-496.

33 [8] Franco Montagna and Antonella Mancini, A minimal predicative set theory, Notre Dame J. Formal Logic 35(2) (1994), pp. 186-203. [9] Flavio Previale, Induction and foundation in the theory of the hereditarily finite sets, Arch. Math. Logic 33 (1994), pp.213-241. [10] D. R¨odding, Primitiv-rekursive Funktionen ¨uber einem Bereich endlicher Mengen, Arch. Math. Logik u. Grundlagenforschung 10 (1967), pp. 13-29. [11] J. C. Shepherdson, Inner models for set theory — Part I, J. Symbolic Logic 16(3), (1951), pp. 161-190. [12] W. W. Tait, Finitism, Journal of Philosophy 78(9) (1981), pp. 524-546. [13] Alfred Tarski, Sur les ensembles finis, Fundamenta Mathematicae 6 (1924), pp. 45-95. [14] Hao Wang, Between number theory and set theory, Math. Annalen 126 (1953), pp. 385-409, reprinted in Logic, Computers, and Sets, Chelsea Pub- lishing Co., New York, 1970, pp. 478-506. [15] Alfred North Whitehead, Indication, classes, numbers, validation, Mind (July 1934), reprinted in Essays in Science and Philosophy, Greenwood Press, 1968, pp. 313-331. [16] Stephen Wolfram, A New Kind of Science, Wolfram Media, 2002. [17] Ernst Zermelo, Sur les ensembles finis et le principe de l’induction compl`ete, Acta Mathematica 32 (1909), pp. 185-193.

Department of Mathematics Baruch College, City University of New York 1 Bernard Baruch Way New York, NY10010 laurence [email protected]

34