Infinity Goes up on Trial (V55.0341)

Inﬁnity Goes Up On Trial (V55.0341)

Notes by Professor Melvin Hausner New York University

Fall, 2005 V55.0341 Inﬁnity Goes Up On Trial

Introduction

These notes offer a description of various forms of infinity as used by today’s mathematician. The history of this subject is at least as old as the ancient Greeks. For example, areas of figures , which today we usually compute by integration, were regarded as the sum of infinitely many infinitesimal rectangle. A circle was regarded as a regular polygon with infinitely many sides. Zeno proposed various paradoxes to show that motion is impossible, or that if it is, you can never arrive at your destination. These paradoxes depend on the notion that time is infinitely divisible.

Mathematicians thought they had tamed the subject by using limits. Here, infinity was not anything you arrived at – it was a useful manner of speaking. You “approached infinity” but you never arrived there. Infinity was “well understood.” Who has not heard a mathematics teacher say that infinity is not a number? The teacher was right, and wrong. Right, because in many uses of the term, it is not a number. Wrong, because in many other uses of the term it is a number.

Georg Cantor, in the late 19th century, developed a remarkable and profound theory of inﬁnite numbers. (Note the plural.) It has proved to be a remarkable theory and has lead to many insights as well as surprises. It proved so controversial, that eminent mathematicians of the time were deeply divided on its use and meaning. The following quote is taken from the book ”The Mystery of the Aleph” by Amir D. Aczel.

The famous french mathematician Henri Poincare (1854-1912) said that Cantor’s set theory was a malady, a perverse illness from which someday mathematics would be cured. In response, however, the eminent German mathematician David Hilbert said that “no one would expel us from the paradise that Georg Cantor has opened for us.”

The arguments persist in one form or the other to this day, but it seems (to me) that Hilbert won. At any rate, in this course we shall study the Cantor theory, as well as various other approaches to inﬁnity.

Incidentally, the title of the course comes from a Bob Dylan song, ”Visions of Johanna.” Dylan never said what the charges were.

i Contents

1 Sets and Functions 1

2 Finite Sets 8

3 Inﬁnite Sets: ℵ0, the Smallest Inﬁnite Cardinal 12

4 The Algebra of Cardinals 17

5 Larger Inﬁnite Cardinals 21

6 Order Relations on Cardinal Numbers 26

7 Ordered Sets 29

8 Order Types 33

9 Zorn’s Lemma and Applications 39

10 Peano’s Postulates 43

ii 1 Sets and Functions

We shall take a naive point of view about sets, and not attempt to axiomatize the theory.

Sets. A set is a collection of things, called elements. These elements are chosen from some ﬁxed set U, called the universal set. For example, we may wish to talk about sets of integers. In that case, we would choose the set of all integers as the universal set. Some important examples of sets are:

1. The set N is the set of natural numbers {0, 1, 2,...}. Some people define a natural number to be a positive integer. We find it more convenient to include 0. 2. The set Q is the set of rational numbers. The set Q+ is the set of positive rational numbers. 3. The set R is the set of real numbers. The set R+ is the set of positive real numbers. 4. The set Z is the set of integers, positive, negative or 0. 5. The set P of all prime numbers. This is an infinite set, but this is by no means obvious. It was proven by none other than Euclid in his Elements.

These are all inﬁnite sets. Some examples of ﬁnite sets are:

6. The set N5 consists of all natural numbers between 1 and 5 inclusive. This can be given explicitly: N5 = {1, 2, 3, 4, 5}. Note that N5 has 5 elements. 7. The set P100 of prime numbers less than 100. This too can be given explicitly though with a little more diﬃculty:

P100 = {2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97}

A count shows that P100 consists of 25 elements.

Sets can be given implicitly by a defining condition. In some cases it may be difficult to determine if such a set is finite or infinite. For example,

8. The set of twin primes. This the set of numbers1 p such that both p and p + 2 are prime numbers. In this case, (p, p + 2) is called a twin prime pair. The ﬁrst few twin prime pairs are (3.5), (5, 7), (11, 13), (17, 19), (29, 31) The set of twin primes under 100 are {3, 5, 11, 17, 29, 41, 59, 71}.

Here is an interesting fact. To this day (September 2005) nobody know if the set of twin primes is finite or infinite. (But nearly everyone I speak to is pretty sure it is infinite!) If you prove this one way or the other (finite or infinite), I promise you will be famous and be written up on the front page of the New York Times.

1A number is understood to be a natural number unless otherwise described.

1 Here are a few other examples of sets.

9. The set of numbers x such that x2 + 1 is a prime.

10. The set of pairs (m, n) of positive integers such that 2n2 = m2 +1. For example (7,5) is in this set. Can you ﬁnd two more elements of this set? Is it inﬁnite?

Set Operations And Relations. We now review some of the basic operations on sets.

1. Membership. If S is a set, we write x ∈ S (read: x is in S) to mean that x is an element of S. We write x 6∈ S (read: x is not in S) to mean that x is not an element of S.For example 4 ∈ N5, and −1 6∈ N.

2. Equality of Sets. We say that sets S and T are equal (S = T ) if any member of one set is also a member of the other. Using logical symbolism,

(S = T ) ↔ (∀x)((x ∈ S → x ∈ T ) ∧ (x ∈ T → x ∈ S)).

Here is the list of logical symbols and their meaning used in this complicated looking formula.

Symbol Meaning ↔ if and only if ∀x for any x ∈ is in → implies; A → B is read “If A then B” ∧ and

3. Union and Intersection. If S and T are sets, S ∪ T (read: S union T ) is the set of elements which are in S or in T .2 The intersection of sets S and T , written S ∩ T ,(read: S intersect T ) is deﬁned as the set of elements in both S and T . In symbols:

(∀x)(x ∈ (S ∪ T ) ↔ (x ∈ S) ∨ (x ∈ T ))

(∀x)(x ∈ (S ∩ T ) ↔ (x ∈ S) ∧ (x ∈ T ))

For example, take the universal set to be N. Deﬁne S = E, the even numbers and T = N6, the numbers between 1 and 6 inclusive. Then S ∪ T = the even numbers together with 1, 3, and 5. S ∩ T = {2, 4, 6}. Here we have used an additional logical symbol:

Symbol Meaning ∨ or

2This is a mathematical “or”. When a mathematician says “A or B”, he or she means “A or B or both”. This is called the inclusive or.

2 One further example: If N is taken as the universal set, let E be the set of even numbers, and let O be the set of odd numbers. Then a little thought shows that E ∪ O = N, but there are no elements in E ∩ O. Since we always want the intersection of sets to be deﬁned, we introduce the empty set ∅ as the set with no elements. In this case, we have

E ∩ O = ∅.

We deﬁne the empty set by the condition

(∀x)¬(x ∈∅); [or (∀x)(x 6∈ ∅)].

Here is the last list of logical symbols (for a while).

Symbol Meaning ¬ not 6∈ is not in

4. Subsets. If A and B are sets, we write A ⊆ B if all elements of A are also in B. Using logical symbols: A ⊆ B ↔ (∀x)(x ∈ A → x ∈ B) Comparing this to the deﬁnition of equality of sets, we see that A = B if and only if A ⊆ B and B ⊆ A. Typically, a proof that two sets are equal will have two parts; the proof of the inclusion of each set in the other.

An important property of inclusion is transitivity: If A ⊆ B and B ⊆ C then A ⊆ C.IfA ⊆ B and B ⊆ C, we customarily write A ⊆ B ⊆ C, in analogy with notation used for inequalities of numbers.

We point out that for any set A, ∅⊆A ⊆ U. The second part of this inclusion is straightforward because of our agreement about the universal set. But what about ∅⊆A? Technically, this means (∀x)(x ∈∅→x ∈ A) We can argue verbally that this is true: If (you can ﬁnd an) x ∈∅, then (indeed, that) x will be in A. An implication with a false hypothesis is said to be vacuously true.

5. Complements and Diﬀerences. The complement of a set consist of all elements not in the set. Recall that we have agreed that all sets in a given discussion are understood to consist of elements taken from the universal set U. The complement of a set A, written A0 is thus deﬁned by the condition

(∀x)(x ∈ A0) ↔ (x 6∈ A ∧ x ∈ U)

3 Because of our convention, the clause x ∈ U is redundant. It is placed as a reminder to us that elements come from U. For example, let E be the set of even natural numbers. 0 One might be tempted to say that E is the set of odd natural√ numbers. But by deﬁnition, E0 consist of elements which are not even. Why not take 2 ∈ E0? Or even 1/2 ∈ E0? It all depends on our universal set. If we take U = N, then E0 is in fact the set of odd numbers. But if U = Q then in fact 1/2 ∈ E0. Thus the universal set is crucial when deﬁning complements.

If A and B are sets, we can deﬁne A − B as the set of elements in A but not in B, Thus,

(∀x)(x ∈ A − B ↔ (x ∈ A ∧ x 6∈ B))

Equivalently, A − B = A ∩ B0 Some authors use A \ B rather than A − B.

6. The Algebra of Sets. We have introduced several operations and relations on sets. This includes ∅,U,∩, ∪, ∈, ⊆,A0. Not surprisingly, there is a substantial algebra allowing us to work with these concepts. We list a few of the basic results. In most cases, the proofs are straightforward. i. Empty Set and Universal Set Properties.

∅⊆A ⊆ U (1)

∅0 = U; U 0 = ∅ (2) A ∩ A0 = ∅; A ∪ A0 = U (3) A ∪∅= A; A ∩∅= ∅ (4) A ∪ U = U; A ∩ U = A (5) ii. Union and Intersection Properties.

A ∩ A = A; A ∪ A = A (6) A ∪ B = B ∪ A (7) A ∩ B = B ∩ A (8) A ∩ (B ∪ C)=(A ∩ B) ∪ (A ∩ C) (9) A ∪ (B ∩ C)=(A ∪ B) ∩ (A ∪ C) (10)

Formulas 7 – 9 are easy to remember since they are analogous to similar formulas from algebra with ∩ replaced by multiplication and ∪ replaced by addition. However, this analogy is not perfect. For example, the equation a + x = b will have a unique real solution x for given real numbers a and b, but the analogous equation A ∪ X = B will only have a solution if A ⊆ B, and then the solution will not be unique.

4 iii. Inclusion Properties. A ⊆ A ∪ B; A ∩ B ⊆ A (11) If A ⊆ B then A ∩ C ⊆ B ∩ C (12) A ⊆ B if and only if A ∩ B = A (13) A ⊆ B if and only if A ∪ B = B (14) If A ⊆ B and B ⊆ C then A ⊆ C (transitivity) (15) iv. Complement Properties.

(A0)0 = A (16) (A ∪ B)0 = A0 ∩ B0 (17) (A ∩ B)0 = A0 ∪ B0 (18) If A ⊆ B then B0 ⊆ A0 (19)

Equations 17 and 18 are called the DeMorgan laws.

Counting. What do we mean that there are 45 students in a class? In simple terms, it means that we can count the students oﬀ, one by one, ending with 45. Thus: 1, Mary Kowalski; 2, Mike Jacobs; . . . , 44, Anthony Jiminez; 45, Alice Lazaroﬀ. Diagrammatically,, we set up a correspondence betwen the numbers from 1 though 45 and the students in the classroom:

1 ↔ Mary Kowalski 2 ↔ Mike Jacobs ··↔··· 44 ↔ Anthony Jiminez 45 ↔ Alice Lazaroﬀ

This correspondence is called a 1-1 correspondence from the integers between 1 and 45 inclusive onto the set of students in the classroom. Every student is counted and counted only once.

We can generalize this procedure to any ﬁnite set as follows:

Deﬁnition. If n is a positive integer, we deﬁne Nn is the set of integers between 1 and n inclusive. Thus, Nn = {1, 2,...,n}. More formally, Nn is the set of x ∈ N such that 1 ≤ x ≤ n.

Deﬁnition. Let n ∈ N, and n>0. A set S has n elements if there is a 1-1 correspondence from Nn onto S. By deﬁnition, we also say that the null set has 0 elements. The number of elements of a set S is written |S|.

5 Of course not all sets have a number using this deﬁnition. For example, the set of even numbers cannot be counted using this deﬁnition.

Definition. If |S| = n ∈ N or S = ∅, S is called a finite set. If not, it is called infinite.

If a set S has n ≥ 1 elements, then by deﬁnition we can list the set as S = {s1,s2,...,sn} where there are no repetitions in this listing.

Functions. We have stated that a ﬁnite non-empty set is one which can be put into a 1-1 correspondence with one of the sets Nn for some n ∈ N. It is possible to deﬁne correspondences between arbitrary sets A and B using the notion of a function or mapping. Let A and B be two sets. Suppose that each element x of A is assigned some element y of B. We give this assignment a name, traditionally a single letter, say f. Such a correspondence is written as f : A → B, and read as “f maps A into B.” In this case we write y = f(x) if the element x ∈ A is made to correspond to the element y ∈ B. A correspondence is also called a function or a mapping. The set A is called the domain of the function f. The set B is called its co-domain. This is a very general concept. As an illustration, take A = {1, 3, 5, 7, 9, 11} and B = {A, B, C.D}. Then an example of a mapping g:A → B is given by the following table:

x 1 3 5 7 9 11 g(x) B A C B A A

We can also give this function as follows:

1 7→ B; 3 7→ A; 5 7→ C;77→ B; 9 7→ A; 11 7→ A

In this example, some letters are not covered (D in this case), and some are covered more than once (A and B here.). Our definition of a function states that every element of A must correspond to some element of B, but it is not required that every element of B come from some element of A, and the definition allows some elements to be repeated. We make the following definition:

A function f:A → B is said to be onto if for any y ∈ B there is an x ∈ A such that y = f(x).

A function f:A → B is said to be 1-1 if different elements of A map into different elements of B. This may be stated in two different, but equivalent ways:

1. If x1,x2 ∈ A and x1 =6 x2, then f(x1) =6 f(x2). 2. If x1,x2 ∈ A and f(x1)=f(x2), then x1 = x2.

6 The two statements are logically equivalent. (Technically, the second is the contrapostive of the first.) The first is probably the more intuitive way of looking at the definition, but the second is often easier to handle mathematically, because equalities are usually easier to handle than inequalities.

If f:A → B is 1-1 and onto, each element of A is assigned a unique element of B and each element of B is the assignment of one and only one element of A.

If f:A → B and g:B → C, then we can put these together to form a function from A to C. The resulting function is called g ◦ f and is deﬁned by the formula (g ◦ f)(x)=g(f(x)). g ◦ f is called the composition of g and f.Iff and g are 1-1 and onto, then so is g ◦ f .

We illustrate with a simple counting example. A = {1, 2, 3, 4}, B = {a, b, c, d}, and C = {5, 8, 10, 14}. Let f and g be given by the assignments:

1234 abcd g: ↓↓↓↓ f: ↓↓↓↓ bcad 10 5 14 8

Then f ◦ g: simply follows these mappings one after the other (beginning with g!):

12 34 g: ↓↓ ↓↓ 12 34 bc adso f ◦ g: ↓↓ ↓↓ f: ↓↓ ↓↓ 5 14 10 8 514108

If f:A → B is 1-1 and onto, then the relationship between A and B is symmetric – corresponding to each element y ∈ B there is a unique element x ∈ A such that y = f(x). Thus we have an inverse map of B into A, This map is denoted f −1. By deﬁnition,

x = f −1(y) if and only if y = f(x)

In the above example, we can easily compute g−1:

1234 1234 abcd g: ↓↓↓↓so g−1 : ↑↑↑↑or g−1 : ↓↓↓↓ bcad bcad 3124

Note: If a function f is 1-1 and onto, so is f −1.

7 2 Finite Sets

Before discussing numbers in general, and inﬁnite numbers in particular, we are will have to discuss the familiar ﬁnite sets.

I. Equivalence of Sets. Imagine a classroom with 39 seats in it. The class ﬁles in and each student sits in a seat. It is observed that each chair is occupied, and that no two people occupy the same seat. We can now positively state that there are 39 people in the class. We can interpret what has happened by stating that the seating eﬀected a one-one correspondence between students and seats:

This same principle applies for much larger “classrooms.” Imagine a stadium with a large number of seats. At the night of a big concert, every seat is occupied, and there’s no doubling up. (Again a 1-1 correspondence.) What we can say now is that the number of concertgoers is equal to the number of seats even though we have not “counted” either. We now generalize these ideas.

Deﬁnition. Sets A and B are said to be equivalent, written A ∼ B if they can be placed into 1-1 correspondence. More precisely,

A ∼ B ↔ there is a 1-1 onto map f : A → B

The relation ∼ on sets is an equivalence relation. This means:

1. A ∼ A (The reﬂexive property) 2. If A ∼ B then B ∼ A (The symmetric property) 3. If A ∼ B and B ∼ C then If A ∼ C (The transitive property)

Properties 1, 2, and 3 are easy to prove. To prove 1, note that I:A → A deﬁned by the condition I(x)=x is a 1-1 onto correspondence between A and itself. (The mapping I is called the identity transformation.) To prove 2, note that if f:A → B is 1-1 and onto, so is f −1:B → A. To prove 3, note that if f:A → B and g:B → C are 1-1 and onto, then so is g ◦ f:A → C.

The definition of equivalence applies to all sets, including infinite ones. For example, if A = N, the natural numbers, and B = {10, 11,...} is the set of all integers which are greater than or equal to 10, then we have A ∼ B because the function f:A → B defined by the equation f(x)=x + 10 is 1-1 and onto.

The definition of equivalence of sets does not give the 1-1 correspondence f:A → B. It is necessary to find f or at least prove that one can be found. This definition depends on the existence of the function f.

We have the following simple result alluded to at the beginning of this section:

8 Theorem: If A ∼ B and A is ﬁnite, then B is also ﬁnite and |A| = |B|.

For the proof, suppose that |A| = n. Then by definition, Nn ∼ A. But we are given that A ∼ B. Therefore, using symmetry and transitivity, we find Nn ∼ B.SoB is finite and |B| = n. This is the result.

II. Uniqueness of cardinality. Up to now, we took it for granted that the cardinal number of a ﬁnite set is unique. To take a concrete example, this states that we cannot count 45 elements in a set, and then by recounting in another way, ﬁnd only 43 elements.1 We now prove it, by contradiction.

We will assume that both counts are possible, and then arrive at a contradiction. For, suppose we managed a count of 45 and a count of 43 for the same set. Then, we would have a 1-1 correspondence between N45 and N43 looking something like:

1 ↔ 12 2 ↔ 5 . . . ↔ . 21 ↔ 43 . . . ↔ . 44 ↔ 13 45 ↔ 17

Now switch 21st and the 45th correspondence to get

1 ↔ 12 ↔ 12 2 ↔ 5 ↔ 5 . . . . ↔ . ↔ . 21 ↔ 43 ↔ 17 . . . . ↔ . ↔ . 44 ↔ 13 ↔ 13 45 ↔ 17 ↔ 43

Now drop the last correspondence from 45 to 43 to get

1This is a result that we have learned to take for granted since childhood, so it may seem diﬃcult to believe that it actually needs a proof!

9 1 ↔ 12 2 ↔ 5 . . . ↔ . 21 ↔ 17 . . . ↔ . 44 ↔ 13

The numbers on the left are all the numbers up through 44, while on the right we have all the numbers up through 42. So starting with a 1-1 correspondence between N45 and N43, we arrive at a 1-1 correspondence between N44 and N42. We have reduced these counts by 1. We can now repeat this process 41 more times to get a 1-1 correspondence between N3 and N1. And this is clearly impossible!

This method of descent is based on the fundamental well ordering property of N. This states the following:

The Well Ordering Property of N. If S is a non-empty subset of N, then S contains a least element. That is, there is an element s0 ∈ S such that s0 ≤ s for all s ∈ S.

Here is how the well ordering property may be used to prove that every ﬁnite set has a unique count. Say that a natural number n is weird if there is a set S whose count is n but whose count by some other 1-1 method is m>n. Note that it is easy to see that neither 1 is not weird. We want to show that no weird numbers exist; namely that the set of weird number is empty. Assume not. Then by the well ordering property there is a smallest weird number n0. Using the process illustrated above (where 42 was assumed weird), we can show that n0 − 1 is also weird. But this is a contradiction, since n0 was the smallest weird number and n0 − 1 is smaller that n0, This completes the proof.

III. Additivity of Cardinals. In grade school, addition of integers was often deﬁned in term of sets. “Here are 3 apples, here are 5 more. Put them together and you have 5 + 3 = 8 apples. In general, we expect the same for any ﬁnite sets.

Theorem: If A and B are disjoint ﬁnite sets (A ∩ B = ∅), then |A ∪ B| = |A| + |B|.

Note that the result is trivial if A or B is the empty set. For the proof, suppose |A| = m>0 and |B| = n>0. We have 1-1 onto maps f:A → Nm and g:B → Nn. Now we “lift up” Nm be deﬁning h(x)=x + n for all x ∈ Nm. It is an easy matter to show that h maps Nm 1-1 onto the set {1+n, 2+n,...,m+ n} = Nm+n − Nn.Soh ◦ f:A → Nn+m − Nn. Now we put together the two functions h◦f and g to get a function F :A∪B → (Nn+m−Nn)∪Nn = Nn+m. It is easy to show that F is 1-1 and onto. This shows that |A ∪ B| = n + m = |A| + |B|.

As a consequence we have the following theorem.

10 Theorem (Preservation of Inequalities). If B is ﬁnite, and If A ⊆ B, then |A|≤|B|. More precisely, if A is a proper subset of B2, then |A| < |B|.

Proof: Assuming that A is ﬁnite - it is the subset of a ﬁnite set! – we can write B = A ∪ (B − A). Since A and B − A are disjoint, we have |B| = |A| + |B − A|. This proves that |A|≤|B| since |B − A|≥0. Also if A ⊂ B, we have B − A =6 ∅, and so |B − A| > 0 and this proves the strict inequality when A is a proper subset of B.

The proof does depend on the result that a subset of a ﬁnite set is ﬁnite. This is another result that is so familiar we almost have to ask if it isn’t obvious. Before we give a formal proof, we start with a simple lemma:

Replacement Lemma. If |A| = n, and x0 6∈ A and a ∈ A, then |A −{a}∪{x0}| = n. Note that A −{a}∪{x0} is the set A with a replaced by x0.

0 Proof: There is a 1-1 correspondence f from A onto Nn. Define f : A−{a}∪{x0}→Nn by 0 0 defining f (x)=f(x)ifx =6 a, and define f (x0)=f(a). This is a 1-1 onto correspondence between A −{a}∪{x0} and Nn.

Theorem: A subset of a ﬁnite set is ﬁnite.

Proof: We do this again by the method of descent. Call a natural number n strange if there is a finite set A, with |A| = n, which has a non-finite subset. We will show that there are no strange numbers. For it there are strange numbers, we let m be the least strange number. In that case, it is easy to show that Nm itself has a non-finite subset B. B itself consists of some (but not all) of the numbers of Nm. There are now two cases.

In case 1, m 6∈ B. In this case, B ⊆ Nm−1. Because of how m was chosen, m − 1 is not strange, and so B is ﬁnite – a contradiction.

In case 2, we have m ∈ B. Since B is a proper subset of Nm there is an element x0 ∈ Nm −B. 0 Now, using the replacement lemma, replace m by x0 in the set B. The resulting set B is also non-ﬁnite but contains m. This brings us back to case 1 for a contradiction. This completes the proof.

2This means that A ⊆ B, but A =6 B. It is written A ⊂ B.

11 3 Inﬁnite Sets: ℵ0, the Smallest Inﬁnite Cardinal

An infinite set S is one which is not finite. Although this gives the full story, we expect more. We want lots of elements. In particular, we want to be able to pick off some of its elements one by one forever. We start by choosing the first element, and call it s1. This cannot exhaust the set – if it did, we would have a finite set – so we choose a different second element and call it s2. This also cannot exhaust the set. So we can continue indefinitely in this way to find a subset S0 = {s1,s2,...,sn,...}

The mapping f:N → S0 in which n 7→ sn is 1-1. (This is another way of saying that the various sn are all diﬀerent.) However, this mapping is not necessarily onto. Conversely, if such a map f exists, S must be inﬁnite. Thus we have the theorem:

Theorem. A set is inﬁnite if and only if there is a 1-1 function f:N → S.

Let us take a few examples.

Example 1. S = Q, the rational numbers and f(n)=1/n. f is 1-1, but not onto. Q is, of course, inﬁnite.

Example 2. S = R, the real numbers and f(n)=n. f is 1-1, but not onto. R is, of course, inﬁnite.

Example 3. S = E, the positive even numbers and f(n)=2n. f is 1-1, and onto. E is, of course, inﬁnite.

We can picture the map in Example 3 as follows:

123... n ... ↓↓↓ ↓ ↓ ↓ 246... 2n ...

Here, the even numbers E are put into 1-1 correspondence with the natural numbers. Using the terminology of Section 2, we have N ∼ E. In analogy with ﬁnite sets, Georg Cantor took the giant step of counting N and assigning it a number ℵ0 (read aleph null):

|N| = ℵ0

Following the ﬁnite counting results, we say that two sets have the same number if and only if they are equivalent. That is, if A ∼ B, we have |A| = |B|. This permits any set to have a number, and two sets have the same number if and only if they can be put into 1-1 correspondence with each other. Using example 3 above, we can now say the there are ℵ0

12 even numbers. If |S| = ℵ0, we say that the set S is denumerable or countable. The count of any set is called a cardinal number. The count of an inﬁnite set is called a transﬁnite number. The theorem at the beginning of this section may be stated:

Theorem: Any inﬁnite set has a denumerable subset.

We also have the following theorem, illustrating that ℵ0 is the smallest inﬁnite cardinal.

Theorem: A subset of a denumerable set is either ﬁnite or denumerable.

The idea of the proof is to pick out the elements, one by one. For a proof, it is enough to consider a subset S ⊆ N. If it is non-empty, it has a least element, which we call n0. Note that n0 ≥ 0. Now consider all n ∈ S −{n0}. If this is empty, then S consists of just one element n0 and the result is proved. If not empty, we choose the least element of n1 ∈ S −{n0}. Note that n1 ≥ 1. We proceed in this way to get a sequence n0

In the 17th Century, Galileo noted the square number 1, 4, 9, . . . are in 1-1 correspondence with the natural numbers using the simple correspondence n 7→ n2. This led to considerable confusion, since the whole was supposed to be “greater than its parts.” We now say that there are ℵ0 perfect squares - the same number as the positive integers N. This was one of the initial paradoxes concerning inﬁnite sets. Most people might guess that there are fewer even numbers than whole numbers; E is a proper subset of N. For ﬁnite sets, a proper subset has fewer elements than a given set. (See the theorem on the preservation of inequalities, page 11.) But here |E| = |N| = ℵ0.

In Examples 2 and 3 above, we found 1-1 mappings into Q and R which were not onto. This does not imply that neither of these sets has cardinality ℵ0. All that we have found is that the given maps were not appropriate to prove that the sets in question were denumerable. (In fact, we shall prove that |Q| = ℵ0, but |R|6= ℵ0.)

Once Cantor discovered (invented?) ℵ0, he went about trying to get counts of various sets; for example, real numbers, a line segment, the plane and 3-space, the rational numbers and so on. And although the strict theorem on preservation of inequalities failed, what can be said of the non-strict version using ⊆ and ≤? Also what about arithmetic? Can you add, subtract, divide, multiply, or exponentiate cardinal numbers?

Addition of Cardinals. For ﬁnite cardinals, we know that for disjoint sets A and B,we can add cardinalities by taking the union of the sets: |A ∪ B| = |A| + |B|. (See page 10.)

13 We take this as the deﬁnition of the sum of cardinals, inﬁnite or otherwise:

If A and B are disjoint, |A| + |B| = |A ∪ B|

Let us apply this deﬁnition to compute 1 + ℵ0. In order to do this, we have to ﬁnd a set with 1 element, and get its union with a set with ℵ0 elements. We try the single element set {0} consisting of 0 alone, and the set N+ of positive natural numbers. The mapping f:N → N+ given by the formula f(n)=n + 1 is clearly 1-1 and onto, so N+ ∼ N and we + + have |N | = |N| = ℵ0. Finally, since N = {0}∪N , we get

1+ℵ0 = ℵ0

We can also compute ℵ0 + ℵ0 using the fact that the even numbers are denumerable. We need a disjoint set to add to it. Let’s take the odd numbers O. By deﬁnition an odd number is of the form 2n + 1. Therefore the map f:N → O deﬁned by f(n)=2n + 1 is a 1-1 map of N onto the odd numbers. We have |O| = ℵ0. Finally, since N = E ∪ O, we have:

ℵ0 + ℵ0 = ℵ0 These last two equation are enough to tell us that subtraction is not going to be possible. For, we have the equation 1 + ℵ0 = ℵ0 + ℵ0. and if we were to allow subtraction, this would lead to 1 = ℵ0, an unpalatable (and untrue) result.

However, we do have the commutative and associative laws for sets, using the union operation: A ∪ B = B ∪ A; A ∪ (B ∪ C)=(A ∪ B) ∪ C This implies the commutative and associative laws for addition of cardinal numbers. So we can compute: ℵ0 +1=1+ℵ0 = ℵ0

ℵ0 + ℵ0 + ℵ0 =(ℵ0 + ℵ0)+ℵ0 = ℵ0 + ℵ0 = ℵ0

2+ℵ0 =1+1+ℵ0 =1+ℵ0 = ℵ0

Of course this generalizes to the equations n + ℵ0 = ℵ0 + n for n ∈ N and n ·ℵ0 = ℵ0 for n ∈ N. In this last equation n ·ℵ0 denotes the sum ℵ0 + ℵ0 + ···+ ℵ0, with n summands.

Using these results, we can show that the integers Z (positive negative and zero) are denumerable. We have Z = N+∪{0}∪N−, where N+ and N− are the sets of positive and negative + − − + numbers respectively. The map n 7→ −n shows that N ∼ N ,so|N | = |N | = ℵ0.So

+ − + − |Z| = |N ∪{0}∪N | = |N | + |{0}| + |N | = ℵ0 +1+ℵ0 = ℵ0 A more direct approach is to enumerate (count oﬀ) the integers as follows: 0, 1, −1, 2, −2, 3, −3,...,

An interesting consequence of the above arithmetic is that adding ℵ0 to an inﬁnite cardinal does not change the value of that cardinal:

14 Theorem: If a is any inﬁnite cardinal, then ℵ0 + a = a and n + a = a for any n ∈ N.

Proof: . Let A be a set for which |A| = a. Then by the theorem on page 13, A has a denumerable subset D. Let C = A − D. Then A = D ∪ C where C and D are disjoint. Therefore, a = |A| = |D| + |C| = ℵ0 + |C| Therefore ℵ0 + a = ℵ0 + ℵ0 + |C| = ℵ0 + |C| = a. We also have n + a = n + ℵ0 + a = ℵ0 + a = a

The following theorem is a handy tool to show that certain sets are denumerable¿

Theorem. The union of denumerably many ﬁnite sets is ﬁnite or denumerable. Thus, if

T = T1 ∪ T2 ∪···∪Tn ∪··· where each Tn is finite, then T is finite or denumerable. Proof: We count T systematically, by starting with T1, going on to T2 and continuing indefinitely, being sure not to count any element twice. It is possible that we reach a point where there are no new elements. In this case, T will be finite. Otherwise |T | = ℵ0.

Using this theorem, we can prove that the set of positive rationals Q is denumerable. Deﬁne the size of a positive rational m/n as the number m + n. (The fraction is understood to be in lowest terms.) Let Qn be the set of rationals having size n. For example, Q8 = {1/7, 3/5, 5/3, 7/1}. Since Q is clearly the union of the Qn, we conclude that Q is denumerable. This is an astonishing result because it would appear that there are many more rationals than integers. The count is as follows:

1/1 ;1/2, 2/1;1/3, 3/1;1/4, 2/3, 3/2, 4/1,...

size|{z}2 | size{z 3 } | size{z 4 } | size{z 5 } The set of all (negative, positive, and zero) rationals is also denumerable, using the same technique used to show that the set Z of integers is denumerable.

Another application, done by Cantor, shows that the cardinal number of the algebraic numbers is ℵ0. An algebraic number is one which satisﬁes some polynomial equation of the form n n−1 f(x)=anx + an−1x + ...+ a1x + a0 =0,ai ∈ Z,an =06 √ √ √ For example, 5 is algebraic since it satisﬁes the equation x2 − 5 = 0. So is 5 2 − 3 3 4.

The method for showing the algebraic numbers are algebraic is similar to the one we used above for the rationals. Deﬁne the weight or size of f(x)tobe|an|+|an−1|+...+|a1|+|a0|+n.

15 This is an oversimplified way of giving the complexity of a polynomial. For example, the weight of the polynomial x3 − 4x + 7 is 1 + 4 + 7 + 3 = 15, and the wight of x7 − 1 is similarly 1 + 1 + 7 = 9. Then are finitely many polynomials (with integer coefficients) of a given weight. And each polynomial, by a theorem of algebra, has finitely many roots. So there are finitely many algebraic numbers that are roots of a polynomials of a given size. Since the algebraic numbers are the union of algebraic numbers coming from polynomials of weight 1, 2, etc., this proves the result. We will use this result in a striking way in a later section.

16 4 The Algebra of Cardinals

We have so far considered addition of cardinals, including transfinite cardinals, and have applied the results to ℵ0. We now consider the usual algebra of addition, multiplication and exponentiation as these operations apply to transfinite cardinal numbers. In all cases, our definitions are natural extensions of these operations as they apply to finite sets.

We recall the definition of addition. Addition of Cardinals. For finite cardinals, we know that for disjoint sets A and B,we can add cardinalities by taking the union of the sets: |A ∪ B| = |A| + |B|. (See page 10.) We take this as the definition of the sum of cardinals, infinite or otherwise:

If A and B are disjoint, |A| + |B| = |A ∪ B|

We point out that this is well deﬁned. Namely,

If 1) |A| = |C| and |B| = |D|, and 2) A and B are disjoint, and 3) C and D are disjoint, then |A| + |B| = |C| + |D|.

For a proof, note that A ∼ C and B ∼ D. Therefore, there are 1-1 onto functions f:A → C and g:B → D. Now deﬁne h:A∪B → C ∪D by taking h(a)=f(a) for a ∈ A and h(b)=g(b) for b ∈ B. Then h is well deﬁned because A and B are disjoint. h is onto C ∪ D because both f and g are onto. h is 1-1 because f and g are, and because C and D are disjoint.1 This proves the result. This result shows that to compute addition of cardinals, we may take any representative sets for the computation. We did this to compute ℵ0 + ℵ0 by choosing appropriate disjoint sets (the evens and the odds) for the two occurrence of ℵ0.

We have already noted that the associative law a +(b + c)=(a + b)+c is valid because the associative law hold for unions of sets. Similarly, the commutative law a + b = b + a is valid.

Multiplication of Cardinals. For ﬁnite cardinals, we often think of multiplication as repeated addition. Thus, 3 · 4=4+4+4=12. This method does not easily generalize for inﬁnite cardinals. One way that does generalize is to use arrays. Here is how 3 · 4 would look using a 3 by 4 array. Take a set with 3 elements, say {a,b,c} and one with 4 elements, say {1, 2, 3, 4}. Now make a 3 by 4 array (3 rows and 4 columns) as follows

1234 a (a,1) (a,2) (a,3) (a,4) b (b,1) (b,2) (b,3) (a,4) c (c,1) (c,2) (c,3) (c,4)

1The reader should check these statements and not take them for granted.

17 Here, the element in the row corresponding to x and the column corresponding to y is the ordered couple (x, y).2 The number of elements in this array is of course 3 · 4 = 12. We now generalize this procedure to arbitrary sets A and B as follows.

Deﬁnition. If A and B are sets, the product set A × B is the set of ordered couples (a, b) were a ∈ A and b ∈ B.

You have seen product sets before. In analytic geometry, the plane consists of all pairs (x, y) where x and y are real numbers. Thus, the plane is the product R × R.

We are now ready to deﬁne the product of cardinal numbers. As for sums, we deﬁne this in terms of sets having these as cardinal numbers.

If A and B are any sets, we deﬁne |A|·|B| = |A × B|

As with addition, it can be shown that multiplication of cardinals is well-deﬁned, and not dependent on the sets used to represent the cardinal numbers.,

The familiar algebra applies:

1. (The associative law) a(bc)=(ab)c. 2. (The distributive law) a(b + c)=ab + ac

The associative law is valid, because A × (B × C) ∼ (A × B) × C under the correspondence (x, (y, z)) 7→ ((x, y),z). The distributive law is handled similarly.

To compute ℵ0 ·ℵ0, we need to ﬁnd |N × N|. We can illustrate N × N as follows:

12... n ... 1 (1, 1) (1, 2) ... (1,n) ... 2 (2, 1) (2, 2) ... (2,n) ...... m (m, 1) (m, 2) ... (m, n) ......

The number of entries in this (infinite) table is by definition |N × N| = ℵ0 ·ℵ0. We can show that even this cardinal is also ℵ0. We consider the set S of all numbers of the form 2m3n where m, n ≥ 1. This is an infinite set {6, 12, 18, 24, 36, 48,...}. The first six terms here correspond to (m, n)=(1, 1), (2, 1), (1, 2), (3, 1), (2, 2) and (4, 1). |S| = ℵ0 since it is an

2An ordered couple (x, y) is not the same as the set {x, y}. In the ordered couple x is the ﬁrst element, and y the second. In the set, it makes no sense to talk about the ﬁrst or the second element – the elements are unordered. For sets {x, y} = {y, x}. For ordered couples (x, y)=(y, x) only if x = y.

18 inﬁnite subset of N. (See the theorem on page 13). But the mapping (m, n) 7→ 2m3n is 1-1 from N × N onto S. It is 1-1 by the unique factorization theorem for integers, which states that every natural number can be written uniquely as a product of prime powers, and it is onto by the deﬁnition of S.SoS ∼ N × N. Therefeore

2 ℵ0 = |S| = |N × N| = ℵ0 ·ℵ0 = ℵ0 3 n Similarly, by multiplying both sides by ℵ0 we get ℵ0 = ℵ0 and continuing, ℵ0 = ℵ0 for every n ∈ N. We can now give an alternative proof that the rational numbers are denumerable.

Theorem. The rational numbers Q are denumerable.

Proof: We first consider the positive rational numbers Q+. These have the form m/n where m and n are relatively prime (i.e. have no prime factor in common). Thus, the set of all ordered couples (m, n) where m and n are relatively prime is in 1-1 correspondence with the positive rationals. But this set is an infinite subset of N × N, So it is denumerable as an infinite subset of a denumerable set. Therefore the positive rationals are denumerable since it is similar to a denumerable set.

The rational numbers Q consist of the positive rationals, the negative rationals and 0. But using the same proof that the integers Z are denumerable, it follows that the rationals are denumerable.

This is an amazing result because to the naked eye, the rationals “ﬁll up” the line. Between any two rationals (no matter how close) there are inﬁnitely many rationals. Yet they are countable. In the next section, we consider cardinal numbers larger than ℵ0.

Exponentiation. We are all familiar with exponents indication the number of time a number is multiplied by itself. Thus, a3 = a · a · a, and in general, an = a · a ···a. However, |n factors{z } this does not generalize easily when n is a transﬁnite cardinal. Instead. we consider the question of determining how many functions are there from set S to set T.

Deﬁnition. If S and T are sets, we deﬁne T S as the set of functions from S to T .

For finite sets, suppose |S| = m and |T | = n. Suppose S = {s1,...,sm}. To define a function from S to T , we must decide on the image of s1 (there are n choices in T ), and once this is decided there are n choices for the image of s2, and so on to the n choices for the image of 3 m sm. By the basic counting principle, there are a total of n · n ···n = n functions from S | m factors{z } to T . In a formula, |ST | = |S||T |. (This formula is what made us define ST as above.) This suggests the following definition, valid for all sets and cardinal numbers.

3 The principle states that if a process can be done in m steps, and there are n1 choices for the ﬁrst step, and once this step is performed, there are n2 choices for the second step, etc, then the process can be done in a total of n1n2 ···nm ways in all.

19 Deﬁnition. If S and T are sets, we deﬁne |T S| = |T ||S|.

As with our previous definitions of addition and multiplication, it is straightforward to show that the definition is independent of the sets used to define exponentiation. Namely,

T T1 |T | |T1| If S ∼ S1 and T ∼ T1 then S ∼ S1 and |S| = |S1| .

We now show that “the usual rules of algebra” apply.

Theorem: .Ifa, b, c are cardinals, then

(ab)(ac)=ab+c (1)

Proof: Choose sets A, B, and C with B and C disjoint such that |A| = a, |B| = b, and |C| = c. We will prove that AB × AC ∼ AB∪C . Let (f,g) be an element of AB × AC . Thus f:B → A and g:C → A. Since B and C are disjoint, we may create a function h:B ∪C → A. Simply deﬁne h(x)=f(x)ifx ∈ B and h(x)=g(x)ifx ∈ C. Conversely, any h:B ∪C → A. gives rise to an f and a g by restricting h to B or C respectively. This correspondence shows that AB × AC ∼ AB∪C , proving the theorem.

In a similar manner, we may prove that (AB)C ∼ AB×C , and so prove

(ab)c = abc (2) valid for any cardinals a, b, c. Similarly we have A × B)C ∼ AC × BC so that

(ab)c = acbc (3) for any cardinals a, b, c.

20 5 Larger Inﬁnite Cardinals

So far, all of the sets we have analyzed have been denumerable. Cantor discovered inﬁnite sets whose count was greater than ℵ0. In particular the real numbers R was shown by him to be non-denumerable. That is, no matter how you try to count the reals, one by one, you will never be able to count all of the reals. This is not unlike a set with 40 element compared to one with 41. No matter how you count the 41 element set in any order from 1 to 40, there will always be an element not counted. The remarkable thing is that this phenomenon can apply to inﬁnite sets as well.

Deﬁnition. If S is any set, the power set P(S) is the set of all subsets of S.

For example, if S = {1, 2, 3}, then P(S) consist of

∅, {1}, {2}, {3}, {1, 2}, {1, 3}, {2, 3}, {1, 2, 3}

For typographic reasons, we write these simply as ∅,1,2,3,12,13,23,123. Note that in this case, |S| = 3, and |P(S)| =8=23.

For ﬁnite n it is not hard to show that if |S| = n, then |P(S)| =2n. We reason as follows. To form a subset of S. we have to decide, for every s ∈ S whether or not s is in the subset. So for each s ∈ S, we have 2 choices (in or out).1 A basic counting principle tells us that the total number of choices is obtained by multiplying the number of choices at each stage. Therefore, the number of subsets 2n. We illustrate for the case S = {1, 2, 3}. We use 0 and 1 as the code names for out or in the subset. The possibilities are given in the following table.

123subset 000 ∅ 100 1 010 2 110 12 001 3 101 13 011 23 111 123

We can see why the number of subsets will double if we add one more element, 4, to the set S. For we would keep the same 8 subsets using choice 0 for 4, and we would add 4 to each of these subsets, by using choice 1 for 4. This clariﬁes the general formula 2n for the number of subsets of a set with n elements. 1That is, a subset of S is determined by a function from S into the 2 element set {0, 1}.

21 It is easy to see that in general (for transﬁnite cardinals) we have |P(S)| =2|S|. To see this we will show how the elements of 2S correspond in a 1-1 onto way with the subset of S.For the 2 element set we choose {0, 1}.Iff:S →{0, 1}, Let T be the subset of all x for which f(x) = 1. If T is any subset of S, deﬁne f(x) = 1 for x ∈ T and f(x) = 0 for x ∈ T 0, This sets up a 1-1 onto correspondence between {0, 1}S and P(S). Thus {0, 1}S ∼P(S), so |P(S)| =2|S|.

Inequalities among cardinals. We have defined equality of cardinals: |S| = |T | if and only if S ∼ T . That is, S and T can be put into a 1-1 onto correspondence. We have also seen that proper set inclusion is not enough to define strict inequality. For example, the even numbers and the natural numbers have the same cardinality, as do the integers and the rational numbers. So we must exercise care in defining strict inequality between cardinal numbers.

Deﬁnition. We say that |S|≤|T | provided there is a 1-1 correspondence from S into T . (Equivalently, S ∼ T1 where T1 ⊂ T .) Naturally we also write |T |≥|S| in this case.

Deﬁnition. We say that |S| < |T | if |S|≤|T | and if there is no 1-1 correspondence from S onto T . (That is, S 6∼ T .) Naturally we also write |T | > |S| in this case.

This deﬁnition is clearly consistent with the usual deﬁnition of inequality among the natural numbers. Also (not surprisingly!) n<ℵ0 for any n ∈ N.

The following important result shows how, for every inﬁnite cardinal, to ﬁnd a larger cardinal.

Theorem: (Cantor) If S is any set, |S| < |P(S)|.

So Cantor’s theorem states that for any cardinal number a,wehave2a >a. For example, ℵ0 2 > ℵ0. Thus, there are more subsets of integers than integers.

Proof: It is easy enough to show that |S|≤P(S). The map s 7→ {s}, mapping an element into the set consisting of just that element, is clearly 1-1. It is now necessary to show that there is no 1-1 map of S onto P(S). We do this by contradiction.

We assume there is a 1-1 map f of S onto P(S). For any s ∈ S, f(S) is a subset of S,sowe can ask whether or not s ∈ f(S). Let us call s contrary if s 6∈ S. We now let T be the set of all contrary elements. Thus,

(∀s ∈ S)s ∈ T if and only if s 6∈ f(s).

Since T is a subset of S, and f is onto, T must be the image of some s0 ∈ S: T = f(s0). So we may write the above condition as

(∀s ∈ S)s ∈ f(s0) if and only if s 6∈ f(s).

22 We now ask whether or not s0 ∈ T . Using the above condition, we have

s0 ∈ f(s0) if and only if s0 6∈ f(s0).

This is a contradiction. If s0 is not in T , then it is. If it is in T , then it isn’t!

This proof has many ancestors which were classical paradoxes. For example, suppose a person says, ”I am lying.” Is he lying or not? If he is lying, then he is telling the truth. If he’s telling the truth, he is lying.

ℵ0 Here is a similar proof for 2 > ℵ0, using S = N. We suppose we have an onto map n 7→ Sn from N onto P(N). We may represent a subset of T ⊆ N as a sequence s0,s1,...,sn,... where sn = 0 or 1. We choose sn = 1 if and only if n ∈ T . For example, the set of even numbers is represented by the sequence 1, 0, 1, 0, 1, 0,..., and the set of prime numbers correspond to 0, 0, 1, 1, 0, 1, 0, 1, 0, 0, 0,...Now we will assume that all such sequences of 0’s and 1’s are countable. We list them in an inﬁnite array as follows. For purpose of illustration, we start the counting with the even numbers, the primes, and the odd numbers.

Term : 0123... Sequence 0 1 010... Sequence 1 0 0 11... Sequence 2 010 1 ......

We now go down the main diagonal (underlined in the diagram) and change all 0’s to 1’s and vice versa: 1, 0, 0,...=⇒ 0, 1, 1,... The resulting sequence 0,1,1,. . . cannot be any of the enumerated sequences. For it diﬀers in the 0-th place of the 0-th sequence, and similarly for any n, if diﬀers in the n-th place of the n-th sequence. Therefore the constructed sequence is not in the list, and the list has not enumerated all sequences of 0’s and 1’s. This is a contradiction.

The method outlined here is famous. It is called the Cantor Diagonal Process. The technique has many mathematical applications, especially in logic and computer science.

We can now show that the reals R is not countable. Before doing so, it is convenient to prove the following theorem which will make the computation easier.

Theorem. On the Irrelevancy of a Denumerable Subset. Let S be a set with ℵ0 ℵ0 |S| =2 , and let D ⊆ S with |D| = ℵ0.. Then |S − D| =2 . Proof: . We have S = D ∪ (S − D). Since D and S − D are disjoint, we have

|S| = |D| + |S − D| = ℵ0 + |S − D|

23 But S − D is inﬁnite, so by the theorem on page 15, ℵ0 + |S − D| = |S − D|.. Therefore,

ℵ0 2 = |S| = ℵ0 + |S − D| = |S − D| proving the theorem.

It is traditional to set |R| = C.2

Theorem: C =2ℵ0 .

Proof: . We first consider the interval I =(0, 1) of numbers t strictly between 0 and 1. Any such real number has a binary expansion t = .a1a2a3 ..., where each ai = 0 or 1. Thus, it would appear that the interval I is in 1-1 correspondence with 2N, and so |I| = |2N| =2ℵ0 . The difficulty is that some numbers have two different representations as infinite binary decimals. Just as in the decimal representation of reals, a number such as .34 can also be written as .33999. . . , a finite binary decimal can also be written with a string of 1’s at the end. Thus, in binary, .011 = .010111 .... So the reals can be written uniquely as a binary decimal which does not terminate in a sequence of 1’s. Now working with the set 2N of all sequences of 0’s and 1’s we let S be the set of sequences of 0’s and 1’s which terminate in an infinite sequence of 1’s, and let T be the set of sequences which do not terminate is an infinite sequence of 1’s, and finally, to eliminate the real number 0 = .000 ..., let U be the set consisting of the one sequence 0,0,0,. . . . Then this analysis shows that

2N = S ∪ (T − U) ∪ U; I ∼ (T − U)

Since S, T, and U are disjoint, it follows that

2ℵ0 = |S| + |T − U| + |U| = |S| + |I| +1=|S| + |I|

since I is an inﬁnite set. (See the theorem on page 15.)

ℵ0 We now show that |S| = ℵ0. This will show that 2 = ℵ0 + |I| = |I|, again by the theorem on page 15. To show that S is countable, we break S into denumerably many ﬁnite pieces:

S = S1 ∪ S2 ∪···∪Sn ∪···.

Here S1 consists of these sequences which have 1’s from the ﬁrst place on; namely 1,1,. . . . S2 consists of those sequences which are 1 from the second place on, namely 0,1,1. . . , and 1,1,1,. . . . In general, Sn consist of those sequence which are 1 from the n-th place on. (There

2C is used because the reals are often referred to as the continuum. So C is the cardinality of the continuum.

24 n−1 are 2 such sequences because the ﬁrst n−1 terms can be either 0 or 1. Therefore |S| = ℵ0 by the theorem on page 15. Thus, we have the result |I| =2ℵ0 .

As a consequence, we can easily show that any open interval (a, b) with a

Finally, we show that I ∼ R, and this will prove the result. There are many simple functions x that will do this. One such function if f:(−1, 1) → R, deﬁned by f(x)= . The 1 − x2 reader may show that this is 1-1 onto. Another, perhaps more familiar example comes from trigonometry: f(x) = tan x maps (−π/2,π/2) 1-1 and onto R.

We can now state, using the theorem on page 23 that the cardinality of the irrationals is C, since the set of irrationals is R − Q.

A transcendental number is deﬁned as a real number which is not algebraic. The above argument also shows that the set of transcendental numbers has the cardinality of the continuum.

This is an indirect proof that there are transcendental numbers. We can say that there are way more transcendental numbers than algebraic numbers. Curiously this is an existence proof without exhibiting even one transcendental number. When the result was announced by Cantor, it was regarded by many as extremely suspect - an existence proof without the slightest hope of using the proof to actually ﬁnd an algebraic number! It is still regarded as suspect by some mathematicians.

25 6 Order Relations on Cardinal Numbers

We have deﬁned (page 22) the relation < and ≤ on cardinal numbers, but we have not discussed any properties these order relation satisfy. We have enough experience to suspect that we may not easily generalize from the ﬁnite case. However, we can start with a simple result.

Theorem: (Transitivity) If a, b, c are cardinal numbers and a ≤ b and b ≤ c, then a ≤ c.

For a proof, suppose A, B, C are sets with |A| = a, etc. By deﬁnition, we have 1-1 functions f:A → B, and g:B → C. Therefore g ◦ f:A → C is also 1-1, and this implies

a = |A|≤|C| = c.

An extremely important result, taken for granted for ﬁnite numbers is that if a ≤ b and b ≤ a, then a = b. This is not trivial for inﬁnite cardinals, because the hypotheses involve two 1-1 functions and the conclusion calls for a single 1-1 onto function. The result is true, and it is important and non-trivial enough to be attached to three mathematician’s names.

Theorem: (The Cantor-Schr˝oder-BernsteinTheorem.) Suppose that A and B are sets, and that |A|≤|B| and |B|≤|A|. Then |A| = |B|.

Proof. Since |A|≤|B| and |B|≤|A|, there are 1-1 mappings f:A → B and g:B → A .For each a ∈ A , exactly one of the following holds:

1. For every b ∈ B , g(b) =6 a. In this case, we shall say that a has no predecessor. or 2. There is a unique b1 ∈ B such that g(b1)=a. Here uniqueness follows because g is 1-1. In this case, we shall say that b1 is the predecessor of a.

Similarly, for each b ∈ B, exactly one of the following holds: 1. For every a ∈ A , f(a) =6 b. In this case, we shall say that b has no predecessor. or 2. There is a unique a1 ∈ A such that f(a1)=b. Here again, uniqueness follows because f is 1-1. In this case, we shall say that a1 is the predecessor of b.

Note that every a ∈ A is the predecessor of f(a)inB, and that every b ∈ B is the predecessor of g(b)inA . It follows that for every element a ∈ A, we can construct a string as follows:

f g f g f g f ...−→ b2 −→ a1 −→ b1 −→ a −→ f(a) −→ g(f(a)) −→ ...

Here b1 is the predecessor of a, a1 is the predecessor of b1, and so on. Note that the string does not terminate on the right, but may terminate on the left at an element with no predecessor.

26 Similarly, for every element b ∈ B, we can construct a similar string:

g f g f g f g ...−→ r2 −→ s1 −→ r1 −→ b −→ g(b) −→ f(g(a)) −→ ...

Here r1 is the predecessor of b, s1 is the predecessor of r1, and so on.

Again the string does not terminate on the right, but may terminate on the left at an element with no predecessor.

Now comes a crucial statement. If any element of A or B occurs in two strings, the strings are identical. This is so because any one element in the string uniquely determines the full string. Otherwise put, the strings are disjoint.

The strings are of 3 types: I: Those which continue indeﬁnitely on the left. A: Those which end at the left with with an element a of A. B: Those which end at the left with with an element b of B.

We are now ready to deﬁne a 1-1 map from A onto B, proving the result. Deﬁne h:A → B as follows:

1. For any a ∈ A in a string of type I or A, we deﬁne h(a)=f(a). 2. For any a ∈ A in a string of type B, we let h(a) be the predecessor of a.

The function h:A → B is well deﬁned by this deﬁnition. Let us show that it is onto and 1-1. ONTO: If b is in a string of type B, b = h(g(b)). Otherwise, b will have a predecessor a1, and b = f(a1)=h(a1). So h is onto. ONE-ONE: Suppose b = h(a1)=h(a2). If b is in a string of type B, then b is the predecessor of a1 and of a2.Soa1 = g(b) and also a2 = g(b). So a1 = a2. But if b is in a string of type I or A, then a1 and a2 are the unique predecessors of b.Soh is 1-1.

This completes the proof. We’ll call this the CBS theorem.

As a corollary, we have the following non-surprising result.

Corollary. (Transitivity of <.) If a, b, c are cardinals, and a

Corollary. Let S be a subset of R which contains some open interval I. Then |S| = C. Proof: Since I ⊆ S ⊆ R,wehave|I|≤|S|≤|R|,orC ≤|S|≤C. By the CBS theorem, |S| = C.

1We went through a lot to get this result. Can you ﬁnd an easier proof? I can’t.

27 The following results show that order relations are preserved under the usual algebraic operations. Note that the inequalities here are not strict ones. The proofs are all straightforward.

Theorem: If a, b, c, d are cardinals, and a ≤ b, c ≤ d, then

a + c ≤ b + d (1) ac ≤ bd (2) ac ≤ bd (3)

We conclude this section with computations involving posers of cardinal numbers.

Theorem: Cℵ0 = C. ℵ Proof: C 0 =(2ℵ0 )ℵ0 =2ℵ0·ℵ0 =2ℵ0 = C.

As a consequence, we have ℵ0 Theorem. (ℵ0 as exponent.) If 2 ≤ a ≤ C, then a = C. Proof, by the previous results, we have

C =2ℵ0 ≤ aℵ0 ≤ Cℵ0 = C.

Now use the CBS theorem.

Finally, we have a Theorem. (Powers of C.) If 1 ≤ a ≤ℵ0 then C = C. Proof: . We have Cℵ0 = C by the above theorem. Therefore

C = C1 ≤ Ca ≤ Cℵ0 = C.

Now use the CBS theorem.

This result shows that the cartesian plan R2 and space R3 have cardinality C. Thus there are as many points in the plane or space as there are points on a line segment!

28 7 Ordered Sets

We consider the less than relation < on the real numbers R (or the integers) and try to isolate those properties which do not depend on arithmetic operations such as +, −, and ×.. It is more convenient to consider the relation ≤ (less than or equal to). In order to make clear that we are not necessarily talking about real numbers, we shall use the notation instead of the usual ≤. The ﬁrst important property is transitivity:

Transitivity: If a b and b c then a c.

Using certain logical symbols, transitivity can also be expressed as

(∀a, b, c)(a b) ∧ (b c) → (a c). (1)

Here ∀ can be read “for any” or “for all”. It is called the universal quantiﬁer. The symbol ∧ is the logical symbol for “and”, and the symbol → is the logical symbol for “implies.” Thus p → q can be read as “p implies q” or as “If p then q.” Equation (1) expresses the transitivity property of the order relation in logical notation.

We write a 6 b (read a not less than b)ifa b is false. We also have

Reﬂexivity: a a.

In logical notation, this is simply (∀a)(a a) (2) Finally, we have the antisymmetric property:

Antisymmetry: If a b and b a then a = b.

In logical notation, this is written

(∀a, b)((a b) ∧ (b a) → (a = b)). (3)

Equation (3) expresses the antismmetry property of the order relation in logical notation.

It is also convenient to introduce the “greater than” symbol . Thus

a b is deﬁned as b a.

Definition. A partially ordered set (S, ) is a set S together with a transitive, reflexive, and antisymmetric relation defined on it. These are given respectively by equations (1), (2), and (3).

29 The relation is called a non-strict inequality. As for transﬁnite cardinals, we introduce a strict inequality ≺, deﬁned by the condition

a ≺ b means a b and a =6 b.

In logical symbols (∀a, b)((a ≺ b) ↔ (a b) ∧ (a =6 b)). Here, ↔ is the symbol for “if and only if.” The symbols is treated similarly.

The following result can now be proved and will be left as an exercise.

1. In a partially ordered set (S ), the relation ≺ is transitive: a ≺ b and b ≺ c implies a ≺ c. Further, if a ≺ b then b ≺ a is false. (This second condition is called non-reﬂexivity.)

We summarize: A partially ordered set (S, ) is a set S, with a transitive, reﬂexive, and antisymmetric relation on it. Equivalently, a partially ordered set (S, ≺) is a set S, with a transitive, non-reﬂexive relation ≺ on it.

Another property of inequality for the reals is that for any two diﬀerent numbers a and b one is less than the other. In logical notation this is given by (∀a, b)((a b) ∨ (b a)). (4) It is easy to show that for strict inequalities, this may be written (∀a, b)((a = b) ∨ (a ≺ b) ∨ (b ≺ a)). (5) Equation (4) or (5) is called the trichotomy law.

A totally ordered set (S, ) is a partially ordered set in which the trichotomy law (4) is true. Totally ordered sets are also called linearly ordered or, simply, ordered sets.

We now give some examples to illustrate these concepts.

I. The real numbers are totally ordered. So are the rational numbers, the positive integers, and the integers from 1 through 100.

II. N10, the integers from 1 through 10, where a b is interpreted to mean that a divides b. This is written a|b. For example 2|6 and 3 6|5. We can verify that (N10, |) is a partially ordered set, but not linearly ordered. For example, 2 and 5 are not comparable, since neither divides the other. But as in I above (N10, ≤) is totally ordered.

III. If S is any set, the subsets of S called the power set P(S) form a partially ordered set (P(S), ⊆), using inclusion as the order relation. This forms a partially ordered set. The three conditions:

30 Transitivity: If A ⊆ B and B ⊆ C, then A ⊆ C. Reﬂexivity: A ⊆ A. Antisymmetry: If A ⊆ B and B ⊆ A, then A = B.

are clearly valid. This is not a total ordering on P(S).1 For example, if S = {0, 1}, neither of the two sets {0} and {1} are included in the other.

We can show orderings graphically, at least for ﬁnite sets.. For example, consider the the 2 set N3 = {1, 2, 3}. Its subsets are ∅, 1, 2, 3, 12, 13, 23, 123. The following diagram gives the structure of the inclusion order on the subsets of 123:

123 ¨H ¨¨ HH ¨¨ s HH ¨¨ ? HjH ¨¨ HH 12¨¨ HH 23 H¨ ¨H H¨ H ¨ H ¨ H ¨¨ 13 HH ¨ s H ¨ s H ¨ s ? HjH¨ Hj¨¨ ? ¨¨H ¨HH ¨ H ¨ H ¨ HH ¨¨ H ¨132¨ HH¨¨ HH HH ¨¨ s HH s ¨¨ s HjH ? ¨¨ HH ¨¨ H ¨ H ¨ HH¨¨ ∅ s Graph of The Subsets of {1,2,3}

The subsets are indicated by points. The arrow points from a subset to a smaller subset (that is, a set properly included in it), Finally, one subset A is properly contained in another subset B if there is a path, in the direction of the arrows, connecting A to B. From the diagram, 123 can be connected to any set, and any set can be connected to ∅. This translates into ∅⊆A ⊆ 123 in this example.

Using this technique, we can draw the graph of any ﬁnite partially ordered set S.For example, the linearly ordered set {1, 2, 3, 4, 5}, using the usual inequality ordering, has the following graph:

1Except for the trivial case where S is empty or has only one element. 2For simplicity, we have written, for example, {1, 3} as 13.

31 12345 sssss Graph of the Numbers from 1 through 5

The construction of the graph of an ordered set (S, ) Is done by deﬁning the relation To deﬁne this, we introduce the notation a ≺≺ b (a just precedes b) as follows:

a ≺≺ b means that 1. a ≺ b, and 2. There is no x satisfying a ≺ x and x ≺ b.

Condition 2 can also be stated: If a x b then either x = a or x = b.

The above graphs can be explained as follows. Draw vertex for each element of the ordered set S, and draw a vector from vertex a to vertex b if and only if b ≺≺ a.

Note that in N, we have a ≺≺ b if and only if b = a + 1. However in Q, it is not possible to have a ≺≺ b because you could choose c as the average of a and b, and we would have a

The following result will be left as an exercise. In a ﬁnite partially ordered set S,ifa ≺ b, then there is an element c such that 1. a c and c ≺≺ b. 2. There are ﬁnitely many elements c1,c2,...,cn such that

a ≺≺ c1 ≺≺ c2 ≺≺ ...≺≺ cn ≺≺ b

This shows that, as in the above examples, a ≺ b if and only if there is a path in the graph from b to a,

32 8 Order Types

We now restrict ourselves to linearly ordered sets (S, ≺). In analogy with the theory of cardinal numbers, we deﬁne an order equivalence between two such sets S and T .For simplicity, we use the same symbol < to denote the order relation in each of the sets.

Deﬁnition. We write S =∼ T if there is a 1-1 onto correspondence f:S → T which preserves inequalities. That is: If a

In analogy to the theory of transﬁnite numbers, we have the following results for ordered sets S, T and U.

S ∼= S If S ∼= T then T ∼= S If S ∼= T and T ∼= U then S ∼= U

Definition. (Order types). Following the procedure used to define transfinite cardinals, we let S denote the order type of a linearly ordered set S. Thus, if S and T are linearly ordered set, we have S = T if and only if S ∼= T.

Remark. If S = T , then |S| = |T |. This is so because there is a 1-1 map of S onto T . But as we shall soon see, the converse, though true for ﬁnite sets is not true for inﬁnite sets.

Theorem: All finite ordered sets of size n ∈ N have the same ordinal type. Proof: We take the n-element finite set Un = {0, 1,...,n−1} using the usual order relation < as one of the sets. Now let S be any n-element ordered set. We set up an inequality preserving map form Un to S as follows: Because S is finite, it has a unique least element s0. Let s0 7→ 0. Now there is a unique least element in S which is greater than s0. Call it s1 and let s1 7→ 1. Continue in this way till we find sn−1 7→ n − 1. The result is a 1-1 order preserving map of S onto Un.SoS = Un. Similarly, if |T | = n, we have T = Un.SoS = T .

Note. We will continue to use Un = {0, 1,...,n− 1} as a representative of the ordinal number n. As in the theory of cardinal numbers, if S is an ordered n-element set, we write S = n.

Things change for inﬁnite sets. For example, the rational numbers and the positive integers, though both denumerable, have very diﬀerent order types.

We now consider two inﬁnite ordered sets, very similar to each other. The ﬁrst is simply N, and the second, N+, adjoins ℵ0 to it. In both cases, the order relationship is that of the cardinal numbers. We can illustrate these simply as

N = {0 < 1 < 2 <...

33 Following Cantor, we write N = ω and N+=ω + 1. The addition of 1 to ω reminds us that one element was adjoined to N and put at the end. Note that ω +1=6 ω because N does not have a last element. but N+ does. More generally, we can deﬁne addition of order type similar to addition of cardinal numbers, except we have to be careful about order in the union set. The deﬁnition is as follows.

Deﬁnition (Addition of Order Types). If α = A and β = B, and A and B are disjoint, then α + β = A ∪ B, where the order in A ∪ B is deﬁned as follows. If x, y ∈ A ∪ B then x

We denote the ordered set A ∪ B by A ⊕ B Thus

A ⊕ B = A + B

In brief, α+β puts the elements of B after the elements of A. This explains why N+=ω+1. But notice that we do not have the commutative law. The following diagram shows why 1+ω = ω =6 ω +1. Call 1 = {x} and ω = N.

{x} + {0 < 1 < 2 <...} = {x<0 < 1 < 2 <...} ∼= {0 < 1 < 2 <...}

Here we use the order preserving map x 7→ 0 and n 7→ n + 1 for n ∈ N.

Although we do not have the commutative law, we still have the associative law:

α +(β + γ)=(α + β)+γ

This permits writing α + β + γ for either of these equal formulas.

Deﬁnition (Multiplication of Order Types). If α = A and β = B, then αβ = A × B, where the order in A × B is deﬁned as follows. If (a, b) and (a1,b1) ∈ A × B, then (a, b) < (a1,b1) if and only if (a) b

We denote the ordered set A × B by A ⊗ B Thus

A ⊗ B = A B

In brief, The order in A × B is determined by the order in B. But in the event of a tie (b = b1), the order in A kicks in. It easy to prove that for any order type α, we have

1α = α1=α. (1)

34 We illustrate the deﬁnition of product by determining the order in ω · 2. Since ω = N and 2={0, 1}, the ordering is

(0, 0) < (1, 0) < (2, 0) <...<(n, 0) <...<(0, 1) < (1, 1) < (2, 1) <...<(n, 1) <...

This order type is seen to be ω + ω. Thus, ω2=ω + ω, a result we might anticipate by writing ω2=ω(1 + 1). (We shall shortly show that α(β + γ)=αβ + γ.) Another way of thinking of the product αβ is to take the ordering of β and replace each element in this ordering by all of α. In the above computation of ω2, we ﬁrst take the ordering of 2: 0 < 1. Now replace 0 and 1 by the ordering of ω:

0 < 1 < 2 <...< 0 < 1 < 2 <... | {z0 } | {z1 } Of course the two occurrences of 0, and of 1, etc., are regarded as different. This is why we use ordered couple in the formal definition. If we use this technique, to compute 2ω,we arrive at the ordering 0 < 1 < 0 < 1 < 0 < 1 <... | {z0 } | {z1 } | {z2 } This yields 2ω = ω, which can of course be formally proved using the definition. This illustrates that the commutative law for multiplication is not valid for the multiplication of order types. It is easy to see the associative law is valid for multiplication.

As indicated above, the distributive law is valid. Theorem. (The Distributive Law for Order Types. If α, β and γ are order types, then α(β + γ)=αβ + αγ (2) Proof: The proof is straightforward though tedious. We ﬁnd linearly ordered sets A, B, C with B and C disjoint such that A = α etc. We let a, a0,... represent elements of A, and similarly for b, b0,... etc. We now compare the ordering of

X = A ⊗ (B ⊕ C) and Y =(A ⊗ B) ⊕ (A ⊗ C).

Both have elements of the type (a, b)or(a, c). We now compare the ordering in X and Y . (a) Comparing (a, b) and (a0,c), b

Note that we do not have the rule (β + γ)α = βα + γα.. For example, ω =2ω = (1 + 1)ω =6 ω + ω.

35 We do not consider exponentiation at this time.

Well Ordering. A linearly ordered set S is well ordered if every non-empty subset T ⊆ S has a least element. We have already remarked (page 10) that N is well ordered.

Deﬁnition. An ordinal number is the order type of a well ordered set.

We have seen that ω is an ordinal number, since ω = N. A little analysis will show that ω + 1 and ω + ω are also ordinals. Before delving into the structure of ordinal numbers, it is useful to give an equivalent criterion for a well ordered set.

Theorem: A linearly ordered set S is well ordered if and only if there is no (denumerably) infinite strictly decreasing sequence s1 >s2 >...>sn ... with si ∈ S. Proof: If the set S is well ordered then there cannot be such a sequence, since it clearly has no least element. If S is not well ordered, then there is as subset T ⊆ S such that T has no least element. Now, choose an element s1 ∈ T . Since it is not the least element of T ,we can find an element s2 ∈ T such that s1 >s2. Similarly, s2 is not the least element of T , so we can continue the process to find s3 ∈ T with s2 >s3. This process can be continued indefinitely, and so we can find and infinite decreasing sequence in S, This is proves the converse.1

Theorem: If α and β are ordinals, so are α + β and αβ, Proof: (a) (The sum.) Let A and B be disjoint well ordered sets with A = α and B = β. Now let S be a non-empty set in A ⊕ B.IfS ∩ A is non-empty, then it will have a least element a which will also be less than all elements in B.Soa will be the least element of S. On the other hand, if S ∩ A = ∅, then S ⊆ B and since B is well ordered, S will have a least element. (b) (The product.) Let A and B be well ordered sets with A = α and B = β. Now let S ⊆ A ⊗ B. S is a set of ordered couples (a, b). Let S0 be the set of all b’s such that 0 00 (a, b) ∈ S. Then, since B is well ordered, S has a least element b0. Now let S be the set 00 of all a ∈ A such that (a, b0) ∈ S. Since A is well ordered, S has a least element a0. It is now an easy matter to show that (a0,b0) is the least element in S. For if (a, b) ∈ S, we have b0 ≤ b.Ifb0

In ordinary language, an ordinal number gives a ranking: first, second, third, etc. If S is a well ordered set, we can similarly rank the elements. If non-empty, there has to be a least element, which we call s0 (the zero-th element.) Then there has to be a least element greater than s0, which we call s1.IfS is infinite, we can continue in this way to find elements s0

1Actually this proves the contrapositive of the result.

36 (sω is the least element of the set S −{s0,s1,s2,...}. So we have the ﬁrst ω + 1 elements 2 of S: s0

0 < 1 < 2 <...<ω<ω+1<...<ω· 2 <ω· 2+1<...<ω· 3 ...<ω2 <...

How long can this go on? The answer is: For as long as there are more elements in the set. We need to prove this. We do note that in the sequence of ordinals 0, 1, 2, 3,...,ω,ω+ 1,...,ω2,..., the (ordered) set of ordinals strictly preceding any of the ordinals α in the set has ordinal number α. This is why it is convenient to start the ordinal numbers with 0. Thus, preceding 3, we have 0 < 1 < 2, and ordered set with 3 elements. And preceding ω, we have the set of natural numbers (as ordered set) which has ordinal number ω. Thus, in constructing ordinal numbers in this piecemeal way, as long as we have more ordinal numbers to consider (i.e. more elements in the set S), we can get the next ordinal by taking the ordinal number of the ordinals already constructed. Note that we do not simply add 1 to the last ordinal found. There might not be a last, as for example when we have constructed all ﬁnite ordinals.

But what happens if the set S is non-denumerable? Can it be well ordered? Is there a super inductive method of dealing in this way with any ordered set. We discuss this in the next section. But ﬁrst we deﬁne inequality of ordinals.

Deﬁnition. A segment of a well ordered set S is the set of all x ∈ S, with x

Note that a segment of a well ordered set is also a well ordered set. Also, by deﬁnition, if

x ∈ segS(x0), then x

Deﬁnition. If α = A and β = B, we say that α<βif and only if there is some segment ∼ C = segB(b0)ofB such that A = C.

It is straightforward to show that transitivity holds for inequality of ordinals. It is less straight forward to show that irreﬂexivity holds: α <α6 .

Theorem: If A is well ordered and α = A then α <α6 . ∼ The idea of the following proof is as follows: If α<α, then A = segA(a0), so A would have the same order type as a smaller set. This smaller set would then have the same order type of a yet smaller set, and so on. This will lead to a contradiction for a well ordered set. A sketch of the proof follows.

Proof: (By contradiction) For suppose α<α. Then there is an element a0 ∈ A such ∼ that A = segA(a0). Set A0 = segA(a0). Then there is a 1-1 onto order preserving function 2We have not yet deﬁned inequality of ordinals, so we are working intuitively here.

37 f:A → A0. Since a1 ∈ A0 = segA(a0), we have a1 a1 >a2 .... This is an inﬁnite, descending sequence which is impossible in a well ordered set. (See page 36.) This contradiction proves the theorem.

38 9 Zorn’s Lemma and Applications

Infinite sets have always been viewed suspiciously by mathematicians because too many paradoxes arose when studying them. The famous Russell paradox notes that the set of all sets is itself a set, so it is an example of a set which belongs to itself. Russell than considers the set of all sets which do not belong to themself. If this set belongs to itself, then of course it does not belong to itself. But if it belongs to itself then it does not belong to itself. This example, and many more, peppered the mathematical landscape in the 20th century and before. So mathematicians studied sets, and especially infinite sets, with a great deal of caution. Still, the subject itself was regarded as so juicy that most mathematicians proceeded with caution, fixing the foundations along the way. To some extent, we seem to be in this uncertain phase today.

Here is an example of a possible foundational problem. Suppose we have an onto function f:A → B, where A and B are infinite. Since the function is onto, if b ∈ B there is a non- empty set Ab of elements of A which map into b. Now we wish to form a “pseudo inverse” function g:B → A where, for any b ∈ B, g(b) ∈ Ab. How do we do this? For any b ∈ B,we know that Ab is non-empty. So let g(b) equal any one of the elements in Ab. Although this may sound reasonable, questions were asked. Which element of Ab? If the answer is “any one”, the objection was that a function has to be well defined, so “any one” does not cut it. So even if f is well defined, the pseudo inverse is not. So mathematicians (or logicians) invented the Axiom of Choice which states that a choice function exists, and therefore there was such a pseudo inverse. Today, many mathematicians will seek proof which do not use the axiom of choice. And if they have to, they will often explicitly state this.

Zorn’s Lemma, given below, is logically equivalent to the Axiom of Choice, although this is not immediately apparent. It can do for infinite sets, not necessarily denumerable, what recursion does for finite or denumerable sets. For example, if we want to construct a function on a denumerable set, we can go through its elements, one by one, and construct the function recursively. It is not clear how this technique can extend to arbitrary infinite sets. Zorn’s lemma will show how this can be done. We will state Zorn’s lemma1 below in an abstract setting. But first, some definitions.

2 Deﬁnition. Let P be a partially ordered set , ordered by ≺.Amaximal element x0 ∈Pis an element which cannot be exceeded by any element of P. That is, x0 6≺ x for any x ∈P. If S ⊆P,anupper bound for S is an element s0 ∈Psuch that s s0 for all s ∈ S. A subset S ⊆Pis called a chain if any two elements of S are comparable. That is, if s, t ∈ S, then either s t or t s.

Do not confuse “upper bound” with maximal element. Here are some examples which may

1After Max Zorn. The lemma was ﬁrst publicly stated and ascribed to Zorn in 1922. 2See page 29.

39 clarify these deﬁnitions. Look them over carefully to ﬁx ideas.

Example 1. (P, ⊆) is the set of all subsets of a given set S. Then S itself is a maximal element for P, and is an upper bound for any set of subsets.

Example 2. P is the set of positive integers, and the order relation is “divides”. Thus a b means that a divides b, written a|b. Here, any ﬁnite set F = {a1,...,an} has an upper bound a1a2 ···an. No inﬁnite set can have an upper bound. P does not have a maximal element. An example of a chain is the set of all non-negative powers of 2: 1, 2, 4, 8,.... Another example of a chain is the set of factorials: {1, 2, 6, 24,...,n!,...}.

Example 3. P is the set of all inﬁnite proper subsets of the real numbers R, under inclusion. P has many maximal elements: Choose any real number a. Then R −{a} is maximal. An example of a chain is the set of all open intervals of the form (−a, a) where a is an arbitrary positive real. This chain has no upper bound. Why? The set of open intervals of the form (0,a) is also a chain, and it has an upper bound. Name one! The set of intervals of the form (a, 2a) is not a chain. Why?

We can now state Zorn’s Lemma.

Zorn’s Lemma. Let (P, ) be a partially ordered set such that every chain has an upper bound. Then P has a maximal element.

Is that all? What’s the idea behind this? Suppose we are trying to ﬁnd a maximal element in P. So we start somewhere and go higher and higher (large and larger). But we are nowhere near our goal. But we are guaranteed that any chain has an upper bound. So wherever we are in our attempt, we can always continue from this upper bound and go higher and higher, etc. For a set with a huge cardinality, this attempt may seem futile. But Zorn’s lemma tells us not to worry - there is a maximal element. Zorn’s lemma is not constructive. It states that there is a maximal element but does not state how to get it.

As a ﬁrst application, we prove that any two cardinal numbers are comparable.

Theorem. (Comparability of Cardinal Numbers.) If α and β are cardinal numbers, then either α ≤ β or β ≤ α.

The idea of the proof is as follows. If A and B are sets with |A| = α and |B| = β,we attempt a 1-1 correspondence between part of A and part of B. If this doesn’t give the required correspondence because neither A nor B is fully used, simply extend and continue this indeﬁnitely. This extending and continuing indeﬁnitely is what brings Zorn’s lemma to mind. The proof follows.

Proof: We say that a function f:A1 → B1 is eligible if f is 1-1 and onto, and A1 ⊆ A and B1 ⊆ B. We let P be the set of eligible functions. We make P into a partially ordered set

40 by deﬁning f g to mean that g extends f. This means that if f:A1 → B1 and g:A2 → B2 then A1 ⊆ B1, and A2 ⊆ B2, and g(x)=f(x) for all x ∈ A1.

0 Now suppose that f:A1 → B1 and A1 =6 A and B1 =6 B. The it is possible to extend f to f 0 0 0 0 0 as follows. Choose a ∈ A − A1 and b ∈ B − B1, and deﬁne f :A1 ∪{a }→B1 ∪{b } by the 0 0 0 0 0 0 conditions: f (a)=f(a) for a ∈ A1 and f (a )=b . Then f ∈Pand f ≺ f . This shows that f is not a maximal element.

Thus, if f is a maximal element in P then either A1 = A or B1 = B. In the former case, f −1 sends A 1-1 onto B1. In the late case, f sends B 1-1 onto A1. This means that |A|≤|B| or |B|≤|A| respectively. So either α ≤ β or β ≤ α and we have the result.

It remain to find a maximal element in P. Here, we use Zorn’s Lemma. Let S be any chain in P. Let A0 be the union of the domains of functions in S. Thus, x ∈ A0 if and only if X is in one of the domains of functions in S. Similarly define B0 as the union of the ranges of functions in S. Finally, for x ∈ A0, we define F (x)=f(x) where f is one of the functions in S defined on a set containing x. Then F :A0 → B0. The definition is unique, because S is a chain: given two functions in S, one extends the other. It is not hared to show that F is 1-1 and onto, so F ∈P. The resulting F is an extension of all the functions in S, so it is an upper bound for S. We have shown that every chain in P has an upper bound. Thus by Zorn’s Lemma, P has a maximal element. As we have seen above, this proves the theorem.

A similar argument proves the analogous theorem for ordinal numbers:

Theorem. (Comparability of Ordinal Numbers.) If α and β are ordinal numbers, then either α ≤ β or β ≤ α.

If α is an ordinal number, it makes sense to find its cardinal number |α|. We simply define it as the cardinal number of any well ordered set with ordinal number α. That is, if α = A, define |α| = |A|. Clearly, this definitio is independent of the set used to determine the cardinal number α, since if A =∼ B then we also have A ∼ B and so |A| = |B|. Cantor wanted to have ordinal numbers with any cardinality. In order to do this, it is necessary to give a well ordering on any set. For example, imagining a well ordering of the real numbers.3 Then we would have an ordering on a huge set (cardinality ℵ0) in which every descending sequence must be finite!

Theorem. The Well Ordering Theorem. Any set can be well ordered.

The idea of the proof is to build up well-ordered subsets of a set S until no further buildup is possible. At that point we will have well-ordered all of S. The method is to use Zorn’s Lemma. 3This ordering need not have any relationship to the usual ordering of the reals. For example, we might have π ≺ 2 ≺ 10 ≺ e, etc.

41 Proof: . We consider the class T of subsets A of S with a well ordering on A. We will use ≺ as the “less than” symbol in A. We make T into a partially ordered set by deﬁning A1 v A2 if and only if (i) A1 ⊆ A2, and (ii) A1 ≤ A2. (Intuitively, A2 is obtained from A1 by adjoining new elements at the end of A1 and keeping it well ordered.) If A is a proper subset of S, then A can not be a maximal element of S. To see this, take any element s0 ∈ S − A and adjoin it to the end of A. The resulting well ordered set A ∪{s0} will then be strictly larger than A. Finally, it remains to be shown that any chain in S has an upper bound. We omit the details here.

Summarizing, with the help of Zorn’s lemma we are able to obtain many powerful results in the transﬁnite range. We have shown that any two cardinal numbers are comparable as are any two ordinals. We have also shown that there are ordinals with arbitrary cardinality.

42 10 Peano’s Postulates

Giuseppe Peano (1858-1932) was a famous Italian mathematician who did his work in analysis, geometry and logic. He is probably most remembered for his axiomatic treatment of the natural numbers N, (called numbers in what follows). This axiom system, commonly called Peano’s Postulates or Peano’s Axioms, is as follows. It takes for granted two undeﬁned terms1: 0, and x0 (the successor of x.) He wanted to show how all of arithmetic can be developed and proved, using these two basic concepts.

Peano’s Postulates

Ax. 1. 0 is a number. (0 ∈ N.) Ax. 2. 0 is not the successor of any number. (∀x)(0 =6 x0) Ax. 3. If x is a number, then x0 is a number. (∀x)(x ∈ N → x0 ∈ N). Ax. 4. If x0 = y0, then x = y. (The map x 7→ x0 is 1-1.) Ax. 5. If S is a set of numbers, and if (1) 0 ∈ S, and (2) x ∈ S → x0 ∈ S, then S = N.

Some comments. Axioms 1 and 2 start the natural numbers at 0. Axiom 3 postulates the method of getting the next number, once you have a number. Axiom 4 prevent the progression of numbers from going around in a circle.

Axiom 5 is the famous principle of mathematical induction and is called the induction axiom. It is sometimes stated as

Axiom 50: If a statement P (n) about a number n is true for n = 0, and on the assumption that P (n) is true for some n, it will also be true for n + 1, then the statement is true for all numbers n.2

Axiom 5 implies Axiom 50. Assuming we have Axiom 5, and we have a statement for which the hypotheses of Axiom 50 is true. Let S be the set of n for which the statement is true. Then conditions (1) and (2) of axiom 5 will be true, so S = N and so the statement is true for all n ∈ N. It is also easy to show that Axiom 50 implies Axiom 5.

Axiom 5 is the set version of induction, while 50 is the statement version.

Before using induction (Axiom 50), be sure to keep in mind what the statement P (n) is. To use the axiom, you must ﬁrst prove that P (0) is true. Then, to prove that P (n) → P (n + 1), you can assume that P (n) is true3 and from this, you must prove that P (n + 1) is also true.

1In much the same way, point and line are undefined terms in geometry. Peano wanted to define numbers ‘from scratch’. 2We take n0 = n + 1 here. This will be justified below. 3This is called the induction hypothesis.

43 Once this is done, you have the complete proof that P (n) is true for all n. This is often confusing for beginning students. It looks like you are assuming the result! In fact, when you are assuming P (n) is true, you are assuming that it is true for one particular value of n, not all n. You then show it must also be true for then next value of n. The power of the method is that it does give us a powerful starting point - we are able to assume P (n) is true.

We illustrate the method with a typical high school example. We wish to prove that 1 + 3 + ...+(2n +1)=(n +1)2. The proof is as follows. (1) For n = 0 this is the statement that 1 = (0 + 1)2, which is true. So the statement is true for n =0 (2) We now assume the result for n: 1+3+...+(2n +1)=(n +1)2. This is the induction hypothesis.

We now proceed to prove the result for n+1, namely that 1+3+...+(2(n+1)+1) = (n+1+1)2 or1+3+...+(2n +3)=(n +2)2. We do this by starting with our induction hypothesis

1+3+...+(2n +1)=(n +1)2 = n2 +2n +1.

Now add 2n + 3 to both sides of this equation:

1+3+...+(2n +1)+(2n +3)=(n +1)2 = n2 +2n +1+2n +3

or 1+3+...+(2n +3)=n2 +4n +4=(n +2)2 which is the result for n + 1. This proves the result by mathematical induction.

We now start with the formal development of arithmetic, using Peano’s postulates. We start by “deﬁning” addition and multiplication of numbers by recursion formulas.

a +0=a; a + b0 =(a + b)0 (1)

a · 0=0; ab0 = ab + a (2) Formula (1) define a + b by induction on b, The first part gives its value for b = 0, Once we know its value at any b the second part of this system gives its value at the next value b0 of b. This is not quite induction, but we take it as a recursive definition of a + b. Similarly, formula (2) for multiplication defines ab by induction on b, using the previous definition of a + b. These formulas form the basis for arithmetic. These formulas are easy to remember: Just think of b0 as b +1. We start with some non-surprising definitons:

1=00;2=10 =000, 3=20 =0000,...

The Peano Axiom 2 states that the number 0 has no predecessor. We can now show that every other number has a predecessor.

44 Theorem: (The Predecessor Theorem.) Every non-zero number is the successor of some number. Proof: We want to show that if x =6 0 then there is a y such that x = y0. We shall show this by induction on x. It is true for x = 0 since the statement “If 0 =6 0 then there is a y such that 0 = y0.4 If the statement is true for x, it is certainly true for x0 since x0 clearly has x as its predecessor. This completes the proof.5

As a consequence, we have the following useful result. Theorem: If x + y = 0 then x = 0 and y =0. Suppose for example that y =6 0. Then we have y = z0 for some z by the predecessor theorem. Therefore, (x + z)0 = x + z0 = x + y =0. But 0 has no predecessor by Axiom 2. This proves the result by contradiction.

We now show how ordinary algebra can be derived from the axiom.

Theorem: 1+1=2. (Are you surprised at this?) Proof: 1+1=1+00 = (1 + 0)0 =10 =2.

Theorem: For any number n, n0 = n +1. Proof: n +1=n +00 =(n +0)0 = n0.

Theorem: (a + b)+c = a +(b + c). (The associative law for addition.) Proof: By induction on c. It is true for c = 0 since (a + b)+0=a + b, and a +(b +0)=a + b using the ﬁrst part of deﬁnition (1). Now assuming the result for c,weget (a + b)+c0 =((a + b)+c)0 =(a +(b + c))0 =(a +(b + c)0)=(a +(b + c0)). This is the result for c0. So the result is proved by induction.

Theorem: 0+n = n. Proof: The result is true for n = 0, since 0 + 0 = 0. Assume the result is true for n.So0+n = n. Then 0+n0 =(0+n)0 = n0 This is the result for n0 so the result is true for all n. Similarly, we have

Theorem: 1+n = n0. We leave the easy proof (by induction) to the reader.

4This is a vacuously true statement, since the hypothesis is false. For example, “If 0 =6 0, then 18 = 3” is a true statement. 5If you are uneasy about the case x = 0, one way of expressing the statement “If x =6 0 then there is a y such that x = y0”is“x =0orx has a predecessor.” The case x = 0 is then clearly true, without fretting about vacuously true statements. In general, we can always replace p → q by ¬p ∨ q.

45 Theorem: a + b = b + a. (The commutative law for addition.) Proof: By induction on a. It is true for a = 0. Now assume it true for a. Then by the above results,

b + a0 =(b + a)0 =(a + b0)=(a +(1+b)) = ((a +1)+b)=a0 + b

This is the result for a0 and the result is proved.

In the same way, it is possible to verify the usual laws of algebra for multiplication on natural numbers. We list these results, and prove one of them.

The Associative Law: (ab)c = a(bc) The Commutative Law: ab = ba The Left Distributive Law: a(b + c)=ab + ac The Right Distributive Law: (a + b)c = ac + bc Property of 1: 1a = a Property of 0: 0a =0

For example, let’s prove the left distributive law a(b + c)=ab + ac by induction c.For c = 0, this is a(b +0)=ab + a0. But this follows from the deﬁnitions: a(b +0)=ab and ab + a0=ab +0=ab. So, assuming the result for c,weget

a(b + c0)=a(b + c)0 = a(b + c)+a =(ab + ac)+a = ab +(ac + a)=ab + ac0.

This is the result for c0. Note that we have used, respectively, the definition of addition, the definition of multiplication, the inductive hypothesis, the associative law for addition, and the definition of multiplication.

We leave the proof these algebraic identities to the reader. From now on, we shall take for granted these algebraic results. We will also use expressions such as a + b + c + d, because its value is independent of how parentheses are introduced. Similarly the commutative law for addition will show that, for example, a + b + c + d = c + a + d + b. SImilar remarks apply for multiplication.

An important property of numbers is the cancellation law for addition:

Cancellation Law for Addition: If a + c = b + c, then a = b.

Note that we cannot simply add −c to both sides of the equation, since we have no negative numbers in this system.

The proof is (of course) by induction on c. It is clearly true if c = 0. Now suppose it is true for some c,andwehavea + c0 = b + c0. Then we have (a + c)0 =(b + c)0. Then by Peano’s

46 4th axiom, this implies If a + c = b + c, and so we have a = b by the induction hypothesis. We also have a cancellation law for multiplication. See below.

Inequalities are introduced in a natural way. We write a ≤ b to mean that b = a + x for some x ∈ N. Thus, a ≤ b if and only if (∃x)(b = a + x). As usual, we write a

It is a simple algebraic result that the relation ≤ and < are transitive. For if a ≤ b and b ≤ c, then we have b = a + x and c = b + y. Thus, c = a + x + y and so a ≤ c. If one of the inequalities are strict, we have either x =0or6 y =6 0. Therefore by the theorem following the predecessor theorem (page 45), x + y =6 0, and a

We also have a cancellation law for inequalities: Theorem: If a + x ≤ b + x then a ≤ b, and similarly for strict inequalities. For a proof, we have b + x = a + x + w for some w. Now cancel x using the cancellation law for addition. This gives b = a + w and the result.

The following result follows from the predecessor theorem: Theorem: If x =6 0 then x ≥ 1. To see this, note that x has a predecessor: x = z0 = z + 1. This shows that x ≥ 1. We can rephrase this as “1 is the smallest positive number.”

As a corollary, we have Corollary There is no number strictly between x and x0. To see this, suppose x

We can also show that we can algebraically manipulate inequalities. Theorem: .Ifa ≤ b and c ≤ d then a + c ≤ b + d and ab ≤ bd. If the inequalities are strict7, then the conclusions will be strict inequalities. Proof: Writing b = a + x and d = c + y, add to get b + d = a + c + x + y which shows a + c ≤ b + d. For strict inequalities, x + y =6 0 and we have the strict conclusion. Multiplying, give bd = ac + ay + cx + xy. A case analysis will give the result.

We now prove the important Theorem. (The comparison theorem.) For any a, b we either have a ≤ b or b ≤ a. Proof: We prove this by induction on b.Forb = 0, we have b =0≤ a, so we have the result.. Now suppose this is true for a given b and we now want to compare a and b0.We have two cases: Case 1. a ≤ b. Then since b

6As usual, we write x>yand x ≥ y to mean, respectively. y

47 Case 2. b

This can be used to prove the Cancellation Law for Multiplication: If ac = bc and c =6 0, then a = b. To prove this, suppose a

We have no intention of further developing number theory from Peano’s postulates. However, since we have use the fact that N is well ordered in the preceding sections, we now give a proof of this using Peano’s Axiom 5.

Theorem. The Well ordered theorem for N. Let S be a non-empty subset of N. The S contains a least element. Proof: Our strategy is to creep up on the least element of S from below. We now attempt to ﬁnd the least element by building up from 0 until we ﬁrst hit an element of S. Let B be the set of numbers below S. That is, b ∈ B if and only if b ≤ s for every s ∈ S. Clearly 0 ∈ B.IfB is closed under the successor operation8, it would follow that B = N by Axiom 5 (the set version of induction). But this is not possible because S is non-empty. For if s ∈ S then clearly s0 6∈ B. Therefore there is a b ∈ B such that b0 is not in B.Sob0 is not 0 below all of S, so there must be some s0 ∈ S such that s0

8That is, b0 ∈ B whenever b ∈ B.