<<

20 LARRY SUSANKA

When G P(X), we will use S(G)todenotethesetofsimplefunctionscon- structed from⊂ the sets in G. Afunctionthathasconstantrangevaluet on its whole domain will sometimes be denoted t,withthisusage(andthedomain)takenfromcontext.Thus,for example, χX is sometimes denoted by 1 and 0χX by 0, in yet another use of each of those symbols. When H is a of [ , ]X ,wewilluseB(H)todenotethebounded members of H; f B(H) −∞f H∞ and a R with 0 a< and a f a. ∈ ⇔ ∈ ∃ ∈ ≤ ∞ − ≤ ≤ If X is a topological space, let C(X)denotethecontinuousfunctionsfromX to R. RX , B(RX )and,whenX has a topology, C(X), are all vector lattices:real vector spaces and lattices. They are also commutative rings with multiplica- tive identity χX . 7.1. Exercise. S(G) is obviously a (possibly empty) . Give conditions on G under which S(G) is a vector lattice and a commutative ring with multiplicative identity χX .

8. The of Choice

In this section we will discuss an axiom of theory, the . Every human language has grammar and vocabulary, and people communicate by arranging the objects of the language in patterns. We imagine that our com- munications evoke similar, or at least related, mental states in others. We also use these patterns to elicit mental states in our “future selves,” as reminder of past imaginings so that we can start at a higher level in an ongoing project and not have to recreate each concept from scratch should we return to a task. It is apparent that our brains are built to do this. But words are all defined in terms of each other. Ultimate meaning, if there is any to be found, is derived from pointing out the window at instances in the world, or from introspection. Very often ambiguity or multiple meaning of a phrase is the of a given communication, and provides the richness and subtlety characteristic of poetry, for instance, or the beguiling power of political speech. is a language mathematicians have invented to encode mathemat- ics. But unlike most human languages, this language does everything possible to avoid blended meaning, to expose the logical structure of statements and keep the vocabulary of undefined terms to an absolute minimum. Many mathematicians believe what they do is “art.” But ambiguity and internal discord is not part of our particular esthetic ensemble. Most mathematicians believe that, though set theory may be unfinished, it serves its purpose well. Virtually all mathematical structures canbesuccessfullymodeled in set theory, to the extent that most mathematicians never think of any other way of speaking or writing. CARDINALS AND ORDINALS 21

Together, the collection of (which, along with logical conventions de- fines the language) normally used by most mathematicians is called the Zermelo- Fraenkel Axioms, or simply ZF and the set theory that arises from these axioms is called Zermelo-Fraenkel Set Theory. You saw explicit mention of two ax- ioms from ZF, the Axiom of Infinity and the Axiom of the , in Section 5. We have used others without mention on every page. For example we have formed power sets. The Axiom of the For any set A there is a set P(A) consisting of all, and only, the of A. Asserting the existence of a set with this feature is a dramatic and “non-constructive” thing to do, particularly when the underlying set is infinite. We are not told how to create this set. We just have a means of recognizing if a set we have in hand is a member of this power set, or not. And where, exactly, did that first infinite set come from? The Axiom of Infinity brings it into existence, out of nothing, simply because mathematician want infinite sets and this seems to be a logically consistent way to produce them. There is another extremely useful—and arguably even less constructive—axiom which we discuss now. We will present and presume to be true, wherever convenient, the four equivalent and useful statements below, one of which is called the Axiom of Choice. This axiom is frequently abbreviated to AC. The collection of the axioms of standard set theory plus this axiom is frequently denoted ZFC. The discussions regarding equivalence of the Axiom of Choice and the other three statements, and the history associated with them, is a fascinating story which deserves study by every serious student of . The Axiom of Choice: If J and X are sets and A: J P(X) is an indexed collection of nonempty sets then there is a functi→on f : J → X such that f(β) Aβ β J. A with this property is called a choice function for∈ A. ∀ ∈ Essentially, this axiom states that given any generic set S of nonempty sets, there is a way of selecting one from each member of S. The other axioms do not imply that such a selection can be made, unless every member of S has an element with some unique property, which would allow it to be singled out. Zorn’s Lemma: If S is a set with a partial order and if every chain in S possesses an upper bound in S, then S has a maximal member. Zermelo’s : Every nonempty set can be well ordered. Kuratowski’s Lemma: Each chain in a S is con- tained in a maximal chain in S (that is, a chain in S not contained in any other chain in S.) Kuratowski’s Lemma is also often called The Hausdorff Maximal Principle. That Zorn’s Lemma implies Kuratowski’s Lemma is immediate. Suppose S is a set with a partial order and and C is a chain in S. Let W denote the set of all chains in S which contain the chain C, ordered by containment. Any chain in W is 22 LARRY SUSANKA bounded above by the of the chain, so Zorn’s Lemma implies that W contains a maximal member. That maximal member is a chain in S not properly contained in any other chain in S. On the other hand, assuming Kuratowski’s Lemma to be true, suppose S is a set with a partial order and that every chain in S possesses an upper bound in S. This time let W denote the set of all chains in S. Let X denote a maximal member of W. So X is a chain in S not contained in any other chain. Let M be any upper bound for X. By maximality of X, M must actually be in X and cannot be less than any other member of S: that is, M is maximal in S. So Zorn’s Lemma is true. In the last two paragraphs we have shown that Zorn’s Lemma and Kuratowski’s Lemma are equivalent statements. We will now show that Zorn’s Lemma implies AC. Suppose S is any nonempty set of nonempty sets and X is the union of all the sets in S. Let B = S X. Now let Q denote the set of all subsets of P(B) which are choice functions on× their domains: that is, T Q exactly when T is nonempty and there is at most one ordered pair in T whose∈ first component is any particular member of S, and also s A whenever (A, s) T . These are called “partial choice functions.” Order Q∈by containment. The∈ union of any chain in Q is a member of Q so Zorn’s Lemma implies that Q has a maximal member. This maximal member is a choice function on its domain, which must by maximality be all of S. The fact that Zermelo’s Theorem implies AC is also straightforward: given any nonempty set S of nonempty sets, well order the set X = S∈S S. For each S S let f(S) be the least element of S with respect to this ordering. f is the requisite∈ choice function. S The opposite implication is a bit trickier. It involves using a choice function to create the well order. Suppose set A has more than one element and f : P(A) ∅ A is a choice function: that is, f(B) B whenever ∅ = B A. − { } → ∈ 6 ⊂ Let B denote the set of all nonempty containment-chains in P(A) ∅ which are well ordered and satisfy the condition: − { }

Whenever IK is an initial segment of one of these chains and if J is the

union of all the sets in IK then J = A and K = J f(A J) . 6 ∪ { − } B is nonempty: for example, f(A) , f(A),f(A f(A) ) is in B. { { } { − { } } } The condition above implies that each of the chains in B must start with the set f(A) , and the successor to any set K in such a chain (if, of course, K is not the last{ set} in the chain) has exactly one more member than does K. It also implies directly that if two different chains X and W of this kind have a common initial segment, so that IK X and IG W and IK = IG then K = G. In other words, the least successor of⊂ an initial segment⊂ is determined by the sets in the initial segment, and not by the specific chain within which the initial segment sits. Suppose that X is one of these chains. We will call S a “starting chunk” of X if ∅ = S X and whenever B,C X the condition B S and C B implies C 6S. Now⊂ it might be that a starting∈ chunk is as short∈ as f(A) ⊂or it could, possibly,∈ be all of X. But if it is not all of X then because X is{ well ordered} there CARDINALS AND ORDINALS 23 is a least member K of X not in S and so S contains all members of X less than K. That is, S = IK for some K X. So starting chunks are either initial segments or the entire chain. ∈ Now suppose X and W are unequal chains, members of B. Then one, say X, would contain a least set K not in the other. The initial segment IK of X is contained in W . If there were a set in W not in IK but less than some member of IK then there would be a least member of W of this kind. Call that least member G. But then the initial segment IG of W would be a starting chunk of X and by the above remark we would have G X, contrary to its definition. ∈ So there are no missing members of W between members of IK , which is therefore a starting chunk of W . Since K / W we must have IK = W , and conclude that W is an initial segment of X. ∈ To reiterate: for each pair of members of B, one is an initial segment of the other. Now let S be the union of all the members of B. Each set in S comes from a member of B and since one of any pair of members of B is an initial segment of the other we conclude that S itself is a chain, and well ordered too. Let J denote the union of all the sets in S. If J = A then we could extend S to S J f(A J) which satisfies the conditions6 for membership in B but is strictly∪ { longer∪ { than− its}} longest member, a contradiction. We conclude that J = A. So we can use S to create an order on A. If a and b are members of A there is a least member Sa of S containing a and a least member Sb containing b. Declare a b precisely when Sa Sb. If J is the union of the sets in the initial segment ≤ ⊂ determined by Sa then a / J so it must be that a = f(A J). So this makes A into a .∈ Further, if ∅ = T A then the− collection of all of the 6 ⊂ St with t T has a least member, which produces a least member of T . So the order on A∈is a well order. We conclude that the existence of a choice function on P(A) ∅ implies that A can be well ordered. So AC implies Zermelo’s Theorem. − { } Upon accepting the Axiom of Choice, as we will do throughout this book, well ordered sets are plentiful and can be used. At this point we have shown the following implications among the conditions which we claim to be equivalent to the Axiom of Choice.

Zorn Kuratowski ⇐⇒ ⇓ AC Zermelo ⇐⇒ The Principles of Induction and Recursive Definition are incredibly pow- erful and useful techniques, extending the idea of Induction on the to many more well ordered sets and situations more varied than merely checking if an in- dexed set of are all true. The methods are detailed in Section 14. It is important to note, and the reader should check, that the proof of the version of Recursive Definition we use here does not require AC. We will now use Induction and Recursive Definition to show that Zermelo’s The- orem implies Kurotowski’s Lemma, thereby proving that any of the four conditions 24 LARRY SUSANKA listed above implies the others. The discussion below is a typical usage of this type of . It uses first a recursive definition to deduce that a certain function exists, and then induction to confirm various properties of that function. We suppose we have a chain in a partially ordered set. We will line up the members of the set not already in the chain and test them one at a time. When it is an element’s turn, if it can be added to yield a bigger chain than we have up to that point we select it. Otherwise we discard it. Then we go on to the next element and repeat until we have exhausted the possibilities. The is a maximal chain. A rigorous justification can be produced after digesting the result in 14.3 Assume Zermelo’s Theorem to be true, and suppose H is a nonempty chain in set K with partial order -. Suppose B = K H is nonempty. There is a well order for B. Since we have two orders in hand,− we will use prefixes to describe ≤ which order is in use. We will let Iβ stand for a ≤-initial segment for any β B. ∈ Suppose y is a fixed element of H. For the ≤-first element α of B, let P (α) equal α if H α is a --chain, and let P (α) be y otherwise. ∪ { } Having found P (β) for all β B with α β<γ for some γ B define P (γ) to ∈ ≤ ∈ be γ if H γ P (Iγ ) is a --chain, and let P (γ) be y otherwise. ∪ { }∪ This serves to define P (γ) for each γ B. ∈ H P (B) must be a --chain: if not it must contain two --incomparable members s and∪t, which cannot both be in H. If one of the two, say s, is in H then there is a β B with P (β)= t = β. But then H β P (Iβ) is not a chain, violating the defining∈ condition for P (β). A similar contradiction∪ { }∪ occurs if neither s nor t are in H, by examining the point at which the second of the two points would have been added. So in fact H P (B) must be a --chain. ∪ No additional members of K can be added to H P (B) without causing the ∪ resulting set to fail to be a --chain: once again, letting γ be the ≤-least member of B which could be added, if any, yields a contradiction. That member would have been added at stage γ.

So A P (B) is a maximal --chain in K, and Kuratowski’s Theorem holds. ∪ 8.1. Exercise. Fill in the details of a direct proof using Induction and Recursive Definition that Zermelo’s Theorem implies Zorn’s Lemma. We assume that K is a set with partial order - for which every chain has an upper bound. We assume also that K has a well order ≤ with ≤-first member α.

We would like to conclude that K has a --maximal element.

Let α denote the ≤-first member of K and define G(α) = α. Having defined G on ≤-initial segment Iβ for β>α let G(β)= β if β is a --upper bound for G(Iβ), and otherwise let G(β)= α.

Show that G(K) is a chain and that G(K) has a --last member, which is -- maximal in K.

8.2. Exercise. (i) An axiom equivalent to our Axiom of Choice is produced if we add to that axiom the condition that Aα Aβ = ∅ whenever α = β. ∩ 6 CARDINALS AND ORDINALS 25

(ii) Consider the statement: “Whenever B is a nonempty set of nonempty pair- wise disjoint sets, there is a set S for which S x contains a single element for each x B.” Show that this statement is equivalent∩ to the Axiom of Choice. ∈ (iii) Let B be a (nonempty) set of sets. B is said to have finite character provided that A B every finite subset of A is in B. Tukey’s Lemma states that∈ every set of sets of finite character has a maximal member: a set not contained in any other member. Show that Tukey’s Lemma is equivalent to the Axiom of Choice. (hint: To prove that Tukey’s Lemma implies the Axiom of Choice examine the set of partial choice functions and note that it satisfies the conditions of Tukey’s Lemma.)

The use of AC in the formation of mathematical has historically been the subject of controversy centered around the nebulous nature of the objects whose existence is being asserted in each case. In applications the axioms of set theory are usually used to affirm the existence of one precisely defined set whose elements share an explicit property. That is less obviously the case when AC is invoked. Applications which require less than the full strength of AC are common. In an effort to control, or at least record, how the axiom is being used, weaker variants have been created. Some mathematicians award “style points” to proofs using one of these, or which avoid AC altogether. We list two of these weaker versions of AC below. The Axiom of Dependent Choice: If X is a nonempty set and R X X is a with domain all of X, then there is a ⊂ × r : N X for which (rk, rk ) R k N. → +1 ∈ ∀ ∈ The Axiom of Countable Choice: If X is a nonempty set and r : N P(X) is a sequence of nonempty subsets of X then there is a sequence→ f : N X such that f(n) An n N. → ∈ ∀ ∈ These axioms are frequently abbreviated to DC and ACω, respectively.

8.3. Exercise. (i) Prove the implications AC DC ACω. ⇒ ⇒ (ii) Suppose X is infinite. For each k N let Sk denote the set of all subsets of k ∈ X which have 2 elements. ZF alone implies that Sk is nonempty for each k, and you may assume this. Let S denote the set of all the Sk. Use ACω twice to prove that there is a one-to-one function f : Y N for an infinite subset Y of X. Any set Y (infinite or not) for which there is a→ function of this kind is called countable, and the result here may be paraphrased as “Any infinite set has an infinite countable subset in ZF+ACω.” (iii) Sometimes the use of an axiom, particularly a variant of the Axiom of Choice, is hard to spot in an argument. It seems so reasonable, it is hard to see you are assuming anything. The theorem that “The union of a of countable sets is countable.” is an example. Suppose A is a countable set of countable sets, and let B denote the union of all the members of A. Because A is countable, there exists one-to-one T : A N. Because each member of A is countable, for each nonempty set S A there→ is a ∈ nonempty set FS consisting of all one-to-one functions from S to N. Using T , this collection of sets of functions is seen to be countable, so ACω guarantees that we 26 LARRY SUSANKA can pick a function from each. It is easy to overlook this step,andmerelyassert “Because each member of A is countable there exists one-to-one GS : S N for each S A.” and get on with the discussion using these selected functions.→ But it ∈ is ACω which endorses this selection.

To finish the argument, for each x B we let Ax = S A x S and define ∈ { ∈ | ∈ } ix to be the least in T (Ax).WedefineWx to be that member of Ax with T (Wx)=ix.ThefunctionH : B N given by → i G x H(x)=2x 3 Wx ( ) · is one-to-one, so B is countable.

8.4. Exercise. (i) Any chain in a is contained in a branch. (ii) Prove K¨onig’s Tree Lemma:IfS is an infinite rooted tree but each t S is the immediate predecessor of only finitely many members of S then S has∈ an infinite branch. (hint: Let K denote those members of S with an infinite number of successors and for each t K let Mt = Tt K t .Letf denote a choice ∈ ∩ −{} function for these sets: f(t) Mt t K.UseinductiononN to create an infinite chain.) ∈ ∀ ∈

The next section contains another important consequence of the Axiom of Choice. Many more can be found scattered in appendices and chapters throughout this book. Those who want a slightly more detailed look at the ZF axioms can find them listed in Sections 11 and 13. The discussions there are rudimentary but, I hope, a practical guide providing a taste of modern set theory.

9. Nets and Filters

Suppose r : J X is a net. Recall that this means that J is preordered and there is an upper→ bound in J for each two-element subset of J. If A X, r is said to be in A if r(J) A. r is said to be eventually in A if ⊂ ⊂ there is a terminal segment Tα J such that r(Tα) A. r is said to be frequently ⊂ ⊂ in A if r(Tα) A = ∅ terminal segments Tα in J.Obviously,ifr is eventually in A then r is frequently∩ ' ∀ in A. A subnet of r is a net s: K X such that f : K J for which s = r f → ∃ → ◦ and m J n K such that f(Tn) Tm.Notethatf is not presumed to be nondecreasing.∀ ∈ ∃ It∈ is simply eventually in⊂ any terminal segment of J.Asubnetofa subnet is also a subnet of the original net. AnetinasetX is called universal if the net is eventually in A or eventually in Ac for all A P(X). ∈ 9.1. . Each net r : D X has a universal subnet. →