8. the Axiom of Choice in This Section We Will Discuss An
Total Page:16
File Type:pdf, Size:1020Kb
20 LARRY SUSANKA When G P(X), we will use S(G)todenotethesetofsimplefunctionscon- structed from⊂ the sets in G. Afunctionthathasconstantrangevaluet on its whole domain will sometimes be denoted t,withthisusage(andthedomain)takenfromcontext.Thus,for example, χX is sometimes denoted by 1 and 0χX by 0, in yet another use of each of those symbols. When H is a subset of [ , ]X ,wewilluseB(H)todenotethebounded members of H; f B(H) −∞f H∞ and a R with 0 a< and a f a. ∈ ⇔ ∈ ∃ ∈ ≤ ∞ − ≤ ≤ If X is a topological space, let C(X)denotethecontinuousfunctionsfromX to R. RX , B(RX )and,whenX has a topology, C(X), are all vector lattices:real vector spaces and lattices. They are also commutative rings with multiplica- tive identity χX . 7.1. Exercise. S(G) is obviously a (possibly empty) vector space. Give conditions on G under which S(G) is a vector lattice and a commutative ring with multiplicative identity χX . 8. The Axiom of Choice In this section we will discuss an axiom of set theory, the Axiom of Choice. Every human language has grammar and vocabulary, and people communicate by arranging the objects of the language in patterns. We imagine that our com- munications evoke similar, or at least related, mental states in others. We also use these patterns to elicit mental states in our “future selves,” as reminder of past imaginings so that we can start at a higher level in an ongoing project and not have to recreate each concept from scratch should we return to a task. It is apparent that our brains are built to do this. But words are all defined in terms of each other. Ultimate meaning, if there is any to be found, is derived from pointing out the window at instances in the world, or from introspection. Very often ambiguity or multiple meaning of a phrase is the point of a given communication, and provides the richness and subtlety characteristic of poetry, for instance, or the beguiling power of political speech. Set theory is a language mathematicians have invented to encode mathemat- ics. But unlike most human languages, this language does everything possible to avoid blended meaning, to expose the logical structure of statements and keep the vocabulary of undefined terms to an absolute minimum. Many mathematicians believe what they do is “art.” But ambiguity and internal discord is not part of our particular esthetic ensemble. Most mathematicians believe that, though set theory may be unfinished, it serves its purpose well. Virtually all mathematical structures canbesuccessfullymodeled in set theory, to the extent that most mathematicians never think of any other way of speaking or writing. CARDINALS AND ORDINALS 21 Together, the collection of axioms (which, along with logical conventions de- fines the language) normally used by most mathematicians is called the Zermelo- Fraenkel Axioms, or simply ZF and the set theory that arises from these axioms is called Zermelo-Fraenkel Set Theory. You saw explicit mention of two ax- ioms from ZF, the Axiom of Infinity and the Axiom of the Empty Set, in Section 5. We have used others without mention on almost every page. For example we have formed power sets. The Axiom of the Power Set For any set A there is a set P(A) consisting of all, and only, the subsets of A. Asserting the existence of a set with this feature is a dramatic and “non-constructive” thing to do, particularly when the underlying set is infinite. We are not told how to create this set. We just have a means of recognizing if a set we have in hand is a member of this power set, or not. And where, exactly, did that first infinite set come from? The Axiom of Infinity brings it into existence, out of nothing, simply because mathematician want infinite sets and this seems to be a logically consistent way to produce them. There is another extremely useful—and arguably even less constructive—axiom which we discuss now. We will present and presume to be true, wherever convenient, the four equivalent and useful statements below, one of which is called the Axiom of Choice. This axiom is frequently abbreviated to AC. The collection of the axioms of standard set theory plus this axiom is frequently denoted ZFC. The discussions regarding equivalence of the Axiom of Choice and the other three statements, and the history associated with them, is a fascinating story which deserves study by every serious student of mathematics. The Axiom of Choice: If J and X are sets and A: J P(X) is an indexed collection of nonempty sets then there is a functi→on f : J → X such that f(β) Aβ β J. A function with this property is called a choice function for∈ A. ∀ ∈ Essentially, this axiom states that given any generic set S of nonempty sets, there is a way of selecting one element from each member of S. The other axioms do not imply that such a selection can be made, unless every member of S has an element with some unique property, which would allow it to be singled out. Zorn’s Lemma: If S is a set with a partial order and if every chain in S possesses an upper bound in S, then S has a maximal member. Zermelo’s Theorem: Every nonempty set can be well ordered. Kuratowski’s Lemma: Each chain in a partially ordered set S is con- tained in a maximal chain in S (that is, a chain in S not contained in any other chain in S.) Kuratowski’s Lemma is also often called The Hausdorff Maximal Principle. That Zorn’s Lemma implies Kuratowski’s Lemma is immediate. Suppose S is a set with a partial order and and C is a chain in S. Let W denote the set of all chains in S which contain the chain C, ordered by containment. Any chain in W is 22 LARRY SUSANKA bounded above by the union of the chain, so Zorn’s Lemma implies that W contains a maximal member. That maximal member is a chain in S not properly contained in any other chain in S. On the other hand, assuming Kuratowski’s Lemma to be true, suppose S is a set with a partial order and that every chain in S possesses an upper bound in S. This time let W denote the set of all chains in S. Let X denote a maximal member of W. So X is a chain in S not contained in any other chain. Let M be any upper bound for X. By maximality of X, M must actually be in X and cannot be less than any other member of S: that is, M is maximal in S. So Zorn’s Lemma is true. In the last two paragraphs we have shown that Zorn’s Lemma and Kuratowski’s Lemma are equivalent statements. We will now show that Zorn’s Lemma implies AC. Suppose S is any nonempty set of nonempty sets and X is the union of all the sets in S. Let B = S X. Now let Q denote the set of all subsets of P(B) which are choice functions on× their domains: that is, T Q exactly when T is nonempty and there is at most one ordered pair in T whose∈ first component is any particular member of S, and also s A whenever (A, s) T . These are called “partial choice functions.” Order Q∈by containment. The∈ union of any chain in Q is a member of Q so Zorn’s Lemma implies that Q has a maximal member. This maximal member is a choice function on its domain, which must by maximality be all of S. The fact that Zermelo’s Theorem implies AC is also straightforward: given any nonempty set S of nonempty sets, well order the set X = S∈S S. For each S S let f(S) be the least element of S with respect to this ordering. f is the requisite∈ choice function. S The opposite implication is a bit trickier. It involves using a choice function to create the well order. Suppose set A has more than one element and f : P(A) ∅ A is a choice function: that is, f(B) B whenever ∅ = B A. − { } → ∈ 6 ⊂ Let B denote the set of all nonempty containment-chains in P(A) ∅ which are well ordered and satisfy the condition: − { } Whenever IK is an initial segment of one of these chains and if J is the union of all the sets in IK then J = A and K = J f(A J) . 6 ∪ { − } B is nonempty: for example, f(A) , f(A),f(A f(A) ) is in B. { { } { − { } } } The condition above implies that each of the chains in B must start with the set f(A) , and the successor to any set K in such a chain (if, of course, K is not the last{ set} in the chain) has exactly one more member than does K. It also implies directly that if two different chains X and W of this kind have a common initial segment, so that IK X and IG W and IK = IG then K = G. In other words, the least successor of⊂ an initial segment⊂ is determined by the sets in the initial segment, and not by the specific chain within which the initial segment sits. Suppose that X is one of these chains. We will call S a “starting chunk” of X if ∅ = S X and whenever B,C X the condition B S and C B implies C 6S.