<<

Appendix A Theory

A.1 Sets

The purpose of this appendix is to introduce some of the basic ideas and terminologies from which are essential to our present work. In this naive treatment, we do not intend to give a complete and precise analysis of set theory, which belongs to the foundations of and to . Rather, we shall deal with sets on an intuitive . We remark that this usage can be formally justified. Intuitively, a set is a collection of objects called members of the set. The terms “collection” and “family” are used as synonyms for “set”. If an object x is a member of a set X, we write x ∈ X to express this fact. Implicit in the idea of a set is the notion that a given object either belongs or does not belong to the set. The statement “x is not a member of X” is indicated by x ∈/ X. The objects which make up a set are usually called the elements or points of the set. Given two sets A and X, we say that A is a of X if every of A is also an element of X. In this case, we write A ⊆ X (read A is contained in X)or X ⊇ A (read X contains A). The sets A and X are said to be equal, written A = X, if A ⊆ X and X ⊆ A. We call A a proper subset of X (written A ⊂ X)ifA ⊆ X but A = X.Theempty or which has no element is denoted by ∅.Wehave the inclusion ∅ ⊆ X for every set X. The set whose only element is x is called a , denoted by {x}. Note that ∅ ={∅}.IfP (x) is a statement about elements in X, that is either true or false for a given element of X, then the subset of all the x ∈ X for which P (x) is true is denoted by {x ∈ X | P (x)} or {x ∈ X : P (x)}. In a particular mathematical discussion, there is usually a set which consists of all primary elements under consideration. This set is referred to as the “universe”. To avoid any logical difficulties, all the sets we consider in this section are assumed to be of the universe X. The difference of two sets A and B, denoted by A − B,istheset{x ∈ A | x ∈/ B}.If B ⊆ A,thecomplement of B in A is A − B. Notice that the operation is defined only when one set is contained in the other, whereas the difference operation does not have such a restriction. The of two sets A and B is the set A ∪ B =

© Springer Nature Singapore Pte Ltd. 2019 411 T. B. Singh, Introduction to , https://doi.org/10.1007/978-981-13-6954-4 412 Appendix A: Set Theory

{x | x belongs to at least one ofA, B}.Theintersection of two sets A and B is the set A ∩ B ={x | xbelongs to both A and B}. When A ∩ B = ∅,thesetsA and B are called disjoint; otherwise we say that they intersect.

Proposition A.1.1 (a) X − (X − A) = A. (b) B ⊆ A ⇔ X − A ⊆ X − B. (c) A = B ⇔ X − A = X − B.

Proposition A.1.2 A ∪ A = A ∩ A.

Proposition A.1.3 A ∪ B = B ∪ A, A ∩ B = B ∩ A.

Proposition A.1.4 A ⊆ B ⇔ A = A ∩ B ⇔ B = A ∪ B.

Proposition A.1.5 (a) A ∪ (B ∪ C) = (A ∪ B) ∪ C. (b) A ∩ (B ∩ C) = (A ∩ B) ∩ C.

Proposition A.1.6 (a) A ∩ (B ∪ C) = (A ∩ B) ∪ (∩C). (b) A ∪ (B ∩ C) = (A ∪ B) ∩ (A ∪ C).

Proposition A.1.7 (a) A − B = A ∩ (X − B). X − (A ∪ B) = (X − A) ∩ (X − B) (b) (De Morgan’s laws). X − (A ∩ B) = (X − A) ∪ (X − B) (c) (A − B) ∪ (B − A) = (A ∪ B) − (A ∩ B). (d) If X = A ∪ B and A ∩ B = ∅, then B = X − A.

Let J be a nonempty set, and suppose that a set A j is given for each j ∈ J. Then the collection of sets {A j | j ∈ J}, also written as {A j } j∈J , is called an indexed , and J is called an indexing set for the family. The union of an indexed family {A j | j ∈ J} of subsets of a set X is the set  A j ={x ∈ X | x ∈ A j for some j in J}, j∈J and the intersection is the set  A j ={x ∈ X | x ∈ A j for every j in J}. j∈J  { | ∈ } The union of the sets A j is also denoted by A j j J , and their intersection by { | ∈ } A j j J . If there is no ambiguity about the indexing set, we simply use A j ={ ,..., } > for the union and A j for the intersection. If J 1 n , n 0 an , then n ,..., n we write j=1 A j for the union of A1 An, and j=1 A j for their intersection. Observe that any nonempty collection C of sets can be considered an indexed family of sets by “self-indexing”: The indexing set is C itself and one assigns to each S ∈ C the set S. Accordingly, the foregoing definitions become Appendix A: Set Theory 413  {S : S ∈ C } = {x | x ∈ S for some S in C } and  {S : S ∈ C } = {x | x ∈ S for every S in C }.  If we allow the collection C to be the , then, by convention, {S : S ∈ C } = ∅ and {S : S ∈ C } = X, the specified universe of the discourse.

Proposition A.1.8 Let A j ,j∈ J, be a family of subsets of a set X. Then we have         − = − − = − (a) X j A j j X A j , and X j A j j X A j . ⊂ ⊆ ⊇ (b) If K J, then k∈K Ak j∈J A j and k∈K Ak j∈J A j .

A.2 Functions

With each two objects x, y, there corresponds a new object (x, y), called their . This is another primitive notion that we will use without a formal definition. Ordered pairs are subject to the condition: (x, y) = x , y ⇔ x = x and y = y . Accordingly, (x, y) = (y, x) ⇔ x = y. The first (resp. second) element of an ordered pair is called the first (resp. second) coordinate. Given two sets X1 and X2, their X1 × X2 is defined to be the set of all ordered pairs (x1, x2), where x j ∈ X j for j = 1, 2. Thus X1 × X2 ={(x1, x2) | x j ∈ X j for every j = 1, 2}.Note that X1 × X2 = ∅ if and only if X1 = ∅ or X2 = ∅. When both X1 and X2 are nonempty, X1 × X2 = X2 × X1 ⇔ X1 = X2. Let X and Y be two sets. A f from the set X to Y (written f : X → Y ) is a subset of X × Y with the following property: for each x ∈ X, there is one and only one y ∈ Y such that (x, y) ∈ f. A function is also referred to as a mapping (or briefly, a ). We write y = f (x) to denote (x, y) ∈ f, and say that y is the of x under f or the value of f at x. We also say that f maps (or carries) x into y or f sends (or takes) x to y. A function f from X to Y is usually defined by specifying its value at each x ∈ X, and if the value at a typical point x ∈ X is f (x), we write x → f (x) to give f . We refer to X as the domain, and Y as the of f.The set f (X) ={f (x) | x ∈ X}, also denoted by im( f ), is referred to as the range of f. The identity function on X, which sends every element of X to itself, is denoted by 1 or 1X .Amapc : X → Y which sends every element of X to a single element of Y is called a constant function. Notice that the range of a constant function consists of just one element. If A ⊂ X, the function i : A → X, a → a, is called the inclusion map of A into X.If f : X → Y and A ⊂ X, then the restriction of f to A is the function f |A : A → Y defined by ( f |A) (a) = f (a) ∀ a ∈ A. The other way around, if A ⊂ X and g : A → Y is a function, then an extension of g over X is a function G : X → Y such that G|A = g. The inclusion map i : A → X is the restriction of the identity map 1 on X to A. { | ∈ } Proposition A.2.1 Let X be a set and A j j J a family of subsets of X with X = A j . If, for each j ∈ J, fj : A j → Y is a function such that f j | A j ∩ Ak = 414 Appendix A: Set Theory   fk | A j ∩ Ak for all j, k ∈ J, then there exists a unique function F : X → Ywhich extends each f j .

Proof Given x ∈ X, there exists an index j ∈ J such that x ∈ A j . We put F (x) = f j (x) if x ∈ A j .Ifx ∈ A j ∩ Ak , then f j (x) = fk (x), by our hypothesis. Thus F (x) is uniquely determined by x, and we have a single-valued function F : X → Y .Itis clearly an extension of each f j . The uniqueness of F follows from the fact that each x ∈ X belongs to some A j ; consequently, any function X → Y, which agrees with each f j on A j , will have to assume the value f j (x) at x. ♦

We say that a function f : X → Y is surjective (or a surjection or onto) if Y = f (X).If f (x) = f x for every x = x , then we say that f is injective (or an injection or one-to-one). A function is bijective (or a or a one-to-one correspondence) if it is both injective and surjective. If f : X → Y and g : Y → Z are functions, then the function X → Z which maps x into g ( f (x)) is called their composition and denoted by g ◦ f or simply g f.

Proposition A.2.2 Suppose that f : X → Y and g : Y → X satisfy g f = 1X . Then f is injective and g is surjective.

Given a function f : X → Y , a function g : Y → X such that g f = 1X and f g = 1Y is called an inverse of f. If such a function g exists, then it is unique and we denote it by f −1. It is clear from the preceding result that f has an inverse if and only if it  − is a bijection. Also, it is evident that f −1 1 = f . Let f : X → Y be a function. For a set A ⊆ X, the subset

f (A) ={f (x) | x ∈ A} of Y is called the image of A under f and, for a set B ⊆ Y , the subset

f −1 (B) ={x ∈ X | f (x) ∈ B} of X is called the inverse image of B in X under f. The of a set X is the family P (X) of all subsets of X. A function f : X → Y induces a function P (X) → P (Y ), A → f (A). It also induces a func- tion P (Y ) → P (X), B → f −1 (B). The main properties of these functions are described in the following. Proposition A.2.3 Let f : X → Y be a function. ⊆ ⇒ ( ) ⊆ ( ) (a) A1 A2 fA1 f A2 , ( ) = ( ) (b)f j A j  j f A j , ( ) ⊆ ( ) (c) f j A j j f A j , and (d) Y − f (A) ⊆ f (X − A) ⇔ f is surjective. Proposition A.2.4 Let f : X → Y be a function. −1 −1 (a) B1 ⊆ B2 ⇒ f (B1) ⊆ f (B2), Appendix A: Set Theory 415   −1( ) = −1( ) (b) f  j B j  j f B j , −1( ) = −1 (c) f j B j j f B j , and (d) f −1(Y − B) = X − f −1(B).

Proposition A.2.5 Let f : X → Y be a function. Then: (a) For each A ⊆ X, f −1 ( f (A)) ⊇ A; in particular, f −1 ( f (A)) = Aiff is injective.     (b) For each B ⊆ Y, f f −1 (B) = B ∩ f (X); in particular, f f −1 (B) = B if f is surjective.

A.3 Cartesian Products

In the previous section, we have defined the Cartesian product of two sets. For this operation, the following statements are easily proved.

Proposition A.3.1 (a) X1 × (X2 ∪ X3) = (X1 × X2) ∪ (X1 × X3). (b) X1 × (X2 ∩ X3) = (X1 × X2) ∩ (X1 × X3). (c) X1 × (X2 − X3) = (X1 × X2) − (X1 × X3). (d) For Y i ⊆ Xi ,i = 1, 2,

(X1 × X2) − (Y1 × Y2) = X1 × (X2 − Y2) ∪ (X1 − Y1) × Y2 = [(X1 − Y1) × (X2 − Y2)]∪[Y1 × (X2 − Y2) ∪[(X1 − Y1) × Y2].

Proposition A.3.2 Let X j ,j∈ J, be a family of subsets of a set X, and let Yk , k ∈ K , be a family of subsets of a set Y. Then      ( ) × ( ) = × (a)  j X j k Yk  j,k X j Yk ; ( ) × ( ) = × (b) j X j k Yk j,k X j Yk ; With J = K,      ( ) × ( ) = × (c)  j X j  j Y j  j X j Y j  and ( ) × ( ) ⊃ × (d) j X j j Y j j X j Y j .

It is obvious that (X1 × X2) × X3 = X1 × (X2 × X3) when the sets X1, X2, and X3 are all nonempty. However, there is a canonical bijection

(X1 × X2) × X3 ←→ X1 × (X2 × X3) given by (x1 × x2) × x3 ↔ x1 × (x2 × x3). Therefore, the Cartesian product of n sets X1,...,Xn (n > 2) may be defined by induction as X1 ×···× Xn = (X1 × ...× Xn−1) × Xn. A typical element x of X1 ×···× Xn is written as x = (x1,...,xn), where xi ∈ Xi for very i = 1,...,n and referred to as the ith coordinate of x. Observe that two elements x = (x1,...,xn) and y = (y1,...,yn) in X1,...,Xn are equal if 416 Appendix A: Set Theory and only if xi = yi for very i. Accordingly, an element of X1,...,Xn is called an ordered n-. We now extend the definition of the Cartesian product to an arbitrary indexed family of sets {Xα | α ∈ A}. Of course, the new definition of the product of the sets Xα for α ∈{1, 2} must reduce to the earlier notion of the Cartesian product of two sets. Observe that each ordered pair (x1, x2) in X1 × X2 may be considered as defining :{ , }→ ∪ ( ) = ( ) = a function x 1 2 X1 X2 with x 1 x1 and x 2 x2. Accordingly, the Cartesian product of the Xα is defined to be the set α Xα (or simply written : → ( ) ∈ ∈ as Xα) of all functions x A α Xα such that x α Xα for each α A. { | ∈ } Occasionally, we denote the product α Xα by α∈A Xα or Xα α A to avoid any ambiguity about the indexing set. We call Xα the αth factor (or coordinate set) of Xα.Ifx ∈ Xα, then x (α) is called the αth coordinate of x, and is often denoted by xα. If the family {Xα} has n sets, n a positive integer, then it may be indexed by the set {1,...,n}. In this case, we have two definitions of its Cartesian product; the one in n 2 → the present sense will be denoted by i=1 Xi . Notice that the function i=1 Xi × → ( , ) X1 X2, x x1 x2 , is a bijection. This shows that the notion of the product α Xα is a reasonable generalization of the notion of the Cartesian product of two → ( ,..., ) n → sets. Similarly, the mapping x x1 xn gives a bijection between i=1 Xi ×···× ,..., X1 Xn, and allows us to identify the two products of X1 Xn. We usually ∈ n ∈ call an element x i=1 Xi an ordered n-tuple. In general, an element x α∈A Xα ( ) , is referred to as an A-tuple and written as xα . It is obvious that two elements x y = ∈ in α∈A Xα are equal if and only if xα yα for all α A. If one of the sets Xα is { } empty, then so is Xα. On the other hand, if Xα is a nonempty family of nonempty = ∅ sets, it is not quite obvious that α Xα . In fact, a positive answer to the question of the existence of an element in α Xα is one of the set-theoretic , known as follows.

Theorem A.3.3 (The of Choice) If {Xα | α ∈ A} is a nonempty family of nonempty sets, then there exists a function  c : A → Xα α such that c (α) ∈ Xα for each α ∈ A (c is called a choice function for the family {Xα}).

This axiom is logically equivalent to a of interesting ; one such proposition is the following.

Theorem A.3.4 (Zermelo’s Postulate) Let {Xα | α ∈ A} be a family of nonempty pairwise . Then there exists a set C consisting of exactly one element from each Xα.

For, if c is a choice function for the family {Xα | α ∈ A} in Theorem A.3.4, then the set C = c(A) is a desired set. Conversely, if {Xα | α ∈ A} is a family of nonempty Appendix A: Set Theory 417 sets, then Yα ={α}×Xα is nonempty for every α and Yα ∩ Yβ = ∅ for all α = β in A. By the above postulate, there exists a set C which consists of exactly one ∈ ∈ element from each Yα. Accordingly, for each α A, we have a unique xα Xα such ( , ) ∈ ⊆ ⊆ × that α xα C. Obviously, C α Yα A α Xα , and we have a function : → ( ) = ∈ c A α Xα defined by c α xα for all α A. Later, we will discuss some more equivalent to the . It should be noted that the sets Xα in the definition of the Cartesian product α∈A Xα need not to be different from one another; indeed, it may happen that they areallthesamesetX. In this case, the product α∈A Xα may be called the Cartesian product of A copies of X or the Cartesian Ath power of X, and denoted by X A. Notice that X A is just the set of all functions A → X.

⊆ ∈ ⊆ Proposition A.3.5 If Yα Xα for every α A, then α Yα α Xα. Conversely, = ∅ ⊆ ⊆ if each Xα and α Yα α Xα, then Yα Xα for every α.

Proposition A.3.6 Let {Xα | α ∈ A} be a family of nonempty sets, and let Uα, Vα ⊆ Xα for every α. Then we have

∪ ⊆ ( ∪ ) (a) Uα Vα Uα Vα . (b) Uα ∩ Vα = (Uα ∩ Vα). { | ∈ } ∈ If Xα α A is a family of nonempty sets, then for each β A,wehavethe : → → mapping pβ α∈A Xα Xβ given by x xβ. It is easy to see that each of these maps is surjective. The map pβ is referred to as the onto the βth factor. For ⊆ −1 × Uβ Xβ, pβ Uβ is the product Uβ α=β Xα, referred to as a slab in Xα.    ⊂ −1 = = ∈ Proposition A.3.7 If B A, then β∈B pβ Uβ Yα, where Yα Xα if α A − B while Yα = Uα for α ∈ B.

A.4 Equivalence Relations

ArelationonasetX is a subset R ⊆ X × X.If(x, y) ∈ R, we write xRy. The relation Δ ={(x, x) | x ∈ X} is called the identity relation on X. This is also referred to as the diagonal. If R is a relation on X, and Y ⊂ X, then R ∩ (Y × Y ) is called the relation induced by R on Y. Given a set X, a relation R on X is (a) reflexive if xRx ∀ x ∈ X (equivalently, Δ ⊆ R), (b) symmetric if xRy ⇒ yRx ∀ x, y ∈ X, and (c) transitive if xRy and yRz ⇒ xRz, ∀ x, y, z ∈ X. An equivalence relation on the set X is a relation which is reflexive, symmetric, and transitive. An equivalence relation is usually denoted by the ∼, read “tilde.” Suppose that ∼ is an equivalence relation on X. Given an element x ∈ X, the set [x]={y ∈ X | y ∼ x} is called the equivalence of x. It is clear that 418 Appendix A: Set Theory

X equals the union of all the equivalence classes, and two equivalence classes are either disjoint or identical. A partition of a set X is a collection of nonempty, disjoint subsets of X whose union is X. With this terminology, the family of equivalence classes of X determined by ∼ is a partition of X. Conversely, given a partition E of X, there is an equivalence relation ∼ on X such that the equivalence classes of ∼ are precisely the sets of E. This relation is obtained by declaring x ∼ y if both x and y belong to the same partition set. Given an equivalence relation ∼ on X, the set of all equivalence classes [x], x ∈ X, is called the quotient set of X by ∼, and is denoted by X/∼. Thus we have another important method of forming new sets from old ones. The map π : X → X/∼ defined by π (x) =[x] is called the projection of X onto X/∼. If R isabinaryrelationonasetX, then there is an equivalence relation ∼ on X defined by x ∼ y if and only if one of the following is true: x = y, xRy, yRx, or there exist finitely many points z1,...,zn+1 in X such that z1 = x, zn+1 = y and either zi Rzi+1 or zi+1 Rzi for all 1 ≤ i ≤ n. It is called the equivalence relation generated by R.

A.5 Finite and Countable Sets

For objects in a set, we use the natural (or positive ) 1, 2, 3,.... When it is feasible, the process of counting a set X requires putting it in one-to-one correspondence with a set {1, 2,...,n} consisting of the natural numbers from 1 to n. Also, counting of sets in essence serves to determine if one of two given sets has more elements than the other. For this purpose, it may be easier to pair off each member of one set with a member of the other and see if any elements are left over in one of the sets rather than count each. The following definition makes precise the notion of “sets having the same size.” We say that two sets X and Y are equipotent (or have the same ) if there exists a bijection between them. Clearly, the relation of equipotence between sets is an equivalence relation on any given collection of sets. We shall assume familiarity with the set of natural numbers N ={1, 2, 3,...} (also denoted by Z+), and the usual arithmetic operations of addition and multiplication in N. We shall also assume the notion of the order relation “less than” < on N and the “well-ordering property”: Every nonempty subset of N has a smallest element. For any n ∈ N the set {1, 2,...,n} consisting of the natural numbers from 1 to n will be denoted by Nn.ThesetNn is referred to as an initial segment of N.AsetX is finite if it is either empty or equipotent to a set Nn for some n ∈ N. X is called infinite if it is not finite. When X is equipotent to a set Nn we say that X contains n elements or the cardinality of X is n. The cardinality of the empty set is 0. Of course, we must justify that the cardinality of a finite set X is uniquely determined by it. To see this, we first prove the following. Appendix A: Set Theory 419

Proposition A.5.1 For any natural numbers m < n, there is no injection from Nn to Nm . Proof We use induction on n to establish the proposition. If n = 2, then m = 1, and the proposition is obviously true. Now, let n > 2 and assume that the proposition is true for n − 1. We show that it is true for n. If possible, suppose that there is an injection f : Nn → Nm , where m < n.Ifm is not in the image of Nn−1, then f |Nn−1 is an injection Nn−1 → Nm−1, contrary to our inductive assumption. So we may further assume that f ( j) = m, j = n. Then f (n) = m. We define a mapping g : Nn−1 → Nm−1 by f (i) for i = j g (i) = f (n) for i = j.

It is clear that g is an injection which, again, contradicts our inductive hypothesis. Therefore, there is no injection Nn → Nm . ♦ The preceding proposition is sometimes called the “Pigeonhole Principle.” The contrapositive of the above proposition states that if there is an injection Nm → Nn, then m ≤ n. As an immediate consequence of this, we see that the cardinality of a finite set X is uniquely determined. For, if f : X → Nn and g : X → Nm are −1 , then the composite f g : Nm → Nn is a bijection. Hence, m ≤ n and n ≤ m which implies that m = n. It is now evident that two finite sets X and Y are equipotent if and only if they have the same number of elements. Notice that for infinite sets, the idea of having the same “number of elements” becomes quite vague, whereas the notion of equipotence retains its clarity. Next, we observe the following useful property of finite sets.

Proposition A.5.2 An injective mapping from a finite set to itself is also surjective.

Proof Let X be nonempty finite set, and f : X → X be injective. Let x ∈ X be an arbitrary point. Write x = x0, f (x0) = x1, f (x1) = x2, and so on. Since X is finite, we must have xm = xn for some positive integers m < n. Because f is injective, we obtain x0 = xn−m ⇒ x = f (xn−m−1). Thus f is surjective. ♦ Proposition A.5.3 A proper subset of a finite set X is finite and has cardinality less than that of X.

Proof If X has n elements, then there is a bijection X → Nn. Consequently, each subset of X is equipotent to a subset of Nn, and thus it suffices to prove that every proper subset of Nn is finite and has at most n elements. We use induction on n to show that any proper subset of Nn is finite and has at most n − 1 elements. If n = 1, then N1 ={1} and its only proper subset is the empty set ∅. Obviously, the proposition is true in this case. Suppose now that every proper subset of Nk (k > 1) has less than k elements. We show that each proper subset Y of Nk+1 has at most k elements. If k + 1 does not belong to Y , then either Y = Nk or it is a proper subset of Nk ;in the latter case, Y has at most k elements, by the induction assumption. If k + 1isin Y , then Y −{k + 1}⊂Nk . Therefore, either Y ={k + 1} or there exists an integer 420 Appendix A: Set Theory m < k and a bijection f : Y −{k + 1}→Nm , by the induction assumption again. In the latter case, we define g : Y → Nm+1 by setting g (y) = f (y) for all y = k + 1, and g (k + 1) = m + 1. Clearly g is a bijection. So Y is finite and has m + 1 ≤ k elements. ♦

The contrapositive of this proposition states that if a subset Y ⊆ X is infinite, then so is X.

Theorem A.5.4 For any nonempty set X, the following statements are equivalent: (a) X is finite. (b) There is a surjection Nn → Xforsomen∈ N. (c) There is an injection X → Nn for some n ∈ N.

Proof (a) ⇒ (b): Obvious. −1 (b) ⇒ (c):Let f : Nn → X be a surjection. Then f (x) = ∅ for every x ∈ X. So, for each x ∈ X, we can choose a number c (x) ∈ f −1 (x). Since { f −1 (x) | x ∈ X} is a partition of Nn, the function x → c (x) is an injection from X to Nn. (c) ⇒ (a): If there is an injection f : X → Nn, then X is equipotent to f (X) ⊆ Nn. By the preceding proposition, there exists a positive integer m ≤ n and a bijection f (X) → Nm .ItfollowsthatX is equipotent to Nm and hence finite. ♦

Proposition A.5.5 If X is a finite set, then there is no one-to-one mapping of X onto any of its proper subsets.

Proof Let X be a finite set and Y a proper subset of X. Suppose that X has n elements. Then there is a bijection f : X → Nn for some n ∈ N. Also, there is a positive integer m < n and a bijection g : Y → Nm , by Proposition A.5.3.Ifh : X → Y is −1 a bijection, then the composite ghf : Nn → Nm would be an injection, which contradicts Proposition A.5.1. ♦

By the preceding proposition, a finite set cannot be equipotent to one of its proper subsets. As n → n + 1 is a bijection between N and N −{1},thesetN is not finite. In fact, this is the characteristic property of infinite sets. To establish this, we need the following.

Theorem A.5.6 (Principle of Recursive Definition) Let X be a set and f : X → X be a function. Given a point x0 ∈ X, there is a unique function g : N → X such that g (1) = x0 and g (n + 1) = f (g (n)) for all n ∈ N.

This is one of the most useful ways of defining a function on N, which we will take for granted.

Theorem A.5.7 Let X be a set. The following statements are equivalent: (a) X is infinite. (b) There exists an injection N → X. (c) X is equipotent to one of its proper subsets. Appendix A: Set Theory 421

Proof (a) ⇒ (b):LetF be the family of all finite subsets of X. By the axiom of choice, there exists a function c from the collection of all nonempty subsets Y ⊆ X to itself such that c (Y ) ∈ Y. Since X is not finite, X − F = ∅ for all F ∈ F. So c (X − F) ∈ X − F. It is obvious that F ∪{c (X − F)} is a member of F for every F ∈ F. Accordingly, F → F ∪{c (X − F)} is a function of F into itself. Set G (1) ={c (X)} and G (n + 1) = G (n) ∪{c (X − G (n))} for every n ∈ N.By the principle of recursive definition, G is a function N → F. We show that the function φ: N → X defined by φ (n) = c (X − G (n)) is an injection. It is easily seen, by induction, that G (m) ⊆ G (n) for m ≤ n. Therefore, for m < n,wehave φ (m) = c (X − G (m)) ∈ G (m + 1) ⊆ G (n), while φ (n) ∈ X − G (n). (b) ⇒ (c):Letφ: N → X be an injection, and put Y = φ (N). Consider the map- ping ψ : X → X defined by   φ 1 + φ−1 (x) if x ∈ Y, and ψ (x) = x if x ∈/ Y.

We assert that ψ is a bijection between X and X −{φ (1)}. The  surjectivity of

ψ is clear. To show that it is injective, assume that ψ (x) = ψ x . Then both x − and x belong  to either Y or X − Y .Ifx, x ∈ Y, then we have φ 1 + φ 1 (x) = φ 1 + φ−1 x , and it follows from the injectivity of φ that x = x . And, if x, x ∈ X − Y, then x = x , by the definition of ψ. (c) ⇒ (a): This is contrapositive of Proposition A.5.5. ♦

AsetX is called countably infinite (or denumerable,orenumerable)ifX is equipotent to the set N of natural numbers. X is called countable if it is either finite or denumerable; otherwise, it is called uncountable. For example, the set Z of all integers is countably infinite, since f : N → Z, defined by n/2ifn is even, and f (n) = − (n − 1) /2ifn is odd, is a bijection. We conclude from Theorem A.5.7 that every infinite set contains a countably infinite subset. The following proposition shows that no uncountable set can be a subset of a .

Proposition A.5.8 Every subset of N is countable.

Proof Let M ⊆ N.IfM is finite, then it is countable, by definition. If M = N, then M is a countably infinite set, since the identity function from N to M is a bijection. So assume that M is an infinite proper subset of N. Then we define a function from N to M by as follows: Let f (1) be the smallest integer in M, which exists by the well-ordering property of N. Suppose that we have chosen integers f (1), f (2),..., f (k). Since M is infinite, M − f {1, 2,...,k} is nonempty for every k ∈ N. We define f (k + 1) to be the smallest integer of this set. By the principle of recursive definition, f (k) is defined for all k ∈ N. Clearly, f (1) < f (2) < ··· 422 Appendix A: Set Theory so that f is injective. Now, we see that f is surjective, too. For each m ∈ M,the set {1, 2,...,m} is finite. Since the set { f (k) | k ∈ N} is infinite, there is a k ∈ N such that f (k) > m. So we can find the smallest integer l ∈ N such that f (l) ≥ m. Then f ( j) < m for each j < l; accordingly, m ∈/ f {1,...,l − 1}. By the definition of f (l),wehave f (l) ≤ m, and the equality m = f (l) follows. Thus f : N → M is a bijection. ♦

It follows that a subset of N is either finite or countably infinite. Note that any set which is equipotent to a countable set is countable. Accordingly, any subset of a countable set is countable.

Theorem A.5.9 For any nonempty set X, the following statements are equivalent: (a) X is countable. (b) There is a surjection N → X. (c) There is an injection X → N.

Proof (a) ⇒ (b):IfX is countably infinite, then there is a bijection between N and X, and we are done. If X is finite, then there exists an n ∈ N and a bijection f :{1,...,n}→X. We define g : N → X by setting

f (m) for 1 ≤ m ≤ n, and g (m) = f (n) for m > n.

Clearly, g is a surjection. (b) ⇒ (c):Letg : N → X be a surjection. Then g−1 (x) is nonempty for every x ∈ X. So it contains a unique smallest integer h (x), say. Then x → h (x) is a mapping h : X → N, which is injective, since g−1 (x) ∩ g−1 (y) = ∅ whenever x = y. (c) ⇒ (a): Suppose that h : X → N is an injection. Then h : X → h (X) is a bijection. By Proposition A.5.8, h (X) is countable, and therefore X is countable. ♦

By definition, a countable set is the range of a (finite or infinite) , and the converse follows from the preceding theorem. Thus the elements of a countable set X can be listed as x1, x2,..., and such a listing is called an of X. Observe that the of a countable set is countable, and the domain of an into a countable set is countable.

Lemma A.5.10 For any finite number factors, N ×···×N is countably infinite.

Proof Let 2, 3,...,pk be the first k prime numbers, where k is the number of factors in N ×···×N. Define a mapping f : N ×···×N → N by f (n1, n2,...,nk ) = n n1 n2 ··· k 2 3 pk . By the fundamental theorem of arithmetic, f is injective, and hence N ×···×N is countable. It is obvious that n → (n, 1,...,1) is an injection N → N ×···×N so that N ×···×N is infinite. ♦

It is immediate from the preceding lemma that a finite product of countable sets Xi is countable, for if fi : N → Xi are surjections for 1 ≤ i ≤ k, then so is the mapping Appendix A: Set Theory 423

( ,..., ) → ( ( ) ,..., ( )) N ×···×N ×···× n1 nk f1 n1 fk nk of into X1 Xk . Notice k that if each Xi is nonempty and some X j is countably infinite, then 1 Xi is also 0 0 0 0 0 countably infinite, since x j → x ,...,x − , x j , x + ,...,x , where x ∈ Xi are 1 j 1 j 1 k i = → k fixed elements for i j, is an injection X j 1 Xi . We also see that if each Xi k = ∅ k = ∅ is finite, then so is 1 X i . In fact, if any X j , then 1 Xi , and, if Xi has > k ··· mi 0 elements, then 1 Xi contains m1 mk elements. By induction on k,it = N ={ , ,..., } suffices to establish this proposition in the case k 2. Let mi 1 2 mi : N → = , × : N × N → × and fi mi Xi be bijections, i 1 2. Then f1 f2 m1 m2 X1 X2, ( , ) → ( ( ) , ( )) : N × N → N a b f1 a f2 b is a bijection. Define a mapping g m1 m2 m1m2 by g (a, b) = (a − 1) m2 + b. It is easily verified that g is a bijection, so X1 × X2 has m1m2 elements. However, an infinite product of even finite sets is not countable. For example, let ={ , } N = ∞ = X 0 1 , and consider the countable product X 1 Xn, where each Xn X. Note that an element in X N is a sequence s : N → X.Let f : N → X N be any function. = − ( ) =   ∈ N = ( ) ∈ N Write tn 1 f n n. Then t tn X and t f n for all n . Thus f is not surjective. Since f is an arbitrary function N → X N, X N cannot be countable, by Theorem A.5.9. This method of proof was first used by G. Cantor, and is known as Cantor’s diagonal process. We use this technique to prove the following

Example A.5.1 The set R of all real numbers is uncountable. Since an uncountable set cannot be a subset of a countable set, it is enough to show that the unit I ⊂ R is uncountable. By Theorem A.5.9, it suffices to show that there is no surjective mapping f : N → I .Let f : N → I be any function. We use the decimal representation of real numbers to write f (n) = 0.an1an2 ···. Here each ani is a digit between 0 and 9. This representation of a number is not necessarily unique, but if a number has two different decimal representations, then one of these representations repeats 9s from some place onward and the other repeats 0s from some place onward. We define a new r whose decimal representation is 0.b1b2 ···, where bn = 3if ann = 3, and bn = 5 otherwise. It is clear that r has a unique decimal representation and differs from f (n) in the nth decimal place for every n ∈ N.Sor = f (n) for all n ∈ N. Obviously, r belongs to I , and thus f is not surjective.

Proposition A.5.11 The union of a countable family of countable sets is countable. ∈ Proof Let A be a countable set, and suppose that for every α A, Eα is a countable set. Put X = Eα. We show that X is countable. If X = ∅, there is nothing to prove. So we assume that X = ∅. Then A = ∅, and we may further assume that Eα = ∅ for every α ∈ A, since the empty set contributes nothing to the union of N → → the Eα. Since A is countable, there is a surjection A, n αn. Since Eαn is a : N → ( ) = nonempty countable set, there is a surjection fn Eαn . Write fn m xnm for ∈ N : N × N → ( , ) = every m . Then we have a mapping φ X defined by φ n m xnm. = N × N Clearly, φ is surjective, for X n Eαn . By Lemma A.5.10, is countable, and hence X is countable. ♦ 424 Appendix A: Set Theory

If the indexing set A and the sets Eα in the preceding proposition are all finite, then it is easily verified that the union α Eα is finite.

Example A.5.2 The set Q of all rational numbers is countably infinite. Let Q+ denote the set of all positive rationals, and Q− denote the set of all negative rationals. Then Q = Q+ ∪ Q− ∪{0}. Obviously, Q+ and Q− are equipotent, and therefore it suffices to prove that Q+ is countably infinite. As N ⊂ Q+, Q+ is infinite. There is a surjective mapping g : N × N → Q+ defined by g (m, n) = m/n. By Lemma A.5.10, there is a bijection f : N → N × N, so the composition g f : N → Q+ is surjective. By Theorem A.5.9, Q+ is countable. Since R is uncountable, we see that the set R − Q of all irrational numbers is uncountable. Theorem A.5.12 The family of all finite subsets of a countable set is countable. Proof Let X be a countable set, and F (X) be the family of all finite subsets of X.If X is a finite set having n elements, then every subset of X is finite, and F (X) has 2n members. If X is countably infinite, then there is a bijection between F (X) and the family F (N) of all finite subsets of N. So it suffices to prove that F (N) is countably infinite. It is obvious that n →{n} is an injection N → F (N),soF (N) is infinite. To see that it is countable, consider the sequence 2, 3,...,pk ,... of prime numbers. If F ={n1, n2,...,nk }⊂N, then the ni can be indexed so that n1 < n2 < ···< nk . n ( ) = n1 n2 ··· k → ( ) F (N) −{∅}→ Put ν F 2 3 pk . Then F ν F defines an injection N, by the fundamental theorem of arithmetic. By Theorem A.5.9, F (N) −{∅} is countable, and so F (N) is countable. ♦ It must be noted that the family of all subsets of a countably infinite set is not countable. This follows from the following. Theorem A.5.13 (Cantor (1883)) For any set X, there is no surjection X → P (X). Proof Assume that there is a surjection f : X → P (X) and consider the set S = {x ∈ X | x ∈/ f (x)}. Then S ⊆ X, so there exists an x ∈ X such that S = f (x).Now, if x ∈ S, then x ∈/ f (x), and if x ∈/ S, then x ∈ f (x). Thus, in either case, we obtain a contradiction, and hence the theorem. ♦ We end this section with the following theorem, which will be proved later in Sect. A.8. Theorem A.5.14 (Bernstein–Schröeder) Let X and Y be sets. If there exists injec- tions X → Y and Y → X, then there exists a bijection X → Y.

A.6 Orderings

Definition A.6.1 Let X beaset.Anorder (or a simple order)onX is a binary relation, denoted by ≺, such that Appendix A: Set Theory 425

(a) if x, y ∈ X, then one and only one of the statements x = y, x ≺ y, y ≺ x is true, and (b) x ≺ y and y ≺ z ⇒ x ≺ z. X together with a definite order relation defined in it is called an ordered set.

The statement “x ≺ y” is read as “x precedes y”or“y follows x.” We also say that “x is less than y”or“y is greater than x.” It is sometimes convenient to write y  x to mean x ≺ y. The notation x  y is used to indicate x ≺ y or x = y, that is, the negation of y ≺ x. It should be noticed that an order relation is irreflexive, that is, for no xx≺ x. However, if ≺ is an order on X, then the relation  satisfies the following conditions: (a) (Reflexivity) x  x ∀ x ∈ X; (b) (Antisymmetry) x  y and y  x ⇒ x = y; (c) (Transitivity) x  y and y  z ⇒ x  z; and (d) (Comparability) x  y or y  x ∀ x, y ∈ X. A relation having the properties (a)–(c) is called a partial ordering. A set together with a definite partial ordering is called a partially ordered set. Two elements x, y of a partially ordered set (X, ) are called comparable if x  y or y  x, and a subset Y ⊆ X in which any two elements are comparable is called a chain in X. A partially ordered set that is also a chain is called a linearly or totally ordered set. Thus we have associated with each simple order a total (or linear) order relation. Conversely, there is a simple order relation associated with each total ordering. In fact, to any partial order relation  on X, there is associated a unique relation ≺ given by x ≺ y ⇔ x  y and x = y.

It is transitive and has the property that (a) for no x ∈ X, the relation x ≺ x holds, and (b) if x ≺ y, then y ⊀ x for every x, y ∈ X. A transitive relation with this property is called a strict partial ordering.If totally orders the set X, then ≺ is clearly a simple order (or a strict total ordering) on X. It follows that the notions of a total (resp. partial) order relation and a strict total (resp. strict partial) order relation are interchangeable. If  is a partial order, we use the notation ≺ to denote the associated strict partial order, and conversely. The relation ≤ is a total ordering on the set R of real numbers, while the inclusion relation ⊆ on P (X) is not if X has more than one element. Note that the set inclusion is always a partial ordering on any family of subsets of a set X. It is obvious that the induced ordering on a subset of a partially ordered set is a partial ordering on that subset. If (X1, 1) and (X2, 2) are partially (totally) ordered sets, then the dictionary (or lexicographic) order relation on X1 × X2 defined by

(x1, x2)  (y1, y2) if x1 ≺1 y1 or if x1 = y1 and x2 2 y2 is a partial (total) ordering. 426 Appendix A: Set Theory

Let (X, ) be a partially ordered set. If Y ⊆ X, then an element b ∈ X is an upper bound of Y if for each y ∈ Y, y  b. If there exists an upper bound of Y, then we say that Y is bounded above. A least upper bound or supremum of Y is an element b0 ∈ X such that b0 is an upper bound of Y and if c ≺ b0, then c is not an upper bound of Y . Observe that there is at most one such b0, by antisymmetry, and it may or may not exist. We write b0 = sup Y when it exists. Notice that sup Y may or may not belong to Y.IfsupY ∈ Y, it is referred to as the last (or largest or greatest) element of Y. Analogously, an element a ∈ X is a lower bound of Y if a  y for all y ∈ Y.If there exists a lower bound of Y, we say that Y is bounded below. A greatest lower bound or infimum of Y is an element a0 ∈ X which is a lower bound of Y and if a0 ≺ c, then c is not a lower bound of Y . It is clear that there is at most one such a0, and it may or may not exist. When it exists, we write a0 = inf Y.IfinfY ∈ Y,we call it the first (or smallest or least) element of Y. It is obvious that sup Y is the first element of the set {x ∈ X | y  x for all y ∈ Y }, and inf Y is the last element of the set {x ∈ X | x  y for all y ∈ Y }. A partially ordered set X is said to have the least upper bound property if each nonempty subset Y ⊆ X with an upper bound has a least upper bound. Analogously, X is said to have the greatest lower bound property if each nonempty subset Y ⊆ X which is bounded below has a greatest lower bound. If X is a set, then P (X) ordered by inclusion has the least upper bound property.

Proposition A.6.2 If an ordered set X has the least upper bound property, then it also has the greatest lower bound property.

An ordered set X which has the least upper bound (equivalently, the greatest lower bound) property is called order complete. Let (X, ) be a partially ordered set. An element a ∈ X is called a minimal element of X if for every x ∈ X, x  a ⇒ x = a, that is, no x ∈ X which is distinct from a precedes a. Similarly, an element b ∈ X is called a maximal element of X if for every x ∈ X, b  x ⇒ x = b, that is, no x ∈ X which is distinct from b follows b. It should be noted that if X has a last (first) element, then that element is the unique maximal (minimal) element of X.If is a , a maximal element is the last element, but there are partially ordered sets with unique maximal elements which are not last elements. If x, y ∈ X and x ≺ y, we say that x is a predecessor of y (or y is a successor of x). If x ≺ y and there is no z ∈ X such that x ≺ z ≺ y, then we say that x is an immediate predecessor of y (or y an immediate successor of x). Ordinals A partially ordered set X is called well ordered (or an ordinal) if each nonempty subset A ⊆ X has the first element. The set N of the positive integers is well ordered by its natural ordering ≤.On the other hand, the set R of real numbers is not well ordered by the usual ordering ≤. And, if X has more than one element, then the partial ordering ⊆ in P (X) is not a well ordering. Appendix A: Set Theory 427

Clearly, any subset of a well-ordered set is well ordered in the induced ordering. The empty set ∅ is considered to be a well-ordered set. If (X1, 1) and (X2, 2) are well-ordered sets, then the dictionary order relation on X1 × X2 is a well ordering. It was E. Zermelo who first formulated the axiom of choice (though it was being used by mathematicians without explicit formulation) and established that every set can be well ordered (1904). In fact, we have the following. Theorem A.6.3 The following statements are equivalent: (a) The Axiom of Choice. (b) Well-Ordering Principle: Every set can be well ordered. (c) Zorn’s Lemma: Let X be a partially ordered set. If each chain in X has an upper bound, then X has a maximal element. (d) Hausdorff Maximal Principle: If X is a partially ordered set, then each chain in X is contained in a maximal chain, that is, for each chain C in X, there exists a chain M in X such that C ⊆ M and M is not properly contained in any other chain which contains C. We do not wish to prove this theorem and refer the interested reader to the texts by Dugundji [3] and Kelley [6]. We remark that there are several other statements equivalent to the axiom of choice individually. We note that a well-ordered set (W, ) is totally ordered. For, if x, y ∈ W, then the subset {x, y}⊆W has a first element; accordingly, either x  y or y  x. Moreover, W is order complete. In fact, given a nonempty set X ⊆ W with an upper bound, the first element of the set of all upper bounds of X in W is sup X. Each element x of W that is not the last element of W has an immediate successor. This is first element of the set {y ∈ W | x ≺ y}, usually denoted by x + 1. However, x need not have an immediate predecessor. For example, consider the set N of positive integers with the usual ordering ≤. Choose an element ∞ ∈/ N, and put W = N ∪ {∞}. Define a relation  on W by x  y if x, y ∈ N and x ≤ y, x ∞for all x ∈ N, and ∞∞. Then (W, ) is a well-ordered set, and ∞ has no immediate predecessor in W. It is obvious that ∞ is the last element of W. We say that N ∪{∞} has been obtained from N by adjoining ∞ as the last element. For each x ∈ W,thesetS (x) ={w ∈ W | w ≺ x} is called the section (or initial segment) determined by x. The first element of W is usually denoted by 0; it is obvious that S (0) = ∅. Each section S (x) in W obviously satisfies the condition: y ∈ S (x) and z  y ⇒ z ∈ S (x). Conversely, if S is a proper subset of W satisfying this condition, then S is a section in W.For,ifx is the least element of W − S, then y ≺ x ⇔ y ∈ S.SoS = S (x). It is clear that a union of sections in W is either a section in W or equals W. Also, it is immediate that an intersection of sections in W is a section in W. A.6.4 (Principle of Transfinite Induction) Let (W, ) be a well-ordered set. If X is a subset of W such that S (y) ⊆ X implies y ∈ X for every y ∈ W, then X = W. Proof The first element 0 of W is in X, since ∅ = S (0) ⊂ X.IfW − X = ∅, then it has a first element y0,say.SoS (y0) ⊆ X, which forces y0 ∈ X, by our hypothesis. This contradiction establishes the theorem. ♦ 428 Appendix A: Set Theory

The preceding theorem is generally used in the following form: For each x ∈ W, let P (x) be a proposition. Suppose that (a) P (0) is true, and (b) for each x ∈ W, P (x) is also true whenever P (y) is true for all y ∈ S (x). Then P (x) is true for every x ∈ W. Since each element of N other than 1 has an immediate predecessor, the induction principle on N is equivalent to the following statement. Let P (n) be a proposition defined for each n ∈ N.IfP (1) is true, and for each n > 1, the hypothesis “P (n − 1) is true” implies that P (n) is true, then P (n) is true for every n ∈ N.   Let (W, ) and W ,  be well-ordered sets. A mapping f : W → W is said to be order-preserving if f (x)  f (y) whenever x  y in W. We call W and W isomorphic (or of the same ) if there is a bijective order-preserving map f : W → W ; such a map f is called an isomorphism. An order-preserving injection f : W → W is called a monomorphism. It should be noted that an isomorphism of a partially ordered set X onto a partially ordered set X is a bijection f : X → X such that x  y ⇔ f (x)  f (y). However, if X and X are totally ordered, one needs only the implication x  y ⇒ f (x)  f (y). For, this also implies the reverse implication f (x)  f (y) ⇒ x  y:Ifx  y, then y ≺ x ⇒ f (y) ≺ f (x), con- tradicting the antisymmetry of  . The ordinal N is isomorphic to the well-ordered set ω ={0, 1, 2,...} of nonneg- ative integers in its natural order. We note that a well-ordered set may be isomorphic to one of its proper subsets. For example, the well-ordered set N is isomorphic to the set E of even integers; n → 2n is a desired isomorphism. However, no well-ordered set W can be isomorphic to a section in it; this is immediate from the following fact. Proposition A.6.5 If f is a monomorphism of a well-ordered set W into itself, then f (w)  w for every w ∈ W.

Proof Let X ={x ∈ W | f (x) ≺ x}.IfX = ∅, then it has a first element, say x0.As x0 ∈ X,wehave f (x0) ≺ x0 whence f ( f (x0)) ≺ f (x0). This implies that f (x0) ∈ X, which contradicts the definition of x0. Therefore X = ∅. ♦ Corollary A.6.6 Two sections in a well-ordered set W are isomorphic if and only if they are identical.  

Proof  Suppose that S (x) and S x are isomorphic  sections in W and let f : S (x) →

S x be an isomorphism. If S(x) = S x , then we have either x ≺ x or x ≺ x.

By interchanging S (x) and S x , if necessary, we may assume that x ≺ x. Then x ∈ S (x) which implies that f x ≺ x . This contradicts Proposition A.6.5. ♦ Corollary A.6.7 The only one-to-one order-preserving mapping of a well-ordered set onto itself is the identity mapping. Proof Suppose that W is a well-ordered set and f : W → W is an isomorphism. If the set S = {x ∈ W | f (x) = x} is nonempty, then S has a least element a,say. By PropositionA.6.5,wehavea ≺ f (a). Since f is surjective, a = f (x) for some x ∈ W.So f (x) ≺ f (a) ⇒ x ≺ a. Then, by the definition of a, f (x) = x and we obtain a = x ≺ a, a contradiction. Therefore S = ∅, and f is the identity mapping. Appendix A: Set Theory 429

Theorem A.6.8 If X and Y are well-ordered sets, then exactly one of the following statements holds. (a) XisisomorphictoY. (b) XisisomorphictoasectioninY. (c) Y is isomorphic to a section in X.

Proof We first show that the three possibilities are mutually exclusive. If (a) and (b) occur together, then Y is isomorphic to a section in it, a contradiction. For the same reason, (a) and (c) cannot occur together. If f : X → S (y) and g : Y → S (x) are isomorphisms, then g f : X → S (x) is a monomorphism with g f (x) ≺ x.This contradicts PropositionA.6.5, so (b) and (c) also cannot occur together. Now, we prove that one of the above three possibilities does occur. Suppose that neither of (a) and (b) holds, and let Σ be the family of all sections S (xα) in X such that there exists an isomorphism fα of S (xα) onto a section in Y or onto Y itself. For every  pair of indices  α and β, S (xα) ∩ S xβ is a section in X,sayS(xγ), and fα S xγ and fβ S xγ are clearly isomorphic. If α = β, then one of these is certainly a section in Y, and  hence the other  is also a section in Y. Therefore, by Corollary A.6.6,wehave fα S xγ = fβ S xγ , and then Corollary A.6.7 shows that fα (x) = fβ (x) for all x ∈ S xγ . Hence, there is a function   : ( ) → ( ( )) f α S xα α fα S xα ( ) = ( ) ∈ ( ) defined by f x fα x if x S xα . It is easily checked that f is an isomorphism. ( ) ( ( )) Moreover, α S xα is either X or a section in it, and α fα S xα is either Y or ( ) = asectioninY.If α S xα X, then we have alternative (a) or (b), contrary to our ( ) = ( ) ∈ ( ) ∈ Σ assumption. So α S xα S x0 for some x0 X, and S x0 . We assert that f : S (x0) → Y is surjective. Assume otherwise. Then we have f (S (x0)) = S (y0) for some y0 ∈ Y. Obviously, we can extend f to an isomorphism S (x0) ∪{x0}→ S (y0) ∪{y0} by defining f (x0) = y0. Note that S (y0) ∪{y0} is either a section in Y or coincides with Y (if y0 is the last element of Y ). If x0 were the last element of X, then we would have alternative (a) or (b). Therefore S (x0) ∪{x0} is a member of the family Σ. Then, from the definition of S(x0), it follows that x0 ∈ S(x0),a contradiction. Hence, our assertion and the alternative (c) holds. ♦

Corollary A.6.9 Any subset of a well-ordered set W is isomorphic to either a section in W or W itself.

Proof Let X be a subset of W. Then X is a well-ordered set under the induced order. If there is an isomorphism f of W onto a section S (x0) in X, then there exists a monomorphism g : X → X defined by f , which satisfies g (x0) ≺ x0.This contradicts Proposition A.6.5. Therefore, one of the two possibilities (a) and (b) in Theorem A.6.8 must hold. ♦ 430 Appendix A: Set Theory

A.7 Ordinal Numbers

It is evident that the relation of isomorphism between ordinals is an equivalence relation. Ordinal numbers are objects uniquely associated with the isomorphic classes of well-ordered sets. A natural way to define them is to consider these isomorphic classes themselves as the ordinal numbers. The main drawback with this definition is that isomorphic classes of well-ordered sets are unfortunately not sets; accordingly, logical contradictions arise when such large classes are collected into sets. To avoid such difficulties, ordinal numbers are defined to be well-ordered sets such that each isomorphic class of well-ordered sets contains exactly one .

Definition A.7.1 An ordinal number is a well-ordered set (W, ) such that (a) if X ∈ W and x ∈ X, then x ∈ W, (b) for every X, Y ∈ W, one of the possibilities: X = Y , X ∈ Y or Y ∈ X holds, and (c) X  Y ⇔ X ∈ Y or X = Y.

The fact that ordering  in the above definition is actually a well ordering follows from one of the axioms in set theory: Every nonempty set X contains an element y such that x ∈/ y for all x ∈ X. (∗) As a consequence of this axiom, we see that no nonempty set can be a member of itself, and both the relations X ∈ Y and Y ∈ X cannot hold simultaneously for any two sets X, Y. So only one of the three alternatives described in the condition (b) can occur. It is now obvious that  is reflexive and antisymmetric. To see the transitivity of this relation, suppose that X  Y and Y  Z in W. Then we must have one of the relations X = Z, X ∈ Z or Z ∈ X.IfZ ∈ X, then we cannot have X = Y or Y = Z,forY ∈/ X and Z ∈/ Y .SoX ∈ Y and Y ∈ Z. But, this leads to the violation of Axiom (∗) by the set {X, Y, Z}. Hence X  Z. Using the above axiom, we also see that every nonempty subset of W has a least member. The empty set ∅ is obviously an ordinal number. To construct a few more ordinal numbers, observe that if W is an ordinal number, then so is W ∪{W}. Thus {∅}, {∅, {∅}}, {∅, {∅}, {∅, {∅}}}, etc. are first few ordinal numbers. Henceforth, we will generally denote ordinal numbers by small Greek letters.

Proposition A.7.2 Every element of an ordinal number is an ordinal number.

Proof Let β be an ordinal number and α ∈ β. We show that α is also an ordinal number. Suppose that x ∈ y and y ∈ α. Then we have either x = α or x ∈ α or α ∈ x.Ifx = α, then we would have x ∈ y and y ∈ x, a contradiction. Now, if α ∈ x, then the subset {x, y, α}⊂β fails to have a least element. So x ∈ α, and the condition Definition A.7.1(a) holds. Next, suppose that x, y ∈ α. Then both x, y are members of β. So one of the possibilities x = y, x ∈ y or y ∈ x must hold. Thus the condition (b) in Definition A.7.1 is satisfied by α. ♦

Lemma A.7.3 If α and β are ordinal numbers, then α ∈ β if and only if α ⊂ β. Appendix A: Set Theory 431

Proof If α ∈ β, then x ∈ α ⇒ x ∈ β, by DefinitionA.7.1. Moreover, α = β ⇒ α ∈ α, a contradiction. So we have α ⊂ β. Conversely, assume that α ⊂ β. Then β − α is nonempty and has a least element λ, say. We observe that α = λ. Suppose that x ∈ λ. Then x ∈ β,forλ ∈ β.Ifx ∈/ α, then we have λ  x. This implies that either λ = x or λ ∈ x, and contradicts the fact that λ is an ordinal number, by A.7.2.On the other hand, if x ∈ α, then x = λ and λ ∈/ x,forλ ∈/ α. Since both x and λ are members of β,wehavex ∈ λ. Thus α = λ ∈ β. ♦ Proposition A.7.4 If α and β are any two ordinal numbers, then α ⊆ β or β ⊆ α. Proof Suppose that α and β are two ordinal numbers. Then γ = α ∩ β ={x | x ∈ α and x ∈ β} is an ordinal number. If γ = α, β, then γ ⊂ α and γ ⊂ β.Bythe preceding lemma, we have γ ∈ α and γ ∈ β. Thus γ ∈ α ∩ β = γ, a contradiction. So γ = α or γ = β; accordingly, we have α ⊆ β or β ⊆ α. ♦ Theorem A.7.5 (a) Every nonempty set of ordinal numbers has a least element with respect to inclusion. (b) The union of a set of ordinal numbers is an ordinal number. Proof (a): Let Σ be a nonempty set of ordinal numbers. Choose α ∈ Σ.IfT = α ∩ Σ is empty, then σ ∈ Σ implies that σ ⊂ α, by Lemma A.7.3. Hence α ⊆ σ, and α is the least element of Σ.IfT = ∅, then T contains an element β such that β ∈ τ or β = τ for every τ ∈ T , since α is well ordered. Thus β ⊆ τ for every τ ∈ T . And, if σ ∈ Σ − T , then we clearly have β ⊂ α ⊆ σ. Thus, in this case, β is the least element of Σ.  Σ = (b): Let be a set of ordinal numbers. Then X α∈Σ α is a set. It is obvious that the condition (a) of Definition A.7.1 holds in X. Furthermore, if x, y ∈ X, then there exist ordinal numbers α, β ∈ Σ with x ∈ α and y ∈ β. By Proposition A.7.4, we have α ⊆ β or β ⊆ α. To be specific, suppose that α ⊆ β. Then x, y ∈ β and, therefore, they satisfy the condition (b) of DefinitionA.7.1. Lastly, we show that X is well ordered by the relation x  y ⇔ x ∈ y or x = y. It is obvious that  is a linear ordering on X and, by Proposition A.7.2, every element of X is an ordinal number. Then, for each nonempty subset Y of X, the part (a) of the theorem shows that there is an element z ∈ Y such that z ⊆ y for all y ∈ Y . By Lemma A.7.3, we deduce that z is the least element of Y . Thus (b) holds. ♦ Given two ordinal numbers α, β, we write α ≤ β if and only if α ⊆ β. Then, by Proposition A.7.4, we have either α ≤ β or β ≤ α. Moreover, it is immediate from PropositionA.7.5 that any nonempty set of ordinal numbers is well ordered by the relation ≤. We also see, by Proposition A.7.2 and Lemma A.7.3, that an ordinal number α consists of all those ordinal numbers which precede it. It follows that any two distinct ordinal numbers cannot be isomorphic.For,ifα < β, then α is the section {γ ∈ β | γ < α} in β. Therefore, by Proposition A.6.5, α cannot be isomorphic to β. Notice that the class O of all ordinal numbers is not a set. For, if O were a set, then = X α∈O α would be an ordinal number, by the preceding theorem. Consequently, T = X ∪{X} is an ordinal number, and we obtain the contradiction X < T ≤ X. 432 Appendix A: Set Theory

Theorem A.7.6 Every well-ordered set W is isomorphic to a unique ordinal number.

Proof Clearly, if W is isomorphic to two ordinal numbers α and β, then they them- selves are isomorphic, and we must have α = β. To find the desired ordinal number, let X be the collection of all elements x ∈ W for which there are ordinal numbers αx (depending upon x) such that the section S (x) in W is isomorphic to αx . Put Ψ(x) = αx .Ifx, y ∈ X and x = y, then the ordinal numbers αx and αy are isomorphic. Hence αx = αy, and it follows that Ψ is single-valued on X. By an axiom in set theory, we see that X is a set, and so is im(Ψ ). Observe that Ψ is injective, too. For, if Ψ (x) = Ψ (y), then S (x) and S (y) are isomorphic. By Corollary A.6.6, we deduce that S (x) = S (y), which forces x = y. Thus Ψ : X → im(Ψ ) is a bijection. We show that it is, in fact, an isomorphism. Denote the ordering in W by , and suppose that x, y ∈ X and y ≺ x. Let f : S (x) → Ψ (x) be an isomorphism. Then f (y) <Ψ(x), and f induces an isomorphism of S (y) onto {α ∈ Ψ (x) | α < f (y)}= f (y). By the definition of Ψ , Ψ (y) = f (y) <Ψ(x). Conversely, suppose that β = Ψ (y) <Ψ(x) and let g : Ψ(x) → S (x) be an isomorphism. Then β ∈ Ψ(x) and g|β is an isomorphism of β onto S (g (β)).SoΨ (g (β)) = Ψ (y). Since Ψ is injective, we have y = g (β) ≺ x. This proves our claim. We next show that im(Ψ ) is an ordinal number. Since the class O of all ordinal numbers is not a set, we have O = im(Ψ ). So there exists an ordinal number ν ∈/ im(Ψ ).Ifν contains elements which do not belong to im(Ψ ), then we have such a least element λ, by Theorem A.7.5(a); otherwise, we put λ = ν. We observe that im(Ψ ) ={α ∈ O | α < λ}=λ. Clearly, α < λ ⇒ α ∈ im(Ψ ). On the other hand, if α ∈ im(Ψ ), then there exists an x ∈ W and an isomorphism g : Ψ (x) = α → S (x). Consequently, for every β < α, y = g (β) ≺ x and g induces an isomorphism between β and S(y).Soβ = Ψ (y) ∈ im(Ψ ), and this implies that α < λ. Thus have im(Ψ ) = λ. To finish the proof, we need to show that X = W.IfX = W, then W − X has a least element, say w0. We observe that X = S (w0). Suppose x ∈ X, w ∈ W and w ≺ x. Then there exists an isomorphism f : S(x) → Ψ(x), which clearly induces an isomorphism between S(w) and the ordinal number f (w). Therefore, w ∈ X and we conclude that X is the section S (w0) in W. Since Ψ : X → λ is an isomorphism, we have w0 ∈ X, a contradiction. ♦

By the preceding theorem, it is clear that each well-ordered set determines a unique ordinal number. If an ordinal number α is nonempty, then the least element a of α must be ∅, for x ∈ a would imply that x ∈ α and x < a. It follows that ∅ is the least ordi- nal number, denoted by 0. We denote the ordinal number {∅} by 1, {∅, {∅}} by 2, {∅, {∅}, {∅, {∅}}} by 3, and so on. Notice that the ordinal number denoted by a positive integer n is determined by the well-ordered set {0, 1,...,n − 1}. More- over, we see that 1 is the immediate successor of 0, 2 is the immediate successor of 1, etc. In general, for each ordinal number α, α ∪{α} is an ordinal number, Appendix A: Set Theory 433 which is the least ordinal number > α. Thus α ∪{α} is the immediate succes- sor ofα, usually denoted by α + 1. However, there are ordinal numbers, such as = ∞ ω n=1 n, which do not have immediate predecessors, and they are called limit ordinal numbers. Observe that ω is isomorphic to the well-ordered set of all non- negative integers with the usual ordering, and each ordinal number α < ω contains a finite number of elements. We refer to an ordinal number containing a finite num- ber of elements as a finite ordinal number. It is obvious that a finite ordinal number coincides with some ordinal number n, and thus ω is the smallest ordinal number greater than every finite ordinal number. Accordingly, ω is called the first infinite ordinal number. Using the construction of successors, we obtain the sequence of + , + ,... ( + ) ordinal numbers ω 1 ω 2 . By Theorem A.7.5(b), n ω n is an ordi- nal number, denoted by 2ω = ω + ω. Thus, there is a sequence of ordinal num- bers 0, 1,...,ω, ω + 1,...,2ω, 2ω + 1,..., 2ω + ω = 3ω,.... The ordinal num- 2 2 + 2 ber n nω is denoted by ω , n ω nω is denoted by 2ω , and so on. Notice that these are all countable ordinal numbers. We conclude this section by showing the existence of an uncountable ordinal number. To this end, it suffices to construct an uncountable well-ordered set, since every well-ordered set determines a unique ordinal number.

Proposition A.7.7 There is an uncountable ordinal (W, ) with the last element ω1 such that, for each w ∈ W other than ω1, the section S (w) ={x ∈ W | x ≺ w} is countable.

Proof Let X be any uncountable set. By the well-ordering principle, there is a well- ordering  for X.IfX does not have a last element, we choose an element ∞ ∈/ X and construct a well-ordered set X ∪{∞}by adjoining ∞ to X as the last element. This is done by extending the relation on X to X ∪{∞}: Define x ∞for all x ∈ X. So we can assume that the well-ordered set (X, ) has a last element. Let Y be the set of elements y ∈ X such that the set {x ∈ X | x  y} is uncountable. Then Y is nonempty, and therefore has a least element. If ω1 denotes the least element of Y, then W ={x ∈ X | x  ω1} is the desired set. ♦

The ordinal W in the preceding proposition is unique in the sense that, if W is any ordinal with the same properties, then there is an isomorphism of W onto W .For,by Theorem A.6.8, we can assume that there is a monomorphism f : W → W . Then f (W) is clearly uncountable, and f (ω1) is its last element. If f (ω1) is distinct ( ) from  the last element ω1 of W , then f W would be countable, being a section w w < ( ) = S for some ω1. This contradiction shows that f ω1 ω1, and so f is an isomorphism of W onto W . Also, observe that the section S (ω1) in W is uncountable, and any countable ordinal is isomorphic to a section S (x) in W for some x ≺ ω1. Another notable fact about W is that if A ⊂ W is countable and does not contain ω1, then sup A ≺ ω1.To ={ ∈ |  } ∈ see this, note that the set Xa x W x a is countable for each a A. Since = = w A is countable, so is Y a Xa. Clearly, W Y ,forW is uncountable. Let 0 be the least element of W − Y . Then we have y ∈ Y ⇔ y ≺ w0. It follows that w0 has 434 Appendix A: Set Theory only a countable number of predecessors, and therefore w0 ≺ ω1. Clearly, w0 is an upper bound for A,sosupA ≺ ω1. The ordinal number determined by the ordinal W in the preceding proposition is called the first (or least) uncountable ordinal number, and is denoted by Ω.The section {α ∈ Ω | α <Ω}, usually denoted by [0,Ω), is the set of all countable ordinal numbers and has no largest element. For, if α ∈[0,Ω), then its immediate successor α + 1 is countable. So α + 1 <Ω, since [0,Ω)is uncountable. Accord- ingly, Ω is also a limit ordinal; in fact, there are uncountably many limit ordinals in [0,Ω].

A.8 Cardinal Numbers

As seen in Sect. A.5, two finite sets X and Y are equipotent if and only if they have the same number of elements. We associate with each set X an object X such that two sets X and Y are equipotent (that is, X and Y have the same cardinality) if and only if X = Y . Isomorphic ordinals are obviously equipotent, and we observe that the converse is also true for finite ordinals. We first show that an ordinal X consisting of n elements, n any nonnegative integer, is isomorphic to a section S (n) in ω.For,ifX = ∅, then it is S (0). So assume that X = ∅, and define a mapping f : X → ω as follows. Map the first element x1 of X into the integer 0. If X −{x1} = ∅, denote the first element of X −{x1} by x2 and put f (x2) = 1. If X ={x1, x2}, denote the first element of X −{x1, x2} by x3 and put f (x3) = 2. Since X consists of finitely many elements only, this process terminates after finitely many steps. Thus, we obtain a positive integer n such that X ={x1,...,xn}, and there is a mapping f : X → ω given by f (xi ) = i − 1, i = 1, 2,...,n. Clearly, x1 < ···< xn so that f is an isomorphism of X onto the section S (n) in ω. It follows that any two finite ordinals both having the same number of elements are isomorphic. However, this may not be true for ordinals having infinitely many elements. For example, if ω ∪{q} is the ordinal obtained from the ordinal ω by adjoining q as the last element, then ω ∪{q} is not isomorphic to ω, by Proposition A.6.5.Butφ: ω ∪{q}→ω defined by φ (n) = n + 1, φ (q) = 0 is a bijection. By the well-ordering principle, every set X can be well-ordered and, by Theorem

A.7.6, it has the cardinality of an ordinal number. We define X to be the least ordinal number such that X is equipotent to X . Clearly, the object X is uniquely determined by X, and is called the of X.

If X is finite and has n elements, then X is the ordinal number n. The cardinal number N is denoted by ℵ0, and the cardinal number R is denoted by ℵ1 and also by c, called the cardinality of the . The mapping x → x/(1 −|x|) is a bijection between the open interval (−1, 1) and R, and the mapping x → (2x − a − b)/(b − a) is a bijection between an open interval (a, b) and (−1, 1). Therefore the cardinality of any open interval (a, b) is c. By Theorem A.5.14, we see that the cardinality of a closed interval or a half open interval is also c. Appendix A: Set Theory 435

Proof of Theorem A.5.14:Let f : X → Y be an injection. By the definition of car- dinal number Y , there is a bijection ψ : Y → Y . Then the composition ψ f is an injection of X into Y ; consequently, there is a well ordering, say,  on X such that ψ f is an isomorphism of (X, ) into Y .Ifo(X, ) denotes the ordinal number determined by the well-ordered set (X ,  ), then, by Corollary A.6.9, we conclude that o(X, ) ≤ Y . On the other hand, X ≤ o(X, ), by the definition of X . Since the relation ≤ between ordinal numbers is transitive, we have X ≤ Y . Similarly, we also have Y ≤ X if there is an injection Y → X. It follows that X = Y , and hence X and Y are equipotent. Given two sets X and Y ,wehave X < Y if and only if there exists an injection X → Y but there is no bijection between X and Y . Accordingly, it is logical to say that X has fewer elements than Y when X < Y . As seen in §A.5, there is no surjection from N to I while there is an injection N → I , n → 1/n.Soℵ0 < c. Cardinal Arithmetic The sum of two cardinal numbers α and β, denoted by α + β , is the cardinality of the of two sets A and B, where A = α and B = β. Obviously, n +ℵ0 =ℵ0,forN ={1, 2,...,n}∪{n + 1, n + 2,...}. Since N is the union of disjoint sets {1, 3, 5,...} and {2, 4, 6,...}, we obtain ℵ0 +ℵ0 =ℵ0. Similarly, we derive ℵ0 +···+ℵ0 =ℵ0. Considering the interval [0, 2) as the union of the intervals [0, 1) and [1, 2),we see that c + c = c. Therefore, for any integer n ≥ 0,

c ≤ n + c ≤ℵ0 + c ≤ c + c = c which implies that n + c =ℵ0 + c = c + c = c. In fact, for any infinite cardinal number α,wehaveα +ℵ0 = α. To prove this, we first observe that every infinite set X contains a countably infinite subset. Choose an element x1 ∈ X. Since X ={x1}, we can choose an element x2 ∈ X such that x2 = x1. We still have X −{x1, x2} = ∅. Consequently, we can choose an element x3 ∈ X such that x3 = x1, x2. Assume that we have chosen n distinct elements x1, x2,...,xn of X. Since X is infinite, it is possible to choose an xn+1 ∈ X that is distinct from each of the xi . Repeating this process indefinitely, we obtain a sequence x1, x2,... of distinct points of X.Theset{x1, x2,...} constructed in this way is obviously countably infinite. Now, given an infinite cardinal number α, find a set X with α =

X . Then there exists a countably infinite set Y ⊆ X. Writing β = X − Y ,wehave

α = Y + X − Y =ℵ0 + β,soα +ℵ0 = β +ℵ0 +ℵ0 = β +ℵ0 = α. ∈  Let M be a set and suppose that, for each m M, αm is a cardinal number. Then ∈ αm denotes the cardinal number of the union of pairwise disjoint sets Am, m M where Am = αm . Clearly, the set N can be written as the disjoint union of countably many countable sets: 436 Appendix A: Set Theory

13 5 7··· 2 6 10 14 ··· 4122028··· 8244056··· ...... ···

So ℵ0 +ℵ0 +···=ℵ0. If α1, α2, α3,..., are cardinal numbers such that 1 ≤ αn ≤ℵ0 for every n = 1, 2, 3,..., then, by Proposition A.5.11, we derive

α1 + α2 +···=ℵ0.

The decomposition of the interval [0, ∞) into the intervals [n − 1, n), where n = 1, 2,..., shows that c + c +···=c. The product of two cardinal numbers α and β, denoted by αβ, is the cardinality of the Cartesian product of two sets A and B, where A = α and B = β.Ifμ is a cardinal number, then the μth power of α, denoted by αμ, is the cardinal number of the set AM , where M is a set with M = μ. ∈ =  Suppose that for every m M, there is the same cardinal number αm α. Then = = μ = m∈M αm αμ and m∈M αm α , where μ  M . × = = ×{ } For the first formula, we note that A M m∈M Am , where Am A m . { | ∈ } Obviously, Am is equivalent to A, and the family Am m M is pairwise disjoint. × = = So A M m∈M Am and we have m∈M αm αμ. = ∈ = M With Am A for all m M,wehave m∈M Am A , and the second formula follows. Using the above results, we obtain

nℵ0 =ℵ0 +···+ℵ0 =ℵ0, ℵ0ℵ0 =ℵ0 +ℵ0 +··· = ℵ0, nc = c +···+c = c, ℵ0c = c + c +··· = c, where n is any positive integer.

ℵ Proposition A.8.1 c = 2 0 .

ℵ N Proof By definition, 2 0 is the cardinality of 2 , the set of all whose terms are the digits 0 and 1 only (that is, the dyadic sequences). Interestingly, only these two digits are required to write the binary (or dyadic) representation of a real number x in [ , ] = . ··· ··· 0 1 ; specifically, as the binary expansion, the representation x 0 a1a2 an n means the number x = an/2 , where each an is either 0 or 1. Note that, as in the case of decimal expansion, certain numbers in [0, 1] have two binary expansions (e.g., 1/2 = 0.100 ···=0.011 ···). Hence we deduce that c ≤ 2N ≤ 2c = c. ♦ Appendix A: Set Theory 437

ℵ2 =ℵℵ =ℵ ℵn =ℵ ···ℵ =ℵ Note that 0 0 0 0 and 0 0 0 0, by induction on n. Similar results can be proved for any transfinite cardinal number α. With this end in view, we first establish the following.

Proposition A.8.2 For any infinite cardinal number α, 2α = α.

Proof Let X be an infinite set with X = α. Denote the two-point set {0, 1} by 2. Then, for any set A,2 × A is the union of disjoint sets {0}×A and {1}×A and so we have 2 A = A + A = 2 × A . Now, consider the family F of all pairs (A, f ) such that A ⊆ X and f : A → 2 × A is a bijection. By Theorem A.5.7, X contains a countably infinite set A and, by Proposition A.5.11,theset2× A is also countable. Hence, there is a bijection f : A → 2 × A, and (A, f ) belongs to F . Consider the binary relation ≤ on F defined by (A, f ) ≤ (B, g) if A ⊆ B and f = g|B.It is easily verified that ≤ is a partial ordering and every chain in F has an upper bound in F . So the Zorn’s lemma applies, and we get a maximal member (M, h) in F . Then M + M = M . We show that M = X .IfX − M is infinite, then it contains a countably infinite set B. As above, there exists a bijection g between B and 2 × B. Then we have a bijection k : B ∪ M → 2 × (B ∪ M) defined by k|B = g and k|M = h. Thus (B ∪ M, k) belongs to F , and this contradicts the maximality of M. Therefore, X − M must be finite; accordingly, M is infinite. If Y ⊆ M is a countably infinite set (which does exist), then Y ∪ (X − M) = Y + X − M = Y and we obtain X = Y ∪ (X − M) ∪ ( M − Y )

= Y ∪ ( X − M) + M − Y = Y + M − Y = M . ♦

As an immediate consequence of this result, we see that if α is an infinite cardinal number, then, by induction, nα = α for every integer n > 0. And, for a cardinal number β ≤ α, α + β = α, since α ≤ α + β ≤ 2α = α.

Proposition A.8.3 For any infinite cardinal number α, α2 = αα = α.

Proof Let X be an infinite set with X = α and let F be the family of all pairs (A, f ) such that A ⊆ X and f : A → A × A is a bijection. Clearly, X contains a countably infinite set A and, by Lemma A.5.10,thesetA × A is also countable. So there is a bijection f between A and A × A, and thus the pair (A, f ) belongs to F . We partially F ( , ) ≤ ( , ) ⊆ = | C = { , } order by A f B g if A B and f g B. Given a chain Ai fi in F , put B = Ai and define f : B → B × B by setting f (x) = fi (x) if x ∈ Ai . Then f is clearly a bijection so that the pair (B, f ) belongs to F . It is obvious that (B, f ) is an upper bound for the chain C . By the Zorn’s lemma, we have a maximal member (M, h) in F . Since h : M → M × M is a bijection, M = M M .We observe that X = M .AsM ⊆ X, M ≤ X .If M < X , then we find that M < X − M .For, X − M ≤ M implies that

X = M + X − M ≤ M + M = M , 438 Appendix A: Set Theory

by the preceding proposition. This contradicts our assumption that M < X .Let j : M → X − M be an injection and put Y = j(M). Then Y = M . Since Y is infinite, we have 3 Y = Y , and it follows that there are three disjoint sets A, B , and

C contained in Y such that Y = A ∪ B ∪ C and Y = A = B = C .So A = M × Y , B = Y × M and C = Y × Y ; accordingly, there is a bijection k : Y → (M × Y ) ∪ (Y × M) ∪ (Y × Y ). Since M ∩ Y = ∅, we have a bijection M ∪ Y → (M ∪ Y ) × (M ∪ Y ) which extends h (and k). This contradicts the maximality of M, and hence M = X . ♦ If α is an infinite cardinal number, then, by induction on n, we see that αn = α for every integer n > 0. Moreover, for any cardinal number β ≤ α,wehaveα ≤ αβ ≤ αα = α, and therefore αβ = α. In particular, notice that ℵ0α = α.Weuse these results to prove a generalization of Theorem A.5.12.

Theorem A.8.4 Let X be an infinite set and F be the family of all finite subsets of X. Then F = X .

Proof Since the mapping X → F , x →{x}, is an injection, we have X ≤ F . To see the opposite inequality, for each integer n > 0, let Fn denote the family of all those subsets of X, which contain exactly n elements. Now, for each set F ∈ n Fn, choose an element φ(F) in X ×···× X = X (n copies), which has all its coordinates in F. Of course, there are several choices for φ(F ), we pick any one : F → n F ≤ n = n = of these. Then φ n X is an injection, and so n X X X . F = F Obviously, we have n n and therefore   F ≤ F ≤ =ℵ = . n n n X 0 X X ♦

Let X be a set and A ⊆ X. The function f A : X →{0, 1} defined by

0forx ∈/ A, and f (x) = A 1forx ∈ A is called the characteristic function of A. The function which is zero everywhere is the characteristic function of the empty set, and the function which is identically 1 on X is the characteristic function of X. The set of all functions X →{0, 1} is denoted by 2X. Obviously, every element of 2X is a characteristic function on X. Proposition A.8.5 For any set X, there is a bijection between the set P (X) of all its subsets and 2X.

Proof For a set A ⊆ X,let f A denote the characteristic function of A. Then the mapping φ: 2X → P (X), f → f −1(1), is clearly surjective. It is also injective. For, if f = g are in 2X , then we have f (x) = g (x) for some x ∈ X. It follows that x belongs to one of the sets f −1 (1) and g−1 (1), but not to the other. So φ ( f ) = φ (g), and φ is an injection. ♦ Appendix A: Set Theory 439

From the above proposition, it is clear that P (X) = 2|X|. The function x →{x} is obviously an injection from X into P (X) , but there is no bijection between these sets, by Theorem A.5.13. Therefore X < P (X) , and thus we have established the following.

Proposition A.8.6 For any set X, X < 2|X|. Appendix B Fields R, C,andH

B.1 The Real Numbers

We shall not concern ourselves here with the construction of the real number system on the basis of a more primitive concept such as the positive integers or the rational numbers. Instead, we assume familiarity with the system R of real numbers as an ordered field which is complete (that is, it has the least upper bound property). We review and list the essential properties of R, however, in aid of the reader. With the usual addition and multiplication, the set R has the following properties.

Theorem B.1.1 (a) R is an abelian group under addition, the number 0 acts as the neutral element. (b) R −{0} is an abelian group under multiplication, the number 1 acts as the multiplicative identity (unit element). (c) For all a , b, c ∈ R,a(b + c) = ab + ac. A field is a set F containing at least two elements 1 = 0 together with two binary operations called addition and multiplication, denoted by + and · (or juxtaposition), respectively, which satisfy TheoremB.1.1. The element 0 is the identity element for addition, and 1 acts as the multiplicative identity. With this terminology, the set R is a field under the usual addition and multipli- cation. This field has an order relation < which satisfies (a) a + b < a + c if b < c, and (b) ab > 0ifa > 0 and b > 0. A field which also has an order relation satisfying these two conditions is called an ordered field.ThesetQ of all rational numbers is another example of an ordered field. We call an element a of an ordered field positive if a > 0, and negative if a < 0. Theorem B.1.2 The following statements are true in every ordered field: (a) a > 0 ⇒−a < 0, and vice versa. (b) a > 0 and b < c ⇒ ab < ac, a < 0 and b < c ⇒ ab > ac. © Springer Nature Singapore Pte Ltd. 2019 441 T. B. Singh, Introduction to Topology, https://doi.org/10.1007/978-981-13-6954-4 442 Appendix B: Fields R, C,andH

(c) a = 0 ⇒ a2 > 0. In particular 1 > 0. (d) a > 0 ⇒ a−1 > 0, a < 0 ⇒ a−1 < 0. (e) ab > 0 ⇒ either both a > 0 and b > 0 or both a < 0 and b < 0. (f) ab < 0 ⇒ either both a < 0 and b > 0 or both a > 0 and b < 0. The ordered field R has the least upper bound property:IfS ⊆ R is nonempty and bounded above, then sup S exists in R. This is also called the completeness property of R. It follows that R is a complete ordered field. Moreover, R has no gaps: If a < b, then c = (a + b)/2 satisfies a < c < b. Thus we see that R is a linear continuum (refer to Exercise3.1.11). Using the completeness property, it can be shown that every nonempty set of real numbers with a lower bound has a infimum. Another important consequence of the completeness property of R is described by the following. Theorem B.1.3 (Archimedean Property) If a real number x > 0, then given any real number y, there exists a positive integer n such that nx > y. Using this property of R, one can prove that, for any two real numbers x < y, there is a rational number r ∈ Q such that x < r < y. This fact is usually stated by saying that Q is dense in R. The absolute value of a real number x, denoted by |x|, is defined by |x|=x for x ≥ 0, and |x|=−x for x < 0. Proposition B.1.4 For any x, y ∈ R, we have

(a) |x|≥0 (b) |x|=0 ⇔ x = 0. (c) |−x|=|x|.(d) |xy|=|x||y|. (e) |x|≤y ⇔−y ≤ x ≤ y.(f) −|x|≤x ≤|x|. (g) |x + y|≤|x|+|y|.(h) |x|−|y| ≤|x − y|. (i) |x − y|≤|x|+|y|. √ By (d), |x|= x2.

B.2 The Complex Numbers

The set R2 of all ordered pairs (x, y) of real numbers turns into a field under the following addition and multiplication:    

(x, y) + x ,y = x + x , y + y ,  (x, y) x , y = xx − yy , xy + yx .

The element (0, 0) acts as the neutral element for addition, and the element (1, 0) plays the role of multiplicative identity. It is routine to check that R2 is a field Appendix B: Fields R, C,andH 443 under these definitions. It is usually denoted by C, and its elements are referred to as the complex numbers. It is readily verified that (x, 0) + (y, 0) = (x + y, 0), and (x, 0)(y, 0) = (xy, 0). This shows that the complex numbers of the form (x, 0) form a subfield of C, which is isomorphic to R under the correspondence x → (x, 0).We can therefore identify this subfield of C with the real field and regard R ⊂ C. Writing ı = (0, 1),wehaveı 2 =−1 and (x, y) = (x, 0) + (y, 0)(0, 1) = x + yı, using the identification x ↔ (x, 0). Thus C ={x + yı|x, y ∈ R}, where ı 2 = −1. If z = x + yı is a complex number, then we call x the real part of z (denoted by Re(z)), and y the imaginary part of z (denoted by Im(z)). If z = x + yı ∈ C, its conjugate is defined to be the complex number z¯ = x − yı. For any complex numbers z and w,wehave

z + w =¯z + w, zw =¯zw.

Observe that zz¯ is a positive real number unless z = 0. √ The absolute value |z| of a complex number z is defined to be zz¯ (the nonnegative square root). Clearly, |z| > 0 except when z = 0, and |0|=0. It is also obvious that |Re (z) |≤|z| and |z|=|¯z|.Ifz and w are any two complex numbers, then it is easily checked that |zw|=|z||w| and |z + w|≤|z|+|w|.

B.2.1 (Schwarz Inequality) If z1,...,zn and w1,...,wn are complex numbers, then we have     n w 2 ≤ n | |2 n |w |2 . 1 z j j 1 z j 1 j    = n | |2 = n |w |2 = n w Proof Put α 1 z j , β 1 j and γ 1 z j j . We need to show that αβ −|γ|2 ≥ 0. If α = 0orβ = 0, this is trivial. We therefore assume that α = 0 = β. Then α, β > 0 and we have      n |βz − γw |2 = n βz − γw βz¯ − γw 1 j j 1 j j  j j = β2α − β|γ|2 = β αβ −|γ|2 .

The left-hand side of the first equality is obviously nonnegative, and therefore we must have αβ −|γ|2 ≥ 0. ♦

B.3 The Quaternions

By using the usual scalar and (nonassociative) vector product in R3, Hamilton defined (in 1843) a multiplication in R4, which together with componentwise addition makes it into a skew field. This field has proven to be fundamental in several areas of mathematics and physics. We intend to discuss here some basic facts about it. Recall that a skew field (also called a division ring) satisfies all field axioms except the commutativity of multiplication. 444 Appendix B: Fields R, C,andH

4 3 The mapping R → R × R , (x0, x1, x2, x3) → (x0, (x1, x2, x3)), is a bijection. If we define the vector space structure on R × R3 over R componentwise:

(a, x) + (b, y) = (a + b, x + y) and c (a, x) = (ca, cx) , then this mapping becomes an isomorphism. Consequently, we can identify R × R3 = H with R4, and call its elements quaternions. If q = (a, x), we refer to a as the real part of q and x as the vector part of q. There are canonical monomorphisms R → H, a → (a, 0), and R3 → H, x → (0, x), of vector spaces. Hence, we identify the real number a with the quaternion (a, 0), and the vector x with the quaternion (0, x). Then a quaternion (a, x) can be written as a + x. Accordingly, for any two quaternions q = a + x and r = b + y,wehaveq + r = a + b + x + y, and if c is a real, then cq = ca + cx. If x, y ∈ R3 ⊂ H, we first define xy =−x · y + x × y, where · is the usual scalar product, and × is the usual vector product in R3. Notice that xy is in general an element of H. It is easily checked that this multiplication of vectors in H is associative. As the multiplication in H ought to be distributive, we set

qr = ab + ay + bx + xy for q = a + x and r = b + y. We leave it to the reader to verify the following con- ditions for all quaternions q, r, s, and real c:

q (cr) = c (qr) = (cq) r, q (rs) = (qr) s, q (r + s) = qr + qs, (r + s) q = rq + sq.

The quaternion 1 = 1 + 0 acts as the multiplicative identity: 1q = q = q1 for every q ∈ H. To complete the proof that H is a skew field with the above addition and multi- plication, it remains to verify that every nonzero quaternion q has a multiplicative inverse. With this end in view, we define the conjugate of q = a + x as q¯ = a − x. Observe that q + r =¯q +¯r, cq = cq¯, qr =¯rq¯ for any quaternions q, r, s, and real 2 c. Also, it is straightforward√ to see that qq¯ =¯qq = a + x · x (a real). We define the modulus of q to be |q|= qq¯. Notice that |q| is the Euclidean norm of q when it is considered as an element of R4.Forq, r ∈ H,wehave

|qr|2 = (qr)(qr) = qrr¯q¯ = q|r|2q¯ =|r|2qq¯ =|r|2|q|2, which implies that |qr|=| q||r|. Clearly, |q|=0 ⇔ q = 0 and, if q = 0, then q q¯/|q|2 = 1 = q¯/|q|2 q. Thus q¯/|q|2 is the inverse of q in H, usually denoted by q−1. Observe that x ∈ R3 is a unit vector ⇔ x · x = 1 ⇔ x2 =−1, and two vectors x, y ∈ R3 are orthogonal ⇔ x · y = 0 ⇔ xy =−yx.Aright-handed orthonormal system in R3 is an ordered triple ı,j,k of vectors in R3 such that ı,j,k are of unit length, mutually orthogonal and ı × j = k. So, if ı,j,k form a right-handed Appendix B: Fields R, C,andH 445 orthonormal system, then ı 2 = j 2 = k2 =−1 and ıjk =−1. Conversely, these conditions imply that ı,j,k are of unit length, and ıj = k whence ı · j = 0 and ı × j = k. Thus ı,j,k form a right-handed orthonormal system. Suppose now that ı,j,k is a right-handed orthonormal system in R3. Then any 3 vector x ∈ R can be written uniquely as x = x1ı + x2j + x3k, xi ∈ R; accordingly, any quaternion q can be expressed uniquely as

q = q0 + q1ı + q2j + q3k, qi ∈ R.

Clearly, we have ıj = k =−jı, jk = ı =−kj, kı = j =−ık. Using these rules, we obtain the following formula for the product of two elements q = q0 + q1ı + j + = + + j + H q2 q3k, q q0 q1ı q2 q3k in :     qq = q q − q q − q q − q q + q q + q q + q q − q q ı +  0 0 1 1 2 2 3 3 0 1 1 0 2 3 3 2  + + − j + + + − . q0q2 q2q0 q3q1 q1q3 q0q3 q3q0 q1q2 q2q1 k   ¯ = − − j − | |= 2 + 2 + 2 + 2 We also note that q q0 q1ı q2 q3k and q q0 q1 q2 q3 . For any unit vector x ∈ R3, the set of quaternions a + bx, a, b ∈ R, is a subfield of H isomorphic to C under the mapping a + bx → a + bı. In particular, the subfield of quaternions with no j and k components is identified with C, and we regard C as a subfield of H. Thus, we have field inclusions R ⊂ C ⊂ H. It is obvious that any real number commutes with every element of H. Conversely, if a quaternion q commutes with every element of H, then q ∈ R. We emphasize, however, that the elements of C do not commute with the elements of H. References

1. G.E. Bredon, Topology and Geometry (Springer, New York, 1993) 2. R. Brown, Elements of Modern Topology (McGraw-Hill, London, 1968) 3. J. Dugundji, Topology (Allyn and Bacon, Boston, 1965) 4. D.B. Fuks, V.A. Rokhlin, Beginner’s Course in Topology (Springer, Heidelberg, 1984) 5. I.M. James, Topological and Uniform Spaces (Springer, New York, 1987) 6. J.L. Kelley, (van Nostrand, New York, 1955) 7. W.S. Massey, Algebraic Topology: An Introduction (Springer, New York, 1967) 8. G. McCarty, Topology: An Introduction with Application to Topological Groups (McGraw-Hill, New York, 1967) 9. D. Montgomery, L. Zippin, Topological Transformation Groups (Interscience Publishers, New York, 1955) 10. J.R. Munkres, Topology: A First Course (Prentice-Hall, NJ, 1974) 11. L. Pontriajagin, Topological Groups (Princeton University Press, New York, 1939) 12. I.M. Singer, J.A. Thorpe, Lecture Notes on Elementary Topology and Geometry (Scott, Fores- man and Company, IL, 1967) 13. E.H. Spanier, Algebraic Topology (McGraw-Hill, New York, 1967) 14. L.A. Steen, J.A. Seebach, Counterexamples in Topology (Springer, New York, 1978) 15. R.C. Walker, The Stone-Cechˇ Compactification (Springer, New York, 1974) 16. S. Willard, General Topology (Addison-Wesley, MA, 1970)

© Springer Nature Singapore Pte Ltd. 2019 447 T. B. Singh, Introduction to Topology, https://doi.org/10.1007/978-981-13-6954-4 Index

Symbols Bounded set, 5 Fσ set, 11 Box topology, 43 Gδ set, 11 T0-space, 91 T1-space, 91 C T2-space, 89 , 13 T3-space, 181 Cardinal number, 434 T4-space, 186 Cauchy sequence, 219 T 1 -space, 196 Chain, 425 3 2 σ-discrete family, 212 Clopen set, 8 σ-locally finite family, 206 Closed , 10 Closed function, 34 , 8 A , 11 Accumulation or cluster point Cluster or accumulation point of a set, 12 of a filter, 87 Action of a net, 82 effective, 299 of a sequence, 77 free, 299 Coarser topology, 8 of a group, 297 Coarse topology, 145 transitive, 299 Cocountable topology, 7 trivial, 299 Cofinal, 83 Adherent point, 11 Cofinite topology, 7 Adjunction space, 156 Coherent topology, 164 Admissible nbd, 374 Coherent union, 164 Affine map, 276 Coinduced topology, 163 Attaching map, 156 , 64 Compactification, 199 , 96 B Compact subset, 96 , 231 Completely , 195 Baire space, 230 Completely , 196 Basis, 17 Complete , 219 Bing metrization theorem, 218 Cone, 142 Boundary, 13 Connected component, 59 Boundary point, 13 Connected set, 52 Bounded metric, 5 , 51 © Springer Nature Singapore Pte Ltd. 2019 449 T. B. Singh, Introduction to Topology, https://doi.org/10.1007/978-981-13-6954-4 450 Index

Continuity at a point, 3, 31 Filter, 86 , 4, 29 Filter , 87 Continuum, 104 Finer topology, 8 Contractible space, 318 Fine topology, 145 Contraction, 318 Finite intersection property, 97 Convergence First countable space, 107 continuous, 235 First or smallest or least pointwise, 235 element, 426 uniform, 235 Fixed point, 300 Convergent filter, 87 Fort space, 10 Convergent net, 81 Fréchet space, 49 Convergent sequence, 77 Free abelian group, 351 Convex set, 64 Free group, 353 Coset space, 282 Free product, 346 Countably compact space, 106 with amalgamation, 359 Covering map or Function, 413 projection, 374 associate, 251 Covering of a set, 95 bijective, 414 Cross section, 309 characteristic, 438 Curve, 66 injective, 414 Cylinder, 40 surjective, 414 Fundamental domain, 309 Fundamental group, 331 D Deck transformation, 394 Decomposition space, 138 G Deformation retract, 321 Generalized Heine–Borel Degree of a loop, 340 theorem, 100 Dense set, 13 Gluing lemma, 32 Derived set, 12 Graph of a function, 42 Diagonal, 42 Group Diameter of a set, 5 general linear, 269 Dictionary order, 425 orthogonal, 276 Disconnected space, 51 special linear, 278 Discrete family, 212 special orthogonal, 278 Discrete set, 27 special unitary, 278 Discrete space, 7 symplectic, 277 Discrete topology, 7 topological, 267 Dunce cap, 325 unitary, 277

E H Embedding, 36 , 81 Equicontinuous family, 261 Hawaiian earring, 152 Equivalence relation, 417 Heine–Borel theorem, 96 Equivalent metric, 21 Hereditary property, 90 Euclidean group, 295 Hilbert space, 3 Evaluation map, 216, 243 Homeomorphism, 33 Homogeneous space, 270 Homotopic maps, 316 F Homotopy, 316 Family separating Homotopy class, 317 points, 216 Homotopy equivalence, 317 points and closed sets, 162 Homotopy inverse, 317 Index 451

I Mapping cylinder, 159 Identification map, 137 Maximal element, 426 Identification space, 137 Metric space, 2 Identification topology, 137 Metric topology, 7 Indiscrete space, 7 Metrizable space, 47 Indiscrete topology, 7 Minimal element, 426 Induced topology, 161 Möbius band, 127 Inductive topology, 164 Monodromy action, 389 Inessential map, 318 Inner production Cn and Hn, 2 Interior, 11 N Interior point, 11 Nagata–Smirnov metrization Invariant subset, 302 theorem, 215 , 13 Neighborhood, 9 Isometry, 38 Neighborhood basis, 22 Isotropy subgroup, 299 Net, 80 Normal space, 186 Nowhere dense, 230 J Null homotopic map, 318 Join, 143

O K One-point compactification, 117 Klein bottle, 128 Open ball, 4 k-space (or compactly Open function, 34 generated space), 119 , 4, 7 Orbit, 299 Orbit map, 305 L Orbit space, 304 Last or largest or greatest Order relation, 424 element, 426 , 16 Lebesgue number, 112 Ordinal, 426 Left translation, 270 Ordinal number, 430 Lens space, 305 first infinite, 433 Lifting of a map, 381 first uncountable, 434 Limit point of a set, 12 Lindelöf space, 177 , 57 P Linear or total , 205 ordering, 425 Partial ordering, 425 Local homeomorphism, 377 Partition of unity, 212 Locally closed, 116 Path, 63 Locally compact space, 114 Path component, 66 , 69 Path-connected space, 64 Locally finite family, 32 Perfectly normal space, 195 Locally metrizable space, 218 Perfect map, 121 Locally path-connected space, 72 Perfect set, 13 Loop, 320 Pointed space, 151 Lower limit topology, 20 Precompact space, 109 Principal filter, 87 Product space, 39 M , 39, 43 Manifold, 213 Proper map, 121 Mapping cone, 159 Pseudocompact space, 114 452 Index

Pseudo-metrizable space, 215 Subspace, 5, 25 Subspace topology, 25 Sup metric, 6 Q Suspension, 142 Quasi-component, 61 Symmetric nbd, 271 Quotient space, 126 Symplectic space, 3 Quotient topology, 126

T R Television topology, 24 Rank Tietze extension theorem, 191 of a free abelian group, 353 Topological invariant, 35 of a free group, 355 Topologically complete space, 220 Reduced suspension, 154 , 7 Refinement, 205 Topological sum, 149 Reflection, 313 Topologist’s sine curve, 57 Regular covering, 398 Topology, 7 Regular space, 181 admissible, 243 Relative homotopy, 321 compact-open, 244 Relative topology, 25 of compact convergence, 258 Retract, 321 of pointwise convergence, 236 Right translation, 270 of uniform convergence, 238 Rotation, 311 of uniform convergence on compacta, 258 Torus, 41, 128 S Totally bounded space, 109 Saturated set, 126, 137 Totally disconnected space, 60 Second countable space, 169 Tychonoff theorem, 99 Semilocally simply connected, 405 Tychonoff topology, 43 , 175 , 53 Sequence, 77 U Sequentially compact space, 107 Ultrafilter, 87 Sequential space, 174 Uniform homeomorphism, 228 Set Uniformly continuous map, 113 countable, 421 Uniformly convergent sequence, 80 of first category, 231 Uniformly equivalent metric, 228 of second category, 231 Uniform metric, 50 order complete, 426 Unitary space, 3 orthonormal, 277 Unit n-cube, 5 well ordered, 426 Unit n-disc, 5 Sheet, 374 Unit n-sphere, 5 Sierpinski space, 7 Universal covering space, 388 Simple ordering, 424 Universal (ultra) net, 84 Simply connected space, 334 Upper limit topology, 20 Skew field, 443 Urysohn embedding theorem, 191 Smash product, 153 Urysohn lemma, 188 Stereographic projection, 34 Urysohn metrization theorem, 190 Stone–Cechˇ compactification, 199 Strong deformation retract, 321 Subbasis, 16 W Subcovering, 95 Weak topology, 164 Subnet, 83 Wedge, 151 Subsequence, 78 Winding number, 341