<<

© COPYRIGHT

by

Amanda Purcell

2013

ALL RIGHTS RESERVED

ULTRAPRODUCTS AND THEIR APPLICATIONS

BY

Amanda Purcell

ABSTRACT

An is a mathematical construction used primarily in abstract algebra and to create a new structure by reducing a product of a family of existing structures using a class of objects referred to as filters. This thesis provides a rigorous construction of and investigates some of their applications in the fields of mathematical , , and complex analysis. An introduction to basic theory is included and used as a foundation for the ultraproduct construction. It is shown how to use this method on a family of models of first order logic to construct a new model of first order logic, with which one can produce a proof of the that is both elegant and robust. Next, an ultraproduct is used to offer a bridge between intuition and the formalization of nonstandard analysis by providing concrete infinite and elements. Finally, a proof of the Ax-

Grothendieck Theorem is provided in which the ultraproduct and other previous results play a critical role. Rather than examining one in depth application, this text features ultraproducts as tools to solve problems across various disciplines.

ii

ACKNOWLEDGMENTS

I would like to give very special thanks to my advisor, Professor Ali Enayat, whose expertise, understanding, and patience, were invaluable to my pursuit of a degree in . His vast knowledge and passion for teaching inspired me throughout my collegiate and graduate career and will continue to do so.

iii

TABLE OF CONTENTS

ABSTRACT ...... ii

ACKNOWLEDGMENTS ...... iii

Chapter

1. INTRODUCTION ...... 1

2. FILTERS, , AND REDUCED PRODUCTS ...... 3

3. ULTRAPRODUCTS AND COMPACTNESS ...... 16

4. ULTRAPRODUCTS AND NONSTANDARD UNIVERSES ...... 31

5. THE AX-GROTHENDIECK THEOREM ...... 47

BIBLIOGRAPHY ...... 59

iv CHAPTER 1

INTRODUCTION

The ultraproduct construction is a method used primarily in model theory and abstract algebra that creates a new structure by reducing a of existing structures. Its value stems from the ability for one to determine properties of the ultraproduct based solely on the properties of the structures within the reduced product and the equivalence relation by which it is reduced. The ultraproduct is attractive in its algebraic nature and the fact that it can be constructed using only basic set theoretic concepts. This thesis begins (Chapter 2) with an outline of the required for the construction of ultraproducts beginning with filters (specific sets of sets). It includes important observations on the existence and uses of certain filters before providing the step-by-step construction of reduced products and describing their elements. This chapter provides the foundation for all applications of ultraproducts discussed ahead. To be able to define the notion of an ultraproduct, and to be able to understand its properties, it is necessary to provide a brief introduction to model theory, which comes in the beginning of Chapter 3. Chapter 3 then defines an ultrapoduct of models and the elements, functions, and relations on this new object. Next, this chapter instructs how to interpret sentences of first order logic in the ultraproduct model by proving the ever-important and applicable Łos’s´ Theorem which will recur throughout the text. Ultimately, Chapter 3 provides a proof of the Compactness Theorem, one of the most fundamental results in first order logic. Chapter 4 then changes gears to provide an examination of basic nonstandard analysis using the tools and elements of ultraproducts developed in earlier sections. The ultraproduct construction allows us to build ordered "nonarchimedean" fields, i.e., fields 1 2 that contain, as concrete objects, infinitely small quantities as well as infinitely large ones. Moreover, these nonarchimedean fields are shown to satisfy the same first order sentences as those that are true in the ordered field of real numbers. In chapter 4 we employ the aforementioned nonarchimedean fields to provide a new way of establishing a number of classical results in Calculus in an intuitively clear manner, with the explicit, unabashed use of infinitesimals. The final chapter (5) offers a proof of the Ax-Grothendieck Theorem, alternative to the proof from Walter Rudin [1] utilizing techniques of complex analysis. This final part highlights the rather metamathematical method of proof used a great deal in disciplines of advanced mathematics: the method in which one takes a problem from some specific area, maps the relevant pieces to a different in which the problem can be solved, and maps back to complete the proof. In the end, it should be clear to the reader that the ultraproduct model is not only interesting in and of itself, but is a very powerful tool that can provide answers, or methods to obtain them, in many other disciplines outside of model theory. 3

CHAPTER 2

FILTERS, ULTRAFILTERS, AND REDUCED PRODUCTS

Set Theory and Construction of Filters

We begin with a few basic set theoretic concepts that will be useful in our construction of filters, and ultimately in the construction of ultraproducts.

• The powerset of a set X, denoted P(X), is the set of all possible of X. For example, for X = {A,B}, the powerset of X would be written P(X) = {/0,{A},{B},{A,B}}, where /0denotes the “,” the unique set containing no elements. • A is a of a set X, or X contains A (written A ⊆ X), if every element of A is also an element of X. • The union of two sets, A and B, denoted A ∪ B, is the collection of elements that are in A, in B, or in both A and B. • The intersection of two sets A and B, denoted A ∩B, is the collection of elements that are in both A and B. • The complement of a set A in relation to a background set X where A ⊆ X, denoted X\A, is the set of all elements of X that are not in the subset A.

For purposes that arise in a later section, it is also important to understand the concept of . The cardinality of a set X, denoted |X|, is the number of elements in the set; numbers representing the cardinality of sets are called “cardinal numbers.” For example, the cardinality of the natural numbers (N) is ℵ0, the first infinite . The ℵ ℵ cardinality of the real numbers (R) is 2 0 , where 2 0 > ℵ0 by a classical theorem of Cantor. 4

Armed with these basic set theoretic concepts, we are now equipped to discuss the first class of objects important to our construction of the ultraproduct: filters.

Definition 1. Let X be a nonempty set. A filter F over X is a nonempty family of sets F ⊆ P(X) such that:

(i) if A, B ∈ F , then A ∩ B ∈ F , and (ii) if A ∈ F and A ⊆ B ⊆ X, then B ∈ F .

In other words, F is a set of subsets of X that is closed under intersections and supersets. For example, F = P(X) is a filter over X; this is called the “improper filter.” The set containing only X itself, {X}, also forms a filter, called the “trivial filter.” All other filters are referred to as “proper nontrivial filters.”

Proposition 2. A filter F is proper if and only if /0 ∈/ F .

Proof. We prove the ⇒ direction by way of contradiction. Suppose that F is a proper filter over X and that /0 ∈ F . As all subsets A ⊆ X contain /0,

(/0 ⊆ A ⊆ X) ⇒ (A ∈ F ), for all A ⊆ X, and every possible subset of X is a member of F by upward closure. Therefore F = P(X), and F must be improper, contradicting our assumption. Hence, for any proper filter F over a set X, we have /0 ∈/ F .

The proof of the ⇐ direction is trivial. 

Also note that for all filters F over X, the background set X will be a member of the filter F . Given that F is nonempty by definition, let A ∈ F for some A ⊆ X. By upward closure, we have

(A ⊆ X) ⇒ (X ∈ F ), 5 and X ∈ F for all filters F over X. Some examples of proper filters include:

• The principal filter: Fix some element i0 ∈ X, and consider the family of subsets

{A ⊆ X |i0 ∈ A}. This defines a filter over X, and is referred to as the principal

filter generated by i0. Any filter that is not principal is referred to as free. • The Fréchet filter: Let X be an infinite set, and let

F = {A ⊆ X |X\A is finite},

i.e., the set of all co-finite subsets of X. This filter is called the Fréchet filter, a very important filter that will be utilized later.

Before we can discuss more interesting properties about filters, we must first define the following property:

Definition 3. A set F of subsets of X is said to have the finite intersection property, or f.i.p., if the intersection of any finite number of those subsets is nonempty, i.e.,

A1 ∩ A2 ∩ ... ∩ An 6= /0 for Ai ∈ F , 1 ≤ i ≤ n, and n ∈ N.

We claim that any proper filter has f.i.p., and for any set of subsets that has f.i.p., there exists a proper filter containing that set. Before we prove our claim, consider our examples of proper filters above. The principal filter generated by i0 clearly has f.i.p. as i0 will be a member of every set in the filter, and thus a member of any finite intersection. The Fréchet filter also clearly has f.i.p as any finite intersection will be the complement of the union of a finite number of finite sets (which will also be finite) in relation to an infinite background set. 6

Claim 4. Let F be a filter over X. Then the following statements are equivalent: (i) F is a proper filter. (ii) /0 ∈/ F . (iii) F has f.i.p.

Proof. To prove that the above statements are equivalent, it is sufficient to show that (i) implies (ii), (ii) implies (iii), and that (iii) implies (i). (i)→(ii). By Proposition 2.

(ii)→(iii). Suppose, by way of contradiction, that /0 ∈/ F and that F does not have f.i.p. Then there exists some finite number of elements, A1,..,An of F such that

A1 ∩ ... ∩ An = /0. By property of filters, this intersection must also be a member of F , contradicting our assumption that /0 ∈/ F . (iii)→(i). Suppose, by way of contradiction, that F has f.i.p and that F is improper. As F is improper, given any element A ⊂ X (where A 6= X), we have both A ∈ F and X\A ∈ F , and therefore by property of filters, A ∩ (X\A) = /0 ∈ F , contradicting our assumption. 

Lemma 5. Suppose G is a set of subsets of the background set X such that G has f.i.p. Then G can be extended to a proper filter that contains G .

Proof. Let G be a set of subsets that has f.i.p. and define:

F = {S ⊆ X |S ⊇ A1 ∩ ... ∩ An, whereAi ∈ G for1 ≤ i ≤ n} for some n ∈ N, the collection of all sets S that are the supersets of some finite intersection of sets in G . We claim this set will be a proper filter containing G . By definition of a proper filter, we must show that (i) for A, B ∈ F , we have A ∩ B ∈ F , (ii) for A ∈ F and A ⊆ B ⊆ X, then B ∈ F , and (iii) that F is proper. 7

(i) Let A,B ∈ F . Then by definition, A and B both contain finite intersections of elements from G . Thus, the intersection of A and B will also contain an intersection of finitely many elements from G , and A ∩ B ∈ F . (ii) Let A ∈ F and A ⊆ B ⊆ X. Since A contains a finite intersection of elements from G and B contains A, B must also contain this same finite intersection and therefore B ∈ F . (iii) Suppose F were improper and thus contained /0(by Proposition 2). Then G would also contain /0since S ⊇ /0for S = /0. This contradicts the fact that G has f.i.p., however, since any intersection with the empty set is empty.

Note that for all g ∈ G , we have g ∈ F since for S = g we have S ⊇ g.

Therefore F is a proper filter such that G ⊆ F . 

Ultrafilters

Let F be a filter over N. We say that a f has some property P in reference to F if and only if {n ∈ N| f (n)has propertyP} ∈ F . That is, the set of indices on which the function has the property is a member of the filter. Consider the function   1 n is even f (n) = ,  −1 n is odd and consider the property P of the values of a function being always greater than or equal to zero (or, conversely, always less than zero). We would like for there to be a clear choice whether function f has property P or not P in reference to F (and not both, and not neither). Unfortunately, in reference to the Fréchet filter, neither is true in this case as f oscillates. Therefore, we need to obtain a finer filter to make this choice. To do so, we introduce another stipulation to our definition of a filter to create the ultrafilter. This new 8 stipulation provides the tool to decide which set of the “evens” or “odds” would be in the

filter on N above to determine if either property P or not P held.

Definition 6. A filter F is called an ultrafilter if, in addition to (i) and (ii) of Definition 1,

(iii) For any subset A ⊆ X, either A ∈ F or (X\A) ∈ F , and not both.

Proposition 7. A filter F over a set X is an ultrafilter if and only if it is maximal, i.e., the only proper filter containing F is F itself.

Proof. (⇒) Let F be an ultrafilter; then for any set A ⊆ X, we have either A ∈ F or (X\A) ∈ F . Now suppose G is a proper filter over X that contains F , G ⊇ F . Let B be a subset of X such that B ∈ G and B ∈/ F . Then, since F is an ultrafilter, we have X\B ∈ F . As F is contained in G,

(X\B ∈ F ) ⇒ (X\B ∈ G ), and

(B ∈ G and (X\B) ∈ G ) ⇒ (B ∩ (X\B) = /0 ∈ G ), contradicting the fact that G is proper. Therefore, G cannot strictly contain F , and we must have F = G . Hence F is maximal.

(⇐) Now let F be a maximal filter over X. It suffices to show that for any subset A ⊆ X, A ∈/ F implies X\A ∈ F . Let A be a subset of X such that A ∈/ F . As F is a proper filter, F has f.i.p. by Claim 4. The set F ∪ {X\A} maintains f.i.p. since A ∈/ F . This is because an empty intersection could only be obtained with {X\A} and A or a subset of A. Clearly A ∈/ F , 9 and for any subset B ⊂ A, if B ∈ F , then by property of filters, A must also be a member of F . By Lemma 5, there exists a proper filter Ff that contains F ∪ {X\A}, which obviously contains F . Thus by the maximality of F , we must have:

F ⊆ F ∪ {X\A} ⊆ Ff⊆ F , which implies that F = F ∪ {X\A}, and {X\A} ∈ F . 

A principal ultrafilter is an ultrafilter that is principal (as introduced after

Proposition 2). If F is an ultrafilter over X, and {i0} ∈ F for some i0 ∈ X, then F is the principal ultrafilter generated by {i0}. The following statements about principal ultrafilters are equivalent:

• A subset A of X is a member of the principal ultrafilter F ;

• A ∩ {i0} ∈ F ;

• i0 ∈ A.

Additionally, if F is a principal ultrafilter, then F must be generated by a single element. Otherwise, if F were generated by some subset B (where B is more than one element of X, i.e., F would be the set of all subsets of X containing set B) such that /0 6= B ⊂ A ⊆ X, then the filter generated by B would be a proper filter containing F , contradicting the maximality of F .

Proposition 8. Let F be an ultrafilter. If F is not principal (i.e., F is free), then F contains the Fréchet filter.

Proof. By the ultrafilter “decision” property (Definition 6), it suffices to show that

(S ⊆ X and S is finite) ⇒ (S ∈/ F ), 10 i.e., if no finite set S can be in ultrafilter F , then all co-finite sets X\S must be. We proceed by induction on the size of S. Let F be a non-principal ultrafilter over

X and define Sn := {i1,...,in} where i j ∈ X for 1 ≤ j ≤ n. We wish to show that Sn ∈/ F .

For the case of n = 1, we clearly have a contradiction as S1 ∈ F would imply that F is principal, as defined above.

Now suppose that for n = k, we have Sk ∈/ F and hence X\Sk ∈ F . We now show that Sk+1 ∈/ F . Observe,

Sk+1 = {i1,...ik,ik+1} = {i1,...,ik} ∪ {ik+1},

where Sk = {i1,...,ik} ∈/ F , and {ik+1} ∈/ F by the n = 1 case. Therefore, we have

X\{i1,...,ik} ∈ F and X\{ik+1} ∈ F . Since F is closed under intersections,

(X\{i1,...,ik}) ∩ (X\{ik+1}) = X\({i1,...,ik} ∪ {ik+1}) ∈ F

⇒ X\Sk+1 ∈ F .

Thus, Sk+1 ∈/ F , and by induction, we have Sn ∈/ F for any n ∈ N. Since F contains no finite set and F is an ultrafilter, it must contain the complement of every finite set. In other words, F contains the Fréchet filter. 

The Fréchet filter is clearly not an ultrafilter (for example, by the existence of the infinite/co-infinite sets of the even and odd numbers of N as mentioned at the beginning of this section). However, we have seen that it is possible to have a principal filter generated by i0 that is an ultrafilter. Are there other examples of ultrafilters? The answer to this question is yes; in fact, we will show any proper filter can be extended to an ultrafilter. Before we comment on this and make a few other important 11 observations on ultrafilters, we first need to introduce the in the form of Zorn’s Lemma, and, in order to do so, we first define the notion of a poset.

Definition 9. A partially ordered set, or poset, is a pair (X,≤) where X is a nonempty set and ≤ is a binary relation on X that is: (i) reflexive, (i.e., x ≤ x for all x ∈ X); (ii) antisymmetric, (i.e., if x ≤ y and y ≤ x then x = y); (iii) transitive, (i.e., if x ≤ y and y ≤ z then x ≤ z).

Any subset C of X, where (X,≤) is a poset, that can be ordered such that x ≤ y or y ≤ x for all x,y ∈ C is called a chain. An element b ∈ X is an upper bound for C if for all x ∈ C we have x ≤ b. An element m is maximal if for any x ∈ X, m ≤ x implies x = m.

Lemma 10. (Zorn’s Lemma) Let (X,≤) be a partially ordered set. If each chain in X has an upper bound, then X has at least one maximal element.

Zorn’s Lemma is equivalent to the Axiom of Choice, which states that there is a “choice function,” f , on any (infinite) set A of nonempty sets such that f (x) ∈ x for each x ∈ A [2]. The Axiom of Choice and Zorn’s Lemma will be referenced when applied in later proofs. We use Zorn’s Lemma in the proof of the following theorem on the existence of ultrafilters, referred to as the “Ultrafilter Theorem.”

Theorem 11. (Ultrafilter Theorem) If F is a proper filter on X, then there is an ultrafilter E on X such that F ⊆ E .

Proof. Let F be a filter over X and define Ffbe the set of all proper filters that contain F . Ffis clearly nonempty as F ∈ Ff. We claim that (Ff,⊆) forms a poset. (i) Clearly, D ⊆ D for all D ∈ Ffby property of inclusion, and ⊆ is reflexive. 12

(ii) Similarly, for D1,D2 ∈ Ff, if D1 ⊆ D2 and D2 ⊆ D1, then D1 = D2, and ⊆ is antisymmetric.

(iii) Finally, for D1,D2,D3 ∈ Ff, if D1 ⊆ D2 and D2 ⊆ D3, then D1 ⊆ D3, and ⊆ is transitive.

Thus, (Ff,⊆) forms a poset. Now let C be a chain in Ff. In order to be able to apply Zorn’s Lemma, we must show that every chain has an upper bound. Consider that any chain C of Ffis a list of elements, or filters, and consider the union of such elements: SC . We claim that this is an upper bound for C . Since for each D ∈ C , /0 ∈/ D (since each D in the chain is a S S proper filter), we are assured that /0 ∈/ C . Given A ∈ C , we know that A ∈ D1 for some proper filter D1 ∈ C . Thus for any superset B of A, A ⊆ B, we have B ∈ D1 and S S therefore B ∈ C . Now given A,B ∈ C , we know that that A,B ∈ D2 for some proper

filter D2 ∈ C as the chain is partially ordered by inclusion. Therefore A ∩ B ∈ D2 and A ∩ B ∈ SC . Hence SC ∈ Ff. By construction, we have D ⊆ SC for each D ∈ C , and SC is an upper bound for the chain C . Therefore, since every chain has an upper bound, we know by Zorn’s Lemma that

Ffhas at least one maximal element, i.e., a maximal filter that contains F . Since this

filter is maximal, it is an ultrafilter by Proposition 7. 

Construction of Reduced Products

Now that we are familiar with the notions of filters and ultrafilters, we may begin our construction of reduced products. We shall see that the ultraproduct is just a specific case of the reduced product. 13

Suppose X is a non-empty set and D is a proper filter over X, and for each i ∈ X,

Ai is a non-empty set. Let C = ∏i∈X Ai be the Cartesian product (in the usual sense) of S these sets. Elements of C are thus functions f : X → i∈X Ai such that f (i) ∈ Ai for each i ∈ X, or strings of elements where the ith element comes from Ai.

Given two functions f , g ∈ C, we say f and g are D-equivalent, written f =D g, if and only if {i ∈ X | f (i) = g(i)} ∈ D.

That is, the set of indices on which f and g agree is in D. Note that for any non-principal ultrafilter D, by Proposition 8, this would amount to the fact that f and g agree on all but a finite number of indices, or “almost everywhere.” We claim that C divides naturally into equivalence classes modulo the filter D.

Proposition 12. The relation =D is an equivalence relation over C.

Proof. To show that =D is an equivalence relation, we must show that reflexivity, symmetry, and transitivity hold.

(i) Reflexivity: Given f ∈ C, we have f =D f if and only if

{i ∈ X | f (i) = f (i)} ∈ D.

However, {i ∈ X | f (i) = f (i)} = X where X is necessarily in D, and therefore =D is reflexive.

(ii) Symmetry: Given f ,g ∈ C, we have f =D g if and only if

{i ∈ X | f (i) = g(i)} ∈ D. 14

However, {i ∈ X | f (i) = g(i)} = {i ∈ X |g(i) = f (i)}, and thus

{i ∈ X |g(i) = f (i)} ∈ D,

which implies g =D f , and =D is symmetric.

(iii) Transitivity: Given f ,g,h ∈ C, f =D g and g =D h if and only if

{i ∈ X | f (i) = g(i)} ∈ D and {i ∈ X |g(i) = h(i)} ∈ D.

Because D is a filter, the intersection of these sets must also be in D, and:

{i ∈ X | f (i) = g(i)} ∩ {i ∈ X |g(i) = h(i)} ∈ D

⇐⇒ {i ∈ X | ( f (i) = g(i)) ∧ (g(i) = h(i))} ∈ D.

Since {i ∈ X | ( f (i) = g(i)) ∧ (g(i) = h(i))} ⊆ {i ∈ X | f (i) = h(i)}, we have:

{i ∈ X | f (i) = g(i)} ∩ {i ∈ X |g(i) = h(i)} ∈ D ⇒ {i ∈ X | f (i) = h(i)} ∈ D, by upward closure of D.

Therefore =D is also transitive, and hence an equivalence relation over C. 

As =D is an equivalence relation, it can be used to divide C into equivalence classes, which, for convenience, we will denote:

f D = {g ∈ C |{i ∈ X |g(i) = f (i)} ∈ D}. 15

The set of these equivalence classes is referred to as a reduced product, defined specifically as follows:

Definition 13. The reduced product of Ai modulo D is the set of all equivalence classes D { f | f ∈ ∏i∈X Ai}, denoted by ∏D Ai, where the Ai are nonempty sets. The set X is the index set for ∏D Ai. In the case that D is an ultrafilter over X, the reduced product ∏D Ai is an ultraproduct. Additionally, if all sets Ai = A are the same, ∏D Ai may be written as

∏D A and is referred to as an ultrapower.

Now that we have constructed this interesting object, we examine what is means to be a reduced product of models, but first, we provide a brief overview of elementary model theory to give context for this new structure. 16

CHAPTER 3

ULTRAPRODUCTS AND COMPACTNESS

A Brief Introduction to Model Theory

Before we prove the Compactness Theorem, we must first discuss the model construction that we will utilize in the proof, and, before we discuss the model construction, it is necessary to provide a brief introduction to model theory. Model theory is the branch of that concerns the relation between a formal language (syntax) and its in different structures () [3].

Syntax. The first order language L consists of finitary relation symbols: R, functions symbols: F, and constant symbols: c. To formalize L , we must also use the following logical symbols: connectives: ∧ (“and”), ∨ (“or”), ¬ (“not”), → (“if _then_”), ⇐⇒ (“_if and only if_”); quantifiers: ∃ (“there exists”), ∀ (“for all”); variables: x0,xi,...,xn,...; and parentheses: (, ).

In addition to these logical symbols, we also have the binary relation for equality,

≡. Note that none of these symbols are part of the language L . In order to meaningfully refer to objects in our domain of discourse, we need to use strings of logical symbols combined with the symbols of L . These strings are called terms. Terms are defined inductively as follows:

(i) Constants and variables are terms.

(ii) If F is an n-ary function and τ1,...,τn are terms, then F(τ1,...,τn) is a . 17

(iii) A string of symbols is a term if and only if it can be arrived at by a finite number of applications of (i) and (ii).

To be able to say meaningful things about terms (e.g., how they “relate”), we introduce formulas. The most simple formulas are referred to as atomic and defined as follows:

(i) τ1 ≡ τ2 is an atomic formula, where τ1,τ2 are terms of L .

(ii) If R is an n-ary relation symbol and τ1,...,τn are terms, then R(τ1,...,τn) is an atomic formula.

Formulas of L are defined inductively as follows:

(i) An atomic formula is a formula. (ii) If φ and ψ are formulas, then (φ ∧ ψ), (φ ∨ ψ), (φ → ψ), (φ ⇐⇒ ψ), and (¬φ) are formulas. (iii) If x is a variable and φ is a formula, then ∃x(φ) and ∀x(φ) are formulas. (iv) A sequence of symbols is a formula only if it can be shown to be a formula by a finite number of applications of (i), (ii), and (iii).

Formulas with no free variables (variables specified with quantifiers) are called sentences. Syntax is very important, but without instruction on how to interpret in different structures (semantics), it is quite literally meaningless.

Semantics. Sentences of L as developed above can be true in one structure and not in another. Consider the case of the existence of a greatest element. A sentence of L representing this idea is true in finite structures that can be ordered, but not, for example, in the typical structure of the real numbers. Consider also a first order statement of density, that between any two elements exists another. This sentence is true in Q and R, but not in N and Z. Sentences like these are true or false depending on the structure (on the context) in which they are being considered. The structures in which we consider the truth or 18 falsity of a sentence of L are called models, a class of mathematical structures (e.g. groups, fields, graphs, etc.) in which one can apply the tools of mathematical logic. A sentence is satisfiable if it is possible to find a model whose interpretation of that sentence is true in that model. The following section outlines how we can use the truth interpretation of sentences of first order logic in each model Ai in our ultraproduct to determine what is true in the structure of the ultraproduct itself.

Interpretations and Łos’s´ Theorem

The Compactness Theorem of first order logic states that if you have an infinite set Σ of sentences of L , with the property that each finite subset i ⊂ Σ is satisfied in some model Mi (which is dependent on the finite subset i specified), then there is a single model M in which all of Σ is satisfied. The typical mathematical logic proof of this theorem requires the notion of Soundness, as well as Gödel’s Completeness Theorem: two ideas which, themselves, require much careful consideration. It is a proof by contrapositive, the idea being to use the finiteness of proofs and the relationship between satisfiability and deducibility to show that if the infinite set of sentences is not satisfiable, then it has a finite subset of sentences which can lead to the proof of a contradiction (and therefore, the premise is impossible). The proof of the Compactness Theorem we present utilizing ultraproducts will ultimately be completed in two main steps. First we will prove Łos’s´ Theorem to show how true formulas in each model Ai are interpreted in the ultraproduct. We then use filter properties to show that finite satisfiability also transfers to the ultraproduct from each of the elements in the product. In the end, we not only prove the Compactness Theorem, but 19 we are left with a tangible model that actually fulfills this duty of satisfying the infinite set of sentences. In order to provide some in nomenclature, symbols of the language as interpreted in each model are represented using the same symbol of L with a superscript denoting the model (like our equivalence classes, f D, used superscripts to denote the filter). Multiple symbols are indexed in the subscript.

Let X be a nonempty set, D a proper filter over X, and for each i ∈ X, let Ai be the model for language L over the set Ai. Constants c of the language L are interpreted by

Ai Ai n Ai n c ∈ Ai, n-ary relations R by R ⊆ Ai , and functions F by F : Ai → Ai.

Given hAi : i ∈ Xi, we construct a new model of L . The following is an extension of the previously provided definition of reduced products (Definition 13) and specifies how each object of the language L is interpreted component-wise.

Definition 14. The reduced product B = ∏D Ai is the model for L described as follows:

(i) Set: The background set for B is ∏D Ai, (the ultraproduct as constructed D above, where elements are equivalence classes, f , defined by =D). (ii) Relations: Let R be an n-ary relation symbol of L . The interpretation of R

in B = ∏D Ai is the relation such that

B D D Ai R ( f1 ,..., fn ) ⇐⇒ {i ∈ X |R ( f1(i),..., fn(i))} ∈ D,

D i.e., elements (equivalence classes) fk , 1 ≤ k ≤ n, are related in B if and only if the set of indices of elements of the equivalence classes that are related by

R in the ith model, Ai, is in D. 20

(iii) Functions: Let F be an n-ary function symbol of L . Then F is interpreted in B by the function

D E B D D Ai F ( f1 ,..., fn ) = F ( f1(i),..., fn(i)) : i ∈ X , D

i.e., FB is a function on n elements (equivalence classes) of B whose output is a sequence of elements where the ith member of the sequence is the value

of the function F interpreted in the ith model (as FAi ), modulo D.

(iv) Constants: Let c be a constant symbol of L . Then c is interpreted in B

B B Ai by c ∈ ∏D Ai, where c = c : i ∈ X D, i.e., the equivalence class of a

sequence of constants from the Ai’s where the ith term in the sequence is the

constant interpreted in the corresponding Ai.

The following proposition shows that ultraproducts over principal ultrafilters are somewhat less interesting in that they mirror pre-existing components of the product, motivating our examination of products over solely non-principal ultrafilters in the future (so we can form objects that yield new information and do not simply extrapolate information from one of their respective members).

Proposition 15. Suppose D is a principal ultrafilter over X generated by i0. Then ∼ ∏D Ai = Ai0 .

Proof. Given that D is a principal ultrafilter generated by i0, we know {i0} ∈ D. Elements of ∏D Ai can then be grouped into equivalence classes based on their agreement evaluated D only on the i0th element. Let F : ∏D Ai → Ai0 be given by F( f ) = f (i0). Note that F is D well-defined since every element in each equivalence class f agrees on the value of i0. D D Given f ,g ∈ ∏D Ai, we have

D D  F( f ) = F(g ) ⇐⇒ ( f (i0) = g(i0)), 21 which happens only if f and g are in the same equivalence class, and therefore F is one-to-one.

Let y ∈ Ai0 . Since Ai is nonempty for all i ∈ X, we are assured that there exists at least one element ai in each Ai. Define the function h : X → ∏i∈X Ai by   ai i 6= i0 h(i) := .  y i = i0

D D Then, for h = { f ∈ ∏i∈X Ai | f (i0) = y}, we have F(h ) = y, and F is onto. To show that F is operation-preserving, we must see that it preserves relations. By definition,

B D D Ai R ( f1 ,..., fn ) ⇐⇒ {i ∈ X |R ( f1(i),..., fn(i))} ∈ D

B D D ⇐⇒ R( f1(i0),..., fn(i0)) ⇐⇒ R (F( f1 ),...,F( fn )).

D In other words, elements f relate if and only if they relate on their values at i0, which is how F is defined. ∼ Therefore, ∏D Ai = Ai0 . 

From here forward, we will focus on the use of ultraproducts over non-principal ultrafilters, as the above proposition shows that ultraproducts over principal ultrafilters yield nothing new. The following lemma expands our understanding of the interpretations of constants, relations, and functions in order to interpret terms in the reduced product (before we are able to use Łos’s´ Theorem to interpret full formulas and sentences of first order logic). 22

Lemma 16. Let τ be a term of L on n variables, τ(x1,x2,...,xn), and f1, f2,..., fn ∈

∏i∈X Ai, then,

B D D D D (∗) τ ( f1 , f2 ,..., fn ) = f ,

A where f (i) = τ i ( f1(i),..., fn(i)). In other words, terms, like constants, variables, and functions, are elements of

∏D Ai formed by component-wise interpretation.

Proof. We prove this lemma using induction on terms. It is clear that this theorem follows directly for variables (τB = f D) and constants (τB = cB), since they are defined component-wise in Definition 14. We now wish to show that (∗) holds true for a term τB when defined as a function on terms.

Let F be an m-ary function symbol of L , i.e., F(τ1,...,τm), and τ1,...,τm be terms with free variables among x1,...,xn, i.e., τi(x1,...,xn) for 1 ≤ i ≤ m. We wish to show that (∗) holds for

τ(x1,...,xn) = F(τ1(x1,...,xn),...,τm(x1,...,xn)).

Assume that (∗) holds for τ1,...,τm. By the interpretation of terms we have,

B D D D B B D D B D D τ ( f1 , f2 ,..., fn ) = F (τ1 ( f1 ,..., fn ),...,τm ( f1 ,..., fn )).

By (∗) for τ1,...,τm, we have:

B D D D τk ( f1 ,..., fn ) = gk , for k = 1,...,m, where

Ai gk(i) = τ ( f1(i),..., fn(i)), 23 and, more explicitly,

D D Ai E gk = τ ( f1(i),..., fn(i)) : i ∈ X , k D by Definition 14. We may then write:

B D D D B D D τ ( f1 , f2 ,..., fn ) = F (g1 ,...,gm).

Interpreting the functions in the reduced product, we have:

D E B D D Ai F (g1 ,...,gm) = F (g1(i),...,gm(i)) : i ∈ X D.

Using the interpretation of terms,

Ai Ai τ ( f1(i),..., fn(i)) = F ( f1(i),..., fn(i)), and together we obtain

D E B D D B D D Ai τ ( f1 ,.., fn ) = F (g1 ,...,gm) = τ ( f1(i),..., fn(i)) : i ∈ X D, and therefore τ = F, our term written as a function on terms, satisfies (∗), as desired. 

Now that we have developed an interpretation of terms in the ultraproduct, we can move on the prove Łos’s´ Theorem, also called the “Fundamental Theorem of Ultraproducts,” which will provide a way to determine the validity of formulas and sentences in the product from only our knowledge of the individual models.

Theorem 17. (Łos’s´ Theorem / Fundamental Theorem of Ultraproducts). Let hAi : i ∈ Xi be a family of models for L , and let D be an ultrafilter over X. Then for any formula D D α(x1,...,xn) of L and elements f1 ,..., fn of ∏D Ai, 24

D D (∗∗) ∏Ai |= α f1 ,..., fn ⇐⇒ {i ∈ X |Ai |= α ( f1(i),..., fn(i))} ∈ D. D

That is, formulas are true as interpreted in the ultraproduct if and only if they are true as interpreted in of the models, in the sense of D. We prove this theorem by induction on the length of formulas.

Proof. The first step is to consider atomic formulas.

Case 1: Let τ1 and τ2 be terms of L whose variables are among x1,x2,...,xn, and let α ≡ τ1 = τ2, the simplest of formulas (involving no relations). Then, D D D D ∏Ai |= τ1( f1 ,..., fn ) = τ2( f1 ,..., fn ) D if and only if, by Lemma 16,

Ai Ai {i ∈ X |τ1 ( f1(i),..., fn(i)) = τ2 ( f1(i),..., fn(i))} ∈ D, and

{i ∈ X |Ai |= α} ∈ D. as desired. Case 2: Next, consider the atomic case in which α is an m-ary relation, R, and the free variables of τ1,...,τm are among x1,x2,...,xn, i.e.,

α ≡ R(τ1(x1,...,xn),...,τm(x1,...,xn)). 25

We have:

B B D D D D ∏Ai |= R (τ1 ( f1 ,..., fn ),...,τm( f1 ,..., fn )) D if and only if

B D D ∏Ai |= R (g1 ,...,gm), D

Ai where gk(i) = τk ( f1(i),... fn(i)) for 1 ≤ k ≤ m by Lemma 16. B D D By definition, R (g1 ,...,gm) if and only if

Ai {i ∈ X |R (g1(i),...,gm(i))} ∈ D, which is equivalent to:

Ai Ai Ai {i ∈ X |R (τ1 ( f1(i), f2(i),..., fn(i)),...,τm ( f1(i), f2(i),..., fn(i)))} ∈ D, and therefore

{i ∈ X |Ai |= α} ∈ D as desired. Now we consider formulas using logical connectives. Since ∧, ¬ form an adequate set of connectives, i.e., all other connectives can be expressed using just these two, we examine only two cases (cases 3 and 4 below) to cover all possible formula combinations (with only free variables).

Case 3: Suppose that (∗∗) holds for β(x1,x2,...,xn), and we wish to show that it also holds for α ≡ ¬β. Observe, D D D ∏Ai |= ¬β( f1 , f2 ,..., fn ) D 26

D D D ⇐⇒ ∏Ai 2 β( f1 , f2 ,..., fn ) D

⇐⇒ {i ∈ X |Ai |= β( f1(i),..., fn(i))} ∈/ D as (∗∗) holds for β. Since D is an ultrafilter, the above set of indices is not in D if and only if its complement, the set of indices for which β does not hold, is in D, and we have:

{i ∈ X |Ai |= ¬β( f1(i),..., fn(i))} ∈ D

⇐⇒ {i ∈ X |Ai |= α} ∈ D as desired.

Case 4: Suppose (∗∗) holds for β1(x1,...,xn) and β2(x1,...,xn), and we wish to show that it holds for α ≡ β1 ∧ β2.

(⇒) First, suppose ∏D Ai |= α. Then we have:

! ! ∏Ai |= β1 ∧ β2 ⇐⇒ ∏Ai |= β1 and ∏Ai |= β2 . D D D

Since the statement holds for β1 and β2 individually,

! D D ∏Ai |= β1( f1 ,..., fn ) ⇐⇒ {i ∈ X |Ai |= β1( f1(i),..., fn(i))} ∈ D, D and, 27

! D D ∏Ai |= β2( f1 ,..., fn ) ⇐⇒ {i ∈ X |Ai |= β2( f1(i),..., fn(i))} ∈ D. D

By the above, we have two sets in D whose intersection must also then be in D (since D is a filter):

{i ∈ X |Ai |= β1( f1(i),..., fn(i))} ∩ {i ∈ X |Ai |= β2( f1(i),..., fn(i))} ∈ D

⇐⇒ {i ∈ X | (Ai |= β1) ∧ (Ai |= β2)} ∈ D

⇐⇒ {i ∈ X |Ai |= β1 ∧ β2} ∈ D.

(⇐) Now suppose we have {i ∈ X |Ai |= β1 ∧ β2} ∈ D. The set of indices where both β1 and β2 are true is obviously a subset of the set of indices in which only one is true. Thus, we have:

{i ∈ X |Ai |= β1 ∧ β2} ⊆ {i ∈ X |Ai |= β1} ∈ D, and,

{i ∈ X |Ai |= β1 ∧ β2} ⊆ {i ∈ X |Ai |= β2} ∈ D, by the upward closure property of ultrafilters. Since (∗∗) holds for β1 and β2, this implies:

∏Ai |= β1 and ∏Ai |= β2 D D

⇐⇒ ∏Ai |= β1 ∧ β2, D as desired. 28

Case 5: Lastly, we consider the case that the statement holds for some formula

β(x1,...,xn), and that α ≡ ∃xβ(x,x1,...,xn). D (⇒) By definition, ∏D Ai |= α if and only if there exists an f in the product such D D D that ∏D Ai |= β( f , f1 ,..., fn ). By assumption, this is equivalent to:

D (∗)∃ f ∈ ∏Ai D such that

{i ∈ X |Ai |= β( f (i), f1(i),..., fn(i))} ∈ D since (∗∗) holds for β. Notice that

{i ∈ X |Ai |= β( f (i), f1(i),..., fn(i))} ⊆ {i ∈ X |Ai |= α( f1(i),..., fn(i))} ∈ D for any f D, and thus we have (∗) implies

{i ∈ X |Ai |= α( f1(i),..., fn(i))} ∈ D, by the upward closure of filters. (⇐) Now we assume

{i ∈ X |Ai |= α( f1(i),..., fn(i))} ∈ D,

for α(x1,...,xn) = ∃xβ(x,x1,...,xn). We define,   Yi = {a0 ∈ Ai |Ai |= β(a0, f1(i),.., fn(i))} for Yi 6= /0 Xi = .  Ai otherwise 29

We may then use the Axiom of Choice to deduce that ∀i ∈ X, Xi 6= /0implies that 0 ∏i∈X Xi 6= /0. That is, we are assured the product of Xi s is non-empty because we are 0 able to specify an infinite choice function (as none of the Xi s are empty). We may then conclude that there exists some g ∈ ∏i∈X Xi ⊆ ∏i∈X Ai such that

{i ∈ X |Ai |= β(g(i), f1(i),..., fn(i))} ∈ D.

And, since

{i ∈ X |Ai |= β(g(i), f1(i),..., fn(i))} ⊆ {i ∈ X |Ai |= β( f1(i),..., fn(i))}, we know that this set must also be in D by the upward closure of filters, i.e.,

{i ∈ X |Ai |= β( f1(i),..., fn(i))} ∈ D.

D D Thus there exists a g ∈ ∏D Ai such that ∏D Ai |= β(g, f1 ,..., fn ), and our proof of Case 5 and this theorem is complete. 

This theorem provides a powerful result – one which we will rely upon not only for the proof of the Compactness Theorem, but also in the next section on nonstandard analysis, as well as in the final section on the Ax-Grothendieck Theorem. Finally, we have all of the tools we need for the proof of the Compactness Theorem of first order logic using ultraproducts.

The Compactness Theorem

Theorem 18. (The Compactness Theorem) Let Σ be a set of sentences of L and let X be the set of all finite subsets of Σ. Then for each i ∈ X (each finite set of sentences), let Ai be a model of i. Then there exists an ultrafilter D over X such that ∏Ai is a model of Σ. D 30

As stated previously, you will notice that not only does the following version of the proof of Compactness Theorem state the existence of a model to satisfy Σ, but with the ultraproduct it actually provides a concrete structure that will fulfill this role.

Proof. For each sentence σ ∈ Σ, let σb := {i ∈ X |σ ∈ i}, i.e., the set of all finite subsets of sentences that contain σ. The set E = {σb |σ ∈ Σ} has f.i.p. since

{σ1,σ2,...,σn} ∈ σc1 ∩ σc2 ∩ ··· ∩ σcn, because {σ1,σ2,...,σn} is a finite subset of sentences that contains σi for 1 ≤ i ≤ n. Since E ⊂ P(X) and has f.i.p., by Lemma 5 there exists a proper filter containing E, and by the Ultrafilter Theorem (Theorem 11), an ultrafilter D extending that filter over X. Note that for each σ ∈ Σ, i ∈ σb → σ ∈ i (i.e., a certain finite subset of sentences i is an element of the set of all subsets containing σ, then σ is clearly a member of i), and since Ai is a model for i, we have Ai  σ. Thus for each σ ∈ Σ, σb ∈ D and σb ⊂ {i ∈ X |Ai  σ}, and so by upward closure {i ∈ X |Ai  σ} ∈ D. By Łos’s´ Theorem (Theorem 17), we have:

{i ∈ X |Ai  σ} ∈ D ⇐⇒ ∏Ai  σ, D and ∏D Ai is a model for all σ ∈ Σ as desired.  31

CHAPTER 4

ULTRAPRODUCTS AND NONSTANDARD UNIVERSES

Motivation

In this section, we examine a very different application of ultraproducts. We will use the ultrapower construction method on R to form a non-archimedean structure extending the real numbers with the addition of new elements that we will call hyperreals. Using this new structure, we reexamine some fundamentals of basic calculus and standard analysis. The ideas of infinitely large and infinitely small quantities have been present in the minds of philosophers and mathematicians for more than two thousand years (infinitesimals since Archimedes: c.287 BC–c.212 BC, and infinitely large numbers since of Zeno of Elea: c.490 BC–c.430 BC). In the development of Leibniz’s and Newton’s Calculus in the late 1600s, infinitesimals were treated like ideal numbers in that they behaved as real numbers do under the rules of arithmetic, but were categorically not real. That infinitesimals were not defined concretely gave many people problems. For instance. Bishop Berkeley famously criticized these infinitesimals by calling them “ghosts of departed quantities” in 1734. It was not until approximately 150 years later that Cauchy and Weierstrass found formalizations, the epsilon-delta representation and the concept of the limit, respectively, to side step the problem presented by infinitesimals and create the foundation for a rigorous development of calculus. These infinitesimal representations (ε − δ and infinite limits) can be hard for students to grasp precisely because they are not real and not concrete. However, a rigorous formulation of these quantities was provided by in the 1960s [4]. He used elements of an ultrapower of R to represent the “hyperreal numbers” – including those infinitely large and small – 32 and showed that R can be embedded as a strict subset into this larger set of numbers. Nonstandard analysis is the branch of mathematics that studies calculus through the use of these infinite and infinitesimal quantities. As we saw in the previous section, we can extrapolate many things about the ultrapower of R using Łos’s´ Theorem to lift properties of the reals to this new non-archimedean structure that contains both reals and hyperreal numbers. In this chapter, we shall construct such a structure (the extended set of reals), which we shall denote R, as an ultrapower of R using the method as outlined in the previous section. We then define typical algebraic operations on this new number system and show that R is linearly ordered. We will also show how the real numbers (or an isomorphic copy of the reals) are embedded into R with added concrete elements of infinite numbers and infinitesimals that are strictly not real. With this expanded number system, we then provide a few examples of how nonstandard analysis can be used to bolster some basic ideas of standard analysis.

Construction of Hyperreal Numbers

Algebraic Structure: How R relates to R. Let R denote the total linearly ordered field (R,+,·,<), where +, ·, and < are the typical addition, multiplication, and linear ordering on R. Let D be a non-principal ultrafilter over the index set N, and consider the ultrapower R = ∏D R. Note that the properties of R do not depend on our choice of D; since every non-principal ultrafilter contains the Fréchet filter, properties depend on the “almost everywhere” behavior of the product. The elements of this ultrapower are equivalence classes of sequences of real D numbers, e.g., r = [hr1,r2,...i] where each ri ∈ R. Using Łos’s´ Theorem to determine how to lift the usual algebraic operations of R, we form the following definitions:

D D Definition 19. Let r = [hrii] and s = [hsii] be elements of R, as defined above. Then, 33

D R D (i) r + s = [hri + sii], D R D (ii) r · s = [hri · sii], D R D D R D (iii) r < s if and only if {i ∈ N|ri < si} ∈ D, and r ≤ s if and only if rD

Theorem 20. The structure R is a linearly ordered field.

Proof. To show that R is a linearly ordered field, we must show that the following properties hold: (1) the field axioms, (2) the axiom of linear order, (3) if x ≤ y and z 6= 0, then x · z ≤ y · z, and (4) if x ≤ y then x + z ≤ y + z. (1) The field axioms (associativity, commutativity, distributivity, identity, and inverses for both addition and multiplication) may each be written as a sentence of first order logic. Consider, for example, associativity:

∀q ∀r ∀s ((q + r) + s = q + (r + s)).

Since each sentence representing these axioms is true in every copy of R in the ultrapower, by Łos’s´ Theorem, the interpretation of each sentence holds in the ultrapower as well, and

R is a field. (2) The axiom of linear orders can also be written as a sentence of first order logic:

∀x∀y∀z{(xPx) ∧ (xPy ∧ yPx → x = y) ∧ (xPy ∧ yPz → xPz) ∧ (xPy ∨ yPx)}, 34 which is equivalent to the statement: “P is reflexive, antisymmetric, transitive, and any two elements are related in one way or the other.” Again, since this sentence is true in every structure of the product, by Łos’s´ Theorem, the interpretation of this sentence holds in the ultrapower. Similarly for (3) and (4), we may write

∀x ∀y ∀z ((x < y ∧ z ≥ 0) → (x · z ≤ y · z)) and ∀x ∀y ∀z (x ≤ y → (x + z ≤ y + z)), both of which are true in each copy of R in our ultrapower, and thus true as interpreted in the ultrapower by Łos.´

Therefore, R is a linearly ordered field. 

We now show that R can be embedded into R.

Definition 21. For all r ∈ R, define ∗(r) = ∗r, where ∗r = [hr,r,...i] ∈ R, i.e. the constant sequence of r.

Theorem 22. The map ∗ : R → R is an isomorphism from the ordered field R into the ordered field R.

Proof. Suppose ∗r = ∗s. Then [hr,r,...i] = [hs,s,...i], and as these elements are only equal if the set of indices on which they agree is a member of the ultrafilter, we clearly have r = s, and ∗ is one-to-one. That addition, multiplication, and linear order are preserved follows directly from the definitions. Observe: 35

∗r +R ∗s = [hr,r,..i] +R [hs,s,...i] = [hr + s,r + s,...i] = ∗(r + s), and similarly for multiplication and linear order. 

By this map, R is embedded into R.

Definition 23. If A ⊆ R, then (A)∗ is the set {∗a|a ∈ A}; (R)∗ is the set of standard numbers in R.

It is fairly obvious that the set of standard numbers is strictly contained in our new structure. Consider the element r = [h1,2,3,...,n,...i]. This element is strictly greater than any element ∗x in (R)∗, since at some point the value of the sequence ri will exceed x ∈ R, and will then be greater than x for the remainder of the infinite sequence. Therefore the set of indices on which r is greater than ∗x is a member of the Fréchet filter, and r >R ∗x. This element is greater than any finite , and hence, infinite. 1 1 1 Similarly, consider the element s = [ 1, 2 , 3 ,..., n ,... ]. Following the same reasoning as above, this element is smaller than every positive real, however not equal to ∗0. Thus, this element is infinitely small, and hence, an infinitesimal number. Notice that both of these quantities can be operated on by the same functions as the reals and obey the same rules through interpretation in our ultrapower, though they are certainly not real.

New Vocabulary

Before diving into analysis in our new structure, we first provide the following definitions. 36

Members of R are called “hyperreal numbers”; members of R are “real” or

“standard,” as Definition 23 indicates. Members of Q = ∏D Q are hyperrationals, of

Z = ∏D Z are hyper-integers, and of N = ∏D N are hypernaturals. For the remainder of the section, we drop the equivalence class-sequence notation, [hri], to represent elements of R for simplicity sake and use r ∈ R, n ∈ N , etc. instead. Similarly, we drop ∗() notation, so now, for example, 0 = ∗0 = [h0,0,...i]. Keep in mind that these are infinite strings and all functions and relations on R are applied as such. Hyperreal numbers can be divided into four categories as follows.

Definition 24. A b is:

(a) limited if and only if |b| < n for some n ∈ N, (b) unlimited if and only if |b| > n for all n ∈ N, 1 (c) infinitesimal if and only if |b| < n for all n ∈ N, 1 (d) appreciable if and only if n < |b| < n for some n ∈ N.

All real numbers and infinitesimals are limited. The only real infinitesimal is 0.

We denote the set of unlimited hyperreal numbers and hypernaturals by R∞ and N∞, respectively.

Definition 25. Let x and y be elements of R. Then, (i) x and y are infinitely close if |x −y| is infinitesimal, written x ' y. This defines an equivalence relation on R; the '-equivalence class of x is called the halo, hal(x) = {y ∈ R |x ' y}. (ii) x and y are finitely close if |x − y| is limited, written x ∼ y. The galaxy of x is the ∼-equivalence class gal(x) = {y ∈ R |x ∼ y}.

Lemma 26. Let x,y ∈ R. If x and y are not infinitely close and at least one is limited, then there is a standard q ∈ R strictly between x and y. 37

Proof. Without loss of generality, let 0

(m − 1) · b ≤R x ⇒ m · b ≤R x + b together with 0

Theorem 27. Every limited x ∈ R is infinitely close to a unique c ∈ R.

Proof. Let A = {r ∈ R|r < x}. Since x is limited, there exist r,s ∈ R such that r < x < s, and A is nonempty and bounded above in R by s. By the completeness of R, A has a least upper bound, c ∈ R. We claim x ' c. Let ε ∈ R such that ε > 0. Since c is an upper bound for A, c + ε ∈/ A, and thus x ≤ c + ε. We also know that x  c − ε, otherwise c − ε would be an upper bound for A, contradicting that c is the least upper bound. Thus,

c − ε < x ≤ c + ε ⇒ |x − c| ≤ ε.

This inequality holds for all positive reals ε, and therefore x ' c. 38

To show that this c is unique, consider x ' c0 where c0 6= c. However, x ' c and 0 0 0 0 x ' c imply that c ' c , and as both c,c ∈ R, we must have c = c . 

The particular c in the proof of the previous theorem is referred to as the shadow of x. Thus, we have a function sh(x) from limited numbers of R to R.

Corollary 28. Every finite x ∈ R has a unique decomposition of the form x = sh(x) + i where i is an infinitesimal.

Sometimes the shadow function is also referred to as the “standard part map.” This function has the following properties (stated, but not proven):

Theorem 29. (a) sh(x) maps R onto R. (b) sh(x) = 0 if and only if x is infinitesimal. (c) sh(x +R y) = sh(x) + sh(y). (d) sh(x ·R y) = sh(x) · sh(y).

Examples

In calculus, the ideas of convergence and infinitely large and small quantities are inseparable. As typically taught, convergence and other fundamental ideas of differential and integral calculus (e.g., the continuity of functions) are described in terms of ε’s and δ’s, which are defined not by what they are, but more so by what they are not. However, in the previous sections we developed concrete objects and a new structure in which to apply them that would satisfy these roles. Naturally, the next step is then to reexamine our old calculus techniques with these new objects.

Convergence. A real-valued sequence hSn : n ∈ Ni is a real-valued function S : N → R. Using our ultraproduct construction tools, we can extend this sequence S to a 39

“hypersequence” from N to R, ultrapowers of N and R, respectively. Terms Sn become then defined on the unlimited hypernaturals, N∞; we define the collection {Sn |n ∈ N∞} to be the extended tail of S. It is the behavior of this extended tail that really differentiates our new method from the old. We state the equivalence of convergence of a sequence in our new structure to that of a sequence in R as a theorem.

Theorem 30. A sequence S : N → R converges to L ∈ R if and only if for each n ∈ N∞,

Sn ' L.

+ Proof. (⇒) Suppose the sequence hSn |n ∈ Ni converges to limit L ∈ R, and let ε ∈ R be given. From the standard convergence condition, we know that there exists an m ∈ N such that |Sm − L| < ε for all n ∈ N where n > m. Using Łos,´ we can transfer this criteria to the extended tail,

∀n ∈ N (n > m → |Sn − L| < ε).

Now fix some N ∈ N∞. Since N is unlimited, we have N > m, and the above sentence implies

|SN − L| < ε and SN ' L as desired.

(⇐) Conversely, suppose that Sn ' L for all n ∈ N∞. Let N ∈ N∞. For any z > N, it follow that z is also unlimited, and thus Sn ' L ⇒ |Sz − L| < ε. Equivalently,

∀z ∈ N (z > N → |Sz − L| < ε). 40

Therefore the sentence

(∗) ∃n ∈ N ∀z ∈ N (z > n → |Sz − L| < ε) is true. However, this sentence is equivalent to

(∗∗) ∃n ∈ N ∀z ∈ N (z > n → |Sz − L| < ε) as interpreted in each structure R = (R,+,·,<) in our ultrapower R. Thus if (∗) is true in the ultrapower, (∗∗) is true in R, by nature of our filter, and we have traditional convergence to L on R. 

. A real-valued infinite series ∑i=1 ai is convergent if and only if the sequence s = hsn : n ∈ Ni of partial sums, sn = a1 + ... + an, is convergent. Applying the convergence on sequences results from above, we have:

∞ n • ∑i=1 ai = L in R if and only if ∑i=1 ai ' L for all n ∈ N∞. ∞ n • ∑i=1 ai converges in R if and only if ∑i=m ai ' 0 for all n, m ∈ N∞ where m < n. The latter is given by the Cauchy Convergence Criterion which was not discussed explicitly above. To show how these ideas can be used more concretely than in traditional real analysis, we provide the following example. Note an additional fact used for this example b is as follows: for limited b and unlimited n, the quantity n is infinitesimal [5].

∞ 1 Example 31. Show that ∑n=1 n(n+1) = 1.

b 1 Let S(b) = ∑n=1 n(n+1) for b ∈ N. This extends to the unlimited hypernaturals:

For x ∈ N∞, x 1 S(x) − 1 = ∑ − 1 n=1 n(n + 1) 41

x 1 1  = ∑ − − 1 n=1 n n + 1 x 1 x 1 = ∑ − ∑ − 1 n=1 n n=1 n + 1 1 = − , x + 1 which is infinitesimal as desired, and the limit of S is 1.

Continuity.

Theorem 32. The function f is continuous at the point c ∈ R if and only if f (x) ' f (c) for all x ∈ R such that x ' c.

Proof. (⇐) By the standard definition of continuity at a point c, we wish to show that

+ + (∗) ∀ε ∈ R ∃δ ∈ R ∀x ∈ R (|x − c| < δ → | f (x) − f (c)| < ε).

+ Suppose that x ' c → f (x) ' f (c) and let ε ∈ R be given. For any positive infinitesimal d ∈ R+, we know that for any x ∈ R,

|x − c| < d → x ' c → f (x) ' f (c) by assumption, and therefore | f (x) − f (c)| < ε for all real ε. Hence,

∃δ ∈ R+ ∀x ∈ R (|x − c| < δ → | f (x) − f (c)| < ε).

By Łos,´ this implies

+ ∃δ ∈ R ∀x ∈ R (|x − c| < δ → | f (x) − f (c)| < ε). 42

+ (⇒) Conversely, assume that (∗) holds and let ε ∈ R . Then, by (∗), there exists + a δ ∈ R such that

∀x ∈ R (|x − c| < δ → | f (x) − f (c)| < ε).

By Łos,´ we have

∀x ∈ R (|x − c| < δ → | f (x) − f (c)| < ε) for the same ε and δ. If x ' c, then |x − c| < δ for any real δ, and so specifically for the one chosen + above, and thus | f (x) − f (c)| < ε. As this holds for all ε ∈ R , f (x) ' f (c). 

Derivative. As is hopefully clear at this point, the system of real numbers is inadequate for analyzing infinitesimal behavior in a concrete way. The , the cornerstone of calculus, is perhaps where this inadequacy is most obvious. Leibniz’s dx was less than any assignable quantity, but strictly greater than zero to allow for certain calculations, a quantity that does not exist in the real number line. In our new system, however, this quantity is clearly defined as an infinitesimal. In the standard definition, the derivative of a function f at a real number x is the real number representing the rate of change of the function f as you get very close to x. Again, it is this notion of “very close” that can be problematic in traditional real analysis. We hope to offer some relief here.

Theorem 33. If f is defined at x ∈ R, then the real number L is the derivative of f at x, f 0(x) = L, if and only if for every nonzero infinitesimal ε, f (x + ε) is defined and

f (x + ε) − f (x) ' L. ε 43

0 Proof. (⇒) Assuming that f is defined at x ∈ R, and we have f (x) = L, we may write

f (x + h) − f (x) f 0(x) = lim = L h→0 h

f (x+h)− f (x) in the standard definition of a derivative on R. Let g(h) = h . That limh→0 g(h) =

L is characterized in our new structure by: limh→0 g(h) = L if and only if g(h) ' L for all h ∈ R where h ' 0 and h 6= 0, which can be obtained by interpreting the standard ε-δ definition of limit in R as we did for sequences and series. By this characterization of limit,

f (x + h) − f (h) f 0(x) = L ⇐⇒ lim g(h) = L ⇐⇒ g(h) = ' L, h→0 h for all h ' 0 where h 6= 0. The proof of the ⇐ direction follows from the “if and only if” nature of the limit characterization. 

Example 34. Let f (x) = x2. Then f 0(a) = 2a, since

2  f (a + ε) − f (a) (a + ε) − a2 a2 + 2aε + ε2 − a2 f 0(a) = = = ε ε ε

2aε + ε2 = = 2a + ε ' 2a. ε

We now introduce another notation that will be useful in our next proof of the chain rule. Let ∆x denote an arbitrary nonzero infinitesimal representing a change or increment in the value of x. We can then define ∆ f = f (x + ∆x) − f (x) to be the corresponding increment in the value of the function f at x.

If the derivative of f exists at x ∈ R, then by the previous theorem we may write ∆ f 0 ∆x ' f (x), reminiscent of Newton’s quotient for . Also, by the previous 44

∆ f ∆ f theorem, we see that ∆x is limited, and ∆ f is infinitesimal (as ∆ f = ∆x ∆x and a limited quantity multiplied by an infinitesimal is always infinitesimal [5]). Therefore, f (x + ∆x) ' f (x) for all infinitesimal ∆x. Using this notation, and the following theorem, the proof of the chain rule will be rather straightforward.

0 Theorem 35. (Incremental Equation) If f (x) exists at x ∈ R and ∆x is infinitesimal, then ∆ f is infinitesimal, and there is an infinitesimal ε, dependent on x and ∆x, such that:

∆ f = f 0(x)∆x + ε∆x, and so f (x + ∆x) = f (x) + f 0(x)∆x + ε∆x.

Now we may proceed with the proof of the chain rule.

Theorem 36. (Chain Rule) If f is differentiable at x ∈ R, and g is differentiable at f (x), then g ◦ f is differentiable at x with derivative g0( f (x)) f 0(x).

Proof. Let ∆x be a nonzero infinitesimal. Since f is differentiable at x, we may write f (x + ∆x) ' f (x) as shown above. Also, since g0( f (x)) exists, g must be defined at all points y such that f (x) ' y. Thus (g ◦ f )(x + ∆x) = g( f (x + ∆x) is defined. Using our definitions above, we have:

∆ f = f (x + ∆x) − f (x) and ∆(g ◦ f ) = g( f (x + ∆x)) − g( f (x)), 45 the increments of f and g ◦ f corresponding to the increment ∆x of x. Since ∆ f is then also an infinitesimal, we may write

∆(g ◦ f ) = g( f (x + ∆ f )) − g( f (x)), which shows that ∆(g ◦ f ) is the increment corresponding to the increment ∆ f of f . Then, by the incremental equation for g (Theorem 35), there exists an infinitesimal ε such that ∆(g ◦ f ) = g0( f (x))∆ f + ε∆ f , and hence, ∆(g ◦ f ) ∆ f ∆ f = g0( f (x)) + ε ∆x ∆x ∆x ' g0( f (x)) f 0(x) + 0

0 0 and g ( f (x)) f (x) is the derivative of g ◦ f at x. 

An Observation: Completion by Enlargement. The process of enlarging a structure by taking an ultraproduct is a kind of completion. Consider that the system of rational numbers Q can be viewed as incomplete, in that it does not contain certain elements – such as the limits of Cauchy sequences, the sum of infinite series, etc. It can be shown that the enlargement of Q in a nonstandard framework can complete it, though this process leads to some redundancies so simply taking an ultraproduct is not the final step. An outline of the process is provided below. In essence, the reals are obtained from the rationals by taking the quotient ring of Qlim/Qinf, where Qlim is the set of limited hyperrationals and Qinf is the set of infinitesimal hyperrationals. Elements of Qlim/Qinf are cosets: Qinf + x =  q + x : q ∈ Qinf where x ∈ Qlim, which are the same as equivalence classes of Qlim under the relation ' (the relation of infinite closeness). The isomorphism with R is 46 then given by the map: Qinf + x → sh(x), where sh(x) is the shadow function described previously. A more detailed explanation can be found in Goldblatt, 1998 [5]. 47

CHAPTER 5

THE AX-GROTHENDIECK THEOREM

Motivation

The case of the Ax-Grothendieck Theorem for which we will provide a proof n is a very important result in complex analysis that states that any function from C n to C that is polynomial in every coordinate and one-to-one is necessarily onto. The only readily accessible complex analytic proof of this theorem was provided by Walter Rudin in 1995 [1]. The proof provided here utilizes an ultraproduct on the closure of

Zp (where p is prime). Though this method may not be simple, it clearly shows the value that the ultraproduct can add as a tool with which you prove results in a given discipline from outside that discipline. In the same sense that isomorphisms offer a way to extend phenomena from one structure to another in group or ring theory, the ultraproduct construction offers a way to tackle this proof from a realm almost entirely outside that of complex analysis. The only complex analysis, per se, involved in the following proof is the fact that the cardinality of C is 2ℵ0 and the fact that C is algebraically closed.

Statement of the Theorem

n n Theorem 37. (Ax-Grothendieck) Given F : C → C such that F is polynomial in every coordinate, if F is one-to-one, then F is onto.

It is important to note that in this theorem, “onto” and “one-to-one” cannot be interchanged (even though they can be in the pigeonhole principle) since, for example, f : C → C given by f (x) = x2 is onto, but it is not one-to-one. Before we present the proof, we should first define what it means for a function n n to be polynomial in every coordinate. This simply means that for f : C → C , we have 48 f (x1,...,xn) = (p1(x1,...,xn),..., pn(x1,...,xn)), where each pi, (1 ≤ i ≤ n), is a polynomial n map from C into C.

Proof of the Theorem

We present the proof of this theorem in four parts. In part one, we show that Zp, the algebraic closure of the structure Zp = (Zp,+,·), where Zp is the cyclic group of Z S modulo p for prime p, may be written as Zp = n∈N Kn where each Kn is a finite field, i.e., k that each Zp is a “tower of finite fields.” In part two, we check that for any f : (Zp) → k (Zp) that is one-to-one must be necessarily onto. In part three, we show that for the ultraproduct ∏D Zp, where D is a non-principal ultrafilter on P, the set of all primes, any k k f : (∏D Zp) → (∏D Zp) that is one-to-one must, again, be also onto. Finally, in part four we show that the ultraproduct ∏D Zp is isomorphic to C, thus completing our proof. Note that the n = 1 case of the theorem, i.e. functions f : C → C, is trivial since the only such f are linear functions, all of which are onto C.

S Proof. PART ONE. To prove that Zp = n∈N Kn where each Kn is a finite field, we utilize the following theorem. Theorem A. Given a finite field F, there exists an infinite sequence of finite fields hFn : n ∈ Ni such that:

(a) F0 = F;

(b) Fn is a subfield of Fn+1; S (c) n∈N Fn is algebraically closed. However, in order to prove Theorem A, we first require the following lemma:

Lemma B. There is an infinite sequence hkn : n ∈ Ni, where each kn ∈ N, such that:

(a) Each natural number appears infinitely often in hkn : n ∈ Ni, i.e., {n : kn = j} is infinite for any j ∈ N. 49

(b) kn ≤ n for each n ∈ N. PROOFOF LEMMA B. Consider the N by N matrix, A, all of whose columns are the same and given by ai j = j,

  0 1 2 3 ...      0 1 2 ...      A =  0 1 ... .        0 ...    ...

Let hPn : n ∈ Ni be the “zig-zag” enumeration of A (reminiscent of the counting argument used to show ℵ0 × ℵ0 = ℵ0), where Pn = han,bni as follows:

P1 = h0,0i,

P1 = h0,1i,

P2 = h1,0i,

P3 = h0,2i,

P4 = h1,1i,

P5 = h2,0i, ...

The desired sequence hkn : n ∈ Ni can be simply obtained by setting kn := an for all n ∈ N (or kn := bn for all n ∈ N), and as each row and column are infinite in our matrix A, each value n ∈ N will appear infinitely often in this sequence, satisfying the conditions and thus completing the proof of Lemma B. The proof of Theorem A will also utilize Kronecker’s Theorem, provided below. Theorem C. (Kronecker’s Theorem) Given a field K and a non-constant polynomial f (x) ∈ K[x], there exists an extension field K0 ⊃ K such that f (x) has at least one root in K0. 50

Moreover, if K is a finite field, then K0 can be also arranged to be a finite field. For a proof of this theorem, see Fraleigh, 2003 [6]. Observe that to extend K, K0 can be taken to be the quotient ring K[x]/p(x), where p(x) is an irreducible polynomial that divides f (x). The existence of p(x) follows from the fact that K[x] is a factorization domain [6]. Also note that |K0| = |K|n, where n is the degree of polynomial p(x), and therefore for finite K and any degree n, |K0| will also be finite. We are now equipped to prove Theorem A.

PROOFOF THEOREM A. We construct the desired sequence hFn : n ∈ Ni by recursion.

Let F0 = F, the original finite field. For the successor step, suppose that we have already constructed the sequence up to some fixed natural number j, i.e., hFn : n ≤ ji. Therefore, we are able to enumerate the non-constant elements (polynomials) of F0[x], F1[x],..., Fj[x] as h fi,n(x) : i ∈ Ni for each n ≤ j, respectively.

Applying Lemma B, we consider k j, where j is the same fixed natural number from above. By condition (b) of Lemma B, k j ≤ j and therefore the element Fk j of our sequence has already been constructed at this stage. Consider then Fk j [x] and let i0 be the

first i ∈ N such that Fj has no root for fi0,n(x). Define Fj+1 to be a field extension of Fj which has a root for fi0,n(x) by Kronecker’s theorem, and therefore we have that Fj is a subfield of Fj+1. 0 We now have an infinite sequence of finite fields (see above on |K |), hFn : n ∈ Ni, such that conditions (a) and (b) of Theorem A are met.

Since each natural number appears infinitely often in hkn : n ∈ Ni, by the time hFn : n ∈ Ni is constructed, we have assured that every polynomial in Fn[x] for each n has a root (since we infinitely revisit whether the polynomials in Fn[x] have roots), thereby S ensuring that n∈N Fn is algebraically closed. 51 S Therefore for finite field Zp, we can write Zp = n∈N Fn where each Fn is a finite field. S PART TWO. Let F be a field that can be written F = n∈N Fn where each Fn is finite (as k 1−1 k in part one). We claim that every f : F → F where k ∈ N such that f is polynomial in every coordinate must be onto.

Let f be one-to-one, where f (x1,...,xk) = (p1(x1,...,xk),..., pk(x1,...,xk)), and let

Ci for 1 ≤ i ≤ k be the set of coefficients for the polynomials pi, respectively. Thus Ci ⊆ F k S for all 1 ≤ i ≤ k, and each Ci is finite by the finite nature of each pi. Then Ci (the set i=1 of all coefficients used in all of the polynomial coordinates in f ) is also finite. Since F is n k S0 S a union of finite fields Fn, there is some n0 ∈ N such that Fi contains Ci (all of the i=1 i=1 n S0 coefficients used in f ). Since Fi is a finite union of finite fields, it is itself finite, and i=1 n S0 we shall define it as F := Fi. i=1 k 1−1 k Now consider that for this same f , we may write f : F → F , where f is now a one-to-one function between two finite sets of the same cardinality. Hence by the pigeonhole principle, f must also be onto [2].

Note that from our original field F, we have F ⊆ F. Suppose that there exists a k 1−1 k k function g : F → F that is not onto. Then there exists an element y ∈ F , y = (y1,...,yk) k where yi ∈ F for 1 ≤ i ≤ k such that no element x ∈ F maps to y. That is, there is no x = (x1,...,xk), xi ∈ F for 1 ≤ i ≤ k such that

g(x1,...,xk) = (p1(x1,...,xk),..., pk(x1,...,xk)) = y

where y = (y1,y2,...) such that

p1(x1,...,xk) = y1, 52

p2(x1,...,xk) = y2,

...

pk(x1,...,xk) = yk.

S Given that F = n∈N Fn, we must have y1,...,yk ∈ Fm0 for some m0 ∈ N where Fm0 is a finite field. Therefore we may consider g : Bk → Bk where,

n0 ! m0 ! max{n0,m0} [ [ [ B = Fi ∪ Fi = Fi i=1 i=1 i=1

n S0 and Fi is the set that contains all of the coefficients used in g (as was constructed above i=1 for f ). By the pigeonhole principle, this map must be one-to-one, and there exists an element x ∈ Bk such that g(x) = y. However, since B ⊆ F by its construction,

x ∈ Bk ⇒ x ∈ Fk, contradicting our assumption that no such x can exist in Fk. k 1−1 k Therefore, for every f : F → F where k ∈ N such that f is polynomial in each coordinate must be onto.

PART THREE. Consider the ultraproduct ∏D Zp, where D is a non-principal ultrafilter over index set P, the set of primes. As shown in part one, each Zp in our product can be k 1−1 k written as a union of finite fields and hence, by Part 2, any polynomial f : Zp → Zp must be onto for all p ∈ P. Because the notions of a function being one-to-one and onto are expressible in first order logic, we may write the following sentence:

 0 0 0 0 0 0  Ψ := ∀x1∀x2...∀xk f (x1,x2,..,xk) = f (x1,x2,..,xk) → hx1,x2,...,xki = x1,x2,...,xk 53

→ ∀y1∀y2...∀yk ∃x1∃x2...∃xk ( f (x1,...,xk) = (y1,..,yk))

As this sentence holds in Zp for all p ∈ P , by Łos’s´ Theorem, it holds as interpreted in the ultraproduct as well, i.e.,

{p ∈ P|Zp |= Ψ} = P ∈ D and therefore

∏Zp |= Ψ. D PART FOUR. By a theorem of Steinitz [7], any two algebraically closed fields of the same characteristic and the same cardinality are isomorphic. Thus, to show that C and

∏D Zp are isomorphic, it is sufficient to show that (a) both are algebraically closed fields, (b) both have the same characteristic, and (c) both have the same cardinality.

(a) Closure: C is algebraically closed by the Fundamental Theorem of Algebra, that is, C contains a root for every non-constant polynomial in C[x]. Consider the fact that the notion of algebraic closure can be written as a family of sentences of first order logic:

σ1 := ∀a0∀a1 ∃x (a1 6= 0 → a1x + a0 = 0),

2  σ2 := ∀a0∀a1∀a2 ∃x a2 6= 0 → a2x + a1x + a0 = 0 ,

...

n σn := ∀a0∀a1...∀an ∃x (an 6= 0 → anx + ... + a1x + a0 = 0),

n+1  σn+1 := ∀a0∀a1...∀an+1 ∃x an+1 6= 0 → an+1x + ... + a1x + a0 = 0 , 54

...

That is, for each n ∈ N, there is a single first order language sentence that expresses that all polynomials of degree n have a solution. We write Σ = {σn : n ∈ N} to express algebraic closure.

Therefore, since Zp is algebraically closed for all p ∈ P, we have

{p ∈ P|Zp |= Σ} ∈ D and thus by Łos’s´ Theorem, we have

∏Zp |= Σ D and ∏D Zp is also algebraically closed. (b) Characteristic: As C has characteristic 0, we claim that the characteristic of

∏D Zp is also 0. + Suppose, by way of contradiction, that char(∏D Zp) = k for some k ∈ N . Let p0 be the first prime such that p0 > k. Then for all Zp such that p > p0, char(Zp) = p > p0.

Given that p0 is a finite number, we have all but a finite number of Zp’s where we can guarantee the characteristic to be not k, i.e.,

{p ∈ P|char(Zp) 6= k} ∈ D.   Therefore, the first order sentence φ := ∃x x + x + ... + x = 0 is false in all but a finite | {z } k times number of models in the product ∏Zp. Alternatively, for ψ = ¬φ, ψ is true in co-finitely many models in the product and we have

{p ∈ P|Zp |= ψ} ∈ D. 55

Thus, by Łos,´

∏Zp |= ψ, D and the characteristic of ∏D Zp is equal to no finite number, and is therefore 0.

ℵ0 (c) Cardinality: We know that |C| = 2 ; we claim that ∏D Zp also has cardinality 2ℵ0 by showing that ℵ ℵ 2 0 ≤ |∏Zp| ≤ 2 0 . D

ℵ0 First we show |∏D Zp| ≤ 2 . In Part 1, we saw that Zp is built as a union of finite fields for each prime p, the end result of which is a countably infinite structure.

Thus each structure Zp is equipollent to N (that is, there exists a bijection between these two structures). For each p ∈ P, let θp be the bijection from Zp to N. We claim then that our ultraproduct ∏D Zp is equipollent to the ultrapower ∏D N. Let Θ : ∏D Zp → ∏D N D D be defined as Θ( f ) = g , where g(p) = θp( f (p)). D D D D D D Let f1 , f2 ∈ ∏D Zp and suppose Θ( f1 ) = Θ( f2 ). Then g1 = g2 and equivalently,

{p ∈ P : g1(p) = g2(p)} ∈ D,

⇐⇒ {p ∈ P : θp( f1(p)) = θp( f2(p))} ∈ D and since θp is a bijection for each p ∈ P, we have

{p ∈ P : f1(p) = f2(p)} ∈ D

D D and f1 = f2 . Thus Θ is one-to-one. D D D D Let g ∈ ∏D N. We wish to find an f ∈ ∏D Zp such that Θ( f ) = g .

Given any p ∈ P, consider g(p) ∈ N. Because θp is a bijection for each p, there is a unique m ∈ Zp such that θp(m) = g(p). Define f (p) = m. Repeating this process, we 56 build the element f ∈ ∏Zp such that θp( f (p)) = g(p) for all p ∈ P (where P ∈ D), so Θ( f D) = gD and Θ is onto.

Therefore, Θ is a bijection from ∏D Zp to ∏D N.

Since the sets are equipollent, we have |∏D Zp| = |∏D N|.

By the Axiom of Choice, we are assured the existence of a function F : ∏D N → N D N that selects a witness from each equivalence class f ∈ ∏D N. Since F is injective N (equivalence classes are disjoint), we know that |∏D N| ≤ |N |. By elementary set theory,

ℵ ℵ0 ℵ0 ℵ0  0 ℵ0·ℵ0 ℵ0 2 ≤ ℵ0 ≤ 2 = 2 = 2 ,

N ℵ0 ℵ0 and hence |N | = ℵ0 = 2 . Therefore, combining this result with the above, we have

N ℵ0 |∏Zp| = |∏N| ≤ |N | = 2 . D D

ℵ0 We now wish to show that 2 ≤ |∏D Zp|. As shown above, |∏D Zp| = |∏D N|

ℵ0 since the two are equipollent; we will show that 2 ≤ |∏D N|.

Claim: There exists a family of functions fs : s ∈ 2N such that:

(i) fs : N → N, and

(ii) given s 6= t where s,t ∈ 2N, there exists a k ∈ N such that fs(i) 6= ft(i) for i ≥ k (i.e., fs and ft are eventually different and can be equivalent on only a finite number of indices).

Let B be the set of all functions bn : n → {0,1} for each n ∈ N, i.e., the family of functions mapping each natural number to a sequence of length n of 0’s and 1’s. We can visualize this family by creating a binary “tree” where each branch splits at successive levels, doubling the number of branches each time. The first “node” at the base of our tree is the empty sequence hi, where n = 0. The first split level has nodes for {h0i,h1i}, the second level into {h0,0i,h0,1i,h1,0i,h1,1i}, the third into 57

{h0,0,0i,h0,0,1i,h0,1,0i,h0,1,1i,h1,0,0i,h1,0,1i,h1,1,0i,h1,1,1i}, and so on, on successive levels splitting each node into two, one with an additional 0 to the sequence and one with an additional 1. By this process, we create every possible sequence of 0’s and 1’s of length n for each natural n. We consider each n as the set {0,1,...,n − 1} so that, for instance, h1,0,1i would indicate the function b : 3 → {0,1} where b(0) = 1, b(1) = 0, b(2) = 1. i Next, we define the map a(h) = ∑h(i)=1 2 where h : n → {0,1}. For example, a(h1,0,1i) = 20 + 22 = 5 and a(h0,0,1,1i) = 22 + 23 = 12. In order for each b ∈ B to have a unique value, we apply the function G : B → B where G simply adds a 1 to the end of any sequence b. That is, for b = h1,0,1,0i, G(b) = h1,0,1,0,1i. Since each sequence of G(B) is unique, it’s a-value will be uniquely determined (by the placement of the 1’s in the sequence). As every natural number may be obtained using the function a (by binary representation), we know that a ◦ G is a bijection from B to N. Next, we consider again the binary tree. We now have a function that assigns each node with a value of N (a unique “address”), and therefore each infinite branch b can be rewritten as a sequence of naturals from each node it passes through (giving unique “directions” to follow for each branch). Define the function f (b) as this resequencing, e.g., for the branch b that begins as follows: b = h0,1,1,...i, we find addresses for the nodes by: a(G(hi)), a(G(h0i)), a(G(h0,1i)), a(G(h0,1,1i)), ..., and we obtain f (b) = fb by: fb(0) = a(G(hi)), fb(1) = a(G(h0i)), fb(2) = a(G(h0,1i)), fb(3) = a(G(h0,1,1i)), etc. In this way, f is a function from N to N, and given s 6= t where s,t ∈ 2N, there exists a k ∈ N such that fs(i) 6= ft(i) for i ≥ k (since all branches eventually split), satisfying the two conditions of our claim. Note that the cardinality of infinite branches is the cardinality | | of the functions from {0,1} to N, or 2 N = 2ℵ0 . 58

N The family of functions fs : s ∈ 2 as developed above are elements of ∏n∈N N, and given the second condition of our claim is satisfied, for any non-principal ultrafilter D D D over N and ∏D N, we have fs 6= ft for s 6= t. Therefore we have an injective map

ℵ0 from the family of infinite sequences of 0’s and 1’s, whose cardinality is 2 , to ∏D N,

ℵ0 and 2 ≤ |∏D N| = |∏D Zp|. ∼ Therefore, by Steinitz, ∏D Zp = C. Since isomorphisms preserve the truth of sentences of first order logic, the fact that all one-to-one polynomial maps are onto in the ultraproduct assures us that all one-to-one polynomial maps are onto in C.  59

BIBLIOGRAPHY

[1] Rudin, W. (1995) Injective Polynomial Maps are Automorphisms, American Mathematical Monthly, 102, 6:540-543. [2] Goldrei, D. (1996) Classical Set Theory: For Guided Independent Study. Boca Raton, Florida: Chapman & Hall/CRC Press. [3] Chang, C. & Keisler, J. (2012) Model Theory, Third Edition. Mineloa, New York: Dover Publications. [4] Robinson, A. (1966) Non-Standard Analysis. Princeton, New Jersey: Princeton University Press. [5] Goldblatt, R. (1998) Lectures on the Hyperreals: An Introduction to Nonstandard Analysis. New York, New York: Springer-Verlag Berlin Heidelberg. [6] Fraleigh, J. B. (2003) A First Course in Abstract Algebra. Boston, Massachusetts: Addison-Wesley. [7] Steinitz, E. (1910) Algebraische Theorie der Körper. J. Reine Angew. Math., 137:167-309.