<<

Introduction to

by

Bashir Abdel-Fattah [email protected]

Abstract The basic idea of nonstandard analysis is to extend the reals to a field that includes ”infinite” and ”infinitesimal” elements in order to simplify proofs and con- cepts by replacing limits and epsilon-delta proofs by expressions involving infinites- imals. This paper discusses the construction and properties of the hyperreals and some basic results results in differential before moving on to logic and cul- minating in theL´oˆsTheorem/. The contents are largely based on the papers (Davis, 2009) and (Rayo, 2015), with lesser contributions from (Fletcher et al., 2017), (Claassens, 2016), (Marker, 2010), (Keef and Guichard, 2015), and (Murnaghan, 2015). Also note that this paper assumes little background knowledge outside of analysis (the first section does begin with a brief high-level overview in terms of abstract algebra, but if the reader is unfamiliar with that material they can safely proceed knowing that everything else they need will be introduced throughout the paper). Contents

1 Basics of Nonstandard Analysis3 1.1 Constructing Infinitesimals...... 3 1.1.1 Definitions and Properties of Filters...... 3 1.1.2 Ultrafilter Construction of the Hyperreals...... 6 1.1.3 Properties of the Hyperreals...... 9 1.1.4 Enriching Sets and Functions to the Hyperreal Setting.... 12 1.2 Basic Analysis using the Hyperreals...... 13

2 The Transfer Principle 16 2.1 Formal Language and First-Order Logic...... 16 2.2 Basics of Transfer...... 19 2.2.1 and Ultrapowers...... 19 2.2.2L´oˆs’Theorem ...... 21

References 25

2 1

Basics of Nonstandard Analysis 1.1 Constructing Infinitesimals

Although there are several different schemes of varying degrees of sophistication and utility in rigorously introducing infinite and infinitesimal elements to analysis, in this paper we will focus on the ultrafilter construction developed by in the 1960s. Robinson’s method, being both one of the oldest nonstandard schemes and relatively straightforward, is well developed in the mathematical literature, and the basic idea of the method is as follows: for any X and ring R, the set RX of all function f : X → R is a ring under pointwise addition and multiplication

(f + g)(x) = f(x) + g(x)(f · g)(x) = f(x) · g(x)

However, even if R is an integral domain or a field, RX won’t be an integral domain in general due to the presence of zero divisors. For example, let X be any set and let R be any ring. Then, for any S ⊆ X, let χS(x) denote the characteristic function ( 1R if x ∈ S χS(x) = 0R if x 6∈ S

Then, letting A and B are two disjoint nonempty of X, neither χA(x) nor χB(x) are uniformly zero, yet (χA · χB)(x) = χA(x) · χB(x) = χA∩B(x) = χ∅(x) is uniformly zero. In the case of the hyperreals ∗R, our aim is the quotient the ring RN of real-valued sequences by a maximal MAX so that RN/MAX is a field. In a heuristic sense, this amounts to finding an equivalence relation =MAX that declares two sequences hani and hbni equal if the set {n ∈ N : an = bn} is ”too large”, so that if hani · hbni = hanbni =MAX h0i, then at least one of hani and hbni contained ”enough” zeros to be equivalent to zero to begin with. This of eliminating zero divisors is accomplished with the notion of filters.

1.1.1 Definitions and Properties of Filters We begin with a definition of filters in terms of their properties. For many of the properties below, there are several equivalent definitions, many of which are more general or more sophisticated than those stated here, but we shall proceed with following in the interests of simplicity.

Definition 1.1.1. A filter on a set X is a subset F of the P(X) satisfying the following properties:

3 1. Proper : ∅ 6∈ F

2. Finite Intersection Property: If A, B ∈ F, then A ∩ B ∈ F

3. Superset Property: If A ∈ F and B ⊃ A, then B ∈ F

Additionally, F is said to be a ultrafilter if it also satisfies:

4. Maximality: For any A ⊆ X, either A ∈ F or X \ A ∈ F

F is further said to be a free ultrafilter if it satisfies:

5. Freeness: F contains no finite subsets of X

Note that ultrafilters actually obey a stronger version of the maximality property (of which the above definition is a special case), which we prove below in order to familiarize the reader with the style of reasoning associated with filters:

Lemma 1.1.1. Let U be an ultrafilter on a set X, and let {A1,A2,...,An} be a finite Sn collection of disjoint subsets of S such that i=1 Ai = X. Then there is exactly one set Aj ∈ {A1,A2,...,An} such that Aj ∈ U.

Proof. First, suppose that U contains none of the sets A1,A2,...An. Then, by the maximality property of ultrafilters, (X \ A1), (X \ A2),..., (X \ An) ∈ U, so by the finite intersection property

n n \ [ (X \ Am) = X \ ( Am) = X \ X = ∅ m=1 m=1 is an element of U. This is a contradiction since U must be a proper filter, so we must have that at least one of the sets A1,A2,...An is in U. Now suppose that U contains more than one of the sets A1,A2,...,An. Let Ai and Aj be two distinct sets such that Ai,Aj ∈ U. Then, by the intersection property, we have

Ai ∩ Aj = ∅ ∈ U

This is also a contradiction, so we must have that U contains at most one of the sets A1,A2,...,An. Therefore U contains exactly one of the sets A1,A2,...,An.  Examples of filters on a set X include:

1. The trivial filter F = X.

2. The principle filter FA, which is defined in terms of any set A ⊂ X as

FA = {Y ⊂ X : Y ⊃ A}

In the case that A = {a} contains only a single element, then the principle filter Fa is in fact an ultrafilter.

3. The cofinite/Fr´echet filter F co = {Y ⊂ X : X \ Y is finite}. Fr´echet filters are non-principal.

4 All of the above examples are relatively simple and uninteresting for our pur- poses, and it turns out that every free ultrafilter is non-principle, and in fact cannot be explicitly constructed. One might then doubt whether any collection of sets that obeys all five of the given properties even exists, but we can prove the existence of free ultrafilters on any infinite set using Zorn’s Lemma. We will proceed to that proof in short order, but first we need the following helpful lemma.

Lemma 1.1.2. Let F be a filter on a set X, and let A ⊂ X be a set such that A 6∈ F and (X \ A) 6∈ F. Then F can be extended to a filter F 0 on X containing both F and A.

0 Proof. To begin with, assume F 6= ∅ (if F = ∅, then we can just take F = FA, the principle filter generated by the set A, and the result follows trivially), and consider the collection

F 0 = {Y 0 ⊂ X : there exists some Y ∈ F such that Y 0 ⊃ (Y ∩ A)}

We will now verify that F 0 satisfies each of the properties of a filter in order.

1. Proper Filter: First, note that X ∈ F by the superset property (given any Y ∈ F, then X ⊃ Y , so X ∈ F). This means that A 6= ∅, because in the case that A = ∅ we have (X \ A) = X ∈ F, which contradicts the hypotheses of the lemma. Furthermore, for any Y ∈ F we have Y ∩ A 6= ∅, because if Y ∩ A = ∅ then (X \ A) ⊃ Y , which implies that (X \ A) ∈ F by the superset property of F, a contradiction. Therefore, given any Y 0 ∈ F 0, then there is some Y ∈ F such that Y 0 ⊃ (Y ∩ A) 6= ∅, thus Y 0 6= ∅.

0 0 0 2. Finite Intersection Property: Suppose Y1 ,Y2 ∈ F . Then there exist some 0 0 sets Y1,Y2 ∈ F such that Y1 ⊃ (Y1 ∩ A) and Y2 ⊃ (Y2 ∩ A). Since Y1 ∩ Y2 ∈ F 0 0 by the finite intersection property of F, then Y1 ∩ Y2 ⊃ (Y1 ∩ Y2) ∩ A is an element of F 0 by definition.

0 0 3. Superset Property: This follows fairly trivially. Suppose Y1 ∈ F and that 0 0 0 Y2 ⊃ Y1 . Since there exists some Y1 ∈ F such that Y1 ⊃ (Y1 ∩ A), in which 0 0 0 case Y2 ⊃ (Y1 ∩ A) as well, so Y2 ∈ F by definition. Therefore F 0 is a filter, and we can readily verify that A ∈ F 0 and F ⊂ F 0: for any 0 Y ∈ F we have that Y,A ⊃ (Y ∩ A), hence Y,A ∈ F by definition.  Note that the above lemma also justifies the choice of terminology for that max- imality property: any set that does not obey the maximality property of ultrafilters is not maximal in the sense that there is a strictly larger filter containing it. We will prove in the following theorem that a maximal filter (in the sense that there is no strictly larger filter containing it) obeys the maximality property of ultrafilters, and thus the two notions are equivalent.

Theorem 1.1.3 (Ultrafilter Lemma). Let F be a filter on the set X. Then F can be extended to an ultrafilter U on X.

Proof. In this proof, we will first show that there is a maximal filter U on X contain- ing F, then demonstrate that U is an ultrafilter. The first step will be accomplished using Zorn’s Lemma, which is equivalent to the of Choice:

5 Let S be a family of sets. If for each chain C ⊂ S there exists a member of S that contains all members of C, then S contains a maximal member. Recall that a chain is a collection of sets C such for any pair of distinct sets A, B ∈ C, either A ⊂ B or B ⊂ A (thus a chain can be written ... ⊂ A ⊂ B ⊂ C ⊂ ..., justifying the terminology). Then let Φ be the set of filters on X containing F, and let C be any chain of filters F0 = F ⊂ F1 ⊂ F2 ⊂ ... in Φ. Then we claim that ∞ [ G = Fn n=0 is a filter in Φ that contains every member C. We will now verify the properties of filters in order:

1. Proper Filter: This follows trivially, given that ∅ 6∈ Fn for all n ≥ 0 (since each filter Fn is proper), and hence ∅ 6∈ G.

2. Finite Intersection Property: Given any Y1,Y2 ∈ G, then there must exist

some filters Fn1 , Fn2 ∈ C such that Y1 ∈ Fn1 and Y2 ∈ Fn2 . Suppose without

loss of generality that n2 ≤ n1. Then Fn2 ⊂ Fn1 , so both Y1 and Y2 are

elements of Fn1 . By the finite intersection property of Fn1 , we have that

(Y1 ∩ Y2) ∈ Fn1 ⊂ G.

3. Superset Property: Suppose that Y1 ∈ G and that Y2 ⊃ Y1. Since Y1 ∈ Fn for some Fn ∈ C, then Y2 ∈ Fn ⊂ G by the superset property of Fn. Thus G is a indeed a filter, and G contains both F and every element of C, so G is an element of Φ that contains every element of C. Since such a filter can be constructed for every chain in Φ, then by Zorn’s Lemma Φ must contain a maximal member, which we denote U. Since U is an element of Φ, then U is a filter containing F, and it is also clear that U satisfies the finite intersection property: for any set A ⊂ X, if A 6∈ U and (X \ A) 6∈ U then U can be extented to a filter U 0 ⊃ (U ∪ {A}) by Lemma 1.1.2, contradicting the maximality of U. Therefore U must contain either A of (X \ A) for any A ⊂ X (it cannot contain both because then by the finite intersection property U would also contain the empty set, contradicting the fact that U is a proper filter), thus U is an ultrafilter on X.  Given that any filter can be extended to an ultrafilter, we can also demonstrate the existence of a free ultrafilter U on an infinite set X by extending the cofinite filter F co on X. For any finite set A ⊂ X, F co contains (X \ A) because (X \ A) is infinite (X is infinite and A is finite), and hence (X \ A) ∈ U because U ⊃ F co. In particular, U does not contain A (as before, it cannot contain both X and (X \ A) because then U would also contain the empty set, a contradiction), so U cannot contain any finite sets.

1.1.2 Ultrafilter Construction of the Hyperreals Definition 1.1.2 (Equivalence Modulo an Ultrafilter). Given an ultrafilter U on N N, then define an equivalence relation ≡U on R as follows: given any real-valued N sequences hani, hbni ∈ R ,

hani ≡U hbni if and only if {n ∈ N : an = bn} ∈ U

6 Proof. Recall that a relation ∼ on a set X is an equivalence relation if it satisfies the following three properties: 1. Reflexivity: (∀x ∈ X): x ∼ x

2. Symmetry: (∀x, y ∈ X): x ∼ y =⇒ y ∼ x

3. Transitivity: (∀x, y, z ∈ X):(x ∼ y and y ∼ z) =⇒ x ∼ z

One can readily verify the reflexivity, symmetry, and transitivity of the relation ≡U N N on R . Given any sequence hani ∈ R , then the set {n ∈ N : an = an} = N is an element of the ultrafilter U by the maximality principle (either N or N \ N = ∅ is an element of U, but ∅ 6∈ U because U is a proper filter), so hani ≡U hani. For symmetry, if hani ≡U hbni, then {n ∈ N : an = bn} = {n ∈ N : bn = an} ∈ U, so hbni ≡U hani trivially. Finally, suppose hani ≡U hbni and hbni ≡U hcni. Then {n ∈ N : an = bn} and {n ∈ N : bn = cn} are both elements of the ultrafilter U, so {n ∈ N : an = bn and bn = cn} = {n ∈ N : an = bn} ∩ {n ∈ N : bn = cn} ∈ U by the finite intersection property. Since {n ∈ N : an = cn} ⊃ {n ∈ N : an = bn and bn = cn}, then {n ∈ N : an = cn} ∈ U by the superset property, so hani ≡U hcni. Therefore the given relation is indeed a valid equivalence relation.  Recall that the equivalence class of an element a of a set X equipped under an equivalence relation ∼ is the set of all elements b ∈ X such that b ∼ a. Then going N forwards we will use [hani] to denote the equivalence class of the sequence hani ∈ R under ultrafilter equivalence, where [hani] = [hbni] if and only if hani ≡U hbni (i.e., if and only if {n ∈ N : an = nn} ∈ U). We will also denote the set of all such N equivalence classes by R / ≡U . Lemma 1.1.4. The operations of pointwise addition and multiplication

[hani] + [hbni] = [han + bni]

[hani] ∗ [hbni] = [hanbni]

N are well-defined binary operations on R / ≡U .

N Proof. Let hani, hαni, hbni, hβni ∈ R be real valued sequences such that [hani] = [hαni] and [hbni] = [hβni]. Then {n ∈ N : an = αn} and {n ∈ N : bn = βn} are both elements of U, so {n ∈ N : an = αn and bn = βn} = {n ∈ N : an = αn} ∩ {n ∈ N : bn = βn} ∈ U by the finite intersection property. The sets {n ∈ N : an+bn = αn+βn} and {n ∈ N : anbn = αnβn} are both supersets of {n ∈ N : an = αn and bn = βn}, so both are also elements of U by the superset property. Therefore

[hani] + [hbni] = [han + bni] = [hαn + βni] = [hαni] + [hβni]

[hani] ∗ [hbni] = [hanbni] = [hαnβni] = [hαni] ∗ [hβni] Thus the results of addition and multiplication as defined are independent of which representatives are chosen for each equivalence class, and hence the given operations are well defined. 

N Theorem 1.1.5. Given an ultrafilter U on N, then R / ≡U is a field under pointwise addition and multiplication.

7 Proof. Perhaps the most concrete definition of a field is the following: a field is a set F equipped with binary operations + and · such that for all x, y, z ∈ F 1. x + y = y + x (commutativity of addition) 2.( x + y) + z = x + (y + z) (associativity of addition) 3. There exists some element 0 ∈ F such that x + 0 = x for all x ∈ F (existence of an additive identity) 4. For each x ∈ F there is some element −x ∈ F such that x + (−x) = 0 (existence of additive inverses) 5. x · y = y · x (commutativity of multiplication) 6.( x · y) · z = x · (y · z) (associativity of multiplication) 7. There esists some element 1 ∈ F such that x · 1 = x for all x ∈ F (existence of a multiplicative identity) 8. For each nonzero element x ∈ F there exists some element x−1 ∈ F such that x · x−1 = 1 (existence of multiplicative inverses) 9.( x + y) · z = x · z + y · z (distributivity)

N Although most of the above properties are evident in the case of R / ≡U (for in- stance, commutativity and associativity of addition and multiplication are inherited directly from the corresponding properties of the field R), we will nonetheless show each property in order for definiteness.

1.[ hani] + [hbni] = [han + bni] = [hbn + ani] = [hbni] + [hani]   2. [hani] + [hbni] + [hcni] = [han + bni] + [hcni] = [h(an + bn) + cni]

= [han + (bn + cn)i] = [hani] + [hbn + cni]   = [hani] + [hbni] + [hcni]

3. Consider the element [h0i], the equivalence class of the sequence that is iden- N tically zero. Then for any [hani] ∈ R / ≡U ,

[hani] + [h0i] = [han + 0i] = [hani]

N Thus [h0i] is the additive identity in R / ≡U .

N 4. For each [hani] ∈ R / ≡U , define −[hani] = [h−ani]. Then   [hani] + − [hani] = [hani] + [h−ani] = [han − ani] = [h0i]

5.[ hani] ∗ [hbni] = [hanbni] = [hbnani] = [hbni] ∗ [hani]   6. [hani] ∗ [hbni] ∗ [hcni] = [hanbni] ∗ [hcni] = [h(anbn)cni]

= [han(bncn)i] = [hani] ∗ [hbncni]   = [hani] ∗ [hbni] ∗ [hcni]

8 7. Consider the element [h1i], the equivalence class of the sequence of all ones. N Then for any [hani] ∈ R / ≡U ,

[hani] ∗ [h1i] = [han · 1i] = [hani]

N Thus [h1i] is the multiplicative identity in R / ≡U .

8. For any equivalence class [hani] 6= [h0i], since we would have [hani] = [h0i] if {n ∈ N : an = 0} ∈ U, then it must be true that {n ∈ N : an = 0} 6∈ U. By the maximality property, this means that {n ∈ N : an 6= 0} = N \{n ∈ N : an = 0} ∈ U. Then consider, for instance, the real-valued sequence defined by ( an if an 6= 0 αn = 1 if an = 0

Since {n ∈ N : αn = an} = {n ∈ N : an 6= 0} ∈ U, then [hαni] = [hani]. Because αn 6= 0 for all n ∈ N, then we can define the inverse

−1 −1 [hαni] = [hαn i]

with the property

−1 −1 −1 −1 [hani] ∗ [hαni] = [hαni] ∗ [hαni] = [hαni] ∗ [hαn i] = [hαnαn i] = [h1i]

  9. [hani] + [hbni] ∗ [hcni] = [han + bni] ∗ [hcni] = [h(an + bn)cni]

= [hancn + bncni] = [hancni] + [hbncni]

= [hani] ∗ [hcni] + [hbni] ∗ [hcni]



1.1.3 Properties of the Hyperreals In order to show that this newly-constructed field contain ”infinite” and ”infinitesi- N mal” , we first need some notion of size on R / ≡U . We accomplish this in N analogy to the definition of equality on R / ≡U :

Definition 1.1.3 (Inequality Modulo an Ultrafilter). Given an ultrafilter U on N, N define the relation ≤U on R / ≡U as follows: given any two elements [hani], [hbni] ∈ N R / ≡U , [hani] ≤U [hbni] if and only if {n ∈ N : an ≤ bn} ∈ U N Then ≤U is a total ordering on R / ≡U . Proof. Recall that a relation  on a set X is a (partial) ordering of X if it is

1. Reflexive: (∀x ∈ X): x  x

2. Anti-symmetric: (∀x, y ∈ X):(x  y and y  x) =⇒ x = y

3. Transitive: (∀x, y, z ∈ X):(x  y and y  z) =⇒ x  z

9 Additionally,  is a total ordering on X if it has the additional property that for all x, y ∈ X, either x  y or y  x. The proof that ≤U is a partial ordering proceeds analogously to the case of N equality modulo and ultrafilter. Given element [hani] ∈ R / ≡U , then the set {n ∈ N : an ≤ an} = N is an element of the ultrafilter U by the maximality principle (either N or N \ N = ∅ is an element of U, but ∅ 6∈ U because U is a proper filter), so [hani] ≤U [hani]. For transitivity, suppose [hani] ≤U [hbni] and [hbni] ≤U [hcni]. Then {n ∈ N : an ≤ bn} and {n ∈ N : bn ≤ cn} are both elements of the ultrafilter U, so {n ∈ N : an ≤ bn and bn ≤ cn} = {n ∈ N : an ≤ bn} ∩ {n ∈ N : bn ≤ cn} ∈ U by the finite intersection property. Since {n ∈ N : an ≤ cn} ⊃ {n ∈ N : an ≤ bn and bn ≤ cn}, then {n ∈ N : an ≤ cn} ∈ U by the superset property, so [hani] ≤U [hcni]. To demonstrate that ≤U is anti-symmetric, suppose [hani] ≤U [hbni] and [hbni] ≤U [hani]. Then {n ∈ N : an ≤ bn} and {n ∈ N : bn ≤ an} are both elements of U, so {n ∈ N : an = bn} = {n ∈ N : an ≤ bn and bn ≤ an} = {n ∈ N : an ≤ bn} ∩ {n ∈ N : bn ≤ an} ∈ U by the finite intersection property. Therefore [hani] = [hbni]. Finally, to show that ≤U is a total ordering, consider any two elements [hani] and N [hbni] in R / ≡U . If {n ∈ N : an ≤ bn} ∈ U, then [hani] ≤U [hbni] and we are done. If {n ∈ N : an ≤ bn} 6∈ U, then {n ∈ N : an > bn} = N \{n ∈ N : an ≤ bn} ∈ U by the maximality property, and {n ∈ N : an ≥ bn} ⊃ {n ∈ N : an > bn} is an element of U by the superset property. Therefore [hbni] ≤U [hani]. 

N Now that we know that R / ≡U is an ordered field, we will suppress the details ∗ N of its construction when not necessary going forward. We define R = R / ≡U to be the set of hyperreals, and we will write, for instance,

∗ (∃ a, b ∈ R): a ≤ b instead of N (∃ [hani], [hbni] ∈ R / ≡U ):[hani] ≤U [hbni] Note that we now take U to be a free ultrafilter, although none of the properties of ∗R that we have demonstrated thus far rely on freeness. We also introduce the notation σ N R for the set of equivalence classes of constant-valued sequences in R / ≡U , which is an embedding of the standard real numbers R into the hyperreals ∗R that is naturally isomorphic to R:

N a ∈ R ←→ [hai] = [ha, a, a, . . .i] ∈ R / ≡U We can now proceed to state the definition of infinite and infinitesimal numbers in ∗R in terms of the function ( a if a ≥ −a |a| = max(a, −a) = −a if a ≤ −a

∗ which is well-defined on R (if a ≤ −a and −a ≤ a, then {n ∈ N : an ≤ −an} ∈ U and {n ∈ N : −an ≤ an} ∈ U, so {n ∈ N : an = −an} = {n ∈ N : an = 0} = {n ∈ N : an ≤ −an} ∩ {n ∈ N : −an ≤ an} is an element of U by the finite intersection property, and hence a = −a = 0).

Definition 1.1.4 (Infinite and Infinitesimal Numbers). Let σR+ denote the embed- ding of R+ = {r ∈ R : r > 0} into ∗R. Then a hyperreal a ∈ ∗R is finite

10 if there exists some r ∈ σR+ such that |a| ≤ r, infinitesimal if |a| ≤ r for every r ∈ σR+, and infinite if |a| ≥ r for every r ∈ σR+. We denote the set of all finite numbers in ∗R by O, and the set of all infinitesimal numbers of ∗R by ϑ (which is a proper subset of O).

σ ∼ σ Note that the only infinitesimal in the standard reals R = R is 0, and R doesn’t contain any infinite elements. In order to demonstrate the existence of nonzero infinitesimals and infinite numbers in the hyperreals, consider the elements 1 1 1 1 σ + [hni] = [h1, 2, 3, 4,...i] and [h n i] = [h1, 2 , 3 , 4 ,...i]. Then for any [hri] ∈ R , there are only finitely many terms such that n < r and finitely many terms such 1 1 that n > r, so {n ∈ N : n < r} and {n ∈ N : n > r} are not elements of U because U is free (that is, it contains no finite sets). Therefore {n ∈ N : n ≥ r} 1 and {n ∈ N : n ≤ r} are elements of U by the maximality property of ultrafilters, 1 ∗ + 1 and hence [hni] ≥U [hri] and [h n i] ≤U [hri] for every [hri] ∈ R . Thus [h n i] is infinitesimal and [hni] is infinite. Definition 1.1.5 (Infinitesimal Closeness). Two finite hyperreals a, b ∈ O are said to be infinitesimally close if a − b ∈ ϑ, which we denote a ≈ b. The relation ≈ is an equivalence relation. Proof. The reflexivity of ≈ is clear: for all a ∈ O, a − a = 0 ∈ ϑ. That the relation is symmetric and transitive is also clear from the definition of an infinitesimal, since the negative of an infinitesimal is also infinitesimal and the sum of two infinitesi- mals is also infinitesimal (these assertions follow trivially from the definition of an infinitesimal). For any a, b ∈ O, if a − b ∈ ϑ, then b − a = −(a − b) ∈ ϑ, so a ≈ b =⇒ b ≈ a. Similarly, for any a, b, c ∈ O, if a − b ∈ ϑ and b − c ∈ ϑ, then a − c = (a − b) + (b − c) ∈ ϑ, so if a ≈ b and b ≈ c, then a ≈ c. 

Definition 1.1.6. Every finite hyperreal a ∈ O is infinitesimally close to a unique standard in σR, which we call the standard part of a and denote st(a). Proof. The existence of such a standard real number is perhaps intuitive, but the proof requires results from abstract algebra and hence is beyond the scope of this paper. Instead, we shall demonstrate the uniqueness of such a number. Suppose that c, c0 ∈ σR are such that c ≈ a and c0 ≈ a. Then, by transitivity, c ≈ c0 and hence c−c0 ∈ ϑ. Since c and c0 are both standard real numbers, then their difference is also a standard real number, and because there is only one standard real number 0 in ϑ, it follows that c − c = 0.  Note that there are some drawbacks to the ultrafilter construction of the hyper- reals that later methods have sought to rectify, the foremost issue being that it is not possible to determine any free ultrafilters on N (the best we can do is demonstrate their existence). As a result, the order relation on ∗R is not explicitly known, and one might puzzle over trying to order sequences such as the following:

h1, 0, 1, 0, 1, 0,...i

h0, 1, 0, 1, 0, 1,...i 1 1 1 1 h1, , 3, , 5, , 7, ,...i 2 4 6 8

11 For the first two sequences above, one must be equivalent to h1i modulo U and the other must be equivalent to h0i modulo U, but it is impossible to tell which one is which. For the third sequence, one might wonder whether the sequence represents an infinite number, an infinitesimal, or something in between. Also of concern is that different ultrafilters result in distinct fields, and it is an open problem whether or not these field will turn out to be isomorphic. However, in the section on transfer, we will find that the results that hold in these fields don’t depend on the ultrafilter, because any two such fields are so-called elementary equivalent.

1.1.4 Enriching Sets and Functions to the Hyperreal Setting

∗ N For every set A ⊂ R, we can associate the natural extension A ⊂ R / ≡U by

∗ [hxni] ∈ A if and only if {n ∈ N : xn ∈ A} ∈ U

The above definition can be readily extended to Cartesian products: given sequences 1 2 m xn, xn, . . . , xn (where it is understood that the superscripts index the sequence in this case and do not denote exponentiation) and a set A ∈ Rm, we define

1 2 m  ∗ 1 2 m [hxni], [hxni],..., [hxn i] ∈ A if and only if {n ∈ N :(xn, xn, . . . , xn ) ∈ A} ∈ U

This allows us to readily extend m-ary functions f(x1, x2, . . . , xm) and predicates/re- lations P (x1, x2, . . . , xm) to the nonstandard domain (note that a predicate is essen- tially a Boolean-valued function). One can view a m-ary relation P on a set X as a subset Pe of the Cartesian product space Xm, where we define

P (x1, x2, . . . , xm) if and only if (x1, x2, . . . , xm) ∈ Pe

m k m+k and a function f : X → X can be viewed as a subsetset Γf of X , where

f(x1, . . . , xm) = (y1, . . . , yk) if and only if (x1, x2, . . . , xm, y1, y2, . . . , yk) ∈ Γf

In the case of X = R, the natural extensions of Pe and Γf thus give us the corre- sponding function ∗f and relation ∗P in the nonstandard domain:

∗ 1 2 m  1 2 m P [hxni], [hxni],..., [hxn i] ⇐⇒ {n ∈ N : P (xn, xn, . . . , xn )} ∈ U

∗ 1 m  1 k  1 m 1 k f [hxni],..., [hxn i] = [hyni],..., [hyni] ⇐⇒ {n ∈ N : f(xn, . . . , xn ) = (yn, . . . , yn)} ∈ U

The extensions of a function f : Rm → Rk can be written more naturally in terms of the extension of the equality relation of Rk:

∗ 1 2 m 1 2 m f([hxni], [hxni],..., [hxn i]) = [hf(xn, xn, . . . , xn )i]

However, we also will want more sophisticated sets in nonstandard analysis than just copies of subsets of R, such as iterating the idea of a power set. In order to transfer the idea of the power set P(R) to the nonstandard domain, we introduce the idea of the ultrapower set of R, which is the set of equivalence classes of sequences N of subsets of R under ultrafilter equivalence (in analogy to R / ≡U , we now have N P(R) / ≡U , if you will)

12 Definition 1.1.7. Given two sequences hAni and hBni of subsets An, Bn ⊂ R, we define hAni ≡U hBni if and only if {n ∈ N : An = Bn ∈ U} The proof that this constitutes a valid equivalence relation is essentially identical to the proof in the case of equivalence modulo and ultrafilter for sequences of real numbers, and hence is omitted. We can also define a notion of containment for the equivalence classes in the ultrapower of P(R):

N Definition 1.1.8. Given an element [hani] ∈ R / ≡U and an element [hAni] of the ultrapower of P(R), we define a relation ∗ ∈ by

∗ [hani] ∈ [hAni] if and only if {n ∈ N : an ∈ An} ∈ U

N Although elements of the ultrapower of P(R) are not subsets of R / ≡U and ∗ ∈ is not the usual definition of membership, we can define a subset ∗A ⊂ ∗R corresponding to each sequence [hAni] in the ultrapower of P(R) by

∗ ∗ [hxni] ∈ A if and only if [hxni] ∈ [hAni]

Subsets of ∗R that are association with members of the ultrapower of P(R) in this way are called internal sets. The collection of all internal sets in ∗R is denoted ∗P(R), which is a proper subset of the power set P(∗R). One immediate conse- quence of this is that the natural extension of A ⊂ R from before is the associated to the constant sequence An = A for all n ∈ N. These ideas will be revisited in a more general setting in the section on the transfer principle.

1.2 Basic Analysis using the Hyperreals

In this section, we discuss some basic results from standard real analysis and dif- ferential calculus (of course, integrals and other more sophisticated mathematical objects can be extended to the nonstandard domain, but that is unfortunately be- yond the scope of this paper). This section is only meant to illustrate how analysis can function in the nonstandard domain and the facility with which results can be proved. We will not expend much effort in proving such results, however, because the transfer principle discussed in later sections will make it unnecessary to re-prove results in the nonstandard domain.

Definition 1.2.1 (Infinitesimal Continuity). The function ∗f : ∗A → ∗R is contin- uous at a point a ∈ ∗A if ∗f(a + ) ≈ ∗f(a) for every infinitesimal  ∈ ϑ.

Example 1.2.1. Let ∗f : ∗R → ∗R be the extension ∗f(x) = x2. Then for any finite hyperreal a ∈ O, we have for any infinitesimal  ∈ ϑ

∗f(a + ) = (a + )2 = a2 + 2a + 2

Since a is finite and  is infinitesimal, then 2a is infinitesimal (and clearly 2 is also infinitesimal, so ∗f(a + ) = a2 + 2a + 2 ≈ a2 = ∗f(a)

13 Since this holds for any  ∈ ϑ for each a ∈ O, then ∗f is continuous for all finite x by definition. However, ∗f is not necessary continuous for infinite values of x. For instance, given any infinite number ω, then for the infinitesimal 1/ω we have 1 1 1 1 1 ∗f(ω + ) = (ω + )2 = ω2 + 2ω · + = ω2 + 2 + 6≈ ω2 = ∗f(ω) ω ω ω ω2 ω2 so ∗f is not continuous at any infinite ω.

Definition 1.2.2 (Infinitesimal Differentiability). The function ∗f : ∗A → ∗R is differentiable at a point a ∈ ∗A if there exists a finite b ∈ σR such that ∗f(a + ) − ∗f(a) ≈ b  for every nonzero infinitesimal  ∈ ϑ. In this case, we define ∗f 0(a) = b.

Theorem 1.2.1. If a function ∗f : ∗A → ∗R is differentiable at a ∈ ∗A, then ∗f is continuous at a. Proof. Since ∗f is differentiable at a, then for any infinitesimal  ∈ ϑ we have

∗f(a + ) − ∗f(a) ≈  · b

Since b is finite and  is infinitesimal, then  · b is also infinitesimal, so

∗f(a + ) − ∗f(a) ≈  · b ≈ 0

⇓ ∗f(a + ) ≈ ∗f(a) ∗ for any  ∈ ϑ. Therefore f is continuous at a by definition.  Theorem 1.2.2 (Chain Rule). Let f : ∗R → ∗R and g : ∗R → ∗R be two functions such that g is differentiable at a ∈ ∗R and f is differentiable at g(a). Then the function f ◦ g :∗ R → ∗R is differentiable at a, and (f ◦ g)0(a) = f 0(g(a))g0(a). Proof. For any x 6= a such that x ≈ a, consider the expression f(g(x)) − f(g(a)) x − a If g(x) = g(a), then g(x) − g(a) f(g(x)) − f(g(a)) = 0 and = 0 x − a x − a so (f ◦ g)0(a) = 0 = f 0(g(a))g0(a). If g(x) 6= g(a), then we can write f(g(x)) − f(g(a)) f(g(x)) − f(g(a)) g(x) − g(a) = · x − a g(x) − g(a) x − a Since g(x) is differentiable at a, then g(x) − g(a) ≈ g0(a) x − a

14 and since g(x) is continuous at a, then g(x) ≈ g(a), so

f(g(x)) − f(g(a)) ≈ f 0(g(a)) g(x) − g(a)

Therefore f(g(x)) − f(g(a)) f(g(x)) − f(g(a)) g(x) − g(a) (f ◦ g)0(a) ≈ = · ≈ f 0(g(a))g0(a) x − a g(x) − g(a) x − a whenever x ≈ a, so f◦g is differentiable at a with (f◦g)0(a) = f 0(g(a))g0(a) by definition.  Note that this proof if the Chain Rule is essentially trivial, and follows from a small bit of algebra that one might naively, albeit incorrectly, attempt when first learning calculus. We can also find incredible utility in the introduction of infinite numbers when modeling probabilistic behaviors such as the physics of a large col- lection of particles, for example, where we often want to consider the where the number of particles is extremely large. It is also possible to introduce exotic mathematical objects in the nonstandard domain that can (and have) been used to prove unsolved problems in analysis.

15 2

The Transfer Principle The basic idea of this section is that, rather than developing two distinct systems of analysis using either the reals or the hyperreals and having to manually prove whether or not results derived in one system hold in the other, we want a principle that guarantees that statements of certain forms hold in standard analysis if and only if they hold in the nonstandard setting. This is somewhat analogous to, for instance, the principle of permanence of functional equations in complex analysis, which, if one forgives the imprecise statement, roughly gives the following:

Given two functions g, h : R → R and some relationship F : R2 → R such that F (g(x), h(x)) = 0 for all x ∈ R then if g, h, and F permit extensions g∗, h∗ : C → C and F ∗ : C2 → C such that g∗(z) and h∗(z) are analytic and F ∗(z, w) is analytic in z for each fixed w and analytic in w for each fixed z, then g∗(z) and h∗(z) satisfy ∗ ∗ ∗ F (g (z), h (z)) = 0 for all z ∈ C Although more sophisticated results have since been developed that allow for the transfer of more complicated statements, this section is intended to discuss theL´oˆs Theorem on the transfer of first-order logical formulae.

2.1 Formal Language and First-Order Logic

To begin with, we first want to define a language with which we construct logical formulae. This language will contain the following logical symbols

¬ not ∃ existential quantifier (there exists)

∧ and x1, x2,... variables ∨ or ( left bracket

⇒ implies ) right bracket ⇔ is equivalent to (if and only if) , comma ∀ universal quantifier (for all)

In addition to logical symbols, there are also constants, function symbols, and predicates.

16 Example 2.1.1. Let P (x) denote the predicate (x ≡ 0 mod 2π), and define the functions f1(x) = cos(x) and f2(x) = 2 − cos(x). Then we can form the statement

(∀x ∈ R)(P (x) =⇒ f1(x) = f2(x))

Definition 2.1.1. A language L is a set of symbols containing 1. The logical symbols

2. A set of constants symbols C

3. A set of functions F and positive integers nf for each f ∈ F indicating that f is a function of nf variables.

4. A set of predicates/relations R and positive integers nR for each R ∈ R indi- cating that R is an nR-ary relation. Technically, the inclusion of constants above is superfluous, because they can be regarded as nullary functions, functions with no inputs and one output. Since the logical symbols are a part of every language, we usually omit them when defining and discussing languages. For instance, we would denote language that only contains the logical symbols by L = ∅. Given a language, one can construct formulas, which are combinations of sym- bols arranged with the proper syntax (of course, the reader should already be familiar with the syntax of logical and mathematical expressions). Definition 2.1.2. Given a language L, let M be a nonempty set and V ⊂ L be the set of all variables in L.A variable assignment is a mapping β : V → M which assigns elements of M to all variables in V.

Definition 2.1.3. Let L be a language. An L-structure M is given by the follow- ing data: 1. A nonempty set M called the /domain/underlying set of M

2. A function f M : M nf → M for each f ∈ F

M nR 3. A set R ⊂ M for each R ∈ R, where we interpret R(x1, x2, . . . , xnR ) as M ”true” if and only if (x1, x2, . . . , xnR ) ∈ R 4. An element cM for every c ∈ C We refer to f M, RM, and cM as interpretations of the symbols f, R, and c. We often write the structure as M = (M, f M,RM, cM : f ∈ F,R ∈ R, and c ∈ C), or we sometimes denote it M = (M, I, β), where β is a variable assignment function and I is an interpretation function with domain the set of all constants, symbols, and functions in L (i.e., I(c) = cM, I(f) = f M, and I(R) = RM for all c ∈ C, f ∈ F, R ∈ R). Closely related is the concept of a relational structure: Definition 2.1.4. A relational structure S − {M, R, F} consists of a set M, a set R or finitary relations on M, and a set F of functions on M.

17 Of special interest will be the relational structures R, consisting of R and all possible relations and functions R, and ∗R, which consists of ∗R and the extensions of the relations and functions in R.

Definition 2.1.5. A term is a string of symbols from a language L that is defined recursively as follow:

• Every constant and every variable of L is a term.

• If τ1, τ2, . . . , τn are terms and f is an nf -ary function, then f(τ1, τ2, . . . , τn) is a term.

• A string of symbols is a term if it can be constructed by the application of finitely many of the above steps.

Definition 2.1.6. Let L be a language and M = (M, I, β) be an L-structure of L. Then the interpretation (τ)I,β of any term τ of symbols in L is defined as follows:

• If τ = c for some constant c ∈ C, then (τ)I,β = I(c) = cM

• If τ = x for some variable x, then (τ)I,β = β(x)

I,β • If τ = f(τ1, τ2, . . . , τnf ) for some nf -ary function f ∈ F, then (τ) = I,β I,β I,β M I,β I,β I(f)((τ1) , (τ2) ,..., (τ2) ) = f ((τ1) ,..., (τ2) )

Definition 2.1.7. A formula is a string of symbols from a language L that is defined recursively as follows:

1. If τ1 and τ2 are terms, then (τ1 = τ2) is an formula

2. If R is a nR-ary relation and τ1, τ2, . . . , τnR are terms, then R(τ1, τ2, . . . , τnR ) is formula

3. If ϕ is a formula then so is (¬ϕ)

4. If ϕ and ψ are formulas, then so is (ϕ ∧ ψ)

5. If ϕ is a formula and x is a variable, then (∃x)(ϕ) is a formula

6. A string of symbols is a formula if it can be constructed by the application of finitely many of the above steps.

Formulas given in the form of #1 and #2 above are called atomic formulae.

Note that the inclusion of #1 above is not strictly necessary to the definition, since it is just a special case of #2. Also note that it is only necessary to consider the symbols ∧, ¬, and ∃ both here and in proofs later on because all other logical symbols can be written in terms of these three. For instance, (ϕ ∨ ψ) = ¬((¬ϕ) ∧ (¬ψ)), (∀x)(ϕ) = ¬((∃x)(¬ϕ)), and (ϕ ⇒ ψ) = ¬(ϕ ∧ (¬ψ)).

Definition 2.1.8. Let L be a language and M = (M, I, β) be an L-structure for L, and let ϕ be a formula in L. Then we say that M satisfies ϕ and write M |= ϕ if:

18 • If ϕ = R(τ1, . . . , τnR ) for some nR-ary relation R ∈ R, meaning that ϕ is I,β I,β M atomic, then M |= ϕ if ((τ1) ,..., (τnR ) ) ∈ I(R) = R • If ϕ = ¬ψ for some atomic formula ψ, then M |= ϕ if M does not satisfy ψ (that is, if M 6|= ϕ)

• If ϕ = (µ ∧ ν) for some atomic formulae µ and ν, then M |= ϕ if M |= µ and M |= ν.

• If ϕ = (∃x)(ψ) for some atomic formula ψ, then M = (M, I, β) |= ϕ if there exists some c ∈ M such that (M, I, β[x, c]) |= ψ, where ( c if y = x β[x, c](y) = β(y) if y 6= x

Definition 2.1.9. Given a language L, a set T of non-empty formulas in L is called a theory. We say that an L-structure M is a model of T if M |= ϕ for all ϕ ∈ T , and write M |= T . The theory of M, denoted T h(M), is the set of all sentences ϕ of L such that M |= ϕ. Definition 2.1.10. Let ϕ be a formula. In the expressions (∀x)(ϕ) and (∃x)(ϕ), we call ϕ the range of the quantifier. Definition 2.1.11. An occurrence of a variable x is called bound if it lies in the range of a universal or existential quantifier, and free otherwise. A formula in which all variables are bound is called a sentence, and is said to be closed. Thus concludes the basic overview of the relevant topics in first-order logic. In the next section, we discuss ultraproducts and conclude andL´oˆs’Theorem.

2.2 Basics of Transfer

2.2.1 Ultraproducts and Ultrapowers Definition 2.2.1. Let U be a free ultrafilter on some (infinite) indexing set J, and let {Mj}j∈J be a collection of nonempty sets. Then we define the arbitrary product of the collection as Y [ Mj = {f : J → Mj | f(j) ∈ Mj for all j ∈ J} j∈J j∈J Q Two functions f, g ∈ j∈J Mj are modulo equivalent if {j ∈ J : f(j) = g(j)} ∈ U, in which case we write f =U g or, in terms of equivalence classes, [f]U = [g]U . Note that the proof that modulo equivalence is an equivalence relation is entirely N analogous to the proof that ≡U is an equivalence relation on R .

Definition 2.2.2. The of {Mj}j∈J modulo U is the set of equivalence classes Y . Y ( Mj) U = {[f]U : f ∈ Mj} j∈J j∈J

19 The ultrapower of a set M is the ultraproduct of the constant sequence Mj = M for all j ∈ J: Y . Y ( M) U = {[f]U : f ∈ M} j∈J j∈J

Q ∗ R Using the above notation, we would denote the hyperreals by R = n∈N /U.

Definition 2.2.3. Let J be an index set with some ultrafilter U on J, and let Mj = (Mj,Ij, βj) be a L-structure for some language L for all j ∈ J. Then the ultraproduct ∗ Q ∗ ∗ ∗ M = ( j∈J Mj/U, I, β) is a model of L with an interpretation function I and a variable assignment function ∗β defined as follows:

∗ • If x is a variable in L, then β(x) = [βj(x)]U

∗ • If c ∈ C is a constant in L, then I(c) = [Ij(c)]U

• If f ∈ F is an nf -ary function, then

∗ I(f)([g1]U , [g2]U ,..., [gnf ]U ) = [Ij(f)(g1(j), g2(j), . . . , gnf (j))]U

• If R is a nR-ary relation, then

∗ ([g1]U ,..., [gnf ]U ) ∈ I(R) ⇐⇒ {j ∈ J :(g1(j), . . . , gnf (j)) ∈ Ij(R)} ∈ U

The above definitions are well-defined.

Proof. To demonstrate that the above is well-defined, we want to show that the defi- ∗ ∗ nitions of I(f) and I(R) for each given argument ([g1]U ,..., [gnf ]U ) is independent of the choice of representative of each equivalence class [gm]U In case of the relation R, fix g , g , . . . , g , g0 , g0 , . . . g0 ∈ Q M such that 1 2 nR 1 2 nR j∈J j 0 0 gm =U gm for all 1 ≤ m ≤ nR. Then {j ∈ J : gm(j) = gm(j)} ∈ U for all 1 ≤ m ≤ nR, so by the finite intersection property

n \R {j ∈ J :(g (j), . . . , g (j)) = (g0 (j), . . . , g0 (j))} = {j ∈ J : g (j) = g0 (j)} ∈ U 1 nR 1 nR m m m=1

∗ If ([g1]U ,..., [gnf ]U ) ∈ I(R) =⇒ {j ∈ J :(g1(j), . . . , gnf (j)) ∈ Ij(R)} ∈ U, then by the finite intersection property again the set below, which is the intersection of the previous two sets, is also an element of U.

{j ∈ J :(g (j), . . . , g (j)) = (g0 (j), . . . , g0 (j)) and (g (j), . . . , g (j)) ∈ I (R)} 1 nR 1 nR 1 nR j Then by the superset property, the set below, which is a superset of the set above, is an element of U. {j ∈ J :(g0 (j), . . . , g0 (j)) ∈ I (R)} 1 nR j Therefore ([g ] ,..., [g ] ) ∈ ∗I(R) =⇒ ([g0 ] ,..., [g0 ] ) ∈ ∗I(R), and the same 1 U nf U 1 U nf U 0 reasoning with each gm and gm switched will demonstrate ([g1]U ,..., [gnf ]U ) ∈ ∗I(R) ⇐= ([g0 ] ,..., [g0 ] ) ∈ ∗I(R), so 1 U nf U

([g ] ,..., [g ] ) ∈ ∗I(R) ⇐⇒ ([g0 ] ,..., [g0 ] ) ∈ ∗I(R) 1 U nf U 1 U nf U

20 and hence ∗I(R) is well-defined. The case of functions proceeds similarly. Given g , g , . . . , g , g0 , g0 , . . . g0 ∈ 1 2 nf 1 2 nf Q 0 j∈J Mj such that gm =U gm for all 1 ≤ m ≤ nf , then

nf \ {j ∈ J :(g (j), . . . , g (j)) = (g0 (j), . . . , g0 (j))} = {j ∈ J : g (j) = g0 (j)} ∈ U 1 nf 1 nf m m m=1 as before. By the superset property, the set below, which is a superset of the set above set, is also an element of U

{j ∈ J : I (f)(g (j), . . . , g (j)) = I (f)(g0 (j), . . . , g0 (j)} j 1 nf j 1 nf Therefore [I (f)(g (j), . . . , g (j))] = [I (f)(g0 (j), . . . , g0 (j))] , so ∗I(f) is well- j 1 nf U j 1 nf U defined. 

2.2.2L´oˆs’Theorem Theorem 2.2.1 (L´oˆs’Theorem) . Let L be a language, J be a set with some ultrafilter U on J, and M = (Mj,Ij, βj) be an L-structure for all j ∈ J. Then for any formula ∗ Q M ∗ ∗ ϕ of L we have that M = ( j∈J j/U, I, β) |= ϕ if and only if {j ∈ J : Mj |= ϕ} ∈ U Proof. Since every formula is defined recursively, we proceed with the proof by induction on the complexity of the formula. First, as the base case, we consider atomic formulae

• If ϕ is an atomic formula (i.e., a relation R(τ1, . . . , τnR ) between the terms

τ1, . . . , τnR ) then the statement holds by definition of relations on the ultra- product (revisit Definition 2.2.3 if this is unclear) Now, given any formula ϕ, suppose as an induction hypothesis that the result holds for any formula that can be constructed in strictly fewer steps than ϕ.

• If ϕ = (µ ∧ ν) for some formulas µ and ν, then suppose for the forwards direction of the proof that ∗M |= (µ ∧ ν). Then ∗M |= µ and M |= ν by definition. Since µ and ν can be constructed in one fewer step than ϕ, then ∗ by the induction hypothesis M |= µ =⇒ {j ∈ J : Mj |= µ} ∈ U and ∗ M |= ν ⇒ {j ∈ J : Mj |= ν} ∈ U. Then, by the finite intersection property,

{j ∈ J : Mj |= (µ ∧ ν)} = {j ∈ J : Mj |= µ and Mj |= ν}

= {j ∈ J : Mj |= µ} ∩ {j ∈ J : Mj |= ν} is an element of U.

For the reverse direction, suppose that {j ∈ J : Mj |= (µ ∧ ν)} ∈ U. Then, since {j ∈ J : Mj |= (µ ∧ ν)} = {j ∈ J : Mj |= µ and Mj |= ν)}, we have that {j ∈ J : Mj |= µ} ⊃ {j ∈ J : Mj |= µ and Mj |= ν)}

{j ∈ J : Mj |= ν} ⊃ {j ∈ J : Mj |= µ and Mj |= ν)} are both elements of U by the superset property. Therefore ∗M |= µ and ∗M |= ν by the induction hypothesis, so ∗M |= (µ ∧ ν) by definition.

21 • If ϕ = (¬ψ) for some formula ψ, then ∗M |= ϕ if and only if ∗M 6|= ψ by definition. Since ψ can be constructed in one fewer step that ϕ, it follows from ∗ the induction hypothesis that M 6|= ψ ⇐⇒ {j ∈ J : Mj |= ψ} 6∈ U. By the maximality property, {j ∈ J : Mj |= ψ} 6∈ U if and only if

J \{j ∈ J : Mj |= ψ} = {j ∈ J : Mj 6|= ψ} = {j ∈ J : Mj |= ϕ} ∈ U ∗ Therefore M |= ϕ ⇐⇒ {j ∈ J : Mj |= ϕ} ∈ U in this case. • If ϕ = (∃x)(ψ) for some formula ψ with a free variable x, then suppose for the forwards direction that ∗M |= (∃x)(ψ). By definition, this means that there Q M Q M ∗ ∗ exists some [g]U ∈ j∈J j/U such that ( j∈J j/U, I, β[x, [g]U ]) |= ψ. Since ψ can be constructed in one fewer step than ϕ, then by the induction hypothesis this implies that U contains {j ∈ J :(Mj,Ij, βj[x, g(j)]) |= ψ}, and hence by the superset property contains

{j ∈ J : Mj = (Mj,Ij, βj) |= (∃x)(ψ)} ⊃ {j ∈ J :(Mj,Ij, βj[x, g(j)]) |= ψ}

For the reverse direction, suppose {j ∈ J : Mj |= (∃x)(ψ)} ∈ U. Then, Q using the , define a function g ∈ j∈J Mj such that for all j ∈ {j ∈ J : Mj = (Mj,Ij, βj) |= (∃x)(ψ)} we have that g(j) is cho- sen so that (Mj,Ij, βj[x, g(j)]) |= ψ (that is, g(j) is an element in Mj that satisfies ψ for each j where such an element exists). Then by the induc- tion hypothesis we have that {j ∈ J :(Mj,Ij, βj[x, g(j)]) |= ψ} ∈ U =⇒ Q M ∗ ∗ ∗ Q M ∗ ∗ ( j∈J j/U, I, β[x, [g]U ]) |= ψ, so M = ( j∈J j/U, I, β) |= (∃x)(ψ) by defi- nition. Since every formula in L can be obtained by the application of finitely many of the above steps, then the result holds by induction.  If one returns to the notion that a filter roughly represents a notion of ”largeness” on a set, then in a heuristic senseL´oˆs’Theorem essentially says that a statement holds in the ultraproduct ∗M if and only if it holds in a ”large enough” proportion of the L-structures Mj. Corollary 2.2.1.1 (Transfer Principle). Let L be a language, J be a set with some ultrafilter U on J, and M = (M, I, β) be an L-structure. Then for all ϕ of L we ∗ Q ∗ ∗ have that M = ( j∈J M/U, I, β) |= ϕ if and only if M = (M, I, β) |= ϕ. That is, ∗M |= T h(M). ∗ ∗ Proof. If M |= ϕ, then {j ∈ J : Mj |= ϕ} ∈ U byL´oˆs’Theorem. However, Mj = M for all j ∈ J, then either {j ∈ J : Mj |= ϕ} = J or {j ∈ J : Mj |= ϕ} = ∅. Since ∅ 6∈ U, then it follows that {j ∈ J : Mj |= ϕ} = {j ∈ J : M |= ϕ} = J and hence M |= ϕ. On the other hand, if M |= ϕ, then {j ∈ J : Mj |= ϕ} = {j ∈ J : ∗ M |= ϕ} = J ∈ U, so M |= ϕ byL´oˆs’Theorem.  In the case of the hyperreals ∗R, we accomplish transfer by way of the ∗- transform: Definition 2.2.4 (∗-Transform). If a term τ is a variable or a constant, then ∗τ is σ the embedding of τ into R. If τ has the form f(τ1, . . . , τnf ) for an nf -ary function ∗ ∗ ∗ ∗ ∗ f and terms (τ1, . . . , τnf ), then τ = f( τ1,..., τ1), where f is the extension of f to the hyperreals. In the case of formulas, we define the ∗-transform inductively on the construction of the formula:

22 ∗ ∗ ∗ ∗ • For atomic formulae, (R(τ1, . . . , τnR )) = R( τ1,..., τn) • ∗(¬ϕ) = ¬(∗ϕ)

• ∗(ϕ ∧ ψ) = (∗ϕ) ∧ (∗ψ)

• ∗((∃x ∈ A)(ϕ)) = (∃x ∈ ∗A)(∗ϕ) The ∗-transform essentially consists of putting a ∗ on every term, relation symbol, function symbol, and set acting as a bound on a variable.

Also note that the ∗-transforms of the relations = and ≤ are ≡U and ≤U , re- spectively.

Example 2.2.1. The ∗-transform of

+ (∀ ∈ R )(∃N ∈ N)(∀n, m ∈ N)(n, m ≥ N =⇒ |sn − sm| < ) is given by

∗ + ∗ ∗ ∗ ∗ (∀ ∈ R )(∃N ∈ N)(∀n, m ∈ N)(n, m ≥U N =⇒ | sn − sm|

Theorem 2.2.2. A function f : R → R satisfies the standard epsilon-delta defini- tion of continuity at some point a ∈ R if and only if its extension ∗f : ∗R → ∗R satisfies the infinitesimal definition of continuity (Definition 1.2.1) at a ∈ σR. Proof. First, suppose that f satisfies the epsilon-delta definition of continuity at a. Then for any fixed  ∈ R+, there exists a δ ∈ R+ such that

(∀x ∈ R)(|x − a| < δ =⇒ |f(x) − f(a)| < ) Taking the ∗-transform of the above sentence, we have

∗ ∗ ∗ (∀x ∈ R)(|x − a|

∗ + ∗ ∗ ∗ (∃δ ∈ R )(∀x ∈ R)(|x − a|

23 and hence, because the sentence above is the ∗-transform of the sentence below, it follows from the transfer principle that

+ (∃δ ∈ R )(∀x ∈ R)(|x − a| < δ =⇒ |f(x) − f(a)| < ) Since the above holds for any  > 0, we have demonstrated the standard continuity of f at a.  Theorem 2.2.3. A function f : R → R is (standard) differentiable at a ∈ R with derivative f 0(a) if and only if its extension ∗f : ∗R → ∗R satisfies ∗f(a + h) − ∗f(h) ≈ f 0(a) h for every infinitesimal h ∈ ϑ.

Proof. First, suppose that f is (standard) differentiable at a with deritive f 0(a), so

f(a + h) − f(a) lim = f 0(a) h→0 h ⇓

 f(a + h) − f(a) 0  (∀ > 0)(∃δ > 0)(∀h ∈ ) |h| < δ =⇒ − f (a) <  R h Then, fixing some  > 0 and the corresponding δ, we have by taking the ∗-transform that ∗ ∗ ∗  f(a + h) − f(a) 0  (∀h ∈ ) |h|

∗ ∗ f(a + h) − f(a) 0 − f (a)

∗f(a+h)−∗f(a) 0 Since this is true for any standard  > 0, we must have that h − f (a) is infinitesimal, and hence ∗f(a + h) − ∗f(a) ≈ f 0(a) h for any infinitesimal h. For the reverse direction, suppose that ∗f satisfies the infinitesimal differentia- bility property at a, and fix some  ∈ R+. Then for any infinitesimal δ, whenever |h| ≤U δ we have that ∗f(a + h) − ∗f(a) ≈ f 0(a) h and hence ∗ ∗ f(a + h) − f(a) 0 − f (a) ≤U  h Therefore we have demonstrated

∗ ∗ ∗ + ∗  f(a + h) − f(a) 0  (∃δ ∈ )(∀x ∈ ) |h|

24 and hence, since the above statement is the ∗-transform of the statement below, we must have

+  f(a + h) − f(a) 0  (∃δ ∈ )(∀x ∈ ) |h| < δ =⇒ − f (a) <  R R h by the transfer principle. Since this holds for any  ∈ R+, it follows from the epsilon-delta definition of a limit that

f(a + h) − f(a) lim = f 0(a) h→0 h thus f is (standard) differentiable at a. 

25 Bibliography

F. C. Claassens. Non-standard analysis. Web page, 2016. URL https://www.universiteitleiden.nl/binaries/content/assets/science/ mi/scripties/masterclaassens.pdf. Accessed 2020/06/04.

I. Davis. An introduction to nonstandard analysis. Web page, 2009. URL https:// www.math.uchicago.edu/~may/VIGRE/VIGRE2009/REUPapers/Davis.pdf. Ac- cessed 2020/06/04.

P. Fletcher, K. Hrbacek, V. Kanovei, M. G. Katz, C. Lobry, and S. Sanders. Ap- proaches to analysis with infinitesimals following robinson, nelson, and others, 2017. URL https://arxiv.org/pdf/1703.00425.pdf.

P. Keef and D. Guichard. Introduction to higher mathematics. Web page, 2015. URL https://www.whitman.edu/mathematics/higher_math_online/. Accessed 2020/06/04.

D. Marker. for algebra and algebraic geometry. Web page, 2010. URL http://homepages.math.uic.edu/~marker/orsay/orsay1.pdf. Accessed 2020/06/04.

F. Murnaghan. Mat 240 - algebra i: Fields. Web page, 2015. URL http://www. math.toronto.edu/fiona/courses/mat240/field.pdf. Accessed 2020/06/04.

D. A. B. Rayo. Introduction to non-standard analysis. Web page, 2015. URL http://math.uchicago.edu/~may/REU2015/REUPapers/Rayo.pdf. Ac- cessed 2020/06/04.

26