Numbers: from Theoretical Foundations to Practical Applications
AM30MP Mathematics Project
Calum Horrobin Aston University 6th May 2014
i Abstract
The rewards of forming a well-established mathematical theory are unbounded. This provides a limitless source of philosophical knowledge and practical under- standing about how our universe works. Furthermore, any practical application of mathematics cannot be appreciated in isolation and solely by its real-world implications, but rather as a consequence of forming a well-established theory. In this project we have analysed the theoretical foundations of numbers and demonstrated their value to a practical application of cryptography. A coherent theory of numbers is valuable to mathematics since numbers are a key compo- nent in many pure areas such as analysis, topology, number theory and geometry.
We have demonstrated how to establish a theory of numbers based on set theory, specifically using John Von Neumann’s inductive sets to model the nat- ural numbers. We first prove that this construction of the natural numbers is valid under the Zermelo-Fraenkel axioms. We also prove that various properties which are essential to the natural numbers are satisfied, namely Peano’s postu- lates and the principles of induction and well-ordering. Ultimately, establishing the foundations of mathematics relies on being satisfied with an initial object or premise which is based on metamathematical grounds. The philosophy of using set theory as the foundations of mathematics is based on the acceptance of metamathematical concepts relating to the existence and nature of sets in the first place. Logical issues about the completeness and consistency of our set the- ory axiom system have implications to the completeness and consistency of our theory of numbers. We address the relevant philosophical and logical matters as appropriate.
A rigorous treatment of the foundations gives way to an effective and well- rounded mathematical theory. We broaden the theoretical foundations to analyse and prove some results in number theory which give way to a practical application of cryptography. Thus, we appreciate the value of establishing the foundations of numbers and recognise how further developments in theoretical components would improve the results in applied areas.
ii Preface
This project is a continuation of my Mathematics Report [1]. The Zermelo-Fraenkel axioms will be the starting point of my mathematical analysis, for completeness and reference a list of these axioms is given below:
Axiom of Extensionality: If two sets, X and Y , have the same elements then they are equal, X = Y . ∀x(x ∈ X ⇔ x ∈ Y ) ⇒ (X = Y ). Axiom of Empty Set: For any set, X, there exists a set, E, such that X is not a member of E. ∀X∃E : X/∈ E. Axiom of Pairs: For any pair of sets, X and Y , there exists a set, Z, which either just contains X or just contains Y .
∀X∀Y (∃Z : ∀z(z ∈ Z ⇔ (z = X ∨ z = Y ))).
Axiom of Union: For any set, X, there exists a set, Y , which is the ‘union’ of all elements of X. ∀X∃Y : ∀y(y ∈ Y ⇔ ∃z :(z ∈ X ∧ y ∈ z)). Axiom of Power Set: For any set, X, there exists a set, Y , which is the set of all subsets of X. ∀X∃Y : ∀y(y ∈ Y ⇔ y ⊆ X). Axiom of Foundation: For all non-empty sets, X, there is an element within X, w, such that X and w have no common elements.
∀X(X 6= ∅ ⇒ (∃w ∈ X :(6 ∃y :(y ∈ X ∧ y ∈ w)))).
Axiom of Infinity: There exists a set, X, such that ∅ ∈ X, also such that if y ∈ X then {y} ∈ X. ∃X : ((∅ ∈ X) ∧ ∀y(y ∈ X ⇒ {y} ∈ X)). Axiom of Subsets: Let X be a set and let L(w) be a logical property of sets which depends on the variable w, then there exists a set, Y , which is the set of all the elements, w, in X which satisfy L(w).
∃Y : ∀w(w ∈ Y ⇔ (w ∈ X ∧ L(w))).
The following fundamental theorems were proved in my previous report and will be needed for my work concerning set theory:
Theorem 0.1. There is no set x such that it is an element of itself: ∀x(x∈ / x).
Cantor’s Theorem. Let S be a set. The cardinality of the power set of S, P (S), is strictly greater than the cardinality of S: card(S) < card(P (S)).
iii Contents
1 Introduction and Context 1
2 The Foundations: from Set Theory to Number Theory 3 2.1 The Natural Numbers ...... 3 2.1.1 John Von Neumann’s Ordinals ...... 4 2.1.2 Philosophical and Logical Implications ...... 8 2.1.3 Induction and Ordering of the Natural Numbers ...... 9 2.1.4 Reflections and the Infinitude of the Natural Numbers . . . . . 12 2.2 Extending the Natural Numbers ...... 14 2.2.1 Arithmetic ...... 14 2.2.2 The Sets of Integer and Rational Numbers ...... 17 2.3 The Real Numbers ...... 19 2.3.1 Analysis of Dedekind Cuts and Cauchy Sequences ...... 20 2.3.2 Decimal Representation ...... 22
3 Broadening the Theory of Numbers 25 3.1 Fundamental Results ...... 25 3.1.1 Factoring Natural Numbers ...... 25 3.1.2 Analytic Study of Primes ...... 27 3.1.3 Theorems with Applicable Results to RSA Cryptography . . . . 31 3.2 Relevant Issues ...... 34 3.2.1 Computing with Number Theory ...... 35 3.2.2 Integer Factorisation Difficulties ...... 35 3.3 Primality Testing ...... 35 3.3.1 Fermat’s Test ...... 37 3.3.2 Miller’s Test ...... 37
4 Applying our Theoretical Results 40 4.1 The RSA Cryptosystem ...... 40 4.1.1 How Encryption and Decryption Works ...... 40 4.1.2 Security of the RSA Method ...... 41 4.2 Implementing Fermat’s and Miller’s tests ...... 42 4.3 Generating Random Large Primes ...... 44 4.3.1 Theory and Method ...... 44 4.3.2 Analysis of Results ...... 45
5 Conclusion 47
Appendix: Transcript of Codes 49
References 52
iv 1 Introduction and Context
1 Introduction and Context
My previous report [1] demonstrated the importance of setting a sensible basis to un- derlie our mathematical theory, which consists of logic, undefined terms and axioms. Logically and philosophically speaking, practical applications of mathematics rely on these foundations as they form a consistent and effective theory. The objective of this work is to gain a deeper understanding of the foundations of mathematics and to demonstrate their fundamental importance to practical situations. To achieve this we first rigorously establish the foundations of numbers by using set theory. Building on this, we then develop the theory of numbers further to give way to a practical applica- tion of cryptography.
As motivation for studying the foundations of the natural numbers, it is in fact logi- cally impossible to define the set of natural numbers in such a way that every natural number is a member of that set. This is contrary to what we might expect. The reason is because a ‘set’ has to be well-defined, as analysed in my previous report, it must satisfy one of the Zermelo-Fraenkel axioms of set theory. In Section 2.1 we discuss how Georg Cantor was aware of the contradictions caused by the set of all ordinal numbers and in Section 2.1.1 we examine this further using John Von Neumann’s definition of an ordinal number. It is of fundamental importance that the definition of an ordinal number must be in such a way as to deal with the problem of there being infinitely many natural numbers. Also, a valid definition should preserve the ordering properties of trichotomy and well-ordering, we will analyse order in section 2.1.3. As given in my previous conclusion, an example of how an unsophisticated treatment of the natural numbers can lead to awkward problems is thus, ‘does the set N = {0, 1, 2,...} have more elements than the set N\{1} = {0, 2, 3,...}?’
The nature of our work for this project has a strong theoretical component. In order to show its value to an applied area we have focussed on its application to RSA (Rivest, Shamir and Adleman) cryptography. Cryptography is fundamental to the safety of electronic communication and is heavily relied upon in today’s world. In Section 3.1.3 we prove various results with direct applications to RSA cryptography and in Section 4 we show how the mathematical theory can be implemented in order to ensure the security of encryption.
Page 1 of 53 1 Introduction and Context
Statement of Contributions
1. Through rigorous analysis of set theory, I have demonstrated how to establish a coherent theory of numbers based on the Zermelo-Fraenkel axioms;
2. I have made original reflections which delve deeper into the subject;
3. I have illustrated the application of fundamental results by programming a rea- sonably functional random prime generator;
4. This project is the result of much research and reflection, it is oriented to enhance the connection between purely theoretical foundations and practical applications;
5. This project has been written to invite the reader to reflect on the value of studying abstract foundations of mathematical theories.
Page 2 of 53 2 The Foundations: from Set Theory to Number Theory
2 The Foundations: from Set Theory to Number Theory
Our objective here is to establish the foundations upon which we can base a theory of numbers using set theory. We first study how various properties which determine the nature of numbers can be understood and derived, beginning with the natural numbers. Extending this, we then reflect on techniques of constructing the real numbers from the natural numbers and discuss some interesting observations of how we are able to represent numbers. In particular, representing numbers in digit form has some curious consequences and is intimately related to remaining unsolved challenges in number theory. Our work has significant value to the broad mathematical theory as a rigorous study of numbers gives way to topics such as number theory, topology and analysis. This Section of the report is inspired by the readings of: Stewart & Tall, Greenberg, Wilder, Dalen & Monna, Hamilton, Nievergelt, Faticoni and Potter [2–9] respectively.
2.1 The Natural Numbers
“If we are to do any worthwhile mathematics, it will certainly be necessary to have the natural numbers (or at any rate adequate proxies for them) available to us.” Potter p67 [9].
The natural numbers are fundamental to mathematics. To define set of the natural numbers is not a trivial task, an obvious difficulty is because there are infinitely many of them, “we cannot write down a complete list of elements: they just go on forever” p146 [2]. The existence and essence of the natural numbers can be analysed in a great deal, for example Whitehead and Russell based their approach on fundamental prin- ciples in logic and algebra in their work Principia Mathematica [10]. For our work, taking a set theory approach, we find that the crux of the matter in defining the nat- ural numbers lies around how to deal with their infinitude, which has previously been considered as “inaccessible to human knowledge” [Preface to] [10].
It would seem intuitive that we should be able to define the set of natural numbers, N, in such a way that every natural number is an element of N, however, this is log- ically impossible. As Georg Cantor was aware, the set of all ordinal numbers causes inconsistencies, “it would have an ordinal δ, which would be greater than all ordinals α, in particular δ would be greater than δ, which is absurd” p22 [5]. We state and reflect on Cantor’s definition of an ordinal number in Section 2.1.4 and in Section 2.1.1 we analyse how Von Neumann’s definition of an ordinal number also does not permit the existence of a set of all ordinal numbers. Essentially, based on Von Neumann’s definition, the set of all ordinals causes an analogy to Russell’s paradox and Cantor’s paradox whereby we are able to claim the truth of two opposing statements. To ensure logical consistency in mathematics is one of our reasons for rigorously establishing the foundations of the natural numbers.
A second reason for establishing the foundations of the natural numbers is, from a purely theoretical perspective, because we would like to minimise the number of terms
Page 3 of 53 2 The Foundations: from Set Theory to Number Theory we leave undefined in our mathematical theory. By doing so, we gain a deeper under- standing of what numbers are and also appreciate what are their most fundamental properties, those which we should be able to derive and those which characterise or ‘define’ them. Part of the value of analysing axioms and investing redundancies is, as Allenby writes, “to discover which features of a given system are essential and which only incidental” p12 [11]. Conclusions from my previous report show that it is impos- sible to define all terms and we should regard axioms as primitive assumptions which, in some sense, define objects by characterising all of their essential properties. The characteristic properties of the natural numbers can be precisely summarised through Giuseppe Peano’s postulates, which are stated below:
Peano’s Postulates
1. 0 is a natural number.
2. For each natural number n, there is another natural number n0.
3. For no natural number n is n0 equal to 0.
4. For any natural numbers m and n, if m0 = n0 then m = n.
5. For any set A of natural numbers containing 0, if n0 ∈ A whenever n ∈ A, then A contains every natural number. as can be found p114 [6], where n0 is effectively in place of n + 1.
Set theory is a very powerful branch of mathematics. Since we are familiar and satisfied with sets, the objective of mathematicians who study these foundations is to define numbers in terms of sets. We should then be able to derive all properties and theorems which are possessed by numbers. We discuss the philosophy and logical implications of defining numbers in terms of sets in Section 2.1.2. Our approach is to follow John Von Neumann’s “brainwave” p160 [2] of inductive sets to form a set-theoretical model. We claim the existence of the natural numbers by proving Peano’s postulates are satisfied by this model. Thus, following techniques in set theory, there is no need for such postulates because they can in fact be derived. We note that this procedure of defining new objects in terms of old ones is an essential part of the foundations of mathematics, by constructing a valid model which is able to deduce a list of essential properties about the new objects.
2.1.1 John Von Neumann’s Ordinals
In an axiomatic set theory approach, Von Neumann based his definition of ordinal numbers on the principle of well-ordering. Namely, a set is an ordinal if it satisfies the well-ordering property and also every element of that set is the union of all its predecessors, p45 [5]. The principle of well-ordering will be formally stated and proved in Section 2.1.3, for now it will suffice to say that a set is well-ordered if every subset of that set has a least element with respect to some ordering relation. By inspection, then, we see that ∅ satisfies this condition to be an ordinal, as does ∅ ∪ {∅} = {∅} and {∅} ∪ {{∅}} = {∅, {∅}} and so on. In general, if α is an ordinal then α ∪ {α} is
Page 4 of 53 2 The Foundations: from Set Theory to Number Theory
also an ordinal. Thus, Von Neumann developed his theory by introducing the concept of an inductive set, which is defined as a set which satisfies the following modification of the Zermelo-Fraenkel axiom of infinity:
∅ ∈ X ∧ ∀x(x ∈ X ⇒ x ∪ {x} ∈ X). This modification of the axiom of infinity merely permits the existence of an inductive set. One can see that such an inductive set follows a fairly straightforward pattern, for example, the following set is an inductive set:
ω = {∅, {∅}, {∅, {∅}}, {∅, {∅}, {∅, {∅}}},...}. We call the set ω, as above, the formal set of natural numbers because it effectively models “the intuitive idea of counting” p146 [2]. When counting, we begin with the number zero and recursively add the number 1 to obtain the next element. Observe, with ω, we begin with the element ∅ and recursively construct the successor element of x as x ∪ {x}. Also note that this overcomes the difficulty that there are infinitely many natural numbers.
Before we begin our analysis, we extend our motivation of studying these foundations by returning to the problem of defining the set of natural numbers as the set of all ordinal numbers. Using Von Neumann’s above definition, consider a collection of well- ordered ordinals, ∅, {∅}, {∅, {∅}},.... Observe, if we take the union of these ordinals then we generate a new ordinal, since the well-ordering property is preserved and every element would be the union of all its predecessors. Now suppose there exists a set, N, which is the set of all ordinals: α is an ordinal ⇔ α ∈ N. By the previous remark, since we take the union of all ordinals, N is itself a new ordinal, so that if we ask if N is an element of itself we have an analogy to Russell’s paradox. Observe, N cannot be a member of itself, N 6∈ N, by Theorem 0.1 in my previous report, on the other hand N must be a member of itself, N ∈ N, since it is defined as the set of all ordinals. As analysed in my previous report, the ZF axioms do not allow the existence of a set, x, with the property x ∈ x, such a set x would be analogous to the universal set as in Cantor’s paradox. We conclude it is impossible to define the set of all ordinal numbers because the set of all ordinals would itself be a new ordinal which cannot be a member of itself. This begs an interesting question of how to correctly define the set of natural numbers. We will now analyse how Von Neumann’s concept of an inductive set, in particular ω, is a valid model for the set of natural numbers, under the ZF axioms, by proving Peano’s postulates are satisfied.
The unique existence of ω can be proved through the following observations. Consider an arbitrary family of inductive sets, F = {f1, f2,...} where all fi are inductive sets, T and consider the intersection Y = f∈F f, then the set Y is itself an inductive set and is in fact unique for any family of inductive sets F . Since all f ∈ F are inductive sets it follows by definition that,
∅ ∈ f and ∀x(x ∈ f ⇒ x ∪ {x} ∈ f). Then, by definition of intersection, it immediately follows that Y is also inductive since the elements of Y are the shared elements of all sets f ∈ F . The fact that the
Page 5 of 53 2 The Foundations: from Set Theory to Number Theory intersection of a family of inductive sets is unique requires a little more effort. A proof of uniqueness can be based on the following Lemma:
Lemma 2.1. Let S be an inductive set, B be an inductive subset of S and Y be the intersection of all inductive subsets of S, then Y is a subset of B. That is, \ Let S and B be inductive sets such that, B ⊆ S and Y = x : x is inductive, then Y ⊆ B. x⊆S
Let F1 ⊆ P (S1) and F2 ⊆ P (S2) be families of inductive sets (where S1 and S2 are also inductive sets) and consider their intersections, Y = T f and Y T f. Now, we 1 f∈F1 2 f∈F2 observe the following:
1. Y1 ∩ S2 is an inductive set, since Y1 and S2 are both inductive sets;
2. Y1 ∩ S2 ⊆ Y1, trivially by definitions of ∩ and ⊆;
3. Y1 ∩ S2 ⊆ S2, again, by definitions of ∩ and ⊆.
Combining 1. and 3. Together implies Y2 ⊆ Y1 ∩ S2, this is the crucial step which is justified by Lemma 2.1, since B = Y1 ∩ S2 is an inductive subset of S2 and since Y2 is the intersection of all inductive subsets of S2. Combining this with result 2., we have: Y2 ⊆ Y1 ∩ S2 ⊆ Y1. We follow the exact same reasoning to deduce the converse: Y1 ⊆ Y2 and so conclude Y1 = Y2.
This brings us to our formal definition of the natural numbers in terms of set theory. Let F be an arbitrary family of inductive sets, the formal set of natural numbers is T defined as ω = f∈F f. Thus, in an axiomatic fashion, we have been able to claim the unique existence of ω = {∅, {∅}, {∅, {∅}},...}. We now turn our attention to proving Peano’s postulates are satisfied by ω, we discuss the different interpretation of certain terms in Section 2.1.2.
Theorem 2.1. The formal set of natural numbers, ω, satisfies the following properties:
1. ∅ ∈ ω: 2. ∀x(x ∈ ω ⇒ x ∪ {x} ∈ ω);
3. 6 ∃x :(x ∪ {x} = ∅); 4. ∀x, y ∈ ω(x ∪ {x} = y ∪ {y} ⇒ x = y);
5. (A ⊆ ω ∧ ∅ ∈ A ∧ ∀x(x ∈ A ⇒ x ∪ {x} ∈ A)) ⇒ A = ω.
Proof. Properties 1. and 2. follow by definition of an inductive set. To prove 3. we employ a proof by contradiction, whereby we assume there exists a set x such that, x ∪ {x} = ∅.
Page 6 of 53 2 The Foundations: from Set Theory to Number Theory
Recall the definition of ∅, this satisfies the axiom of the empty set: ∀X(X/∈ ∅). So, by our hypothesis, we have: ∀X(X/∈ x ∪ {x}). Or equivalently, using definition of ∪, ∀X(X/∈ {x, {x}}). This statement is false, for example consider X = x:
x∈ / {x, {x}}.
This is clearly a contradiction since x ∈ {x} is true for all x. We conclude that the hypothesis used to reach this contradiction is false, so the result is proved: 6 ∃x : (x ∪ {x} = ∅).
Property 4. is the most arduous to prove. We assume the hypothesis and let x, y ∈ ω such that x ∪ {x} = y ∪ {y}, that is, by the axiom of extensionality,
∀z(z ∈ x ∪ {x} ⇔ z ∈ y ∪ {y}).
Using the definition of ∪, this is equivalently written as:
∀z((z ∈ x ∨ z ∈ {x}) ⇔ (z ∈ y ∨ z ∈ {y})). (1)
Note that z ∈ {x} and z ∈ {y} is equivalent to z = x and z = y respectively.
The above hypothesis, statements (1), is true for all z. We prove the result x = y by considering the following two cases separately, firstly, (z = x ∨ z = y) and, secondly, (z 6= x ∧ z 6= y). In the first case, if we have z = x then the hypothesis becomes:
(x ∈ x ∨ x ∈ {x}) ⇔ (x ∈ y ∨ x ∈ {y}).
Recall that P ⇔ Q is true if and only if P and Q are both true or are both false. Since x ∈ {x} is true for all x and since this is an ‘if and only if’ statement, the right hand side must also be true: (x ∈ y ∨ x ∈ {y}).
Alternatively, if z = y, then we have:
(y ∈ x ∨ y ∈ {x}) ⇔ (y ∈ y ∨ y ∈ {y}).
By the same reasoning, y ∈ {y} is true and this is an ‘if and only if’ statement, so the left hand side must also be true: (y ∈ x ∨ y ∈ {x}). Therefore, in the first case, when z = x or z = y, we have the following statements are true:
(x ∈ y ∨ x ∈ {y}) and (y ∈ x ∨ y ∈ {x}).
We now examine these statements to exhaust the possible combinations of true and false statements.
Consider if, x ∈ y if y ∈ x this contradicts the axiom of foundation, if y = x ⇒ x ∈ x contradicts Theorem 0.1. Now consider x = y, if y ∈ x ⇒ y ∈ y also a contradicts Theorem 0.1, if y = x this is true.
Page 7 of 53 2 The Foundations: from Set Theory to Number Theory
By exhaustion of possibilities, we have found that the only case in which we do not have a contradiction is when x = y and y = x. That is, if z = x or z = y in (1) then x = y follows. Now we consider the second case, when z 6= x and z 6= y, our hypothesis (1) becomes: (z ∈ x ∨ z ∈ {x}) ⇔ (z ∈ y ∨ z ∈ {y}). Since z 6= x and z 6= y, this is equivalently: (z ∈ x ⇔ z ∈ y), which is the condition for x = y by the axiom of extensionality. Therefore, by exhausting the possible combi- nations of z in the hypothesis, we conclude the result x = y always holds.
Finally, for property 5. we assume the hypothesis and let A be an inductive subset of ω. Observe the following:
1. ω ∩ A = ω since ω is defined to be the intersection of any inductive sets and also ω is unique;
2. ω ∩ A ⊆ A this follows directly from the definitions of ∩ and ⊆.
Combining these results, we have ω ∩ A = ω ⊆ A. Since we also have A ⊆ ω, by our hypothesis, we conclude ω = A.
2.1.2 Philosophical and Logical Implications
We have now achieved a significant result whereby we can claim the unique existence of the natural numbers, as based on ZF set theory. We may take a different interpre- tation of set theory terms to effectively model the natural numbers. Observe, the set ∅ may represent the number 0 and the successor element x ∪ {x} of x may represent the successor element n + 1 of n. Therefore, results 1. - 5. in Theorem 2.1 become translated into the form of Peano’s postulates and, by analysing Von Neumann’s theory of inductive sets, we have proved that ω indeed satisfies these postulates. Thus, under these new interpretations, the set of natural numbers, ω, exists and is unique.
In an abstract sense, as Greenberg p72 [3] discusses, a model is simply a different way of interpreting the undefined terms in a mathematical theory. The correct results which we are able to prove in set theory are regarded as axioms, neither correct nor incorrect, which characterise the newly interpreted terms of number theory. We have been able to prove the results in Theorem 2.1 as based on the ZF axioms and so our model for the natural numbers is a valid one. However, as discussed in my previous report, we do not regard axioms as ‘correct’ or ‘self-evident truths’, but rather, as primitive assumptions about a mathematical theory. With respect to the newly interpreted terms, the results in Theorem 2.1 are Peano’s postulates and act as axioms which characterise the natural numbers. In this sense, one can interpret that “ZF ‘contains’ N” p121 [6] and, Hamil- ton continues, “when we discuss ZF we have, so to speak, reached the end of the line”. This underlines the philosophy of our work, the ZF axioms require metamathematical concepts, an appreciation of terms such as ‘set’ and ‘element’ and so on. On the con- cept of using ZF set theory as a base to deduce new mathematical theories and results, Potter writes, “in order to believe a result whose proof uses set-theoretical methods one must believe something about the conception of sets on which those methods are
Page 8 of 53 2 The Foundations: from Set Theory to Number Theory
based” p132 [9]. This is the philosophy of analysing the foundations of numbers from set theory. In particular with Von Neumann’s approach, the axiom of infinity is crucial in order to accept the existence of the natural numbers.
Modelling the natural numbers in terms of set theory also has important logical impli- cations, namely regarding consistency and completeness. Though a rigorous study of logic is outside the scope of this report, we address relevant issues here because they are relevant to topics discussed in my previous report. On the assumption that ZF is consistent then the set of natural numbers, based on Peano’s postulates, also forms a consistent system; this is a sophisticated result in logic, Hamilton p122 [6] sketches a proof. Conversely, if a logical inconsistency is found in the set of natural numbers then this would reveal an inconsistency with ZF set theory. However, no such logical problem has been found and so set theory appears to offer an excellent foundation to base a theory of numbers. Reflecting on the construction of the natural numbers from ZF set theory and then the real numbers from the natural numbers, Bloch writes, “Hence, if we accept the ZF axioms, then we have at our disposal the familiar num- ber systems with their standard properties, and a variety of branches of mathematics [such as real analysis and Euclidean geometry], all constructed completely rigorously.” p119-120 [13].
There is a second logical matter which should be addressed: do Peano’s postulates form a complete mathematical system? That is, is every statement which can be formulated in terms of numbers either correct or false? In short, the answer is no, we know there exist statements whose truth or falsity cannot be decided by Peano’s postulates because of Kurt G¨odel’sIncompleteness Theorems. G¨odel’sIncompleteness Theorems tell us that no axiom system can be complete, therefore the ZF axioms are not complete and so Peano’s postulates are not complete because we have been able to deduce these postulates using ZF set theory. In Section 2.1.4 we will further discuss the incompleteness of the ZF axioms and, generally, mathematical theory as a whole.
2.1.3 Induction and Ordering of the Natural Numbers
In the previous Sections we have focussed on theoretical analysis concerning the exis- tence of the natural numbers under set theory. We now prove some important proper- ties of the natural numbers, namely the induction principle and some ordering prop- erties. These are fundamental properties which a valid construction of the natural numbers should be able to prove. The induction principle is crucial to proving many results about the natural numbers and is at the heart of how arithmetic can be defined. Intuitively, we can see Von Neumann’s approach somewhat lends itself to the induction principle, since ω is defined recursively in terms of successor elements. In fact, we have already proved the induction principle in property v) Theorem 2.1, however not in the formal state as below: Theorem (Mathematical Induction). Let L(x) be a logical property whose truth or falsity is determined by the set x. Then, if we have L(x) is true for x = ∅, and by assumption that L(x) is true we can prove L(x ∪ {x}) is true, then L(x) is true for all x in ω. (L(∅) ∧ ∀x(L(x) ⇒ L(x ∪ {x}))) ⇒ ∀x(x ∈ ω ⇒ L(x)).
Page 9 of 53 2 The Foundations: from Set Theory to Number Theory
Consider a subset of ω, set A, such that a ∈ A if and only if L(a) is true. By the hy- pothesis given about the logical property L(x), it is clear to see that A is an inductive set. Therefore, by property v) in Theorem 2.1, A = ω. Thus, Peano’s fifth postu- late, in our previous list, is effectively a version of mathematical induction. It is in the nature of how Von Neumann’s inductive set models the natural numbers, by using a recursion pattern, that the induction principle of the natural numbers comes so easily.
We now turn our attention to the ordering of the natural numbers, namely the tri- chotomous property and the well-ordering principle. The trichotomous property of order is fundamental to the abstract concept of ‘size’ and the well-ordering property is “the fundamental notion ... upon which the concept of ordinal number depends” p119 [4]. We can intuitively see that Von Neumann’s approach maintains some order because every element of ω is uniquely related to all of its predecessors, specifically every element is the union of all its predecessors. What we mean by order is that we can define a relation, say ≥, on a set, N × N, so that the truth of a ≥ b depends on the order of the elements in the pair (a, b) ∈ N × N. For example, to establish an ordering of the natural numbers, Stewart and Tall define the relation ≥ by:
m ≥ n ⇔ ∃r ∈ N : m = r + n p155 [2]
where m, n ∈ N and ‘+’ means addition (we will analyse arithmetic in Section 2.2.1). Here, the truth or falsity of a ≥ b is strictly determined by the order (a, b) or (b, a), assuming a and b are distinct, a 6= b. While discussing axioms for simple order, a term which Wilder uses to describe ordering is ‘precedes’. Following Wilder’s discussion, given two distinct points p and q in a set we may say either p ‘precedes’ q or q ‘pre- cedes’ p, but the two statements cannot both be true. As we should expect from the philosophical results of my previous report, Wilder continues to write, “... precedes is adopted as an undefined technical term.” p46 [4].
In order to prove that the formal set of natural numbers, ω, satisfies these properties of order, we use the set-theoretical concept of a proper subset, which is defined below.
Definition 2.1. A proper subset, x, of a set y is defined such that it satisfies:
x ⊂ y ≡ x ⊆ y ∧ x 6= y ≡ ∀a(a ∈ x ⇒ a ∈ y) ∧ ∃a(a ∈ y ⇒ a∈ / x).
We note that, traditionally, Von Neumann’s approach is to use the ordering relation of membership, ∈. Despite this, our approach is equivalently valid because of the following result, which is provable by induction and definition of subset:
Proposition 2.1. For all sets x, x ∈ ω ⇔ x ⊂ ω.
Our proofs of the order properties rely on various other results which we have formally stated below as Lemmas. Although these Lemmas are important, we have omitted the full details of their proofs since these are lengthy and would distract from our report. The results in these Lemmas can be proved based on Peano’s postulates (Theorem 2.1) and the Induction principle.
Page 10 of 53 2 The Foundations: from Set Theory to Number Theory
Lemma 2.2. Let x, y ∈ ω, then we have x ⊂ y ⇒ x ∪ {x} ⊂ y ∪ {y}.
Lemma 2.3. Let x ∈ ω, then we have x ⊂ x ∪ {x}.
Theorem (Trichotomy of the order property ⊂). Let x, y ∈ ω, then exactly one of the following holds: x ⊂ y, x = y or y ⊂ x.
Proof. We first show that at most one of the above statements can be true. If x ⊂ y, then x = y is false by definition 2.1 of ⊂. We would also have y ⊂ x is false since ∃a(a ∈ y ⇒ a∈ / x). By symmetry of x and y, if we have y ⊂ x then we again conclude the other two statements are false. Finally, if we have x = y then x ⊂ y and y ⊂ x would both be false since ∀x(x ⊂ x) ⇔ x 6= x is false.
Now, for the second part of this proof we show that at least one of the statements al- ways holds. We use on induction on y, let x ∈ ω and y = ∅. By definition of the empty set, x ⊂ ∅ is false. If we have x = ∅ then we are finished. Alternatively, if x 6= ∅ then we have ∅ ⊂ x is true since ∀a(a ∈ ∅) is false, and so ∀a(a ∈ ∅ ⇒ a ∈ x)∧x 6= ∅ holds.
For the next step of induction we let y = z ∈ ω and assume exactly one of the following holds: x ⊂ z, x = z or z ⊂ x. We now consider y = z ∪ {z}. If we have x ⊂ z or x = z then x ⊂ z ∪ {z} immediately follows, by Lemma 2.3 since z ⊂ z ∪ {z}. Alternatively, if we have z ⊂ x,
⇒ z ∪ {z} ⊂ x ∪ {x} by Lemma 2.2 ⇒ z ∪ {z} ⊂ x ∨ z ∪ {z} = x.
This completes the induction proof and so we conclude ∀x, y ∈ ω exactly one of the following statements holds: x ⊂ y, x = y or y ⊂ x.
This is a significant result as it shows the trichotomous property of the order relation < on the natural numbers, commonly taken as an axiom of order. Observe that in our set-theoretical model, the relation ⊂ can be equivalently interpreted as <, so that the above theorem can be translated into more-familiar language as: let m and n be natural numbers, then exactly one of m < n, n < m or m = n holds.
A second important ordering property of the natural numbers is the well-ordering principle. We will see in Section 2.2.2 how the well-ordering property is not possessed by all sets of numbers, for example rational numbers. To prove the well-ordering principle we make use of the following Lemma, whose proof we omit, in order to keep focus on more-significant results.
Lemma 2.4. Let x, y ∈ ω, then we have x ⊂ y ⇒ x ∪ {x} ⊆ y.
Definition 2.2. Let X be a non empty set. A least element of X, denoted as x, is defined such that x satisfies: (x ∈ X) ∧ ∀y(y ∈ X ⇒ x ⊆ y).
Page 11 of 53 2 The Foundations: from Set Theory to Number Theory
It is easy to see that a least element of a set, if it exists, must be unique. However, this remark is worth a formal proof as the least element of a set is an important concept for the well-ordering principle. Proposition 2.2. Let X be a non empty set, then X has at most one least element.
Proof. Suppose there are two elements of X, x1 and x2, which respectively satisfy:
∀y(y ∈ X ⇒ x1 ⊆ y) and ∀y(y ∈ X ⇒ x2 ⊆ y).
Since x2 ∈ X, we have x1 ⊆ x2. Similarly, x1 ∈ X and so x2 ⊆ x1. We conclude x1 = x2. Theorem (Well-Ordering). Let X be a non empty subset of ω, then X has a least element.
Proof. Our proof follows the same fundamental concepts as Stewart and Tall’s proof of this theorem p163 [2]. However, we keep with our set-theoretical foundations and use the subset property ⊆ in place of the relation ≤. With the intention of deducing a contradiction, we suppose that a least element does not exist and define the set Y by:
∀y(y ∈ Y ⇔ (y ∈ ω ∧ y∈ / X)).
By the axiom of subsets, this is a valid construction of a set, Y ⊆ ω. Furthermore, note that we have X ∪ Y = ω and X ∩ Y = ∅.
Now we use induction. Firstly, we see ∅ ∈ Y , since if ∅ ∈ X then ∅ would be a least element of X (∀a(∅ ⊆ a) holds by definition of the empty set). As the next step, we let z ∈ Y and consider the element z ∪ {z}. Observe, by our assumption that X has no least element we have, ∀y(y ⊆ z ⇒ y ∈ Y ). By definition of Y , this is equivalently:
∀y(y ⊆ z ⇒ y∈ / X).
Using simple rules of logic, this can be rewritten as: ∀y(y ∈ X ⇒ z ⊂ y). So if we con- sider z∪{z} ∈ X then z∪{z} would be a least element of X, since (z ⊂ y ⇒ z∪{z} ⊆ y) by Lemma 2.4.
This would contradict our assumption that X has no least element, so our only option is to conclude z∪{z} ∈ Y . We are now finished since, by induction, this implies Y = ω, or equivalently X = ∅. This contradicts our hypothesis that X is a non empty set. Therefore, we conclude every non empty subset of ω has a least element.
2.1.4 Reflections and the Infinitude of the Natural Numbers
In this Section we discuss Cantor’s approach of defining ordinals and reflect on Von Neumann’s approach, in comparison. We then finish our analysis of the natural num- bers by examining how we should deal with their infinitude. As Dalen p15 [5] discusses, in the na¨ıve phase of set theory, Cantor’s approach was to generate new ordinal numbers by adding 1 recursively and by adding previously obtained ordinals together. Consider a set of ordinals N = {0, 1, 2, 3,...}, where all elements satisfy the trichotomy property
Page 12 of 53 2 The Foundations: from Set Theory to Number Theory with respect to some order relation <, that is, 0 < 1 < 2 < 3 < . . .. Cantor supposed that there exists some ordinal, ω0 ∈ N, such that for all n ∈ N we have n < ω0, that is, 0 < 1 < 2 < 3 < . . . < ω0. Of course, we can count beyond ω0, for example ω0 ω0 ω0 ω0 ω0 ω0 ω0 ω0 < ω0 + 1 < ω0 + 2 < . . . < ω0 + ω0 < . . . < ω0 < . . . < ω0 , ω0 + 1 < . . .. The concept behind ω0 is that it is some large natural number which is greater than all other natural numbers we have already thought of, “whose existence we had not expected” p182 [8]. It is important to note that we do not regard ω0 as infinity, instead, ω0 is the first countably-infinite ordinal, Faticoni writes, “ω0 is not a number. It is simply another element in a larger set” p172 [8].
Von Neumann based his definition of an ordinal number on stronger set-theoretic grounds involving the well-ordering principle, we have already discussed his formal definition in Section 2.1.1. Observe that every ordinal is a subset of ω, by construc- tion, which is the same concept as before when we consider that every natural number is less than Cantor’s ω0. Furthermore, ω is itself something like an ordinal, because we have proved it is well-ordered and, by construction, every element is the union of all its predecessors. However, there does not exist an ordinal α ⊂ ω such that α ∪ {α} = ω because, by proposition 2.1, α ⊂ ω ⇔ α ∈ ω which implies that α∪α ∈ ω by definition of an inductive set. Hence, if there existed an element α ⊂ ω such that α ∪ {α} = ω then we would have ω ∈ ω which is a contradiction with the ZF axioms, as proved in my previous report, Theorem 0.1. This is analogous to Cantor’s ω0 as we consider that there is no number n such that n + 1 = ω0. We do not regard ω0 or ω as in- finity, instead, they are merely elements of a larger set. Von Neumann’s ω gives a very conceptually-neat way of dealing with the infinity of natural numbers in terms of axiomatic set theory and, essentially, matches the same key ideas which Cantor had developed.
We finish our analysis of the foundations of the natural numbers by discussing how we deal with the cardinality of N, that is, how to understand the infinitude of the natural numbers. Theoretically, every natural number is ‘countable’ in the sense that we can begin at zero and recursively add one until we reach that number. However, Peano’s second postulate shows that we cannot simply write down a full list all of the natural numbers. In short, we say that there are countably infinite many natural numbers and we interpret card(N) ≡ card(ω) = ℵ0 to represent the smallest, ‘countable’, infinity. The abstract concept for a set to be countable is that there is a logical procedure to write a list which would incorporate every element of that set, even theoretically, if the set is infinite. We define that the cardinality of two sets, X and Y , is the same if there exists a bijective function f : X → Y . Returning to the question in the Introduction and Context Section, according to this mathematical definition, the cardinality of N and the cardinality of N\{1} are the same because we can define a bijective function n, if n = 0, f : → \{1} given by f(n) = We note that if a set has a N N n + 1, if n > 0. greater cardinality than ℵ0 then it is said to be uncountable, which stands to reason because ℵ0 is defined to be the smallest infinity.
Although it is beyond the scope of this report to analyse infinite cardinals, we wish to
Page 13 of 53 2 The Foundations: from Set Theory to Number Theory
address relevant issues which demonstrate the incompleteness of mathematics and put forward curious concepts which can be studied further as future research. The axiom of choice and the continuum hypothesis are two mathematical statements which can be formulated in terms of set theory but cannot be proved as theorems from the ZF axioms, nor can they be disproved. For completeness, we formally state them below:
(AC) For any non-empty set x there is a set y which has precisely one element in common with each member of x. (CH) Each infinite set of real numbers is either countable or has the same cardinal number as the set of all real numbers.
We have taken these statements as found from Hamilton p120 [6]. Faticoni discusses, p185-6 [8], it is possible to prove that every cardinal, ℵ, has a successor cardinal ℵ+ and that the continuum hypothesis that be formulated to read: + ℵ1 = card(R) (where ℵ1 ≡ ℵ0 ). The cardinality of the real numbers puts forward a new concept of being uncountable, since we can prove card(N) < card(R), which introduces the idea of having more than one type of infinity: countable and uncountable. Recall, from Cantor’s theorem, proved in my previous report, that the cardinality of the power set, P (S), of a set, S, is strictly greater than the cardinality of that set. This leads us to a more-general concept of having infinitely many infinities, observe the following:
card(N) < card(P (N)) < card(P (P (N))) < card(P (P (P (N)))) < . . . as noted by Faticoni p141 [8]. In fact, Paul J. Cohen proved that we cannot prove or disprove the statement ℵ1 = card(R) and furthermore Kurt G¨odelproved that we can add the axiom of choice and continuum hypothesis to the list of ZF axioms without causing an inconsistency. This demonstrates the incompleteness of mathematics, which is essentially the meaning behind Kurt G¨odel’sIncompleteness Theorems.
2.2 Extending the Natural Numbers
In this Section we study how to define arithmetic, addition and multiplication, and how to construct new sets of numbers, namely the integers and rationals. In order to formally define arithmetic and new sets of numbers we make use of the abstract concepts of equivalence classes, Cartesian products, functions and relations. Most of the results here follow an elementary fashion to prove and are based on the principle of induction, Peano’s fifth postulate, however require a great deal of space. Rather than deriving every result in full detail, our objective is to outline the abstract techniques involved in establishing these foundations so that we gain a deeper understanding of numbers.
2.2.1 Arithmetic
At the foundations of the natural numbers we have seen that there is an underlying fundamental concept of having a starting element ∅, ‘zero’, and for every element, x, having a successor element x∪{x}. This recursive design lends itself to deal with arith- metic in a recursive manner, as Hamilton writes [about addition and multiplication], “These functions can be defined in terms of the successor function, using the induction
Page 14 of 53 2 The Foundations: from Set Theory to Number Theory
principle” p114 [6]. Potter p72 [9], Nievergelt p166 [7] and Tourlakis p244 [14] define arithmetic in such a recursive way, similar to our definition below.
Consider the function f : ω × ω → ω given by:
f(x, ∅) = x and f(x, y ∪ {y}) = f(x, y) ∪ {f(x, y)}. Then for any two natural numbers x, y ∈ ω we notice that f(x, y) equivalently cor- responds to the addition x + y, of which we are more familiar. Despite that this is a moderately simple definition of a function, this is powerful enough to derive all the properties of addition which we expect. In particular, we should be primarily con- cerned to verify that the normal axioms of addition, namely the commutative and associative properties, are provable. Indeed they are. For shorthand notation, we use x+ to represent x ∪ {x}, observe the following results: Lemma 2.5. Let x, y ∈ ω, then we have f(x, y+) = f(x+, y).
This Lemma is easily provable by taking x ∈ ω and using induction on y. We omit the full details, in order to save space and focus on the following results. In addition, we note that ∅ it an identity element with respect to f: f(x, ∅) ≡ f(∅, x) = x, this is also provable using induction on x. Theorem 2.2. Addition is commutative: f(x, y) = f(y, x).
Proof. Let x ∈ ω and take induction on y. For y = ∅ we have f(x, ∅) = x and f(∅, x) = ∅ as previously noted, ∅ is an identity element of f. Hence, f(x, y) = f(y, x) holds for y = ∅. For the second part of the proof we assume the result holds for some y ∈ ω and consider y+:
f(x, y+) = f(x, y)+ by definition of f, = f(y, x)+ by assumption, = f(y, x+) by definition of f, = f(y+, x) by Lemma 2.5.
Therefore, by induction, we conclude for all x, y ∈ ω, f(x, y) = f(y, x) and so addition is commutative: x + y = y + x. Theorem 2.3. Addition is associative: f(f(x, y), z) = f(x, f(y, z)).
Proof. Let x, y ∈ ω and take induction on z. For z = ∅ we have f(f(x, y), ∅) = f(x, y) and also, f(x, f(y, ∅)) = f(x, y), both results follow by definition of f. Hence, f(f(x, y), z) = f(x, f(y, z)) holds for z = ∅. For the second part of the proof we assume the result holds for some z ∈ ω and consider z+:
f(f(x, y), z+) = f(f(x, y), z)+ by definition of f, = f(x, f(y, z))+ by assumption, = f(x, f(y, z)+) = f(x, f(y, z+))by definition of f.
Therefore, by induction, we conclude for all x, y, z ∈ ω, f(f(x, y), z) = f(x, f(y, z)) and so addition is associative: (x + y) + z = x + (y + z).
Page 15 of 53 2 The Foundations: from Set Theory to Number Theory
In order to define multiplication, consider the function g : ω × ω → ω given recursively by: g(x, ∅) = ∅ and g(x, y ∪ {y}) = f(g(x, y), x). Then for any two numbers x, y ∈ ω we notice that g(x, y) equivalently corresponds to the multiplication x × y. Following similar reasoning as with addition, we are mainly concerned to verify that the normal axioms of multiplication, commutative and associative properties, are provable. In addition, we require that the distributive axiom, of multiplication over addition, is provable. We continue to use x+ in place of x ∪ {x} and deduce the following results. Firstly, we note it is provable by induction on x that ∅+ is an identity element with respect to g, that is g(x, ∅+) ≡ g(∅+, x) = x and also g(x, ∅) ≡ g(∅, x) = ∅. Theorem 2.4. Multiplication is distributive over addition: g(f(x, y), z) = f(g(x, z), g(y, z)).
Proof. Let x, y ∈ ω and take induction on z. For z = ∅ we have g(f(x, y), ∅) = ∅ and f(g(x, ∅), g(y, ∅)) = f(∅, ∅) = ∅ both by definitions of f and g. Hence, g(f(x, y), z) = f(g(x, z), g(y, z)) holds for z = ∅. We now assume the result holds for some z ∈ ω and consider z+:
g(f(x, y), z+) = f(g(f(x, y), z), f(x, y)) by definition of g, = f(f(g(x, z), g(y, z)), f(x, y)) by assumption, = f(f(g(x, z), x), f(g(y, z), y)) by the properties of addition, = f(g(x, z+), g(y, z+)) by definition of g.
Therefore, by induction, we conclude for all x, y, z ∈ ω we have g(f(x, y), z) = f(g(x, z), g(y, z)) and so multiplication is distributive over addition: (x + y) × z = x × z + y × z.
Theorem 2.5. Multiplication is commutative: g(x, y) = g(y, x).
Proof. Let x ∈ ω and use induction on y. For y = ∅ we have g(x, ∅) = ∅ and g(∅, x) = ∅ as previously noted. Hence, g(x, y) = g(y, x) holds for y = ∅. For the second part of the proof we assume the result holds for some y ∈ ω and consider y+:
g(x, y+) = f(g(x, y), x) by definition of g, = f(g(y, x), x) by assumption, + + = f(g(y, x), g(∅ , x)) as noted, ∅ is an identity element with respect to g, + = g(f(y, ∅ ), x) by the distributive property, = g(y+, x) by definition of f.
Therefore, by induction, we conclude for all x, y ∈ ω we have g(x, y) = g(y, x) and so multiplication is commutative: x × y = y × x.
Corollary 2.1. As an immediate consequence of the distributive property and commu- tative property of multiplication, we also have g(x, f(y, z)) = f(g(x, y), g(x, z)), which is a slightly different form of the distributive property.
Theorem 2.6. Multiplication is associative: g(g(x, y), z) = g(x, g(y, z)).
Page 16 of 53 2 The Foundations: from Set Theory to Number Theory
Proof. Let x, yinω and use induction on z. For z = ∅, g(g(x, y), ∅) = ∅ and g(x, g(y, ∅)) = g(x, ∅) = ∅ by definition of g. We now assume the result g(g(x, y), z) = g(x, g(y, z)) holds for some z ∈ ω and consider z+: g(g(x, y), z+) = f(g(g(x, y), z), g(x, y)) by definition of g, = f(g(x, g(y, z)), g(x, y)) by assumption, = g(x, f(g(y, z), y)) by Corollary 2.1, = g(x, g(y, z+)) by definition of g.
Therefore, by induction, we conclude for all x, y, z ∈ ω we have g(g(x, y), z) = g(x, g(y, z)) and so multiplication is associative: (x × y) × z = x × (y × z).
These results show how the arithmetic properties of the natural numbers can be proved using abstract definitions of functions to represent addition and multiplication. These results are commonly assumed as ‘the axioms of arithmetic’, which we have shown to be redundant because they can be derived.
2.2.2 The Sets of Integer and Rational Numbers
In principle, to construct the set of integer numbers we should formulate a list of definitive axioms and construct a model using the natural numbers which satisfies the properties in these axioms. We should then suitably define arithmetic functions and an ordering relation in order to verify the normal rules of arithmetic and order hold. The procedure should be repeated again in order to construct the set of rational num- bers, as Stewart and Tall comment about a similar issue, “the whole process makes a mountain out of a molehill” p179 [2]. Rather than derive every result in full detail, we detail the methods involved and give heuristic arguments to show they are indeed valid by satisfying various characteristic properties. Essentially, the set of integer numbers are an extension of the natural numbers which satisfy the following property: for every integer, m ∈ Z, there is another integer, n ∈ Z, such that m + n = 0. This is a crucial property of the integer numbers which a valid model must be able to deduce.
The construction of the integers is based on the concept of interpreting pairs of natural numbers as integers. Specifically, as Stewart and Tall demonstrate p178 [2], we define the relation R on N × N given by: (a, b)R(c, d) ⇔ a + d = b + c. Note that this is a valid construction because we have already defined the addition of two natural numbers. It can be verified with little effort that R is indeed an equivalence relation and so we can define the equivalence class of an element (a, b) ∈ N × N with respect to R as: [(a, b)]R = {(c, d) ∈ N × N :(a, b)R(c, d)}. Now we may define the set of integers immediately: = S [(a, b)] , so that Z (a,b)∈N×N R every element of Z is a pair (a, b) ∈ N × N. Observe that the relation R is equivalent to a + d = b + c ≡ a − b = c − d, so that we may interpret every ordered pair (a, b) ∈ Z as our intuitive notion of subtraction, a − b. For example, the element (1, 0) ∈ Z cor- responds to 1 and the element (0, 1) ∈ Z corresponds to −1. By defining the addition
Page 17 of 53 2 The Foundations: from Set Theory to Number Theory
of integers as (a, b) + (c, d) = (a + c, b + d), it is easy to observe that Z satisfies the additive-inverse property: for all (a, b) ∈ Z we have (a, b) + (b, a) = (a + b, b + a). Since the addition of natural numbers is commutative, a + b = b + a, and also any pair of the form (c, c) effectively models the number 0, (a, b) + (b, a) is 0.
In order to construct the set of rationals, we recognise that the fundamental property we should be able to deduce is that every rational number, m ∈ Q, has a multiplicative inverse, n ∈ Q, such that m × n = 1. We follow a similar concept as before whereby we associate every rational number as the composition of two integer numbers. In particular, we define the relation r on Z × Z given by: (a, b)r(c, d) ⇔ a × d = b × c, for b, d 6= 0, where we have implicitly assumed that the multiplication of integers has been defined. The importance of specifying b, d 6= 0 is so that r is indeed an equivalence relation. If we allowed b = 0 or d = 0 then the transitive property would not necessarily hold, this gives a reasonable insight into why dividing by zero is undefined from an abstract-foundations perspective. We then define an equivalence class of an element (a, b) ∈ Z × Z with respect to r as:
[(a, b)]r = {(c, d) ∈ Z × Z :(a, b)r(c, d)} and define the set of rationals as the union of all equivalence classes: = S [(a, b)] . Q (a,b)∈Z×Z r a c We can see that the relation r is equivalently a × d = c × b ≡ b = d , so that every a pair (a, b) ∈ Q may be interpreted as the division b . We then define multiplication of rationals as (a, b) × (c, d) = (a × c, b × d) so that we can observe the multiplicative- inverse property is satisfied: for all (a, b) ∈ Q we have (a, b) × (b, a) = (a × b, b × a). Assuming we have proved the commutative property of multiplication on the integers and by also noting that any pair of the form (c, c) effectively models the number 1, we see (a, b) × (b, a) is 1.
It is possible to establish arithmetic and ordering of the integers and rationals, by mak- ing suitable definitions of addition, multiplication and order relations. Furthermore, we can show that the arithmetic and ordering of these sets is equivalent to the arith- metic and ordering of the natural numbers, in the sense that for m, n ∈ N if m < n then we also have corresponding m, n ∈ Z such that m < n and (m + n), (m × n) ∈ Z and similarly for some corresponding m, n ∈ Q. This idea is based on establish- ing an order isomorphism between sets, that is a function f : N → Z such that m < n ⇔ f(m) < f(n) for all m, n ∈ N. A slight issue is that elements of N are inter- preted as numbers, whereas elements of Z are interpreted as ordered pairs of natural numbers which correspond to an equivalence class with respect to the relation R. To overcome this conceptual difficulty it is possible to define another isomorphism which uniquely maps every element of N to an element of Z, so that we may regard N ⊂ Z as intended. We exclude the complete details, as Stewart and Tall write about this task, this involves, “littering the landscape with unsightly isomorphisms which confuse what is really a simple situation.” p179 [2].
It is important to note, however, that the well-ordering property does not hold for the set of positive rationals, Q+. For example, consider the subset A ⊆ Q+ given
Page 18 of 53 2 The Foundations: from Set Theory to Number Theory
by A = {x ∈ Q+ : 0 < x}, if we suppose that A has a lower bound, x∗ ∈ A, x∗ ∗ then the number 2 ∈ A would contradict the definition for x to be a lower bound x∗ ∗ since 0 < 2 < x . The rational numbers introduce the concept of having infinitely m small numbers, n where m ≪ n, hence the rational numbers do not have successor elements. This leads to the realisation that if we attempted to write an ordered list of all positive rationals we would not even be able to write a second element, after zero. It is somewhat surprising, then, that Cantor was able to prove that Q has countably many elements, card(N) = card(Q). We wish to bring attention to the notion of being relatively prime and so we sketch a constructive proof below: 1 2 3 4 5 6 7 1 1 1 1 1 1 1 1 /1 /2 /3 /4 /5 /6 /7 2 2 2 2 2 /1 /3 /5 /7 3 3 3 3 3 3 /1 /2 /4 /5 /7 4 4 4 4 4 /1 /3 /5 /7 5 5 5 5 5 5 5 /1 /2 /3 /4 /6 /7 6 6 6 6 /1 /5 /7 7 7 7 7 7 7 7 /1 /2 /3 /4 /5 /6
Table 1: Cantor’s proof to illustrate card(Q) = ℵ0.
Cantor’s proof then works on the idea of mapping every natural number to one of the elements in this table, following a diagonal pattern we obtain the same result as 1 2 1 Faticoni p107 [8], f(0) = 1 , f(1) = 1 , f(2) = 2 , and so on. Since the nominator and denominator of each fraction are relatively prime, every fraction is in reduced form and so, by construction, f is one-to-one. This is strongly related to Euler’s phi function φ(n) and gives a neat geometric picture as to why φ(n) is always even for n > 2, in Section 3.1.3 we define Euler’s phi function and refer back to this table. Finally, we m th th observe that f is surjective, consider the fraction n , at the m row and n column, Pm+n then there exists a number k ≤ −1 + i=2 φ(i) such that f(k) = x. We conclude that + f is bijective and so card(Q ) ≡ card(Q) = ℵ0.
2.3 The Real Numbers
The linear continuum is an essential part of nature. It is fundamental to mathematics and science that we have a valid model for the linear continuum in order to gain an understanding of how the universe works. Objects in the real world move continuously, for example a ball falling under the influence of gravity does not move in discrete in- tervals, but rather, in a continuous motion. Time is also continuous, despite that we commonly measure it in discrete periods. The linear continuum is at the heart of var- ious pure mathematics topics such as topology, geometry and analysis. In this Section we address the issue of defining the set of real numbers, or equivalently modelling the linear continuum, and explore some interesting phenomena of representing numbers in digit form.
It can be interpreted that for a set to form a linear continuum means that every ‘point’ on an infinite ‘line’ can correspond to an element of that set. The problem with the
Page 19 of 53 2 The Foundations: from Set Theory to Number Theory
set of rational numbers is that it is not complete, that is, there are gaps on the linear continuum that do not correspond to any element of Q. Elementary examples of such instances are the length of a diagonal on the unit square and the area of a circle with unit radius. The view that the set of real numbers corresponds to all possible points on an infinite line is a geometrical interpretation which aids our understanding, we note that ‘point’ and ‘line’ are undefined terms. A valid construction of the set of real numbers is in such a way that we are able to deduce its completeness property, which is defined below:
Definition 2.3. A set X is said to be complete if and only if for every non empty subset of X which is bounded above there exists a supremum in X.
A set x is bounded above if there exists a number B such that ∀y(y ∈ x ⇒ y ≤ B). In addition, the supremum of a set is defined as the least upper bound of that set.
The incompleteness of the set of rational numbers can be demonstrated in numerous ways, consider the following result: √ Proposition 2.3. Let p be a prime number, then p∈ / Q.
√ m 2 2 A proof can work by assuming p = n ∈ Q and so deduce n p = m . We now note that the left hand side of this equation has an odd number of prime factors, while the right hand side has an even number of prime factors. This observation reveals an inconsistency with the fundamental theorem of arithmetic, which states that the prime factorisation of any number is unique. We prove the fundamental theorem of √ arithmetic in Section 3.1.1 and hence conclude that p∈ / Q . Now consider the set 2 X √= {x ∈ Q :(x < 2 ∨ x < 0)}. By√ inspection, we observe that the supremum of X is 2, however by Proposition 2.3, 2 ∈/ Q. Since the supremum of a set is unique, we conclude that Q is incomplete.
Since there are infinitely many prime numbers, a result which we prove in Section 3.1.2, Proposition 2.3 uncovers infinitely many gaps in the set of rational numbers. These gaps introduce the concept of an irrational number. To give a definition of an irrational number is not a trivial task because the set of irrational numbers is in fact uncountable, we give a constructive proof of this in the following Section 2.3.2. Roughly speaking, to be uncountable means that there is no logical way to write down a list which would theoretically incorporate every irrational number. Notice that if we could define the set of irrational numbers, Q, then we would readily have a definition of the real numbers: R = Q ∪ Q. For mathematicians who study these foundations, the objective is to define irrational numbers in terms of rational ones. We have mainly considered two techniques of achieving this, Dedekind cuts and Cauchy sequences, which we examine in the following Section.
2.3.1 Analysis of Dedekind Cuts and Cauchy Sequences
In principle, these constructions of the real numbers are able to deduce all properties which characterise R, namely arithmetic, ordering and completeness. Our objective is to give an outline how various methods work and reflect on them.
Page 20 of 53 2 The Foundations: from Set Theory to Number Theory
Dedekind Cuts Dedekind cuts work on the concept of ‘cutting’ the linear continuum two, disjoint, lower and upper sets of rational numbers. Consider making a cut at point x, Dedekind observed that exactly one of the following three properties will hold: 1. the lower set contains its supremum, x, or 2. the upper set contains its infimum, x, or 3. x is irrational. In any case, we then arbitrarily choose the lower set or the upper set√ to represent the number x where the cut was made. For example, making a cut at 2, we see the lower set is the set X from before: X = {x ∈ Q :(x2 < 2 ∨ x < 0)} and the 2 upper set is Y = {y ∈ Q :(x > 2 ∧ x√ > 0)}. In this case we are in situation 3., neither the upper nor the lower√ set contains 2, and we interpret set X, or equivalently Y , to represent the number 2.
This concept is fairly neat because we will always be in one of the three cases, listed above, and gives a way of representing irrational numbers in terms of rational ones. Then the set of real numbers may be defined as the union of all lower sets for every possible cut, or equivalently of all upper sets. An important point to make is that, technically, we do not know what an irrational number√ is and so we couldn’t define the lower set as, for example,√ X = {x ∈ Q : x <√ 2}, this would be circular reasoning. The example using 2 works well because 2 is an algebraic number,√ namely the solution to the equation x2 − 2 = 0. Hence, the rational numbers below 2 are easily defined by (x2 < 2 ∨ x < 0). This technique is not easily generalised to make a cut at any arbitrary irrational number, for example transcendental numbers such as π and e. These are not solutions to any polynomial f ∈ Z[x] and so a definition of all rational numbers less than, or greater than, π or e does not follow so easily. In general, to define a Dedekind cut at any irrational number becomes very complicated and should be correctly done by analysing limits of sequences of rational numbers. Dedekind’s observation that every cut of the linear continuum satisfies one of the three properties is reasonable and proposes a na¨ıve way to accept a construction of the real numbers. However, Dedekind cuts do not offer to give a precise definition of an irrational number, for this more rigorous analysis is required.
Cauchy Sequences Limits of rational numbers are often regarded as being the most appropriate way to rep- Pn (−1)k 1 n resent irrational numbers, for example π = limn→∞ 4 k=0 2k+1 and e = limn→∞ 1 + n . It is reasonable to expect a definition of an irrational number to be the limit of a se- quence of rational numbers but, as Stewart and Tall remark, “The catch is that formally we do not know what the limit is” p182 [2]. This is similar to the problem of Dedekind cuts, where it would be circular reasoning to define an irrational number by implicitly using the concept of an irrational number. We follow Stewart and Tall’s reasoning and consider a sequence of rationals sn such that limn→∞ sn = l ∈ R, or equivalently ∀ > 0∃N() such that
|sn − l| < ∀n ≥ N and |sm − l| < ∀m ≥ M.
Adding these two inequalities, we have |sn −l|+|sm −l| < 2. By the triangle inequality, 0 we have |sn − sm| ≤ |sn − l| + |sm − l| < 2 = . This illustrates the brilliant concept behind Cauchy sequences, we have established a way of dealing with the limit of a sequence of rationals without the need to know what that limit actually is. The set of
Page 21 of 53 2 The Foundations: from Set Theory to Number Theory real numbers may then be defined as the set of equivalence classes of Cauchy sequences which satisfy the relation: sn ∼ tn ⇔ limn→∞ (sn − tn) = 0. We note that in the above discussion about infinite limits of sequences of rational numbers we mean the infinity of type ω, since card(Q) = card(ω).
2.3.2 Decimal Representation
Our objective in this Section is to reflect on the concept of regarding real numbers as infinite decimals and to prove the uncountability of the irrational numbers. Following this, we finish our analysis of the foundations of numbers by showing how recognising patterns can give way to deeper theoretical results. To use the term ‘decimal’ repre- sentation is a slight abuse of terminology, we note that we may use a different base other than 10 and observe analogous results. For our work, we will focus on the base 10 representation of numbers in digit form because it is what we are most familiar with.
The philosophical concept of constructing real numbers as arbitrary infinite decimals can lead to controversial beliefs about the axiom of choice. Here, we briefly reflect on this issue in order to further our analysis of the real numbers and gain a deeper understanding of them. Such a construction would require a convincing reason which asserts existence of arbitrary infinite decimals in the first place. The significance of having an arbitrary infinite decimal means that we cannot know what digits are in that number, nor can we know any pattern to determine those digits. The uncountability of the real numbers, which we prove below, means that we have an overwhelming amount of infinite decimals which we cannot possibly associate to any logical order. The axiom of choice can be thought of as equivalently reading, “if we have a family of non-empty sets, we can simultaneously choose one element from each of the sets in the family” p121 [13]. This supports the claim that an arbitrary infinite decimal exists because all digits of that number can be simultaneously chosen. However, the cause of uncertainty about the construction of real numbers as arbitrary infinite decimals is because we are unable to prove the axiom of choice.
We also note that to define arithmetic and ordering of arbitrary infinite decimals is fraught with difficulties. As an example, consider the following equality: 0.9999 ... = 1. −n This equation can be justified by noting 0.9999 ... = limn→∞ (1 − 10 ) which indeed −n 1 equals 1 because for all > 0 we have |(1 − 10 ) − 1| < for all n ≥ blog10 c + 1. Thus, the concept representing real numbers as infinite decimals leads to the notion of having more than one way to represent the same number. This makes it seemingly im- possible to accurately establish arithmetic and ordering in terms of infinite decimals. It is reasonable to claim that real numbers can be represented as infinite decimals, though we should appreciate that this can never be exact.
The difficulty of defining the irrational numbers, which we have examined in Section 2.3.1, gives some insight into their uncountability. The fact that the real numbers are uncountable is a result which Cantor was able to prove, we sketch a constructive proof below. We consider the real numbers between 0 and 1 and suppose this set is
Page 22 of 53 2 The Foundations: from Set Theory to Number Theory
countable, so that we may theoretically write down a complete list, such as:
x1 = 0.11083 . . . , x2 = 0.22548 . . . , x3 = 0.36317 ...,
x4 = 0.40349 . . . , x5 = 0.75552 . . . , xi = 0.di1di2 . . . dii ...,...
We now construct a new real number, x = 0.e1e2e3 ... ∈ (0, 1), using the elements in this list, such that ei = 2 if dii 6= 2 and ei = 3 if dii = 2, for i = 1, 2,.... Using the above list of numbers we would have x = 0.23223 .... It is easily observed that the number x will always be a new number which is not already in that list, despite what numbers we use to construct x, for example ‘2’ or ‘3’. Being able to construct a new number x ∈ (0, 1) demonstrates that the set of real numbers on (0, 1) is uncountable, hence R itself is uncountable. Finally, since R = Q ∪ Q and also card(Q) < card(R), we conclude card(Q) = card(R), that is the set of irrationals is uncountable.
We finish our analysis of the foundations of numbers by demonstrating a connection to some results in number theory which are relevant to our following work. Recog- nising patterns in numbers can give way to deeper theoretical results, “Gauss and others often used to carry out phenomenal amounts of calculation looking for patterns which might suggest mathematical assertions or relationships which ought to be true” p2 [12]. The decimal expansion of a real number, x, is either eventually periodic or never periodic, in fact this corresponds to x being rational or irrational respectively. For a number x = 0.d1d2d3 ... (where all di are decimal digits) to have period P we th mean that dn+P = dn for all n, or if x has period P after the k digit then dn+P = dn for all n > k. Our focus is specifically on the decimal form of rational numbers and recognising their periodicy.
Pn −k Observe that we may write a finite decimal number as 0.d1d2d3 . . . dn = k=1 dk10 . Hence, a decimal number with period P may be written as:
n ! ∞ ! X −k X −nj 0.d1d2d3 . . . dn = dk10 10 , k=1 j=0 1 = d 10−1 + d 10−2 + ... + d 10−n 1 2 n 1 − 10−n d 10n−1 + d 10n−2 + ... + d = 1 2 n , (2) 10n − 1 where we have used the sum of a geometric series since |10−n| < 1 for all n ≥ 1. We first notice that for all numbers of the form x = 2α5β, for natural numbers α and β, 1 1 1 1 the decimal expansion of x is finite, 2 = 0.5, 52 = 0.04, 23×52 = 0.005 and so on. After a closer look we find that this is simply a consequence that 2 and 5 are the prime divisors of our base 10. So we consider numbers, x, whose prime factors do not include 2 or 5, that is the greatest common divisor of x and 10 is 1. Euler’s totient function is the focus of Section 3.1.3, where we give a formal definition and prove its various properties. After various trials and experiments, it is observable that the period of 1 repeating decimals in x seems to always be a factor of Euler’s totient function φ(x), 1 furthermore the period never seems to exceed φ(x). For example, 7 has period 6 = φ(7) 1 1 and 11 has period 2 = 5 φ(11). In general, this observation is correct and is proved below:
Page 23 of 53 2 The Foundations: from Set Theory to Number Theory
Proposition 2.4. Let x be a natural number greater than 1 and such that gcd(x, 10) = 1 1, then the period of x is at most φ(x) and is always a factor of φ(x). Proof. This proof relies on Euler’s Theorem, which we state and prove in Section 3.1.3. Since x and 10 are relatively prime, by hypothesis, Euler’s Theorem shows that 10φ(x) ≡ 1 mod(x). Equivalently, we may write 10φ(x) −1 ≡ kx for some integer k ∈ Z.
1 k φ(x) Now, since x > 1, we have x ≡ kx < 1, that is, k < kx = 10 − 1. Therefore, the φ(x)−1 φ(x)−2 integer k has at most φ(x) digits, expressed in base 10, k = d110 + d210 + ... + dφ(x). It immediately follows that we may write,
1 k d 10φ(x)−1 + d 10φ(x)−2 + ... + d ≡ ≡ 1 2 φ(x) , x kx 10φ(x) − 1
1 which is the same form as (2) and so we see x has a period of at most φ(x). If we let P be the lowest integer such that 10P ≡ 1 mod(x) then, as Rosen proves p308 [16], P is indeed a factor of φ(x). This would imply that the integer k has exactly P digits 1 in its base 10 decimal expansion and so the period of x is exactly P . The minimum value of (x − φ(x)) is 1, which only occurs only when x is prime, we prove this in Theorem 3.2 iii). Therefore, by Proposition 2.4, the maximum value of the 1 periodicy of x is φ(x) and can only occur when x is prime, however this does not hold 1 1 for all primes. The period of 7 is 6, which holds, however the period of 11 is 2. Through my research and literature investigation, I have not found a solution to the problem of 1 conjecturing that there may be infinitely many primes, p, such that the period of p is φ(p) = p − 1. In number theoretic language, this conjecture is equivalent to supposing there are infinitely many primes, p, such that 10 is a primitive root modulo p.
Page 24 of 53 3 Broadening the Theory of Numbers
3 Broadening the Theory of Numbers
Having established the technical groundwork which number theory rests upon, we now turn our attention to analyse how the theory can be extended to lead to a practical application, namely RSA cryptography. Number theory is part of the purest form of mathematics, “to be admired for its beauty and depth rather than its applicability” p13 [15]. However, in this Section, our primary objective is to build on the foundations so that we can apply our theoretical results to RSA cryptography, in the following Sec- tion 4. By doing so, we appreciate the importance of establishing a well-founded and solid mathematical theory by demonstrating how it gives way to a practical application.
We begin by proving various fundamental results which generally concern all of number theory and have direct applications to RSA cryptography. Following this, the remainder of our work is mainly dedicated to tackle the problem of primality testing. We also address relevant issues regarding computational number theory and the difficulties of integer factorisation. As an additional motive to study number theory, in Section 3.1.2 we discuss some deeper theoretical results which are the focus of current research and are connected to remaining unsolved problems. New developments in these involved topics of number theory would have a significant impact to other areas of mathematics. With respect to primality testing and RSA cryptography, new results in these areas would offer very efficient and practical methods, as we conclude in Section 4.3.2.
3.1 Fundamental Results
A crucial component of a rigorous study of number theory is the study of prime num- bers, as Yan writes, “In fact, the theory of numbers is essentially the theory of prime numbers” p85 [15]. An elementary definition of a prime number is a natural number, p, having exactly two divisors, that is, 1|p and p|p but for all n ∈ {2, 3, . . . , p − 1}, n 6 |p. Prime numbers are the subject of many open-challenges, for example in under- standing the distribution of prime numbers and determining the prime factorisation of any number, large numbers in particular.
3.1.1 Factoring Natural Numbers
Firstly, the following elementary theorems introduce the significance of prime numbers:
Theorem 3.1. Every integer n > 1 is expressible as a product of primes.
We give two proofs of this theorem to demonstrate the importance of the principles of induction and well-ordering, which we established in Section 2.1.3. Both proofs follow the same reasoning as Allenby [12] p126. Let S be the set of all numbers which are expressible as a product of primes:
∀s(s ∈ S ⇔ (s ∈ N ∧ s is expressible as a product of primes)). Proof by Induction. Observe that 2 is expressible as a product of primes (2 is itself prime) so 2 ∈ S. Now we assume k ∈ S and consider (k + 1). If (k + 1) is prime then we are finished, (k + 1) ∈ S. Alternatively, (k + 1) must be expressible as a product
Page 25 of 53 3 Broadening the Theory of Numbers
of two integers, a and b, such that a, b < (k + 1). By the hypothesis step, a and b are expressible as a product of primes and so (k + 1) = a × b is also expressible as a product of primes. That is, (k + 1) ∈ S and so S = N\{1}.
Proof by Well-Ordering. Consider the set T = (N\{1})\S i.e. the set of numbers which are not expressible as a product of primes. If T is non empty then there exists a least element t ∈ T . Clearly, t is not prime otherwise it would be expressible as a product of primes. Therefore, t is expressible as a product of two integers a and b such that a, b < t. However, since a and b are less than t and t is defined as the least element which is not expressible as a product of primes, we must have a and b are expressible as a product of primes. Therefore, t = a × b is also expressible as a product of primes, that is, T has no least element and so T = ∅. Being able to express any natural number as the product of primes is a crucial result. Knowing the prime factorisation of a number often allows many useful properties about that number to be calculated easier. We will discuss more about this after we establish the following important result, which shows that a prime factorisation of any number is in fact the only prime factorisation of that number.
Theorem (Fundamental Theorem of Arithmetic). Every natural number n > 1 has a unique prime factorisation.
Proof. By Theorem 3.1 every natural number n > 1 can be expressed as a product of primes. Let S be the set of all numbers which do not have a unique prime factorisation,
∀s(s ∈ S ⇔ (s ∈ N ∧ s has a non-unique prime factorisation)). In order to reach a contradiction, we assume that S is non-empty and so must have a least element, s ∈ S, by the well-ordering Theorem,
n m Y Y s = pi = qj for distinct primes pi and qj. i=1 j=1
Euclid’s ‘first’ theorem shows that a prime divides a product of numbers if and only if it divides one of those numbers, we omit the proof as this is a quite a straightforward result. Therefore, since p1 is prime it must divide one of qj for j ∈ {1, 2, . . . , m}. Without loss of generality, take that p1 divides q1. However, by construction, q1 is s prime and so we must have p1 = q1. Now, if we consider we have, p1
n m s Y Y = p = q . p i j 1 i=2 j=2
By the assumption that s has a non-unique factorisation, at least one of p2, p3, . . . , pn s must not equal one of q2, q3, . . . , qm. That is, has a non-unique prime factorisation. p1 This is a contradiction since s < s and s is defined as the least element which does not p1 have a unique prime factorisation. We conclude that S does not have a least element and so S = ∅.
Page 26 of 53 3 Broadening the Theory of Numbers
We note that these fundamental results, of factoring natural numbers into their prime divisors, have been proved based on the foundational theory in Section 2.1.3. Namely, the induction and well-ordering properties of the natural numbers. This shows the significance of setting effective foundations to underlie our analysis.
The fundamental theorem of arithmetic shows that “primes are the multiplicative build- ing blocks of the natural numbers” [17] p1. This result gives some insight into the significance of prime numbers and introduces the ‘fundamental problem of arithmetic’, as it is often called. The crux of the matter is that it is generally very difficult to factor a number, large numbers in particular, into its prime divisors. In truth, this is the underlying reason why the RSA cryptosystem is secure, as Rosen writes [about the RSA cryptosystem] “... security is based on the difficulty of factoring integers” [16] p259. As we prove in Theorem 3.2 v), if we are able to find the prime factorisation of a number, n, then Euler’s totient function of that number, φ(n), is easily calculated and our cryptosystem would be broken. We discuss about the difficulties of factoring a number into its prime divisors in Section 3.2.2.
3.1.2 Analytic Study of Primes
We now move focus onto the study of prime numbers. Though our objective is not to give a complete analysis of prime numbers, we wish to address some important results which are crucial to number theory and are relevant to our following work. Prime numbers have a major role in number theory. To gain a deeper understanding of them it seems natural to look for a rule or relation which determines the nth prime. However, to date, no exact nth-prime function has been discovered. It is also of high interest to discover some order or pattern of how the primes are distributed among the natural numbers. In particular to analyse how many primes are less than a given number. These are famous problems of number theory and remain to be open challenges, though significant breakthroughs have been made. “Till now mathematicians have tried in vain to this day to discover some order in the sequence of prime numbers, and we have reason to believe that it is a mystery into which the human mind will never penetrate.” Leonhard Euler.
We define the function π : R → N where π(x) is the number of primes less than or equal to x: x X π(x) = 1. p prime Throughout the remainder of our work, we will use ‘π’ to mean the prime counting function, or ‘prime distribution function’, as defined above. Although there is no computationally manageable method to calculate π(x) for all x ∈ R, there have been numerous discoveries which show π(x) is quite well-behaved. For our work we are mainly concerned with the following results: Chebyshev’s inequality, which bounds x π(x) for any x ∈ R, and that π(x) can be approximated by ln(x) reasonably well for sufficiently large x. The latter result is a rough translation of the Prime Number Theorem, a fundamental theorem, which should be formally stated as below:
Page 27 of 53 3 Broadening the Theory of Numbers
Theorem (Chebyshev’s π(x) Inequality). Let 3 ≤ x ∈ R, then there exist 0 < A, B ∈ R such that, Ax Bx < π(x) < . ln(x) ln(x)
x Theorem (Prime Number Theorem). Let x ∈ R, then π(x) is asymptotic to ln(x) . That is, π(x) ln(x) lim = 1. x→∞ x Chebyshev’s result is somewhat close to the Prime Number Theorem, however, as Cran- dall and Pomerance write, “the real difficulty in the PNT [Prime Number Theorem] is x showing that limx→∞ π(x)/ ln(x) exists at all” p10 [17]. By going into a deep study of analytic number theory, it is possible to deduce the Prime Number Theorem from the Riemann zeta function, which is defined as:
∞ X 1 ζ(s) = , for s ∈ and when the summation converges. ns C n=1 As Crandall and Pomerance hint, one method of proof works on the principle that ζ(s) has no zeros for Re(s) = 1 p34 [17], this proof requires a rigorous study of complex analysis. It is important to appreciate the value of the Prime Number Theorem. In particular for our following work, in Section 4.3, as the Prime Number Theorem gives a rough estimate of the probability of randomly choosing an integer and it is prime. We also note that Chebyshev’s inequality can be used to approximately bound this probability.
Another fundamental result in number theory is the infinitude of primes. Euclid is given credit for providing the first proof of this theorem. His proof works by con- tradiction and deduces that there is a prime which divides the number 1, which is impossible. As Crandall and Pomerance describe p34 [17], another proof of the in- finitude of primes can be deduced from the zeta function by considering s ∈ R and taking the limit s → 1. In this limit, the zeta function becomes the harmonic series, P∞ 1 n=1 n , which is known to diverge through studies in real analysis. Furthermore, using Euler’s ‘factor’ to show the relationship between the zeta function and primes, namely P∞ 1 Q −s −1 n=1 ns ≡ all primes p (1 − p ) , the infinitude of primes follows after some manip- ulations. Our proof of the infinitude of primes, below, is based on a similar approach. To derive our proof we have more or less answered Crandall and Pomerances’ exercise 1.20 p53 [17].
Theorem. There are infinitely many prime numbers.
Proof. Firstly, we use the principle of induction to show
x X 1 > ln(x) holds for x ≥ 1. (3) n n=1
Page 28 of 53 3 Broadening the Theory of Numbers
P1 1 In the case x = 1 we have n=1 n = 1 > ln(1) = 0 which is clearly true. Now we assume the result is true for x = k,
k X 1 > ln(k), n n=1 and consider when x = k + 1:
k+1 k X 1 1 X 1 1 = + > + ln(k) by the hypothesis step n k + 1 n k + 1 n=1 n=1 = ln(ek+1k) > ln(k + 1) since the logarithm is a monotonic function and ek+1k > (k + 1) holds for k ≥ 1. Therefore, we conclude (3) is true. The second result we need to prove is,
x −1 X 1 Y 1 < 1 − where p is prime. (4) n p n=1 p≤x
We proceed by making the following manipulations:
x 1 X 1 1 1 1 1 1 1 = + + + + ... + + 2 n 2 4 6 8 2x − 2 2x n=1 x 1 X 1 1 1 1 1 1 1 − = 1 + + + + ... − − ∴ 2 n 3 5 7 2x − 2 2x n=1 x 1 1 X 1 1 1 1 1 1 1 1 − = + + + + ... − − 3 2 n 3 9 15 21 6x − 6 6x n=1 x 1 1 X 1 1 1 1 1 1 1 1 − 1 − = 1 + + + + + ... − − − ... ∴ 3 2 n 5 7 11 13 2x − 2 2x n=1 1 1 ... + + . 6x − 6 6x Repeating this process, we can observe that if we continue multiplying the left hand 1 side by 1 − p for all prime divisors p ≤ x, the positive terms on the right hand side 1 of the form k k≤x will all disappear except for k = 1. The remaining terms of the 1 form ± k k>x generally form a complicated pattern and don’t admit a ‘neat’ formula, however we note that their sum should be less than 0, for example since 2x < 6x we 1 1 1 1 have 2x > 6x and so − 2x + 6x < 0. In general, we find:
x x −1 Y 1 X 1 X 1 Y 1 1 − = 1 + c, where c < 0, giving < 1 − , p n n p p≤x n=1 n=1 p≤x we conclude (4) is true. Observe, since the natural logarithm is monotonic, by taking
Page 29 of 53 3 Broadening the Theory of Numbers
the natural logarithm of (4) we have,
−1! x ! X 1 X 1 ln 1 − > ln > ln (ln(x)) by (3), p n p≤x n=1 ∞ ! ∞ n X X 1 −1 X y 1 > ln (ln(x)) since ln (1 − y) = and < 1, ∴ npn n p p≤x n=1 n=1 ∞ ! X 1 X 1 + > ln (ln(x)) . ∴ p npn p≤x n=2
P P∞ 1 P P∞ 1 Notice from the left hand side, p≤x n=2 npn < p≤x n=2 pn and this is in P∞ 1 1 1 1 fact a geometric series: n=2 pn = 1 − 1 − p = p(p−1) . 1− p
∞ X 1 X 1 X 1 X 1 X 1 Therefore, ln (ln(x)) < + < + = + 1, p p(p − 1) p n(n − 1) p p≤x p≤x p≤x n=2 p≤x
P∞ 1 since n=2 n(n−1) is a telescoping series, by taking partial fractions we can see it sums to 1. Finally, we have reached our desired result: X 1 −1 + ln (ln(x)) < . p p≤x From this, it is straight forward that if we take the limit x → ∞ on both sides, X 1 −1 + lim ln (ln(x)) < lim , x→∞ x→∞ p p≤x
P 1 then the left hand side goes to infinity and so all primes p p , from the right hand side, diverges. As discussed previously, if a series diverges then it must have an infinite number of terms, we therefore conclude there is an infinite number of primes.
Seemingly harmless problems such as counting primes rely on vast areas of advanced mathematics to solve. As we have briefly discussed, the Riemann zeta function gives some great insights into the properties of prime numbers, such as their infinitude, their x distribution and even an estimate of the error π(x) − ln(x) can be deduced p35 [17]. A fundamental conjecture about the zeta function, “for all of number theory, if not for all of mathematics” p36 [17], was put forward by Riemann and is known as the Riemann hypothesis. For completeness, we formally state it below:
Riemann Hypothesis. All the zeros of ζ(s) in the critical strip 0 < Re(s) < 1 lie on 1 the line Re(s) = 2 . By introducing Dirichlet L-functions, the Riemann hypothesis can be taken further to the extended Riemann hypothesis and the generalised Riemann hypothesis. Such conjectures are in the field of current research, though it is relevant for us to address them because developments in these theoretical areas would have a significant impact to the practical application of mathematics. We have merely addressed these issues so that we may refer back to them in our Conclusion Section.
Page 30 of 53 3 Broadening the Theory of Numbers
3.1.3 Theorems with Applicable Results to RSA Cryptography
In this Section we focus on particular theorems which are applicable to RSA cryptog- raphy. We analyse and prove how the RSA cryptosystem works in Section 4.1; for now, we are primarily concerned with proving various fundamental results. Euler’s totient function is at the heart of how RSA cryptography works and there are two important properties we need to prove: firstly, Euler’s theorem and, secondly, how to calculate the value of the totient function by knowing the prime factorisation of its argument.
Definition 3.1. Euler’s totient function is defined as φ : N → N where φ(n) is the number of numbers less than n which are relatively prime to n. That is, in set theory terms, ∗ ∗ φ(n) = |Zn|, where Zn = {x ≤ n : gcd(x, n) = 1}.
We begin by making a theoretical remark to show some insight into the nature of φ(n). We return to Table 1 in Section 2.2.2. Consider the sets of diagonal elements at 1 each row, for example at the first row we have r1 = { 1 }, at the second row we have 2 1 7 5 3 1 r2 = { 1 , 2 }, at the seventh row we have r7 = { 1 , 3 , 5 , 7 } and so on. In general, we find that the number of elements in the diagonal line beginning at row n is the same as the number of elements less than n+1 and relatively prime to n+1, that is |rn| = φ(n+1). Furthermore, Table 2.2.2 gives a neat geometric proof as to why φ(n) is even for n > 2. Notice there is diagonal symmetry, if (a, b) are relatively prime then it trivially follows (b, a) are also relatively prime. Therefore, every set of diagonal elements at each row, a b a rk>1, has an even number of elements: if b ∈ rk then a ∈ rk and also a ∈/ rk>1. This is a heuristic proof as to why φ(n) is even for n > 2.
We proceed by proving the following, Euler’s theorem:
Theorem (Euler’s Totient Theorem). If a, n ∈ N such that gcd(a, n) = 1, then aφ(n) ≡ 1 (mod n).
∗ Proof. Consider the sets Zn = {b1, b2, . . . , bφ(n)} and X = {ab1, ab2, . . . , abφ(n)}. Since ∗ gcd(a, n) = 1 (by hypothesis) and gcd(bi, n) = 1 (by definition of Zn), it follows that ∗ ∗ gcd(abi, n) = 1. Furthermore, for all b ∈ Zn we are able to find a unique c ∈ Zn such that c ≡ a × b(mod n). This follows from the result in group theory that we can define ∗ ∗ ∗ a permutation (one to one function) on Zn using f : Zn → Zn by f(x) = a×x(mod n).
∗ From group theory, we can write the following product of permutations using Zn and X:
b1 × b2 × ... × bφ(n) ≡ (ab1) × (ab2) × ... × (abφ(n)) (mod n) φ(n) ≡ a b1 × b2 × ... × bφ(n) (mod n).
From this, we have at least one of the following is true:
φ(n) Y φ(n) bi ≡ k × n, or a − 1 ≡ k × n, for some k ∈ N. i=1
Page 31 of 53 3 Broadening the Theory of Numbers
Since gcd(bi, n) = 1 the first result must certainly be false. Therefore, rewriting the second result in congruence notation, we have:
aφ(n) ≡ 1 (mod n).
The following theorem shows some important properties about Euler’s totient function. Theorem 3.2. Euler’s totient function φ satisfies the following properties:
1 i) let p be prime and α ∈ \{0}, then φ(pα) = pα 1 − ; N p X ii) let n ∈ N\{0}, then n = φ(d); d|n iii) p is prime if and only if φ(p) = p − 1; iv) let a, b ∈ N\{0} such that gcd(a, b) = 1, then φ(ab) = φ(a)φ(b); Y 1 v) let n ∈ , then φ(n) = n 1 − , where p is prime. N p p|n
Remark. For our work, result v) shows a fundamental property as this allows φ(n) to be calculated for all n, by knowing its prime factorisation. The proof we give of v) relies on the preceeding properties i), ii) and iv), which are worth stating as individual results in their own right.
Proof. i) Since p is prime, the only factors of pα are the multiples of p up to pα: p, 2p, 3p, . . . , pα − p. Therefore, all other numbers between 1 and pα are relatively prime to pα. That is, we can calculate φ(p) as pα minus the number of relatively prime α α−1 α α−1 α 1 multiples of p up to p , of which there are p . φ(p) = p − p = p 1 − p . ii) Let d be a divisor of n and define the sets:
S = {1, 2, . . . , n} and T (d) = {x ∈ S : gcd(x, n) = d}.
S 0 0 Firstly, observe that d|n T (d) = S and also T (d) ∩ T (d ) = ∅ for d 6= d , X therefore, n = |S| = |T (d)|. (5) d|n