Matijasevi˘c’sTheorem: Diophantine descriptions of recursively enumerable sets

Bachelor’s thesis

S.R. Groen ∗

First supervisor: prof. dr. J. Top Second supervisor: dr. A.E. Sterk

2017

Abstract In 1970, Yuri Matijasevi˘cfinished the proof that all recursively enumerable sets are Dio- phantine, rendering Hilbert’s tenth problem unsolvable. He did so by showing that exponential Diophantine sets are Diophantine, which complemented earlier work done by Martin Davis, and . In this thesis, we analyze, explore and apply this result. We reconstruct a√ known way to create a Diophantine description of exponentiation: using the unit group of Z[ d]. This provides a mechanism with which we can create a Diophantine de- scription of any recursively enumerable . We apply this to find Diophantine descriptions of some specific sets of integers. We also study the complexity of such Diophantine descriptions. Furthermore, we try to create a new method of creating a Diophantine description of expo- √3 nentiation,√ using the unit group of Z[ d], whose structure is similar to that of the unit group of Z[ d]. It turns out that such a similar method does not work, as the desired divisibility sequences don’t exist.

Keywords: Hilbert’s tenth problem, Diophantine sets, Matijasevi˘c’stheorem, number rings, al- gebraic .

∗Faculty of and natural sciences, Rijksuniversiteit Groningen, Nijenborgh 4, 9747 AG Groningen, The Netherlands, e-mail: [email protected]

1 Contents

1 Introduction 3 1.1 Hilbert’s tenth problem...... 3 1.2 Diophantine sets...... 3 1.3 Recursively enumerable sets...... 4 1.4 Matijasevi˘c’sTheorem...... 4 1.5 The DPR-theorem...... 5 1.6 From exponential Diophantine to Diophantine...... 5 1.7 The aim of this thesis...... 6 √ 2 A Diophantine√ description of exponentiation using Z[ d] 7 2.1 Z[ d] and its unit group...... 7 2.2 The Pell equation...... 8 2.3 Cyclicity...... 9 2.4 A suitable choice for d ...... 10 2.5 Behavior of xn(a) and yn(a) ...... 11 2.6 Finding the solution number using divisibility properties...... 14 2.7 A Diophantine description of xn(a) and yn(a) ...... 20 2.8 A Diophantine description of exponentiation...... 21

3 Expanding the language of Diophantine descriptions 23 3.1 Diophantine descriptions of important functions...... 23 3.2 The Bounded Universal Quantifier Theorem...... 24 3.3 The Sequence Number Theorem...... 24 3.4 Putnam’s trick...... 25

4 Application 26 4.1 The set of primes...... 26 4.1.1 The straightforward definition...... 26 4.1.2 Wilson’s Theorem...... 27 4.2 The divisor number function...... 28 4.3 The divisor sum function...... 28 4.4 Euler’s φ function...... 29 4.5 Gödel’s incompleteness theorems...... 29

5 The complexity of Diophantine descriptions 31 5.1 Degree...... 31 5.2 Dimension...... 32 √ 3 6 A Diophantine√ description of exponentiation using Z[ d] 33 3 6.1 Z[ d] and its unit group...... 33 6.2 Application of Dirichlet’s theorem...... 34 6.3 A suitable choice for d ...... 35 6.4 The three-dimensional Pell equation...... 36 6.5 Behavior of xn(a), yn(a) and zn(a) ...... 38 6.6 Finding the solution number using divisibility properties...... 41 6.7 Are xn(a), yn(a) and zn(a) Diophantine?...... 43 6.8 Comparison to two-dimensional case...... 44

7 Conclusion and outlook 45

2 1 Introduction 1.1 Hilbert’s tenth problem In 1900, David Hilbert posed 23 then unsolved problems in mathematics that he encouraged mathematicians to solve in the twentieth century. Some of these problems, such as the Riemann hypothesis, are still unsolved. Hilbert’s tenth problem plays an important role in this thesis. Through work by Martin Davis, Julia Robinson, Hilary Putnam and Yuri Matijasevi˘c,this problem has been shown to be unsolvable. The problem was posed in 1900 as follows.

Hilbert’s tenth problem: Devise an that, given any equation with in- teger coefficients as input, gives as output whether this polynomial has any roots over the integers. [Dav73]

In 1970, Matijasevi˘ccompleted the proof that such an algorithm does not exist and that Hilbert’s tenth problem is thus impossible to solve. [Mat70] In this thesis, we will explore and apply the method’s used in the proof of Matijasevi˘c’s Theorem A key notion in the theory applicable to this problem is the notion of Diophantine sets. This will be a vital concept throughout this thesis.

1.2 Diophantine sets n Definition 1.1. A set S ⊂ Z is Diophantine if there exists an m ≥ n and a p ∈ Z[X1,X2, ··· ,Xm] for which the following holds:

n m−n S = {s ∈ Z | ∃t ∈ Z s.t. p(s, t) = 0}

A Diophantine set S ⊂ Zn is thus the projection of the set of zeros in Zm of the polynomial p ∈ Z[X1,X2, ··· ,Xm] onto the first n coordinates. An equation of the form p(X1,X2, ··· Xm) = 0 is also called a , and a Diophantine description of S. We will call p the polynomial corresponding to S. In the following examples, the set of numbers X1 for which the polynomial p has integer roots is Diophantine:

• The even numbers, with the corresponding polynomial p = X1 − 2X2.

2 • The squares (of integers), with the corresponding polynomial p = X1 − X2 . 2 2 2 2 • The non-negative integers, with the corresponding polynomial p = X1 − X2 − X3 − X4 − X5 . Here we use Lagrange’s result that every non-negative integer is the sum of four squares, and obviously negative integers can’t have that property).

2 2 2 • The Pythagorean hypotenuse integers, with the corresponding polynomial p = X1 −X2 −X3 Another example of a Diophantine set is the set of composite numbers. On the first hand, p = X1 − X2X3 = 0 might seem like a suitable Diophantine description of this set, but it allows for one (or two) of the factors of X1 (which are X2 and X3) to be equal to 1. We therefore also need that both are larger than 1, in order to find the positive composite numbers. That will result in either of the following equivalent Diophantine descriptions of the set of composite numbers:

2 2 2 2 2 2 2 2 2 2 2 p = (X1 − X2X3) + X2 − X4 − X5 − X6 − X7 − 2 + X3 − X8 − X9 − X10 − X11 − 2 = 0 2 2 2 2  2 2 2 2  p = X1 − X2 + X4 + X5 + X6 + 2 X3 + X7 + X8 + X9 + 2 = 0 If we would also want to find the negative composite numbers, this would be equivalent to also allowing X3 to be smaller than −1 instead of greater than 1. We would then have the following polynomial:

2 2 2 2 2 2 p = (X1 − X2X3) + X2 − X4 − X5 − X6 − X7 − 2 + 2 2 2 2  2 2 2 2 2 X3 − X8 − X9 − X10 − X11 − 2 X3 + X12 + X13 + X14 + X15 + 2 = 0 We will later see that the complement of this last set, which is set of primes, is also Diophantine. However, this is far from trivial and may feel counterintuitive at this moment.

3 As the example of the set of composite numbers shows, using the four squares theorem so many times is quite a hassle. We therefore assume from now on that every variable can only be nonnegative. This is no loss of generality, as we can always introduce a minus sign to let a number be negative. We conclude that, for any set S, the following three are equivalent: 1. There exists a Diophantine description of S in the integers 2. There exists a Diophantine description of S in the non-negative integers

3. There exists a Diophantine description of S in the positive integers This is because we can always introduce a minus sign or use Lagrange’s four square theorem. It is straightforward to prove that the set of Diophantine sets is closed under union: if we have S1 and S2, with corresponding p1 and p2, The set S1∪S2 has corresponding polynomial p1 ·p2. This polynomial is zero if and only if at least one of the polynomials p1 and p2 is zero, which means we are dealing with an element of S1 or S2. Similarly, the set S1 ∩ S2 has corresponding 2 2 polynomial p1 + p2. We have already applied this technique to our Diophantine description of composite numbers. Not all Diophantine sets have the property that their complement is also Diophantine, but this is also not trivially seen. We can now see what the algorithm Hilbert asked for should do precisely: it should be able to decide within a finite amount of time, given any polynomial as input, whether the corresponding Diophantine set is empty or non-empty.

1.3 Recursively enumerable sets Another important notion will be the notion of recursively enumerable sets. Definition 1.2. A set S is recursively enumerable if the Turing machine has an algorithm that enumerates S. Equivalently: an algorithm exists that halts precisely when its input is an element of S. Examples of such sets are the following:

• Any finite set: S = {s1, s2, ··· , sn} = {s | s = s1 ∨ s = s2 · · · ∨ s = sn} • The positive numbers: S = {s | s > 0} • The even numbers: S = {s | ∃y s = 2y}

• The set of powers of 2: S = {s | ∃y s = 2y}

• The set of prime numbers: S = {s | ¬(∃y)1

1.4 Matijasevi˘c’sTheorem As said, Hilbert’s tenth problem boils down to devising an algorithm that decides membership of any given Diophantine set, within a finite amount of time. In 1936 already, Alonzo Church showed that an algorithm that decides membership of any given recursively enumerable set can’t exist. [Chu36]. It was only in 1970 that Matijasevi˘ccompleted the work of proving Matijasevi˘c’s Theorem, also known as the MRDP-theorem or DPRM-theorem (in credit to the others that had contributed). Theorem 1.3. Matijasevi˘c’sTheorem. A set is Diophantine if and only if it is recursively enumerable. [Mat70]

4 Then, since there does not exist an algorithm that can determine, within a finite amount of time, of any recursively enumerable set whether it’s non-empty, neither can we find an algorithm that does this for Diophantine sets. But this algorithm was exactly what Hilbert had asked for. Thus Matijasevi˘c’sresult implies that Hilbert’s tenth problem is unsolvable. The proof of Matijasevi˘c’sTheorem is a main subject of this thesis. We can make a good start by proving the following lemma: Lemma 1.4. If a set is Diophantine, then it is recursively enumerable.

n Proof. Let S ⊂ Z be a Diophantine set, with the corresponding polynomial p ∈ Z[X1,X2, ··· ,Xm] and n ≤ m. Let s ∈ Zn be arbitrary. Then an algorithm that systematically checks all elements of Zm−n (for instance by ordering on the absolute value of the sum of the coordinates) will suffice. If s is in S, then, by definition of S, p(s, t) = 0 for some t ∈ Zm−n. We know that this t will be found by our algorithm after a finite amount of time, and therefore the algorithm will halt. On the other hand, when s is not in S, no such t ∈ Zm−n can be found, and thus our algorithm will run forever. As this algorithm halts precisely for elements in S, we conclude that S is recursively enumerable. Moreover, S was chosen arbitrarily, so we conclude that every Diophantine set is recursively enumerable. The algorithm described here is not the algorithm Hilbert asked for, as this algorithm is not able to conclude within a finite amount of time that a Diophantine equation does not have any integer solutions. Lemma 1.4 has a short proof. The inclusion the other way around, which is that all recursively enumerable sets are Diophantine, is a much more complicated and surprising result. It means that any recursively enumerable set of integers, e.g. the set of primes, corresponds to a polynomial that has a zero in the integers precisely when the first coordinate is an element of our set. In other words, every property of integers that can be found algorithmically, is expressible by a Diophantine equation. Before Matijasevi˘cstarted his work, a lot of work had already been done on the connection between recursively enumerable sets and Diophantine sets. The most important result of that research is described in the following section.

1.5 The DPR-theorem In 1961, Martin Davis, Julia Robinson and Hilary Putnam proved the Davis-Putnam-Robinson- theorem, or in short DPR-theorem, which was a very important step towards Matijasevi˘c’sThe- orem. In order to state this theorem, we must first define what an exponential Diophantine set is. Definition 1.5. A set is exponential Diophantine if there exists an m ≥ 0, a base b and a polynomial p ∈ Z[X1,X2, ··· ,Xn+2m] such that the following holds:

n m t1 t2 tm S = {s ∈ Z | ∃{t1, t2, ··· , tm} ∈ Z s.t. p(s, t1, t2, ··· , tm, b , b , ··· , b ) = 0} We can now state the DPR-theorem Theorem 1.6. DPR-theorem. Every recursively enumerable set is exponential Diophantine. [DPR61] The proof of this theorem is intricate and contains quite a lot of analysis of and the Turing machine. It will therefore not be included in this thesis. An outline of the proof can be found in [DPR61] or in secondary literature, for instance [Kui10].

1.6 From exponential Diophantine to Diophantine Now that we take the DPR-theorem for granted, the step towards Diophantine sets has become smaller. All we have to prove now, and what Matijasevi˘cproved in 1970, is that all exponential Diophantine sets are Diophantine. This is equivalent to proving that sets of the form

3 c S = {(a, b, c) ∈ Z | a = b } are Diophantine. Put otherwise, exponentiation is a Diophantine function.

5 There are several strategies for finding a Diophantine description of such sets. The strategy first used by Matijasevi˘cin his original proof was based on the Fibonacci numbers. [Mat70] In slightly different words, it was based on the unit group (the elements with a multiplicative inverse) √ √ 1+ 5 of the number ring Z[ 2 ]. After Matijasevi˘c’sproof, similar proofs were given using Z[ d] for a non-square d. Although Matijasevi˘c’sconstruction sufficed to show that all exponential Diophan- tine sets are Diophantine, the second approach is more useful for the systematic construction of Diophantine descriptions of recursively enumerable sets. The use of unit groups of number rings can briefly be explained as follows. We will see that these unit groups have a cyclic subgroup, for which there is a Diophantine description. If every element of this subgroup is a power of some fundamental unit, the units behave exponentially with respect to the exponent to which the fundamental unit is raised. Roughly speaking, if we can relate a unit to the exponent using Diophantine equations, a Diophantine description of exponentiation follows. Divisibility sequences, such as the Fibonacci sequence, play a central role in relating a certain unit to the exponent. The proof by Matijasevi˘cis constructive: it contains a recipe that can turn any recursively enumerable set into a Diophantine set. Specifically, it can turn an algorithm that enumerates a set into a polynomial corresponding to that set.

1.7 The aim of this thesis This thesis will focus on the construction of Diophantine descriptions of recursively enumerable sets. Firstly, in section√2, we will reconstruct the proof that exponential Diophantine sets are Diophantine, using Z[ d]. Subsequently, in section3, we give some new tools in Diophantine descriptions, using our Diophantine description of exponentiation. We then determine, in section 4, Diophantine descriptions corresponding to concrete sets, such as the following: • The set of primes. • The set of perfect numbers (numbers that are equal to the sum of their divisors). • Highly composite numbers (numbers that have more divisors than all smaller numbers).

As said, Matijasevi˘c’sproof is constructive, so Diophantine descriptions of these sets of integers can be made explicit in a systematic way. Finding these Diophantine descriptions will provide insight in how the recipe and all the theory involved works precisely. In section5 we will also look at the complexity of Diophantine descriptions in terms of dimension (the number of variables in the Diophantine equation) and degree (the degree of the Diophantine equation). We examine to what extent both can be minimized. Furthermore, in section6 we will discuss a√ different strategy of proving that exponential√ Dio- 3 phantine sets are Diophantine. Instead of Z[ d], we will look at the number ring Z[ d]. In this ring, everything becomes a bit more complicated, and we work in a three-dimensional system in- stead of a two-dimensional√ one. However, its unit group has the same useful properties. If our 3 approach using Z[ d] works, this entails a new proof of Matijasevi˘c’sTheorem. As it turns out, many lemmas from section2 have a three-dimensional analog in section6, but the new approach ultimately does not work. It cannot similarly lead to a new Diophantine description of exponentiation. This is because the required divisibility sequences do not exist, which had already been shown in 1936 (in a paper not related to Matijasevi˘c’s Theorem). [Hal36] This is an unexpected twist, as we thought a divisibility sequence would follow naturally. It makes Matijasevi˘c’sTheorem and the two-dimensional case even more subtle and special. Treating the three-dimensional case also provides lots of knowledge of the theory of unit groups of number rings, as the proofs in the three-dimensional case are more intricate and need more advanced algebra. This knowledge also gives us a deeper insight into why the two-dimensional case works as well as it does, and into Matijasevi˘c’sTheorem in general.

6 √ 2 A Diophantine description of exponentiation using Z[ d]

A common approach in finding a Diophantine√ description of exponentiation is to use the cyclicity of a subgroup of the unit group of Z[ d], for some non-square d. Elements of this subgroup correspond to solutions to the Pell equation. We first need to prove some lemmas about these solutions. We can eventually apply these to find a Diophantine description of exponentiation, which means that {(a, b, c) | a = bc} is a Diophantine set, the result that Matijasevi˘chas obtained in 1970. This section provides a construction of this Diophantine description of exponentiation. Our construction is similar to the construction in [Dav73]. √ 2.1 Z[ d] and its unit group A ring that is useful for our purpose is the following: √ √ Z[ d] = {x + y d | x, y ∈ Z} In which d is a positive integer, but not the square of an integer. This ring is an example of a number ring: Definition 2.1. A ring K is called a number ring if its field of fractions is a number field. √ √ In our case, the field of fractions of Z[ d] is Q( d). The latter is an algebraic field extension of Q and hence a number field. We wish to determine which of the√ elements of this ring have a multiplicative inverse. That is, we are studying the unit group of Z[ d]. √ √ √ × Z[ d] = {α ∈ Z[ d] | ∃β ∈ Z[ d] s.t. α · β = 1} It can straightforwardly be seen that this set is in fact a group under multiplication. Firstly, 1 is obviously a unit. Furthermore, the product of two units and the inverse of a unit are again units, which proves the fact that the set of units is in fact a group. In order to find out whether some element is a unit, we compute the norm of that element. √ √ Definition 2.2. Let α = x + y d be an element of Z[ d]. We define the norm of α as follows:  √   √  N(α) = x2 − dy2 = x + y d x − y d (2.1) √ The origin of this norm can be explained√ using a little module theory.√ The ring Z[ d] can is a module over Z, with the basis {1, d}. Then, for every element of Z[ d] there exists a matrix in Z2×2 that corresponds to multiplication by that element. In order to determine the columns of this matrix Mα, we check what multiplying with α does with the elements of our basis:  √  √ x + y d · 1 = x + y d  √  √ √ x + y d · d = yd + x d

This gives us the following matrix:

x yd M = α y x We then compute the determinant of this matrix. √ √ x yd 2 2     det(Mα) = = x − dy = x + y d x − y d (2.2) y x This norm resembles the usual norm for complex numbers or the Euclidean norm. √ Lemma 2.3. α ∈ Z[ d] is a unit if and only if N(α) = ±1.

7 Proof. Only if: Suppose α is a unit. Since x and y√are integers, N(α√) will also be an integer. Furthermore, the norm is multiplicative: if α = x + y d and β = χ + ψ d, then we have  √   √  N(αβ) = N x + y d χ + ψ d  √  = N (xχ + yψd) + (xψ + yχ) d

= (xχ + yψd)2 − d (xψ + yχ)2 = x2χ2 + 2xyχψ + y2ψ2d2 − x2ψ2 − 2xyχψ − y2χ2 = x2 − dy2 χ2 − dψ2 = N(α)N(β)

Because of equation (2.2), this multiplicativity also follows directly from the multiplicativity of the determinant. Now, if α and β are each other’s inverse, then

N(α)N(β) = N(αβ) = N(1) = 1

It follows that N(α) and N(β) are each other’s inverse in Z. The only units in Z are −1 and 1, so we conclude N(α) = N(β) = ±1

√ If: Suppose α = x + y d is such that N(α) = ±1. Then it follows from equation (2.1)  √   √  x + y d x − y d = x2 − dy2 = ±1

 √  We have thus found the multiplicative inverse of α, namely β = ± x − y d , and we conclude that α is a unit.

√ Lemma 2.3 provides a different notation for the unit group of Z[ d]: √ √ × 2 2 x + y d = α ∈ Z[ d] ⇔ N(α) = x − dy = ±1 √ √ × ⇒ Z[ d] = {α ∈ Z[ d] | N(α) = ±1} This unit group will be of great importance in our construction of a Diophantine description of exponential sets.

2.2 The Pell equation We now study the following set: √ √ × Z[ d] ⊃ Gd = {α ∈ Z[ d] | N(α) = 1 , α > 0} (2.3) √ × Lemma 2.4. Gd is a subgroup of Z[ d] .

Proof. 1. First of all, we observe that 1 ∈ Gd.

2. If α and β are positive, so is αβ. Furthermore, N(αβ) = N(α)N(β) = 1 · 1 = 1, so Gd is closed under multiplication.

−1 −1 −1 −1 3. If α is in Gd, then, since αα = 1, α is also positive. Finally, N(α)N(α ) = 1∗N(α ) = −1 −1 1, so N(α ) = 1, and hence α ∈ Gd. Gd is√ thus closed under inverses. × We conclude that Gd is indeed a subgroup of Z[ d] . √ Lemma 2.5. A necessary and sufficient condition for any α = x + y d to be an element of Gd is that x is positive and x, y is an integer solutions to the Pell equation:

x2 − dy2 = 1 (2.4)

8 √ Proof. Necessity: Suppose α = x + y d is an element of Gd. This means α > 0 and N(α) = 2 2 x −dy = 1. It follows√ that x, y is a solution to 2.4. Now, as Gd is closed under inverses, it follows −1 that α = x − y d is also in Gd, and hence positive: √ x + y d > 0 √ x − y d > 0

Adding these equations gives that x > 0. We conclude that the condition is satisfied. √ Sufficiency: Suppose x is positive and x, y is a solution to equation (2.4). We define α = x + y √d. 2 2 Equation (2.4) implies√ that N(α) = x −dy = 1. Furthermore, as x is positive, either α = x+y d or α−1 = x − y d is positive. αα−1 = 1 implies they have the same sign. It follows that both must be positive, and thus α > 0. We conclude that α is an element of Gd. We furthermore observe that the Pell equation is a Diophantine equation and positivity is a Diophantine property, such that the set √ 2 2 2 2 {(x, y) ∈ Z | x + y d ∈ Gd} = {(x, y) ∈ Z | x − dy = 1 , x > 0} is a Diophantine set. This is a crucial part in the construction of a Diophantine description of exponentiation.

2.3 Cyclicity √ × To find the structure of Z[ d] and Gd, we use a general and powerful unit theorem by Johann Dirichlet. We first need to define what an order of a number field is.

Definition 2.6. An order O a number field K is a subring of K that is free of rank n = [K : Q].

Theorem 2.7. Dirichlet (1846): Let K be a number field with r1 real embeddings and r2 pairs of complex conjugate embeddings (so 2r2 complex embeddings in total). Then the unit group of any order in K is finitely generated with r1 + r2 − 1 independent generators of infinite order. More precisely, letting r = r1 + r2 − 1, any order O in K contains multiplicatively independent units u1, ··· , ur of infinite order such that every unit in O can be written uniquely in the form

m1 mr ζu1 ··· ur where ζ is a root of unity and every mi is an integer. Abstractly, O× =∼ µ(O) × Zr, where µ(O) is the finite group of roots of unity in O.[Con] A proof of Dirichlet’s Unit Theorem, which can be found in [Con], is too intricate to include in this thesis. Nevertheless, it is a very powerful tool for us to use. √ √We can readily apply it to our case. We√ are working with the number√ field K = Q( d). As {1√, d} forms a basis for√ the vector√ space Q( d) over Q, we have [Q( √d): Q] = 2. We also√ have Z[ d] as a subring of Q( d), and Z[ d] is free with rank 2, such that Z[ d] is an order√ in Q( d). We can hence apply Dirichlet’s√ Theorem 2.7. We observe that the roots of√ unity of Z[ d] are just ±1, which means that µ(Z[ d]) = {±1}. Furthermore, we can embed Q( d) in the real numbers by the following ring homomorphism: √ f : Q( d) ,→ R √ √ x + y d 7→ x − y d √ √ Of course, as Q( d) is contained in R, the identity is also an embedding of Q( d) in R.√ That these are the only two embeddings follows from field theory: the minimum polynomial of d is X2 − d. These embeddings permute the zeros of this polynomial, and are the identity on Q. There can only be two such embeddings, as this polynomial only has two zeros. From this we conclude that we have found r. r1, the number of real embeddings, is 2, whereas r2, the number of complex embeddings, is 0. This gives us r = r1 + r2 − 1 = 2 + 0 − 1 =√ 1. We thus only have one generator of infinite order, which we call u, the fundamental unit of Z[ d]. It follows that the unit group is of the following form: √ × n Z[ d] = {±u | n ∈ Z}

9 This unit u could be the smallest unit greater than 1, or its inverse, the√ greatest unit smaller than 1. We define it to be the smallest unit greater than 1. The group Z[ d]× has 2 generators, namely −1 and u. Note that −1 has order 2 and u has infinite order. Let us now look at the set of positive units: √ × n {α ∈ Z[ d] | α > 0} = {u | n ∈ Z} This group has only one generator: we have eliminated the generator −1. We have thus ended up with a cyclic group. We can now return to the group Gd from 2.3. This is a subgroup of the group of positive units, and is therefore itself cyclic. It is hence of the following form: √ n Gd = {α ∈ Z[ d] | N(α) = 1 , α > 0} = {u1 | n ∈ Z} √ Where u1 is the smallest element of Z[ d] greater than 1 with norm 1. Now we have also excluded the units with norm −1. The unit√ u1 is the fundamental unit of Gd. It need not be identical to u, the fundamental unit of Z[ d], as the latter could have norm −1. The fact that the group Gd is cyclic can also be proven without Dirichlet’s Unit Theorem, using a more elementary arithmetical proof. [Dav73] However, we wish to find a deeper, more general reason for this fact, as this will be useful in section6. This reason is Dirichlet’s Unit Theorem.

2.4 A suitable choice for d

Finding the fundamental unit of the group Gd given some non-square d is not always easy. On top of that, such a unit can be quite large. For instance, if d is equal to 1141, the fundamental unit is √ 1036782394157223963237125215 + 30693385322765657197397208 1141.

The case d = 1000099 is even worse. In that case, the smallest positive value of y that is part of a solution to the Pell equation has 115 decimal digits. [Ste12] Another infamous example of such a huge solution to a Pell equation is Archimedes’ cattle problem, a relatively simple problem posed by the ancient Greek mathematician Archimedes. Solving the cattle problem eventually comes down to solving a Pell equation, for which the smallest solution has 206545 decimal digits. [Len02] Clever algorithms are needed to find such units, as simply trying values for x and y will take too long. However, if we choose d = a2 − 1 for some integer a > 1, finding the fundamental unit is easier. We obtain the following Pell equation:

x2 − (a2 − 1)y2 = 1 (2.5)

We can straightforwardly check that (x, y) = (a, 1) is a solution to this Pell equation.

x2 − (a2 − 1)y2 = a2 − (a2 − 1) = 1

Throughout the rest of this section, we will use the letter d as an abbreviation for the expression 2 a − 1, with a > 1. We can now prove that we have found the fundamental unit of Gd. √ Lemma 2.8. The fundamental unit of Gd is u1 = a + d

Proof. The proof is by contradiction. Assume that u1 is not the fundamental unit of Gd, such that Gd has an element larger than√ 1, but√ smaller than u1. That is, that there exist x and y such that x2 − dy2 = 1 and 1 < x + y d < a + d. By √ √ √ √ (x + y d)(x − y d) = 1 = (a + d)(a − d), √ √ √ √ It follows that 1 > x − y d > a − d, and hence −1√< −x +√y d < −a + d. Adding this to the assumed inequality gives 0 < 2y d < 2 d. No integer y can fulfill this last inequality. This is a contradiction, from which we conclude that u1 is the fundamental unit of Gd.

We now know what the positive solutions (xn(a), yn(a)) to equation (2.5) look like precisely: √  √ n xn(a) + yn(a) d = a + d (2.6)

In which n ranges over the integers. As seen, the sequences xn and yn are functions of a.

10 2.5 Behavior of xn(a) and yn(a) Because of Dirichelt’s Unit Theorem 2.7, every positive solution (x, y) to the Pell equation (2.5) is equal to (xn(a), yn(a)) for some n, as given by equation (2.6). We can now prove some lemmas about the arithmetical behavior of xn(a) and yn(a). From now on, we will drop the dependence on a, and just write xn, yn.

Lemma 2.9. If (xm, ym) and (xn, yn) are solutions to equation (2.5), then we have

xm±n = xmxn ± dymyn

ym±n = xnym ± xmyn

Proof. Let (xm, ym) and (xn, yn) be solutions to the Pell equation 2.5. Then we compute √  √ m+n xm+n + ym+n d = a + d  √ m  √ n = a + d a + d  √   √  = xm + ym d xn + yn d √ = (xmxn + dymyn) + (xnym + xmyn) d

So, xm+n = xmxn + dymyn and ym+n = xnym + xmyn. Similarly, we compute √  √ m−n xm−n + ym−n d = a + d  √ m  √ −n = a + d a + d  √   √ −1 = xm + ym d xn + yn d  √   √  = xm + ym d xn − yn d √ = (xmxn − dymyn) + (xnym − xmyn) d which proves the lemma. √  √   √  Substituting n = ±1 in Lemma 2.9, or simply working out xm±1+ym±1 d = xm + ym d a ± d gives us the following relations:

xm±1 = axm ± dym (2.7)

ym±1 = aym ± xn (2.8)

We can now state the recursive relations for the sequences xn and yn, by which they are called Lucas sequences.[JSWW76]

Lemma 2.10. xn+1 = 2axn − xn−1 and yn+1 = 2ayn − yn−1 Proof. From equation (2.7) follows

xn+1 = axn + dyn

xn−1 = axn − dyn

Adding these two equations gives

xn+1 + xn−1 = 2axn

⇒ xn+1 = 2axn − xn−1

Similarly, from equation (2.8) follows

yn+1 = ayn + xn

yn−1 = ayn − xn

11 Adding these two equations gives

yn+1 + yn−1 = 2ayn

⇒ yn+1 = 2ayn − yn−1

The proof given above is the elementary arithmetical proof given by Martin Davis in [Dav73]. It is instructive to also give a different proof that is a bit more involved, but gives more insight in why this lemma holds. The following proof uses matrix notation and the Cayley-Hamilton Theorem to prove Lemma 2.10. Proof. We denote a solution to the Pell equation (2.5) by the column vector   xn yn √ such that xn and yn are the coordinates of a unit in the ordered basis {1, d}. We now have to  √  find the matrix corresponding to multiplication with our fundamental unit a + d . We find the √ columns of this matrix by checking how this multiplication acts on our basis vectors, 1 and d.  √  √ a + d · 1 = a + d  √  √ √ a + d · d = d + a d This gives us the following matrix: a d A = M = u1 1 a It follows that solutions are of the following form: n x  a d x  1 n = 0 = An (2.9) yn 1 a y0 0 We now compute the characteristic polynomial of A and use the definition d = a2 − 1.

pA(λ) = det(A − λI)

a − λ d = 1 a − λ = (a − λ)2 − d · 1 = a2 − 2aλ + λ2 − a2 + 1 = λ2 − 2aλ + 1 From the Cayley-Hamilton Theorem then follows A2 − 2aA + I = 0 ⇒ A2 = 2aA − I We substitute that in into equation (2.9) the following: x  1 n+1 = An+1 yn+1 0 1 = A2An−1 0 x  = A2 n−1 yn−1 x  = (2aA − I) n−1 yn−1 x  x  = 2aA n−1 − n−1 yn−1 yn−1 x  x  = 2a n − n−1 yn yn−1

12 which is what we wanted to prove. Matrix notation can also be used to derive Lemma 2.9 and equations (2.7) and (2.8), but this does not save us any effort or provide us any additional insight. The recursive relations provided by Lemma 2.10 allow us to prove properties of xn and yn by induction, using our first two solutions (x0, y0) = (1, 0) and (x1, y1) = (a, 1). Specifically, we can now prove some lemmas about the growth of xn and yn, which resembles exponential growth.

Lemma 2.11. For every non-negative n, xn+1 > xn > n and yn+1 > yn ≥ n.

Proof. This can be shown by induction. It is straightforwardly seen that x1 > x0 > 0 and y1 > y0 ≥ 0. Furthermore, suppose the lemma holds up to n = k. Then, we have:

xk+2 = 2axk+1 − xk > axk+1 + xk+1 − xk > axk+1 > xk+1 > k + 1

⇒ xk+2 > xk+1

xk+1 > xk > k

⇒ xk+1 > k + 1

yk+2 = 2ayk+1 − yk > ayk+1 + yk+1 − yk > ayk+1 > yk+1

⇒ yk+2 > yk+1

yk+1 > yk ≥ k

⇒ yk+1 ≥ k + 1 which proves our induction step.

n n Lemma 2.12. For every non-negative n, a ≤ xn ≤ (2a) .

0 0 1 1 Proof. Again, we use induction. Firstly, a ≤ x0 ≤ (2a) and a ≤ x1 ≤ (2a) . Now our induction step is as follows:

n n+1 xn+1 = 2axn − xn−1 ≤ 2axn ≤ 2a(2a) = (2a) n n+1 xn+1 = 2axn − xn−1 ≥ 2axn − xn ≥ axn + (a − 1)xn ≥ axn ≥ a · a = a which proves our lemma. In this induction step, we have used Lemma 2.11. Lemma 2.13. Let p be any positive number. Then we have

n 2 xn + (p − a)yn ≡ p mod (2ap − p − 1)

Proof. This again can be proven inductively. We first observe:

x0 + (p − a)y0 = 1 + (p − a) · 0 = 1

x1 + (p − a)y1 = a + (p − a) · 1 = p

We now suppose the lemma holds up to n = k. We then use the following induction step:

xk+1 + (p − a)yk+1 = 2axk − xk−1 + (p − a)(2ayk − yk−1)

= 2a(xk + (p − a)yk) − (xk−1 + (p − a)yk−1) ≡ 2apk − pk−1 mod (2ap − p2 − 1) = (2ap − 1)pk−1 mod (2a − p2 − 1) = p2pk−1 mod (2a − p2 − 1) = pk+1 mod (2a − p2 − 1) in which we have used Lemma 2.10. This induction step completes the proof of the lemma.

The factors placed in front of xn and yn may seem arbitrary at first, but they are the solution c1, c2 to the following system: 1 0 c  1 1 = a 1 c2 p These factors are not relevant in the induction step.

13 Furthermore, note that the modulus 2ap − p2 − 1 is just minus the characteristic polynomial of A, which is also the origin of the coefficients in the recurrence relations of Lemma 2.10. In this lemma, it can be used to create a higher power of p. We have now seen that a Diophantine description of exponentiation could follow from a Dio- phantine description of xn(a) and yn(a). This is no surprise, the n-th unit larger than 1 is just the fundamental unit exponentiated by n. We need now finish our Diophantine description of xn(a) and yn(a). All we have left to do is find the solution number given a solution to the Pell equation. That is, given some solution (x, y) to equation (2.5), find n such that (x, y) = (xn, yn). We must, of course, do this using only Diophantine equations. For that goal, we use some divisibility properties of xn and yn.

2.6 Finding the solution number using divisibility properties Divisibility properties are very useful in finding what the solution number n is given a solution (x, y) of equation (2.5). The technique essentially comes down to proving that two numbers have the same residue modulo some modulus, and that both are smaller than the modulus, such that they must be equal.

Lemma 2.14. For every n, gcd(xn, yn) = 1.

Proof. Any divisor of both xn and yn must also divide the left hand side of the Pell equation (2.5), and thus its right hand side, which is 1. It thus follows that this divisor must equal 1.

Lemma 2.15. For every positive n and k we have yn|ynk. Proof. We prove by induction on k. For k = 1, we have identity and hence division. Now suppose the lemma holds up to k = m. Then by Lemma 2.9 it follows that

yn(m+1) = xnynm + xnmyn ≡ xnynm mod yn

Our induction hypothesis provides that yn|ynm, so it follows that yn|yn(m+1), which completes our induction step We conclude that yn|ynk for every positive n and k.

The property of the sequence y0, y1, ··· expressed by Lemma 2.15 makes that sequence a divis- ibility sequence.

Definition 2.16. A sequence u1, u2, ··· that is constructed along the recurrence relation

un+k = a1un+k−1 + ··· akun is called a divisibility sequence of k-th order if n|m implies un|um. Definition 2.17. The characteristic polynomial corresponding to this sequence is given by

k k−1 f(x) = x − a1x · · · − ak

In our case, this is just the characteristic polynomial of A. This polynomial immediately determines the coefficients of our recurrence relation by the Cayley-Hamilton Theorem, as seen in Lemma 2.10.

Definition 2.18. A divisibility sequence u0, u1, ··· is called normal if u0 = 0

In fact, all sequences of the form un+2 = P un+1 − Qun with u0 = 0 and u1 = 1, with P and Q integers, are second order divisibility sequences. [Smy10] By definition, these sequences are normal. We will pay extra attention to the importance of the fact that y0, y1, ··· is a divisibility sequence, because this will be relevant in section 6.6. It is useful that the converse of Lemma 2.15 also holds.

Lemma 2.19. For every n and t, yn|yt if and only if n|t.

14 Proof. If: Suppose n|t. Then Lemma 2.15 provides yn|yt. Only if: Suppose yn|yt. We write t = nq + r, with 0 ≤ r < n and observe:

yt = xrynq + xnqyr

Since yn divides both yt (our assumption) and ynq (by Lemma 2.15), it must divide xnqyr as well. Now we use Lemma 2.14: xnq and ynq are coprime. As yn|ynq, yn and xnq are coprime as well. It must then be that yn|yr. However, since r < n, we have yr < yn by Lemma 2.11. But yn can’t divide a positive number smaller than itself. It follows that yr = 0 and hence r = 0. We conclude that n|t.

2 Lemma 2.20. yn|yt if and only if nyn|t. 2 Proof. Only if: Suppose yn|yt. Then yn|yt, and hence by Lemma 2.19 n|t, so t = nk. We work out: √  √ t xt + yt d = a + d  √ k = xn + yn d

k X k = xk−jyj j n n j=0 k X k ⇒ y = xk−jyj t j n n j=0 , j6 |2 k ≡ xk−1y mod y2 1 n n n k−1 2 = kxn yn mod yn 2 = 0 mod yn

It is used that terms of the sum in which j exceeds 1 contain higher powers of yn and are therefore 2 2 2 k−1 divisible by yn. The last equality is by our supposition that yn|yt. It then follows that yn|kxn yn k−1 k−1 and hence yn|kxn . However, by Lemma 2.14 xn and yn are coprime, and hence so are xn and yn. It follows that yn|k, and thus nyn|nk = t, which is what we wanted to prove. If: Suppose nyn|t. We first set k = yn to find:

yn X y  y = n xyn−jyj nyn j n n j=0 , j6 |2 y  ≡ n xyn−1y mod y2 1 n n n

2 yn−1 2 = ynxn mod yn 2 = 0 mod yn

2 2 We conclude that yn|ynyn and then by Lemma 2.19 it follows that yn|yt. 2 2 Note that this lemma is stronger than the lemmas ’yn|ynyn ’ and ’yn|yt implies yn|t’, which are presented in [Dav73]. Lemma 2.20 forms a more general and compact lemma. Nevertheless, the lemmas in [Dav73] are also sufficient for the construction of a Diophantine description of exponentiation. Lemma 2.21. For any non-negative n, we have:

x2n ≡ −1 mod xn

x2n ≡ 1 mod yn

y2n ≡ 0 mod xn

y2n ≡ 0 mod yn

15 Proof. This can be seen by substituting m = n in Lemma 2.9, and using the fact that we are dealing with a solution to the Pell equation (2.5):

2 2 2 2 2 2 x2n = xn + dyn = xn + (xn − 1) = (dyn + 1) + dyn

⇒ x2n = −1 mod xn and

x2n = 1 mod yn

y2n = 2xnyn

⇒ y2n = 0 mod xn and

y2n = 0 mod yn

Note that the last congruence also follows from the fact that y0, y1, ··· is a divisibility sequence (Lemma 2.15). Lemma 2.22. For any non-negative n, we have:

x4n ≡ 1 mod xn

x4n ≡ 1 mod yn

y4n ≡ 0 mod xn

y4n ≡ 0 mod yn

Proof. Using Lemma 2.21, and a similar construction, we find:

2 2 2 2 2 2 x4n = x2n + dy2n = x2n + (x2n − 1) = (dy2n + 1) + dy2n 2 2 ⇒ x4n = (−1) + ((−1) − 1) mod xn = 1 mod xn and

x4n = (0 + 1) + 0 mod yn = 1 mod yn

y4n = 2x2ny2n

⇒ y4n = 0 mod xn and

y4n = 0 mod yn

Again, the last congruence also follows from the fact that y0, y1, ··· is a divisibility sequence. Lemma 2.21 and Lemma 2.22 can also be shown using matrix notation, which might give some more insight into why the lemmas hold.

Proof. We consider the ring homomorphism ’mod xn’:         2×2 2×2 xn 0 0 xn 0 0 0 0 f : Z → Z / , , , 0 0 0 0 xn 0 0 xn a a  a mod x a mod x  11 12 7→ 11 n 12 n a21 a22 a21 mod xn a22 mod xn

We compute f(An):

n a d  x dy   0 dy  f(An) = f = f n n = n 1 a yn xn yn 0

We then use the fact that f is a homomorphism to find:

 2  2    n 2 n 2 0 dyn dyn 0 −1 0 f (A ) = f(A ) = = 2 = = −I yn 0 0 dyn 0 −1 such that x mod x  x  −1 0  1 −1 2n n ≡ (An)2 0 = = y2n mod xn y0 0 −1 0 0 which proves the first and the third congruence of Lemma 2.21. We then find f((An)4):

2 f (An)4 = f (An)2 = (−I)2 = I

16 n We conclude that A is of order 4 modulo xn. It then similarly follows that

x mod x  1 1 4n n ≡ I = y4n mod xn 0 0 which shows the first and third congruence of Lemma 2.22. n A completely analogous computation shows that A has order 2 modulo yn, which provides the other congruences. This method can also be utilized to prove Lemma 2.23 and Lemma 2.24, but this does not provide much additional insight.

Lemma 2.23. For any integer j (hence allowing j to be negative) we have xj+2n = −xj mod xn

Proof. First, note that x−j = xj, as

 √ −1 √ xj + yj d = xj − yj d

Then, we apply Lemma 2.9 and Lemma 2.21 to obtain the following:

xj+2n = xjx2n + dyjy2n

≡ xj · (−1) + dyj · 0 mod xn

= −xj mod xn

Lemma 2.24. For any integer j, we have xj+4n = xj mod xn Proof. This follows similarly from Lemma 2.22:

xj+4n = xjx4n + dyjy2n

≡ xj · 1 + dyj · 0 mod xn

= xj mod xn

Lemma 2.25. Suppose xi ≡ xj mod xn, with n > 0 and 0 ≤ i ≤ j ≤ 2n. Then either i = j or we have the exceptional case: a = 2, n = 1, i = 0 and j = 2

Proof. We split the proof into two parts: either xn is even or xn is odd. We will show, in both cases, that x0, x1, ··· x2n are all different modulo xn, such that the result follows.

xn−1 1. First, we treat the case in which xn is odd. We define q = 2 . Then we consider the following set: x0, ··· , xn−1 It follows from Lemma 2.11 that x0 < ··· < xn−1

xn xn Moreover, 2.10 implies that xn−1 ≤ a ≤ 2 < q. Thus these are all unique residues modulo xn that are smaller than q. We then consider the set

xn+1, ··· x2n

By Lemma 2.23, they are congruent modulo xn, respectively, to:

−xn−1, ··· , −x0

Similarly, these are all unique negative residues modulo xn that are greater than −q. Thus all residues modulo xn of the set x0, x1, ··· x2n are unique numbers between −q and q, which is a range smaller than xn. We conclude that they are mutually incongruent modulo xn. From this follows our result: if xi and xj have the same residue, then they must be the same, as no two different possibilities have the same residue.

17 xn 2. Now suppose xn is even. We then define q = 2 . The result follows similarly, unless xn−1 = q. In that case, we will have, by Lemma 2.23 that

xn+1 ≡ −q mod xn = q mod xn = xn−1 mod xn

This is precisely the case when

xn = axn−1 + dyn−1 = 2xn−1

This is the case when a = 2 and yn−1 = 0. This in turn implies that n = 1, i = 0 and j = 2, which completes the exceptional case.

Lemma 2.26. Suppose xi ≡ xj mod xn, with n > 0, 0 < i ≤ n and 0 ≤ j < 4n. Then either j = i or j = 4n − i. Proof. We will split the proof up into two cases. Either j ≤ 2n or j > 2n. In the first case, Lemma 2.25 implies that j = i. The exceptional case is excluded: in that case, n = 1 and i = 0 or i = 2. This contradicts 0 < i ≤ n. In the second case, Lemma 2.24 implies

x4n−j = xj−4n ≡ xj mod xn = xi mod xn

Then similarly by Lemma 2.25, it follows that i = 4n − j, and hence j = 4n − i. The exceptional case is ruled out because both i and 4n − j cannot be zero.

Lemma 2.27. If n > 0, 0 < i ≤ n, j is any integer and we have xi ≡ xj mod xn, then it follows that j ≡ ±i mod 4n. Proof. We can write j = 4n + r, where 0 ≤ r < 4n. Then Lemma 2.24 implies

xr ≡ xj mod xn = xi mod xn

Then from Lemma 2.26 follows that i = r or i = 4n−r, and thus j ≡ r mod 4n = ±i mod 4n.

The following two lemmas have an elementary, arithmetical proof using induction and a more general proof, using ring theory, that can give more insight in why the lemmas hold. I will give both proofs, roughly as presented in [Dav73] and [Kui10], respectively.

Lemma 2.28. xn ≡ 1 mod (a − 1) and yn ≡ n mod (a − 1). Proof. We show this by induction: For n = 0, 1, it is straightforwardly seen. Then suppose it holds up to n = k. We then use the following induction step:

xk+1 = 2axk − xk−1

≡ 2xk − xk−1 mod (a − 1) = 2k − (k − 1) mod (a − 1) = k + 1 mod (a − 1)

√ Proof. The ring Z[ d] is the isomorphic to the polynomial ring Z[t]/(t2 − d). Any element of 2 this ring looks like√ x + yt, and in this ring t − d = 0 holds. The variable t could be seen as a replacement for d. With our choice of d, the ring becomes Z[t]/(t2 − a2 + 1). We naturally have √ 2 2 × × Z[t]/(t − a + 1) =∼ Z[ d]

From equation (2.6), we know that the positive units of Z[t]/(t2 − a2 + 1) are of the form

n xn(a) + yn(a) = (a + t)

18 In this lemma, we are looking at the ring homomorphism ’mod (a − 1)’:

2 2 2 Z[t]/(t − a + 1) → Z/(a − 1)Z [t]/(t ) x + yt 7→ x mod (a − 1) + t · (y mod (a − 1))

We have used that a2 − 1 = (a − 1)(a + 1). This induces a group homomorphism:

2 2 × 2 × Z[t]/(t − a + 1) → Z/(a − 1)Z [t]/(t ) x + yt 7→ x mod (a − 1) + t · (y mod (a − 1))

n 2 The positive elements of the unit group become xn + ynt = (1 + t) , with t = 0. The higher order terms of the binomial expansion vanish, and we are left with xn + ynt = 1 + nt, which proves the lemma.

Lemma 2.28 would be sufficient for finding the solution number n, given the condition that a − 1 exceeds n. This is because every yn then has a unique residue, and thus a unique n follows. However, we want to find the solution number given any n and a − 1, such that n can also exceed a − 1.

Lemma 2.29. If a, b, c, n are any non-negative integers, then a ≡ b mod c implies the following congruences:

xn(a) ≡ xn(b) mod c

yn(a) ≡ yn(b) mod c

Proof. This can be seen by induction on n: It is straightforward to check that the lemma holds for n = 0, 1. Now suppose it holds up to n = k. Then we use the following induction step:

xn+1(a) = 2axn(a) − xn−1(a)

≡ 2bxn(b) − xn−1(b) mod c

= xn+1(b) mod c

A similar induction step goes for yn.

Proof. We now have two different rings: Z[t]/(t2 − a2 + 1) and Z[t]/(t2 − b2 + 1). On their unit groups we can define the group homomorphism ’mod c’:

2 2 × 2 2 × Z[t]/(t − a + 1) → Z[t]/(t − a + 1, c) x + yt 7→ x mod c + t · (y mod c)

We again use the form of the elements of these unit groups:

n xn(a) + yn(a) = (a + t) n xn(b) + yn(b) = (b + t)

Under the group homomorphism they become:

n xn(a) + yn(a)t = (a mod c + t) n xn(a) + yn(a)t = (b mod c + t)

Now, since a ≡ b mod c, the units of Z[t]/(t2 − a2 + 1) and Z[t]/(t2 − b2 + 1) look exactly the same modulo c. This proves the lemma. We have now finally acquired enough tools to find n given a solution to the Pell equation (2.5). This allows us to make a Diophantine description of xn(a) and yn(a).

19 2.7 A Diophantine description of xn(a) and yn(a)

We can now find a necessary and sufficient Diophantine condition for the equations x = xk(a) and y = yk(a) to hold.

Theorem 2.30. We have x = xk(a) and y = yk(a) if and only if the following system of 8 Diophantine equations has a solution in positive integers:

(I) x2 − (a2 − 1)y2 = 1 (V) b = a + u2(u2 − a) (II) u2 − (a2 − 1)v2 = 1 (VI) s = x + cu (III) s2 − (b2 − 1)t2 = 1 (VII) t = k + 4y(d − 1) (IV) v = 4ry2 (VIII) y = k + e − 1

Proof. Sufficiency: Suppose we have a solution to the system. Then I, II and III, together with Lemma 2.5 imply:

x = xi(a) u = xn(a) s = xj(b)

y = yi(a) v = yn(a) t = yj(b)

For some positive integers i, n and j. Since IV implies y ≤ v, it follows that i ≤ n, by Lemma 2.11. VI yields the congruence xj(b) ≡ xi(a) mod xn(a) (2.10)

V implies that b ≡ a mod xn(a), and then Lemma 2.29 provides

xj(b) ≡ xj(a) mod xn(a) (2.11)

Combining equations (2.10) and (2.11) gives

xi(a) ≡ xj(a) mod xn(a)

Then, since 0 < i ≤ n, Lemma 2.27 yields

j ≡ ±i mod 4n (2.12)

Moreover, IV and Lemma 2.20 together imply yi(a)|n. Combining this with equation (2.12) yields

j ≡ ±i mod 4yi(a) (2.13)

Also, VII states that yj(b) ≡ k mod 4yi(a) (2.14)

Furthermore, II and IV together imply that u = xn(a) ≡ 1 mod 4yi(a) = 4y. Substituting this in into V gives us

2 2 b = a + u (u − a) ≡ a + 1(1 − a) mod 4yi(a) = 1 mod 4yi(a)

Hence we have 4yi(a)|b − 1. Then Lemma 2.28 implies

yj(b) ≡ j mod (b − 1) = j mod 4yi(a) (2.15)

Combining equations (2.14) and (2.15) gives us

j ≡ k mod 4yi(a) (2.16)

Combining this with equation (2.13) yields

k ≡ ±i mod 4yi(a) (2.17)

However, both i and k are positive and not greater than yi(a): i by Lemma 2.11 and k by VIII. This implies that i = k. We conclude that x = xk(a) and y = yk(a).

Necessity: Suppose we have x = xk(a) and y = yk(a). Then I is immediately satisfied. We introduce m = 4kyk(a) and set u = xm(a) and v = ym(a), such that II holds. Furthermore, we

20 2 2 define b = a + (xm(a)) ((xm(a)) − a), and set s = xk(b) and t = yk(b), such that III and V also hold. Lemma 2.15 and Lemma 2.20 imply 4y2|v, such that IV can be satisfied. Again, II, IV and V together imply that b ≡ a mod u and 4y|b − 1. Then, by Lemma 2.29, xk(b) ≡ xk(a) mod u, i.e. s ≡ x mod u, such that VI can be satisfied. Moreover, Lemma 2.28 can be used to obtain yk(b) ≡ k mod b − 1 and hence t ≡ k mod 4y, which means VII can be satisfied. Finally, k ≤ yk(a) by Lemma 2.11, such that VIII can be satisfied. This proof is largely the same as the one given in [Dav73], except for equation V. This has been combined into one equation, as pointed out to be possible in [JSWW76]. Using only substitution of IV-VII, the system can be reduced to a system of only 4 equations, also reducing the number of variables by 4. Summing the squares of these equations gives one single Diophantine equation whose solvability is a necessary and sufficient condition for x = xk(a) and y = yk(a). In [JSWW76] the requirement that the variables range over the positive integers is abolished, such that they can range over all integers. However, one of the equations in this system is n ≤ y. One needs Lagrange’s four square theorem, and hence four new variables, to express this in a Diophantine way if the variables range over all integers.

2.8 A Diophantine description of exponentiation

Now that we have found a Diophantine description of xn(a) and yn(a), a Diophantine description of exponentiation is a relatively small step away. An important lemma in this step is Lemma 2.13. We only have to make sure that we have not just congruence, but equality. That is, we need that pn < 2ap − p2 − 1. which comes down to choosing a sufficiently large. This is done by requiring that a is a solution to a Pell equation. We also need the following lemma:

Lemma 2.31. For a > 1, and p and n positive, if a > pn, then 2ap − p2 − 1 > pn Proof. We observe that in p = 1, we have

2ap − p2 − 1 = 2a − 2 ≥ a > pn

We then look at the derivative: d (2ap − p2 − 1) = 2a − 2p > 0 dp

Since a > pn ≥ p. Thus, the function 2ap − p2 − 1 exceeds a (which is constant with respect to p) in p = 1 and always has a positive derivative. We conclude that 2ap − p2 − 1 ≥ a > pn for all positive values of p. Finally, this allows us to formulate the theorem that this section has been building up to: there exists a Diophantine condition that is necessary and sufficient for describing the graph of exponentiation.

Theorem 2.32. We have m = pn if and only if the following system of Diophantine equations has a solution in positive integers:

(I) x2 − (a2 − 1)y2 = 1 (VII) t = n + 4y(d − 1) (II) u2 − (a2 − 1)v2 = 1 (VIII) y = n + e − 1 (III) s2 − (b2 − 1)t2 = 1 (IX) (x − y(a − p) − m)2 = (f − 1)2(2ap − p2 − 1) (IV) v = 4ry2 (X) m + g = 2ap − p2 − 1 (V) b = a + u2(u2 − a) (XI) w = p + n (VI) s = x + cu (XII) a2 − (w2 − 1)(w − 1)2z2 = 1

Proof. Sufficiency: Suppose we have a solution to the system in positive integers. Then, by XI, w > 1, such that XII implies that a > 1. We then recognize I-VIII from Theorem 2.30: they imply x = xn(a) and y = yn(a). We then apply Lemma 2.13 to IX to obtain

m ≡ pn mod (2ap − p2 − 1) (2.18)

21 We now need equality instead of just congruence. X tells us

m < 2ap − p2 − 1 (2.19)

Hence we only need to establish that pn < 2ap − p2 − 1. For that, we can use Lemma 2.12. XII is a Pell equation and tells us that a = xj(w) and (w − 1)z = yj(a) for some positive j. Lemma 2.12 then implies a ≤ wj (2.20) Furthermore, by Lemma 2.28:

yj(w) ≡ j mod (w − 1) ⇒ j ≡ (w − 1)z mod (w − 1) = 0 mod (w − 1)

Such that j|w − 1 and thus j ≥ w − 1 (2.21) Furthermore, XI implies p < w and n < w. Combining this with equations (2.20) and (2.21) gives

a ≥ wj ≥ ww−1 > pn

Then, by Lemma 2.31, it follows that

pn < 2ap − p2 − 1 (2.22)

Finally, combining equations (2.18), (2.19) and (2.22) yields that m and pn have the same residue modulo 2ap − p2 − 1 and are both smaller than 2ap − p2 − 1. Thus it must be that m = pn.

Necessity: Suppose m = pn. Then we can set w = p + n, such that XI is satisfied. Moreover, we set a = xw−1(w). Lemma 2.28 then provides that w − 1|yw−1(w), and hence yw−1(w) = z(w − 1), such that XII holds. We set x = xn(a) and y = yn(a), such that a I-VIII can be satisfied as a consequence of Theorem 2.30. It then follows from Lemma 2.31 that m = pn < 2ap − p2 − 1, which satisfies X. The satisfiability of IX is then directly implied by Lemma 2.13. This completes the proof that the system can be satisfied if m = pn. A change with respect to the proof in [Dav73] is in XI. Davis introduces two new variables in order to establish precisely that w exceeds both n and p, but this is not necessary. As there are no other restrictions on w, this can be established by simply setting w = p + n. This finishes the proof that exponential sets are Diophantine. We could for instance set p = 2, and form a polynomial P that is the sum of squares of the homogeneous equivalents of the equations in the system above. Then P has a solution in positive integers if and only if m is a power of 2. This is perhaps counterintuitive: the exponential relation can apparently be captured by some polynomial, an extraordinary result not known before Matijasevi˘c’sTheorem.

22 3 Expanding the language of Diophantine descriptions 3.1 Diophantine descriptions of important functions We have shown in section2 that an exponential relation is equivalent to the solvability of a Diophantine equation. This entails a Diophantine description of the following functions: f(n, k) = n , g(n) = n! and h(a, b, y) = Qy (a + bk). The first two functions were originally proved to k k=1 be exponential Diophantine by Julia Robinson in [Rob52], the last was shown by Martin Davis to have the same property. I will stick to the notation used in [Dav73]. For Lemma 3.2, a simpler Diophantine expression was found in [JSWW76]. Again, the convention is used that the variables range only over the non-negative integers (for brevity). The proofs of the lemmas consist of long and tedious calculations. I do not think it is relevant to reproduce them in this thesis. Any interested reader can look them up in [Rob52], [JSWW76] or [Dav73]. n Lemma 3.1. A Diophantine description exists for f(n, k) = = n! :[Dav73] k k!(n−k)!

n f = k ⇔ (∃u, v, w, x, y, t) s.t. v = 2n ∧ a > v ∧ t = u + 1 ∧ x = tn ∧ y = uk ∧ yw ≤ x < y(w + 1) ∧ f < u ∧ f ≡ w mod u Lemma 3.2. A Diophantine description exists for g(n) = n!:[JSWW76]

g = n!

⇔ (∃j, h, m, p, q, w, z) s.t. q = wz + h + j ∧ z = g(h + j) + h ∧ (2n)3(2n + 2)(m + 1)2 + 1 = z2 ∧ p = (m + 1)n ∧ q = (p + 1)m ∧ z = pn+1 Qy Lemma 3.3. A Diophantine description exists for h(a, b, y) = k=1(a + bk):[Dav73]

y Y h = (a + bk) k=1 ⇔ (∃m, p, q, r, s, t, u, v, w, x) s.t. r = a + by ∧ s = ry ∧ m = bs + 1 ∧ bq = a + mt ∧ u = by w ∧ v = y! ∧ z < m ∧ w = q + y ∧ x = ∧ z + mp = uvx y That these functions are Diophantine is essential in the Bounded Universal Quantifier theorem

23 3.2 The Bounded Universal Quantifier Theorem An important step in the creation of Diophantine descriptions of recursively enumerable sets is the theorem of Bounded Universal Quantifier Theorem.

Theorem 3.4. Given any polynomial p ∈ Z[X1, ··· ,Xn+m+2], the set

S = {(y, x1, ··· , xm)|(∀k)≤y(∃y1, ··· ym) p(y, k, x1, ··· , xn, y1, ··· , ym) = 0}

Is Diophantine. [Dav73] The trick here is how to get rid of the bounded universal quantifier. Note that such a description with an unbounded universal quantifier is not recursively enumerable and thus not Diophantine. Davis observes first that the values y1, ··· , ym must also have some bound, u. He then introduces some polynomial Q(y, u, x1, ··· , xn), with the following properties:

Q(y, u, x1, ··· , xn) > u

Q(y, u, x1, ··· , xn) > y

k ≤ y ∧ y1, ··· ym ≤ u ⇒ |p(y, k, x1, ··· , xn, y1, ··· , ym)| < Q(y, u, x1, ··· , xn)

From this he proves the following equivalence:

(∀k)≤y(∃y1, ··· ym)[p(y, k, x1, ··· , xn, y1, ··· , ym) = 0]

⇔ y Y (∃c, t, a1, ··· , am)[(1 + ct) = (1 + kt) k=1 u Y ∧t = Q(y, u, x1, ··· , xn)! ∧ (1 + ct)| (a1 − j) j=1 u Y ∧ · · · ∧ (1 + ct)| (am − j) j=1

∧(1 + ct)|p(y, c, x1, ··· , xn, a1, ··· , am)] Although the second condition seems more complicated, it has only existential quantifiers. Fur- thermore, all functions in this condition are Diophantine. [Dav73] The Bounded Universal Quantifier theorem enables us to find a Diophantine description of properties of numbers that involve all numbers smaller than that number. A prominent example of this is the set of primes. Primality can be characterized as having no divisor smaller than itself, except for 1. In other words: for all numbers m smaller than p, m does not divide p, or m equals 1. This is a bounded universal quantifier description. A concrete elaboration of this can be found in section 4.1. The Bounded Universal Quantifier Theorem is also useful when we want to find a Diophantine description of a singleton containing the smallest number with a certain property. An example of this is Taxicab(n): the smallest number that can be represented as the sum of two cubes in n different ways (the famous example by Srinivasa Ramanujan is Taxicab(2) = 1729 = 13 + 123 = 93 + 103). A characterization of such a number t is that t has this property, but all numbers smaller than t don’t have this property.

3.3 The Sequence Number Theorem In order to ’store’ some kind of information (e.g. a sum or product) while letting a bounded uni- versal quantifier run iteratively, the Sequence Number Theorem is useful. It allows us to ’encrypt’ an entire sequence of numbers in a single number u, which can be ’decrypted’ by a function S(i, u).

24 This function is the following: [Dav73]

S(i, u) = w ⇔ w ≡ L(u) mod (1 + iR(u)) ∧ w ≤ 1 + iR(u) ⇔ (∃v, x,y, z) s.t. 2u = (x + y − 2)(x + y − 1) + 2y x = w + z(1 + iy) 1 + iy = w + v − 1

Where the functions L(u) and R(u) are as follows:

x = L(z) ⇔ ∃y[2z = (x + y − 2)(x + y − 1) + 2y] y = R(z) ⇔ ∃x[2z = (x + y − 2)(x + y − 1) + 2y]

Using some straightforward properties of these functions, and the Chinese Remainder Theorem, we can prove that for any finite sequence a1, a2, ··· , aN there exists a u such that for every i ≤ N

S(i, u) ≤ u

S(i, u) = ai

This function can be used to store iterates of some function. The sequence would, in that case i−1 be something like S(i, u) = ai = f (x0) for 1 ≤ i ≤ N.[Dav73] The Sequence Number Theorem is useful for finding a Diophantine description of properties that are defined recursively. Some examples of such sets are the set of Fibonacci numbers1 or Qn numbers of the form k=1 ((k! + k) mod 1523).

3.4 Putnam’s trick In 1960, Hilary Putnam proved the following theorem: [Put60]

Theorem 3.5. Suppose we have Diophantine set S ⊂ N and a corresponding polynomial M ∈ Z[X0,X1, ··· Xn], such that

k ∈ S ⇔ (∃x1, ··· , xn) s.t. M(k, x1, ··· xn) = 0.

Then there exists a polynomial P ∈ Z[X1, ··· Xn] such that

k ∈ S ⇔ (∃x0, x1, ··· , xn) s.t. k = P (x0, x1, ··· xn)

Proof. We define P to be the following:

2 P (x0, x1, ··· xn) = x0 1 − (M(x0, x1, ··· xn))

"⇒": Suppose that k ∈ S. Then, by definition of S, M(k, x1, ··· xn) = 0. We can then set x0 = k, to obtain P (x0, x1, ··· xn) = k(1 − 0) = k. "⇐" Suppose P attains a positive value k. Then it follows that M(x0, x1, ··· xn) = 0. Furthermore, the positive value P takes is just k = x0(1 − 0) = x0. Substitution gives M(k, x1, ··· xn) = 0, and thus k ∈ S. Putnam’s trick transforms a polynomial whose set of zeros can be projected onto the first coordinate to find a Diophantine set to a polynomial whose range is precisely that Diophantine set. We can now apply these tools to concrete examples.

1 A much√ simpler Diophantine description of the Fibonacci numbers is possible, using the unit group of the 1+ 5 ring Z[ 2 ]. In fact, this was first shown by Matijasevi˘cin 1970, which completed the proof that all recursively enumerable sets are Diophantine. [Mat70]

25 4 Application

We have now gathered sufficient tools to find a Diophantine description of any recursively enu- merable set. In this section, we will apply these tools to some concrete examples. Note that these examples do not possess any special property that makes them Diophantine, except for being - cursively enumerable. Matijasevi˘c’sTheorem provides a mechanism that allows us to construct a Diophantine description of a set given an algorithm that enumerates it. Throughout this section, variables will range over the positive integers.

4.1 The set of primes A lot of literature has been published on Diophantine descriptions of the set of primes. We will consider two approaches: one using the straightforward definition of primes and one using Wilson’s theorem.

4.1.1 The straightforward definition At first, we can straightforwardly use the definition of a , which is that a prime number is only divisible by 1 and itself. We denote by P the set of primes and write

n ∈ P ⇔ ¬(∃k)≤n s.t. k|n ∧ k 6= 1 ∧ k 6= n

⇔ (∀k)≤n k 6 |n ∨ k = 1 ∨ k = n

Lemma 4.1. We define the polynomial P as follows:

P (k, n, u, v) = ((ku − vn)2 − 1)(n − k).

Then the following equivalence holds:

n ∈ P ⇔ (∀k)≤n (∃u, v)≤n s.t P (k, n, u, v) = 0

Proof. "⇒": Suppose n is prime. Then, for any k ≤ n, we have either k = n or k < n. If k = n, then we have n − k = 0, and thus P = 0. If k < n, then gcd(k, n) = 1, as n is prime. From Bézout’s lemma then follows that there exist integers 0 < u < n and 0 < v < k < n such that ku − vn = ±1, and hence (ku − vn)2 − 1 = 0, and thus P (k, n, u, v) = 0. We conclude that in both cases, the right hand side holds. "⇐": Suppose the right hand side holds. Then, for every k ≤ n, we have either n − k = 0 or (ku − vn)2 − 1 = 0 for some u, v ≤ n. Now assume n is not prime: n has a divisor k strictly between 1 and n. From our supposition and Bézout’s lemma follows that gcd(k, n) = 1, and hence k = 1, contradicting our assumption. We must conclude that n is prime. We can now apply the Bounded Universal Quantifier Theorem to get rid of the universal quantifier in our description of primes. In this case, our bounds y and u are the same bound, namely n. It is straightforward to check that the polynomial Q(n) = n3 + 1 satisfies the described conditions. We end up with:

2 (∀k)≤n(∃u, v)≤n[((ku − nv) − 1)(k − n) = 0]

⇔ n Y (∃c, t, a1, a2)[(1 + ct) = (1 + kt) k=1 n 3 Y ∧ t = (n + 1)! ∧ (1 + ct)| (a1 − j) j=1 n Y ∧ (1 + ct)| (a2 − j) j=1 2 ∧ (1 + ct)|((ca1 − na2) − 1)(c − n)

26 We now have only existential quantifiers and a system of equations. We have already that these equations are Diophantine. All that is left to do now is use our Diophantine description of the exponential function and Lemma 3.1-3.3 to express these equations in polynomials. Although straightforward, this system of polynomial equations will become large and complicated: We need to apply Theorem 2.32 many times and thus end up with a system of many equations. A nice corollary of the Diophantineness of the set of primes is that there is a uniform bound for the number of calculations√ one has to do to prove that some N is prime. Using a straightforward definition, it takes N steps to prove the primality of N. This is unbounded as N increases. However, when using a Diophantine description, one only needs to show that the Diophantine equation has some solution in the integers, which consists of a number of calculations that is independent of N. This uniform bound does not help us in the quest for large primes: we would first have to actually find the integers that satisfy the system of Diophantine equations, which might take a long time. Then the proof that we in fact have a prime is short, but that is of no practical use.

4.1.2 Wilson’s Theorem An easier way to make primality Diophantine is by Wilson’s theorem, an old theorem about prime numbers. Theorem 4.2. A number n is prime if and only if n|(n − 1)! + 1.

Proof. Only if: Suppose n is prime. For n = 2, this is straightforward, so we can assume n is odd. Then all integers smaller than n are coprime to n. (n − 1)! is the product of all these integers. We use n−1 n−1 Y Y (n − 1)! mod n = k mod n = (k mod n) k=1 k=1 Since n is prime, Z/nZ is a field. Therefore, every k mod n in this product has a unique inverse, which is also contained in the product. The only moduli that are their own inverse are 1 mod n and n−1 mod n, so the other numbers form pairs of inverses. These multiply up to 1 and therefore cancel out. As n is odd, there is an even number of factors in our product. It follows that only 1 and n − 1 are left. Then we have

n−1 Y (n − 1)! mod n = (k mod n) = (1 mod n)(n − 1) mod n = −1 mod n k=1

As (n − 1)! ≡ −1 mod n, it follows that n|(n − 1)! + 1. If: Suppose n|(n − 1)! + 1. Now assume n has some divisor k strictly between 1 and n. It follows that k|(n − 1)!. On the other hand (n − 1)! ≡ −1 mod n implies that n and (n − 1)! are coprime. This contradicts the assumption that k is a non-trivial divisor of both. We conclude that n is prime.

The useful thing about this characterization of primes is that no universal quantifiers at all are involved. One only needs a Diophantine description of the factorial. This method is used in [JSWW76]. In this article, the following is shown: Theorem 4.3. For any positive integer k, k + 1 is prime if and only if the following system of Diophantine equations has a solution in the positive integers:

(I) q = wz + h + j (VIII) (x + cu)2 = ((a + u2(u2 − a))2 − 1)(n + 4dy)2 + 1 (II) z = (gk + g + k)(h + j) + h (IX) m2 = (a2 − 1)l2 + 1 (III) ((2k)3(2k + 2)(n + 1)2 + 1 = f (X) l = k + i(a − 1) (IV) e = p + q + z + 2n (XI) n + l + v = y (V) e3(e + 2)(a + 1)2 + 1 = o2 (XII) m = p + l(a − n − 1) + b(2a(n + 1) − (n + 1)2 − 1) (VI) x2 = (a2 − 1)y2 + 1 (XIII) x = q + y(a − p − 1) + s(2a(p + 1) − (p + 1)2 − 1) (VII) u2 = 16(a2 − 1)r2y4 + 1 (XIV) pm = z + pl(a − p) + t(2ap − p2 − 1)

27 We can turn these equations into homogeneous equations and then add the squares to obtain a single polynomial M, that has a zero in the non-negative integers if and only if k + 1 is prime. Finally we can apply Putnam’s trick to find the following polynomial:

P (a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p, q, r, s, t, u, v, w, x, y, z) = (k + 2){1 − [wz + h + j − q]2 − [(gk + 2g + k + 1)(h + j) + h − z]2 − [2n + p + q + z − e]2 − [16(k + 1)3(k + 2)(n + 1)2 + 1 − f 2]2 − [e3(e + 2)(a + 1)2 + 1 − o2]2 − [(a2 − 1)y2 + 1 − x2]2 − [16r2y4(a2 − 1) + 1 − u2]2 − [((a + u2(u2 − a))2 − 1)(n + 4dy)2 + 1 − (x2 + cu)2]2 − [n + l + v − y]2 − [(a2 − 1)l2 + 1 − m2]2 − [ai + k + 1 − l − i]2 − [p + l(a − n − 1) + b(2an + 2a − n2 − 2n − 2) − m]2 − [q + y(a − p − 1) + s(2ap + 2a − p2 − 2p − 2) − x]2 − [z + pl(a − p) + t(ap − p2 − 1) − pm]2} (4.1)

In this polynomial, the variables range over the non-negative integers. The set of positive values attained by this polynomial is identical to the set of primes. [JSWW76] This method also implies that for any prime, there exists a proof of its primality consisting of only 87 additions and multiplications. [JSWW76] Unfortunately, this proof will always involve some integers that are very large with respect to the prime itself, such as the factorial of that prime.

4.2 The divisor number function We now have the appropriate tools to construct a Diophantine description of a function that, given an input n, has as output the number of (positive) divisors of n. In the construction, the following function is important: div(n, k) = 1 − 1 mod ((n mod k) + 1) This function is equal to 1 exactly when k divides n, and 0 otherwise. It is furthermore a Diophan- tine function. Using this function, we can express the number of divisors of n in a more explicit formula:

n n X X X divnum(n) = 1 = div(n, k) = (1 − 1 mod ((n mod k) + 1)) d|n k=1 k=1

Now we can use the Sequence Number Theorem and the Bounded Quantifier Theorem. We need to store the consecutive values of this sum as k runs from 1 to n. This comes down to the following: m = divnum(n) ⇔

∃u s.t.S(1, u) = 1 ∧ (∀k)≤n[S(k, u) = S(k − 1), u) + 1 − 1 mod ((n mod k) + 1)] ∧ S(n, u) = m Although the corresponding system of polynomials would be intricate, this is equivalence implies that we can find an explicit Diophantine description of the divisor number function. It follows that the following properties of numbers can be captured by a Diophantine equation:

n is highly composite ⇔ (∀l)≤n [divnum(l) < divnum(n) ∨ l = n] n is prime ⇔ divnum(n) = 2

4.3 The divisor sum function A closely related function is the function that, given an input n, gives as output the sum of the divisors of n (other than n itself). The function div(n, k) = 1 − 1 mod ((n mod k) + 1) is again useful. We then define:

n n X X X divsum(n) = d = −n + div(n, k)k = −n + k (1 − 1 mod ((n mod k) + 1)) d|n,d6=n k=1 k=1

28 We can again use the Sequence Number Theorem and the Bounded Quantifier Theorem. We need to store the consecutive values of this sum as k runs from 1 to n. This comes down to the following: m = divsum(n) ⇔

∃u s.t. S(1, u) = 1∧(∀k)≤n[S(k, u) = S(k−1, u)+k(1−1 mod ((n mod k)+1))]∧S(n, u)−n = m It follows that the following properties of numbers can be captured by a Diophantine equation:

n is a perfect number ⇔ divsum(n) = n 2 2 n1, n2 are amicable ⇔ (divsum(n1) − n2) + (divsum(n2) − n1) = 0

n1, n2 are friendly ⇔ divsum(n1)n2 = divsum(n2)n1 n is prime ⇔ divsum(n) = 1

4.4 Euler’s φ function In a similar way, we can find a Diophantine description of Euler’s φ function:

n X X φ(n) = 1 = (1 − 1 mod (gcd(k, n))) gcd(k,n)=1,k

This results in the following Diophantine representation:

m = φ(n)

∃u s.t.S(1, u) = 1 ∧ (∀k)≤n[S(k, u) = S(k − 1, u) + (1 − 1 mod (gcd(k, n))] ∧ S(n, u) = m In order to find a Diophantine representation of perfect totient numbers, we must also define the following function: m = totientsum(n) ⇔

∃u∃c s.t. S(1, u) = n ∧ (∀k)≤n[S(k, u) = φ(S(k − 1, u))] ∧ S(c, u) = 2

∧ ∃v s.t.S(1, v) = 0 ∧ (∀k)≤c+1[S(k, v) = S(k − 1, v) + S(k, u)] ∧ S(c + 1, v) = m

It follows that the following properties of numbers can be captured by a Diophantine equation:

n is a perfect totient number ⇔ totientsum(n) = n n is prime ⇔ φ(n) = n − 1

4.5 Gödel’s incompleteness theorems Matijasevi˘c’sTheorem also has interesting implications on Gödel incompleteness. In particular, consider Gödel’s second incompleteness theorem. Theorem 4.4. Any effective axiomatic theory of arithmetic that can prove its own consistency is inconsistent. 2 [Göd31] Now, assume we have some effective axiomatic theory of arithmetic. We can consider the set of Gödel numbers of proofs of 0 = 1. Obviously, any element of this set would render that theory inconsistent. Moreover, by Gödel’s second incompleteness theorem, a proof that this set is empty would imply that our theory is inconsistent. As it takes a finite amount of time to verify that any number is the Gödel number of a proof of 0 = 1, this set is recursively enumerable. Then, by Matijasevi˘c’sTheorem it is Diophantine. This implies that a Diophantine description of this set can be constructed. This proves the following theorem.

2A rigorous definition of an effective axiomatic theory of arithmetic is beyond the scope of this thesis. It can be found in [Göd31]. For now, it is enough to know that the generally used Peano arithmetic is such a theory.

29 Theorem 4.5. There exists a polynomial p with the following properties:

• p doesn’t have any zeros in the integers. • It is impossible to prove that p doesn’t have any zeros in the integers. In [Dav73] an analogous proof of Theorem 4.5 is presented, one that does not use Theorem 4.4.

30 5 The complexity of Diophantine descriptions

There are two properties of a Diophantine set that constitute its complexity: the degree of the corresponding Diophantine equation, which we will call the degree of the set, and the number of variables in that equation, which we will call the dimension of the set. As a less complex Diophantine description is obviously preferred, we might want to minimize these quantities.

5.1 Degree At first, let’s look at the degree of the polynomial. In order to minimize the degree, we can introduce new variables. For instance, let’s say we have the Diophantine equation of degree 7: x2y5 − 19 = 0 (5.1) (Obviously this equation does not have a solution in the integers.) We can introduce the following variables: a = x2 b = y2 c = b2 d = bc Then the Diophantine equation becomes a system of Diophantine equations: a = x2 b = y2 c = b2 d = bc ad − 19 = 0 This can always be done such that none of the equations in the system has degree greater than 2. If this would be the case, we could introduce a new variable to split the exponentiation up into several equations. Summing the squares gives us the following Diophantine equation: [a − x2]2 + [b − y2]2 + [c − b2]2 + [d − bc]2 + [ad − 19]2 = 0 (5.2) It can be seen straightforwardly that equation (5.1) and equation (5.2) are equivalent. The degree, however, has been reduced from 7 to 4. This technique is called Skolem substitution.[JSWW76] This leads to the following theorem on the degree of Diophantine sets: Theorem 5.1. Every Diophantine set has degree at most 4. Proof. Using Skolem substitution, we can transform any Diophantine equation to a system of equations that all have degree at most 2. Squaring all these equations yields polynomials of degree at most 4. Summing up the obtained polynomials changes nothing to the degree: it will still not exceed 4. By Matijasevi˘c’sTheorem, this implies that every recursively enumerable set has a Diophan- tine description of degree at most 4. Whether a smaller upper bound exists, i.e. whether every Diophantine equation is equivalent to some Diophantine equation of degree strictly smaller than 4, is not known. [JSWW76] This gives us a nice result concerning Hilbert’s tenth problem. We already saw that for every algorithm, there is some Diophantine equation such that the algorithm cannot decide within a finite amount of time whether the equation has zeros in the integers. Theorem 5.1 implies something stronger: for every algorithm, there is some Diophantine equation of degree at most 4 such that the algorithm cannot decide within a finite amount of time whether that equation has any zeros in the integers. The same can be applied to Theorem 4.5. We see that there is a trade-off here: we can reduce the degree of a polynomial, but this will always give us more variables. Minimizing dimension and degree simultaneously is generally not possible. For instance, reducing the degree of the prime-representing polynomial given by 4.1 to 4 would result in a polynomial of 42 variables. [JSWW76] For the other functions described in section4, the dimension will be increased much more.

31 5.2 Dimension Minimizing the dimension of a Diophantine set is a little less straightforward. However, the fol- lowing can be proven: Theorem 5.2. There exists an m such that every Diophantine set has dimension at most m. [Dav73] Perhaps surprisingly, Yuri Matijasevi˘cand Julia Robinson have shown in 1975 that m = 13 works, if the variables range over the non-negative integers. [MR73]. Two years later, Matijasevi˘c sharpened this bound to m = 9, for non-negative integer variables. [Mat77] In 1990, it was also shown that a bound of m = 11 suffices if the variables range over the integers. [Sun90] Whether a smaller bound for the dimension of Diophantine sets can exist is not known. [JSWW76] Simply using substitution, we can reduce the dimension of the polynomial 4.1 to 19, which increases the degree to 29. Decreasing the dimension even further, to 12, requires an entirely different method, and raises the degree of the polynomial to 13376.[JSWW76] The other described functions are even more complicated, but it is provided that there is some description of them with only 11 unknowns, and presumably a very high degree. This also adds strength to Matijasevi˘c’sresult. Not only are all recursively enumerable subsets of Z equal to the projection of the set of zeros of some polynomial, but they are equal to the projection of the set of zeros of some polynomial of at most 11 unknowns. In terms of Hilbert’s problem, this means that for every algorithm, there exists an undecidable Diophantine with no more than 11 unknowns. Similarly, the polynomial in Theorem 4.5 can have only 11 variables.

32 √ 3 6 A Diophantine description of exponentiation using Z[ d] √ In section2, we have seen that the unit group of Z[ d] can be used to find a Diophantine√ description 3 of exponentiation. In this section, we try a similar√ approach using the unit group of Z[ d] instead. This ring is a bit more complicated than Z[ d], but because of Dirichlet’s Theorem 2.7 both rings have the same properties with respect to their unit groups. It therefore seems reasonable to think that we are able to make the exponential relation Diophantine using the unit group of this ring. A Diophantine description of this unit group would yield a new proof of Matijasevi˘c’sThe- orem. Although this would be an original proof, it would probably be more involved and less comprehensible than the proof given in section2. As this section will show, the approach applied in section 2.6 does not work in the three- dimensional case. This is because the required divisibility sequences do not exist. √ 3 6.1 Z[ d] and its unit group In this section, we will focus on the following ring: √ √ √ 3 3 3 2 Z[ d] = {x + y d + z d | x, y, z ∈ Z}

In which d is an integer that is not the√ cube of an integer. We see by definition 2.1 that this is a 3 number ring: its field of fractions is Q( d), a number field. Again, we study the unit group of this ring: √ √ √ 3 × 3 3 Z[ d] = {α ∈ Z[ d] | ∃β ∈ Z[ d] s.t. α · β = 1} Similarly, we need a norm to identify the units of this ring. The definition is slightly different in this case: √ √ √ 3 3 2 3 Definition 6.1. Let α = x + y d + z d be an element of Z[ d]. We define the norm of α as follows: N(α) = x3 + dy3 + d2z3 − 3xyzd (6.1)

We can derive this norm by looking at the determinant of the matrix corresponding to multi- √ √ √ 3 3 3 2 plication with α in the Z-module Z[ d], with the basis {1, d, d }:

√ √ √ √  3 3 2 3 3 2 x + y d + Z d · 1 = x + y d + z d √ √ √ √ √  3 3 2 3 3 3 2 x + y d + Z d · d = zd + x d + y d √ √ √ √ √  3 3 2 3 2 3 3 2 x + y d + Z d · d = yd + zd d + x d

This gives us the columns of the matrix Mα.

x zd yd Mα = y x zd z y x

x zd yd 3 3 2 3 det(Mα) = y x zd = x + dy + d z − 3xyzd = N(α) (6.2)

z y x √ 3 Lemma 6.2. For any element α ∈ Z[ d], we have N(−α) = −N(α). Proof. This can be seen either by introducing minus signs in front of x, y, and z in the norm 3 equation, or by the fact that det(M−α) = det(−Mα) = (−1) det(Mα) = − det(Mα). As opposed to the two-dimensional case, the inverses of units are a bit more difficult to find in this case. However, they are given explicitly in terms of the given unit.

33 √ √ 2 Lemma 6.3. Let α = x + y 3 d + z 3 d be such that N(α) = ±1. Then its inverse α−1 is given by

 √ √ 2 β = ± (x2 − yzd) + (z2d − xy) 3 d + (y2 − xz) 3 d

Proof. Suppose we have N(α) = ±1. Then computation shows that

 √ √ 2  √ √ 2 αβ = ± x + y 3 d + z 3 d (x2 − yzd) + (z2d − xy) 3 d + (y2 − xz) 3 d = ±(x3 + dy3 + d2z3 − 3xyzd) = ±N(α) = 1 as the minuses cancel out in the case N(α) = −1. √ √ 2 Furthermore, we can set α−1 = β = χ + ψ 3 d + φ 3 d . Finding out what β looks like comes down to solving the following system:

x zd yd χ 1 y x zd ψ = 0 z y x φ 0 Solving this system is straightforward, and using N(α) = ±1 gives us the desired result. √ 3 Lemma 6.4. An element α ∈ Z[ d] is a unit if and only if N(α) = ±1.

Proof. If: This follows directly√ from Lemma 6.3 √ 3 3 Only if: Suppose α ∈ Z[ d] is a unit. That implies, by definition, the existence of β ∈ Z[ d], such that αβ = 1. We now use the norm as defined by equation (6.2), and the fact that the determinant of matrices is multiplicative: det(AB) = det(A) det(B) In terms of norms, this directly implies

1 = N(1) = N(αβ) = N(α)N(β) √ 3 We thus have that the norms of inverses in Z[ d] are each other’s inverse in Z. The only units in Z are 1 and −1, so we conclude: N(α) = N(β) = ±1 which completes the proof.

6.2 Application of Dirichlet’s theorem Recall Dirichlet’s Unit Theorem:

Theorem 6.5. Dirichlet (1846): Let K be a number field with r1 real embeddings and r2 pairs of complex conjugate embeddings (so 2r2 complex embeddings in total). Then the unit group of any order in K is finitely generated with r1 + r2 − 1 independent generators of infinite order. More precisely, letting r = r1 + r2 − 1, any order O in K contains multiplicatively independent units u1, ··· , ur of infinite order such that every unit in O can be written uniquely in the form

m1 mr ζu1 ··· ur where ζ is a root of unity and every mi is an integer. Abstractly, O× =∼ µ(O) × Zr, where µ(O) is the finite group of roots of unity in O.[Con] √ √ 3 3 This can again be applied to Z[ d], which is√ an order in Q( d). In order to determine the r1 3 and r2, we look at the minimum polynomial of d:

p(X) = X3 − d

The roots of this polynomial are given by √ 3 2πik αk = d · e 3 for k = 0, 1, 2.

For brevity, we write 2πi ω = e 3

34 2 We observe that ω =√ω ¯, the square√ of ω is equal to its complex conjugate. 3 As opposed√ to Q( d), Q( d) is not a Galois extension of Q, as not all the roots above are 3 contained in√Q( d). 3 Since Q( d) is also contained in R, the identity is an embedding in the real numbers. Moreover, we have the following pair of complex conjugate embeddings: √ √ √ √ √ 3 3 3 2 3 3 2 2 f1 : Q( d) ,→ C f1(x + y d + z d ) = x + y dω + z d ω √ √ √ √ √ 3 3 3 2 3 2 3 2 f2 : Q( d) ,→ C f2(x + y d + z d ) = x + y dω + z d ω √ These embeddings correspond to permuting the zeros of the minimum polynomial of 3 d. The complex embeddings always come in pairs: if we have some complex embedding, its conjugate is also a complex embedding. In summary, we now have one real embedding and one pair of complex embeddings. This gives us r = r1 + r2 − 1 = 1 + 1 − 1 = 1, and hence Dirichlet’s theorem tells us the following: √ √ 3 × 3 Z[ d] =∼ µ(Z[ d]) × Z √ 3 The roots of unity in Z[ d] are just −1 and 1. We let v be the greatest unit larger than 1. We then have √ 3 × n Z[ d] = {±v | n ∈ Z} (6.3)

6.3 A suitable choice for d We can again make it easier to find the fundamental unit, this time by setting d = a3 − 1 for some √ √ 2 integer a > 1. It then follows that a2 + a 3 d + 3 d is a unit:

 √ √ 2  √  a2 + a 3 d + 3 d a − 3 d = a3 − d = a3 − (a3 − 1) = 1

From now on, the letter ’d’ will abbreviate the expression ’a3 − 1’, with the condition a > 1. In the three-dimensional case, it is much more intricate to prove that this is the fundamental unit. A simple elementary argument is not available. We have to invoke a theorem by Emil Artin.

Definition 6.6. Let O be an order in a number field K, of rank n. Let {x1, ··· , xn} be a basis for O over Z. Then the discriminant disc(O) is defined as follows:

n disc(O) = det(TrO/Z(xixj))i,j=1 √ 3 The following lemma gives us an easy way of calculating disc(Z[ d]), our particular case. Lemma 6.7. If f ∈ Z[X] is an irreducible polynomial, then the order Z[α] has discriminant disc(f).[Ste12]

We can now state Artin’s theorem.

Theorem 6.8. Artin: Let O be an order in a cubic field K with r1 = 1. Viewing K in R, if v > 1 is a unit of O (so v ∈ O×), then 4v3 + 24 > | disc(O)|.[Con] Although a proof of Artin’s theorem does not lie within the scope of this thesis, we can prove an important corollary of Artin’s theorem.

Corollary 6.9. Let O be an order in a cubic field K with r1 = 1. Viewing K in R, if v > 1 is a × 3 unit of O (so v ∈ O ), and 4v 2 + 24 ≤ | disc(O)|. Then v is the fundamental unit of O.

Proof. As K is cubic, n = r1 + 2r2 = 3. Moreover, since r1 = 1, it follows that r2 = 1, and thus r = r1 + r2 − 1 = 1. Then, from Dirichlet’s Theorem 6.5 follows that

× n O = {± | n ∈ Z} in which  is the fundamental unit of O. As v is a unit greater than 1, it must be that v = ek, for some positive k. Now assume, for contradiction, that v is not the fundamental unit of O, or

35 1 equivalently k > 1. This would mean that  = v k . As  is a unit, it follows from Artin’s Theorem 6.8 that 43 + 24 > | disc(O)| (6.4)

1 1 We also observe that  = v k ≤ v 2 . From our assumption then follows that

3 3 4 + 24 ≤ 4v 2 + 24 ≤ disc(O)| (6.5)

Equation (6.4) and equation (6.5) form a contradiction, from which we conclude that v must be the fundamental unit of O.

Using√ this corollary of Artin’s theorem, we can prove that we have found the fundamental unit 3 of Z[ d], with our choice of d = a3 − 1. √ √ √ 3 3 3 2 Lemma 6.10. The fundamental unit of Z[ d] is v = a2 + a d + d . √ Proof. Let us begin by computing the discriminant of the minimum polynomial of 3 d:

disc(f) = disc(X3 − d) = −27d2 = −27(a6 − 2a3 + 1) √ 3 It then follows from Lemma 6.7 that this is also the discriminant of Z[ d]. We observe:

a3 − 1 < a3 √ ⇒ 3 d < a ⇒ v < 3a2 √ 3 3 ⇒ v 2 < 3 3a √ 3 3 ⇒ 4v 2 + 24 < 12 3a + 24

We now have to prove that √ √ 3 3 6 3 12 3a + 24 ≤ | disc(Z[ d])| = 27(a − 2a + 1) for every a > 1. Then it will be implied by Corollary 6.9 that v is the fundamental unit. We can prove this as follows: 1. For a = 2, we have √ √ 12 3 · 23 + 24 = 96 3 + 24 < 96 ∗ 2 + 24 = 216 27(26 − 2 · 23 + 1) = 27 · 49 = 1323 > 216

2. We look at the derivative of both functions for a > 1. d √ √ (12 3a3 + 24) = 36 3a2 < 72a2 da d (27(a6 − 2a3 + 1)) = 27(6a5 − 6a2) = 162a2(a3 − 1) > 72a2 da

6 3 3. As 27(a − 2a + 1) is greater at a = 2 and√ has a greater derivative for any a > 1, it must be that 27(a6 − 2a3 + 1) is greater than 12 3a3 + 24 for any a > 1. √ 3 By Corollary 6.9, this completes the proof that v is the fundamental unit of Z[ d].

6.4 The three-dimensional Pell equation Before defining the three-dimensional Pell equation, we need the following lemma: √ 3 Lemma 6.11. If α is any element of Z[ d]×, then we have α > 0 if and only if N(α) = 1.

36 Proof. This proof relies on Lemma 6.2 and the multiplicity of the norm, which has been shown in Lemma 6.4: N(αβ) = N(α)N(β). √ √ 2 Furthermore, we have that α = ±vn for some integer n, and v = a2 + a 3 d + 3 d . From straightforward computation follows:

N(v) = (a2)3 + a3d + d2 − 3a3d = a6 + a3(a3 − 1) + (a3 − 1)(a3 − 1) − 3a3(a3 − 1) = a6 − (a3 − 1)(a3 + 1) = a6 − (a6 + 1) = 1

Now, suppose α > 0. Then it follows:

N(α) = N(vn) = N(v)n = 1n = 1.

Otherwise, suppose α < 0. Then we have

N(α) = −N(−α) = −1.

From this we conclude that α > 0 if and only if N(α) = 1. Similar to section 2.2, we define the following set: √ √ √ 3 × 3 3 × Z[ d] ⊃ Gd = {α ∈ Z[ d] | N(α) = 1} = {α ∈ Z[ d] | α > 0} (6.6) √ 3 × Lemma 6.12. Gd is a subgroup of Z[ d] Proof. The proof of this is identical to the proof of Lemma 2.4, except that some parts are not necessary because of Lemma 6.11. This finally brings us to the three-dimensional Pell equation:

x3 + y3(a3 − 1) + z3(a3 − 1)2 − 3xyz(a3 − 1) = 1 (6.7) √ √ √ 3 3 2 3 Lemma 6.13. For any α = x + y d + z d ∈ Z[ d], we have that α ∈ Gd if and only if x, y, z is a solution to equation (6.7).

Proof. This follows from the definition of Gd, equation (6.6). Equation (6.7) is equivalent to N(α) = 1.

Lemma 6.14. Gd is cyclic. √ 3 × Proof. As Gd contains precisely the positive elements of Z[ d] , it looks as follows:

n Gd = {v | n ∈ Z}

This means Gd is only generated by v, and is hence cyclic. This is a part where the three-dimensional case is a bit nicer. Because of Lemma 6.11, we need not impose an extra requirement√ (such as x >√0) on a solution x, y, z of 6.7 to be the coordinates 3 3 of a positive unit of Z[ d]. This is because Z[ d] has only one real embedding, instead of two. It now follows that the following relation holds for solutions xn(a), yn(a), zn(a) of equation (6.7):

√ √ √ √ n 3 3 2  2 3 3 2 xn(a) + yn(a) d + zn(a) d = a + a d + d (6.8)

37 6.5 Behavior of xn(a), yn(a) and zn(a)

From equation (6.8), we can deduce some arithmetical properties of xn(a), yn(a) and zn(a). Again, we drop the dependence on a.

Lemma 6.15. If (xm, ym, zm) and (xn, yn, zn), then we have

xm+n = xmxn + (ymzn + zmyn)d

ym+n = xmxn + ymxn + zmznd

zm+n = xmzn + ymyn + zmxn 2 2 2 2 xm−n = xmxn + ymynd + zmznd − (xmynzn + ymxnzn + zmxnyn)d 2 2 2 ym−n = xmznd + ymxn + zmynd − xmxnyn − ymynznd − zmxnznd 2 2 2 zm−n = xmyn + ymznd + zmxn − xmxnzn − ymxnyn − zmynznd Proof. The first three lemmas follow from working out equation (6.8) for m + n: √ √ √ √ √ √ 3 3 2 2 3 3 2 m 2 3 3 2 n xm+n + ym+n d + zm+n d = (a + a d + d ) (a + a d + d ) √ √ √ √  3 3 2  3 3 2 = xm + ym d + zm d xn + yn d + zn d

The latter three equations follow from Lemma 6.3 and then working out equation (6.8) for m − n: √ √ √ √ √ √ 3 3 2 2 3 3 2 m 2 3 3 2 −n xm−n + ym−n d + zm−n d = (a + a d + d ) (a + a d + d ) √ √ √ √  3 3 2  2 2 3 2 3 2 = xm + ym d + zm d (xn − ynznd) + (znd − xnyn) d + (yn − xnzn) d

Lemma 6.16. The sequence of solutions xn, yn, zn follows the following recurrence relation: 2 xn+3 = 3a xn+2 − 3axn+1 + xn 2 yn+3 = 3a yn+2 − 3ayn+1 + yn 2 zn+3 = 3a zn+2 − 3azn+1 + zn Proof. As usual in the three-dimensional case, this lemma does not follow from elementary arith- metic. We need the Cayley-Hamilton Theorem to obtain this result. We denote a solution to equation (6.7) by the column vector   xn yn zn √ √ 3 3 2 xn, yn and zn are the coordinates of a unit in the ordered basis {1, d, d }. We now have to √ √ 2 find the matrix corresponding to multiplication with (a2 + a 3 d + 3 d ). We do this by checking √ √ 3 3 2 how this acts on our basis vectors, 1, d and d , in order to find the columns of the matrix Mv. √ √ 2 √ √ 2 (a2 + a 3 d + 3 d ) · 1 = a2 + a 3 d + 3 d √ √ 2 √ √ √ 2 (a2 + a 3 d + 3 d ) · 3 d = d + a2 3 d + a 3 d √ √ 2 √ 2 √ √ 2 (a2 + a 3 d + 3 d ) · 3 d = ad + d 3 d + a2 3 d This gives us the following matrix: a2 d ad 2 A = Mv =  a a d  1 a a2 It follows that solutions are of the following form:

   2 n     xn a d ad x0 1 2 n yn =  a a d  y0 = A 0 (6.9) 2 zn 1 a a z0 0

38 We now compute the characteristic polynomial of A and use d = a3 − 1.

pA(λ) = det(A − λI)

a2 − λ d ad

= a a2 − λ d

1 a a2λ = (a2 − λ)3 + d2 + a3d − 3ad(a2 − λ) = a6 − 3a4λ + 3a2λ2 − λ3 + 3a4λ − 3aλ − (a3 − 1)(a3 + 1) = −λ3 + 3a2λ2 − 3aλ + 1

From the Cayley-Hamilton Theorem then follows

A3 − 3a2A2 + 3aA − I = 0 ⇒ A3 = 3a2A2 − 3aA + I

We can substitute that into equation (6.9) the following:     xn+3 1 n+3 yn+3 = A 0 zn+3 0 1 3 n = A A 0 0   xn 3 = A yn zn   xn 2 2  = 3a A − 3aA + I yn zn       xn xn xn 2 2 = 3a A yn − 3aA yn + yn zn zn zn       xn+2 xn+1 xn 3 = 3a yn+2 − 3a yn+1 + yn zn+2 zn+1 zn which is what we wanted to prove. We can now prove other properties by induction, using the fact that we know the first values of xn, yn and zn:

(x0, y0, z0) = (1, 0, 0) 2 (x1, y1, z1) = (a , a, 1) 4 3 2 (x2, y2, z2) = (3a − 2a, 3a − 1, 3a )

Lemma 6.17. For every non-negative n, xn+1 > xn > n, yn+1 > yn ≥ n, and zn+1 > zn ≥ n Proof. The proof is by induction. Firstly, it is straightforward to check that this holds for n = 0, 1, 2. Now suppose it holds up to n = k + 2. Then the induction step is as follows:

2 xk+3 = 3a xk+2 − 3axk+1 + xk 2 > 3a xk+2 − 3axk+2

= 3a(a − 1)xk+2 > xk+2

xk+3 > xk+2 > k + 2

⇒ xk+3 > k + 3

Analogous inductive proofs can be given for yn and zn.

39 2 2 2 n Lemma 6.18. For every non-negative n, (a ) ≤ xn ≤ (3a ) Proof. We use induction. We first observe:

2 0 2 0 x0 = (a ) = (3a ) 2 1 2 1 x1 = (a ) < (3a ) 4 4 2 2 x2 = 3a − 2a < 3a < (3a ) 4 4 4 2 2 x2 = 3a − 2a > 3a − 2a = (a ) We then suppose it holds up to n = k + 2 and the following induction step follows, using Lemma 6.17 and the fact that a > 1:

2 xk+3 = 3a xk+2 − 3axk+1 + xk 2 < 3a xk+2 − (3a − 1)xk+1 2 2 2 k+2 2 k+3 < 3a xk+2 ≤ 3a (3a ) = (3a ) 2 xk+3 = 3a xk+2 − 3axk+1 + xk 2 > (3a − 3a)xk+2 2 = a xk+2 + a(2a − 3)xk+2 2 2 2 k+2 2 k+3 > a kk+2 = a (a ) = (a ) This completes the proof. Lemma 6.19. Let p be any non-negative number. Then we have

2 2 2 3 2 n 3 2 2 xn + (3a p − p − 2a)yn + (ap − (3a − 1)p + 2a )zn ≡ p mod (−p + 3a p − 3ap + 1) Proof. Again, this follows by induction. We first observe:

2 2 2 3 2 x0 + (3a p − p − 2a)y0 + (ap − (3a − 1)p + a )z0 = 1 2 2 2 3 2 x1 + (3a p − p − 2a)y1 + (ap − (3a − 1)p + a )z1 =a2 + (3a2p − p2 − 2a)a + (ap2 − (3a3 − 1)p + a2) =a2 + 3a3p − ap2 − 2a2 + ap2 − 3a3p + p + a2 =p 2 2 2 3 2 x2 + (3a p − p − 2a)y2 + (ap − (3a − 1)p + a )z2 =(3a4 − 2a) + (3a2p − p2 − 2a)(3a3 − 1) + (ap2 − (3a3 − 1)p + a2)(3a2) =3a4 − 2a + 3a5p − 3a2p − 3a3p2 + p2 − 6a4 + 2a + 3a3p2 − 3a5p + 3a2p + 3a4 =p2

Again, the factors c1, c2 and c3 to be placed in front of xn, yn and zn can be found by solving the following matrix equation:       1 0 0 c1 1 2  a a 1  c2 =  p  4 3 2 2 3a − 2a 3a − 1 3a c3 p

In the induction step, we will refer to them as c1, c2 and c3, for brevity and clarity. Their specific forms are not relevant there. We suppose that the lemma holds up to k + 2. It then follows:

2 c1xk+3 + c2yk+3 + c3zk+3 = 3a (c1xk+2 + c2yk+2 + c3zk+2) − 3a(c1xk+1 + c2yk+1 + c3zk+1) 2 k+2 k+1 k 3 2 2 +(c1xk + c2yk + c3zk) ≡ 3a p − 3ap + p mod (−p + 3a p − 3ap + 1) = (3a2p2 − 3ap + 1)pk mod (−p3 + 3a2p2 − 3ap + 1) = p3pk mod (−p3 + 3a2p2 − 3ap + 1) = pk+3 mod (−p3 + 3a2p2 − 3ap + 1)

This completes the proof of the lemma.

40 Having obtained this, we will need to find similar divisibility properties to determine the solution number n of a solution. That is, given some solution (x(a), y(a), z(a)) of equation (6.7), for which n we have (xn(a), yn(a), zn(a)). After that we can finally construct a new way to capture the exponential relation using only Diophantine equations. This could ultimately result in a new proof that all exponential Diophantine sets are Diophantine.

6.6 Finding the solution number using divisibility properties √ √ 3 So far, the results we obtained about Z[ d]× are very similar to the results about Z[ d]×. In the two-dimensional case, we used divisibility properties to find the solution number n, given a solution to the Pell equation. In the three-dimensional case, we run into trouble: we will see that for some crucial lemmas, a three-dimensional analog simply doesn’t hold. We were able to obtain some similar results. n(n+1) Lemma 6.20. xn ≡ 1 mod (a − 1), yn ≡ n mod (a − 1) and zn ≡ 2 mod (a − 1). Proof. This can be shown by induction. However, it is more insightful to show this in a more general way, just as the second proof of Lemma 2.28. We now work with the polynomial ring Z[t]/(t3 − a3 + 1), in which the positive units are of the form (a2 + at + t2)n. We use the group homomorphism ’mod (a − 1).:

3 3 × 3 × Z[t]/(t − a + 1) → Z/(a − 1)Z [t]/(t ) x + yt + zt2 7→ x mod (a − 1) + t · (y mod (a − 1)) + t2 · (z mod (a − 1))

We have used a3 − 1 = (a − 1)(a2 + a + 1). The positive elements of the latter unit group are of the form (1 + t + t2)n, with t3 = 0. The higher order terms in the expansion again cancel, such that

n n(n + 1) (1 + t + t2)n = 1 + nt + nt2 + t2 = 1 + nt + t2 2 2 which is what we wanted to prove. Lemma 6.20 could again provide the solution number under the condition that a − 1 is greater than n. We may however not impose such a condition, so we need more lemmas. Lemma 6.21. If a, b, c, n are any non-negative integers, then a ≡ b mod c implies the following congruences:

xn(a) ≡ xn(b) mod c

yn(a) ≡ yn(b) mod c

zn(a) ≡ zn(b) mod c

Proof. This too can be shown by induction, but we again choose the more general proof. We now have two different rings: Z[t]/(t3 − a3 + 1) and Z[t]/(t3 − b3 + 1). On their unit groups we can define the group homomorphism ’mod c’:

3 3 × 3 3 × Z[t]/(t − a + 1) → Z[t]/(t − a + 1, c) x + yt + zt2 7→ x mod c + t · (y mod c) + t2 · (z mod c)

We again use the form of the elements of these unit groups:

2 2 2 n xn(a) + yn(a)t + zn(a)t = (a + at + t ) 2 2 2 n xn(b) + yn(b)t + zn(a)t = (b + bt + t )

Under the group homomorphism they become:

2 2 2 n xn(a) + yn(a)t + zn(a)t = ((a mod c) + (a mod c)t + t ) 2 2 2 n xn(b) + yn(b)t + zn(a)t = ((b mod c) + (b mod c)t + t )

Now, since a ≡ b mod c, the units of Z[t]/(t3 − a3 + 1) and Z[t]/(t3 − b3 + 1) look exactly the same modulo c. This proves the lemma.

41 Lemma 6.22. For any non-negative n, we have:

x3n ≡ 1 mod xn

x3n ≡ 1 mod yn

x3n ≡ 1 mod zn 2 y3n ≡ 3dynzn mod xn 2 y3n ≡ 3dxnzn mod yn 2 y3n ≡ 3xnyn mod zn 2 z3n ≡ 3dynzn mod xn 2 z3n ≡ 3xnzn mod yn 2 z3n ≡ 3xnyn mod zn Proof. This can be seen in several ways. The most straightforward (and least insightful) way is working out √ √ √ √ 3 3 2 3 3 3 x3n + y3n d + z3n d = (xn + yn d + zn d) In this process, we use the fact that we are dealing with a solution to equation (6.7), such that

3 3 2 3 xn + dyn + d zn − 3dxnynzn = 1 A different way to prove this lemma is by the Cayley-Hamilton Theorem. We apply this to the matrix   xn dzn dyn n A = yn xn dzn zn yn xn We can find the characteristic polynomial of An as follows:

xn − λ dzn dyn

pAn (λ) = yn xn − λ dzn

zn yn xn − λ 3 3 2 3 = (xn − λ) + dyn + d zn − 3dynzn(xn − λ) 3 2 2 3 3 2 3 = −λ + 3xnλ − 3xnλ + xn + dyn + d zn − 3dxnynzn + 3dynznλ 3 2 2 = −λ + 3xnλ − 3(xn − dynzn)λ + 1 3 2 = −λ + 3xnλ − 3x−nλ + 1 The Cayley-Hamilton Theorem then implies

n 3 n 2 n (A ) = 3xn(A ) − 3x−nA + I And thus     x3n 1 n 3 y3n = (A ) 0 z3n 0 1 n 2 2 n = (3xn(A ) − 3(xn − dynzn)A + I) 0 0       x2n xn 1 2 = 3xn y2n  − 3(xn − dynzn) yn + 0 z2n zn 0  2      xn + 2dynzn xn 1 2 2 = 3xn 2xnyn + 2dzn − 3(xn − dynzn) yn + 0 2 2xnzn + yn zn 0

Given this relation, looking modulo xn, yn or zn becomes easier, and the congruences follow. For instance, 2 2 x3n = 3xn(xn + 2dydzn) − 3(xn − dynzn)xn + 1 = 9dxnynzn + 1

42 However, results analogous to Lemma 2.19 and Lemma 2.23 could not be obtained in the three-dimensional case. After a long time of trying to prove these results, in vain, we found the explanation for the difficulty. It was a theorem proven by Marshall Hall in 1936. [Hal36] Theorem 6.23. There is no regular divisibility sequence whose characteristic polynomial is an irreducible cubic whose last two coefficients are relatively prime. [Hal36] We can apply this theorem immediately to obtain the following result:

Theorem 6.24. Let a be any number greater than 1. Then it is impossible that x0(a), x1(a), ··· or y0(a), y1(a), ··· or z0(a), z1(a), ··· is a divisibility sequence. Proof. Observe that all three sequences have the same characteristic polynomial, which is just the characteristic polynomial of A:

3 2 2 f(x) = pA(x) = x − 3a x + 3ax − 1 This is, firstly, a cubic polynomial. Furthermore, its last two coefficients are relatively prime, as the last coefficient is 1. Now, assume f is reducible: f(x) = x3 − 3a2x2 + 3ax − 1 = (x + α)(x2 + βx + γ) = x3 + βx2 + γx + αx2 + αβx + αγ = x3 + (α + β)x2 + (αβ + γ)x + αγ ⇒ α + β = −3a2 αβ + γ = 3a αγ = −1

The last equation implies that α and γ are both units in Z. Suppose α = 1. Then it follows that γ = −1. Then the other equations become β − 1 = 3a and 1 + β = −3a2. No value of β can meet these requirements, since a > 1. On the other hand, suppose α = −1, such that γ = 1. This implies −β +1 = 3a and −1+β = −3a2. Again, these requirements cannot be met, because a > 1. We conclude that such a factorization does not exist, and thus f is irreducible. In fact, f is the minimum polynomial of v, our fundamental unit. We also note that y0(a) = z0(a) = 0, such that Theorem 6.23 implies that y0(a), y1(a), ··· and z0(a), z1(a), ··· are not divisibility sequences. It is still possible that x0(a), x1(a), ··· is a divisibility 2 6 3 sequence that is not normal. However, x1 = a and x3 = 9a − 9a + 1, such that x1 6 | x3. We conclude that x0(a), x1(a), ··· can’t be a divisibility sequence either.

Many of the lemmas in section 2.6 rely on the fact that yn(a) is a divisibility sequence, or at least imply it. Perhaps surprisingly, a similar Diophantine description is just not available in the three-dimensional case. This√ means that our attempt of finding a new Diophantine description of 3 exponentiation using Z[ d] has failed. Not only has it failed, we have shown that the planned method cannot work. However disappointing at first, this is a far more interesting result than a successful new method. There might be other Diophantine ways to extract the solution number n, given a solution to equation (6.7), but that will not be an analog of the method used in section 2.6. Perhaps an industrious reader can find such a method, which will be vastly different from the method in section 2.6.

6.7 Are xn(a), yn(a) and zn(a) Diophantine?

The previous section leads us to wonder whether xn(a), yn(a) and zn(a) are Diophantine, as we could not find a Diophantine description using the same method as in section 2.6. The answer to this question is: yes, xn(a), yn(a) and zn(a) are Diophantine, because they are recursively enumerable. We can, for instance apply the Sequence Number Theorem to the recurrence relation, and then find xn(a), yn(a) and zn(a) for any n. In this construction, we need Theorem 2.32 to find a Diophantine description of exponentiation. However, what we wished to find was some sort of primary Diophantine description of xn(a), yn(a) and zn(a), one in which Theorem 2.32 is not used. This would lead to an original Diophantine description of exponentiation, and hence a new proof of Matijasevi˘c’sTheorem. That wish was not granted, as Theorem 6.24 renders our approach ineffective in the three-dimensional case.

43 6.8 Comparison to two-dimensional case The fact that the approach of section 2.6 does not work in the three-dimensional case adds some subtlety to the two-dimensional case. Initially, we thought that by Dirichlet’s theorem the cyclicity and all other steps would follow naturally, but this turned out to be naive and√ untrue. The strength of the two-dimensional case is not only the structure of the unit group of Z[ d], but the fact that a divisibility sequence is produced. As we have seen, this does not follow naturally from a cyclic subgroup of the unit group with a recurrence relation. This makes Matijasevi˘c’sTheorem even more special and powerful.

44 7 Conclusion and outlook

In this thesis, we studied Diophantine descriptions of recursively enumerable sets and Matijasevi˘c’s Theorem, which entails that Hilbert’s tenth problem is unsolvable. Firstly, we outlined the proof that exponential Diophantine sets are Diophantine, which is the result by Matijasevi˘cthat completed the proof that all√ recursively enumerable sets are Diophantine. In this proof, the structure of the unit group of Z[ d] is used. We find that the non-negative solutions to the two-dimensional Pell equation behave somewhat exponentially, following a second order recurrence relation. This recurrence relation also provides a divisibility sequence. This is essential in eventually finding a Diophantine description of exponentiation. After expanding the language of Diophantine descriptions, we applied this mechanism to some concrete examples. Due to the constructivity of the proof given by Matijasevi˘c,Davis, Robinson and Putnam, not only the existence of a Diophantine equation is guaranteed; such an equation can be explicitly written down. Basically, the algorithm that enumerates a recursively enumerable set can stepwise be transformed into a Diophantine description of that set. This mechanism is applied, among others, to the set of primes. We then determined limits to the degree and dimension of any Diophantine set, adding strength to the negative solution to Hilbert’s tenth problem: there is always a relatively simple undecidable polynomial. We also tried a different, but similar approach√ of proving√ that exponential Diophantine sets are 3 Diophantine, using the unit group of Z[ d] instead of Z[ d]. A three-dimensional analog of the Pell equation is constructed. By Dirichlet’s theorem, the non-negative solutions to this equation also behave somewhat exponentially. However, it turns out that the divisibility relation we are looking for does not exist, whereas we had assumed that this would naturally follow from the structure of the unit group. The results in this thesis show the naiveness and falseness of this assumption. This also makes Matijasevi˘c’sTheorem more subtle and special: the approach works in a surprisingly specific case. It now occurs to be an even more miraculous result than it did before. I would like to thank the reader and everyone who contributed to this thesis, especially first supervisor prof. dr. J. Top.

45 References

[Chu36] Alonzo Church. A Note on the Entscheidungsproblem. The Journal of Symbolic Logic, 1(1):40–41, 1936.

[Con] Keith Conrad. Dirichlet’s unit theorem. http://www.math.uconn.edu/~kconrad/ blurbs/gradnumthy/unittheorem.pdf. [Dav73] Martin Davis. Hilbert’s Tenth Problem is Unsolvable. The American Mathematical Monthly, 80(3):233–269, 1973. [DPR61] Martin Davis, Hilary Putnam, and Julia Robinson. The decision problem for exponen- tial Diophantine equations. Annals of Mathematics, 74:425–436, 1961.

[Göd31] Kurt Gödel. Über formal unentscheidbare Sätze der Principia Mathematica und ver- wandter Systeme i. Monatshefte für Mathematik und Physik, 38(1):173–198, 1931. [Hal36] Marshall Hall. Divisibility sequences of third order. American Journal of Mathematics, 58(3):577–584, 1936.

[JSWW76] James P. Jones, Daihachiro Sato, Hideo Wada, and Douglas Wiens. Diophantine Rep- resentation of the Set of Prime Numbers. American Mathematical Monthly, 83(6):449– 464, 1976. [Kui10] Bouke Kuijer. Creating a diophantine descriptions of a r.e. set and the complexity of such a description. Master’s thesis, University of Groningen, 2010.

[Len02] Hendrik W. Lenstra. Solving the Pell equation. Notices of the AMS, 49(2), 2002. [Mat70] Yuri V. Matijasevi˘c. The Diophantineness of enumerable sets (translated from Rus- sian). Soviet Math. Dokl., 11:354–358, 1970. [Mat77] Yuri V. Matijasevi˘c. Primes are nonnegative values of a polynomial in 10 variables. Zap. Nauchn. Sem, 68:62–82, 1977. [MR73] Yuri V. Matijasevi˘cand Julia Robinson. Reduction of an arbitrary Diophantine equa- tion to one in 13 unknowns. Acta Arithmetica, 27:521–553, 1973. [Put60] Hilary Putnam. An Unsolvable Problem in Number Theory. Journal of Symbolic Logic, 25(3):220–232, 1960.

[Rob52] Julia Robinson. Existential definability in arithmetic. Trans. Amer. Math. Soc., 72:437–449, 1952. [Smy10] Chris Smyth. The Terms in Lucas Sequences Divisible by Their Indices. Journal of Integer Sequences, 13, 2010.

[Ste12] Peter Stevenhagen. Number rings. Universiteit Leiden, 2012. [Sun90] Zhi-Wei Sun. Reduction of unknowns in Diophantine representations. Science in China, 35(3), 1990.

46