On Some Norm Equations over Cyclotomic Fields
Peter Lombaers Tese de Doutoramento apresentada à Faculdade de Ciências da Universidade do Porto Matemática 2019 Peter Lombaers: On Some Norm Equations Over Cyclotomic Fields , July 2019, Porto supervisor: António Machiavelo ABSTRACT
For an odd prime p, the equation xp + yp = zp can be rewritten as p (x + y)Np(x + yζ p) = z , where ζ p is a primitive p-th root of unity and Np is the norm map of the field Q(ζ p). That this norm equation does not have non-trivial integer solutions is known since Andrew Wiles’ proof of Fermat’s last theorem. However, very little is known about two natural generalisations of this equation given by the equations p k Np(x + yζ p) = z , and (a0 + a1 + ··· + ak)Np(a0 + a1ζ p + ··· + akζ p) = zp, for some integer 1 ≤ k ≤ p − 3. In this thesis, we look at the cyclotomic methods that were used to attack the equation xp + yp = zp, and we apply them to those two gen- eralisations. Naturally, this leads to a number of new observations and problems. First of all, we fix the coefficients a0,..., ak and investigate their behaviour as p goes to infinity. This leads to an upper bound k on p in certain cases. Next, we show that Np(a0 + ··· + akζ p) can be expressed in terms of linear recurrence sequences depending only on the coefficients ai. This allows us to show that, for fixed a, b, there 2 p are only finitely many primes p such that Np(aζ p + bζ p − a) = z has non-trivial integer solutions. We review Kummer’s method of attack to the equation Np(x + p yζ p) = z by the use of a logarithmic derivative, and we show that the Iwasawa p-adic logarithm can be used to obtain the same results. We review the theory of unramified extensions of degree p of Q(ζ p), and we see how this leads to a proof of Herbrand’s theorem. Combining knowledge of the units of Z[ζ p] with computational results, we show p 8 that if Np(x + yζ p) = z , then max(|x|, |y|) > 5 × 10 , for regular primes p ≥ 5. Finally, we use the p-adic logarithm to give a unified way of proving congruence identities related to Wolstenholme’s the- orem. With the same method, we can also prove a number of other congruences, which at first sight seem quite unrelated to each other.
iii RESUMO
Para um primo p ímpar, a equação xp + yp = zp pode ser reescrita p como (x + y)Np(x + yζ p) = z , em que ζ p é uma raiz primitiva p- ésima da unidade e Np é a norma do corpo Q(ζ p). Que esta equação não tem soluções inteiras não-triviais é sabido desde a prova de Andrew Wiles do último teorema de Fermat. No entanto, pouco se sabe acerca das duas generalizações seguintes dessa equação: Np(x + p k p yζ p) = z e (a0 + a1 + ··· + ak)Np(a0 + a1ζ p + ··· + akζ p) = z , para algum inteiro 1 ≤ k ≤ p − 3. Nesta tese, olhamos para os métodos ciclotómicos que foram usados para atacar a equação xp + yp = zp, e aplicamo-los a essas duas gener- alizações. Naturalmente, isso conduz a uma série de novas observações e problemas. Em primeiro lugar, fixamos os coeficientes a0,..., ak e investigamos o seu comportamento quando p tende para infinito. Isto conduz a um limite superior para p, em certos casos. Em seguida, k mostramos que a norma Np(a0 + ··· + akζ p) pode ser expressa em termos de sequências de recorrência linear que dependem apenas dos coeficientes ai. Isso permite-nos mostrar que, para a, b fixos, há apenas 2 p um número finito de primos p tais que Np(aζ p + bζ p − a) = z tem soluções inteiras não-triviais. Revemos o método de Kummer para atacar a equação Np(x + p yζ p) = z usando derivadas logarítmicas e mostramos que o log- aritmo p-ádico de Iwasawa pode ser usado para alcançar os mesmos resultados. Revemos a teoria das extensões não-ramificadas de grau p de Q(ζ p), e vemos como isso conduz a uma demonstração do teo- rema de Herbrand. Combinando conhecimento das unidades de Z[ζ p] p com resultados computacionais, mostramos que se Np(x + yζ p) = z , então max(|x|, |y|) > 5 × 108, para primos regulares p ≥ 5. Final- mente, usamos o logaritmo p-ádico para dar uma maneira unificada de demonstrar identidades envolvendo congruências relacionadas com o Teorema de Wolstenholme. Com o mesmo método podemos também demonstrar várias outras congruências que, à primeira vista, parecem não estar relacionadas entre si.
iv ACKNOWLEDGMENTS
This thesis would not have been possible without the help and support of a number of people. First of all I would like to thank my supervisor. António, it’s been a great pleasure to work with you. You’ve always been friendly, enthousiastic and helpful, and I feel lucky to have had you as my supervisor. Next, I would like to thank my friends here in Porto and the friends from the Netherlands that came to visit. You made my time here a great experience, whether it was to enjoy the good things, or complain about the bad things. Special mention should go to Atefeh, without you I would have been taking the stairs for the past four years, and to Nikos, you were always there, listening to my explanations of what I was working on, so that I could understand it better myself. I also want to thank my family for their love and support while I was far away from home. Finally, Saul, without you I would not survived the final year, but you turned it into a great year.
I would like to acknowledge the financial support by the FCT (Fundação para a Ciência e a Tecnologia) through the PhD grant PD/BD/128063/2016, and by CMUP (Centro Matemática - Universi- dade do Porto) (UID/MAT/00144/2013), which is funded by FCT (Por- tugal) with national (MEC) and European structural funds (FEDER), under the partnership agreement PT2020.
v
CONTENTS
1 introduction1 1.1 The main problem 1 1.2 Kummer’s approach 2 1.3 Description of the chapters 4 1.4 Notation 6 2 limits of N-th roots9 2.1 The Mahler measure 9 2.2 An upper bound on p 11 3 recurrence sequences 15 3.1 Norms in terms of a recurrence sequence 15 3.2 A general recurrence theorem 21 4 from norms to singular integers 27 4.1 Singular integers 27 4.2 Self-prime integers 28 5 singular integers 35 5.1 Idempotents and the Herbrand-Ribet theorem 35 5.2 Kummer’s logarithmic derivative 37 5.3 The p-adic logarithm 40 6 kummer extensions 47 6.1 Kummer extensions 47 6.2 Ramification 49 6.3 Units in cyclotomic fields 52 6.4 Unramified extensions 55 7 regular primes 61 7.1 Theoretical results 61 7.2 Computational results 63 8 wolstenholme’s theorem 67 8.1 Logarithms and binomial coefficients 68 8.2 Logarithms and cyclotomic integers 73
bibliography 81
vii
INTRODUCTION 1
1.1 the main problem
In this thesis we look at two generalisations of Fermat’s last theorem. They arise naturally when we approach Fermat’s last theorem via the classical, cyclotomic method. Let us first recall the statement of Fermat’s last theorem:
Theorem 1.1 (FLT). For each integer n ≥ 3 there are no non-zero integers x, y, z such that
xn + yn = zn.(1.1)
The first succesful proof was given by Andrew Wiles, using a con- nection with elliptic curves. Before that, the traditional method to attack FLT was via cyclotomic fields. We first note that we may assume in Theorem 1.1, without loss of generality, that n = 4 or n = p is an odd prime. Fermat himself already wrote a proof for the case n = 4 (see chapter 1 of [29] for a good account of the early history of FLT). Therefore we can restrict our attention to the case n = p is an odd prime. p p Since p is odd, we know that x + y is divisible by x + y. Let ζ p be a primitive p-th root of unity. Then we have the equality
p−1 p p i x + y = ∏(x + yζ p). i=0 = Let us denote by Np NQ(ζ p)/Q the norm function of the p-th cy- p−1 i clotomic field Q(ζ p), hence we have ∏i=1 (x + yζ p) = Np(x + yζ p). Proving FLT is thus equivalent to showing that there are no non-zero integers x, y, z such that
p (x + y)Np(x + yζ p) = z .(1.2)
The first generalisation we shall consider is a slightly less general version of a conjecture by George Gras in [12]. He referred to it as ‘Strong Fermat’s last theorem’.
Conjecture 1.2 (SFLT). For each prime p ≥ 5, the only solutions in pairwise coprime integers x, y, z to the equation
p Np(x + yζ p) = z (1.3) are given by z = 1 and x + yζ p ∈ {±1, ±ζ p, ±(1 + ζ p)}.
1 2 introduction
Note that in [12], Conjecture 1.5 states that also Np(x + yζ p) = pzp only has the trivial solutions. The second generalisation we will consider is:
Problem 1.3. Let p ≥ 5 be a prime and let 1 ≤ k ≤ p − 3 be an integer. What are the integer solutions to the following two equations?
k p Np(a0 + a1ζ p + ··· + akζ p) = z (1.4) k p (a0 + a1 + ··· + ak)Np(a0 + a1ζ p + ··· + akζ p) = z (1.5)
For k = 1, equation (1.5) gives us back FLT and equation (1.4), with coprime a0, a1, is precisely Conjecture 1.2. If k = p − 2 then p−2 a0 + a1ζ p + ··· + ap−2ζ p is a generic element of Z[ζ p], and thus (1.5) p has many solutions. For example, Np(uγ ) will be a p-th power of an integer if u is a unit of Z[ζ p] and γ is any element of Z[ζ p]. This is also the reason why we restrict to p ≥ 5. Though Conjecture 1.2 got a little attention on its own (see [2, 12, 13]), almost all results come from theorems connected to FLT which can be applied to Conjecture 1.2 and Problem 1.3 as well. It is interesting to compare this situation with Catalan’s conjecture, which states that the only positive integer solution to the equation
xn − 1 = yq
with n ≥ 2 is given by 32 − 23 = 1. This has been proved completely by cyclotomic methods by Preda Mih˘ailescu.The corresponding ‘strong’ version is given by the Nagell-Ljunggren equation
xn − 1 = yq. x − 1 Contrary to SFLT, the Nagell-Ljunggren equation has a rich history of itself (see [3] for a survey). In this thesis we shall look at the approach Ernest Kummer took to FLT and improvements of that approach, and we shall see how this translates to Problem 1.3. Naturally this leads to a number of fresh problems which we shall look at. To make this more concrete, we give an overview of the different steps of Kummer’s approach to FLT, and indicate how these steps relate to the chapters of this thesis.
1.2 kummer’s approach
Suppose x, y, z are non-zero integers such that xn + yn = zn. We may assume without loss of generality that n = p is an odd prime. The analog of Conjecture 1.2 for n not a prime would be to consider the equation n−1 i n ∏(x + yζn) = z i=1 1.2 kummer’s approach3 and we can define something similar for Problem 1.3. In chapter 2 we look at a way to give a lower bound for max(|x|, |y|). This method also works for n not a prime. In chapter 3 we look at a way to write
n−1 i ik ∏(a0 + a1ζn + ··· + akζn ) i=0 in terms of recurrence sequences. This method also does not require that n is a prime number. We may assume without loss of generality that x, y, z are pairwise coprime. For FLT, we can assume this without loss of generality, but for the equation (1.3) of SFLT this is not the case. In fact if a, b are any p two integers, then we can write Np(a + bζ p) = cd for some integers p p c, d (with possibly d = 1). Then we see that Np(ac + bcζ p) = c d is a solution to equation (1.3). So there are infinitely many solutions to Conjecture 1.2 if we do not assume that x, y, z are coprime. We will not consider the case where x, y, z have a common divisor any further. We separate the proof into two cases. In the first case p - xyz, and in the second case p | xyz. These cases are traditionally referred to as Case 1 and Case 2, and we will write FLT1 and FLT2 accordingly. We can make a similar case distinction for SFLT, and we will refer to these as SFLT1 and SFLT2. Assume we are in Case 1, so p - xyz. Because x and y are coprime, i j we find that x + yζ p is coprime to x + yζ p if i 6≡ j mod p. Hence we p have (x + yζ p)Z[ζ p] = I for some ideal I of Z[ζ p]. The fields Q(ζ p) do not have unique factorisation into irreducible elements for p ≥ 23. To get around this problem, Kummer used ideals and the class group. The fact that for p - xyz and x, y coprime we have
p p Np(x + yζ p) = z =⇒ (x + yζ p)Z[ζ p] = I (1.6) does not carry over to the situation with more variables of Problem 1.3. In chapter 4 we try to find conditions under which (1.6) holds for k elements of the form a0 + a1ζ p + ··· + akζ p. Apply an annihilator η ∈ Z[Gal(Q(ζ p)/Q)] of the class group of η p Q(ζ p) to get (x + yζ p) = uγ for some unit u ∈ Z[ζ p] and some γ ∈ Z[ζ p]. Kummer originally proved FLT for regular primes, so prime p for which p does not divide the class number of Q(ζ p). In this case, the fact that I p is a principal ideal implies that I must be principal itself. If the prime p is irregular, then I might not be principal, but certain products of Galois conjugates of I will be principal. The two main theorems in this direction are Stickelberger’s theorem and the Herbrand-Ribet theorem. Stickelberger’s theorem was proved for cyclotomic fields already by Kummer. Both of these theorems give annihilators of the class group of Q(ζ p). In chapter 6 we give a proof of the easy direction of the Herbrand-Ribet theorem using class field theory and an explicit construction of an extension of Q(ζ p). 4 introduction
Take the logarithmic derivative and look modulo p. The idea in this step is that the logarithmic derivative of a p-th power should be p congruent to 0 modulo p. So if x + yζ p = uγ , then after taking the logarithmic derivative, we only care about the unit u. We can then use the knowledge we have of the units of Z[ζ p] to derive a congruence condition modulo p that x, y and z must satisfy. For example, Kummer could conclude that if (x, y, z) is a solution to FLT1, then they must satisfy:
xy(x − y) B − ≡ 0 mod p (1.7) p 3 (x + y)3
Here Bp−3 is the (p − 3)-rd Bernoulli number. In chapter 5 we will show that instead of the logarithmic derivative, we can also use the p-adic logarithm. Although not obvious at the start, this will basically give the same information as we can obtain by using the logarithmic derivative. This method is not restricted to FLT and SFLT, but it applies to general cyclotomic elements, like in Problem 1.3. In chapter 8 we look closer at the condition Bp−3 ≡ 0 mod p. Primes that satisfy this condition are called Wolstenholme primes. We use the p-adic logarithm to get characterisation of these primes. Use the symmetry of xp + yp = zp to conclude that p | xyz. If we have xp + yp = zp, then we also get xp + (−z)p = (−y)p and yp + (−z)p = (−x)p. From equation (1.7) we concluded that p | x − y, so by using the symmetry of the equation we find that p | x + z and p | y + z. So then 0 ≡ xp + yp − zp ≡ 3x mod p. So p | xyz after and we can go to Case 2. This kind of symmetry is only something we can use in the case of FLT. For SFLT we have nothing similar. Also for the more general Problem 1.3 we do not have this symmetry. In Case 2, from a solution to FLT2, construct a smaller solution and use a descent argument to get a contradiction. Case 2 of FLT is considerably more difficult than Case 1. The approaches to Case 2 use a descent, and all rely on the symmetry of xp + yp = zp. Therefore they do not carry over to SFLT or to Problem 1.3. Though we will not be able to proof SFLT for a specific prime, we will be able to give a lower bound on max(|x|, |y|).
1.3 description of the chapters
In chapter 2, we fix α ∈ Q(ζ p), and look at the behaviour of q p lim Np(α). p→∞
We see that if α = f (ζ p) for some polynomial f ∈ Z[X], then this limit converges to the Mahler measure of f . We then use knowledge of this limit, in Theorem 2.5, to obtain the lower bound max(|x|, |y|) ≥ 500, p for x, y that satisfy Np(x + yζ p) = z . 1.3 description of the chapters5
In chapter 3, we show that we can write norms in terms of sums of linear recurrence sequences. More specifically, in Theorem 3.4, we see that if f ∈ Z[X] is of degree k, then we can write
n−1 k i (j) ∏ f (ζn) = ∑ An , i=0 j=0
(j) where for each j, the sequence (An )n∈N is a linear recurrence se- quence, depending only on the coefficients of f . There are many results on perfect powers in linear recurrence sequences. We use these to show that for fixed a, b, there are only finitely many primes p, such 2 p that Np(aζ p + bζ p − a) = z has a solution. In chapter 4, we try to find out when the following implication holds:
p p Np(α) = z =⇒ αZ[ζ p] = I for some ideal I of Z[ζ p].
This question does not arise in the case of FLT, but only arises for Problem 1.3, so there is not much previously known. We give some conditions on α that ensure the implication is true. We also try to find out when α is coprime to its Galois conjugates. For x + yζ p this occurs precisely when x and y are coprime, and p - x + y, but for other α the situation is more difficult. In chapter 5, we take a look at Kummer’s idea to study the equation p x + yζ p = uγ , using the logarithmic derivative. We see, in detail, how this leads to a proof of FLT1 for primes which satisfy p - Bp−3. Next we see that, instead of the logarithmic derivative, we can also use the p-adic logarithm to study these kind of equations. However, in Theorem 5.9, we show that the information we obtain with the p-adic logarithm is, curiously, the same as the information we obtain with the logarithmic derivative. In chapter 6, we give a review of the theory of Kummer extensions. We look at the ramification of primes in these extensions. Then we look at the unit group of Z[ζ p], and use these units to construct unramified extensions of Q(ζ p) of degree p. This leads to a proof Herbrand’s direction of the Herbrand-Ribet theorem. All these results are well- known, but we hope that presentation of this chapter is concrete, and ties in well with the logarithmic derivative of chapter 5. In chapter 7, we look at SFLT for regular primes. If p is regular, we have extra knowledge of the units, and this allows us to show that xy(x − y) must be divisible by a number of small primes. In Theorem p 7.8 we use this to show that if p ≥ 5 is regular and Np(x + yζ p) = z , then max(|x|, |y|) > 5 × 108. This method does not work for Problem 1.3. In chapter 8, we look at Wolstenholme’s theorem and generalisations of it. Wolstenholme’s theorem states that, for prime p ≥ 5, we have 2p−1 ≡ 3 ( p−1 ) 1 mod p . A prime p that satisfies this equation modulo 6 introduction
p4 is called a Wolstenholme prime. This is equivalent to the fact that p divides Bp−3, and hence the link with the previous chapters. There are a number of results that generalise Wolstenholme’s theorem and characterise Wolstenholme primes modulo higher powers of p. We use the the p-adic logarithm to give easier proofs of some of these generalisations, and to extend previously known results. For example, we give a characterisation of Wolstenholme primes in terms of harmonic sums modulo p12. In the second section, we apply the p-adic logarithm to norms of certain cyclotomic integers. This gives us p-adic equations for the trace of these integers. Looking modulo specific powers of p, we obtain a number of previously known results, and a number of new results. For example, in Proposition 8.17, we give another proof of the identity
− p 1 12k ∑ ≡ 0 mod p2. k=1 k k
Using the relation with recurrence sequences of chapter 3, we prove the identity
p−1 2 2 1 2k (1 − Lp)(Lp − 3) ∑ (−1)k−1 ≡ mod p2, k=1 k k 2p
where (Ln)n∈N is the sequence of Lucas numbers, given by L0 = 2, L1 = 1 and Ln+2 = Ln+1 + Ln. The most interesting part about this is that these results can be found using the same method, where previously each identity required different methods to proof them.
1.4 notation
We use the following notation in this thesis. For us, p will always denote an odd prime, except when mentioned otherwise. For n ≥ 3 an integer, we let ζn be a primitive n-th root of unity, so it is a root of the n-th cyclotomic polynomial Φn(X). In most chapters, n = p is an odd prime. If the prime p is fixed throughout a chapter, we will drop the subscript p and just write ζ = ζ p and Φ(X) = Φp(X). We will work with the cyclotomic fields Q(ζn). In chapters 4, 5, 7 and section 3 and 4 of chapter 6, we will fix n = p throughout the chapter, then we will denote K = Q(ζ p). We assume that the reader has had a first course in algebraic number, similar to the first chapter of [16]. The ring of integers of Q(ζn) is Z[ζn] and if we use the notation K = Q(ζ p), then we will write OK for the ring of integers of K. We denote the ∗ elements of the Galois group Gal(Q(ζn)/Q) by σa, for a ∈ (Z/nZ) , a where σa is determined by σa(ζn) = ζn. If p is fixed and K = Q(ζ p), then we will also write Gal(K/Q) = G. The norm of K is given by Np(α) = ∏σ∈G σ(α) and the trace is given by Trp(α) = ∑σ inG σ(α). If it is obvious which prime p we are talking about, we will drop the 1.4 notation7 subscript p and just write N and Tr. For us, π will always denote p−1 the element 1 − ζ of K, so that we have π OK = pOK as ideals. We will denote (α) for the principal ideal generated by α if it is clear in what ring we are working. If p is a prime ideal of OK, then n ordp(α) is the maximal n ∈ Z such that p divides α. So, for example ord(π)(p) = p − 1. The field Qp is the field of p-adic numbers. The p-adic logarithm will always be denoted by logp and the ordinary logarithm will be denote by log.
LIMITSOF N -THROOTS 2
Let α ∈ Q(ζ p) and write α = f (ζ p) for some polynomial f ∈ Z[X]. Then we have p−1 i Np(α) = ∏ f (ζ p), i=1 p and thus Np(α) = z for some integer z ∈ Z if and only if v up−1 up i t∏ f (ζ p) ∈ Z. i=1
In this chapter we shall fix f and investigate the behaviour of the above expression as p grows. In the first part we see how this relates to the Mahler measure of a polynomial. We give a concrete formula for the limit when the polynomial f has degree 2. In the second part of the chapter we use this to give an upper bound on p in terms of a and b (or equivalently a lower bound on a and b in terms of p) when p there is a solution to Np(a + bζ p) = c . Concretely, Theorem 2.4 says p that if Np(a + bζ p) = c , then max(|a|, |b|) > 500.
2.1 the mahler measure
n−1 i Let f be a polynomial with integer coefficients. Note that ∏i=1 f (ζn) makes sense for any positive integer n, but will only be equal to the norm of an element of a cyclotomic field if n is an odd prime. Also i n−i note that, since f (ζn) is the complex conjugate of f (ζn ), the number n−1 i n−1 i n−1 i ∏i=1 f (ζn) is always positive, so we have ∏i=1 f (ζn) = ∏i=1 | f (ζn)|. Here |x| is the absolute value of x as an element of C.
Theorem 2.1. Suppose f (X) ∈ Z[X] factors as f (X) = a(X − α1)(X − α2) ... (X − αk) over the complex numbers, and such that a, α1,..., αk 6= 0. Suppose |αi| 6= 1 for all i = 1, . . . , k. Then we have v un−1 k un i lim t f (ζn) = |a| |αi| = |a| max(1, |αi|). n→∞ ∏ ∏ ∏ i=1 |αi|>1 i=1
9 10 limitsof N-th roots
Proof. First of all, we see
n−1 n−1 i i ∏ f (ζn) = ∏ | f (ζn)| i=1 i=1 k n−1 n−1 i = |a| ∏ ∏ |ζn − αj| j=1 i=1 k n 1 − αj = |a|n−1 ∏ − j=1 1 αj |a|n k = |1 − αn|. | ( )| ∏ j f 1 j=1
Now take the n-th root and let n go to infinity. We get s n n |a| 1 lim = |a| lim p = |a|. n→∞ | f (1)| n→∞ n | f (1)|
Also, we have q 1 if |αj| < 1 lim n |1 − αn| = n→∞ j |αj| if |αj| > 1.
Combining this we end up with v un−1 un i lim t f (ζn) = |a| |αj|. n→∞ ∏ ∏ i=1 |αj|>1
The condition |αi| 6= 1 is there to ensure√ that this limit actually con- n n verges. If |α| = 1 and α 6= 1, then limn→∞ 1 − α does not converge. Note that, even for f (X) ∈ Z[X], the condition ‘|α| 6= 1 for any root of f ’ is not equivalent to the condition ‘ f (ζn) 6= 0 for all n ∈ N’. For example, the polynomial x4 − 2x3 − 2x + 1 has two roots with absolute value 1 and two roots with absolute value different from 1.
Definition 2.2. For f (X) = a(x − α1)(x − α2) ... (x − αk) ∈ C[X] define the Mahler measure M( f ) of f as the number
k M( f ) := |a| ∏ max(1, |αj|). j=1
In the article [21], D. H. Lehmer considered a method to find large primes. Suppose f (X) is a monic polynomial with integer coefficients and with roots α1,..., αk. Then, for each positive integer n, the num- k n bers ∆n( f ) = ∏j=1(αj − 1) will be integers. The prime factors of these numbers should satisfy certain congruence conditions, which 2.2 an upper bound on p 11 might make it easier to find large primes. For example, if we take f (X) = X − 2 then we are considering the prime factors of the number 2n − 1, so this is related to Mersenne primes. This lead Lehmer to consider the behaviour of these integers ∆n as n goes to infinity. If f has no roots which are a root of unity, then
∆ + lim n 1 = M( f ). n→∞ ∆n For his purposes it was important that M( f ) is as small as possible. A theorem of Kronecker says that M( f ) = 1 if and only if f is the product of cyclotomic polynomials. Excluding these, the polynomial with smallest Mahler measure Lehmer could find was the polynomial
f (X) = x10 + x9 − x7 − x6 − x4 − x3 + x + 1 which satisfies M( f ) ≈ 1.176. Lehmer’s conjectures states that there is a constant µ > 1 such that M( f ) ≥ µ for all f which are not products of cyclotomic polynomials. This is still an open problem. A non-cyclotomic polynomial with smaller Mahler measure than then one above has not been found. Later, Kurt Mahler considered the problem of multivariable polynomials, and the name Mahler measure got associated to M( f ).
2.2 an upper bound on p
Let f (X) ∈ Z[X] be a polynomial which does not have a root with absolute value 1. Then by Theorem 2.1 we know that v un−1 un i lim t f (ζn) = M( f ). n→∞ ∏ i=1
If M( f ) is not an integer, then we know that there is some n0 ∈ N such q n n−1 i that for all n ≥ n0 the number ∏i=1 f (ζn) is not an integer. If M( f ) n−1 i is an integer, but ∏i=1 f (ζn) 6= M( f ) from some n onwards, then again we can find such an integer n0. If n = p ≥ n0 is a prime then q q p p−1 i p we see that ∏i=1 f (ζ p) = Np( f (ζ p)). So we find that Np( f (ζ p)) cannot be a p-th power for all primes p ≥ n0. Let us find an explicit expression for n0 in a simple example. The simplest example is the case where f (X) = aX + b is a linear polyno- ap+bp mial with |a| 6= |b. In this case we have Np( f (ζ p)) = a+b . Note that if Np(aζ p + b) is a p-th power, then also Np(−aζ p − b) and Np(a + bζ p) are p-th powers, so we can assume without loss of generality that a > |b| > 0. Although the following proposition immediately follows the proof Fermat’s last theorem, we show this example anyway as an illustration of the method. 12 limitsof N-th roots
Proposition 2.3. Let a > |b| > 0 and let p be a prime such that
log(2) p > . log(a + 1) − log(a) Then there is no integer c such that ap + bp = cp.
Proof. By assumption we have
log(2) log(2) p > > log(a + 1) − log(a) log(a) − log(a − 1)
a a+1 and thus we see that log(2) < p log( a−1 ) and log(2) < p log( a ). So 2(a − 1)p < ap and 2ap < (a + 1)p. Hence we find:
(a − 1)p < ap − (a − 1)p ≤ ap + bp < 2ap < (a + 1)p. √ √ p p p p p p Therefore a − 1 <√ a + b < a + 1. If b 6= 0 then a + b 6= a, so we conclude that p ap + bp is not an integer.
A slight variation on the previous proposition is the following.
Theorem 2.4. Let a > |b| > 0 and let p be a prime such that
log(2) + log(a) p > . log(a) − log(a − 1)
ap+bp p Then there is no integer c such that a+b = c . Proof. Note that 1 − a − b ≤ 0, and thus ap(1 − a − b) + bp < 0. This means that for all p we have ap + bp < ap. a + b > log(2)+log(a) ( − ) ( ) − Since we assume that p log(a)−log(a−1) , we find that p 1 log a ( − ) > ( ) ap−1 > p log a 1 log 2 , and thus (a−1)p 2. Consequently
ap > 2a(a − 1)p ≥ (a − 1)p(a + b + 1)
and therefore
ap + bp > ap − (a − 1)p > (a − 1)p(a + b). q q p ap+bp p ap+bp We conclude that a − 1 < a+b < a, so a+b is not an integer.
This allows us to computationally give a lower bound on max(|a|, |b|). > < log(2)+log(a) We fix an a 0, and we check for all primes p log(a)−log(a−1) and ap+bp 1 − a ≤ b ≤ a − 1 coprime to a that a+b is not a p-th power. Note, for p = 3, that aζ3 + b is a general element of Z[ζ3], so we will certainly a3+b3 find many third powers of the form a+b . The interesting case is p > 3. We did these computations using a simple Pari-GP [25] script. This gave us the following result. 2.2 an upper bound on p 13
Theorem 2.5. If p ≥ 5 is a prime and a, b are coprime integers such that ap+bp p a+b = c , then max(|a|, |b|) > 500. This computational process does not scale very efficiently. We will see in chapter 7 that if we assume that p is a regular prime, then we can greatly improve this lower bound. In Lemma ?? we showed what the Mahler measure of a quadratic polynomial is. We would 2 like to make a similar analysis like that of Theorem 2.4 for Np(aζ p + q 2 p bζ p + c). If f (X) = aX + bX + c, can we have Np( f (ζ p)) = M( f ) q p or (a + b + c)Np( f (ζ p)) = M( f )? If not, there is an integer n0 such q p that for all primes p ≥ n0 we know that Np( f (ζ p)) and (a + b + q p c) Np( f (ζ p)) are not integers. Can we give this integer n0 explicitly in terms of a, b and c? The situation is more complicated than in the two variable case, and we do not yet have an answer to these questions.
RECURRENCESEQUENCES 3
In this chapter we investigate a connection between norms of cyclo- tomic integers and linear recurrence sequences. We start with the 2 simplest case, when the cyclotomic integer is of the form aζ p + bζ p + c. In Theorem 3.1 we will see that n−1 2i n n ∏(aζn + bζn + c) = a + c − Hn, i=0 where the numbers (Hn)n∈N are given by a linear recurrence relation. When n = p is an odd prime, this means that we have expressed 2 (a + b + c)Np(aζ p + bζ p + c) in terms of a recurrence sequence. The method of linear forms in logarithms can be used to investigate the perfect powers that occur in a linear recurrence sequence. As a corollary, we will see that for fixed non-zero integers a and b, there are 2 p only finitely many primes p such that bNp(aζ p + bζ p − a) = z has a solution. In the second part of this chapter we generalise Theorem 3.1 to hold for arbitrary cyclotomic elements. This does not make the p problem of finding solutions to Np(α) = z easier, but it is interesting in its own right. The main result is given by Theorem 3.4, and we will see some explicit examples for low variable cases.
3.1 norms in terms of a recurrence sequence
We start with the simplest example of a norm that can be expressed in terms of recurrence sequences. Theorem 3.1. For each integer n ≥ 3, and all a, b, c ∈ C with a 6= 0, we have: n−1 2i i n n ∏(aζn + bζn + c) = a + c − Hn, i=0 where (Hn)n∈N is defined by the recurrence relation
Hn+2 = −bHn+1 − acHn and the initial values H0 = 2 and H1 = −b. 2 Proof. Let f (X) = aX + bX + c = a(X − ω1)(X − ω2), for some c b n ω1, ω2 ∈ C. So we have ω1ω2 = a and ω1 + ω2 = − a . Since X − 1 = n−1 i ∏i=0 (X − ζn), we find that n−1 n−1 2i i n i i ∏(aζn + bζn + c) = a ∏(ζn − ω1)(ζn − ω2) i=0 i=0 n n n = a (ω1 − 1)(ω2 − 1) n n n n = a + c − (aω1) − (aω2) .
15 16 recurrence sequences
Note that aω1 and aω2 are roots of the polynomial (X − aω1)(X − 2 n n aω2) = X + bX + ac. Hence, if we define Hn := (aω1) + (aω2) , then from the theory of linear recurrence relations, we find that Hn satisfies Hn+2 = −bHn+1 − acHn, and has initial values H0 = 2 and H1 = −b.
We give another proof of Theorem 3.1 using determinants of ma- trices. It is a longer proof, but the recurrence relation appears very naturally when computing the determinants.
Proof. By general properties of circulant matrices (see for example [7], page 72) we have:
a a ··· a a 0 1 n−2 n−1
an−1 a0 ··· an−3 an−2 n−1 i 2i (n−1)i ...... ∏(a0 + a1ζn + a2ζn + ··· + an−1ζn ) = . . . . . = i 0 a2 a3 ··· a0 a1
··· a1 a2 an−1 a0 n Here |M| is the determinant of the matrix M and we use the subscript n to remember that it is an n × n matrix. In particular, we find
n−1 n−1 n−1 2i i i −i (−1) ∏(aζn + bζn + c) = ∏(aζn + b + cζn ) i=0 i=0
b a 0 ··· 0 c
c b a ··· 0 0
0 c b ··· 0 0 = (3.1) ......
··· 0 0 0 b a ··· a 0 0 c b n We will define:
b a 0 ··· 0 0
c b a ··· 0 0
(a,b,c) 0 c b ··· 0 0 G = Gn := n ......
··· 0 0 0 b a ··· 0 0 0 c b n 3.1 norms in terms of a recurrence sequence 17
This is the same matrix as in (3.1), but with 0 in the bottom left and upper right corners. Then we see that, for n ≥ 2, the determinant in (3.1) is equal to:
c a 0 ··· 0 0 c b a ··· 0 0
0 b a ··· 0 0 0 c b ··· 0 0
0 c b ··· 0 0 n+1 0 0 c ··· 0 0 bG − − a + (−1) c n 1 ......
··· ··· 0 0 0 b a 0 0 0 c b
a ··· c b a ··· c 0 0 n−1 0 0 0 n−1
a 0 0 ··· 0 0
b a 0 ··· 0 0
n+1 2 c b a ··· 0 0 =bG − − acGn− + (−1) a n 1 2 ......
··· 0 0 0 a 0
··· b a 0 0 0 n−2
c b a ··· 0 0
0 c b ··· 0 0
n+1 2 0 0 c ··· 0 0 +(−1) c − acGn− ...... 2 ......
··· 0 0 0 c b
··· c 0 0 0 0 n−2 n+1 n n =bGn−1 − 2acGn−2 + (−1) (a + c )
The numbers Gn satisfy a recurrence relation. Indeed, for n ≥ 2, we have 18 recurrence sequences
b a 0 ··· 0 0
c b a ··· 0 0
0 c b ··· 0 0 Gn = ......
··· 0 0 0 b a ··· 0 0 0 c b n
c a 0 ··· 0 0
0 b a ··· 0 0
0 c b ··· 0 0 = bG − − a n 1 ......
··· 0 0 0 b a ··· 0 0 0 c b n
= bGn−1 − acGn−2
Together with the initial values G0 = 1 and G1 = b, these determine the numbers Gn. Hence, we find
n−1 2i i n n n+1 ∏(aζn + bζn + c) = a + c + (−1) (bGn−1 − 2acGn−2). i=0
n Now define Hn := (−1) (bGn−1 − 2acGn−2). It is easy to see that the numbers Hn satisfy the recurrence relation Hn+2 = −bHn+1 − acHn. Using the initial values of the sequence (Gn)n∈N, we see:
3 H3 = −bG2 + 2acG1 = −b + 3abc 2 H2 = bG1 − 2acG0 = b − 2ac
Hence we find: H + bH H = − 3 2 = −b 1 ac H + bH H = − 2 1 = 2 0 ac Finally we conclude that
n−1 2i i n n ∏(aζn + bζn + c) = a + c − Hn. i=0
The determinants Gn from the previous proof are a specific instance of the following, more general proposition, which also follows from the results in [16]. 3.1 norms in terms of a recurrence sequence 19
Proposition 3.2. Let a, bi ∈ C for i ≥ 0. Set A0 := 1, and for n ≥ 1, define:
b0 b1 b2 ··· bn−1 . . a b0 b1 . . An := . 0 a b0 ...... b1
0 ··· 0 a b0
For any ci ∈ C, i ≥ 0, we have the equality:
c0 c1 c2 ··· cn−1
a b b ··· b 0 1 n−2 . n n 0 a b . = c0 An−1 − ac1 An−2 + ··· + (−1) a cn−1 A0. 0 . . . .. b 1
0 ··· 0 a b0
In particular, if bi = 0 for i ≥ k, then the matrices An satisfy the recurrence relation
k−1 k−1 An = b0 An−1 − ab1 An−2 + ··· + (−1) a bk−1 An−k.
Proof. We use induction on n. It is obvious for n = 0. Now assume that it holds for matrices of size (n − 1) × (n − 1). Then we see
c0 c1 c2 ··· cn−1 c1 c2 c3 ··· cn−1
a b b ··· b a b b ··· b 0 1 n−2 0 1 n−3 . . 0 a b . = c0 An−1 − a 0 a b . 0 0 ...... b . .. b 1 1
0 ··· 0 a b0 0 ··· 0 a b0 n−1 n−1 = c0 An−1−a(c1 An−2 − · · · + (−1) a cn−1 A0) n n = c0 An−1−ac1 An−2 + ··· + (−1) a cn−1 A0
The last statement of the proposition is obtained by putting ci = bi for all i ≥ 0.
Note that the recurrence relation of the determinants Gn follows from this proposition by putting c0 = b0, c1 = b1 and ci = bi = 0 for i ≥ 2. Let Gn be a non-degenerate (the coefficients of the recurrence rela- tion are non-zero) recurrence sequence of order 2. Shorey and Stewart [31], and independently Pethö [26], proved that there are effectively computable constants C1, C2, depending only on the coefficients and starting values of the sequence Gn, such that if n, y, q are integers with q > 1 satisfying q Gn = y , 20 recurrence sequences
then max(|y|, n, q) < C1 if |y| > 1 and n < C2 if |y| ≤ 1. This result is obtained by using the method of linear forms in logarithms. Generally, the constants are very large. For example, if Gn is the Fibonacci se- quence, then one finds that q < 5.1 × 1017. See [27, 28] for a survey by Pethö on diophantine equations involving linear recurrence sequences. As an immediate consequence of this, we find: Theorem 3.3. Let a, b be non-zero integers. There is an effectively com- putable constant C, depending only on a and b, such that if p is a prime and z an integer such that 2 p bNp(aζ p + bζ p − a) = z , then p < C. Proof. If p is an odd prime, then
p−1 2i i 2 ∏(aζ p + bζ p + c) = (a + b + c)Np(aζ p + bζ p + c). i=0 So if a = −c, then Theorem 3.1 shows that we have 2 bNp(aζ p + bζ p − a) = −Hn.
Here Hn is the second order linear recurrence sequence given by the initial values H0 = 2, H1 = −b and the recurrence relation 2 Hn+2 = −bHn+1 + a Hn. 2 So, taking C = max(C1, C2), we conclude that if bNp(aζ p + bζ p − a) = zp, then we must have p < C.
2 For fixed a and b, if we want to solve the equation bNp(aζ p + bζ p − a) = zp, this theorem tells us we only need to check a finite number of primes p. However, in practice the constant C is too large to check all primes below C and more clever methods are necessary. In the breakthrough article [4], Bugeaud, Mignotte and Siksek show that the perfect powers in the sequence Fn of Fibonacci numbers are given by F0 = 0, F1 = F2 = 1, F6 = 8 and F12 = 144. Furthermore, they showed that the only perfect powers in the sequence (Ln)n∈N of Lucas numbers are given by L1 = 1 and L3 = 4. They use a sharper version of the linear forms in logarithms method together with the modular method, which originates in Wiles’ work on Fermat’s last theorem. These two methods combined reduce the bounds on q and n far enough for the problem to be resolved computationally. Note 2 that, because Np(ζ p − ζ p − 1) = Lp by Theorem 3.1, we find as an immediate corollary that there are no primes p such that 2 p Np(ζ p − ζ p − 1) = z . It is interesting that the modular method that is used in the resolution p of the Fermat equation (x + y)Np(xζ p + y) = z , and which is not 2 cyclotomic in nature, can also be used to solve the equation Np(ζ p − p ζ p − 1) = z . 3.2 a general recurrence theorem 21
3.2 a general recurrence theorem
Now we will show that Theorem 3.1 does not just hold for cyclotomic integers of the form aζ2 + bζ + c, but can in fact be generalized to any cyclotomic integer. For j = 1, . . . , k, let ej(X1,..., Xk) be the elementary symmetric polynomial of degree j in k variables. So e1(X1,..., Xk) = X1 + X2 + ... + Xk, e2(X1,..., Xk) = ∑ ≤r
k k−1 Theorem 3.4. Let f (X) = akX + ak−1X + ··· + a1X + a0 ∈ C[X] with ak 6= 0 and let ω1,..., ωk be its roots. For j = 0, . . . , k and n ∈ N we define (j) (n+1)k−j n n n An := (−1) ak ej(ω1 ,..., ωk ). For each n ≥ 3 we have
n−1 k i (j) ∏ f (ζn) = ∑ An . i=0 j=0
(j) For each j, the sequence (An )n∈N is given by a recurrence relation
k ( j)−1 (j) (j) (j) A k = gs A n+( ) ∑ n+s j s=0
(j) (j) (j) with g0 ,..., g k ∈ Z[a0,..., ak] and initial values An ∈ Z[a0,..., ak]. ( j)−1
Proof. Let f (X) = ak(X − ω1)(X − ω2) ··· (X − ωk) for certain ω1,..., ωk ∈ C. Then we have:
ak−1 = −e1(ω1,..., ωk) ak ak−2 = e2(ω1,..., ωk) ak . . a0 k = (−1) ek(ω1,..., ωk) ak Moreover we see:
n−1 n−1 i n i i ∏ f (ζn) = ak ∏(ζn − ω1) ··· (ζn − ωk) i=0 i=0 nk n n n = (−1) ak (ω1 − 1) ··· (ωk − 1) k ! nk n k−j n n = (−1) ak ∑(−1) ej(ω1 ,..., ωk ) j=0 k (n+1)k−j n n n = ∑(−1) ak ej(ω1 ,..., ωk ) j=0 22 recurrence sequences