On Some Norm Equations over Cyclotomic Fields

Peter Lombaers Tese de Doutoramento apresentada à Faculdade de Ciências da Universidade do Porto Matemática 2019 Peter Lombaers: On Some Norm Equations Over Cyclotomic Fields , July 2019, Porto supervisor: António Machiavelo ABSTRACT

For an odd prime p, the equation xp + yp = zp can be rewritten as p (x + y)Np(x + yζ p) = z , where ζ p is a primitive p-th and Np is the norm map of the field Q(ζ p). That this norm equation does not have non-trivial integer solutions is known since ’ proof of Fermat’s last theorem. However, very little is known about two natural generalisations of this equation given by the equations p k Np(x + yζ p) = z , and (a0 + a1 + ··· + ak)Np(a0 + a1ζ p + ··· + akζ p) = zp, for some integer 1 ≤ k ≤ p − 3. In this thesis, we look at the cyclotomic methods that were used to attack the equation xp + yp = zp, and we apply them to those two gen- eralisations. Naturally, this leads to a of new observations and problems. First of all, we fix the coefficients a0,..., ak and investigate their behaviour as p goes to infinity. This leads to an upper bound k on p in certain cases. Next, we show that Np(a0 + ··· + akζ p) can be expressed in terms of linear recurrence sequences depending only on the coefficients ai. This allows us to show that, for fixed a, b, there 2 p are only finitely many primes p such that Np(aζ p + bζ p − a) = z has non-trivial integer solutions. We review Kummer’s method of attack to the equation Np(x + p yζ p) = z by the use of a logarithmic derivative, and we show that the Iwasawa p-adic logarithm can be used to obtain the same results. We review the theory of unramified extensions of degree p of Q(ζ p), and we see how this leads to a proof of Herbrand’s theorem. Combining knowledge of the units of Z[ζ p] with computational results, we show p 8 that if Np(x + yζ p) = z , then max(|x|, |y|) > 5 × 10 , for regular primes p ≥ 5. Finally, we use the p-adic logarithm to give a unified way of proving congruence identities related to Wolstenholme’s the- orem. With the same method, we can also prove a number of other congruences, which at first sight seem quite unrelated to each other.

iii RESUMO

Para um primo p ímpar, a equação xp + yp = zp pode ser reescrita p como (x + y)Np(x + yζ p) = z , em que ζ p é uma raiz primitiva p- ésima da unidade e Np é a norma do corpo Q(ζ p). Que esta equação não tem soluções inteiras não-triviais é sabido desde a prova de Andrew Wiles do último teorema de Fermat. No entanto, pouco se sabe acerca das duas generalizações seguintes dessa equação: Np(x + p k p yζ p) = z e (a0 + a1 + ··· + ak)Np(a0 + a1ζ p + ··· + akζ p) = z , para algum inteiro 1 ≤ k ≤ p − 3. Nesta tese, olhamos para os métodos ciclotómicos que foram usados para atacar a equação xp + yp = zp, e aplicamo-los a essas duas gener- alizações. Naturalmente, isso conduz a uma série de novas observações e problemas. Em primeiro lugar, fixamos os coeficientes a0,..., ak e investigamos o seu comportamento quando p tende para infinito. Isto conduz a um limite superior para p, em certos casos. Em seguida, k mostramos que a norma Np(a0 + ··· + akζ p) pode ser expressa em termos de sequências de recorrência linear que dependem apenas dos coeficientes ai. Isso permite-nos mostrar que, para a, b fixos, há apenas 2 p um número finito de primos p tais que Np(aζ p + bζ p − a) = z tem soluções inteiras não-triviais. Revemos o método de Kummer para atacar a equação Np(x + p yζ p) = z usando derivadas logarítmicas e mostramos que o log- aritmo p-ádico de Iwasawa pode ser usado para alcançar os mesmos resultados. Revemos a teoria das extensões não-ramificadas de grau p de Q(ζ p), e vemos como isso conduz a uma demonstração do teo- rema de Herbrand. Combinando conhecimento das unidades de Z[ζ p] p com resultados computacionais, mostramos que se Np(x + yζ p) = z , então max(|x|, |y|) > 5 × 108, para primos regulares p ≥ 5. Final- mente, usamos o logaritmo p-ádico para dar uma maneira unificada de demonstrar identidades envolvendo congruências relacionadas com o Teorema de Wolstenholme. Com o mesmo método podemos também demonstrar várias outras congruências que, à primeira vista, parecem não estar relacionadas entre si.

iv ACKNOWLEDGMENTS

This thesis would not have been possible without the help and support of a number of people. First of all I would like to thank my supervisor. António, it’s been a great pleasure to work with you. You’ve always been friendly, enthousiastic and helpful, and I feel lucky to have had you as my supervisor. Next, I would like to thank my friends here in Porto and the friends from the Netherlands that came to visit. You made my time here a great experience, whether it was to enjoy the good things, or complain about the bad things. Special mention should go to Atefeh, without you I would have been taking the stairs for the past four years, and to Nikos, you were always there, listening to my explanations of what I was working on, so that I could understand it better myself. I also want to thank my family for their love and support while I was far away from home. Finally, Saul, without you I would not survived the final year, but you turned it into a great year.

I would like to acknowledge the financial support by the FCT (Fundação para a Ciência e a Tecnologia) through the PhD grant PD/BD/128063/2016, and by CMUP (Centro Matemática - Universi- dade do Porto) (UID/MAT/00144/2013), which is funded by FCT (Por- tugal) with national (MEC) and European structural funds (FEDER), under the partnership agreement PT2020.

v

CONTENTS

1 introduction1 1.1 The main problem 1 1.2 Kummer’s approach 2 1.3 Description of the chapters 4 1.4 Notation 6 2 limits of N-th roots9 2.1 The Mahler measure 9 2.2 An upper bound on p 11 3 recurrence sequences 15 3.1 Norms in terms of a recurrence sequence 15 3.2 A general recurrence theorem 21 4 from norms to singular integers 27 4.1 Singular integers 27 4.2 Self-prime integers 28 5 singular integers 35 5.1 Idempotents and the Herbrand-Ribet theorem 35 5.2 Kummer’s logarithmic derivative 37 5.3 The p-adic logarithm 40 6 kummer extensions 47 6.1 Kummer extensions 47 6.2 Ramification 49 6.3 Units in cyclotomic fields 52 6.4 Unramified extensions 55 7 regular primes 61 7.1 Theoretical results 61 7.2 Computational results 63 8 wolstenholme’s theorem 67 8.1 Logarithms and binomial coefficients 68 8.2 Logarithms and cyclotomic integers 73

bibliography 81

vii

INTRODUCTION 1

1.1 the main problem

In this thesis we look at two generalisations of Fermat’s last theorem. They arise naturally when we approach Fermat’s last theorem via the classical, cyclotomic method. Let us first recall the statement of Fermat’s last theorem:

Theorem 1.1 (FLT). For each integer n ≥ 3 there are no non-zero integers x, y, z such that

xn + yn = zn.(1.1)

The first succesful proof was given by Andrew Wiles, using a con- nection with elliptic curves. Before that, the traditional method to attack FLT was via cyclotomic fields. We first note that we may assume in Theorem 1.1, without loss of generality, that n = 4 or n = p is an odd prime. Fermat himself already wrote a proof for the case n = 4 (see chapter 1 of [29] for a good account of the early history of FLT). Therefore we can restrict our attention to the case n = p is an odd prime. p p Since p is odd, we know that x + y is divisible by x + y. Let ζ p be a primitive p-th root of unity. Then we have the equality

p−1 p p i x + y = ∏(x + yζ p). i=0 = Let us denote by Np NQ(ζ p)/Q the norm function of the p-th cy- p−1 i clotomic field Q(ζ p), hence we have ∏i=1 (x + yζ p) = Np(x + yζ p). Proving FLT is thus equivalent to showing that there are no non-zero integers x, y, z such that

p (x + y)Np(x + yζ p) = z .(1.2)

The first generalisation we shall consider is a slightly less general version of a conjecture by George Gras in [12]. He referred to it as ‘Strong Fermat’s last theorem’.

Conjecture 1.2 (SFLT). For each prime p ≥ 5, the only solutions in pairwise coprime integers x, y, z to the equation

p Np(x + yζ p) = z (1.3) are given by z = 1 and x + yζ p ∈ {±1, ±ζ p, ±(1 + ζ p)}.

1 2 introduction

Note that in [12], Conjecture 1.5 states that also Np(x + yζ p) = pzp only has the trivial solutions. The second generalisation we will consider is:

Problem 1.3. Let p ≥ 5 be a prime and let 1 ≤ k ≤ p − 3 be an integer. What are the integer solutions to the following two equations?

k p Np(a0 + a1ζ p + ··· + akζ p) = z (1.4) k p (a0 + a1 + ··· + ak)Np(a0 + a1ζ p + ··· + akζ p) = z (1.5)

For k = 1, equation (1.5) gives us back FLT and equation (1.4), with coprime a0, a1, is precisely Conjecture 1.2. If k = p − 2 then p−2 a0 + a1ζ p + ··· + ap−2ζ p is a generic element of Z[ζ p], and thus (1.5) p has many solutions. For example, Np(uγ ) will be a p-th power of an integer if u is a of Z[ζ p] and γ is any element of Z[ζ p]. This is also the reason why we restrict to p ≥ 5. Though Conjecture 1.2 got a little attention on its own (see [2, 12, 13]), almost all results come from theorems connected to FLT which can be applied to Conjecture 1.2 and Problem 1.3 as well. It is interesting to compare this situation with Catalan’s conjecture, which states that the only positive integer solution to the equation

xn − 1 = yq

with n ≥ 2 is given by 32 − 23 = 1. This has been proved completely by cyclotomic methods by Preda Mih˘ailescu.The corresponding ‘strong’ version is given by the Nagell-Ljunggren equation

xn − 1 = yq. x − 1 Contrary to SFLT, the Nagell-Ljunggren equation has a rich history of itself (see [3] for a survey). In this thesis we shall look at the approach Ernest Kummer took to FLT and improvements of that approach, and we shall see how this translates to Problem 1.3. Naturally this leads to a number of fresh problems which we shall look at. To make this more concrete, we give an overview of the different steps of Kummer’s approach to FLT, and indicate how these steps relate to the chapters of this thesis.

1.2 kummer’s approach

Suppose x, y, z are non-zero integers such that xn + yn = zn. We may assume without loss of generality that n = p is an odd prime. The analog of Conjecture 1.2 for n not a prime would be to consider the equation n−1 i n ∏(x + yζn) = z i=1 1.2 kummer’s approach3 and we can define something similar for Problem 1.3. In chapter 2 we look at a way to give a lower bound for max(|x|, |y|). This method also works for n not a prime. In chapter 3 we look at a way to write

n−1 i ik ∏(a0 + a1ζn + ··· + akζn ) i=0 in terms of recurrence sequences. This method also does not require that n is a . We may assume without loss of generality that x, y, z are pairwise coprime. For FLT, we can assume this without loss of generality, but for the equation (1.3) of SFLT this is not the case. In fact if a, b are any p two integers, then we can write Np(a + bζ p) = cd for some integers p p c, d (with possibly d = 1). Then we see that Np(ac + bcζ p) = c d is a solution to equation (1.3). So there are infinitely many solutions to Conjecture 1.2 if we do not assume that x, y, z are coprime. We will not consider the case where x, y, z have a common divisor any further. We separate the proof into two cases. In the first case p - xyz, and in the second case p | xyz. These cases are traditionally referred to as Case 1 and Case 2, and we will write FLT1 and FLT2 accordingly. We can make a similar case distinction for SFLT, and we will refer to these as SFLT1 and SFLT2. Assume we are in Case 1, so p - xyz. Because x and y are coprime, i j we find that x + yζ p is coprime to x + yζ p if i 6≡ j mod p. Hence we p have (x + yζ p)Z[ζ p] = I for some ideal I of Z[ζ p]. The fields Q(ζ p) do not have unique factorisation into irreducible elements for p ≥ 23. To get around this problem, Kummer used ideals and the class group. The fact that for p - xyz and x, y coprime we have

p p Np(x + yζ p) = z =⇒ (x + yζ p)Z[ζ p] = I (1.6) does not carry over to the situation with more variables of Problem 1.3. In chapter 4 we try to find conditions under which (1.6) holds for k elements of the form a0 + a1ζ p + ··· + akζ p. Apply an annihilator η ∈ Z[Gal(Q(ζ p)/Q)] of the class group of η p Q(ζ p) to get (x + yζ p) = uγ for some unit u ∈ Z[ζ p] and some γ ∈ Z[ζ p]. Kummer originally proved FLT for regular primes, so prime p for which p does not divide the class number of Q(ζ p). In this case, the fact that I p is a principal ideal implies that I must be principal itself. If the prime p is irregular, then I might not be principal, but certain products of Galois conjugates of I will be principal. The two main theorems in this direction are Stickelberger’s theorem and the Herbrand-Ribet theorem. Stickelberger’s theorem was proved for cyclotomic fields already by Kummer. Both of these theorems give annihilators of the class group of Q(ζ p). In chapter 6 we give a proof of the easy direction of the Herbrand-Ribet theorem using class field theory and an explicit construction of an extension of Q(ζ p). 4 introduction

Take the logarithmic derivative and look modulo p. The idea in this step is that the logarithmic derivative of a p-th power should be p congruent to 0 modulo p. So if x + yζ p = uγ , then after taking the logarithmic derivative, we only care about the unit u. We can then use the knowledge we have of the units of Z[ζ p] to derive a congruence condition modulo p that x, y and z must satisfy. For example, Kummer could conclude that if (x, y, z) is a solution to FLT1, then they must satisfy:

xy(x − y) B − ≡ 0 mod p (1.7) p 3 (x + y)3

Here Bp−3 is the (p − 3)-rd . In chapter 5 we will show that instead of the logarithmic derivative, we can also use the p-adic logarithm. Although not obvious at the start, this will basically give the same information as we can obtain by using the logarithmic derivative. This method is not restricted to FLT and SFLT, but it applies to general cyclotomic elements, like in Problem 1.3. In chapter 8 we look closer at the condition Bp−3 ≡ 0 mod p. Primes that satisfy this condition are called Wolstenholme primes. We use the p-adic logarithm to get characterisation of these primes. Use the symmetry of xp + yp = zp to conclude that p | xyz. If we have xp + yp = zp, then we also get xp + (−z)p = (−y)p and yp + (−z)p = (−x)p. From equation (1.7) we concluded that p | x − y, so by using the symmetry of the equation we find that p | x + z and p | y + z. So then 0 ≡ xp + yp − zp ≡ 3x mod p. So p | xyz after and we can go to Case 2. This kind of symmetry is only something we can use in the case of FLT. For SFLT we have nothing similar. Also for the more general Problem 1.3 we do not have this symmetry. In Case 2, from a solution to FLT2, construct a smaller solution and use a descent argument to get a contradiction. Case 2 of FLT is considerably more difficult than Case 1. The approaches to Case 2 use a descent, and all rely on the symmetry of xp + yp = zp. Therefore they do not carry over to SFLT or to Problem 1.3. Though we will not be able to proof SFLT for a specific prime, we will be able to give a lower bound on max(|x|, |y|).

1.3 description of the chapters

In chapter 2, we fix α ∈ Q(ζ p), and look at the behaviour of q p lim Np(α). p→∞

We see that if α = f (ζ p) for some polynomial f ∈ Z[X], then this limit converges to the Mahler measure of f . We then use knowledge of this limit, in Theorem 2.5, to obtain the lower bound max(|x|, |y|) ≥ 500, p for x, y that satisfy Np(x + yζ p) = z . 1.3 description of the chapters5

In chapter 3, we show that we can write norms in terms of sums of linear recurrence sequences. More specifically, in Theorem 3.4, we see that if f ∈ Z[X] is of degree k, then we can write

n−1 k i (j) ∏ f (ζn) = ∑ An , i=0 j=0

(j) where for each j, the sequence (An )n∈N is a linear recurrence se- quence, depending only on the coefficients of f . There are many results on perfect powers in linear recurrence sequences. We use these to show that for fixed a, b, there are only finitely many primes p, such 2 p that Np(aζ p + bζ p − a) = z has a solution. In chapter 4, we try to find out when the following implication holds:

p p Np(α) = z =⇒ αZ[ζ p] = I for some ideal I of Z[ζ p].

This question does not arise in the case of FLT, but only arises for Problem 1.3, so there is not much previously known. We give some conditions on α that ensure the implication is true. We also try to find out when α is coprime to its Galois conjugates. For x + yζ p this occurs precisely when x and y are coprime, and p - x + y, but for other α the situation is more difficult. In chapter 5, we take a look at Kummer’s idea to study the equation p x + yζ p = uγ , using the logarithmic derivative. We see, in detail, how this leads to a proof of FLT1 for primes which satisfy p - Bp−3. Next we see that, instead of the logarithmic derivative, we can also use the p-adic logarithm to study these kind of equations. However, in Theorem 5.9, we show that the information we obtain with the p-adic logarithm is, curiously, the same as the information we obtain with the logarithmic derivative. In chapter 6, we give a review of the theory of Kummer extensions. We look at the ramification of primes in these extensions. Then we look at the unit group of Z[ζ p], and use these units to construct unramified extensions of Q(ζ p) of degree p. This leads to a proof Herbrand’s direction of the Herbrand-Ribet theorem. All these results are well- known, but we hope that presentation of this chapter is concrete, and ties in well with the logarithmic derivative of chapter 5. In chapter 7, we look at SFLT for regular primes. If p is regular, we have extra knowledge of the units, and this allows us to show that xy(x − y) must be divisible by a number of small primes. In Theorem p 7.8 we use this to show that if p ≥ 5 is regular and Np(x + yζ p) = z , then max(|x|, |y|) > 5 × 108. This method does not work for Problem 1.3. In chapter 8, we look at Wolstenholme’s theorem and generalisations of it. Wolstenholme’s theorem states that, for prime p ≥ 5, we have 2p−1 ≡ 3 ( p−1 ) 1 mod p . A prime p that satisfies this equation modulo 6 introduction

p4 is called a Wolstenholme prime. This is equivalent to the fact that p divides Bp−3, and hence the link with the previous chapters. There are a number of results that generalise Wolstenholme’s theorem and characterise Wolstenholme primes modulo higher powers of p. We use the the p-adic logarithm to give easier proofs of some of these generalisations, and to extend previously known results. For example, we give a characterisation of Wolstenholme primes in terms of harmonic sums modulo p12. In the second section, we apply the p-adic logarithm to norms of certain cyclotomic integers. This gives us p-adic equations for the trace of these integers. Looking modulo specific powers of p, we obtain a number of previously known results, and a number of new results. For example, in Proposition 8.17, we give another proof of the identity

− p 1 12k ∑ ≡ 0 mod p2. k=1 k k

Using the relation with recurrence sequences of chapter 3, we prove the identity

p−1   2 2 1 2k (1 − Lp)(Lp − 3) ∑ (−1)k−1 ≡ mod p2, k=1 k k 2p

where (Ln)n∈N is the sequence of Lucas , given by L0 = 2, L1 = 1 and Ln+2 = Ln+1 + Ln. The most interesting part about this is that these results can be found using the same method, where previously each identity required different methods to proof them.

1.4 notation

We use the following notation in this thesis. For us, p will always denote an odd prime, except when mentioned otherwise. For n ≥ 3 an integer, we let ζn be a primitive n-th root of unity, so it is a root of the n-th Φn(X). In most chapters, n = p is an odd prime. If the prime p is fixed throughout a chapter, we will drop the subscript p and just write ζ = ζ p and Φ(X) = Φp(X). We will work with the cyclotomic fields Q(ζn). In chapters 4, 5, 7 and section 3 and 4 of chapter 6, we will fix n = p throughout the chapter, then we will denote K = Q(ζ p). We assume that the reader has had a first course in , similar to the first chapter of [16]. The of integers of Q(ζn) is Z[ζn] and if we use the notation K = Q(ζ p), then we will write OK for the of K. We denote the ∗ elements of the Gal(Q(ζn)/Q) by σa, for a ∈ (Z/nZ) , a where σa is determined by σa(ζn) = ζn. If p is fixed and K = Q(ζ p), then we will also write Gal(K/Q) = G. The norm of K is given by Np(α) = ∏σ∈G σ(α) and the trace is given by Trp(α) = ∑σ inG σ(α). If it is obvious which prime p we are talking about, we will drop the 1.4 notation7 subscript p and just write N and Tr. For us, π will always denote p−1 the element 1 − ζ of K, so that we have π OK = pOK as ideals. We will denote (α) for the principal ideal generated by α if it is clear in what ring we are working. If p is a prime ideal of OK, then n ordp(α) is the maximal n ∈ Z such that p divides α. So, for example ord(π)(p) = p − 1. The field Qp is the field of p-adic numbers. The p-adic logarithm will always be denoted by logp and the ordinary logarithm will be denote by log.

LIMITSOF N -THROOTS 2

Let α ∈ Q(ζ p) and write α = f (ζ p) for some polynomial f ∈ Z[X]. Then we have p−1 i Np(α) = ∏ f (ζ p), i=1 p and thus Np(α) = z for some integer z ∈ Z if and only if v up−1 up i t∏ f (ζ p) ∈ Z. i=1

In this chapter we shall fix f and investigate the behaviour of the above expression as p grows. In the first part we see how this relates to the Mahler measure of a polynomial. We give a concrete formula for the limit when the polynomial f has degree 2. In the second part of the chapter we use this to give an upper bound on p in terms of a and b (or equivalently a lower bound on a and b in terms of p) when p there is a solution to Np(a + bζ p) = c . Concretely, Theorem 2.4 says p that if Np(a + bζ p) = c , then max(|a|, |b|) > 500.

2.1 the mahler measure

n−1 i Let f be a polynomial with integer coefficients. Note that ∏i=1 f (ζn) makes sense for any positive integer n, but will only be equal to the norm of an element of a cyclotomic field if n is an odd prime. Also i n−i note that, since f (ζn) is the complex conjugate of f (ζn ), the number n−1 i n−1 i n−1 i ∏i=1 f (ζn) is always positive, so we have ∏i=1 f (ζn) = ∏i=1 | f (ζn)|. Here |x| is the absolute value of x as an element of C.

Theorem 2.1. Suppose f (X) ∈ Z[X] factors as f (X) = a(X − α1)(X − α2) ... (X − αk) over the complex numbers, and such that a, α1,..., αk 6= 0. Suppose |αi| 6= 1 for all i = 1, . . . , k. Then we have v un−1 k un i lim t f (ζn) = |a| |αi| = |a| max(1, |αi|). n→∞ ∏ ∏ ∏ i=1 |αi|>1 i=1

9 10 limitsof N-th roots

Proof. First of all, we see

n−1 n−1 i i ∏ f (ζn) = ∏ | f (ζn)| i=1 i=1 k n−1 n−1 i = |a| ∏ ∏ |ζn − αj| j=1 i=1 k n 1 − αj = |a|n−1 ∏ − j=1 1 αj |a|n k = |1 − αn|. | ( )| ∏ j f 1 j=1

Now take the n-th root and let n go to infinity. We get s n n |a| 1 lim = |a| lim p = |a|. n→∞ | f (1)| n→∞ n | f (1)|

Also, we have  q 1 if |αj| < 1 lim n |1 − αn| = n→∞ j |αj| if |αj| > 1.

Combining this we end up with v un−1 un i lim t f (ζn) = |a| |αj|. n→∞ ∏ ∏ i=1 |αj|>1

The condition |αi| 6= 1 is there to ensure√ that this limit actually con- n n verges. If |α| = 1 and α 6= 1, then limn→∞ 1 − α does not converge. Note that, even for f (X) ∈ Z[X], the condition ‘|α| 6= 1 for any root of f ’ is not equivalent to the condition ‘ f (ζn) 6= 0 for all n ∈ N’. For example, the polynomial x4 − 2x3 − 2x + 1 has two roots with absolute value 1 and two roots with absolute value different from 1.

Definition 2.2. For f (X) = a(x − α1)(x − α2) ... (x − αk) ∈ C[X] define the Mahler measure M( f ) of f as the number

k M( f ) := |a| ∏ max(1, |αj|). j=1

In the article [21], D. H. Lehmer considered a method to find large primes. Suppose f (X) is a monic polynomial with integer coefficients and with roots α1,..., αk. Then, for each positive integer n, the num- k n bers ∆n( f ) = ∏j=1(αj − 1) will be integers. The prime factors of these numbers should satisfy certain congruence conditions, which 2.2 an upper bound on p 11 might make it easier to find large primes. For example, if we take f (X) = X − 2 then we are considering the prime factors of the number 2n − 1, so this is related to Mersenne primes. This lead Lehmer to consider the behaviour of these integers ∆n as n goes to infinity. If f has no roots which are a root of unity, then

∆ + lim n 1 = M( f ). n→∞ ∆n For his purposes it was important that M( f ) is as small as possible. A theorem of Kronecker says that M( f ) = 1 if and only if f is the product of cyclotomic polynomials. Excluding these, the polynomial with smallest Mahler measure Lehmer could find was the polynomial

f (X) = x10 + x9 − x7 − x6 − x4 − x3 + x + 1 which satisfies M( f ) ≈ 1.176. Lehmer’s conjectures states that there is a constant µ > 1 such that M( f ) ≥ µ for all f which are not products of cyclotomic polynomials. This is still an open problem. A non-cyclotomic polynomial with smaller Mahler measure than then one above has not been found. Later, Kurt Mahler considered the problem of multivariable polynomials, and the name Mahler measure got associated to M( f ).

2.2 an upper bound on p

Let f (X) ∈ Z[X] be a polynomial which does not have a root with absolute value 1. Then by Theorem 2.1 we know that v un−1 un i lim t f (ζn) = M( f ). n→∞ ∏ i=1

If M( f ) is not an integer, then we know that there is some n0 ∈ N such q n n−1 i that for all n ≥ n0 the number ∏i=1 f (ζn) is not an integer. If M( f ) n−1 i is an integer, but ∏i=1 f (ζn) 6= M( f ) from some n onwards, then again we can find such an integer n0. If n = p ≥ n0 is a prime then q q p p−1 i p we see that ∏i=1 f (ζ p) = Np( f (ζ p)). So we find that Np( f (ζ p)) cannot be a p-th power for all primes p ≥ n0. Let us find an explicit expression for n0 in a simple example. The simplest example is the case where f (X) = aX + b is a linear polyno- ap+bp mial with |a| 6= |b. In this case we have Np( f (ζ p)) = a+b . Note that if Np(aζ p + b) is a p-th power, then also Np(−aζ p − b) and Np(a + bζ p) are p-th powers, so we can assume without loss of generality that a > |b| > 0. Although the following proposition immediately follows the proof Fermat’s last theorem, we show this example anyway as an illustration of the method. 12 limitsof N-th roots

Proposition 2.3. Let a > |b| > 0 and let p be a prime such that

log(2) p > . log(a + 1) − log(a) Then there is no integer c such that ap + bp = cp.

Proof. By assumption we have

log(2) log(2) p > > log(a + 1) − log(a) log(a) − log(a − 1)

a a+1 and thus we see that log(2) < p log( a−1 ) and log(2) < p log( a ). So 2(a − 1)p < ap and 2ap < (a + 1)p. Hence we find:

(a − 1)p < ap − (a − 1)p ≤ ap + bp < 2ap < (a + 1)p. √ √ p p p p p p Therefore a − 1 <√ a + b < a + 1. If b 6= 0 then a + b 6= a, so we conclude that p ap + bp is not an integer.

A slight variation on the previous proposition is the following.

Theorem 2.4. Let a > |b| > 0 and let p be a prime such that

log(2) + log(a) p > . log(a) − log(a − 1)

ap+bp p Then there is no integer c such that a+b = c . Proof. Note that 1 − a − b ≤ 0, and thus ap(1 − a − b) + bp < 0. This means that for all p we have ap + bp < ap. a + b > log(2)+log(a) ( − ) ( ) − Since we assume that p log(a)−log(a−1) , we find that p 1 log a ( − ) > ( ) ap−1 > p log a 1 log 2 , and thus (a−1)p 2. Consequently

ap > 2a(a − 1)p ≥ (a − 1)p(a + b + 1)

and therefore

ap + bp > ap − (a − 1)p > (a − 1)p(a + b). q q p ap+bp p ap+bp We conclude that a − 1 < a+b < a, so a+b is not an integer.

This allows us to computationally give a lower bound on max(|a|, |b|). > < log(2)+log(a) We fix an a 0, and we check for all primes p log(a)−log(a−1) and ap+bp 1 − a ≤ b ≤ a − 1 coprime to a that a+b is not a p-th power. Note, for p = 3, that aζ3 + b is a general element of Z[ζ3], so we will certainly a3+b3 find many third powers of the form a+b . The interesting case is p > 3. We did these computations using a simple Pari-GP [25] script. This gave us the following result. 2.2 an upper bound on p 13

Theorem 2.5. If p ≥ 5 is a prime and a, b are coprime integers such that ap+bp p a+b = c , then max(|a|, |b|) > 500. This computational process does not scale very efficiently. We will see in chapter 7 that if we assume that p is a , then we can greatly improve this lower bound. In Lemma ?? we showed what the Mahler measure of a quadratic polynomial is. We would 2 like to make a similar analysis like that of Theorem 2.4 for Np(aζ p + q 2 p bζ p + c). If f (X) = aX + bX + c, can we have Np( f (ζ p)) = M( f ) q p or (a + b + c)Np( f (ζ p)) = M( f )? If not, there is an integer n0 such q p that for all primes p ≥ n0 we know that Np( f (ζ p)) and (a + b + q p c) Np( f (ζ p)) are not integers. Can we give this integer n0 explicitly in terms of a, b and c? The situation is more complicated than in the two variable case, and we do not yet have an answer to these questions.

RECURRENCESEQUENCES 3

In this chapter we investigate a connection between norms of cyclo- tomic integers and linear recurrence sequences. We start with the 2 simplest case, when the cyclotomic integer is of the form aζ p + bζ p + c. In Theorem 3.1 we will see that n−1 2i n n ∏(aζn + bζn + c) = a + c − Hn, i=0 where the numbers (Hn)n∈N are given by a linear recurrence relation. When n = p is an odd prime, this means that we have expressed 2 (a + b + c)Np(aζ p + bζ p + c) in terms of a recurrence sequence. The method of linear forms in logarithms can be used to investigate the perfect powers that occur in a linear recurrence sequence. As a corollary, we will see that for fixed non-zero integers a and b, there are 2 p only finitely many primes p such that bNp(aζ p + bζ p − a) = z has a solution. In the second part of this chapter we generalise Theorem 3.1 to hold for arbitrary cyclotomic elements. This does not make the p problem of finding solutions to Np(α) = z easier, but it is interesting in its own right. The main result is given by Theorem 3.4, and we will see some explicit examples for low variable cases.

3.1 norms in terms of a recurrence sequence

We start with the simplest example of a norm that can be expressed in terms of recurrence sequences. Theorem 3.1. For each integer n ≥ 3, and all a, b, c ∈ C with a 6= 0, we have: n−1 2i i n n ∏(aζn + bζn + c) = a + c − Hn, i=0 where (Hn)n∈N is defined by the recurrence relation

Hn+2 = −bHn+1 − acHn and the initial values H0 = 2 and H1 = −b. 2 Proof. Let f (X) = aX + bX + c = a(X − ω1)(X − ω2), for some c b n ω1, ω2 ∈ C. So we have ω1ω2 = a and ω1 + ω2 = − a . Since X − 1 = n−1 i ∏i=0 (X − ζn), we find that n−1 n−1 2i i n i i ∏(aζn + bζn + c) = a ∏(ζn − ω1)(ζn − ω2) i=0 i=0 n n n = a (ω1 − 1)(ω2 − 1) n n n n = a + c − (aω1) − (aω2) .

15 16 recurrence sequences

Note that aω1 and aω2 are roots of the polynomial (X − aω1)(X − 2 n n aω2) = X + bX + ac. Hence, if we define Hn := (aω1) + (aω2) , then from the theory of linear recurrence relations, we find that Hn satisfies Hn+2 = −bHn+1 − acHn, and has initial values H0 = 2 and H1 = −b.

We give another proof of Theorem 3.1 using determinants of ma- trices. It is a longer proof, but the recurrence relation appears very naturally when computing the determinants.

Proof. By general properties of circulant matrices (see for example [7], page 72) we have:

a a ··· a a 0 1 n−2 n−1

an−1 a0 ··· an−3 an−2 n−1 i 2i (n−1)i ...... ∏(a0 + a1ζn + a2ζn + ··· + an−1ζn ) = . . . . . = i 0 a2 a3 ··· a0 a1

··· a1 a2 an−1 a0 n Here |M| is the determinant of the matrix M and we use the subscript n to remember that it is an n × n matrix. In particular, we find

n−1 n−1 n−1 2i i i −i (−1) ∏(aζn + bζn + c) = ∏(aζn + b + cζn ) i=0 i=0

b a 0 ··· 0 c

c b a ··· 0 0

0 c b ··· 0 0 = (3.1) ......

··· 0 0 0 b a ··· a 0 0 c b n We will define:

b a 0 ··· 0 0

c b a ··· 0 0

(a,b,c) 0 c b ··· 0 0 G = Gn := n ......

··· 0 0 0 b a ··· 0 0 0 c b n 3.1 norms in terms of a recurrence sequence 17

This is the same matrix as in (3.1), but with 0 in the bottom left and upper right corners. Then we see that, for n ≥ 2, the determinant in (3.1) is equal to:

c a 0 ··· 0 0 c b a ··· 0 0

0 b a ··· 0 0 0 c b ··· 0 0

0 c b ··· 0 0 n+1 0 0 c ··· 0 0 bG − − a + (−1) c n 1 ......

··· ··· 0 0 0 b a 0 0 0 c b

a ··· c b a ··· c 0 0 n−1 0 0 0 n−1

a 0 0 ··· 0 0

b a 0 ··· 0 0

n+1 2 c b a ··· 0 0 =bG − − acGn− + (−1) a n 1 2 ......

··· 0 0 0 a 0

··· b a 0 0 0 n−2

c b a ··· 0 0

0 c b ··· 0 0

n+1 2 0 0 c ··· 0 0 +(−1) c − acGn− ...... 2 ......

··· 0 0 0 c b

··· c 0 0 0 0 n−2 n+1 n n =bGn−1 − 2acGn−2 + (−1) (a + c )

The numbers Gn satisfy a recurrence relation. Indeed, for n ≥ 2, we have 18 recurrence sequences

b a 0 ··· 0 0

c b a ··· 0 0

0 c b ··· 0 0 Gn = ......

··· 0 0 0 b a ··· 0 0 0 c b n

c a 0 ··· 0 0

0 b a ··· 0 0

0 c b ··· 0 0 = bG − − a n 1 ......

··· 0 0 0 b a ··· 0 0 0 c b n

= bGn−1 − acGn−2

Together with the initial values G0 = 1 and G1 = b, these determine the numbers Gn. Hence, we find

n−1 2i i n n n+1 ∏(aζn + bζn + c) = a + c + (−1) (bGn−1 − 2acGn−2). i=0

n Now define Hn := (−1) (bGn−1 − 2acGn−2). It is easy to see that the numbers Hn satisfy the recurrence relation Hn+2 = −bHn+1 − acHn. Using the initial values of the sequence (Gn)n∈N, we see:

3 H3 = −bG2 + 2acG1 = −b + 3abc 2 H2 = bG1 − 2acG0 = b − 2ac

Hence we find: H + bH H = − 3 2 = −b 1 ac H + bH H = − 2 1 = 2 0 ac Finally we conclude that

n−1 2i i n n ∏(aζn + bζn + c) = a + c − Hn. i=0

The determinants Gn from the previous proof are a specific instance of the following, more general proposition, which also follows from the results in [16]. 3.1 norms in terms of a recurrence sequence 19

Proposition 3.2. Let a, bi ∈ C for i ≥ 0. Set A0 := 1, and for n ≥ 1, define:

b0 b1 b2 ··· bn−1 . . a b0 b1 . . An := . 0 a b0 ...... b1

0 ··· 0 a b0

For any ci ∈ C, i ≥ 0, we have the equality:

c0 c1 c2 ··· cn−1

a b b ··· b 0 1 n−2 . n n 0 a b . = c0 An−1 − ac1 An−2 + ··· + (−1) a cn−1 A0. 0 . . . .. b 1

0 ··· 0 a b0

In particular, if bi = 0 for i ≥ k, then the matrices An satisfy the recurrence relation

k−1 k−1 An = b0 An−1 − ab1 An−2 + ··· + (−1) a bk−1 An−k.

Proof. We use induction on n. It is obvious for n = 0. Now assume that it holds for matrices of size (n − 1) × (n − 1). Then we see

c0 c1 c2 ··· cn−1 c1 c2 c3 ··· cn−1

a b b ··· b a b b ··· b 0 1 n−2 0 1 n−3 . . 0 a b . = c0 An−1 − a 0 a b . 0 0 ...... b . .. b 1 1

0 ··· 0 a b0 0 ··· 0 a b0 n−1 n−1 = c0 An−1−a(c1 An−2 − · · · + (−1) a cn−1 A0) n n = c0 An−1−ac1 An−2 + ··· + (−1) a cn−1 A0

The last statement of the proposition is obtained by putting ci = bi for all i ≥ 0.

Note that the recurrence relation of the determinants Gn follows from this proposition by putting c0 = b0, c1 = b1 and ci = bi = 0 for i ≥ 2. Let Gn be a non-degenerate (the coefficients of the recurrence rela- tion are non-zero) recurrence sequence of order 2. Shorey and Stewart [31], and independently Pethö [26], proved that there are effectively computable constants C1, C2, depending only on the coefficients and starting values of the sequence Gn, such that if n, y, q are integers with q > 1 satisfying q Gn = y , 20 recurrence sequences

then max(|y|, n, q) < C1 if |y| > 1 and n < C2 if |y| ≤ 1. This result is obtained by using the method of linear forms in logarithms. Generally, the constants are very large. For example, if Gn is the Fibonacci se- quence, then one finds that q < 5.1 × 1017. See [27, 28] for a survey by Pethö on diophantine equations involving linear recurrence sequences. As an immediate consequence of this, we find: Theorem 3.3. Let a, b be non-zero integers. There is an effectively com- putable constant C, depending only on a and b, such that if p is a prime and z an integer such that 2 p bNp(aζ p + bζ p − a) = z , then p < C. Proof. If p is an odd prime, then

p−1 2i i 2 ∏(aζ p + bζ p + c) = (a + b + c)Np(aζ p + bζ p + c). i=0 So if a = −c, then Theorem 3.1 shows that we have 2 bNp(aζ p + bζ p − a) = −Hn.

Here Hn is the second order linear recurrence sequence given by the initial values H0 = 2, H1 = −b and the recurrence relation 2 Hn+2 = −bHn+1 + a Hn. 2 So, taking C = max(C1, C2), we conclude that if bNp(aζ p + bζ p − a) = zp, then we must have p < C.

2 For fixed a and b, if we want to solve the equation bNp(aζ p + bζ p − a) = zp, this theorem tells us we only need to check a finite number of primes p. However, in practice the constant C is too large to check all primes below C and more clever methods are necessary. In the breakthrough article [4], Bugeaud, Mignotte and Siksek show that the perfect powers in the sequence Fn of Fibonacci numbers are given by F0 = 0, F1 = F2 = 1, F6 = 8 and F12 = 144. Furthermore, they showed that the only perfect powers in the sequence (Ln)n∈N of Lucas numbers are given by L1 = 1 and L3 = 4. They use a sharper version of the linear forms in logarithms method together with the modular method, which originates in Wiles’ work on Fermat’s last theorem. These two methods combined reduce the bounds on q and n far enough for the problem to be resolved computationally. Note 2 that, because Np(ζ p − ζ p − 1) = Lp by Theorem 3.1, we find as an immediate corollary that there are no primes p such that 2 p Np(ζ p − ζ p − 1) = z . It is interesting that the modular method that is used in the resolution p of the Fermat equation (x + y)Np(xζ p + y) = z , and which is not 2 cyclotomic in nature, can also be used to solve the equation Np(ζ p − p ζ p − 1) = z . 3.2 a general recurrence theorem 21

3.2 a general recurrence theorem

Now we will show that Theorem 3.1 does not just hold for cyclotomic integers of the form aζ2 + bζ + c, but can in fact be generalized to any cyclotomic integer. For j = 1, . . . , k, let ej(X1,..., Xk) be the elementary symmetric polynomial of degree j in k variables. So e1(X1,..., Xk) = X1 + X2 + ... + Xk, e2(X1,..., Xk) = ∑ ≤r

k k−1 Theorem 3.4. Let f (X) = akX + ak−1X + ··· + a1X + a0 ∈ C[X] with ak 6= 0 and let ω1,..., ωk be its roots. For j = 0, . . . , k and n ∈ N we define (j) (n+1)k−j n n n An := (−1) ak ej(ω1 ,..., ωk ). For each n ≥ 3 we have

n−1 k i (j) ∏ f (ζn) = ∑ An . i=0 j=0

(j) For each j, the sequence (An )n∈N is given by a recurrence relation

k ( j)−1 (j) (j) (j) A k = gs A n+( ) ∑ n+s j s=0

(j) (j) (j) with g0 ,..., g k ∈ Z[a0,..., ak] and initial values An ∈ Z[a0,..., ak]. ( j)−1

Proof. Let f (X) = ak(X − ω1)(X − ω2) ··· (X − ωk) for certain ω1,..., ωk ∈ C. Then we have:

ak−1 = −e1(ω1,..., ωk) ak ak−2 = e2(ω1,..., ωk) ak . . a0 k = (−1) ek(ω1,..., ωk) ak Moreover we see:

n−1 n−1 i n i i ∏ f (ζn) = ak ∏(ζn − ω1) ··· (ζn − ωk) i=0 i=0 nk n n n = (−1) ak (ω1 − 1) ··· (ωk − 1) k ! nk n k−j n n = (−1) ak ∑(−1) ej(ω1 ,..., ωk ) j=0 k (n+1)k−j n n n = ∑(−1) ak ej(ω1 ,..., ωk ) j=0 22 recurrence sequences

( n n) For each j, we will show that ej ω1 ,..., ωk n∈N and thus also  (j) An satisfies a recurrence sequence as in the statement of the n∈N theorem. In general, if

t t i ∑ ciX = ct ∏(X − γj) i=0 j=1

then from the theory of linear recurrence sequences we know that t n the numbers Cn := ∑j=1 γj satisfy the recurrence relation ctCn+t = t−1 − ∑i=0 ciCn+i. In our case, we would like the numbers Cn to be equal n n to ej(ω1 ,..., ωk ). To achieve this, we need to set the γj equal to k mr(ω1,..., ωk), where mr(X1,..., Xk), r = 1, . . . , ( j) are the mono- mials of the polynomials ej(X1,..., Xk). Therefore we consider the polynomial k ( j) E(X) = ∏(X − mr(ω1,..., ωk)). r=1

Let hs ∈ C be such that

k ( j) s E(X) = ∑ hsX . s=0

Then each hs is given by a symmetric polynomial in ω1,..., ωk. By the fundamental theorem of symmetric polynomials, this means that

hs = Ps (e1(ω1,..., ωk),..., ek(ω1,..., ωk))

for some polynomial Ps ∈ Z[Y1,..., Yk]. n n Setting Cn := ej(ω1 ,..., ωk ), we see that

k ( j) n n Cn = ∑ mr(ω1 ,..., ωk ), r=1

and therefore we conclude that the numbers (Cn)n∈N satisfy the recur- rence relation

k ( j)−1 C k = − hsCn+s n+( j) ∑ s=0 k n n n ( j)−s Hence, if An := (−ak) ej(ω1 ,..., ωk ) and gs = (−a) hs, then we get k ( j)−1 A k = − gs An+s. n+( j) ∑ s=0

To finish the proof, we need to show that the coefficients gs and the k values An are in fact elements of Z[a0,..., ak], for all s, n = 0, . . . , ( j) − 1. Note that they are all elements of Z[ a0 ,..., ak−1 , a ], and that the ak ak k 3.2 a general recurrence theorem 23

power of ak appearing in the denominator of hs is equal to the degree of the polynomial Ps. Hence if deg P k ≤ s, then gs ∈ Z[a0,..., ak]. ( j)−s Similarly, if

n n ej(ω1 ,..., ωk ) = Q(e1(ω1,..., ωk),..., ek(ω1,..., ωk)) then we see that the power of ak appearing in the denominator of n n ej(ω1 ,..., ωk ) is equal to the degree of Q. So if deg Q ≤ n, then An ∈ Z[a0,..., ak]. Now we use the following lemma.

Lemma 3.5. Let h(X1,..., Xk) be a symmetric polynomial in k variables, which has degree n as a polynomial in X1. Then there is some polynomial q ∈ Z[Y1,..., Yk] of degree n (as a polynomial in k variables) such that

h(X1,..., Xk) = q(e1(X1,..., Xk),..., ek(X1,..., Xk)). Proof of the Lemma. The proof is just the usual proof of the funda- mental theorem of symmetric polynomials, but keeping track of the degree of the polynomial q we end up with. For the usual proof, see for example [6], page 178. We use induction on the leading term of h in lexicographic ordering. Suppose the leading term is given by l1 l2 lk h0X1 X2 ··· Xk for some h0 ∈ C. By assumption we have l1 = n. Be- cause we use lexicographic ordering, and because h is symmetric, we have l1 ≥ l2 ≥ ... ≥ lk. We define − ˜ l1−l2 l2−l3 lk−1 lk lk h(Y1,..., Yk) = h0Y1 Y2 ··· Yk−1 Yk .

Then we see deg h˜ = l1 = n. Moreover, we see that

h(X1,..., Xk) − h˜(e1(X1,..., Xk),..., ek(X1,..., Xk)) is a symmetric polynomial whose leading term is smaller in lexico- graphic ordering. By induction we see that h = q(e1,..., ek), for some polynomial q ∈ Z[Y1,..., Yk] of degree n.

n n Because e1(ω1 ,..., ωk ) has degree n as a polynomial in ω1, the lemma tells us that Q has degree n as a polynomial in k variables. Fur- thermore, because in the definition of E(X) the monomials mr(ω1,..., ωk) have degree at most 1 as a polynomial in ω1, we find that h k (ω1,..., ωk) ( j)−s has degree s as a polynomial in ω1. Consequently, the lemma tells us that Ps has degree s as a polynomial in k variables. So, indeed, the coef- ficients gs and the starting values An are elements of Z[a0,..., ak].

(i) We can give some more information on the sequences (An )n∈N. They satisfy some symmetry: k Corollary 3.6. Suppose that a0 6= 0 in Theorem 3.4, let fˆ(X) = a0X + k−1 (j) a1X + ··· + ak−1X + ak be the reciprocal polynomial and let (Aˆ n )n∈N be the corresponding recurrence sequences. Then we have the following symmetrical property:

(j) (n+1)k (k−j) An = (−1) Aˆ n 24 recurrence sequences

1 1 1 Proof. Note that fˆ(X) = a0(X − )(X − ) ··· (X − ) and thus ω1 ω2 ωk   ˆ (k−j) nk+j n 1 1 An = (−1) a0 ej−k n ,..., n . ω1 ωk

k a0 Because we have ω ω2 ··· ω = (−1) we find: 1 k ak

e (ωn,..., ωn) = ωn ··· ωn j 1 k ∑ i1 ij 1≤i1<...

(j) (n+1)k−j n n n An = (−1) ak ej(ω1 ,..., ωk )   k−j n 1 1 (n+1)k ˆ (k−j) = (−1) a0 ek−j n ,..., n = (−1) An . ω1 ωk

(i) Furthermore, we can give (An )n∈N explicitly for small i.

(0) (n+1)k n (k) n Corollary 3.7. In Theorem 3.4 we have An = (−1) ak , An = a0 and for j = 1 and j = k − 1 we have the recurrences:

k−1 (1) = − (− )k(k−s) k−1−s (1) An+k ∑ 1 ak as An+s s=0 k−1 (k−1) = − k−1−s (k−1) An+k ∑ a0 ak−s An+s s=0

(0) Proof. From the proof of Theorem 3.4 we see that, by definition, An = (n+1)k n (−1) ak . For the case j = 1 we see that

k k−1 k as s E1(X) = E(X) = ∏(X − ωr) = X + ∑ X . r=1 s=0 ak

(1) (n+1)k−1 n n n So An = (−1) ak e1(ω1 ,..., ωk ) satisfies the recurrence

k−1 (1) = − (− )k(k−s) k−1−s (1) An+k ∑ 1 ak as An+s. s=0

The cases j = k − 1 and j = k now follow by the symmetry property of the previous corollary. 3.2 a general recurrence theorem 25

We will finish this chapter with a number of examples in the case where we have few variables. As a first example, we again look at f (X) = aX2 + bX + c. Theorem 3.4 tells us that

n−1 i n (1) n ∏ f (ζn) = a + An + c , i=0

(1) (1) (1) where A0 = −2, A1 = −ae1(ω1, ω2) = b, and where An satisfies the recurrence (1) (1) (1) An+2 = −bAn+1 − acAn . So, as expected, we obtain Theorem 3.1 again. In the case of f (X) = aX3 + bX2 + cX + d, we find that

n−1 i n+1 n (1) (2) n ∏ f (ζn) = (−1) a + An + An + d . i=0

(1) (1) (1) (1) 2 Here (An )n∈N is given by A0 = 3, A1 = b, A2 = b − ac and the recurrence (1) (1) (1) 2 (1) An+3 = bAn+2 − acAn+1 + a dAn . (2) (2) (2) (2) The sequence (An )n∈N is given by A0 = −3, A1 = c, A2 = bd − c2 and the recurrence

(2) (2) (2) 2 (2) An+3 = −cAn+2 − bdAn+1 − ad An .

Finally, we shall look at the example f (X) = aX4 + bX3 + cX2 + dX + e. Theorem 3.4 shows that we have

n−1 i n (1) (2) (3) n ∏ f (ζn) = a + An + An + An + e . i=0

(1) (3) For An and An we can use Corollary 3.7 to get the recurrence (1) relation. We see that the sequence (An )n∈N is given by the recurrence relation

(1) (1) (1) 2 (1) 3 (1) An+4 = −bAn+3 − acAn+2 − a dAn+1 − a eAn .

i i i i i Let ω1, ω2, ω3, ω4 be the roots of f (X), and let us set ej(ω ) := ej(ω1, ω2, ω3, ω4) (j) j n n and ej := ej(ω). With this notation, we have An = (−1) a ej(ω ), and (1) thus the the initial values of An are given by:

(1) 0 A0 = −e1(ω ) = −4 (1) A1 = −ae1(ω) = b (1) 2 2 2 2 2 A2 = −a e1(ω ) = −a (e1 − 2e2) = −b + 2ac (1) 3 3 3 3 3 2 A3 = −a e1(ω ) = −a (e1 − 3e1e2 + 3e3) = b − 3abc + 3a d 26 recurrence sequences

(3) By symmetry, we find that the sequence (An )n∈N is given by the recurrence

(3) (3) (3) 2 (3) 3 (3) An+4 = −dAn+3 − ceAn+2 − be An+1 − ae An .

The initial values are given by:

(3) A0 = −4 (3) A1 = d (3) 2 A2 = −d + 2ce (3) 3 2 A3 = d − 3cde + 3be

(2) To find the recurrence relation that the sequence (An )n∈N satisfies, we need to calculate the polynomial E(x) as in the proof of Theorem 3.4:

E(x) = (x − ω1ω2)(x − ω1ω3)(x − ω1ω4)(x − ω2ω3)(x − ω2ω4)(x − ω3ω4) 6 5 4 2 2 3 = x − e2x + (e1e3 − e4)x + (2e2e4 − e1e4 − e3)x 2 2 2 3 + (e1e3e4 − e4)x − e2e4x + e4 c bd − ae 2ace − ad2 − b2e = x6 − x4 + x4 + x3 a a2 a3 bde − ae2 ce2 e3 + x2 − x + a3 a3 a3 As a result, we find that the recurrence relation is given by:

(2) (2) (2) 2 2 (2) An+6 = cAn+5 + (ae − bd)An+4 + (ad + b e − 2ace)An+3 2 2 (2) 2 2 (2) 3 3 (2) + (a e − abde)An+2 + a ce An+1 − a e An

(2) n n Finally, for the initial values An = a e2(ω ) for n = 0, . . . , 5, we find:

(2) A0 = 6 (2) A1 = c (2) 2 A2 = c + 2ae − 2bd (2) 3 2 2 A3 = c − 3ace + 3ad + 3b e − 3bcd (2) 4 2 2 2 2 2 2 2 2 A4 = c + 6a e − 8abde − 4ac e − +4ad c + 4b ce + 2b d − 4bc d (2) 5 2 2 2 2 2 2 3 3 A5 = c + 5a ce + 5a d e + 5ab e − 5abcde − 5ac e − 5abd − 5b3de + 5ac2d2 + 5b2c2e + 5b2cd2 − 5bc3d FROMNORMSTOSINGULARINTEGERS 4

4.1 singular integers

In the classical approach to Fermat’s last theorem the following lemma is used. Lemma 4.1. Let x, y be coprime integers such that p does not divide x + y. Then x + yζ is coprime to its Galois conjugates.

Proof. Suppose q is a prime ideal of OK that divides both x + yζ and x + yζi for some i 6≡ 0, 1 mod p. Then q also divides x + yζ − (x + yζi) = y(ζ − ζi). Since p does not divide x + y, we know that q 6= (1 − ζ). Hence q divides y. Similarly we see that q has to divide ζ−1x + y − (ζ−ix + y) = x(ζ−1 − ζ−i) and therefore q also divides x. But this contradicts the fact that x and y are coprime, so we conclude that x + yζ and x + yζi are coprime.

This lemma allows us to conclude that if p does not divide x + y and p p N(x + yζ) = z , then (x + yζ) = I for some ideal I of OK. Following the terminology of [29], chapter 9, we define: p Definition 4.2. An element α ∈ OK is called a singular integer if (α) = I for some ideal I of OK. p Now suppose we have an element α ∈ OK satisfying N(α) = z . The main question of this chapter is: Are there conditions that allow us to conclude that α is a singular integer? For a rational prime q, we denote the primes of OK dividing q by qi for i = 1, . . . , g(q), where g(q) f (q) = p − 1 and f (q) is the relative degree of q. With this notation, the factorisation of α into prime ideals looks like

g(q) ord (α) q qi (α) = ∏ ∏ i . q prime i=1 We see that N(α) = zp if and only if

g(q) ( ) ≡ ∑ ordqi α 0 mod p, for all rational primes q.(4.1) i=1 On the other hand we see that α is a singular integer if and only if

ordqi (α) ≡ 0 mod p, for all primes qi of OK.(4.2) p−2 p−2 Given α = ap−2ζ + ··· + a1ζ + a0 we set α(X) := ap−2X + ··· + a1X + a0.

27 28 from norms to singular integers

Lemma 4.3. If N(α) = zp, and for each rational prime q the polynomials α(X) and Φ(X) share at most one irreducible factor modulo q, then α is singular.

Proof. We want to show that ordq(α) ≡ 0 mod p for all prime ideals q of OK. Since (1 − ζ) is the only prime above p, condition (4.2) is immediate if q = (1 − ζ). Now let q 6= p be a rational prime and let the factorisation into irreducibles of Φ(X) modulo q be given by

g(q) Φ(X) ≡ ∏ hi(X) mod q. i=1

Then the prime ideals above q are given by qi := (q, hi(ζ)). If qi divides α, then there are polynomials r, s ∈ Z[X] such that

α = r(ζ) · q + s(ζ) · hi(ζ).

So there is some polynomial t(X) ∈ Z[X] such that

α(X) = r(X) · q + s(X) · hi(X) + t(X) · Φ(X).

Because hi(X) is a factor of Φ(X) modulo q, we see that hi(X) is also a factor of α(X) modulo q. So each qi dividing α gives a distinct irreducible common factor of α(X) and Φ(X). By assumption there is at most one such common factor for each prime q, so there is at p most one prime ideal qi dividing α. Since N(α) = z , we find that α is singular.

For example, if α = aζ + b, then α(X) has degree 1 and thus α(X) and Φ(X) can have at most one common irreducible factor per prime q. If α = aζ2 + bζ + c then α(X) has degree 2. So if q 6≡ 0, 1 mod p, then q does not split completely in Q(ζ p) and thus f (q) = deg hi(X) ≥ 2. Therefore there can be at most one common factor of α(X) and Φ(X) modulo q. So only at the primes which are congruent to 1 modulo p is it possible that the condition of Lemma 4.3 is not satisfied. Of course, the above does not give a clear-cut way to say wether a given element α with N(α) = zp is singular or not. For a specific prime p and element α it is possible to check it, but we do not have a general condition in terms of the coefficients of α like we did in Lemma 4.1.

4.2 self-prime integers

In the case of α = x + yζ, we showed that α is singular by showing that α is coprime to its Galois conjugates. For a general element α ∈ OK this is a sufficient condition in order to get (4.1) =⇒ (4.2). That’s why we shall look a bit closer that this property. The goal is to find reasonably simple conditions that tell us when an element is coprime to its conjugates. Let us start with the basic definition. 4.2 self-prime integers 29

Definition 4.4. An element α ∈ OK is called self-prime if α is coprime to σi(α) for all i = 2, . . . , p − 1. Being self-prime is a property that we can check locally, at all the prime ideals of OK. We give some general properties of self-prime cyclotomic integers. The following lemma shows that the self-prime condition puts pretty big restrictions on α.

Lemma 4.5. If α ∈ OK is selfprime then all prime divisors of N(α) are congruent to 1 modulo p. Proof. Suppose q divides N(α) and q is not congruent to 1 modulo p. Then q does not split completely in K/Q. Hence for any prime q above q there is some σ ∈ Gal(K/Q) such that σ(q) = q. But then if q divides α, it also divides σ(α).

If some α is self-prime, then we automatically know some other self-prime cyclotomic integers.

Lemma 4.6. Let u be a unit in OK and let σ be an element of Gal(K/Q). If α is self-prime, then also uα and σ(α) are self-prime. Proof. This is immediate from the definition.

Lemma 4.7. Suppose Q ⊆ L ( K. If α is an element of L then α is not self-prime. Proof. Since L is a proper subfield of K, there is some non-trivial σ ∈ Gal(K/Q) that fixes K. But then σ(α) = α so α is not self-prime.

We will proceed by giving a characterisation of self-prime integers using the characteristic polynomial. Denote by χα(X) the charateristic polynomial of α. So we have

χα(X) = ∏ (X − σ(α)) σ∈Gal(K/Q) p−1 p−2 = X + gp−2(α)X + ··· + g1(α)X + g0(α).

The gi(α) are the elementary symmetric polynomials in σi(α), i = 1, . . . , p − 1. They are invariant under the action of the Galois group and thus they are integers. We have

p−1 gp−2(α) = − ∑ σi(α) = −Tr(α) i=1 p−1 p−1 g0(α) = (−1) ∏ σi(α) = N(α) i=1 p−1 p−1 ∏j=1 σj(α) g (α) = − = −Tr(α−1)N(α) 1 ∑ ( ) i=1 σi α The following proposition gives a characterisation of self-prime integers. 30 from norms to singular integers

Proposition 4.8. An element α ∈ OK is self-prime if and only if g0(α) = N(α) and g1(α) are coprime.

Proof. Suppose q is a prime lying above some prime integer q, which divides both α and σi(α) for some i 6= 1. Then obviously q divides N(α). Since g1(α) is the sum of the products of p − 2 different σi(α), each term in this sum is divisible by q. Hence q also divides g1(α). Conversely suppose that q divides both g0(α) and g1(α). Since it divides g0(α), there is some q above q which divides α. Since q divides p−1 g1(α) and α, it also divides ∏j=2 σj(α). Consequently q divides σj(α) for some j 6= 1.

We only need only p − 1 powers of ζ to get a basis of K over Q. For p−2 example, we can take 1, . . . , ζ as a basis and express α ∈ OK in p−2 terms of this basis as α = a0 + a1ζ + ··· + αp−2ζ for some integers ai. The linear map x 7→ αx then corresponds to the matrix   a0 a1 a2 ... ap−2    −ap−2 a0 − ap−2 a1 − ap−2 ... ap−3 − ap−2      Mα =  ap−2 − ap−3 −ap−3 a0 − ap−3 ... ap−4 − ap−3  .    . . . .   . . . .  a2 − a1 a3 − a1 a4 − a1 ... a0 − a1

Then we have the equality

det(XI − Mα) = χα(X).

However, by choosing p − 1 elements out of 1, ζ,..., ζ p−1 to form a basis, we lost some of the symmetry which these elements have. Instead we could consider the matrix   a0 a1 a2 ... ap−2 0    0 a a ... a a   0 1 p−3 p−2    0  ap−2 0 a0 ... ap−4 ap−3  M =   . α  . . . . .   . . . . .     a a a a a   2 3 4 ... 0 1  a1 a2 a3 . . . 0 a0

In a way, this matrix corresponds to the presentation of α as a0 + p−2 p−1 a1ζ + ··· + ap−2ζ + 0 · ζ . Denote α(1) = a0 + a1 + ... + ap−2. By adding the top p − 1 rows to the bottom row, we get only α(1)’s on the bottom row. If we then substract the last column from the first 0 p − 1 columns, we find that det Mα = α(1) det Mα. Similarly we see that we have

0 det(XI − Mα) = (X − α(1)) det(XIp−1 − Mα) = (X − α(1))χα(X). 4.2 self-prime integers 31

0 p p−1 In particular, if we write det(XI − Mα) = X + hp−1(α)X + ··· + h1(α)X + h0(α), then we have:

hp−1(α) = gp−2(α) − α(1) = −(Tr(α) + α(1))

h0(α) = −α(1)g0(α) = −α(1)N(α) −1 h1(α) = −α(1)g1(α) + g0(α) = (1 − α(1)Tr(α ))N(α)

It makes sense to denote α(1) = a0 + a1 + ··· + ap−2 when we think of it as α = f (ζ) for some polynomial f ∈ Z[X]. However, when we consider the Galois group, perhaps it makes more sense to think of it as a map σ0 : OK → Z given by σ0(α) = a0 + a1 + ··· + ap−2. With this notation we have

p−1 hp−1(α) = − ∑ σi(α) i=0 p−1 h0(α) = − ∏ σi(α) i=0 p−1 p−1 ∏j=0 σj(α) h (α) = 1 ∑ ( ) i=0 σi α

We can use the hi(α) instead of the gi(α) in Proposition 4.8.

Proposition 4.9. Let α ∈ OK. Then α is self-prime if and only if h0(α) and h1(α) are coprime. Proof. Suppose α is not self-prime. By Proposition 4.8 there is some prime q dividing both g0(α) and g1(α). Then, using the expressions above, we see that q divides h0(α) and h1(α). Conversely, suppose that q divides h0(α) and h1(α). From the first we see that q divides α(1) or N(α). But if q divides α(1) then the second gives us that q divides N(α). So we know that q divides N(α) and α(1)g1(α). If q divides N(α) and g1(α) we are done, by Proposition 4.8. Hence suppose that q divides N(α) and α(1). Then there is some prime ideal q lying above q that divides both α and α(1). Write α = a0 + a1ζ + p−2 ··· + ap−2ζ . Then we get

q | α(1) − α = (1 − ζ)(a1 + a2 + ··· + ap−2).

So either q = p or q divides a1 + a2 + ··· + ap−2. If q = p then we know that α is not self-prime. If q divides α(1) and a1 + ··· ap−2 then −i q divides a0. Similarly, using α(1) − ζ α instead of α(1) − α, we can show that q divides ai for any i. But if q divides all ai then it will divide σ(α) for any σ, so α is not self-prime.

In general the hi(α) are a bit nicer to work with, because they are more symmetrical. We can use this to give another proof of Lemma 4.1. 32 from norms to singular integers

Lemma 4.10. Let α = a + bζ. Then α is self-prime if and only if a and b are coprime and p does not divide a + b.

Proof. If a and b have a non-trivial common divisor, then obviously α is not self-prime. Moreover, if p divides a + b, then we know by Lemma 4.5 that α is not self-prime. Hence assume that a and b are coprime and p does not divide a + b. We have

X − a −b 0 . . . 0

0 X − a −b . . . 0

0 p p det(XI − Mα) = 0 0 X − a . . . 0 = (X − a) − b .

......

−b 0 0 . . . X − a

p p p−1 This means that h0(α) = −(a + b ) and h1(α) = pa . So if q divides h1(α) then q = p or q divides a. If q divides a and h0(α) then also q divides b, which contradicts the coprimality of a and b. If q = p p p then 0 ≡ −h0(α) ≡ a + b ≡ a + b mod p, which does not hold by assumption.

In the case of three variables we get the following.

Theorem 4.11. Let α = a + bζ + cζ2. Then α is self-prime if and only if (a + b + c)N(α) and ap − cp are coprime and p does not divide a + b + c.

Proof. By Lemma 4.9 we know that α is self-prime if and only if h0(α) p−1 −i and h1(α) are coprime. Note that we have h0(α) = − ∏i=0 (aζ + b + cζi) and:

p−1 p−1 −i i h1(α) = ∑ ∏(aζ + b + cζ ) j=0 i=0 i6=j − ∂ p 1 = − ∏(aζ−i + b + cζi) ∂b i=0 ∂ = − h (α) ∂b 0

So if a rational prime q divides h0(α) and h1(α), then it must also d divide the resultant of h0(α) and db h0(α) when considered as a polyno- mial in b. But this resultant is precisely the discriminant discb(h0(α)) 4.2 self-prime integers 33

p−1 −i of h0(α) with respect the variable b. Since h0(α) = ∏i=0 (b − (−aζ + cζi)) we find

p−1 ( − ) p p 1 −i i −j j discb(h0(α)) = (−1) 2 ∏ (aζ + cζ − aζ − cζ ) i,j=0 i6=j p−1 − p 1 −i −j i j = (−1) 2 ∏ (a(ζ − ζ ) + c(ζ − ζ )) i,j=0 i6=j p−1 − p 1 −i i−j i+j = (−1) 2 ∏ ζ (1 − ζ )(a − cζ ) i,j=0 i6=j

Note that we have:

p−1 p−1 p−1 p−1 ∏ ζ−i = ∏ ∏ ζ−i = ∏ ζi = 1 i,j=0 i=0 j=0 i=0 i6=j j6=i p−1 p−1 p−1 p−1 ∏ (1 − ζi−j) = ∏ ∏(1 − ζk) = ∏ p = pp i,j=0 i=0 k=1 i=0 i6=j p−1 p−1 p−1 p−1 ∏ (a − cζi+j) = ∏ ∏(a − cζk) = ∏(ap − cp) = (ap − cp)p−1 i,j=0 i=1 k=0 i=1 i6=j

We conclude that

− p 1 p p p p−1 discb(h0(α)) = (−1) 2 p (a − c ) .

So h0(α) and h1(α) are coprime if and only h0(α) and discb(h0(α)) p p are coprime, which happens if and only if h0(α) and p(a − c ) are coprime. We know that p divides h0(α) = −(a + b + c)N(α) if and only p divides a + b + c, so we end up with the statement of the theorem.

2 As an example, let p = 23 and consider the elements ζ23 + ζ23 − 4 2 2 2 and 2ζ23 − ζ23 + 3. We have N(ζ23 + ζ23 − 4) = 47 · 15927174689, but 1 + 423 is not divisible by 47, so this element is self-prime. It is divisible by (47, ζ + 20)2 and not by any other prime above 47. On the other 2 2 23 23 hand N(2ζ23 − ζ23 + 3) = 47 · 139 · 75211 and 2 − 3 is divisible by 47. So this element is not self-prime. Indeed, it is divisible by both the prime ideals (47, ζ23 + 30) and (47, ζ23 + 40). For dealing with the Diophantine equation N(a + bζ + cζ2) = zp, this condition is not so useful. We can not say much about how the primes that divide z depend on the coefficients a, b and c. Ideally we would like to have a more simple condition to determine whether a + bζ + cζ2 is self-prime, a condition which does not involve calculating the norm of the element.

SINGULARINTEGERS 5

If α ∈ OK is a singular integer and η ∈ Z[G] annihilates the class group, then we get an equation

αη = uγp,

∗ for some unit u ∈ OK and some γ ∈ OK. In [20], Kummer’s had the great idea to study these kind of equations using the logarith- mic derivative. Intuitively this makes sense, because the logarithmic derivative should turn the p-th power γp into something which is congruent to 0 modulo p, so we can hope to obtain information on α modulo p. Kummer used this to prove that if the first case of Fermat’s last theorem fails, then p must divide the Bernouilli numbers Bp−3 and Bp−5. His methods have later been generalised, to find many more conditions modulo p that should be satisfied if the first case fails. See for example [29]. It is also possible to obtain information modulo p from the equation αη = uγp by taking the p-adic logarithm. It seems that this has not been done before, but it turns out that the information we get is the same as the information we can obtain with Kummer’s logarithmic derivative. In this chapter we first review Kummer’s method. In the first part will see how it leads to congruence conditions that α must satisfy modulo p. In the second part of the chapter we use the p- adic logarithm instead. In a very similar way as with the logarithmic derivative, we obtain information on α modulo p. In fact, in Theorem 5.9 we give a formula that exactly gives the relation between the value of logp(α) and the logarithmic derivative of α.

5.1 idempotents and the herbrand-ribet theorem

We start with some algebraic prerequisites we use in this chapter, and the next. This can be found in chapter 6.3 of [36], but we include it here to make the presentation more self-contained. ∼ ∗ Let G = Gal(K/Q) = (Z/pZ) , and let Zp be the ring of p-adic integers. By Hensel’s lemma, for each a ∈ (Z/pZ)∗, there is a unique p-th root of unity ρa ∈ Zp satisfying ρa ≡ a mod p. Define ω : G → Zp by ω(σa) = ρa. Let Zp[G] be the group ring of G over Zp. We define the orthogonal idempotents of Zp[G] to be the elements

− 1 p 1 ε := ωi(σ )σ−1 i − ∑ a a p 1 a=1

35 36 singular integers

for 0 ≤ i ≤ p − 2. A quick check shows that they satisfy:

2 • εi = εi

• εiεj = 0 if i 6= j

p−2 • 1 = ∑i=0 εi i • σaεi = ω (σa)εi

If M is any Zp[G]-module, then from these properties it follows that we have p−2 M M = εi M. i=0 Moreover, using the last property, we see that

n i o εi M = x ∈ M | σa(x) = ω (σa)x .

As an example, let us look at C = CK, the class group of the field K. The Galois group G acts on C in the obvious manner. We will often denote this action multiplicatively, so cσ instead of σ(c), for c ∈ C and σ ∈ G. This makes sense, because C is abelian. Let Cp be the p-sylow ∞ i subgroup of C. If c ∈ Cp and ∑i=0 ai p ∈ Zp, then we define

∞ ∞ i i c∑i=0 ai p = ∏ cai p i=0

This product is well-defined because Cp is a finite p-group (all its elements have order a power of p). In this way, Cp is a Zp[G]-module. Hence we have p−2 M εi Cp = Cp . i=0 If we are only interested in the group C[p] := {c ∈ C | cp = 1}, then the action of Zp[G] is completely determined by the first coefficient of the p-adic number. So, we only care about the elements of Zp[G] modulo p, and we can see it as an action of Fp[G]. In this case, the p−1 i −1 corresponding idempotents are εi = − ∑a=1 a σa ∈ Fp[G]. Kummer showed that Cp is non-trivial if and only if one of the Bernoulli numbers B2, B4,..., Bp−3 is divisible by p. The Herbrand- Ribet theorem makes this relation, between the class group and Bernoulli numbers, more precise.

εi Theorem 5.1 (Herbrand-Ribet). Let 3 ≤ i ≤ p − 2 be odd. The group Cp is non-trivial if and only if p divides Bp−i.

εi Showing that p divides Bp−i if Cp is non-trivial is the easier direction, and was proven by Herbrand [15]. In the next chapter we shall give a proof of this direction. The other direction is more difficult and is due to Ribet [30] (see also chapter 15 of [36]). 5.2 kummer’s logarithmic derivative 37

5.2 kummer’s logarithmic derivative

Kummer’s method is well explained in [10] and [32], and we follow p−2 these expositions. For α = a0 + a1ζ + ··· + ap−2ζ we define the X X (p−2)X formal power series α(e ) = a0 + a1e + ··· + ap−2e . For 1 ≤ n ≤ p − 2 we define the logarithmic derivative `n : OK \ πOK → Fp by  n  ∂ X `n(α) := log α(e ) mod p. ∂X X=0

Lemma 5.2. For all 1 ≤ n ≤ p − 2, the maps `n are well-defined. n  ∂  X X X k Proof. Note that ∂X log α(e ) will be of the form gα(e )/α(e ) for some polynomial gα ∈ Z[X] and some integer k. If π does not divide α, then α(1) is not divisible by p, and therefore we get an element of Fp after evaluating at X = 0. Because the maps are defined on the set OK \ πOK, we indeed get a map to Fp. We also need to check that the maps `n are independent of the way we write α. So we need to check that `n(α) = `n(α + f (ζ)Φ(ζ)), for f ∈ Z[X]. We see that

 ∂ n  ∂ n log(α(eX) + f (eX)Φ(eX)) ≡ log(α(eX)) ∂X ∂X  ∂  ∂ n  mod Φ(eX), Φ(eX),..., Φ(eX) . ∂X ∂X

Now notice that Φ(X) ≡ (X − 1)p−1 mod p, which implies that  i   ∂  X ∂X Φ(e ) ≡ 0 mod p for all 0 ≤ i ≤ p − 2. So, indeed, we X=0 have `n(α) = `n(α + f (ζ)Φ(ζ)), and `n(α) is well-defined as a map OK \ πOK → Fp.

We continue by showing some basic properties that this logarithmic derivative satisfies.

Lemma 5.3. The maps `n satisfy the following properties:

n 1. `nσj = j `n for all 1 ≤ n ≤ p − 2 and 1 ≤ j ≤ p − 1.

2. `n(α1α2) = `n(α1) + `n(α2) for all 1 ≤ n ≤ p − 2 and α1, α2 ∈ OK \ πOK.

j j 3. `1(ζ ) = j and `n(ζ ) = 0 for j = 0, . . . , p − 1 and 2 ≤ n ≤ p − 2.

4. `n(γ) = 0 for γ ∈ OK ∩ R and 1 ≤ n ≤ p − 2 odd.

ε ε 5. `n(α i ) = `n(α) if n = i, and else `n(α i ) = 0, for 1 ≤ n ≤ p − 2 and 1 ≤ i ≤ p − 1. 38 singular integers

Proof. For the first one, we prove by induction on n that there is some An(X) ∈ Q(X) such that  ∂ n log α(ejX) = jn A (ejX). ∂X n 0 = ( ) = Xα (X) For n 1 it is easy to see that we need to take A1 X α(X) . Now suppose the claim holds for n − 1. Then  n ∂ jX ∂ n−1 jX n jX 0 jX log α(e ) = j A − (e ) = j e A (e ). ∂X ∂X n 1 n−1 0 So we can take An(X) = XAn−1(X) and the claim follows by induction. For the proof of the first property, we now note:

n h jX i n h X i n `nσj(α) ≡ j An(e ) ≡ j An(e ) ≡ j `n(α) mod p. X=0 X=0 The second property is immediate from the general properties of j the logarithmic derivative. For the third property, note that `n(ζ ) = h n i ∂ jX . The fourth follows from the first, because if γ is real ∂X X=0 then n `n(γ) = `n(σ−1(γ)) = (−1) `n(γ). The final property follows from the following computation:

p−1 p−1 ! εi i −1 n−i `n(α ) ≡ − ∑ j `n(σj (α)) ≡ − ∑ j `n(α) mod p. j=1 j=1

p−1 n−i Now use the fact that ∑j=1 j ≡ −1 mod p if p − 1 | n − i, and else it is congruent to 0 modulo p.

Example 5.4. If α = x + yζ with x, y ∈ Z and p - (x + y) then the first few values of `n(α) are given by:

n `n(x + yζ) y 1 x+y xy 2 (x+y)2 xy(x−y) 3 (x+y)3 xy(x2−4xy+y2) 4 (x+y)4 xy(x−y)(x2−10xy+y2) 5 (x+y)5 If α = x + yζ + zζ2 with x, y, z ∈ Z and p - x + y + z then the first few values of `n(α) are given by:

2 n `n(x + yζ + zζ ) y+2z 1 x+y+z xy+4xz+yz 2 (x+y+z)2 (x−z)(xy+8xz−y2+yz) 3 (x+y+z)3 x3y+16x3z−4x2y2−13x2yz−64x2z2+xy3+8xy2z−13xyz2+16xz3+y3z−4y2z2+yz3 4 (x+y+z)4 5.2 kummer’s logarithmic derivative 39

The following theorem was proven by Kummer in [20] in the case where α is of the from x + ζy. The case of a general singular integer was dealt with by Granville [9].

Theorem 5.5. Let α ∈ OK \ πOK be a singular integer. For all odd integers 3 ≤ n ≤ p − 2 we have

Bp−n`n(α) ≡ 0 mod p.

p Proof. Suppose that α ∈ OK and αOK = I . Applying εn for odd 3 ≤ n ≤ p − 2 gives us αεn = Iεn p. If Iεn is not principal, then by ε the Herbrand-Ribet theorem we know that p divides Bp−n. If I n is εn p ∗ principal, then α = uγ for some u ∈ OK and γ ∈ OK. We know t that u = ζ v for some integer t and some v ∈ OK ∩ R. By using the properties in Lemma 5.3 of the maps `n we see that:

εn t p `n(α) ≡ `n(α ) ≡ `n(ζ ) + `n(v) + `n(γ ) ≡ 0 mod p.

Kummer used Theorem 5.5 to prove that, if the first case of Fer- mat’s last theorem fails, then p must divide both Bp−3 and Bp−5. We will recall his argument here. Assume xp + yp = zp, with p - xyz, gcd(x, y, z) = 1. Then x + yζ is singular and therefore xy(x − y) B − ` (x + yζ) ≡ B − ≡ 0 mod p. p 3 3 p 3 (x + y)3

Note that p - z implies that x + y is not divisible by p, so this is well-defined. We find that if p does not divide Bp−3, then p divides x − y. Now we use the symmetry of our equation. We reorder it as xp + (−z)p = (−y)p and do the same procedure as before. We find that p divides x + z. But then

0 = xp + yp − zp ≡ x + y − z ≡ 3x mod p which gives a contradiction for p > 3. If p does not divide Bp−5 then we have

xy(x − y)(x2 − 10xy + y2) ` (x + yζ) = ≡ 0 mod p. 5 (x + y)5 Again, we can use the symmetry of the initial equation to find that

(x − y)(x2 − 10xy + y2) ≡ (x + z)(x2 + 10xz + z2) ≡ (y + z)(y2 + 10yz + z2) ≡ 0 mod p.

If x − y ≡ x + z ≡ 0 mod p then 3x ≡ 0 mod p. If x − y ≡ x2 + 10xz + z2 ≡ 0 mod p, then from the first congruence we find 2x ≡ z, and from the second we find 25z2 ≡ 0 mod p. Finally, assume that

x2 − 10xy + y2 ≡ x2 + 10xz + z2 ≡ y2 + 10yz + z2 ≡ 0 mod p. 40 singular integers

Adding the first part to the middle part, and using that y − z ≡ x mod p, because xp + yp = zp, one finds that 12x2 + y2 + z2 ≡ 0 mod p. Similarly, by adding the first part to the last part we find x2 + 12y2 + z2 ≡ 0 mod p and therefore 11(x + y)(x − y) ≡ 0 mod p. So for p 6= 3, 5, 11 this shows that p must divide Bp−5.

5.3 the p-adic logarithm

Given an equation of the form αη = uγp, we can also use the (Iwasawa) p-adic logarithm to obtain information on α modulo p. We recall the definition and some basic properties. See, for example, [19] for proofs of these facts. Let Cp be the completion of the algebraic closure of the field Qp and let | · |p be the p-adic valuation. For x ∈ Cp with |x|p < 1, the following series converges:

xi logp(1 − x) := − ∑ i≥1 i

If y ∈ Cp is any non-zero element (not necessarily with |y|p < 1), then r it can be written in the form y = p ξx, where r = ordp(X) ∈ Q, ξ is a root of unity, and x ∈ Cp satisfies |x − 1|p < 1. Then we define

logp(y) := logp(x). Just like we had the properties of Lemma 5.3 for the logarithmic derivative, for the p-adic logarithm we have the following.

Lemma 5.6. For all x, y ∈ OK − πOK we have:

1. logp(xy) = logp(x) + logp(y)

2. σ(logp(x)) = logp(σ(x)) for all σ ∈ G

3. logp(ξ) = 0 for any root of unity ξ ∗ 4. logp(u) = logp(σ−1(u)) for all u ∈ OK

5. logp(x) ≡ 0 mod p if x ∈ Z \ pZ. Proof. These are basic properties of the p-adic logarithm, see for ex- ample [19]. For the fourth property, note that u = ζrv for some integer r and real unit v. For the last one, we use that if p - x, then there is a (p − 1)-st root of unity congruent to x modulo p.

Because of the first two properties, we will sometimes write η logp(α) η instead of logp(α ) for η ∈ Z[G].

Theorem 5.7. Let α ∈ OK \ πOK be a singular integer. For all odd 3 ≤ n ≤ p − 2, we have

Bp−nεn logp(α) ≡ 0 mod p. 5.3 the p-adic logarithm 41

Proof. Just like in Theorem 5.5, we see that if p does not divide Bp−n, εn p ∗ then we must have α = uγ for some u ∈ OK and γ ∈ OK. Using the properties in Lemma 5.6 we see that

εn logp(α) ≡ logp(u) ≡ σ−1 logp(u) n ≡ σ−1εn logp(α) ≡ (−1) εn logp(α) mod p.

Because n is odd, this implies that εn logp(α) ≡ 0 mod p.

We can use εn logp(α) ≡ 0 mod p to get information about α mod- ulo p. As an example we take α = x + yζ and n = 3. Note that we have y y log (x + yζ) = log (x + y) + log (1 − π) ≡ log (1 − π) mod p, p p p x + y p x + y because logp(x + y) ≡ 0 mod p. So, by definition of the p-adic loga- rithm, we have

yi πi log (x + yζ) ≡ − mod p. p ∑ ( + )i i≥1 x y i

The terms with i = p − 1 and i ≥ p + 1 are all congruent to 0 modulo p. Moreover, we have

πi 0 = logp(ζ) = logp(1 − π) = − ∑ , i≥1 i and therefore we see that

− πp p 2 πi ≡ − ∑ mod p. p i=1 i

Hence, we conclude that

− p 2  yp yi  πi log (x + yζ) ≡ − p ∑ ( + )p ( + )i i=1 x y x y i − p 2  y yi  πi ≡ − mod p, ∑ + ( + )i i=1 x y x y i and thus xy π2 xy(x + 2y) π3 log (x + yζ) ≡ + mod π4. p (x + y)2 2 (x + y)3 3

Now we apply the idempotent ε3. We have

p−1 p−1 i −3 i −3 j i ε3π ≡ − ∑ j σjπ ≡ − ∑ j (1 − ζ ) mod p. j=1 j=1 42 singular integers

Because we have j j 1 − ζ j = 1 − (1 − π)j = ∑(−1)r+1 πr, r=1 r we find that  j (1 − ζ j)2 ≡ j2π2 − 2j π3 ≡ j2π2 + (j2 − j3)π3 mod π4, 2 (1 − ζ j)3 ≡ j3π3 mod π,

and therefore: p−1 2 −3 2 2 2 3 3 3 4 ε3π ≡ − ∑ j j π + (j − j )π ≡ −π mod π j=1 p−1 3 −3 3 3 3 4 ε3π ≡ − ∑ j j π ≡ π mod π . j=1 So, finally, we see that: xy ε π2 x2y + 2xy2 ε π3 ε log (x + yζ) ≡ 3 + 3 3 p (x + y)2 2 (x + y)3 3  xy x2y + 2xy2  ≡ − + π3 2(x + y)2 3(x + y)3 xy(x − y) ≡ − π3 mod π4. 6(x + y)3 In other words, we found that π3 ε log (x + yζ) ≡ −` (x + yζ) mod π4. 3 p 3 6 4 In particular, ε3 logp(x + yζ) ≡ 0 mod π if and only if p divides xy(x − y). So, by using Theorem 5.7, we again find that if the first case of Fermat’s last theorem fails, then p divides Bp−3. Remark 5.8. If θ ∈ Z[G] is in the Stickelberger ideal (see chapter 6 of [36]), and α is singular, then we can show, in pretty much the same manner as in Theorem 5.7, that

θ logp(α) ≡ 0 mod p. In this way we can obtain congruence conditions for the first case of Fermat’s last theorem. The process is similar to what we did above, but the details are more messy. For a clear exposition on the relationship between these congruence conditions for FLT1, the Stickelberger ideal, and congruences for Bernoulli numbers, see the PhD-thesis [18].

That `3(α) appeared in the p-adic expansion of ε3 logp(α) is no r coincidence, as the following theorem shows. Let [s] be the unsigned Stirling numbers of the first kind. They are defined by the equation r r x(x + 1) ... (x + r − 1) = ∑ xs. s=1 s 5.3 the p-adic logarithm 43

Note that we have a 1 1 b b = a(a − 1) ... (a − b + 1) = ∑(−1)b+s as. b b! b! s=1 s

Theorem 5.9. Let α ∈ OK \ πOK and 1 ≤ n ≤ p − 2. Then we have p−2   ! n 1 r r εn logp(α) ≡ (−1) `n(α) ∑ π mod p. r=n r! n

Proof. The idea of the proof is that εn logp(α) and `n(α) are both given by the n-th coefficient of certain power series. In the first part of the proof we show what these power series are in both cases. In the second part, we calculate the n-th coefficient of the series. This will lead to the formula in the statement of the theorem. i Suppose that we have a power series ∑i≥0 fiX with coefficients fi ∈ Fp. Then for any 1 ≤ n ≤ p − 2 we have  ∂ n  f (X) = n! fn.(5.1) ∂X X=0 p−2 i Similarly, for the polynomial g(X) = ∑i=0 giX we have p−1 p−2 p−1 −n i−n ∑ j g(j) = ∑ gi ∑ j . j=1 i=0 j=1

p−1 i−n p−1 i−n We know that ∑j=1 j ≡ 0 mod p if i 6≡ n mod p − 1, and ∑j=1 j ≡ −1 mod p if i ≡ n mod p − 1. Consequently, we find

p−1 −n ∑ j g(j) ≡ −gn mod p.(5.2) j=1 n h ∂  i p−1 −n We see that both f (X) and ∑j= j g(j) give us the n-th ∂X X=0 1 coefficient. Now consider α ∈ OK \ πOK and suppose we have α = a0 + a1ζ + p−2 ··· + ap−2ζ for some integers ai. Since α(1) is not divisible by p, we see: p−2 ! jk logp(σj(α)) = logp ∑ akζ k=0 − ! p 2 a = log (α(1)) + log 1 − k (1 − ζ jk) p p ∑ ( ) k=1 α 1 − ! p 2 a ≡ log 1 − k (1 − ζ jk) mod p. p ∑ ( ) k=1 α 1 h n i Similarly, since ∂ log(α(1)) = 0 we get ∂X X=0 " − #  ∂ n   ∂ n p 2 a log(α(eX)) = log(1 − k (1 − ekX)) . ∂X ∂X ∑ α(1) X=0 k=1 X=0 44 singular integers

So, from formula (5.1) we see that `n(α) is n! times the n-th coef- p− ( − 2 ak ( − kX)) ficient of log 1 ∑k=1 α(1) 1 e as a power series in X. On the other hand

p−1 −n εn logp(α) ≡ − ∑ j σj(logp(α)) mod p j=1

p− ( − 2 ak ( − is, by formula (5.2), equal the n-th coefficient of logp 1 ∑k=1 α(1) 1 jk ζ )) as a polynomial in Fp[j]. The rest of the proof will consist of calculating the n-th coefficient in both cases. Suppose that we have

− ! p 2 a log 1 − k (1 − Yk) = b Yi (5.3) ∑ ( ) ∑ i k=1 α 1 i≥0

for some coefficients bi ∈ Q, as a formal power series in Y. Substituting Y = eX in (5.3), we get

− ! p 2 a log 1 − k (1 − ekX) = b eiX. ∑ ( ) ∑ i k=1 α 1 i≥0

Then we see that

s !  s  iX (iX) ∑i≥0 bii s ∑ bie = ∑ bi ∑ = ∑ X . i≥0 i≥0 s≥0 s! s≥0 s!

So in the end we find  n  ∑i≥0 bii n `n(α) ≡ n! ≡ ∑ bii mod p. n! i≥0

On the other hand, substituting Y = ζ j in (5.3), we get

− ! p 2 a log 1 − k (1 − ζ jk) = b ζij. p ∑ ( ) ∑ i k=1 α 1 i≥0

p− 2 ak ( − jk)) ≡ Because ∑k=1 α(1) 1 ζ 0 mod p, this series actually converges. ij We want to write ∑i≥0 biζ as an element of Fp[j]. First note that ij ij+p ( r ) ≡ ( r ) mod p, and therefore

+ − ij p ij + p p 2 ij ζij = ζij+p = ∑ (−1)r πr ≡ ∑ (−1)r πr mod p. r=0 r r=0 r This ensures that we do not have to differentiate between the case where ij > p and the case where ij < p. Moreover, we have

ij r (−i)c r = (−1)r ∑ jc. r c=1 r! c 5.3 the p-adic logarithm 45

So, combining these equations, we get

− − − p 2 r (−i)c r p 2 p 2 ic r ζij ≡ ∑ ∑ πr jc ≡ ∑ ∑ (−1)c πr jc mod p. r=0 c=1 r! c c=1 r=c r! c Finally, we have

p−2 p−2 c   ij c i r r c ∑ biζ ≡ ∑ bi ∑ ∑ (−1) π j i≥0 i≥0 c=1 r=c r! c p−2 ! p−2   ! c c 1 r r c ≡ ∑ (−1) ∑ bii ∑ π j . c=1 i≥0 r=c r! c

n We know that εn logp(α) is equal to the coefficient of j , and thus we find ! p−2   ! n n 1 r r εn logp α ≡ (−1) ∑ bii ∑ π i≥0 r=n r! n p−2   ! n 1 r r ≡ (−1) `n(α) ∑ π mod p. r=n r! n

π3 4 The theorem again gives us ε3 logp(x + yζ) ≡ `3(x + yζ) 6 mod π . We get some easy consequences for the p-adic expansion of logp(α).

Corollary 5.10. For α ∈ OK \ πOK and 1 ≤ n ≤ p − 2 we have:

n 1. εn logp(α) ≡ 0 mod π

n+1 2. εn logp(α) ≡ 0 mod π if and only if εn logp(α) ≡ 0 mod p if and only if `n(α) ≡ 0 mod p

n 3. logp(α) ≡ 0 mod π if and only if εk logp(α) ≡ 0 mod p for all 1 ≤ k ≤ n − 1.

 p−2 1 r r n Proof. For the first part, just note that ∑r=n r! [n]π ≡ 0 mod π .  p−2 1 r r n πn πn n+1 In fact, ∑r=n r! [n]π ≡ [n] n! ≡ n! mod π . So εn logp(α) ≡ 0 n+1 mod π if and only if `n(α) ≡ 0 mod p if and only if εn logp(α) ≡ 0 mod p. For the last part, note that

p−2 logp(α) ≡ ∑ εk logp(α) mod p. k=0

2 Since εk logp(α) ≡ 0 mod π for k ≥ 2, we see that logp(α) ≡ 0 2 3 mod π if and only if ε1 logp(α) ≡ 0 mod p. By looking modulo π , 3 we see that logp(α) ≡ 0 mod π if and only if ε1 log( α) and ε2 logp(α) are congruent to 0 modulo p. Continuing in similar fashion, we find n that logp(α) ≡ 0 mod π if and only if εk logp(α) ≡ 0 mod p for all 1 ≤ k ≤ n − 1. 46 singular integers

This shows that we cannot hope to get more information about singular integers using the p-adic logarithm, than we could get by

using the logarithmic derivative. Looking at εn logp(α) mod p and at `n(α) basically amounts to the same thing. KUMMEREXTENSIONS 6

Our discussion of the classical approach to Fermat’s last theorem naturally led us to consider singular integers. That is α ∈ Q(ζ p) such p that αZ[ζ p] = I for some ideal I of Z[ζ p]. There is an intimate connection between singular integers in a number field and cyclic extensions of this field. In this chapter we explore this connection. In the first two sections, we discuss Kummer extensions of a field and the ramification of primes in these extensions. This can, for exam- ple, be found in [1] and [14]. In the last two sections we restrict our attention to extensions of cyclotomic fields. These sections are for a large part inspired by [36]. The last section concerns the Hilbert class field and the Herbrand-Ribet Theorem. We give a proof of Herbrand’s direction of the Herbrand-Ribet theorem. The structure is very similar to Herbrand’s original article [15]. Though the contents of this chapter are not new, it still seems good to have this chapter in the thesis. Singular integers in Q(ζ p) are very naturally connected to unramified extensions of Q(ζ p). Not including a chapter on this relation would seems like a waste, after we have talked about singular integers so much. We tried to make this chapter as concrete as possible. In general, in class field theory the constructions are quite abstract, and concrete constructions of the Hilbert class field of a number field are difficult. At least in the case of Q(ζ p), we can give explicit units generating the unramified extensions. Moreover, we can use the results of the previous chapter, to explain how the Galois group of Q(ζ p) acts on the Galois group of the Hilbert class field.

6.1 kummer extensions

The theory of Kummer extensions can be found in many sources. A classic text is the third chapter of [1]. Let n be a fixed throughout this chapter, and K be a number field, which contains a primitive n-th root of unity ζ = ζn. So in this section, and in the next section, K does not denote the cyclotomic field Q(ζ p). From the third section onwards, K = Q(ζ p) like in the rest of this thesis. We will classify all cyclic extensions of K of degree n. √ Lemma 6.1. If a ∈ K∗ has order n in the group K∗/(K∗)n then K( n a)/K is a cyclic extension of degree n. Conversely, if L/K √is a cyclic extension of degree n, then there is some a ∈ K such that L = K( n a).

47 48 kummer extensions

√ Proof. Suppose a has order n in K∗/(K∗)n and let α = n a. Since K contains the n-th roots of unity, the extension K(α)/K is Galois. We

have n [K(α):K]   a = NK(α)/K(a) = NK(α)/K(α) , and because a has order n in K∗/(K∗)n, we find that n divides [K(α) : K]. Hence we find [K(α) : K] = n and Xn − a is irreducible. Finally, we note that if σ ∈ Gal(K(α)/K), then σ(α) = ζiα for some i ∈ Z/nZ and σ is completely determined by this i. This means we get an injective group homomorphism from Gal(K(α)/K) to the group of n- th roots of unity, given by σ 7→ σ(α)/α. Since |Gal(K(α)/K)| = n, this homomorphism must be an isomorphism and consequently K(α)/K is a cyclic extension. This completes the proof of the first part. Now suppose that L/K is a cyclic extension of degree n and let σ be a generator of the corresponding√ Galois group. We will explicitly construct an a ∈ K such that L = K( n a). By the normal basis theorem (see, for example, section 3.2.5 of [5]), there is some γ ∈ L such that L = K(γ) and γ, σ(γ), σ2(γ),..., σn−1(γ) forms a basis for L/K. n−1 i i Define α = ∑i=0 ζ σ (γ) and note that α 6= 0 because the elements σi(γ) are linearly independent over K. We find

n−1 n−1 σ(α) = ∑ ζiσi+1(γ) = ∑ ζi−1σi(γ) = ζ−1α. i=0 i=0

This implies that σ(αn) = αn, so αn ∈ K. Similarly, σ(αr) 6= αr, so αr 6∈ K, for 1 ≤ r ≤ n − 1. Let a := αn. Then ar = cn for some c ∈ K if and only if for some integer j, αr = ζ jc ∈ K. Since αr is not in K ≤ ≤ − ∗ ( ∗)n for 1 r n 1 we find that a has order√ n in K / K . Hence, the first part of the lemma tells us that K( n a) is a cyclic extension of K of degree n contained in L, and therefore it must by equal to L.

The following lemma tells us when two elements a and b generate the same cyclic extension. √ ∗ ∗ ∗ n n Lemma√ 6.2. Let a, b ∈ K both have order n in K /(K ) . Then K( a) = K( n b) if and only if there is some c ∈ K, and r ∈ Z relatively prime to n, such that a = brcn.

Proof. Suppose that a = brcn. Then √ √ √ K( n a) = K( n brcn) = K(( n b)r). √ √ Obviously we have ( n b)r ∈ K( n b). Conversely, since r and n rela- tively prime, there exist integers k, l such that rk + nl = 1. Then we see √ √ √ √ r n b = ( n b)rk+nl = ( n b)rkbl ∈ K( n b ). √ √ ( n ) = ( n ) Hence K a K b . √ ( ( n ) ) For the other direction, note that any element√ of Gal√ K a /K is completely determined by what it does to n a or to n b. Suppose that 6.2 ramification 49

√ √ √ √ ( n ) = n ( ( n ) ) ( n ) = σ √ a ζ a. Then σ generates Gal K a /K , and√ thus σ b r n n ζ √b for some r relatively prime to n. The powers of√ a form a basis√ of n n n−1 n i K( a)/K, so there are elements ci ∈ K such that b = ∑i=0 ci( a) . This means we also have n−1 √ √ √ n−1 √ r n i r n n i n i ∑ ciζ ( a) = ζ b = σ( b) = ∑ ciζ ( a) . i=0 i=0 √ n Since the powers of a are linearly√ independent√ over K, we find that n n r only cr can be non-zero, and thus b = cr( a) . Raising this to the n r n-th power gives b = cr a , as desired. Combining these two lemmas gives us the following theorem.

Theorem 6.3. There is a bijection between the cyclic subgroups of K∗/(K∗)n of order n and the cyclic extensions of K of degree n. It is given by √ hai 7→ K( n a).

Proof. This is a direct consequence of the previous two lemmas. Lemma 6.2 tells us that the map is well-defined and injective, and Lemma 6.1 tells us that the map is surjective.

In fact we can get a more general theorem. We say that a group G has exponent n if for every g ∈ G we have gn = 1. We call a field extension L/K a Kummer n-extension if K contains the n-th roots of unity and L/K is Galois with Gal(L/K) a finite of exponent n.

Theorem 6.4. There is a bijection between the finite subgroups of K∗/(K∗)n of and the Kummer n-extensions of K. It is given by √ W 7→ K( n W) √ where K( n W) denotes the field obtained by adjoining the n’th roots of all elements of W to K.

Proof. This can found, for example, as Theorem 8.1 of chapter 5 of [17]. Since we will be mainly interested in cyclic extensions, we will not reproduce the proof here. The main idea is to write an abelian group of exponent n as the direct sum of cyclic groups of order dividing n. Then the result basically follows from the previous theorem.

6.2 ramification

Let K be like in the previous section, and suppose we are given a cyclic Kummer extension L/K of degree n. What can we say about the splitting√ behaviour of the primes of K? We know that L is given by K( n a) for some a ∈ K∗, and that a is a generator of a subgroup of K∗/(K∗)n of order n. 50 kummer extensions

Lemma 6.5. The discriminant of L divides nnan−1, so a prime p of K which does not divide na is unramified. In this case the residue degree is the least number f such that xn ≡ a f mod p is solvable in K. √ O O  n  Proof. Let K be the ring of integers of K. We know that √K a  n  is a submodule of the ring of integers OL of L. Since OK a has discriminant nnan−1, we see that discriminant of L must divide this number. Let p be a prime of K not dividing na. Then it is unramified, so its√ residue degree is equal to the degree of the field extension ( n ) n − f p Kp a /Kp. If x a has a root modulo , then Hensel’s√ lemma says  n  that this roots lifts to a root in Kp. So we see that Kp( a) : Kp is at most f . On the other hand, √ √ n √ n n √ [Kp( a):Kp] n ( ) = n ( ) = NKp( a)/Kp a NKp( a)/Kp a a . √  n  Therefore f is at most Kp( a) : Kp , and thus they must be equal.

In particular, the lemma above tells us that the prime p splits com- pletely if and only if the equation xn ≡ a mod p is solvable. So, just as splitting of primes in quadratic extensions is related to quadratic residues, so is splitting of primes in cyclic extensions related to n-th power residues. Now let us consider the primes that divide a. Theorem 6.6. Let p be a prime of K that divides a and let r be the exact power p ∈ p of √dividing a K. If r is relatively prime to n, then is totally ramified in ( n ) p ( ) = p K a . Suppose√ that - n and gcd r, n s. Then√ is unramified in the ( s ) p ( s ) extension√ K a and the prime factors of in K a are totally ramified in K( n a).

Proof. Suppose that p is a prime of K such that pr | a, but pr+1 - a for some integer r ≥ 1. First note that we may assume 0 ≤ r ≤ n − 1. = + ≤ ≤ − Indeed, if r s tn for some integers s, t with 0 s√ n 1 then√ we can pick an element b ∈ p−t and use the fact that K( n a) = K( n abn). r = p L K p = g Pe Suppose that 1 and that √factors in / as ∏i=1 i . Then n for each i we see that Pi | p + ( a) and therefore √ n n n Pi | (p + ( a)) . √ n/e g n n n So, in fact, p = ∏i=1 Pi divides (p + ( a)) . If e < n this implies that p2 | a, which is in contradiction with the fact that r = 1. Thus we must have e = n. = p Hence, in the case where√ r 1, we see that is totally ramified and in fact p = (p + ( n a))n. Alternatively, we could have used the fact that xn − a is an Eisenstein polynomial if r = 1 to show that p is totally ramified. If gcd(r, n) = 1, we can always reduce to the case r = 1 by using the + = same trick as above. Pick some h, g such that gr hn√ 1 mod√p and pick an element b ∈ ph which is not in ph+1. Then K( n a) = K( n agbn), = and therefore we are back in the case r 1. The final part immediately√ follows by applying the previous parts to the extension K( s a). 6.2 ramification 51

As a corollary, we find a relation between ramification of primes in extensions of K and singular integers. ∈ ∗ O = n Corollary 6.7. Let a K be relatively√ prime to n. Then a K I for O ( n ) some ideal I of K if and only if K a /K is unramified√ √ at the primes not dividing n, and this happens if and only if K( n a) = K( n u) for some ∗ u ∈ OK. √ Proof. If a is singular, then the previous theorem tells us that K( n a)/K is unramified outside n. The converse holds as long as a and n are relatively prime. This deals with the√ first ‘if and only if’. For the second one, we obviously have that K( n u)/K is unramified outside ∗ n n for u ∈ OK. In the other direction, if (a) = I , we can pick some ∈ ( ) = − ( ) p b √K with √ordp b ordp I for all prime ideals of K. Then K( n a) = K( n abn) and abn is a unit.

So, the extensions of K of degree n which√ are unramified outside n are precisely the extensions of the form K( n u) for some unit u. This leaves only the ramification at the primes of K which divide n. However, no general results are known, except for the case where = n p is a prime integer.√ In this case, the splitting behaviour of the primes p above p in K( p a)/K is intimately related to solutions of the equation xp ≡ a mod pb for integers b. The following can be found as exercises 9.1-9.3 of [36] or in section 38 of [14]. p−1 Let n = p be a prime. We know that (p) = (1 − ζ p) in the field extension Q(ζ p)/Q. Hence if p is a prime of K above p, then p divides 1 − ζ. If the exact power dividing 1 − ζ is pt, then the exact power of p pt(p−1) ∈ O p = dividing p is . Assume√ that a K is such that ξ a has no p solution in OK, and hence K( a) is a cyclic extension of K. By Fermat’s little theorem we have aN(p) ≡ a mod p. So ξ p ≡ a mod p always has p b a solution in OK. In fact, ξ ≡ a mod p may have a solution in OK > for b 1. The following theorem shows that√ the maximum value of b is related to the ramification above p in K( p a)/K.

p = ( − ) Theorem 6.8. Let be a prime of K dividing p and let t ordp√1 ζ . p Let a ∈ OK such that it is not a p-th power in OK. Let L = K( a) and p b let b ∈ Z>0 ∪ {∞} be maximal such that ξ ≡ a mod p has a solution p b in OK. (With b = ∞ we mean that ξ ≡ a mod p has a solution for any b > 0.) Then p splits completely in L/K if and only if b > tp if and only if b = ∞. The prime p is inert if and only if b = tp, and p is totally ramified if and only if b < tp.

Proof. Suppose p splits completely, and let P be a prime factor of p in p ∈ N L. Because splits completely, we know that for any b we have√ b b p [OL/P : OK/p ] = 1. Hence there is some ξ ∈ OK such that ξ ≡ a mod Pb. But then we see that ξ p ≡ a mod pb. p tp+1 Conversely, suppose that ξ ≡ a mod p for some ξ ∈ OK. Let ρ ∈ K be such that ordp(ρ) = −t and ordq(ρ) ≥ 0 for all other primes 52 kummer extensions

√ q of K. We consider the element β = ρ( p a − ξ). Then β is a root of the polynomial p  p  (x + ρξ)p − ρpa = xp + ρξxp−1 + ··· + ρp−1ξ p−1x + ρp(ξ p − a). 1 p − 1 pt(p−1) p i Since divides p and ordp(ρ) = −t, we find that ( i )ρ is in p tp p p OK for i = 1, . . . , p − 1. Since ξ − a ∈ p , we see that ρ (ξ − a) is in OK. So the polynomial has coefficients in OK, and thus β is in O p( p − ) p L. Its norm is ρ ξ a , which is divisible by √precisely√ when p ≡ ptp+1 ∈ ( ) ( p ) = p ξ a mod √ . If σ Gal L/K is given by σ a ζ a then β − σ(β) = ρ p a(1 − ζ) 6∈ p. Thus, if P is the prime factor of p containing β then σ(P) 6= P. Because L/K is Galois and of prime degree, this means that p splits completely. So, we have shown that p splits completely if and only if b ≥ tp + 1. If b is exactly tp then β is still an element of OL. Moreover, note that σi(β) − σj(β) is relatively prime to p. Hence also ∏ (σi(β) − σj(β)) 1≤i6=j≤p−1

is relatively prime to p. We conclude that the discriminant ideal ∆L/K contains an element relatively prime to p, and thus p does not divide it. Therefore p is unramified in L/K. By the first part of the theorem we know that p does not split, so p must be inert in L/K. We can finish the proof by showing that if b < tp, then p is totally ramified. Assume that we have b < tp. By Fermat’s little theorem, b is at least 1. We first show that b is not divisible by p. Suppose there is some 0 < c ≤ t − 1 and ξ ∈ K such that a ≡ ξ p mod pcp.

Let λ ∈ OK satisfy ordp(λ) = c. Then for any κ ∈ OK we find (ξ + λκ)p ≡ ξ p + λpκp mod pcp+1, since all other terms are divisible by λp. Now a − ξ p ∈ ppc and λp 6= 0 as an element of ppc/ppc+1. Hence we can pick κ in such a way that ξ p + λpκp ≡ a mod ppc+1, and therefore b is at least pc + 1. So we know that b = cp + v for some c ≤ t − 1 and 1 ≤ v ≤ p − 1. ∈ ( ) = − ( ) ≥ q Let ρ K satisfy ordp ρ c and ordq ρ √0 for other primes . In the same way as above, we see that β := ρ( p a − ξ) is an element of p p OL which is not divisible by p. Now, NL/K(β) = ρ (ξ − a) is divisible exactly by pv, but p does not divide β. This means that p cannot be inert in L/K, and by the first part p cannot split in L/K. Since the extension is of prime degree, we conclude that p must be totally ramified.

6.3 units in cyclotomic fields

In this section we apply the results of the previous two sections to extensions of degree p of cyclotomic fields. Our main reference for this 6.3 units in cyclotomic fields 53

part is chapter 8 of [36]. From now on, we set ζ = ζ p and K = Q(ζ) = ( Q) and we set G Gal K/ . In the previous section we saw that√ an extension L/K is unramified outside of p if and only if L = K( p u), ∗ for some unit u ∈ OK, so in this section we will investigate the units of K. To understand the unramified extensions, we only need to care about the units up to p-th powers, thus we are interested in the group ∗ ∗ p OK/(OK) . The subgroups of this group correspond to the extensions of K unramified outside p. ∗ + −1 To ease the notation, we will set E := OK, K = K ∩ R = Q(ζ + ζ ) + + and E := OK+ . By Dirichlet’s unit theorem, we know that E and E p−3 are both groups whose non-torsion part has rank 2 . We also know that for every u ∈ E there is some integer i such that ζiu is an element of E+. In general, we do not have a nice description of the fundamental units generating E, but we do have a nice discription for a subgroup of E of finite index. Define the group of cyclotomic units to be the 1−ζi subgroup C of E generated by −ζ and the units 1−ζ for i = 1, . . . , p − 1. − 1−ζ i i 1−ζi Note that 1−ζ = −ζ 1−ζ which means that the unit

i 1−i 1 − ζ ξ := ζ 2 i 1 − ζ

+ + + is a real unit. Let C := C ∩ E , so that C is generated by the ξi for p−1 i = 2, . . . , 2 . To get more information about the units we need to consider the action of the Galois group G. Since multiplying a unit by the p-th power of another unit will give us the same extension of K, we are p mostly interested in the group Ep := E/E . Similarly, we will set + + + p p + + + p Ep = E /(E ) , Cp := C/C and Cp := C /(C ) . Note that, since any unit u ∈ E can be written as u = ζrv, for some r ∈ Z and v ∈ E+, + we have Ep = hζi ⊕ Ep . All these above groups have an Fp[G]-module structure, with the obvious action of G, so we can use the idempotents of Fp[G] to decompose these group (see the beginning of chapter 5 for an introduction). Then we see

p−2 M εi Ep = Ep . i=0

p−1 ε0 εi Note that ε0 = ∑a=1 σa, and thus Ep = N(Ep) = 1. Since Ep consists i of the elements u such that uσa = ua , for all 1 ≤ a ≤ p − 1, we see that ε1 the roots of unity are elements of Ep . Furthermore, if v is real and i is odd, then i vεi = vσ−1εi = v(−1) εi = v−εi , and therefore vεi = 1. This means that we have: 54 kummer extensions

Proposition 6.9.

p−3 p−3 M εi + M εi Ep = hζi ⊕ Ep and Ep = Ep i=2 i=2 i even i even

The proof is immediate from the comments above.

+ Lemma 6.10. As an Fp[G]-module, Cp is generated by ξg, where g is a generator of (Z/pZ)∗.

Proof. If a = gr, then:

r − j+1 − 1−gr g r 1 gj−gj+1 g r 1 j 1 − ζ 1 − ζ σg ξ = ζ 2 = ζ 2 = ξ . a − ∏ gj ∏ g 1 ζ j=0 1 − ζ j=0

εi Define Ei := ξg . By the previous lemma, we now know that

p−3 +εi + M Cp = hEii and Cp = hEii. i=2 i even

εi For each even 2 ≤ i ≤ p − 3, we see that hEii is a subgroup of Ep . εi Moreover, we see that hEii 6= Ep if and only if Ei is the p-th power of some unit in E. Although we will not prove it, at this point it is nice to note the following theorem, which is Theorem 8.2 of [36].

Theorem 6.11. The index of the real cyclotomic units inside the group of all real units is given by [E+ : C+] = h+ where h+ is the class number of K+.

+ + + As a consequence, we see that if p does not divide h , then Ep = Cp and thus none of the units Ei is a p-th power of a unit in E. Vandiver’s conjecture states that p never divides h+, which has been verified for p < 4000000, though it is not clear whether this means the conjecture should be true (see the discussion on page 158 of [36]). p We have defined the elements Ei as elements√ of E/E . In the theorem p below, we talk about the extension K( Ei)/K. Here we pick any lift of Ei to E, and then take the p-th root. The extension we get clearly does not depend on the particular lift we have chosen.

≤ ≤ − ∈ Z Z Theorem√ 6.12. For each even 2 i p 3 and r /p , if the exten- p r sion K( ζ Ei)/K is an unramified extension, then p divides the Bernoulli number Bi. 6.4 unramified extensions 55

√ ( p rE ) ∈ Proof. If K ζ i /K is the trivial√ extension, then there is some η p r p r OK such that η = ζ Ei. If K( ζ Ei)/K is a unramified extension p r p of degree p then we have η ≡ ζ Ei mod π for some η ∈ OK, by Theorem 6.8. In both cases we see that

r 0 ≡ p logp(η) ≡ logp(ζ Ei) ≡ logp(Ei) mod p.

εi Since Ei = ξg , we can use Theorem 5.9 of chapter 5 to see that

p−2   ! 1 r r logp(Ei) ≡ εi logp(ξg) ≡ `i(ξg) ∑ π mod p. r=i r! i

So logp(Ei) ≡ 0 mod p if and only if `i(ξg) ≡ 0 mod p. 1−g 1−ζg 2 Because ξg = ζ 1−ζ , and because `i is multiplicative, we find 1 − ζg  ` (ξ ) = ` . i g i 1 − ζ

g i Hence, `i(ξ ) is equal to i! times the coefficient of X of the power egX −1 series log ex−1 . Note that we have: d egX − 1 gegX eX log = − dX eX − 1 egX − 1 eX − 1  1   1  = g + 1 − + 1 egX − 1 eX − 1 1  gX X  = − + g − 1. X egX − 1 eX − 1

m X = ∞ Bm X By definition of the Bernoulli numbers we have eX −1 ∑m=0 m! . So combinining all of this, we see that

i `i(ξg) ≡ (g − 1)Bi mod p.

Because g is a generator of (Z/pZ)∗ and i < p − 1, we know that i g − 1 is never congruent to 0 modulo p. Hence `i(ξg) ≡ 0 mod p if and only if Bi ≡ 0 mod p.

6.4 unramified extensions

We will use class field theory to see how unramified extensions of K relate to the class group of K. This allows us to prove the Herbrand’s direction of the Herbrand-Ribet theorem, by constructing a certain unramified extension of K. Suppose that L/K is an extension of degree√ p, which is unramified outside p. Then, by Corollary 6.7, L = K( p u) for some unit u ∈ E. Because of the results of the previous section, if Vandiver’s conjecture holds, we must have q pp p L ⊂ M := K(ζ p2 , E2,..., Ep−3). 56 kummer extensions

Because the Galois group Gal(M/K) is an elementary p-group (all ele- ments other than the identity have order p), we call M a p-elementary extension of K. In fact, under Vandiver’s conjecture, M is the maximal p-elementary extension of K that is unramified outside p. Since M/K is an , there is an action of G on Gal(M/K) given by conjugation: If τ ∈ Gal(M/K) and σˆ is any lift of σ ∈ G to M, then we define

τσ := στˆ σˆ −1.

This is well-defined (it does not depend on which lift of σ we choose) because M/K is abelian. Because Gal(M/K) is an elementary p-group, this action extends to an action of Fp[G] on Gal(M/K) by putting

p− 1 n σ n σ n σ τ∑a=1 a a := (τ 1 ) 1 ··· (τ p−1 ) p−1 .

In this way, we get a decomposition of Gal(M/K) using the orthogonal idempotents εi ∈ Fp[G]. The following lemma describes the different components of Gal(M/K) of this decompostion.

Lemma 6.13. For i = 0, . . . , p − 2 and τ ∈ Gal(M/K) we have:

ε0 εi 1. τ (ζ p2 ) = τ(ζ p2 ) and τ (ζ p2 ) = ζ p2

ε pp pp 2. τ i ( Ej) = Ej for j 6= p − i

ε pp pp 3. τ i ( Ep−i) = τ( Ep−i) for i ≥ 3 odd

Proof. Let σˆa be a lift of σa to M. Then

p a σˆa(ζ p2 ) = σa(ζ p) = ζ p,

and therefore we see that

a+sp ( 2 ) = σˆa ζ p ζ p2 ,

for some s ∈ Z/pZ. If τ ∈ Gal(M/K), then τ(ζ p) = ζ p and thus

1+tp ( 2 ) = τ ζ p ζ p2 ,

for some t ∈ Z/pZ. Combining this, we see that

−1 ai −1 ai a+sp ( 2 ) = ( ) σˆa τ σˆa ζ p σˆa τ ζ p2 i = −1( a(1+a tp)+sp) σˆa ζ p2 i = 1+a tp ζ p2 .

So we conclude that

 p−1 i 1− ∑a=1 a tp εi ( 2 ) = τ ζ p ζ p2 . 6.4 unramified extensions 57

ε0 εi Hence τ (ζ p2 ) = τ(ζ p2 ) and τ (ζ p2 ) = ζ p2 for all i = 1, . . . , p − 2. That deals with the first part of the lemma. εj For the second and third part, note that Ej = ξg , and thus σa(Ej) = aj p Ej v1 for some v1 ∈ E. Hence, we find that q q aj p p σˆa( Ej) = v2 Ej ,

pp t pp for some v2 ∈ E. We know that τ( Ej) = ζ Ej, for some t ∈ Z/pZ, and therefore  j  i q  i q a −1 a p −1 a p σˆa τ σˆa Ej = σˆa τ v2 Ej

 j  j+i q a −1 a t p = σˆa v2ζ Ej

j+i−1 √ = ζa t p E. We conclude that   q − p−1 j+i−1 q εi p ∑a=1 a t p τ ( Ej) = ζ Ej.

ε pp ε pp So τ i fixes Ej if j 6= p − i, and if j = p − i, then τ i ( Ej) = pp τ( Ej). The lemma tells us that the restriction map induces an isomorphisms ε pp Gal(M/K) i → Gal(K( Ep−i)/K) for i = 3, . . . , p − 2 odd, and it ε0 induces an isomorphism Gal(M/K) → Gal(K(ζ p2 )/K). Now we will use some results from class field theory. A classic reference for this is [1]. Another book is [17], which gives a rather concrete approach to class field theory, and fits quite well with this chapter. We give a short summary of the things we need. Class field theory gives us the existence of the Hilbert class field H = H(K) of K. It is the maximal abelian unramified (at all the primes, also the infinite ones) extension of K. It has the property that Gal(H/K) is isomorphic to the class group C = CK of K. The isomorphism is given by the Artin map: If q is a prime of K, which does not ramify in H/K and Q is a prime of H above q, then there N (q) is a unique ϕQ ∈ Gal(H/K) such that ϕQ(x) ≡ x K/Q mod Q for all x ∈ OH. This map ϕQ is called the Frobenius automorphism 0 corresponding to Q. If Q is another prime of H above q, then ϕQ and ϕQ0 will be conjugate elements. Since Gal(H/K) is abelian, this means that ϕQ = ϕQ0 , so we can denote it by ϕq. This means we have a map q 7→ ϕq from the set of unramified primes of K to Gal(H/K). This induces a map ϕ : CK → Gal(H/K), called the Artin map. Class field theory tells us that this map is in fact an isomorphism. Just like before for the field M, there is an action of G on Gal(H/K), given by conjugation. We have another action of G on Gal(H/K) via the class group. If τ ∈ Gal(H/K) and σ ∈ G we can define τσ := ϕσϕ−1(τ). 58 kummer extensions

The following lemma tells us that these actions are in fact the same.

Lemma 6.14. Let σˆ be a lift of σ ∈ G to H and let τ ∈ Gal(H/K). Then we have ϕσϕ−1(τ) = στˆ σˆ −1.

Proof. Let [q] ∈ C be such that ϕ(q) = τ. Then by definition we have

τσˆ −1(x) ≡ σˆ −1(x)NK/Q(q) mod Q.

Hence we see

στˆ σˆ −1(x) ≡ xNK/Q(q) mod σˆ (Q).

So ϕσϕ−1(τ) = στˆ σˆ −1.

Because C is a finite abelian group, we can write it like the direct sum of its Sylow subgroups. In particular, we can write C = Cp ⊕ D, where Cp is the p-part of C and D is the complement of Cp in C. ϕ(D) We define Hp = H , the subfield of H fixed by all elements of ϕ(D). Then the restriction map from H to Hp induces an isomorphism ϕ(Cp) → Gal(Hp/K). We call Hp the p-part of H. We know there is an action of G on C and Gal(Hp/K). Because these are p-groups (all elements have order a power of p), this action extends to an action of Zp[G]. Now we are ready to prove the following theorem.

εi Theorem 6.15 (Herbrand). For i = 3, . . . , p − 2 odd, if the group Cp is non-trivial, then p divides the Bernoulli number Bp−i. Proof. Let 3 ≤ i ≤ p − 2 be odd. Then we see:

εi 1−εi Cp is nontrivial ⇐⇒ Cp 6= Cp 1−ε ϕ(Cp) i ⇐⇒ Hp /K is a nontrivial extension ⇐⇒ ∃ L/K with [L : K] = p and τεj (L) = L

∀ τ ∈ Gal(Hp/K), ∀ j 6= i

If Ep−i is the p-th power of some unit of E, then as we saw in the ε p−i proof of Theorem 6.12, p must divide Bp−i. Otherwise Ep = hEp−ii. pp From Lemma 6.13 we conclude that L ⊂ K(ζ p2 , Ep−i). Since [L : K] = p, we find q p r L = K( ζ Ep−i), for some r ∈ Z/pZ. Since L is contained in the Hilbert class field, L/K must be an unramified extension. So, by Theorem 6.12, the Bernoulli number Bp−i is divisible by p.

This proof is quite similar to Herbrand’s proof in [15]. The other εi direction, showing that if p | Bp−i, then Cp is non-trivial, is due Kenneth Ribet [30]. He used techniques from algebraic geometry to 6.4 unramified extensions 59 construct unramified extensions. Later, a more elementary proof was found (see chapter 15 of [36]). This finishes the chapter on extensions of Q(ζ p). We see that the singular integers we investigated in previous chapters, give us the extensions of Q(ζ p) that are unramified outside of p. The proof of The- orem 6.15 can be seen as a nice application of Kummer’s logarithmic derivative.

REGULARPRIMES 7

In this chapter we look at the equation

N(x + yζ) = zp (7.1) for pairwise coprime integers x, y, z and p ≥ 5 a regular prime. The fact that p is regular means we have extra knowledge of the units of K, which we try to exploit. First we show that there is no solution to (7.1) in the ‘first case’, i.e. when p - xy(x − y). This follows immediately from what we looked at in chapter 5. However, because p is regular, we get extra information. Then we will see that if there is a solution to (7.1) for a regular prime p, then xy(x − y) must be ‘big’ in the sense that it must be divisible by all the ‘small’ primes congruent to 1 modulo p. More concretely, in Corollary 7.4, we show that if q = np + 1 is prime, with 2 ≤ n ≤ p − 3, then q must divide xy(x − y). In the computational section we use this, together with extra data, to show that if p ≥ 5 is a regular prime and x and y are coprime integers such that N(x + yζ) = zp, then max(|x|, |y|) > 5 × 108 (Theorem 7.8).

7.1 theoretical results

We are interested in the equation

N(x + yζ) = zp for pairwise coprime integers x, y, z. Note that, for any integers a and b, if we write N(a + bζ) = w, then we see that N(wa + wbζ) = wp is a solution to (7.1). So, if we would not restrict to the case of coprime integers, then we would immediately have infinitely many solutions. We will henceforth restrict to the case of coprime integers x and y. Note that if a prime divides any two of x, y and z, then this prime will also divide the third one. So assuming that x and y are coprime is the same as assuming that x, y and z are pairwise coprime.

Lemma 7.1. If p is an odd prime and x and y are coprime integers such that N(x + yζ) = zp, then p does not divide z.

p−1 Proof. Since pOK = π OK, we see that p divides z if and only if π divides x + yζ, and this happens if and only if p divides x + y. If p p divides z then p divides N(x + yζ) and thus ordπ(N(x + yζ)) ≥ i p(p − 1). Since ordπ(x + yζ) = ordπ(x + yζ ) for any i = 1, . . . , p − 1, we see that ordπ(x + yζ) ≥ p, and thus p divides x + yζ. Because

61 62 regularprimes

p−2 1, ζ,..., ζ form a basis of OK as a Z-module, we find that p divides both x and y. This is in contradiction with our assumption that x and y are coprime.

If p = 3, then any element of Z[ζ3] is of the form x + yζ3, so 3 3 N(x + yζ3) = z if and only if x + yζ3 = uα for some unit u ∈ Z[ζ3] and α ∈ Z[ζ3]. In particular, the case p = 3 has many solutions.

Theorem 7.2. Let p be an odd, regular prime and let x and y be coprime integers such that N(x + yζ) = zp. Then p divides xy(x − y).

Proof. From chapter 4 we know that if x and y are coprime, the ele- ments x + yζi and x + yζ j are coprime whenever i 6≡ j mod p. Hence p (x + yζ)OK = I for some ideal I of OK. Because p is regular, I = γOK for some γ ∈ OK. Thus we find that

x + yζ = uγp (7.2)

for some unit u of OK. Now use the logarithmic derivative, or the p-adic logarithm, as in chapter 5.

We can be more specific about the unit u appearing in equation (7.2).

Theorem 7.3. If p is an odd, regular prime and x, y are coprime integers p satisfying N(x + yζ) = z , then one of the following holds for some γ ∈ OK:

1.p | x and x + yζ = ζγp

2.p | y and x + yζ = γp

3.p | x − y and x + yζ = (1 + ζ)γp

p ∗ Proof. We have x + yζ = uγ0 for some unit u ∈ OK and γ0 ∈ OK. In Lemma 7.1 we saw that the assumption of the theorem imply that p p does not divide x + y. Therefore we have γ0 ≡ g mod p for some integer g not divisible by p. So g has an inverse modulo p, and we get

g−1x + g−1yζ ≡ u mod p.

By Theorem 7.2, we know that p divides xy(x − y). If p divides x, then ζ−1u is congruent to an integer modulo p. Since p is regular, Kummer’s lemma tells us that ζ−1u = vp for some unit v. Putting p γ = vγ0, we find that x + yζ = ζγ . If p divides y, then u ≡ g−1x mod p and thus u = vp for some unit p v. Again putting γ = vγ0, we get that x + yζ = γ . Finally, suppose that p divides x − y. Then we have x + yζ ≡ x(1 + ζ) mod p. Since 1 + ζ is a unit we find that there is a unit v such that u p p 1+ζ = v . Putting γ = vγ0, we obtain x + yζ = (1 + ζ)γ . 7.2 computational results 63

Bringing the units in Theorem 7.3 to the left hand side, we find −1 x+yζ 1 that one of xζ + y, x + yζ or 1+ζ = 1+ζ (x − y) + y must be the p-th power of an element of OK. In particular, this should be the case locally at primes q of OK. Theorem 7.3 gives us a way to show that certain ‘small’ primes have to divide xy(x − y). Corollary 7.4. Let p be an odd, regular prime and x, y coprime integers satisfying N(x + yζ) = zp. Suppose that 2 ≤ n ≤ p − 3 is such that q := np + 1 is a prime. Then p divides x (resp. y, x − y) if and only if q divides x (resp. y, x − y). Proof. Since q is congruent to 1 modulo p, it splits in K. So we have

qOK = (q, ζ − a1)(q, ζ − a2) ··· (q, ζ − ap−1) ∗ for some distinct a1,..., ap−1 ∈ (Z/qZ) . If p divides x, then by Theorem 7.3 we know that xζ−1 + y = γp for −1 some γ ∈ OK. Looking modulo (q, ζ − ai), we see that xai + y is a p-th power in Z/qZ. The number of p-th powers in Z/qZ is equal to n + 1 (the p-th powers in (Z/qZ)∗ together with 0), so this number is less than p − 1. In particular, there must be some j 6= i such that −1 −1 xai + y ≡ xaj + y mod q, and therefore q must divide x. Conversely, if q divides x, then q does not divide y or x − y because x and y are coprime. So by what we just showed, p does not divide y or x − y. Hence p must divide x. The proof for the cases where p divides y or x − y goes completely analogous.

7.2 computational results

For a prime p we define

Mp := p ∏ (np + 1). 1≤n≤p−3 np+1 prime

With this notation, Corollary 7.4 tells us that if p is regular, then one of x, y and x − y is divisible by Mp. Heuristically one would expect there to be approximately

1 p2 p ≈ p − 1 2 log p 2 log p primes less than p2 which are congruent to 1 modulo p. This would mean that Mp grows quite rapidly as p gets bigger. Giving a theoretical lower bound on Mp other than Mp ≥ p does not seem feasible, though. Denote by P(a, d) the smallest prime congruent to a modulo d. Linnik’s theorem (see [38]) states that for any coprime, positive integers a, d with 1 ≤ a ≤ d − 1 we have

P(a, d) < cdL 64 regularprimes

for some effictively computable constants c and L. It is conjectured that L ≤ 2, but the current record is L = 5, given in [38]. A lower 2 bound on Mp better than p would imply that P(1, p) < p . For relatively small primes p we can take a computational approach. Though not very efficient, we can still use the lower bound Mp ≥ p. This allows us to calculate all primes p such that Mp < B for a given upper bound B, in finite time. We did this for the bound B = 109, using PARI/GP [25]. This leads to the following lemma.

9 Lemma 7.5. The only primes p such that Mp < 10 are given by:

p Mp Prime Factors 2 2 2 3 3 3 5 55 5 · 11 7 203 7 · 29 11 1508639 11 · 23 · 67 · 89 13 7130461 13 · 53 · 79 · 131 17 57332993 17 · 103 · 137 · 239 19 831041 19 · 191 · 229

Note that these are all regular primes, seeing as the first irregular primes are 37, 59 and 67. Let q = np + 1 be a prime (not necessarily with n ≤ p − 3) and let s be a primitive root modulo q. Even when we can not apply Corollary 7.4, we can still get some computational pi q−1 results. The p-th powers modulo q are 0 and s for i = 1, . . . , p . The − q 1 i p-th roots of unity modulo q are given by s p for i = 0, . . . , p − 1. If x + yζ = γp, then, by looking modulo a prime of K above q, we − q 1 i see that x + ys p should be a p-th power modulo q for any 1 ≤ i ≤ − − q 1 i p − 1. Similarly if xζ−1 + y = γp then xs p + y should be a p-th x−y p power modulo q, for all 1 ≤ i ≤ p − 1. Finally, if 1+ζ + y = γ , then − q 1 i (x − y)(1 + s p )−1 + y has to be a p-th power modulo q. For example, if p = 5 and q = 31, then s = 3 is a primitive root modulo 31. Using PARI/GP, we see that there are no a, b ∈ (Z/31Z)∗ such that a + 36ib is fifth power modulo 31, simultaneously for i = 5 −1 5 1, 2, 3 and 4. Hence if x + yζ5 = γ or xζ + y = γ , then xy is divisible by 31. There are also no a, b ∈ (Z/31Z)∗ such that a + (1 + 6i −1 x−y 5 3 ) b is a fifth power modulo 31. So if 1+ζ + y = γ then 31 divides (x − y)y. We did the same computations for primes the 5 ≤ p ≤ 19 and some small primes q ≡ 1 mod p. The results are the given in the following tables. 7.2 computational results 65

Lemma 7.6. Let x, y be coprime integers satisfying N(x + yζ) = zp. For primes p in the table below, if p divides xy, then xy is also divisible by the primes in the right column of the following table:

p q such that q | xy 5 11, 31, 41, 61, 101, 251, 271, 751 7 29, 43, 71, 113, 127, 197, 211, 239, 281, 337, 379, 421, 449, 463, 491, 547, 631, 659, 673, 701, 743, 757, 827, 883, 911, 953, 967 11 23, 67, 89, 199, 331, 353, 397, 419, 463, 617, 661, 683, 727, 859, 881, 947, 991 13 53, 79, 131, 157, 313, 443, 521, 547, 599, 677, 859, 911, 1093, 1171 17 103, 137, 239, 307, 409, 443, 613, 647, 919, 953, 1021, 1123, 1259, 1327 19 191, 229, 419, 457, 571, 761, 1103, 1217, 1483, 1559, 1597 If p divides x − y then (x − y)y is divisible by the following primes:

p q such that q | (x − y)y 5 11, 31, 41, 61, 71, 101, 151, 191, 241, 311 7 29, 43, 71, 113, 127, 211, 239, 281, 337, 379, 449, 463, 491, 547, 617, 631, 659, 673, 701, 743, 757, 827, 883, 953 11 23, 67, 89, 199, 331, 353, 397, 419, 463, 617, 661, 683, 727, 859, 881, 947, 991 13 53, 79, 131, 157, 313, 443, 521, 547, 599, 677, 859, 911, 1093, 1171 17 103, 137, 239, 307, 409, 443, 613, 647, 919, 953, 1021, 1123, 1259, 1327 19 191, 229, 419, 457, 571, 761, 1103, 1217, 1483, 1559, 1597 Remark 7.7. We checked this for q < 1000, and, where necessary, for more primes q in order to ensure that the product of the primes q corresponding to p is at least 5 × 108. For the primes p ≥ 7 it should be possible to find more corresponding q, given extra computational time. For p = 5 the primes q in the list are all q < 10000. Using the data from the previous two lemmas we can give a lower bound on the absolute value of x and y. Theorem 7.8. Let p ≥ 5 be a regular prime and let x and y be coprime integers such that N(x + yζ) = zp. Then max(|x|, |y|) > 5 × 108. Proof. By Corollary 7.4 we know that one x, y and x − y should be divisible by Mp. So we see

2 max(|x|, |y|) > max(|x|, |y|, |x − y|) ≥ Mp.

9 Lemma 7.5 tells us that for p ≥ 23 we have Mp ≥ 10 and thus max(|x|, |y|) > 5 × 108. For the remaining primes we use Lemma 7.6. 66 regularprimes

For p = 5, if 5 divides xy, then xy is divisible by

5 · 11 · 31 · 41 · 61 · 101 · 251 · 271 · 751 = 22000998843422555.

Hence we find q √ max(|x|, |y|) > |xy| ≥ 22000998843422555 > 1.4 × 109.

If 5 divides x − y then (x − y)y is divisible by

5 · 11 · 31 · 41 · 61 · 71 · 101 · 151 · 191 · 241 · 251 · 311 = 61600621624429072505.

Therefore we see 1 1√ max(|x|, |y|) > |(x − y)y| ≥ 61600621624429072505 > 4 × 1010. 2 2 In both cases we have max(|x|, |y|) > 5 · 108. The calculations for the primes p = 7, . . . , 19 are done in the same manner.

We conclude with some remarks on the equation N(a + bζ + cζ2) = zp. In this case, it is not easy to give conditions for when a + bζ + cζ2 is singular, as we saw in chapter 4. If it is singular, then from chapter 5, Example 5.4, we can conclude that p divides (a − c)(ab + bc + 8ac − b2). This is of course not as strong as Theorem 7.2. We know that a + bζ + 2 p cζ = uγ for some unit u and γ in OK. Even if we assume that p divides a − c, we cannot deduce anything like we did in Theorem 7.3. So it seems that these methods are not strong enough to attack the equation N(a + bζ + cζ2) = zp. WOLSTENHOLME’STHEOREM 8

In 1862, Joseph Wolstenholme [37] proved the following theorem:

Theorem 8.1. For any prime p ≥ 5 we have:

2p − 1 ≡ 1 mod p3,(8.1) p − 1 − p 1 1 ∑ ≡ 0 mod p2,(8.2) k=1 k − p 1 1 ≡ ∑ 2 0 mod p.(8.3) k=1 k

(i) = n 1 = (1) Let us set Hn ∑k=1 ki and Hn Hn . So Wolstenholme’s the- 2 (2) orem says Hp−1 ≡ 0 mod p and Hp−1 ≡ 0 mod p. In 1900, James Glaisher [8] showed:

Theorem 8.2. For any prime p ≥ 5 we have

1 2 3 H − ≡ − p B − mod p . p 1 3 p 3

This means that a prime p divides the Bernoulli number Bp−3 if and 3 only if Hp−1 ≡ 0 mod p . So the condition p | Bp−3 that we found in chapter 5 can also be charaterised in terms of harmonic sums. Over the years many generalisations of Theorem 8.1 have been investigated. For a nice survey, see the article [23]. One generalisation 2p−1 is to consider ( p−1 ) modulo higher powers of p and express it in (i) terms of harmonic sums Hp−1, see for example [24, 35]. This leads to statements such as ([23], p. 6):   2p − 1 2 3 (3) 2 2 ≡ 1 + 2pH − + p H + 2p H − (8.4) p − 1 p 1 3 p−1 p 1

4 4 (3) 2 5 (5) 9 + p H − H + p H mod p . 3 p 1 p−1 5 p−1 In the article [11], the author used the p-adic logarithm and exponen- tial functions to study congruence conditions for binomial coefficients. In this chapter we will show that many Wolstenholme type theorems such as (8.4) can be proven in a uniform manner by using the proper- ties of the p-adic logarithm and the p-adic exponential function. Using this method, we can extend identities such as (8.4) to arbitrary high

67 68 wolstenholme’s theorem

powers of p. This leads to Theorem 8.6 below. As an application, in Corollary 8.7 we prove the following congruence:   2p − 1 2 3 (3) 2 5 (5) 2 7 (7) ≡ 1 + 2pH − + p H + p H + p H p − 1 p 1 3 p−1 5 p−1 7 p−1 2 2 9 (9) 2 2 2 6  (3)  + p H + 2p H − + p H 9 p−1 p 1 9 p−1 4 4 (3) 4 6 (5) + p H − H + p H − H 3 p 1 p−1 5 p 1 p−1 4 3 3 4 5 2 (3) 12 + p H − + p H − H mod p . 3 p 1 3 p 1 p−1 Following the article [22], we call a prime p a Wolstenholme prime 2p−1 ≡ 4 if it satisfies ( p−1 ) 1 mod p . The author of that article shows that the primes 16843 and 2124679 are the only two Wolstenholme primes smaller than 109. Moreover he conjectured there are infinitely many Wolstenholme primes. Using the p-adic logarithm, we obtain characterisations of Wolstenholme primes in terms of harmonic sums. This leads to a characterisation of Wolstenholme primes modulo p12 in Theorem 8.11. Thereby we confirm a conjecture from [23] (Remark 24), which says that a prime p is Wolstenholme if and only if it satisfies   2p − 1 2 3 2 5 (5) 8 ≡ 1 + 2pH − + p H − + p H mod p . p − 1 p 1 3 p 1 5 p−1 In the second part of this chapter we apply the logarithmic method of the first part to norms of integers of cyclotomic fields. The ana- log of Lemma 8.4 is given by Theorem 8.15. As an application, in Proposition 8.17 we prove the identity

− p 1 12k ∑ ≡ 0 mod p2. k=1 k k This is the weaker, modulo p2, version of a result from [35]. Moreover, in Theorem 8.20, we get the congruence

p−1   2 2 1 2k (1 − Lp)(Lp − 3) ∑ (−1)k−1 ≡ mod p2, k=1 k k 2p

where Lp is the p-th . This improves on a result in [34]. Finally we find a congruence involving the integers of the sequence A092765 of the OEIS.

8.1 logarithms and binomial coefficients

In this section we shall see how the relation between binomial co- efficients and harmonic sums becomes clear if you use the p-adic logarithm. For basic properties of the p-adic logarithm, see section 3 of chapter 5 and the references there. In addition to this, we will need the following lemma. 8.1 logarithms and binomial coefficients 69

Lemma 8.3. Let x ∈ Zp satisfy x ≡ 1 mod p, and let 0 < k < p − 2. Then  i k logp(x) x ≡ ∑ mod pr(k+1) i=0 i! if and only if x ≡ 1 mod pr.

r Proof. For any r ≥ 1, we have x ≡ 1 mod p if and only if logp(x) ≡ 0 mod pr. Moreover, we have

 i logp(x) x = expp logp(x) = ∑ , i≥0 i! so it is enough to show that for any y ≡ 0 mod p we have

yi ∑ ≡ 0 mod pr(k+1) i≥k+1 i!

i r  y  if and only if y ≡ 0 mod p . This amounts to showing that νp i! >  k+1   i   k+1  y y = y νp (k+1)! , because then we have νp ∑i≥k+1 i! νp (k+1)! . But  k+2   k+1  y > y < − νp (k+2)! νp (k+1)! follows from the fact that k p 2, whilst for i > k + 2 the claim is obvious.

Our main tool will be the next lemma.

Lemma 8.4. In Qp we have the following three identities:

∞ pi (i) = ∑ Hp−1 0, (8.5) i=1 i ∞ i   p ( ) 2p − 1 (−1)i−1 H i = log ,(8.6) ∑ p−1 p − i=1 i p 1 ∞ 2i−1   p ( − ) 2p − 1 2 H 2i 1 = log .(8.7) ∑ ( − ) p−1 p − i=1 2i 1 p 1

p−1 k−p Proof. Note that, since p is odd, 1 = ∏k=1 k . Hence we get:

− p 1 p 0 = − logp(1) = − ∑ logp(1 − ) k=1 k − p 1 ∞ pi ∞ pi = = (i) ∑ ∑ i ∑ Hp−1. k=1 i=1 k i i=1 i

2p−1 = p−1 p+k ≡ For the second identity, we note ( p−1 ) ∏k=1 k 1 mod p, which allows us to proceed in exactly the same manner as for the first identity. Finally, the third identity is obtained by adding the first two to each other. 70 wolstenholme’s theorem

The following lemma is useful for simplifying some of the expres- sions we will obtain.

Lemma 8.5. For any i ∈ N we have:  (i) 0 mod p, if p − 1 - i Hp−1 ≡ −1 mod p, if p − 1 | i

If i is odd, then:  2 (i) 0 mod p , if p − 1 - i + 1 Hp−1 ≡  ip 2 2 mod p , if p − 1 | i + 1

Proof. The first claim follows easily from the existence of a primitive root modulo p. For the second claim, note that we have

− ( − ) p 1 1 p 1 /2 ki + (p − k)i = ∑ i ∑ i( − )i k=1 k k=1 k p k ( − ) p 1 /2 ki + (−k)i + ip(−k)i−1 ≡ mod p2. ∑ i( − )i k=1 k p k

So, if i is odd, we see that

( − ) p 1 /2 1 ip (i) ≡ − ≡ − (i+1) 2 Hp−1 ip ∑ i+1 Hp−1 mod p . k=1 k 2

Lemma 8.5 covers parts (8.2) and (8.3) of Wolstenholme’s theo- rem 8.1. Note that the equivalence of (8.2) and (8.3) also follows from our Lemma 8.4 by looking at equation (8.5) modulo p3, after know- (3) ing that Hp−1 ≡ 0 mod p. The equivalence between (8.1) and the other two parts of Wolstenholme’s theorem follows by looking at (8.7) modulo p3. 5 2 For p = 3 we have (3) = 1 + 3 , and by Wolstenholme’s theorem 2p−1 ≡ 3 ≥ we have ( p−1 ) 1 mod p for p 5. Therefore, by the general properties of the p-adic logarithm and exponential, we know that 2p−1 = 2p−1 expp logp ( p−1 ) ( p−1 ). Combined with Lemma 8.4, this allows us 2p−1 to express ( p−1 ) in terms of harmonic sums.

Theorem 8.6. Let p be an odd prime. In Qp we have the identity:

n   2i−1 ! 2p − 1 1 p ( − ) = 2 H 2i 1 (8.8) − ∑ ∑ − p−1 p 1 n≥0 n! i≥1 2i 1 8.1 logarithms and binomial coefficients 71

Proof. The proof follows immediately from the combination of Lemma 8.4 and the definition of the exponential function:

2p − 1 2p − 1 = exp log p − 1 p p p − 1 1  2p − 1n = ∑ logp n≥0 n! p − 1 n 2i−1 ! 1 p ( − ) = 2 H 2i 1 ∑ ∑ − p−1 n≥0 n! i≥1 2i 1

By looking modulo a specific power pk, and using Lemma 8.5 to eliminate the terms which are congruent to 0 modulo pk, we can obtain an extension of previously known results (see Remark 8.9 below).

Corollary 8.7. For any prime p ≥ 11 we have the identity:   2p − 1 2 3 (3) 2 5 (5) 2 7 (7) ≡ 1 + 2pH − + p H + p H + p H p − 1 p 1 3 p−1 5 p−1 7 p−1 2 2 9 (9) 2 2 2 6  (3)  + p H + 2p H − + p H 9 p−1 p 1 9 p−1 4 4 (3) 4 6 (5) + p H − H + p H − H 3 p 1 p−1 5 p 1 p−1 4 3 3 4 5 2 (3) 12 + p H − + p H − H mod p . 3 p 1 3 p 1 p−1 Proof. The proof is immediate by looking at (8.8) and using Lemma 8.5.

Remark 8.8. The lower bound on the prime p is there to ensure that certain terms dissapear by using Lemma 8.5. Naturally, for the smaller primes we still obtain a corresponding identity using (8.8), but there will be some extra terms in the above formula.

Remark 8.9. Looking modulo p6, we obtain for all primes p ≥ 5:   2p − 1 2 3 (3) 6 ≡ 1 + 2pH − + p H mod p . p − 1 p 1 3 p−1

This was originally proved in [35]. Looking modulo p9, we get the identity mentioned in the introduction, i.e. for all primes p ≥ 7:   2p − 1 2 3 (3) 2 5 (5) ≡ 1 + 2pH − + p H + p H p − 1 p 1 3 p−1 5 p−1

2 2 4 4 (3) 9 + 2p H − + p H − H mod p . p 1 3 p 1 p−1 So far we have been looking at identities that hold for all primes. Now we will shift our attention to Wolstenholme primes. By using Lemma 8.4, one can quickly prove the following well-known fact. 72 wolstenholme’s theorem

Theorem 8.10. For any prime p the following are equivalent: 2p − 1 ≡ 1 mod p4,(8.9) p − 1 3 Hp−1 ≡ 0 mod p ,(8.10) (2) 2 Hp−1 ≡ 0 mod p .(8.11) > 2p−1 ≡ 4 Proof. From Lemma 8.3, we know that, for p 3, ( p−1 ) 1 mod p 2p−1 ≡ 4 4 if and only if logp ( p−1 ) 0 mod p . Looking at (8.7) modulo p , we 3 see that this happens if and only if Hp−1 ≡ 0 mod p . Finally, looking 4 p (2) 3 at (8.6) modulo p we see that Hp−1 ≡ 2 Hp−1 mod p , which proves the theorem.

As in the article [22], we call a prime satisfying any of these condi- tions a Wolstenholme prime. We can use the same logarithmic method to give characterisations of Wolstenholme primes in terms of harmonic sums. Theorem 8.11. A prime p is Wolstenholme if and only if:   2p − 1 2 3 (3) 2 5 (5) 2 7 (7) ≡ 1 + 2pH − + p H + p H + p H p − 1 p 1 3 p−1 5 p−1 7 p−1

2 9 (9) 2 2 4 4 (3) + p H + 2p H + p H − H 9 p−1 p−1 3 p 1 p−1 2 4 6 (5) 2 6  (3)  12 + p H − H + p H mod p . 5 p 1 p−1 9 p−1 Proof. By Lemma 8.3 we know that p is Wolstenholme if and only if

2p − 1 2p − 1 1  2p − 12 ≡ 1 + log + log mod p12. p − 1 p p − 1 2 p p − 1 We substitute the expression (8.7) we obtained for the logarithm into this equality. We obtain the result of the theorem after eliminating the terms which are 0 modulo p12, by using Lemma 8.5 and Theorem 8.10. Since the first primes are not Wolstenholme, we do not need to concern ourselves with a lower bound for p like in the previous section.

Remark 8.12. By looking modulo p8 we obtain that p is a Wolstenholme prime if and only if   2p − 1 2 3 2 5 (5) 8 ≡ 1 + 2pH − + p H − + p H mod p . p − 1 p 1 3 p 1 5 p−1 This is identity (49) of [23]. In Remark 24 of that article, the author asks whether this identity characterizes Wolstenholme primes. The theorem above proves that this is indeed the case. In [22] the author conjectures that there are infinitely many Wol- 2p−1 ≡ 5 stenholme primes, but no primes p such that ( p−1 ) 1 mod p . For sake of reference we also give the characterisation for these primes in terms of harmonic sums. 8.2 logarithms and cyclotomic integers 73

Proposition 8.13. For any prime p, the following are equivalent:

2p − 1 ≡ 1 mod p5, p − 1 4 Hp−1 ≡ 0 mod p , (2) 3 Hp−1 ≡ 0 mod p .

> 2p−1 ≡ 5 Proof. For p 5, we know ( p−1 ) 1 mod p if and only if

2p − 1 log ≡ 0 mod p5. p p − 1

We know   2p − 1 5 log ≡ 2pH − mod p , p p − 1 p 1 and also

  2 2p − 1 p (2) 5 log ≡ pH − − H mod p , p p − 1 p 1 2 p−1 which proves the equivalence of the three statements.

Just like in the previous two cases, we will extend this to a higher power of p. 2p−1 ≡ 5 Proposition 8.14. A prime p satisfies ( p−1 ) 1 mod p if and only if   2p − 1 2 3 (3) 2 5 (5) 2 7 (7) 10 ≡ 1 + 2H − + p H + p H + p H mod p . p − 1 p 1 3 p−1 5 p−1 7 p−1

Proof. Again, the proof is a simple application of Lemmas 8.3 and 8.4.

8.2 logarithms and cyclotomic integers

In thiss section we will be working in the p-th cyclotomic field K = Q(ζ p) again, with the corresonding notations. In exactly the same way as we did for binomial coefficients modulo p, we can use the p-adic logarithm to obtain congruence relations for norms of elements of OK. We will denote by νπ the valuation with respect to the prime ideal πOK. The analog of Lemma 8.4 is given by the following theorem.

Theorem 8.15. Let α ∈ OK be such that νπ(α) ≥ 1. Then we have, in Qp:

∞ 1 k ∑ Tr(α ) = − logp N(1 − α),(8.12) k=1 k 74 wolstenholme’s theorem

If moreover we know that 1 − α is a unit, then

∞ 1 ∑ Tr(αk) = 0, (8.13) k=1 k ∞ Tr(α2k−1) −2 = log N(1 + α),(8.14) ∑ − p k=1 2k 1 and in particular we get:

− p 1 1 ∑ Tr(αk) ≡ 0 mod pνπ (α).(8.15) k=1 k Proof. Applying the p-adic logarithm to N(1 − α) we see that:

p−1 logp(N(1 − α)) = ∑ logp(1 − σj(α)) j=1 p−1 ∞ σ (αk) = − ∑ ∑ j j=1 k=1 i ∞ 1 = − ∑ Tr(αk), k=1 k which gives the first identity. If 1 − α is a unit, then we have N(1 − α) =

1, so that logp N(1 − α) = 0. For the third identity, note that we have

i+1 (−1) i logp(N(1 + α)) = ∑ Tr(α ). i≥1 i Adding this to the second equation, we get

(−1) + (−1)i+1 Tr(α2k−1) log (N(1 + α)) = Tr(αi) = −2 . p ∑ ∑ − i≥1 i i≥1 2k 1

1 k Now we investigate the p-adic valuation of k Tr(α ). We know that νπ(Tr(x)) ≥ νπ(x), so we find: 1  ν Tr(αk) ≥ kν (α) − ν (k). π k π π

1 k  If p - k, then of course νπ k Tr(α ) ≥ kνπ(α). When νp(k) = t ≥ 1, then, using Bernoulli’s inequality (1 + x)r ≥ 1 + rx, one gets

t kνπ(α) − νπ(k) ≥ p νπ(α) − t(p − 1)

≥ (1 + t(p − 1)) νπ(α) − t(p − 1)

≥ νπ(α) + t(p − 1)(νπ(α) − 1).

In particular, we see that for any k ≥ p one has

1  ν Tr(αk) > (p − 1)(ν (α) − 1), π k π 8.2 logarithms and cyclotomic integers 75

1 k and since k Tr(α ) is a , this means that 1  ν Tr(αk) ≥ ν (α). p k π The congruence (8.15) then follows from equality (8.13).

The discrete Fourier transform will be useful for calculating the trace of an element:

Lemma 8.16. Let p be an odd prime and for j = 0, . . . , p − 1, let fj be a p−1 ij 1 p−1 −ij . If gi = ∑j=0 fjζ then fj = p ∑i=0 giζ . Proof. This is a simple calculation:

p−1 p−1 p−1 ! p−1 p−1 ! −ij ik −ij i(k−j) ∑ giζ = ∑ ∑ fkζ ζ = ∑ fk ∑ ζ = p fj. i=0 i=0 k=0 k=0 i=0

We are now ready to apply this to specific units.

Proposition 8.17. For any prime p ≥ 5 we have

− p 1 12k ∑ ≡ 0 mod p2. k=1 k k

Proof. Consider the element ζ − 1 + ζ−1 = 1 + ζ−1(1 − ζ)2, so that in this case we have α = −ζ−1(1 − ζ)2. Note that for p > 3 we have

1 − ζ6 ζ(ζ − 1 + ζ−1)(ζ + 1) = ζ3 + 1 = , 1 − ζ3 so that 1 − α is indeed a unit. We will calculate the trace of αk. We know − ! ∞ 2k p 1 2k (1 − ζi)2k = ∑ (−1)rζir = ∑ ∑ (−1)r ζij. r=−∞ r j=0 r≡j mod p r

i 2k Now we apply Lemma 8.16 with gi = (1 − ζ ) to get

− 2k 1 p 1 1 ∑ (−1)r = ∑ (1 − ζi)2kζ−ij = Tr(ζ−j(1 − ζ)2k). r≡j mod p r p i=1 p

In particular, we have

2k Tr(αk) = p ∑ (−1)k+r , r≡k mod p r and for 0 < k < p this implies that

2k Tr(αk) = p . k 76 wolstenholme’s theorem

Because νπ(α) = 2, (8.15) immediately gives us that

− − p 1 12k p 1 1 p ∑ = ∑ Tr(αk) ≡ 0 mod p2. k=1 k k k=1 k

We can easily get the result modulo p2 by using the fact that, for k > p, αk 1 k 3 one has νπ( k ) ≥ 2(p + 1), and thus k Tr(α ) ≡ 0 mod p . This means that we have − p 1 1 1 2p ∑ Tr(αk) ≡ − Tr(αp) = − ∑ (−1)p+r k=1 k p r≡0 mod p r 2p  = − − 2 ≡ 0 mod p3, p

by Wolstenholme’s theorem.

Remark 8.18. These trace calculations can be used to find some identities on their own. For example we can write α = −ζ(1 − ζ)2 from Proposition 8.17 also as α = (1 − ζ)(1 − ζ−1). Doing the trace calculation with this second form we get kk Tr(αk) = p ∑ (−1)r+s . r≡s mod p r s If k < p the only solution for r ≡ s mod p is r = s. Combining this with the expression we found for Tr(αk) in Proposition 8.17, we find

2k k k2 = ∑ k r=0 r (which is also an easy consequence of Vandermonde’s identity).

Remark 8.19. In [35] the author shows, via a different method, that in fact we have p−1      1 2k 8 2 2p 4 ∑ ≡ − Hp−1 ≡ − − 2 mod p . k=1 k k 3 3 p In chapter 3 we saw that for certain cyclotomic elements, the norm can be expressed completely by a recurrence sequence. We apply this in the following theorem.

Theorem 8.20. Let p be an odd prime, and let Lp be the p-th Lucas number, i.e. L0 = 2, L1 = 1 and Lk+2 = Lk+1 + Lk. Then we have:

p−1   2 2 1 2k (1 − Lp)(Lp − 3) ∑ (−1)k−1 ≡ mod p2. k=1 k k 2p

Proof. Again, let α = −ζ−1(1 − ζ)2 as in the previous proposition. Then 1 + α = 3 − ζ − ζ−1. Note that we have

(ζ2 − ζ − 1)(ζ−2 − ζ−1 − 1) = 3 − ζ2 − ζ−2 8.2 logarithms and cyclotomic integers 77 and therefore N(1 + α) = N(ζ2 − ζ − 1)2. By using Theorem 3, we see 2 2 that N(ζ − ζ − 1) = Lp. So N(1 + α) = Lp, and we can use this to calculate logp(N(1 + α)). We find:

1 log (N(1 + α)) ≡ N(1 + α) − 1 − (N(1 + α) − 1)2 p 2 1  2 ≡ L2 − 1 − L2 − 1 p 2 p 2 2 (1 − Lp)(Lp − 3) ≡ mod p3. 2 On the other hand, in the previous proposition we saw that Tr(α)k ≡ 0 mod p3 for k ≥ p, and thus (8.12) tells us that

p−1 p−1   k−1 1 k k−1 1 2k 3 logp(N(1 + α)) ≡ ∑ (−1) Tr(α ) ≡ p ∑ (−1) mod p . k=1 k k=1 k k

Together these give the desired result.

Remark 8.21. In the article [34], the authors show that

p−1 12k Fp−( p ) ∑ (−1)k−1 ≡ 5 5 mod p. k=1 k k p

2 Here Fk is the k-th . Using the the fact F p ≡ Lp − 1 p−( 5 ) mod p it is easy to see that Theorem 8.20 reduces to this result modulo p. 2 A prime p for which F p ≡ 0 mod p is called a Wall-Sun-Sun p−( 5 ) prime. It has been shown that if the first case of Fermat’s last theorem fails, then p must be a Wall-Sun-Sun prime. It is interesting to compare this with the case of Wolstenholme primes. From chapter 5 we know that if the first case of Fermat’s last theorem fails, then p must divide the Bernoulli number 2p−1 ≡ 4 Bp−3, which happens if and only if ( p−1 ) 1 mod p . Now we apply our results to a different element α. The result we obtain is concerned with the following sequence. For k ≥ 0, define

k k k  ak := ∑ . s=0 s 2k − 3s

This is sequence A092765 of the OEIS [33], starting with

1, 0, 4, 6, 36, 100, 430, 1470, 5796, . . .

Proposition 8.22. For a prime p ≥ 7 we have

( − ) p 1 /2 a ∑ (−1)k k ≡ 0 mod p. k=1 k 78 wolstenholme’s theorem

Proof. Consider the element α = −ζ2 + ζ + ζ−1 − ζ−2 = −ζ−2(1 − ζ)(1 − ζ3). Then

1 + ζ5 1 − α = ζ2 − ζ + 1 − ζ−1 + ζ−2 = ζ−2 1 + ζ

is indeed a unit. To calculate the trace we use the same trick as in the previous example. ! ! ∞ k ∞ k (1 − ζi)k(1 − ζ3i)k = ∑ (−1)rζir ∑ (−1)sζ3is r=−∞ r s=−∞ s ∞ kk = ∑ (−1)r+s ζi(3s+r) r,s=−∞ r s − ! p 1 kk = ∑ ∑ (−1)r+s ζij j=0 3s+r≡j mod p r s

By Lemma 8.16 we get

− kk 1 p 1 ∑ (−1)r+s = ∑ (1 − ζi)k(1 − ζ3i)kζ−ij 3s+r≡j mod p r s p i=1 1 = Tr(ζ−j(1 − ζ)k(1 − ζ3)k). p

1 k  p−1 Because νπ(α) = 2 we know that νπ k Tr(α ) > p − 1 for k > 2 , 1 k  and thus νπ k Tr(α ) > 1. Now Theorem 8.15 tells us that

( − ) p 1 /2 1 ∑ Tr(αk) ≡ 0 mod p2. k=1 k

p−1 For 0 ≤ k ≤ 2 and 0 ≤ r, s ≤ k, the only solution to 3s + r ≡ 2k mod p is r = 2k − 3s. Thus we find

( − ) p 1 /2 1 0 ≡ ∑ Tr(αk) k=1 k ( − ) ! p 1 /2 1 kk ≡ p ∑ ∑ (−1)k+r+s k=1 k 3s+r≡2k mod p r s ( − ) ! p 1 /2 (−1)k k k k  ≡ p mod p2 ∑ ∑ − k=1 k s=0 s 2k 3s

The previous results were all obtained by starting with an element of the cyclotomic field, calculating its norm and trace, and then using the logarithm to obtain a congruence. Clearly one can get many more results in this way. However, a more interesting problem is the inverse problem. Suppose there is a congruence, similar to those in this chapter, 8.2 logarithms and cyclotomic integers 79 one would like to prove. Is there a corresponding cyclotomic element that does the trick? Also, it would be interesting to see if there are other known identities that can be proved or improved using the logarithmic method.

BIBLIOGRAPHY

[1] Algebraic . Proceedings of an instructional confer- ence organized by the London Mathematical Society (a NATO Advanced Study Institute) with the support of the International Mathematical Union. Edited by J. W. S. Cassels and A. Fröhlich. Academic Press, London; Thompson Book Co., Inc., Washington, D.C., 1967, pp. xviii+366. [2] B. Anglès. “Units and norm residue symbol.” In: Acta Arith. 98.1 (2001), pp. 33–51. [3] Y. Bugeaud and M. Mignotte. “L’équation de Nagell-Ljunggren xn−1 q x−1 = y .” In: Enseign. Math. (2) 48.1-2 (2002), pp. 147–168. [4] Y. Bugeaud, M. Mignotte, and S. Siksek. “Classical and modular approaches to exponential Diophantine equations. I. Fibonacci and Lucas perfect powers.” In: Ann. of Math. (2) 163.3 (2006), pp. 969–1018. [5] H. Cohen. Number theory. Vol. I. Tools and Diophantine equations. Vol. 239. Graduate Texts in Mathematics. Springer, New York, 2007, pp. xxiv+650. [6] P. M. Cohn. Algebra. Vol. 1. Second Edition. John Wiley & Sons, Ltd., Chichester, 1982, pp. xv+410. [7] P. J. Davis. Circulant matrices. A Wiley-Interscience Publication, Pure and Applied Mathematics. John Wiley & Sons, New York- Chichester-Brisbane, 1979, pp. xv+250. [8] J. Glaisher. “On the residues of the sums of products of the first p − 1 numbers, and their powers, to modulus p2 or p3.” In: Quart. J. Math. 31 (1900), pp. 321–353. [9] A. Granville. Diophantine equations with varying exponents (with special reference to Fermat’s last theorem). Thesis (Ph.D.)–Queen’s University (Canada). ProQuest LLC, Ann Arbor, MI, 1987, (no paging). [10] A. Granville. “The Kummer-Wieferich-Skula approach to the first case of Fermat’s last theorem.” In: Advances in number theory (Kingston, ON, 1991). Oxford Sci. Publ. Oxford Univ. Press, New York, 1993, pp. 479–497. [11] A. Granville. “Arithmetic properties of binomial coefficients. I. Binomial coefficients modulo prime powers.” In: Organic mathe- matics (Burnaby, BC, 1995). Vol. 20. CMS Conf. Proc. Amer. Math. Soc., Providence, RI, 1997, pp. 253–276.

81 82 bibliography

[12] G. Gras. “Analysis of the classical cyclotomic approach to Fer- mat’s last theorem.” In: Actes de la Conférence “Fonctions L et Arithmétique”. Vol. 2010. Publ. Math. Besançon Algèbre Théorie Nr. Lab. Math. Besançon, Besançon, 2010, pp. 85–119. [13] G. Gras and R. Quême. “Vandiver papers on cyclotomy revisited and Fermat’s last theorem.” In: Publications mathématiques de Besançon. Algèbre et théorie des nombres, 2012/2. Vol. 2012/. Publ. Math. Besançon Algèbre Théorie Nr. Presses Univ. Franche- Comté, Besançon, 2012, pp. 47–111. [14] E. Hecke. Lectures on the theory of algebraic numbers. Vol. 77. Gradu- ate Texts in Mathematics. Translated from the German by George U. Brauer, Jay R. Goldman and R. Kotzen. Springer-Verlag, New York-Berlin, 1981, pp. xii+239. [15] J. Herbrand. “Sur les classes des corps circulaires.” French. In: J. Math. Pures Appl. (9) 11 (1932), pp. 417–441. [16] M. Janji´c.“Determinants and recurrence sequences.” In: J. Integer Seq. 15.3 (2012), Article 12.3.5, 21. [17] G. J. Janusz. Algebraic number fields. Second. Vol. 7. Graduate Studies in Mathematics. American Mathematical Society, Provi- dence, RI, 1996, pp. x+276. [18] V. Jha. The Stickelberger ideal in the spirit of Kummer with application to the first case of Fermat’s last theorem. Thesis (Ph.D.), Panjab University, Chandigarh, 1992. 1992. [19] N. Koblitz. p-adic numbers, p-adic analysis, and zeta-functions. Sec- ond. Vol. 58. Graduate Texts in Mathematics. Springer-Verlag, New York, 1984, pp. xii+150. [20] E. E. Kummer. “Erweiterter Beweis des letzten Fermatschen Lehrsatzes.” In: Abh. Akad. Wiss. Berlin (1857), pp. 41–74. [21] D. H. Lehmer. “Factorization of certain cyclotomic functions.” In: Ann. of Math. (2) 34.3 (1933), pp. 461–479. [22] R. J. McIntosh. “On the converse of Wolstenholme’s theorem.” In: Acta Arith. 71.4 (1995), pp. 381–389. [23] R. Meštrovi´c. Wolstenholme’s theorem: Its Generalizations and Exten- sions in the last hundred and fifty years (1862–2012). arXiv:1111.3057. 2011. 7 2p−1 [24] R. Meštrovi´c.“On the mod p determination of ( p−1 ).” In: Rocky Mountain J. Math. 44.2 (2014), pp. 633–648. [25] PARI/GP version 2.11.1. Available from http://pari.math.u- bordeaux.fr/. The PARI Group. Univ. Bordeaux, 2018. [26] A. Peth˝o.“Perfect powers in second order linear recurrences.” In: J. Number Theory 15.1 (1982), pp. 5–13. bibliography 83

[27] A. Peth˝o.“Diophantine properties of linear recursive sequences. II.” In: Acta Math. Acad. Paedagog. Nyházi. (N.S.) 17.2 (2001), pp. 81–96. [28] A. Pethö. “Diophantine properties of linear recursive sequences. I.” In: Applications of Fibonacci numbers, Vol. 7 (Graz, 1996). Kluwer Acad. Publ., Dordrecht, 1998, pp. 295–309. [29] P. Ribenboim. 13 lectures on Fermat’s last theorem. Springer-Verlag, New York-Heidelberg, 1979, xvi+302 pp. (1 plate). [30] K. A. Ribet. “A modular construction of unramified p-extensions ofQ(µp).” In: Invent. Math. 34.3 (1976), pp. 151–162. [31] T. N. Shorey and C. L. Stewart. “On the Diophantine equation ax2t + bxty + cy2 = d and pure powers in recurrence sequences.” In: Math. Scand. 52.1 (1983), pp. 24–36. [32] S. Sitaraman. “Vandiver revisited.” In: J. Number Theory 57.1 (1996), pp. 122–129. [33] N. J. A. Sloane. The On-Line Encyclopedia of Integer Sequences. pub- lished electronically at https://oeis.org, Sequence A092765. [34] Z.-W. Sun and R. Tauraso. “New congruences for central bino- mial coefficients.” In: Adv. in Appl. Math. 45.1 (2010), pp. 125– 148. [35] R. Tauraso. “More congruences for central binomial coefficients.” In: J. Number Theory 130.12 (2010), pp. 2639–2649. [36] L. C. Washington. Introduction to cyclotomic fields. Second. Vol. 83. Graduate Texts in Mathematics. Springer-Verlag, New York, 1997, pp. xiv+487. [37] J. Wolstenholme. “On certain properties of prime numbers.” In: Quart. J. Pure Appl. Math. 5 (1862), pp. 35–39. [38] T. Xylouris. Über die Nullstellen der Dirichletschen L-Funktionen und die kleinste Primzahl in einer arithmetischen Progression. Vol. 404. Bonner Mathematische Schriften [Bonn Mathematical Publica- tions]. Dissertation for the degree of Doctor of Mathematics and Natural Sciences at the University of Bonn, Bonn, 2011. Universität Bonn, Mathematisches Institut, Bonn, 2011, p. 110.