<<

Diophantine

Prof. Gisbert W¨ustholz ETH Zurich

Spring Semester, 2013 Contents

About these notes ...... 2 Introduction ...... 2

1 Algebraic Fields 6

2 Rings in 10 2.1 Integrality ...... 11 2.2 rings ...... 14 2.3 Discrete valuation rings ...... 16

3 Absolute Values and Completions 18 3.1 Basic facts about absolute values ...... 18 3.2 Absolute Values on Number Fields ...... 21 3.3 Lemma of Nakayama ...... 24 3.4 Ostrowski’s Theorem ...... 26 3.5 Product Formula for Q ...... 29 3.6 Product formula for number fields ...... 30

4 Heights and Siegel’s Lemma 34 4.1 Absolute values and heights ...... 34 4.2 Heights in affine and projective spaces ...... 35 4.3 Height of matrices ...... 37 4.4 Absolute height ...... 38 4.5 Liouville estimates ...... 38 4.6 Siegel’s Lemma ...... 39

5 Logarithmic Forms 41 5.1 Historical remarks ...... 41 5.2 The Theorem ...... 43 5.3 Extrapolations ...... 48 5.4 Multiplicity estimates ...... 53

1 6 The Unit 54 6.1 Units and S-units ...... 54 6.2 Unit and regulator ...... 56

1 7 Integral points in P \{0, 1, ∞} 57

8 Appendix: Theory 59 8.1 Lattices ...... 59 8.2 Geometry of ...... 60 8.3 Norm and Trace ...... 62 8.4 Ramification and ...... 64

About these notes

These are notes from the course on of Prof. Gisbert W¨ustholzin the Spring of 2013.

Please attribute all errors first to the scribe1.

Introduction

The primary goal will be to consider the unit equation and especially its effective solution via linear forms in . Lecture 1 introduces the primary object of study in algebraic : an field, and Lecture 2 will move on to arithmetic rings, including local rings, valuation rings and Dedekind rings. Dirichlet proved a basic result: Theorem 0.1 (Dirichlet). The of units in a number field is finitely generated. Lecture 3 treats the general theory of absolute values. Both heights and Siegel’s Lemma provide the subject matter of Lecture 4. We will begin with the notion of an absolute value for a general algebraic number field and study how they split upon extension of the number field. This information determines the height, a key tool for modern number theory (see, e.g., the six hundred page tome, Heights, by Bombieri and Gubler). Siegel’s lemma will be treated only in a very elementary form. We will examine the problem of finding solutions of of the form,

x1a1 + ··· + xnan = 0, a1, . . . , an ∈ Z not all zero 1Jonathan Skowera, [email protected]

2 The discussion proceeds in Lecture 5 to linear forms in logarithms. Baker received the for a version of the following theorem: (the best version, presented here, is due to

Theorem 0.2. Let a1, . . . , an be n integers different from 0 and 1, and let b1, . . . , bn be any integers. Then,

b1 bn |a1 ··· an − 1| > f(A, B, n),A = max(|ai|),B = max(|bi|, e) > A−C(n) log B where C(n) is an effective constant.2

The proof requires much machinery, so we will prove a simpler version. A trivial lower bound would be A−CB. The theorem includes an absolute value, and for a general algebraic num- ber field, the various embeddings of the number field into the complex num- bers C induce various absolute values. Fix a finitely generated subgroup Γ ⊂ K×, then you can exactly deter- mine (in principle) the solutions x and y of the unit equation. Hence the whole chain of results can be solved effectively. Solutions of covers of P1 follow from this and also an effective form of Siegel’s theorem. Roth received the ’s medal for the following theorem. Note that it is not effective!

Theorem 0.3 (Thue-Siegel-Dyson-Schneider-Roth). Let α be algebraic and irrational, and let  > 0. Then

|x − αy| < max(|x|, |y|)−(1+) has only finitely many solutions.

In Thue’s version from the 19th century, n = deg α and the exponent is n/2. Because this is the non-effective side of the story, we will leave it aside for the course and concentrate on the arithmetic approach.

Theorem 0.4 (Fermat’s Last Theorem/Wiles’ Theorem). The equation xn + yn = 1 has only finitely many solutions x, y ∈ Q. Precursors to the theorem are the work of Masser and W¨ustholz. In Lecture 6, we will briefly touch on lattice theory. Let B be a simply connected domain in R3 containing a single lattice : the origin. As 2We will call an effective constant any constant which one can, at least in principle, compute.

3 the domain expands, it grows to contain more and more point. If the ex- panded domain λB, λ ∈ R>1 contains a lattice point on its boundary, then the number λ is said to be a minima of B. Beginning with a smaller B generally increases the values of the successive minima, which are a function of the domain, λ1(B) < λ2(B) < ··· The λi depend on two parameters: on the of the domain vol(B) and on the vol(R3/Λ) of the lattice. We only touch the of this very interesting theory whose modern developments include Hermitian lattices and their , including generalizations of theorems like Riemann-Roch. In lecture 7 we will discuss unit equations which are basic tools for solving a large class of diophantine equations and diophantine problems.

Definition 0.5 (Unit equation). Let K be an algebraic number field, and let Γ ⊂ K× be finitely generated. The unit equation is x + y = 1, where x, y ∈ Γ.

In lecture 8, we will examine the Thue equation.3

Definition 0.6 (Thue equation). The Thue equation of degree d is of the form f(x, y) = 1 for a homogeneous, degree d f with at least 3 distinct zeros and coefficients in Q. The unit equation has further ramifications for solutions of yn = f(x). The projective closure C of the solutions Ce = {(x, y) | yn = f(x)} form a covering of P1 of degree n (technically, an algebraic C with a C → P1). Its can be calculated by the Hurwitz formula and defined over Q. It has finitely many solutions in the integers x and y. This fact relies on the unit equation.

Conjecture 0.7 (Mordell conjecture). If C has genus g > 1 over the field Q, then it has only finitely many rational points. Faltings proved this fact in 1983 in the more general case where the curve C may be over any finite extension of Q. Conjecture 0.8 (Shafarevich’s Conjecture). Fix a number field K and a finite set of primes S. Then there are only finitely many elliptic (up to isomorphism) over K which have good reduction outside of S.

3Thue’s 150th birthday incidentally occurs this year.

4 The general Shafarevich conjecture makes the same claim for Abelian varieties. Good reduction means the following: let y2 = x3 + ax + b, a, b ∈ Z be reduced modulo p. Its has the form ∆ = 4a3 + 27b2. When p | ∆, the discriminant vanishes modulo p, and hence the curve is not elliptic (it is not smooth). In all others cases, the curve is said to have good reduction at p. The Mordell conjecture is a consequence of the Shafarevich conjecture according to Parshin. Faltings proved the Shafarevich conjecture, and hence the Mordell con- jecture. The proof reduces the problem to a covering of P1 which hence has finitely many solutions. For a fixed ∆, the covering is 4a3 = −27b2 + ∆ .

Theorem 0.9 (Siegel’s theorem). Let f(x, y) = 0 define a curve of genus at least one with coefficients in Q. Then there are only finitely many solutions (x, y) ∈ Z × Z. Certain questions can be generalized to number fields of class number one which are those number fields with rings of integers in which ideals factor uniquely into prime ideals.

Conjecture 0.10 (Gauß conjecture on class numbers). There are only finitely many imaginary quadratic number fields with class number one.

Baker demonstrated the truth of this conjecture by showing that

|∆K | ≤ 163 √ for all K = Q( n), where n < 0.

5 Lecture 1

Algebraic Number Fields

Pn Let K ⊇ Q be a finite extension of fields. Then K = i=1 Q · ωi is in particular a finite dimensional Q-vector , where n = [K : Q] is called the degree of the field extension. ∼ The Galois group of the extension K/Q is Gal(K/Q) = {φ : K −→ K | φ fixes Q} (in fact, the condition is vacuous, since send 1 to 1). The extension need not be normal, hence not Galois, but the set,

HomQ(K, C) = {σ : K → C } is a measure of the distance of the field from being Galois and gives a critera: ∼ the extension K ⊇ Q is Galois if and only if HomQ(K, C) = Gal(K/Q). Furthermore, there is an action,

ρ K / E CO EE EE∼ EE subset E" 0 ρ(K) K π where the rightmost notation means that the Galois group acts on the

(π ∈ Gal(K/Q)). So HomQ(K, C) is a principal homogeneous space under π. Theorem 1.1 (Primitive element theorem). The finite extension K/Q has a primitive element, i.e., there exists an α ∈ K such that K = Q(α) = Q[α], where the first field is the field of rational functions in α which is equal to the polynomial Q[α] = Q · 1 + Q · α + ··· + Q · αn−1. Hence ∼ K = Q[T ]/(fα(T )) fα(T ) irreducible, fα(α) = 0.

Let OK ⊂ K be the ring of integers of K, i.e., n n−1 OK = {α ∈ K | minimal polynomial fα = T +a1T +···+an−1T +an ai ∈ Z} It is not direct that this forms a ring, so we must return to this point.

6 Rings and Ideals Convention 1.2. Rings R are always commutative and unitary. Homomor- phisms of rings always send 1 to 1. A ⊆ B means that A is a subset of B, while A ⊂ B means that A ⊆ B and A 6= B. Recall that a ring is entire1 if 0 6= 1 and the ring is without zero divisors, i.e. ab = 0 implies a = 0 or b = 0. • A subset I ⊆ R is an if an only if I is an R-.

• An ideal p ⊂ R is prime if and only if rs ∈ p implies r ∈ p or s ∈ p for all r, s ∈ R.2 Equivalently, p is prime if and only if R/p is entire.

• An ideal q ⊂ R is primary if and only if rs ∈ q implies r ∈ q or sn ∈ q for some n.3 These are powers of prime ideals in the ring of integers of a number field.

• An ideal p ⊂ R is maximal if and only if R/p is a field. Let M be an R-module. Then the annihilator ideal of M in R is

Ann(M) = 0 : M = {r ∈ R : rM = 0} ⊆ R.

For example, if R = Z and M = Z/6Z, then the annihilator ideal is Ann(M) = 6Z. To M are associated prime ideals which occur as annihilators of rank one modules:

ass(M) = {p ⊂ R | ∃m ∈ M : p = Ann(Rm)}.

Returning to the viewpoint of K as a Q-, to each Q-basis of K, B = {ω1, . . . , ωn} ⊂ K, is associated a discriminant.

2 ∆(ω1, . . . , ωn) = det(σiωj) , where 1 ≤ i, j ≤ n and HomQ(K, C) = {σ1, . . . , σn}. Using the primitive element theorem, this can be made more explicit. Let K = Q[α] = Q · 1 + Q · α + ··· + Q · αn−1 ⊂ C. Then the primitive element α has a minimal 1Sometimes called integral 2Compare to the case of integers and a p: mn ∈ (p) ⇒ m ∈ (p) or n ∈ (p) means p | mn ⇒ p | m or p | n. 3Compare to the case of integers and a prime power pn: ab ∈ (pn) ⇒ a ∈ (pn) or bn ∈ (pn) means pn | ab ⇒ pn | a or pn | bn.

7 polynomial fα, the lowest degree polynomial with coefficients such that fα(α) = 0. n n−1 fα(T ) = a0T + a1T + ··· + an−1T + an, ai ∈ Z n n Y Y = a0 (T − αi) = a0 (T − σiα) i=1 i=1

HomQ(K, C) = {σ1, . . . , σn} σi(α) = αi

The discriminant is a polynomial in the coefficients ai which vanishes ex- actly when fα(T ) has a multiple root. The exact expression appears as the determinant of a Sylvester matrix, but the situation becomes much simpler if the roots αi are used instead. In that case, a multiple root means that αi = αj for some i 6= j, and the Vandermonde determinant calculates the discriminant:

n−1 2 1 α1 ··· α 1 1 α ··· αn−1 n−1 2 2 Y 2 ∆(B) = ∆(1, α, . . . , α ) = . . = (αj − αi) . . i

Norm and Trace The norm of an element in a number field is just the product of all the conjugates of it. But the norm and the trace can be defined using using the left regular representation of K, K,→ End (K) ρ : Q . x 7→ (a 7→ x · a) Via this representation, the norm and trace on K are defined as the group homomorphisms, K× → × N : Q x 7→ det(ρ(x)) K → T r : Q x 7→ tr(ρ(x))

8 Question: Does N(x) = NormK/Q(x)? Does Tr(x) = TraceK/Q(x)? We observe that if B ⊂ OK is a basis of the ring of integers, then its discriminant ∆(ω1, . . . , ωn) is an integer.

Theorem 1.3. Let B ⊂ OK be a basis such that |∆(ω1, . . . , ωn)| is minimal among all bases. Then OK = Zω1 + ··· + Zωn.

Two such minimal bases differ by an element in GLn(Z).

Proof. Let α ∈ OK . Then

α = a1ω1 + ··· + anωn ai ∈ Q, 1 ≤ i ≤ n. = m1ω1 + ··· + mnωn + (r1ω1 + ··· + rnωn) mi = baic, ri = ai − mi.

Assume that not all ri are zero, and without loss of generality, let us assume that r1 6= 0. Change basis:

0 ω1 := α − m1ω1 − · · · − mnωn 0 ω2 := ω2 . . 0 ωn := ωn.

So  2 r1 ···  0 1 0 ··· 0  0 0   2 ∆(ω1, . . . , ωn) = det  . . . .  ∆(ω1, . . . , ωn) = r1∆(ω1, . . . , ωn).  . . . .  0 0 0 ··· 1

0 0 0 0 Then ω1, . . . , ωn is a basis in OK with |∆(ω1, . . . , ωn)| < |∆(ω1, . . . , ωn)|, a contradiction.

9 Lecture 2

Rings in Arithmetic

In this lecture we consider localization, integrality and generalities on valu- ation rings. Definition 2.1. Let R be a commutative, unital ring and S ⊂ R a multi- plicative set, i.e., a set with 1 ∈ S, such that x, y ∈ S implies xy ∈ S. Then the localization of R with respect to S makes precise the notion of a ring of r −1 fractions { s | r ∈ R, s ∈ S}. In particular, the localization S R can be formalized as a ring of (equivalence classes) of pairs (r, s): S−1R := R × S/ ∼ (r, s) ∼ (r0, s0) ⇔ ∃t ∈ S : t(rs0 − r0s) = 0 [(r, s)] + [(r0, s0)] = [(rs0 + r0s, ss0)] [(r, s)] · [(r0, s0)] = [(rr0, ss0)] 0 = [(0, 1)] 1 = [(1, 1)] Direct calculation shows that addition and multiplication are well-defined on equivalence classes. The reason for the requirement that S be multiplicative becomes clear; the denominator of the pairs contain terms of the form ss0. These formulas r follow from the analogy with fractions, i.e., (r, s) ↔ s . For example, r r0 = ⇔ rs0 = r0s ⇔ rs0 − r0s = 0 s s0 The additional t in the definition of ∼ deals with the case where R has zero- divisors, i.e., the counter-intuitive case when neither t nor rs0 − r0s are zero and yet their product is. For entire rings with 0 ∈/ S, the t may be left off. There is a well-defined, canonical morphism, R → S−1R ι : S r 7→ [(r, 1)]

−1 If R is entire, and 0 ∈/ S, then ιS is injective, and S R is entire.

10 • Let R = K and S = K×. Then localization has no effect, since a field already contains all fractions, i.e., S−1R = K. • Localization at a multiplicative set containing 0 always produces the , i.e., if 0 ∈ S, then S−1R = {0} is the zero ring. • Let p ⊂ R be a prime ideal. Then S := R \ p is a multiplicative set by the definition of a prime ideal. The localization at S is denoted by −1 Rp := (R \ p) R. It has a unique pRp. In fact, there is a bijective correspondence,

{I ⊂ p | (∀s ∈ R \ p)sa ∈ I ⇒ a ∈ I} ←→ {I ⊂ Rp} I 7→ IRp −1 ιp (J) ← J This holds in particular for prime ideals q ⊂ p.

−1 a • Let R = Z and S = Z \{0}. Then S R = { b | a, b ∈ Z, b 6= 0} = Q • More generally, let R be arbitrary and S = R \{0}, then S−1R = Frac(R) is the field of fractions of R.

• Let R = Z and p be a prime number. Letting S = Z \ pZ gives the −1 a 1 localization S R = Z(p) = { b | a, b ∈ Z, (p, b) = 1} The only ideals are the zero ideal (0) and powers of the maximal ideal (pn) for n ∈ N. • Again let R = Z and p be a prime. Letting S = {pn | n ∈ N} and localizing gives the ring of fractions whose denominators are powers of −1 a p, i.e., S Z = { pn | a ∈ Z, n ∈ N}. 2 Definition 2.2. A ring of the form Rp is called a local ring. Equivalently, a local ring is a ring with a unique maximal ideal. They are called local in analogy with geometry, where the ring of germs of functions (i.e., locally defined functions) at a fixed point p of a smooth has a unique maximal ideal formed by the functions vanishing at p.

2.1 Integrality

Let R be a commutative, unitary, entire ring. An (associative, commutative, unitary) R-algebra A is an R-module A with a bilinear multiplication β : A × A → A, i.e., a · (b + c) = a · b + a · c and (b + c) · a = b · a + c · a. We identify3 R with its isomorphic image in A under the ring morphism R → A 1The notation (p, b) means the greatest common divisor gcd(p, b) of p and b. 2 0 0 ∼ That is, R is local if there exists a ring R and prime ideal p ⊂ R such that R = Rp. 3That is, we abuse notation to write R ⊂ A.

11 which maps r 7→ r · 1A. Definition 2.3. An element a ∈ A is integral over R if and only if there exists g(T ) ∈ R[T ]of the form

n n−1 g(T ) = T + g1T + ··· + gn−1T + gn such that g(a) = 0. The weakness of this definition is that it does not immediately show why sums and products of integral elements are integral. The below theorem remedies this. Definition 2.4. An R-algebra A is integral over R if and only if every a ∈ A is integral over R. For example, R is integral over R, since for any r ∈ R, the polynomial g(T ) = T − r has zero g(r) = 0. Theorem 2.5 (Equivalent characterizations of integrality). There are three equivalent conditions for the integrality of an element of an R-algebra. (i) a ∈ A is integral over R.

(ii) R[a] is a finitely generated R-module where R[a] is the smallest subal- 0 gebra of A containing a, i.e., R[a] := ∩A03aA . (iii) There exists an R[a]-module M which is finitely generated as an R- module and satisfies AnnR[a](M) = 0. 2 n+q P n Proof. (i) ⇒ (ii): Mq := h1, a, a , . . . , a i ⊂ R[a] = n≥0 Ra for q ≥ 0. We have

n n−1 0 = g(a) = a + g1a + ··· + gn−1a + gn n n−1 ⇒ a = −g1a − · · · − gn n+q n+q−1 q ⇒ a = −g1a − · · · − gna ∈ Mq−1

⇒ M0 ⊂ Mq ⊂ Mq−1 ⊂ · · · ⊂ M0

⇒ R[a] = M0 is finitely generated. (ii) ⇒ (iii) : How can we find an M? We use the only candidate we have: M = R[a]. Then AnnR[a](R[a]) = 0, because the ring R[a] is entire. (iii) ⇒ (i) : This follows immediately from a lemma. Lemma 2.6. Let M be an R[a]-module which is finitely generated over R such that AnnR[a](M) = 0 and let I ⊂ R be an ideal with aM ⊂ IM. Then n n−1 a is a zero of a polynomial g(T ) = T + g1T + ··· + gn with gi ∈ I for all i.

12 Let M = Rm1 +···+Rmn. Writing multiplication by a in this basis gives ami = µi1(a)m1 + ··· + µin(a)mn. This defines a matrix µ(a) = (µij(a)) ∈ n×n I · M (R). Let Φ := a · In×n − µ(a), and then       m1 m1 m1  .   .   .  a  .  = µ(a)  .  ⇒ (a · In×n − µ(a))  .  = 0 mn mn mn

Multiplying on the left by the adjugate matrix4 shows that det Φ = 0:   m1  .  (det Φ)· .  = 0 ⇒ det Φ·mi = 0 ⇒ det Φ·M = 0 ⇒ det Φ ∈ Ann(M) = 0 ⇒ det Φ = 0 mn

Hence g(T ) = det(T · In×n − µ(T )) is a polynomial of the required form. Corollary 2.7. (i) An element a ∈ A is integral over R if and only if there exists a finite5 R-algebra A0 ⊂ A such that R[a] ⊂ A0.

(ii) The set R of elements of A which are integral over R is a of A called the integral closure of R in A.

Proof. Part (i). ⇒ : By (ii) of the above theorem, we may take A0 = R[a]. Since a is integral over R, we note that this is the same as R[a] being finitely generated. ⇐ : Let A0 be a finite R-algebra and R[a] ⊂ A0. Hence aA0 ⊂ A0. 0 0 0 So A is an R[a]-module with AnnR[a](A ) = 0, since 1 ∈ R[a] ⊂ A . Using the (iii) ⇒ (i) part of the above theorem shows that a is integral over R. Part (ii). Let a, b ∈ R. Then there exist R-subalgebras A0,A00 ⊂ A such that R[a] ⊂ A0 and R[b] ⊂ A00 and A0,A00 are finitely generated R-modules. The product A0A00 is also finitely generated.6 Since R[a + b],R[ab] ⊂ R[a, b] and R[a, b] ⊂ R[a] · R[b] ⊂ A0A00, applying Part (i) gives the proof.

The outlook for all this is the application to the case when K ⊇ Q ⊃ Z for A = K and R = Z, where K is a number field. Then the ring of integers in K is OK := Z, the integral closure of Z in K. Looking forward, we will consider absolute values which correspond roughly to valuations, and then move on to heights.

4The adjugate matrix adj(Φ) always exists and satisfies adj(Φ) · Φ = det Φ. 5We use “finite” to mean “finitely generated as a module”. 6 0 Pn 00 Pm 0 00 P Proof: A = i=1 Rai,A = j=1 Rbj ⇒ A A = i,j Raibj.

13 2.2 Valuation rings

We again let R be a commutative, unitary ring and let U(R) = R× be the units7 of R.

8 Theorem 2.8. Let I = ∪a⊂Ra be the union of all proper ideals of R. Then I = R \ U(R), and I is an ideal if and only if R has a unique maximal ideal.

Proof. The first claim is proved by showing both inclusions.

I ⊆ R \ U(R): Let x ∈ R \ U(R) and let a = xR. If a were the whole ring, then we would have 1 ∈ xR. In other words, there would be an element y ∈ a with xy = 1. But then x would be a unit. Contradiction. Hence x ∈ a ⊆ I.

I ⊇ R \ U(R): Let x ∈ I, so that x ∈ a ⊂ R. If x were a unit, then a would contain x−1x = 1 and hence be the whole ring, a contradiction. So x ∈ R \ U(R).

The second claim is proved by showing both directions of implication.

(⇒) Assume I is an ideal. Since I contains all proper ideals, it is by defi- nition maximal. Let m be another maximal ideal. Then m is a proper ideal, hence I contains m. Since m is maximal and I is a proper ideal, m = I, i.e., I is unique.

(⇐) Assume R has a unique maximal ideal. Let x, y be elements of I. By the first part, they are not units, so xR, yR 6= R. Let m be the unique maximal ideal which must contain both xR and yR. Thus it contains both x and y. Hence x − y ∈ m and rx ∈ m for all r ∈ R. Since I contains all proper ideas, it also contains m, so x − y, rx ∈ I, i.e., I is an ideal.

Note that the proposition implies that U(R) = R \ I.

Definition 2.9. Let R be entire with quotient field K = Frac R. The ring R is called a valuation ring if either r ∈ R or r−1 ∈ R for every r ∈ K.

Definition 2.10. We will denote by I(R) the set of ideals of R.

7An element r ∈ R is a unit if it is invertible with respect to multiplication, i.e., there −1 −1 exists r ∈ R such that r r = 1R. 8An ideal I ⊂ R is a proper ideal if I 6= R.

14 Theorem 2.11. Let R be a valuation ring. Then the set of ideals, I(R), is totally ordered by inclusion. Proof. Let I,J ∈ I(R). If I = J, then there is nothing to show, so let I 6= J. Without loss of generality, let I contain an x which is not in J. Let y ∈ J \{0}. Then x x · y = x∈ / J and y ∈ J ⇒ ∈/ R y y By hypothesis, R is a valuation ring, hence y/x = (x/y)−1 ∈ R and y = (y/x) · x ∈ I. Since y was arbitrary, J ⊂ I. The total ordering on I(R) immediately implies a couple corollaries. Corollary 2.12. Let R be a valuation ring. Then R has a unique maximal ideal. Corollary 2.13. Let R be a valuation ring. Then U(R) = R \ m(R). Corollary 2.14. Let R be a valuation ring. Then K \ R = {x ∈ U(K) | x−1 ∈ m(R)}. Proof. ⊆: By the definition of a valuation ring, if x∈ / R, then x−1 ∈ R and also x−1 ∈/ U(R) (otherwise x = (x−1)−1 ∈ U(R) ⊂ R.) Hence x−1 ∈ m(R). ⊇: Let x ∈ U(K), and let x−1 ∈ m(R). Since x−1 ∈ m(R), x∈ / U(R). But if x−1 is not in U(R), then neither is x in U(R). Since x−1 ∈ m(R), x cannot be in m(R), or else m(R) would be the whole ring. But if x is neither in m(R) nor in U(R), it must not be in R at all, that is, x ∈ K \ R.

Theorem 2.15. Let R be a valuation ring. Then R is integrally closed in K. Proof. Let x ∈ K be integral over R. Assume that x∈ / R. Since R is a valuation ring, this implies x−1 ∈ R. By definition of integrality, there is a polynomial relation, n n−1 x + a1x + ··· + an−1x + an = 0. Division by xn and rearrangement gives, −1 −(n−1) −n −1 −(n−2) −(n−1) 1 = −a1x −· · ·−an−1x −anx = x (−a1−· · ·−an−1x −anx ). So x−1 ∈ U(R). Thus x−1 ∈ m(R). In that case, 1 ∈ m(R), a contradiction. So the assumption was false and x ∈ R, that is, R is integrally closed.

15 2.3 Discrete valuation rings

Let R be an entire ring, and K = Frac R be its quotient field.

Definition 2.16 (Real valuation and value group). Let K be a field. A real valuation on K is a function ν : K → R ∪ {∞} such that 1. ν(xy) = ν(x) + ν(y).

2. ν(x + y) ≥ min{ν(x), ν(y)}

3. ν(x) = ∞ ⇔ x = 0.

The image ν(K \{0}) ⊆ R is called the value group. Definition 2.17 (Discrete valuation). Let K be a field. A real valuation on K such that the value group is a subgroup of Z is called a discrete valuation in which case the value group has the form e · Z for some e > 0. We introduce the notation,

Rν := {x ∈ K | ν(x) ≥ 0} ⊂ K (subring)

mν := {x ∈ K | ν(x) > 0} ⊂ Rν (ideal)

Then Rν is a discrete valuation ring. (Let x ∈ K \ Rν. By the definition −1 of Rν, this means ν(x) < 0. Hence ν(x ) = −ν(x) > 0, from which we −1 conclude that x ∈ Rν.) In addition, mν is the unique maximal ideal whose existence is guaranteed by Corollary 2.12. (Let mn := {x ∈ Rν | ν(x) ≥ n} for n ∈ Z>0. Then

m1 = mν ⊇ m2 ⊇ · · · ⊇ mn ⊇ · · ·

We take π ∈ m1 with ν(π) = 1. In fact, π is uniquely determined up to 0 multiplication by a unit in Rν. Let ν(π ) = 1. Then

π0  π0 π0 ν = 0 ⇒ ∈ R \ m = U(R ) ⇒ =  ∈ U(R ) ⇒ π0 = π. π π ν ν ν π ν

n The same argument shows that mn = m for all n:  x  x ∈ m \m ⇒ ν(x) = n ⇒ ν = 0 ⇒ x = πn ⇒ x ∈ mn ⇒ m = mn. n n+1 πn n Proposition 2.18. Let R be a discrete valuation ring. Then R is a principal ideal domain.

16 Proof. Let I ⊆ R be an ideal. Then ν(I \{0}) ⊆ Z is an additive subgroup. Hence ν(I) = nZ for some n. Hence I = mn = (πn) is a principal ideal. Theorem 2.19. Let R be a noetherian, entire local ring such that every non-zero prime ideal is maximal. Then the following are equivalent:

(i) R is a discrete valuation ring.

(ii) R is integrally closed.

(iii) m is a principal ideal.

(iv) m/m2 is an R/m-vector space of one.

(v) For all ideals a 6= 0 in R, there is an n ≥ 0 with a = mn.

(vi) There exists a π ∈ R such that for all ideals a ⊆ R, a = (π)n for some n ≥ 0.

Lemma 2.20. If R is a discrete valuation ring, then the module quotient mn/mn+1 is an R/m-vector space for all n, and it has dimension one.

Proof. Since mn+1 is an R-module, multiplication by r in mn/mn+1 is well- defined: r : x + mn+1 7→ rx + rmn+1 = rx + mn+1. Hence mn/mn+1 is an R-module. Since Ann(mn/mn+1) = m, it is furthermore an R/m-module. Finally, the isomorphism

mn = (πn): mn/mn+1 −→∼ R/m 0 6= x ∈ mn : x = πn 7→  shows that it is a one dimensional vector space over R/m. Let R be a discrete valuation ring. We give it a of open sets generated by the basis of open neighborhoods of 0:

R ⊃ m ⊃ m2 ⊃ · · · ⊃ mn ⊃ · · · ⊃ 0

The topology is separated if

n ∩n≥0m = (0).

If R is noetherian, then by a lemma of Nakayama (to be proved later), mI = I.

17 Lecture 3

Absolute Values and Completions

Today we introduce the general theory of absolute values and completions with respect to them. Then we examine the particular case of absolute values on number fields. Finally, we prove Nakayama’s Lemma.

3.1 Basic facts about absolute values

Definition 3.1 (Absolute value). Let K be a field. An absolute value on K is a function | · | : K → R≥0 such that the following three properties hold. 1. |x + y| ≤ |x| + |y|.

2. |xy| = |x| · |y|.

3. |x| = 0 ⇔ x = 0. An absolute value is called non-archimedean if it additionally satisfies the inequality |x + y| ≤ max(|x|, |y|) Lemma 3.2. Let (K, | · |) be a value field (a field with absolute value). Then |1| = | − 1| = 1. Proof. By multiplicativity, 1 = 1·1 and 1 = (−1)(−1) imply that |1| = |1|·|1| and | − 1| = | − 1| · | − 1|. Since only 0 has absolute value 0, we may divide by |1| and | − 1| respectively, showing that |1| = | − 1| = 1. Proposition 3.3. The absolute value | · | is non-archimedean if and only if | · | is bounded on Z ⊂ K.

18 Proof. ⇒: Let the absolute value be non-archimedean. We proceed by induction on n. In the base cases, |1| = |1 + 0| ≤ max(|1|, |0|) ≤ 1. For the general case, let n ∈ Z>1. Then |n| = |(n − 1) + 1| ≤ max(|n − 1|, |1|) ≤ 1.

⇐: Now let the absolute value be bounded on Z. Let x, y ∈ K. Consider the binomial formula

n X  n  (x + y)n = xn−kyk k k=0 n   n X n n |x + y| ≤ · M ,M = max(|x|, |y|) k j=0 ≤ (n + 1)cM n c = max |n| < ∞ n∈Z Hence 1 1 |x + y| ≤ (n + 1) n c n M for all n ∈ N Taking the limit as n goes to infinity gives |x + y| ≤ M.

An straightforward calculation shows that every absolute value |·| : K → R≥0 induces a metric d : K × K → R≥0 by the formula d(x, y) = |x − y|.

Definition 3.4 (Completion). Let (K, | · |) be a field with absolute value. Let N S = {(xn) ∈ K | (∀ > 0)(∃N)(∀m, n > N)(xm − xn < )} be the set of Cauchy sequences. The completion, Kb, of the field K with respect to | · | is

Kb = S/ ∼ (xn) ∼ (yn) ⇔ lim (xn − yn) = 0. n→∞ with the canonical embedding of K given by the map to constant sequences,

K → K ι : b x 7→ (x)n∈N

The operations are defined element-wise, i.e., [(xn)] + [(yn)] = [(xn + yn)] and [(xn)] · [(yn)] = [(xnyn)]. These are well-defined and give it the structure of a field.

19 The following can be easily checked:

• The completion Kb is complete, i.e., every in Kb N con- verges in Kb.

• We extend |·| to |c · | on Kb by taking limits, i.e., |[(xn)]| := limn→∞ |xn|. A simple question asks whether an arbitrary field admits an absolute value. In fact, ever field admits at least the trivial absolute value.

Definition 3.5 (Trivial absolute value). Let K be a field. Every field admits at least the trivial absolute value defined by |x| = 1 for all x 6= 0 in K.

Lemma 3.6. If (K, | · |) is a finite field with absolute value, then |x| = 1 for all x ∈ K \{0}.

Proof. The absolute value must take 0 to the zero element, 0K , and every non-zero element is a torsion element, so |x|q = |xq| = |1| = 1.

Definition 3.7 (Equivalent absolute values). Let K be a field. Two absolute values | · |1 and | · |2 on K are said to be equivalent if they define the same topology.

Theorem 3.8. Let K be a field, and let | · |1 and | · |2 be non-trivial absolute values on K. They are equivalent if and only if |x|1 < 1 implies |x|2 < 1 for all x 6= 0. In that case, there exists a λ > 0 such that

λ |x|2 = |x|1 ∀x ∈ K \{0}.

n Proof. Let x ∈ K \{0}. Then |x| < 1 if and only if limn→∞ |x | = 0. But n n equivalent mean that limn→∞ |x |1 = limn→∞ |x |2. Hence

 n n  | · |1 ∼ | · |2 ⇔ lim |x |1 = 0 ⇔ lim |x |2 = 0 n→∞ n→∞ ⇔ (|x|1 < 1 ⇔ |x|2 < 1).

This proves the first part. For the second part, let x0 ∈ K\{0} such that |x0|1 6= 1 (it exists since the absolute values are non-trivial). If |x0|1 < 1, then replace x0 by its inverse, so λ that in either case, |x0|1 > 1. The goal is to show that |x0|2 = |x0|1 . Solving for λ suggests the definition,

log |x | λ = 0 2 . log |x0|1

20 This definition is well-defined, because |x0|1 is neither 1 nor ∞. It depends a priori on x0. Our task will be to show that it is in fact independent of this choice. Let x ∈ K \{0} be arbitrary, and let α ∈ R be such that α |x|1 = |x0|1 . Let (sn) and (rn) be sequences in Q whose limits are both α (i.e., limn→∞ sn = limn→∞ rn = α), but such that rn ≥ α for all n while sn ≤ α for all n. Let rn = an/bn with an, bn ∈ Z. Hence

bn bn bn bn α rn x |x|1 x |x|2 rn |x|1 = |x0|1 ≤ |x0|1 ⇒ an = an ≤ 1 ⇔ an ≤ 1 ⇒ an ≤ 1 ⇒ |x|2 ≤ |x0|2 . x0 1 |x0|1 x0 2 |x0|2

α Taking the limit as n goes to infinity results in the inequality |x|2 ≤ |x0|2 . α The same argument with (rn) replaced by (sn) gives |x|2 ≥ |x0|2 . Hence

α λα λ |x|2 = |x0|2 = |x0|1 = |x|1 . for all x ∈ K \{0}.

Definition 3.9 (Discrete absolute value). An absolute value is discrete if and only if |K \{0}| ⊆ R≥0 is a discrete subset. A (non-trivial) discrete absolute value satisfies |K \{0}| = qZ for some 0 < q < 1.

Definition 3.10 (Homomorphism of valued fields). A homomorphism of valued fields, σ :(K, | · |) → (K0, | · |0), is a field homomorphism σ : K → K0 such that |σx|0 = |x|. In other words, σ∗(K0, | · |0) = (K, | · |).

Next time we will restrict ourselves to special case of number fields. The set of homomorphisms from the field into the complex numbers which form a valued field. Hence each homomorphism induces an absolute value on the field. By this construction derive all archimedean absolute values on the field. Then we will determine all absolute values on the rationals by the theorem of Ostrowski which gives information about absolute values on number fields more generally. The absolute values on the rationals satisfy the product formula, i.e., that the product of all absolute values of a non-zero gives one. Then we will be prepared to study heights.

3.2 Absolute Values on Number Fields

Let K ⊇ Q be a number field, and let

Σ := HomQ(K, C)

21 be the -vector space of -linear field homomorphisms to . The standard Q Q 1 C absolute value on C is |z| = (z · z) 2 and makes (C, | · |) into a valued field. Conjugation F ∞ : z 7→ z preserves the standard absolute value and fixes the real numbers. It is known in this context as the Frobenius morphism at infinity. The group {1,F ∞} acts on Σ. Given an embedding σ : K → C, the standard absolute value on C can be pulled back to give an absolute value on K defined by,

|x|σ := |σx|, x ∈ K

∞ Since F preserves the absolute value, the pullback |·|F ∞σ gives an equivalent absolute value on K. Such pull-backs give all archimedean absolute values on a number field K. √To illustrate these notions, let us consider a simple example. Let K = Q( 2). Then Σ = {σ, τ}. These are two embeddings in C, √ √ √ √ σ : a + b 2 7→ a + b 2 τ : a + b 2 7→ a − b 2 inducing two absolute values, √ √ √ √ |a + b 2|σ = a + b 2 |a + b 2|τ = |a − b 2| √ On the other hand, if K = Q( −2), then the embeddings, √ √ √ √ a + b −2 7→ a + b −2 a + b −2 7→ a − b −2, √ √ induce the same absolute value (|a + b −2| = a2 + b2), and there is just one (archimedean) absolute value on K. In general, the set of embeddings can be decomposed into real and com- plex embeddings as Σ = ΣR ∪ ΣC. Let r = |ΣR| and 2s = |ΣC|. Then r + 2s = [K : Q], and r + s is the total number of archimedean absolute values. It remains to classify non-archimedean absolute values on K. To that end, let (K, | · |) be a non-archimedean value field. Then

O := {x ∈ K | |x| ≤ 1} ⊇ Z p := {x ∈ K | |x| < 1}

The set O is a ring. Indeed, if x, y ∈ O, then both x − y and xy are in O, because |x − y| < max(|x|, | − y|) < 1 and |xy| = |x||y| < 1. It is even a valuation ring, because for all x ∈ K, either |x| ≤ 1 or |x| > 1, hence either x ∈ O or x−1 ∈ O.

22 On the other hand, if we begin with OK ⊂ K, the integral closure of Z in K, and p ⊂ OK is a non-zero prime ideal, then p is maximal. Since p ∩ Z = (p) is prime ideal, the number p is prime. Furthermore, the quotient by p, OK /p, is integral over Z/pZ = Fp. Indeed, let [x] ∈ OK /p be an equivalence class with x ∈ OK its representative. Since x is integral over Z, it satisfies a monic polynomial relation with integral coefficients which can be reduced modulo p:

n n−1 n n−1 x +a1x +···+an−1x+an = 0 ⇒ [x] +[a1][x] +···+[an−1][x]+[an] = [0]

The second relation shows that [x] is integral over Z/pZ.

Proposition 3.11. Retaining the above notation, the ring OK /p is a field. More generally, a ring which is integral over a field is a field.

Proof. We must show that every element has an inverse. Let [x] ∈ OK /p. We would like to find a [y] ∈ OK /p such that [x][y] = [1]. Since [x] is integral over Z/pZ,

−1 n−1 n−2 [1] = −[an] ([x] + [a1]x + ··· + [an−1])[x].

−1 n−1 n−2 Letting [y] = −[an] ([x] + [a1]x + ··· + [an−1]) finishes the proof. The same proof holds for the general case. Since the quotient by the ideal p is a field, p must be a maximal ideal.

Proposition 3.12. The ring of integers OK is integrally closed and every nonzero prime ideal of OK is maximal.

Since OK is a noetherian domain, the immediate consequence of the proposition is that OK is a Dedekind domain, i.e., an integrally closed, noetherian domain of one. Krull dimension one means ex- actly that every nonzero prime ideal is maximal. In a Dedekind domain, every nonzero proper ideal factors uniquely into a product of prime ideals. This unique factorization property makes Dedekind domains the most important rings in arithmetic. They also enjoy the prop- erty that all primary ideals are powers of prime ideals. We have begun with a prime ideal p in OK , and we would now like to construct an absolute value from it. To begin, its powers define a filtration,

2 3 n OK ⊃ p ⊇ p ⊇ p ⊇ · · · ⊇ p ⊇ · · · ⊇ (0) We claim the above chain of ideals tend to the zero ideal in the limit, i.e.,

n I := ∩n≥0p = (0), (3.1)

23 We will show this after proving Nakayama’s lemma. This vanishing implies that the inclusions of the above filtrations are strict and define a basis for a separated topology on OK . In this case, the filtration determines a valuation by the definition,

OK → Z ∪ {∞} ordp : x 7→ min{e ∈ N | x ∈ pe \ pe+1} A straightforward calculation shows that this is indeed a valuation. It can be used to define an absolute value on OK by choosing 0 < q < 1 and letting

ordp x |x|p := q

This absolute value depends on a choice, namely a choice of real number, q, between 0 and 1. But different choices yield equivalent absolute values.

Definition 3.13 (Place). A place of a number field K is an equivalence class of absolute values on K. There are two kinds of absolute values. An infinite place is a class containing an archimedean absolute value, while a finite place contains non-archimedean absolute values.

Every prime ideal p ⊂ OK defines a place by taking the equivalence class of absolute values containing | · |p. If p = char OK /p, then ordp(x) | p for all x ∈ K.

3.3 Lemma of Nakayama

Proposition 3.14. Let R be a Dedekind domain, and let M bea finitely 1 generated R-module. Let a ⊂ R be an ideal, and let φ ∈ EndR(M) be such that φ(M) ⊆ a · M 2. Then φ satisfies the equation

n n−1 φ + a1φ + ··· + an−1φ + an = 0, ai ∈ a

Proof. Let m1, . . . , mk be generators for M over R. Then the hypothesis φ(M) ⊆ a · M implies,

k X φ(mi) = aijmj, aij ∈ ai. j=1

1 Note that EndR(M) is an R-module containing R by the canonical morphism r 7→ r·id. 2 The set {φ ∈ EndR(M) | φ(M) ⊆ a · M} is a right ideal in EndR(M).

24 This can be rewritten as,

k X (δijφ − aij)mj = 0, i = 1, . . . , k. j=1

Letting Ψ be the matrix (δijφ − aij), the system has the form Ψ · m = 0. Multiplying both sides by the adjugate of Ψ gives dm := det(δijφ−aij)·m = 0. Hence dmi = 0 for all i. Since the mi generate M, it must be that d = 0. This finishes the proof with the relation det(δijφ − aij) = 0 as the desired equation. Corollary 3.15. Let R again be a Dedekind domain with proper ideal a ⊂ R, and let M be a finitely generated R-module. If aM = M then there exists x ∈ R such that x ≡ 1 (a) and xM = 0. Proof. Letting φ := id and applying the previous proposition gives that n n−1 id + a1id + ··· an = 0 for some ai ∈ a. Let x = 1 + a1 + ··· + an. Then x ≡ 1 (a) and xM = (idn + ··· + a + n)M = 0. Lemma 3.16 (Nakayama). If a ⊂ R is an ideal contained in each maximal ideal, and aM = M, then M = 0. Proof. Let x be as in the corollary. Then x ≡ 1 (a). Hence x ≡ 1 (m) for all maximal ideals m, i.e., x − 1 ∈ m. If x were in m, then 1 would also be in m, so x is not in any maximal ideal. Hence x is a unit, so that x−1 ∈ R. Thus xM = 0 implies that M = x−1xM = 0. Returning to the condition , the definition of I implies that

I = p · I (3.2)

The inclusion of p · I ⊆ I follows straight from the definition. In the other n direction, the difficulty is that an element x ∈ I = ∩np might not factor uniformly into a product x = yz for y ∈ p and z ∈ pn for all n. (It does of course factor separately for each n, but this is not sufficient.) Since OK is noetherian, this factorization is a consequence of the Artin-Rees Lemma. Lemma 3.17 (Artin-Rees). Let I ⊂ R be an ideal in a Noetherian ring R. Let M be a finitely generated R-module and N ≤ M, a submodule. There exists an integer k ≥ 1 such that

InM ∩ N = In−k((IkM) ∩ N) for all n ≥ k.

25 Applying this to the case where I := p, M := OK and N := I gives that,

pn ∩ I = pn−k(pk ∩ I), n ≥ k

This can be rewritten as I = pn−kI. Letting n = k + 1 results in the desired equality. Were the equality 3.2 to hold for a non-zero I, it would seem strange, because the equality could be interpreted as saying that elements of I are infinitely divisible. In other words, one might take x ∈ I and factor out an element of p, e.g., x = p1 · x1. Iteratively applying the equality, an arbitrary number of elements of p may be factored out x = p1 ··· pnxn. More precisely, localizing the exact sequence,

0 / pI / I / I/pI / 0 at p produces

∼ 0 / (pI)p / Ip / (I/pI)p = 0 / 0. ∼ By writing out a generic element, one sees that (pI)p = aIp, where a = pOK,p. Localization is exact, so this gives ∼ aIp = Ip.

Since a is the unique maximal ideal in the localization, we may Nakayama’s lemma to infer that Ip = 0. Since OK has no zero-divisors, this implies that the I = 0 which was to be shown. We have constructed in a very natural way a set of absolute values. It is not yet clear whether these are all. This will be a central point of the next lecture. In the case of K = Q, it will turn out that in principal, we have already found them all. The result must then be extended to arbitrary number fields K. In this direction we will prove the product formula and then proceed to heights.

3.4 Ostrowski’s Theorem

To summarize the results so far, if K ⊇ Q is a number field and OK ⊃ Z is the integral closure of Z in K, then the absolute values fall into two categories:

Archimedean absolute values These arise as the pull-backs of (C, |·|) by the embeddings of K in C, HomQ(K, C).

26 Non-archimedean absolute values The ring OK is a Dedekind domain. Every prime ideal in OK is maximal, and there is a descending chain 2 ∞ n OK ⊃ p ⊃ p ⊃ · · · ⊃ (0) such that ∩n=0p = (0) (which implies that ordp x the inclusions are strict.) The associated absolute value is |x|p := λ for 0 < λ < 1.

Each non-archimedean absolute value defines a valuation ring Op ⊃ OK by Op := OK,p, the localization at the set S = OK \p. These local rings intersect to give the original (“global”) ring, i.e., ∩pOp = OK . Remark 3.18. A non-archimedean absolute value can be normalized. First a prime number p can be associated to p by intersecting p ∩ Z = (p) and choosing the positive generator. Let q be chosen so that q− ordp p = p, i.e., so −1/ep that |p|p = |p|. The correct choice is q = p , where ep = ordp p. This is the normalization.

Remark 3.19. This works also for Q replaced by Qp. There are inclusions K ⊇ Qp ⊇ Zp. The ring of integers, Op ⊂ K is the integral closure of Zp in K. Applying this theory to Q gives the absolute values  |x|∞ = max(x, −x) archimedean − ordp x 1 |x|p = p non-archimedean (normalize λ = p ) These are all absolute values:

Theorem 3.20 (Ostrowski3). Up to equivalence these are all the non-trivial absolute values on Q.

Proof. We fix a non-trivial absolute value on Q. Then by Lemma 3.2, |1| = | − 1| = 1. Let k ∈ N be a . Then |k| = | 1 + ··· + 1 | ≤ k|1| = k ⇒ | − k| ≤ k. | {z } k For integers m, n > 1, define

N := max(1, |n|).

d We can expand m in terms of powers of n, i.e., m = m0 + m1n + ··· + mdn where 0 ≤ mi < n and md 6= 0. The necessary power of n may be deteced by

3Ostrowski was professor at Basel

27 the inequality m ≥ nd which rewritten becomes d ≤ log m/ log n. Applying absolute values to the expansion of m gives,

d |m| ≤ |m0| + |m1||n| + ··· + |md||n| d ≤ m0 + m1|n| + ··· + md|n| < n(1 + |n| + ··· + |n|d) ≤ n(1 + N + ··· + N d) ≤ n(d + 1)N d.

Choose an integer s > 0, and replace m by ms. Then d becomes ds.

|m|s = |ms| ≤ n(ds + 1)N ds

Taking the sth root gives

1 1 d |m| ≤ n s (ds + 1) s N

The limit as s approaches infinity is |m| ≤ N d ≤ N log m/ log n which rewritten gives 1 1 1 |m| log m ≤ N log n = max(1, |n|) log n The maximum divides into two cases according to whether the absolute value is archimedean. In the first case, let max(1, |n|) = |n| for some integer n > 1. Interchanging the role of m and n in the above derivation gives the above inequality with m and n exchanged, i.e., |n|1/ log n ≤ max(1, |m|)1/ log n. Concatenating these two inequalities produces an equality,

1 1 |m| log m = |n| log n .

Since the equality holds for all m, it will hold in addition for all n. Indepen- 1 dent of n, there is thus a λ such that |n| log n = eλ. Hence,

λ log n λ |n| = e = n , n ∈ Z>1. In fact, for all rational numbers r = p/q, we have

λ |r| = |r|∞ for the standard absolute value |·|∞. This holds for integers, because |−n| = λ λ λ λ |(−1)(n)| = |−1|·|n| = n = |n|∞, and |−1| = |1| = 1 = 1 and |0| = 0 = 0 . For a general rational number, qr = p implies |q||r| = |p|. Hence

λ λ λ λ |r| = |p|/|q| = |p|∞/|q|∞ = |p/q|∞ = |r|∞.

By Theorem 3.8, this means that | · | ∼ | · |∞.

28 On the other hand, let max(1, |n|) = 1 for some n, i.e., |n| ≤ 1. Then induction on m shows that |m| ≤ 1 for all integers m. Proposition 3.3 then implies that this is a non-archimedean absolute value bounded by 1. We define, O := {x ∈ Q | |x| ≤ 1} p := {x ∈ O | |x| < 1}. The ring O is a discrete valuation ring, and the ideal p = (π) intersects Z in either p ∩ Z = (p) or (0). If | · | is non-trivial, then there exists an x 6= 0 with |x|= 6 1. Either |x| or |x−1| is less than one, so either x ∈ p or x−1 ∈ p. So it is impossible that p ∩ Z should be (0). Let therefore p be the generator of p ∩ Z, and let | · |p be the normalized k absolute value corresponding to p for which |p|p = |p|. Let n = p m be an integer factored so that p - m, i.e., m∈ / p ∩ Z. Hence m ∈ O \ p, so |m| = 1 = |m|p. Putting this together gives, k k k |n| = |p m| = |p| |m| = |p|p |m|p = |n|p.

The equality may be extended to rational numbers, proving that |·| ∼ |·|p.

3.5 Product Formula for Q

Theorem 3.21 (Product Formula). Let MQ be the set of normalized valua- tions on Q. For x ∈ Q×, we have Y |x|ν = 1.

ν∈MQ Q × Proof. Let P(x) := ν |x|ν. We show that P(x) = 1. Since P : Q → R is multiplicative, it suffices to show the formula for prime x, and then apply prime decomposition to an arbitrary integer. Let x = p be prime, and let q be any finite prime. Then  1 q = p ord p = q 0 q 6= p Therefore Y Y −1 Y |p|ν = |p|∞|p|p · |p|q = pp · 1 = 1. ν q6=p q6=p

Remark 3.22. Let Qν be the completion of Q at ν ∈ MQ, i.e., Qν = Qp if p is a prime and Qν = R if p = ∞. Let Y AQ ⊆ Qν ν

29 be the subset of (x ) with |x | ≤ 1 for all but finitely many ν, i.e., ν ν∈MQ ν ν where xν is a p-adic integer for all but finitely many p. Then AQ is a ring called the ring of Ad`eles with an embedding of Q by ,→ ι : Q AQ x 7→ (x)ν

By the product formula, the image of Q lies in the cut out by Q the equation |xν|ν = 1. This theory was initially developed by A. Weil.

3.6 Product formula for number fields

We now examine the general case of number fields. Let (K, | · |ν) be a valued field, and (Kν, | · |w), its completion. Let L ⊇ K be a finite, . The K-algebra L is a finite, separable field extension. Hence L = K[x] for some x ∈ L, and K[T ] → L π : T 7→ x is a with (f) for an irreducible polynomial f. Its ∼ quotient is L = K[T ]/(f), and changing coefficients to the completion Kν ∼ produces L ⊗K Kν = Kν[T ]/(f). Factoring the polynomial f = f1 ··· fn over the completion Kν leads to a finite set of pairwise orthogonal idempotents. Namely, by the Chinese Remainder Theorem,

n n ∼ Y Y Kν[T ]/(f) = Kν[T ]/(f1) ∩ · · · ∩ (fn) = Kν[T ]/(fi) =: Li i=1 i=1 ∼ Qn Hence Eν := L ⊗K Kν = i=1 Li for fields Li ⊇ Kν. The projections i ∈ EndK (Eν) such that i : Eν → Li are idempotents. They decompose ν P the identity as 1 = 1 ⊗ 1 = i w and multiply according to the rule,

w · w0 = δw,w0 w, for the Kronecker delta, δw,w0 . These two properties can be seen using the diagram, n n ∼ Y pri i Y ∼ Eν = Li −→ Li ,→ Lj = Eν. i=1 | {z } j=1 ei

There are only finitely many prime ideals pw ⊂ Eν, one for each w, M X pw := Eνw0 = Ann(Eww) = {x = xw0 w0 ∈ Ew | xw = 0}. w06=w w0

30 P They are all maximal. Letting x = 0 xw0 w0 , the condition that 0 = P w xw becomes 0 = w0 xw0 w0 w = xww, which happens exactly when xw = ∼ 0. Hence the morphism πw is a quotient by a maximal ideal, and Eνw = (Eν/pw) · w = Lw is a field.

Constructing the absolute values

All absolute values of L will be found as absolute values of Lw in the following manner. The above reasoning may be summarized by the diagram,

 y i4 K ν WW iiii _ WWWWW iii WWW ιw iiii WWWWW iiii WWWW iiii WWWWW ' × iii  πw +  ix i / K VVV Eν := L ⊗K Kν = ⊕wEνw Lw := Eνw VVVV O gg3 VVVV ggggg VVVV ggggg λ VVVV ggggg κw VVVV ? Õ gggg VV* L% gggg

By the density of the embedding of K in its completion, the extension of |·|ν to Kν is unique. We have already constructed it above by taking limits. The embedding L → L⊗K Kν embeds L as a dense subset, and thus any absolute value on L extends uniquely to Eν. For each norm on Lw which pulls back to | · |w on Kν, we can pull it back to L to get a norm extending the given norm | · |ν on K. Each idempotent defines an extension k · kw of the absolute value k · kν to Eν:

n n X X n kxkν = |prix|ν = |xi|ν, x = (x1, . . . , xn) ∈ Kν . i=1 i=1

This norms makes Eν into a complete Kν-vector space. It is an extension: ∗ ιwk · kw = | · |ν. In turn, it defines an absolute value on L by

|x|w := kκwxkw.

This absolute value extends the original absolute value on K, i.e., |x|w = |λx|ν. We have shown, Lemma 3.23. Let K be a number field, and let L ⊇ K be a finite, separa- ble extension. In the above notation, every idempotent w in Eν defines an absolute value | · |w on L which extends | · |ν.

31 Letting nw/v := [Lw : Kν] and using the fact that tensoring with a field is flat, we arrive at an important formula, X X [L : K] = dimKν Eν = [Lw : Kν] = nw/v w|ν w|ν

Lemma 3.24. nw/v kxkw = |NLw/Kν (x)|ν

Proof. Let Lw ⊇ Kν. Then every σ ∈ Gal(Lw/Kν) preserves the norm, i.e., σ ∗ k · kw = k · kw, because the norm on Lw which extends the norm on Kν is unique. Then

nw/v Y Y σ Y σ kxkw = kxkw = kx kw = k x kw σ σ σ

= kNLw/Kν (x)kw

= |NLw/Kν (x)|ν.

The decomposition of Eν gives a morphism ∼ x 7→ (Tx : y 7→ xy) ∈ EndKν Eν = EndKν (Lw)

The idempotents induce a decomposition, Tx = ⊕wTxw . Each idempotent defines an extension k · kw of the norm of norm k · kν to Eν:

n n X X n kxkν = |prix|ν = |xi|ν, x = (x1, . . . , xn) ∈ Kν . i=1 i=1

This norms makes Eν into a complete Kν-vector space. It is an extension: ∗ ιwk · kw = | · |ν. In turn, it defines an absolute value on L by

|x|w := kκwxkw.

This absolute value extends the original absolute value on K, i.e., |x|w = |λx|ν. Then, Y Y NEw/Kν (x) = det Tx = det Txw = NLw/Kν (xw). w w This leads to the theorem,

32 Theorem 3.25 (Product formula for number fields). Let x ∈ K be nonzero. Then Y nw/v |x|w = 1.

w∈MK

Proof. By the product formula for Q,

Y nw/v Y |x|w = |NLw/Kν (x)|ν

w∈MK w∈MK Y Y Y = NLw/Kν (x)|ν = |NEν /Qν (x)|ν

ν∈MQ w/ν ν∈MQ Y = |NE/Q(x)|ν = 1.

ν∈MQ

33 Lecture 4

Heights and Siegel’s Lemma

4.1 Absolute values and heights

The product formula plays an important role in calculating with heights. Let K ⊇ Q be a number field. The places ν of K have the following form. • ν | ∞: There are two cases. For every, real embedding σ : K → R, there is a real place with absolute value |x|ν := |σx|. Complex places have the form |x|ν = |σx| and correspond to pairs (σ, σ) of an embedding σ : K → C and its conjugate. • ν | p: We classified the finite places in Lemma 3.23. Each has an −(fν /nν ) ordν x absolute value of the form |x|ν = p . Here nν := [Kν : Q] is the degree of the completion as an extension of Q. Letting p ⊆ Oν be

the valuation ideal of ν, we define fν := dimFp Oν/p = [kν : Fp].

What is |p|p? To calculate it, let eν = ordν p. Then nν = eν · fν, so

fν fν eν − ordν − −1 |p|p = p nν p = p nν = p . The product formula for K (see below) motivates the definition of the nor- malization of an absolute value. Definition 4.1 (Normalization of an Absolute Value). The normalization of an absolute value of a number field K ⊇ Q is,

nν kxkν := |x|ν where nν = [Kν : Qp] if ν is a finite place, and  1 σ real n = ν 2 σ complex when ν is an infinite place.

34 With this definition in hand, the product formula for K takes the form,

Y Y nν kxkν = |x|ν = 1, x ∈ K, x 6= 0. ν ν

Definition 4.2. Let K/Q be a finite extension. The (logarithmic) height of an element x ∈ K is X + hK (x) := log kxkν. ν where log+ r := max{0, log r}.

The height enjoys a number of properties inherited from the norm. It is non-negative, and satisfies a inequality, hK (x) ≥ 0 and hK (x + y) ≤ × hK (x) + hK (y). It also remains under multiplication by λ ∈ K .

Example 4.3. Let K = Q, and let x ∈ Q. We may factor its numerator and denominator: a Y Y Y x = = p(p) = p(p) p(p) b p prime (p)≥0 (p)<0

The height can be calculated as follows.

 P (p) log p x ≥ 0 log+ |x| = p 0 x < 0  −(p) log p (p) ≤ 0 log+ |x| = p 0 (p) > 0

3 If x = 5 , then

+ log |x|∞ = 0

log |x|3 = 0

log |x|5 = −(−1) · log 5 = log 5.

3 Hence hQ( 5 ) = log 5 = log(max(3, 5)). √ Exercise 4.4. What is h( 2)?

4.2 Heights in affine and projective spaces

n Let A (K) = {(x1, . . . , xn) | xi ∈ K} be affine n-space.

35 Definition 4.5. The norm1 of an point x ∈ An(K) is defined by  max{kx k } ν non-archimedean  i ν Pn 2 1/2 kxkν := ( i=1 τ(xi) ) ν real  Pn i=1 τ(xi)τ(xi) ν complex Definition 4.6 (Heights in Affine Space). The (logarithmic) height of a point x ∈ An(K) is X hK (x) := log kxkν. ν The lack of log+ means that the height of x ∈ K might not agree with the 1 height of x ∈ A (K), i.e., hK (x) 6= hK (x). These heights are compatible. For example, given the embedding 1 ,→ 2 i : A A 1 x 7→ (1, x)

1 2 the height on A agrees with the restriction of the height on A , i.e., hK (i1(x)) = P hK (1, x) = ν log kxkν = hK (x). Recall that is defined as the quotient of affine space by the multiplicative group of the field:

n n+1 × P (K) = A (K)/K We can extend the definition of heights to projective space virtue of a lemma:

1 1 Lemma 4.7. Let hK be the height on A (K) and x ∈ A (K). Then for all λ ∈ K×, h(λx) = h(x). Proof. By the product formula for λ, X X X X h(λx) = log kλxkν = log kλkνkxkν = log kλkν + log kxkν = h(x). ν ν ν ν | {z } =0

Definition 4.8 (Heights in Projective Space). The (logarithmic) height of n a point (x0 : ··· : xn) ∈ P (K) can be defined using the definition for affine space, because heights are invariant under the K×-action:

hK ((x0 : ··· : xn)) := hK (x0, . . . , xn)

1 Note that this does not agree with many standard definitions, where max{kxikν } is also used for the archimedean case. But our definition agrees with the matrix definition to follow.

36 4.3 Height of matrices

We will denote by Mm,n(K) the K-vector space of m × n-matrices. There are two ways of assigning height to X ∈ Mm,n(K). The naive definition takes the height of the corresponding point in affine space under ∼ m×n the isomorphism Mm,n(K) = A defined by

th X 7→ (. . . , xij(X),...) xij(X) = (i, j) -entry

The better way is due to Bombieri-Vaaler.

Definition 4.9 (Height of a Matrix). The height of a matrix X ∈ Mm,n(K) is defined as follows. Assume m ≤ n, and let I ⊆ {1, . . . , n} be a choice of a rank m minor (i.e., |I| = m). Let XI denote the minor of coefficients whose columns and rows indices appear in I. Let   max|I|=m k det XI kν ν non-archimedean  T 1/2 Hν(X) := k det XX kν ν real  T  k det XX kν ν complex

Then the height of X is defined to be X h(X) = log Hν(X). ν Remark 4.10. A more elegant way to introduce these heights applies the inclusion M (K) ,→ N i : m,n A X 7→ (..., det XI ,...)  n  for N = and m ≤ n. Then the case m = 1 has M = n, and we m 1×n A recover heights for affine space.  max k det Xkν ν - ∞ h : X → T 1/2 k det(XX )kν ν | ∞

The properties of the heights of matrix include the following.

• h(X) ≥ 0.

× • There are only finitely many X ∈ Mm,n(X)/K such that h(X) ≤ C for given C.

37   X1 • Let X = for Xi ∈ Mmi,n(K), and let m = m1 + m2. Then X2

h(X) ≤ h(X1) + h(X2).

• Let P be an element of K[x1, . . . , xn]d ⊆ K[x1, . . . , xn], the of degree at most d. There is an isomorphism

4.4 Absolute height √ Since Q ⊂ Q( 2), we now have two√ definitions of height for an x ∈ Q: the height√ on Q and the height√ on Q( 2). These do not agree. The height in the Q( 2) will be [Q( 2) : Q] times the height in Q. This property holds in general and leads to a definition of height preserved by field extension.

Exercise 4.11. Let p ∈ be prime. Calculate h (p) and h √ (p). Z Q Q( 2)

Definition 4.12. The absolute height of a non-zero element x ∈ Q \{0} is given by choosing K ⊂ Q containing x and defining 1 h(x) := hK (x). [K : Q] This does not depend on the choice of K 3 x.

4.5 Liouville estimates

u Let x = v ∈ Q be non-zero with u, v ∈ Z. Then 1 1 1 |x| ≥ ≥ = . (4.1) |v| max(|u|, |v|) h(x)

This can be generalized as follows.

Proposition 4.13. Let L = `1T1 + ··· `nTn for `i ∈ Z. Let K ⊇ Q be a number field, w, an infinite place of K, and ξ ∈ Kn. If L(ξ) 6= 0, then

log kL(ξ)kw ≥ −(h(ξ) + h(L)) + log kLkw + log kξkw

Pn 2 1/2 where |L| = ( i=1 `i ) .

38 Proof. By the product formula Y Y 1 = kL(ξ)kν = kL(ξ)kw · kL(ξ)kν ν ν6=w Y ≤ kL(ξ)kw kLkν · kξkν ν6=w

Note that the finite places ν satisfy

|L(ξ)|ν = |`1ξ1 + ··· + `nξn| ≤ max |`iξi|ν ≤ max |`i|ν · max |ξi|ν. so kL(ξ)kν ≤ Hν(L) · Hν(ξ). For the rest, one has ν | ∞. For the infinite places ν, the Cauchy-Schwarz inequality gives

1/2 1/2 |L(ξ)|ν = |h`, ξi|ν ≤ h`, `i hξ, xii

Therefore,

Y Y −1 −1 1 ≤ kL(ξ)kw kLkν · kξkν · kLkw · kξkw ν ν Y −1 −1 = kL(ξ)kw h(L)h(ξ) · kLkw · kξkw . ν Taking logarithms gives the result.

This generalizes the equation 4.1 as follows. Let L = T2, and let ξ = (1, x). Then L(ξ) = x, h(L) = 0 and h(ξ) = log max(|u|, |v|), so we conclude by the proposition,  u   u  log |x| ≥ − log max(|u|, |v|) + log 1 + = h(x) + log 1 + ≥ −h(x) v v 4.6 Siegel’s Lemma

Let N > M > 0 be integers. Following Bombieri-Vaaler, we consider a system of M-linear forms in N variables.

N X Li(T1,...,TN ) = `ijTj, 1 ≤ i ≤ M j=1

We want a small solution of Li(T ) = 0 for 1 ≤ i ≤ M.

39 Lemma 4.14 (Siegel’s Lemma). The system has a non-trivial solution x = N (x1, . . . , xN ) ∈ OK such that

 s 1/4 M 2 √   N−M h(x) ≤ D max h(Li) , π i where, d = [K : Q], D = discr(K), and s is the number of real places of K. Phrasing the system of equations as a matrix equation, previous proper- ties imply a bound on the height of the matrix by the height of the column vectors. Instead of proving this, we prove a light version of Siegel’s lemma. Lemma 4.15 (Siegel’s Lemma: Light Version). Let N > M > 0, and let

N X aijTj = 0, 1 ≤ i ≤ M, aij ∈ Z. j=1 for Vi = maxj |aij|. Then the system has a non-tivial solution ξ = (ξ1, . . . , ξn) ∈ N such that Z 1 ! N−M Y |ξ1| < 2 + NVj . j Proof. We apply Dirichlet’s box principle. Let H > 0 be an integer, and let N C be the given by |xi| ≤ H for 1 ≤ i ≤ N. It contains (2H +1) points. N M P Let α : R → R be the linear map, α = (α1, . . . , αM ), αi = j aijTj. The 0 0 0 image α(C) ⊆ C has y ∈ C if and only if |yi| ≤ N · ViH. Then C contains M (2NViH) points. If H is sufficiently large, then α|C is not injective. Thus there exist ξ0 6= ξ00 in C such that α(ξ0) = α(ξ00). 0 00 0 00 Let ξ := ξ − ξ 6= 0. Then |ξi| ≤ |ξi| + |ξi | ≤ 2H. We choose H such that

1 1 ! N−M ! N−M Y Y NVj ≤ 2H < NVj + 2 j j

Then

N N−M M M Y Y 0 (2H+1) = (2H+1) ·(2H+1) > (2H+1) NVj ≥ (2NVjH+1) = |C | j

In other words, the number of points in C0 is greater than the number of points in the image of α. Siegel’s lemma is essentially the best possible bound.

40 Lecture 5

Logarithmic Forms

This lecture treats one of the central contributions to number theory of the nineteenth century. It goes back to a problem of Gauß. × Let α1, . . . , αn ∈ Q and b1, . . . , bn ∈ Z. We define

× Γ := hα1, . . . , αni ⊂ Q .

b1 bn If |α − 1| := |α1 ··· αn − 1|= 6 0, then we want to know how close this value can approach zero in terms of B := max{|b1|,..., |bn|}. In other words, we want a lower bound |α − 1| > C. There is a trivial lower bound. Assume that αi ∈ OK . Then N(α−1) ≥ 1. Q Indeed, N(α − 1) = ν|∞ |α − 1|ν. Hence

B B B |α − 1|ν ≤ |α|ν + 1 ≤ (max Hν(αi)) + 1 ≤ (2Hν(α)) =: Cν .

Hence B Y C ≥ |α − 1|ν · |α − 1|µ ≥ 1 µ6=ν so that |α − 1| ≥ C−B.

5.1 Historical remarks

We ask whether this can be improved, for example, to a bound of the form

|α − 1| > c−B ,  > 0

This would show that diophantine equations of the type,

f(x, y) = 1,

41 for a homogeneous, irreducible f ∈ Z[X,Y ] of degree at least 3, have only finitely many integer solutions. The size can be effectively bounded. The equation y2 = x3 +ax+b for a, b ∈ Z has only finitely many solutions (x, y) ∈ Z2. Similarly, the yk = F (x) is a ramified, cyclic1 k-fold 1 covering of P . The π1(C) may regarded as the Galois group of the covering. In general, any r ∈ Q(C) defines a ramified cover C → P1. This may be reformulated in terms of logarithms,

Λ = b1 log α1 + ··· + bn log αn, bi ∈ Z

β1 βn Remark 5.1. If the bi are replaced by algebraic numbers βi, then α1 ··· αn is expected to be transcendental. Problem 5.2 (Hilbert’s 7th Problem, 1900 ICM Paris). Does α = 0, 1 Another of Hilbert’s problems was the Riemann Hypothesis. Although the Riemann Hypothesis remains unsolved, he considered his seventh problem to be harder.

Theorem 5.3 (Gel’fand, Schneider, 1934). If α, β∈ / Q, then αβ.

Fixing α1, . . . , αn, we want lower bound for |Λ| (Λ 6= 0) in terms of B. In 1966, Baker showed that log |Λ| > −c(log B)k whenever k > 2n + 1. He improved the result in 1969 (?) to include all cases where k > n. Around 1970, Feldman showed that this holds so long as k > 1. We write f  g for functions f and gif there exists a constant C such that f ≥ g. In this case, |Λ and B are functions of the bi. In its final form, the theorem says that log |Λ|  − log B.

Remark 5.4. If α1, . . . , αn are rational integers α1, . . . , αn and Ω = log A1 ··· log An. 0 Then An ≥ Ai ≥ |ai| and Ω = Ω/ log An. Then Baker showed that log |Λ|  − log BΩ log Ω0. W¨ustholzimproved this to log |Λ|  − log BΩ, which is currently the best form. Conjecture 5.5.

log |Λ|  − log B(log A1 + ... + log An) (This is closely related to the abc-conjecture, but we omit the relation.)

1 1 A cycle covering C → P is one such that π1(C) is a cyclic group.

42 5.2 The Theorem

Theorem 5.6. Let b1, . . . , bn ∈ Z and let B ≥ 2 be a real number bound such × that |bi| ≤ B for all i. Let α1, . . . , αn ∈ K for a number field K ⊆ Q. If Λ = b1 log α1 + ··· + bn log αn 6= 0, then there exists an effectively computable constant C > 0 such that

log |Λ| > −C log B.

The constant C depends on α1, . . . , αn and d = [K : Q]. Remark 5.7. The theorem can also be stated as follows: There exists a C > 0 such that if log |Λ| ≤ −C log B then Λ = 0. In other words, α1, . . . , αn are multiplicatively dependent.

Preparations for the proof Feldman introduced ∆-functions. For each integer k ≥ 1, we define a function from C to C by, (z + 1) ··· (z + k) ∆(z, k) := . k! The ∆ functions satisfy a number of properties.

• The ∆ functions map integers to integers, i.e., ∆(Z, k) ⊆ Z.

(|z|+k)k |z|+k •| ∆(z, k)| ≤ k! ≤ e • The set {∆(z, k) | k ∈ N} is a Q-basis of the Q[z].

• The Z-submodule U = {f ∈ Q[z] | f(Z) ⊆ Z} ⊆ Q[z] of functions which preserve integers is generated by the ∆ functions.

U = ⊕k≥0Z∆(z, k)

For all integers ` > 0 and m ≥ 0, we define

 n Z ` 1 d ` 1 ∆(ζ, k) dζ ∆(z, k, `, m) = ∆(z, k) = m+1 m! dz 2πi |ζ−z|=1 (ζ − z)

Hence we arrive at a bound which does not depend on n.

|∆(z, k, `, m)| ≤ e(|z|+k)` (5.1)

43 We have sucessfully played the game to find a useful bound. But there are no games without pain, so the question is, where does the pain enter? The answer is that differentiating to define ∆(z, k, `, m) introduces denominators. It no longer maps integers to integers. Let Y v(k) = lcm(1, . . . , k) = plog k/ log p p prime By the Prime Number Theorem, the number of primes less than or equal to k is k/ log k. Therefore v(k) ≤ 4k. By this definition, we have assuaged the pain, since v(k)∆(z, k, `, m) maps integers to integers.

Proof: The anxiliary function

We introduce the notation c1, c2, c3,... for positive constants which do not depend on B (although perhaps they depend on αi or K), and which can be effectively computed. We begin the proof by assuming that log |Λ| ≤ −C log B. Without loss of generality, we assume that bn 6= 0. Knowing how the proof will end, we define constants

L := [kn/(n+1) log k] h := [log kB] L0 := h − 1 for some large k ∈ N. We define a function on Cn by

Φ(z0, . . . , zn−1) := P P γ1z1 γn−1zn−1 0 p(λ)∆(z + λ , h, λ + 1, 0)α ··· α 0≤λ−1≤L 0≤λ0,...,λn≤L 0 −1 0 1 n−1 where the function p(λ) is unknown, and

bi γi = λi − λn 1 ≤ i ≤ n − 1. bn

We shall determine p(λ) as an element of K = Q(α1, . . . , αn). hk Lemma 5.8. There exist p(λ) ∈ K, not all 0, of height at most c1 such that

n n−1 X Y λj ` Y mj S := p(λ)∆(` + λ−1, h, λ0 + 1, m0) αj γj = 0 (5.2) λ j=1 j=1 for 0 ≤ ` < h, mj ≥ 0 (0 ≤ j ≤ n − 1) and m0 + ··· + mn−1 < k.

44 Proof. This is a typical application of Siegel’s Lemma. Up to a factor of the m1 mn−1 form (log α1) ··· (log αn−1) , the formula of the lemma can be written as, 1  d m0  d m1  d mn−1 ··· Φ(z0, z1, . . . , zn−1). m0! dz0 dz1 dzn−1

Taking the z0 derivative of Φ only changes 0 into m0 in the fourth argument of the ∆ function (by the properties of ∆). The rest of the derivatives add a m1 mn−1 factor of (log α1) ··· (log αn−1) . To make them disappear, we define

−1 d d Dj := (log αj) D0 := dzj dz0 The new system is

1 m0 m1 mn−1 D0 D1 ··· Dn−1 Φ(z0, z1, . . . , zn−1) m0! We reformulate the powers appearing in Φ:

γ1z1 γn−1zn−1 λ1−b1/bnλn λn−1−bn−1/bnλn α1 ··· αn−1 = α1 z1 ··· αn−1 zn−1  λn λ1z1 λn−1zn−1 −b1/bnz1 −bn−1/bnzn−1 = α1 ··· αn−1 α1 ··· αn−1

 λnz λ1z1 λn−1zn−1 −b1/bn −bn−1/bn = α1 ··· αn−1 α1 ··· αn−1

λ1z1 λn−1zn−1 0 λnz = α1 ··· αn−1 (αn)

0 −b1/bn −bn−1/bn where we define αn := α1 ··· αn−1 . Recall the Schwarz lemma: Lemma 5.9 (Schwarz Lemma). Let f : D → D be a holomorphic function from the open unit disk, D = {z : |z| < 1} ⊂ C to itself. If f(z) = 0, then |f(z)| ≤ |z| for all z ∈ D, and furthermore, |f 0(0)| ≤ 1. If additionally |f(z)| = |z| for some non-zero z or if |f 0(0)| = 1, then f(z) = eiθz for some θ ∈ R. 0 z We estimate the difference |αn − αn|. By the power series e = 1 + z + z2/2! + z3/3! + ··· ,

z |z| |e − 1| ≤ |z|e for all z ∈ C

45 0 Letting Q = bn(log αn − log αn),

0 0 0 −1 log αn−log αn |αn − αn| = |αn||αnαn − 1| = |αn| · |e − 1| Q/bn = |αn||e − 1| |Q/bn| ≤ |αn||Q/bn|e |Q| ≤ |αn||Q|e −C B−C ≤ |αn|B e − 3 C ≤ B 4

At the last step, we impose a condition on the constant C. It must satisfy eB−C ≤ 1. Since B ≥ 2, this will be true if e2−C ≤ 1 is satisfied, a condition which does not depend on B. We now bound the height of 5.2. The proof divides here into two cases. In the first case, we consider archimedean places ν. We use the bound 5.1. 0 Evaluating z at ` gives that |z| ≤ h. Observing that k = λ0 + 1 ≤ L and that ` ≤ L gives that

(L0+h)L ∆(`, λ−1, h, λ0 + 1, m0) ≤ e

Qn λj ` Qn Lh Again observing the bounds on the sums, j=1 αj ≤ i=1 Hν(αj) . Using the definition of γi bounds the final product in equation 5.2 to give

n n−1 (L0+h)L Y Y kbjkν S ≤ c H (α )Lh (max(1, 2L)mj 2 ν j kb k i=1 j=1 n ν n 0 (L +h)L Y Lh m1+···+mn−1 ≤ c2 Hν(αj) (2BL) i=1 n (L0+h)L Y Lh k ≤ c2 Hν(αj) (2BL) i=1 Second, we consider the case when ν is a finite place. Then

n −m0 Y Lh −1 k S ≤ kν(h)kν · Hν(αj) kbn kν j=1 Altogether this makes hk H(S) ≤ c3 For an application of Siegel’s Lemma, we need to bound the heights of the M coefficients as well as the quantity N−M , where N is the number of variables, and M is the number of equations in 5.2.

46 hk • The coefficients are bounded by c3 . • N = (L0 + 1)(L + 1)n = h · (knn + 1 log k)n+1 ≥ 2hkn.  k + n − 1  • M = h · ≤ hkn. n

M This gives N−M ≤ 1, and now Siegel’s Lemma gives the result. We look now at the holomorphic functions

1 m0 m1 mn−1 fm(z) = D D ··· D Φ(z0, . . . , zn−1) . m ! 0 1 n−1 0 z0=···=zn−1=z We will sometimes supress the m in the notation and simply write f(z). Similar estimates as in the proof of Lemma 5.8 give

hk+L|z| |f(z)| ≤ c4 (5.3) for m0 + ··· mn−1 < k. Exercise 5.10. Prove the estimate of equation 5.3. Definition 5.11. Let

X λ1` λn` m1 mn g(`) = p(λ)∆(` + λ−1, h, λ0 + 1, 0)α ··· αn γ1 ··· γn λ

1 Exercise 5.12. For 0 ≤ ` < hk 2(n+1) (log k)−1,

− 1 C |f(`) − g(`)| ≤ B 2 .

1 Lemma 5.13. For all integers ` with 0 ≤ ` < hk 2(n+1) (log k)−1, we either have g(`) = 0 or −hk−L` |f(`)| > c6 . Proof. If g(`) = 0, then there is nothing to show, so let g(`) 6= 0. We use the product formula to show that

−hk |g(`)| > c7 . Then

|f(`)| = |g(`) − f(`) − g(`)| ≥ |g(`)| − |f(`) − g(`)| − 1 C ≥ |g(`)| − B 2 1 ≥ |g(`)| 2 47 because, as we shall see, we may take C sufficiently large so that

− 1 C 1 −hk B 2 < c . 2 7 Since g(`) 6= 0, by Lemma 5.8, ` ≥ h. If ν | ∞, then

n n    mj Y λ ` Y m kbjk α j ≤ max(1, kαk ) γ j ≤ max 1, 2kLk j ν j kb k ν j=1 j=1 n The first part of g(`) can also be bounded (we omit this) to give a total bound of

n    mj Y kbjk kg(`)k ≤ chk max(1, kαk ) max 1, 2kLk . ν 7 ν kb k ν j=1 n

If ν - ∞, then g(`) has a correspondingly different bound,

n    mj Y kbjk kg(`)k ≤ kν(h)k−m0 max(1, kαk ) max 1, 2kLk . ν ν ν kb k ν j=1 n Since L` ≤ hk, this implies that

Y hk 1 = kg(`)kν ≤ c7 kg(`)k ν −hk Hence |g(`)| > c7 . The inequality 5.2 can be acheived for C sufficiently large by a chain of reasoning which produces a list of constants c8,... imposing a finite number of lower bounds on C. Choosing C greater than all these lower bounds will then satisfy 5.2.

5.3 Extrapolations

We now turn our attention to the zeros of f(`) in the bounded by h and k:

m k

T = [k/2]

h S `

48 Recall that h = log(kB). Our aim is to show that, if f(`) has zeros in the h × k rectangle, then it has zeros in the rectangle S × T , where

1 S := [hk 2(n+1) (log k)−1] T := [k/2].

Proposition 5.14. For 0 ≤ ` < S and m0, . . . , mn−1 ≥ 0 with m0 + ··· + mn−1 < T , we have g(`) = 0. Proof. We use Cauchy’s integral formula. Let h−1 Y F (z) := (z − j)T j=0 and consider the function φ(z) := f(z)/F (z). We will use the notation

|G(z)|r := max |G(z)| |z|=r for the maximum of a holomorphic function on the disk of radius r. By the 1 Schwarz lemma 5.9, |φ(z)|S ≤ |φ(z)|R for R := [k 2(n+1) S]. Hence |f(z)|S ≤ |F (z)|S |f(z)|R and |F (z)|R Shk  z hk hk = 1 (R/2) R 2(n+1) Hence |z − j| ≥ |z| − j and R − h ≥ R/2, and  S hk |φ(z)| ≤ |φ(z)| R R The function φ(z) is meromorphic with poles at 0, . . . , h. To apply the Cauchy integral formula, we define contours:

Γ

Γ0 Γ1···

h − 1 R

49 Here Γ is the of radius R and the Γs are of radius 1/2 centered at integral points on the x-axis. Then

h−1 T −1 1 Z φ(z) 1 X X f (t)(s) Z 1 (z − s)t dz = φ(`) + dz. 2πi z − ` 2πi t! F (z) z − ` Γ s=0 t=0 Γs Multiplying by F (`) gives

h−1 T −1 1 Z F (`) f(z) 1 X X f (t)(s) Z F (`) (z − s)t dz = f(`) + dz . 2πi F (z) z − ` 2πi t! F (z) z − ` Γ s=0 t=0 Γs | {z } | {z } I J (Note the definitions of I and J.) We want to exclude the possibility that −hk−L` −hk |f(`)| > c6 . To do this, we show that I, J < c8 . We bound F (`)/F (z) by first bounding F (z) from below on the disk of radius R, and then bounding F (`) from above. On Γ, the disk of radius R, |z − j| ≥ |z| − j = R − j ≥ R/2. Hence,

h−1 Y |F (z)| > 2−hT RhT |F (`)| = |` − j|T ≤ ShT j=0

Together, this gives,

F (`) hk − hT ≤ 2 k 2(n+1) z ∈ Γ. (5.4) F (z)

n 1 We have L|z| = LR = k h+1 log k and k 2(n+1) S = kh, so

hK+L|z| 2hk |f(z)| ≤ c9 ≤ c10 z ∈ Γ. (5.5)

Using equations 5.4 and 5.5, we can now bound the integral I by,

 hT hk S 2hk hk − hT − 1 C I ≤ 2 · 2 c = c k 2(n+1) ≤ B 4 . R 10 11

To bound the double sum of integrals, J, we write the derivatives of f in terms of partial derivatives of ∆,

 t (t) ∂ ∂ 1 m0 mn−1 fm (z) = + ··· + D0 ··· Dn−1 Φ(z0, . . . , zn−1) , t = m0+··· mn−1 ∂z0 ∂zn−1 m0! z0=···=zn−1=z

50 ∂ ∂ Recall that = D0 and = log αiDi for 1 ≤ i ≤ n − 1. The powers of ∂z0 ∂zi the derivative are then

t  d  X 1 = (D + log α D + ··· + log α D )t = q(τ) Dτ dz 0 1 1 n n τ ! τ 0

τ τ0 τn−1 where we use the notation D = D0 ··· Dn−1 . The main point is that these differential operators are not defined over Q.

(t) X 1 m0+τ0 mn−1+τn−1 fm (z) = q(τ) D0 ··· Dn−1 Φ(z0, . . . , zn−1) m !τ ! z0=···=zn−1=z τ 0 0

X ∗ 1 m0+τ0 mn−1+τn−1 = q (τ) D0 ··· Dn−1 Φ(z0, . . . , zn−1) (m + τ )! z0=···=zn−1=z τ 0 0 X ∗ = q (τ)fm+τ (z) 0 ≤ |m| ≤ T − 1 τ,|m|≤k−1 where t!   ∗ τ0 + m0 τ1 τn−1 q (τ) = (log α1) ··· (log αn−1) . τ1! ··· τn−1! τ0

k P t! k T Hence c · ≤ c · n . And if m0 + ··· + mn−1 ≤ T − 1, then the τ1!···τn−1! summation goes over fm+τ (z) with m0 + ··· + mn−1 ≤ k − 1.

|fm+τ (s)| ≤ |gm+τ (s)| + |fm+τ (s) − gm+τ (s)| | {z } | {z } =0

(t) hk −C/2 |fm (s)| < c12 B , 0 ≤ t < T − 1, 0 ≤ s ≤ h − 1 (5.6)

Finally, we need a bound on the integrals over Γs.   1/2 s = j |z − j| ≥ (s − j)/2 0 ≤ j < s  (j − s)/2 s ≤ j ≤ h − 1

Then |F (z)| ≥ 2−hs!(h − s)!(h − s)−1.  h  where s!(h − s)! ≤ h!. s

51 √ h h −hT hT Sterling’s formula gives that h! ≥ e 2πh. Finally, |F (z)| ≥ (8e) h . This gives for z ∈ Γs,

F (`) ≤ ShT (8e)hT F (z) hST  1 hT (5.7) 2(n+1) 1 k h hT 2(n+1) hT ≤ h (8e) ≤ (8ek )

We are finally ready to bound the double sum of integrals using equation 5.7 and 5.6 to give,

h−1 T −1 X X 1 hk − 1 C hT J ≤ c B 2 k 2(n+1) t! s=0 t=0 h−1 T −1 ! X X 1 − 1 C − 1 C ≤ B 4 ≤ heB 4 t! s=0 t=0

1 C hk provided that B 4 > (ck) . The bounds on I and J together bound the absolute value of f(`):

− 1 C − 1 C |f(`)| = I + J ≤ B 4 (1 + he) ≤ B 5

Details for abbreviated parts of the proof can be found in Baker and W¨ustholz.

Proposition 5.15. If log |Λ| < −C log B we have g(`) = 0 for all 0 ≤ ` < S and all non-negative integers m0, . . . , mn−1 with m0 + ··· + mn−1 ≤ T . Proof. Let

L0 X X λ1 λn P (X0,...,Xn) := p(λ−1, . . . , λn)∆(X0+λ−1, h, λ0+1, 0)X1 ··· Xn λ−1=0 λ0,...,λn≤L

Then for m0, . . . , mn−1 with m0 + ··· + mn−1 < T ,

1 m0 mn−1 g(`) = D0 ··· Dn−1 P (X0,...,Xn) = Φ(`, . . . , `) m0! ` ` (X0,...,Xn)=(`,α1,...,αn)

∂ ∂ bi ∂ where D0 = and Di = Xi − Xn . Hence P satisfies, ∂X0 ∂Xi bn ∂Xn

m0 mn ` ` D0 ··· Dn−1P (`, α1, . . . , αn) = 0, for all 0 ≤ ` < S and all m0, . . . , mn−1 such that m0 + ··· + mn−1 < T .

52 5.4 Multiplicity estimates

The presentation here can be supplemented by the book by Baker and W¨ustholz.If S(T/n)n ≥ kL0Ln+1 for sufficiently large  and k, then α1, . . . , αn are multiplicatively dependent, i.e., there exist X1,...,Xn ∈ Z not all zero, such that

X1 Xn α1 ··· αn = 1. (5.8)

n 0 n This is satisfied for our P (X0,...,Xn). Siegel’s lemma says that ST ≤ L L and extrapolation says that ST n > kL0Lnnn.

Theorem 5.16. If α1, . . . , αn are multiplicatively independent, then

log |Λ| > −C log B for some effectively computable constant C > 0 which depends only on [K : Q], h(α1), . . . , h(αn) and n. If the terms are multiplicatively dependent, then taking the of equation 5.8 gives a linear relation, 1 log αn = (X1 log α1 ··· + Xn−1 log αn−1) Xn The nth case is thereby reduced to the (n − 1)th case.

53 Lecture 6

The Unit Equation

6.1 Units and S-units

Let K ⊃ Q be a number field, and let OK be a ring of integers. Let Σ := Hom(K, C). Then |Σ| = r + 2s where r is the number of homo- which factor through R ,→ C and s is the number of pairs of complex conjugate embeddings. Theorem 6.1 (Dirichlet’s Unit Theorem). With the above notation,

× ∼ t OK = µ(K) × Z , t = r + s − 1 where µ(K) is the group of roots of unity in K.

× × × Let 1, . . . , t ∈ OK be a basis for OK /µ(K). Then x ∈ OK ⇔ x = x1 xt ζ · 1 ··· t for some x1, . . . , xt ∈ Z and some ζ ∈ µ(K). In this lecture, we solve the unit equation: Definition 6.2 (Unit equation). The unit equation is

× x + y = 1, x, y ∈ OK . (6.1)

Definition 6.3 (S-units). If S is a finite set of primes in OK , then the S-units are the invertible elements of na o O = | a, b ∈ O , b∈ / p for p ∈/ S = {x ∈ K | |x| ≤ 1 for ν∈ / S} . K,S b K ν

Note that OK,S ⊃ OK . Definition 6.4 (S-unit equation). The S-unit equation is x + y = 1 for × x, y ∈ OK,S. A solution of the S-unit equation gives a solution of ax+by = 1 with x, y ∈ OK .

54 × Let x, y ∈ OK be solutions of 6.1. Put

x1 xt x = ξ1 ··· t , ξ ∈ µ(K), xi ∈ Z y1 yt y = η1 ··· t , η ∈ µ(K), yi ∈ Z

Let X := max(|xi|) and Y := max(|yi|). We shall bound max(X,Y ). With- out loss of generality, we assume that X ≤ Y . We assume that there exists ν | ∞ such that − log kykν  Y. We will show that this assumption is actually no restriction using the product formula. Equation 6.1 implies that

x 1 1 −Y + 1 = = ≤ e y y kyk where the norm is the ν-norm. Now x ξ = x1−y1 ··· xt−yt = ζx1−y1 ··· xt−yt y η 1 t 1 t so x + 1 = ζ0x1−y1 ··· xt−yt + 1 = ez + 1 y 1 t Pt where z = i=1(xi − yi) log i + iqπ for some q ∈ Q depending on the root of unity ζ0.

Lemma 6.5. Let z ∈ C satisfy 1 |ez + 1| ≤ 4 Then there exists an integer k such that 4 |z − ikπ| ≤ |ez + 1|. 3 Proof. We take the principle branch of the logarithm which satisfies log ez = z for 0 ≤ arg z < 2π. Let ζ = ez + 1, and let w lie in the strip such that z = w modulo 2πi. z z 1 Then w = log e = log(ζ − 1). The hypothesis |ζ| = |e + 1| ≤ 4 and the P∞ ζn power series formula log(ζ − 1) = − n=1 n together imply that 4 4 | log(1 − ζ)| ≤ |ζ| = |ez + 1|. 3 3

55 The Lemma implies that there is some k ∈ Q such that

4 x |z − ikπ| = |Λ| ≤ + 1 3 y

Pt where Λ = i=1(xi−yi) log i−k log(−1). Thus log |Λ|  −Y = max(|xi|, |yi|). We now have a linear combination of logarithms, and we would like to apply Baker’s theorem. This requires,

|xi − yi| ≤ 2 max(|xi|, |yi|) = 2Y and |k|  Y.

These both hold. Thus − log Y  log Λ. But this reasoning depends on the hypothesis that at some place w we have log |y|w  Y .

6.2 Unit and regulator

× In this section, we denote the units of K by UK = OK . Then  ∈ UK implies that N() = ±1. Let 1, . . . , t be a set of generators of UK , and let Σ0 := σ1, . . . , σr+s ⊆ Σ ⊆ Hom(K, C) be a set of embeddings. Definition 6.6 (Regulator). Let

M := (log |σj(i)|)i=1,...,t = (log |i|w) ∈ Mt+1,t(R). j=1,...,t+1

We take any submatrix R in Mt,t(R). Then the regulator of K is det R which is independent of the choice of R.

Taking the logarithm of the absolute value of y gives

log |y|w = y1 log |1|w + ··· + yt log |t|w, w ∈ Σ0.

As w ranges through the embeddings Σ0, this produces a system of equations for y1, . . . , yt. Hence

Y = max |yi|  max(log |y|w).

So there exists a w such that log |y|w  Y and finitely many solutions to the unit equation.

56 Lecture 7 1 Integral points in P \{0, 1, ∞}

Let K be a number field, and let X = Spec R be an affine variety for R = K[f1, . . . , fn]. The variety X embeds in affine space by the morphism x 7→ n (f1(x), . . . , fn(x)) ∈ A . Definition 7.1 (Integral points). The integral points of X are

X(OK ) = {P ∈ X | f1(P ), . . . , fn(P ) ∈ OK }

Let X be projective over K, and let D be a very ample divisor.

N N XK / PK / POK J O σK  Spec K XOK \ DOK / XOK o DOK gOOO J tt OO σ tt σ OOO tt OK O tt OO  ytt Spec OK

The image σ(Spec OK ) does not intersect DOK . 1 Now let X = PK \{0, 1, ∞} be the projective with three punctures. It can be expressed as the affine variety X = A1 \{0, 1} whose regular 1 1 functions include T, T ,T − 1 and T −1 . These generate the coordinate ring of R = K[T,T −1,T − 1, (T − 1)−1]. By definition, a point P ∈ X is an integral point if and only if the value × of these regular functions is integral. Since 1/T (P ) is defined, T (P ) ∈ OK . × Similarly, 1/(T − 1)(P ) is defined, so (T − 1)(P ) ∈ OK . We conclude that 1 the integral points of PK \{0, 1, ∞} are solutions of the unit equation:

× T (P ) + (T − 1)(P ) = 1,T (P ), (T − 1)(P ) ∈ OK

57 Examples Let E be the in normal form y2 = x(x − 1)(x − λ). It forms a ramified cover 1 E \{E1,E2,E3} → P \{0, 1, ∞} Integrality of points in E is intimately related to integrality of points in the image, of which there are finitely many. This allows one to conclude that E also has finitely many integral points. A super-elliptic curve has normal form

ym = x(x − 1)(x − λ).

On any super-elliptic curve, there are only finitely many integral points, and they can be determined effectively. This should be compared with Ziegel’s theorem that the set of integral points on an of genus greater than one is finite. But Ziegel’s theorem is not effective, hence the work we have done.

58 Lecture 8

Appendix: Lattice Theory

8.1 Lattices

Let (E, h., .i) be a finite dimensional Euclidean Q-vector space, where h., .i : E × E → Q is a positive-definite bilinear form. A subgroup Λ ⊂ E is called discrete if for all bounded B ⊂ E, the intersection B ∩ Λ is finite. Theorem A.1. A subgroup Λ ⊂ E is discrete if and only if

dim(QΛ) = rk(Λ)

Proof. (⇐) Let l1, . . . , lr ∈ Λ be free generators from QΛ(r ≤ dim(E)). By assumption, these are linear independent over Q and can therefore be completed to a basis of E. Because B ⊂ E is bounded, the coordinate functions of B ∩ Λ are also bounded, and hence B ∩ Λ is finite. (⇒) We choose a basis l1, . . . , lr ∈ Λ of QΛ and let ( r ) X F (Λ) := xili | xi ∈ Q, −1/2 < xi ≤ 1/2 . i=1 The set F (Λ) is bounded and therefore F (Λ) ∩ Λ is finite by assumption. We 0 P 0 0 ∼ set Λ := i Zli ⊂ Λ, and then the inclusion Λ/Λ ,→ QΛ/Λ = F (Λ) shows that the quotient Λ/Λ0 is also finite. Therefore the sublattice Λ0 ⊆ Λ has finite index, so that Λ is finitely generated and torsion free as a subgroup of E. It is thus free and hence

0 dim(QΛ) = rk(Λ ) = rk(Λ). A discrete subgroup of maximal rank is called a lattice.

59 We now equip (E, h., .i) with the following measure: given an orthonormal n basis e1, . . . , en of E, we can produce an isomorphism e between E and Q . ∗ We set then dµ = e dµL, where dµL indicates the Lebesgue measure. We obtain in this way a volume, Z Vol(F (Λ)) = dµ. F (Λ)

Theorem A.2. Let Λ ⊂ E be a lattice, and e1, . . . , en, an orthonormal basis of E with Λ = Φ(Ze1 + ··· + Zen). Then, Vol(F (Λ)) = | det(Φ)|. Proof. The pullback satisfies Φ∗dµ = det(Φ)dµ, and therefore, Z Z Vol(F (Λ)) = dµ = dµ F (Λ) F (Φ(Ze1+···+Zen)) Z = Φ∗dµ F (Ze1+···Zen) Z = | det(Φ)| dµ F (Ze1+···+Zen) Z = | det(Φ)| dµL = | det(Φ)|. F (Zn)

2 T We note further that det(Φ) = det(ΦΦ ) = det(hli, lji) for li = Φ(ei).

8.2

We consider now the lattice Λ ⊆ Rn of the Rn viewed with a standard scalar product with respect to which the canonical basis is or- thonormal. We call a set S ⊂ Rn convex if for every pair of points x, y ∈ S, the linear combinations λx + µy ∈ S for all λ + µ < 1 with 0 ≤ λ, µ ≤ 1. We call the set S symmetric if S = −S. For a closed, symmetric and K ⊂ Rn, we define the successive minima of the lattice Λ with respect to K by

λk := inf{λ | λK contains k linear independent lattice points}. It clearly holds that λ1 ≤ λ2 ≤ · · · ≤ λn. Writing d(Λ) = det(Λ), we formulate the following theorem of Minkowski.

60 Theorem A.3 (Minkowski). With the above notation,

1 Vol(K) ≤ λ ··· λ ≤ 1. n! 1 n 2nd(Λ)

Proof. We refer the reader to J. W. S. Cassels, An Introduction to the Ge- ometry of Numbers.

Vol(K) We now let δ(K, Λ) = 2nd(Λ) , so that 1 ≤ λ ··· λ · δ(K, Λ) ≤ 1. n! 1 n In particular, n −1 λ1 ≤ λ1 ··· λn ≤ δ(K, Λ) .

Then δ(K, Λ) ≥ 1, so that λ1 ≤ 1 which means that K contains a nonzero lattice point. This is the content of the following corollary.

Corollary A.4 (Minkowski’s Lattice Point Theorem). If, under the above assumptions, the inequality

Vol(K) ≥ 2nd(Λ), holds, then K contains a non-trivial lattice point.

This immediately implies a corollary.

n Corollary A.5. Let l1, . . . , ln ∈ Hom(R , R) be linear forms, and let 0 < c1, . . . , cn ∈ R with c1 ··· cn ≥ det(l1, . . . , ln)d(Λ), so there is a 0 6= u ∈ Λ with |lj(u)| ≤ cj for 1 ≤ j ≤ n.

Proof. We define K by |lj| ≤ cj for 1 ≤ j ≤ n. Then,

n n Vol(K) = 2 c1 ··· cn/ det(l1, . . . , ln) ≥ 2 d(Λ).

61 8.3 Norm and Trace

Let L/K be a finite field extension. Then L is in particular a finite di- mensional K-vector space. We consider ξ ∈ L and define Tξ ∈ GLK (L) by Tξ(x) = ξx and let

NL/K (ξ) := det(Tξ)

T rL/K (ξ) := Tr(Tξ). The norm and trace enjoy the following properties:

(i) NL/K (ξη) = NL/K (ξ)NL/K (η),

(ii) NL/K (1) = 1,

(iii) TrL/K (ξ + η) = TrL/K (ξ) + TrL/K (η). Therefore the norm and trace homomorphisms map from L∗ to K∗ and from L to K respectively. According to the definition Tξ(x) = ξx, which means x ∈ ker(ξid − Tξ) and hence det(ξid − Tξ) = 0. Therefore ξ is the root of

n n−1 n det(ξid − Tξ) = T − trL/K (ξ)T + ··· + (−1) NL/K (ξ) Y = (T − σξ) σ∈Hom(L/K) Hence a theorem. Theorem A.6. For separable extensions L/K we have, Q (i) det(ξid − Tξ) = σ∈Hom(L/K)(T − σξ), Q (ii) NL/K (ξ) = σξ, P (iii) TrL/K (ξ) = σξ. The following statements are left to the reader as an exercise.

(i) M ⊇ L ⊇ K ⇒ TrM/K = TrL/K ◦ TrM/L,

(ii) NM/K = NL/K ◦ NM/L, (iii) If L/K is separable, the function L × L → K Tr : L/K (ξ, η) 7→ Tr(ξη) is non-degenerate, symmetric and bilinear.

62 (iv) If L ⊇ Ks ⊇ K, Ks is the separable closure of K and L ⊃ Ks is purely inseparable, so that

TrL/K (ξ) = [L : K] TrKs/K (ξ)

and [L:K] NL/K (ξ) = NKs/K (ξ) . Now let K be an algebraic number field. Then the norm of an ideal a ∈ OK is defined by NK (a) := OK : a.

For ξ ∈ OK we set NK (ξ) := NK (ξOK ); this definition agrees with the one given earlier. Since OK is a Dedekind domain, Y a 6= pνp(a) and by the Chinese Remainder theorem it follows that

∼ Y νp(a) OK /a = OK /p , hence Y νp(a) NK (a) = NK (p )

Lemma A.7. With the above notation,

pn/pn+1 ∼= O/p.

Proof. There are isomorphisms, ∼ ∼ O/p = Op/pOp = Op/πOp and n n+1 ∼ n n+1 ∼ n n+1 p /p = p Op/p Op = π Op/π Op.

The final term is an Op/πOp-vector space of dimension one. It follows that

pn/pn+1 ∼= O/p.

n n The immediate conclusion gives that NK (p ) = NK (p) , hence

Y νp NK (a) = NK (p) (a)

63 8.4 Ramification and discriminants

Let R be a Dedekind domain with fraction field K, and furthermore let L ⊇ K be a finite, separable extension containing R0, the integral closure of R in L. Definition A.8. Let P be a nonzero prime ideal in R0 and P ∩ R = p. We say that P is ramified over R if either e(P) > 1 or k(P) is non-separable over k(p). One says the prime ideal p is ramified over R0 if there is a prime ideal P in R0 which lies over p and is ramified over R.

Let V be a vector space, B = {v1, . . . , vn}, a basis, and β : V × V → K a bilinear form. We set

d(B) := det(β(vi, vj)) and call this the discriminant of the basis B with respect to β. If L/K is a finite field extension of number fields and a ⊂ L is a finitely generated OK - module, then we choose a free submodule a1OK + ··· + anOK and examine the discriminant d = det(TrL/K (ai, aj)).

The ideal generated in OK by all such d is called the discriminant of a and is denoted by δL/K (a). In the particular case when a = OL, one calls δL/K = d(OL) the discriminant of L/K. It is an ideal in OK .

Proposition A.9. Let S be a multiplicative set in OK . Then

δL/K (a)S = δL/K (aS )

Proof. A basis of a is a basis of aS , and ever basis BS ⊆ aS defines a basis of a by multiplying by an element s ∈ S. From this follows the conclusion, because 2[L:K] dsB = s dB.

The relation between ramification and discriminant is given through the following theorem. Theorem A.10. The prime ideals of R which are ramified in R0 are exactly those ideals which divide the discriminant δL/K .

Proof. Let p ∈ OK be a prime ideal, and S = OK p. By Proposition A.9, we −1 0 −1 can localize at S and consider the rings R = S OK as well as R = S OL. 0 Then p is ramified in OL exactly when pR is ramified in R . The ring R is a discrete valuation ring and therefore a principal ideal ring. For this reason, 0 0 R is a free R-module which can be written as R = x1R ⊕ · · · ⊕ xrR, etc. (see Gerald J. Janusz, Algebraic Number Fields).

64 Minkowski Space1

Let F : C → C be complex conjugation, i.e., Gal(C, R) = {1,F } and Σ = HomQ(K, C). Then we obtain the embedding K → Γ(Σ, ) ∼= K ⊗ j : C Q C x 7→ j(x): j(x)(τ) = τ(x), τ ∈ Σ, where Γ(Σ, C) = {z :Σ → C}. The complex conjugation F operates on Σ by τ 7→ F (τ) = F ◦ τ and on Γ(Σ, C) by z 7→ F (z) via F (z)(τ) = F (z(F (τ))) = (F ◦z◦F )(τ). The space Γ(Σ, C) possesses a non-trivial involution x 7→ F (x), and we obtain a eigenspace decomposition, + − Γ(Σ, C) = Γ(Σ, C) ⊕ Γ(Σ, C) in the eigenspace with eigenvalues 1 and -1 respectively. For z ∈ Γ(Σ, C)+ we have z(F (τ)) = F (z(τ)), in other words, z and F commute. The embedding j factors through Γ(Σ, C)+ because F (j(x))(τ) = (F ◦ j(x) ◦ F )(τ) = F (j(x)(F (τ))) = F (F (τ)(x)) = (F ◦ F ◦ τ)(x) = τ(x) = j(x)(τ) for an arbitrary τ ∈ Σ, and therefore j(x) ∈ Γ(Σ, C)+. This space to- gether with the Euclidean scalar product that we now introduce is called the Minkowski space. We define a sesquilinear form on Γ := Γ(Σ, C) by X (x, y) ∈ Γ × Γ 7→ hx, yi := x(τ)F (y(τ)). τ It satisfies the property that F hx, yi = hy, xi and X hF x, F yi = F (x)(τ)F (F (y)(τ))) τ X = (F ◦ x ◦ F )(τ)(F (F ◦ y ◦ F ))(τ) τ ! X = F x(τ)F (y(τ)) τ = F (x, y),

1J. Neukirch, Algebraische Zahlentheorie, III.4

65 that means F commutes with Tr, so that, Tr(x) = Tr(F (x)) = F (Tr(x))

+ for x ∈ Γ(Σ, C) , and therefore j ◦ Tr = TrK/Q as claimed. We now construct a canonical measure on (Γ(Σ, C)+, h, i). Assume the volume of an orthonormal parallelepiped P is Vol(P) = 1. Let b1, . . . , bn be a basis of Γ(Σ, C)+, and let

P(b1, . . . , bn) = {x1b1 + ··· + xnbn | 0 ≤ xi < 1}, be the parallelepiped defined by the basis. The transformation X bi = aikek, 1 ≤ i ≤ n k defines a matrix A := (aik), and we obtain

Vol(P(b1, . . . , bn)) = det(hbi, bji) * +! X X = det aikek, ajlel k l ! X = det aikajk k = det(AAT )

The basis b1, . . . , bn generates the lattice, + Λ = Zb1 + ··· + Zbn ⊆ Γ(Σ, C) and we define the Lattice volume to be,

d(Λ) := Vol(P(b1, . . . , bn)).

Example A.11. OK is a free Z-module of the form

OK = Zω1 + ··· + Zωn, and therefore j(OK ) = Zj(ω1) + ··· Zj(ωn). It follows that

2 d(OK ) = det(hjωk, jωli) = dK = det(TrK/Q(ωkωl)). + For a ⊂ OK is Λ := ja ⊆ Γ(Σ, C) a sub-lattice of j(OK , and one obtains the relation, p d(a) = N(a)d(OK ) = N(a) |dK |.

66 The logarithmic Minkowski space2

2J. Neukirch, Algebraische Zahlentheorie, I.7

67