<<

Algebraic Theory summary of notes

Robin Chapman 3 May 2000, revised 28 March 2004, corrected 4 January 2005

This is a summary of the 1999–2000 course on the- ory. Proofs will generally be sketched rather than presented in detail. Also, examples will be very thin on the ground. I first learnt algebraic from Stewart and Tall’s textbook (Chapman & Hall, 1979) (latest edition retitled Algebraic Number Theory and Fermat’s Last Theorem (A. K. Peters, 2002)) and these notes owe much to this book. I am indebted to Artur Costa Steiner for pointing out an error in an earlier version.

1 Algebraic and

We say that α ∈ C is an algebraic number if f(α) = 0 for some monic f ∈ Q[X]. We say that β ∈ C is an algebraic if g(α) = 0 for some g ∈ Z[X]. We let A and B denote the sets of algebraic numbers and algebraic integers respectively. Clearly B ⊆ A, Z ⊆ B and Q ⊆ A.

Lemma 1.1 Let α ∈ A. Then there is β ∈ B and a nonzero m ∈ Z with α = β/m.

Proof There is a monic polynomial f ∈ Q[X] with f(α) = 0. Let m be the product of the denominators of the coefficients of f. Then g = mf ∈ Z[X]. Pn j Write g = j=0 ajX . Then an = m. Now

n n−1 X n−1+j j h(X) = m g(X/m) = m ajX j=0

1 is monic with integer coefficients (the only slightly problematical coefficient n −1 n−1 is that of X which equals m Am = 1). Also h(mα) = m g(α) = 0. Hence β = mα ∈ B and α = β/m.  Let α ∈ A. Then there is a monic polynomial f ∈ Q[X] of least degree such that f(α) = 0. This polynomial is uniquely determined.

Proposition 1.1 Let α ∈ A. Then there is precisely one monic polynomial f ∈ Q[X] of minimum degree with f(α) = 0. This polynomial f has the property that if g ∈ Q[X] and g(α) = 0 then f | g.

Proof Note first that if h ∈ Q[X] is a nonzero polynomial with deg(h) < −1 deg(f), then h(α) 6= 0 since otherwise h1 = a h is a monic polynomial, where a is the leading coefficient of h, with the property that deg(h1) < deg(f) and h1(α) = 0. That would contradict the definition of f. Now f is unique, since if f1 had the same degree as f and also satisfied the same conditions then h = f − f1, if nonzero, has h ∈ Q[X], h(α) = 0 and deg(h) < deg(g) which is impossible. Now let g ∈ Q[X] and suppose that g(α) = 0. By the (Proposition A.1), g = qf + h where q, h ∈ Q[X], and either h = 0 (which means that f | g as we want) or h 6= 0 and deg(h) < deg(f). But as h(α) = g(α) − f(α)q(α) = 0 this is impossible.  We call this f the minimum polynomial of α and call its degree the degree of α. Minimum are always irreducible.

Lemma 1.2 Let f be the minimum polynomial of α ∈ A. Then f is irre- ducible over Q.

Proof If f is not irreducible then f = gh where g, h ∈ Q[X] are monic polynomials of degree less than that of f. Then 0 = f(α) = g(α)h(α) and so either g(α) = 0 or h(α) = 0. We may assume g(α) = 0. Then f | g which is impossible since deg(g) < deg(f).  Suppose the minimum polynomial f of α lies in Z[X]. Then, since f is monic and f(α) = 0, α is an . In fact the converse holds: if α ∈ B then its minimum polynomial lies in Z[X]. We need to study integer polynomials in more detail to prove this. A nonzero polynomial f ∈ Z[X] is primitive if the greatest common of its coefficients is 1. Equivalently f is primitive if there is no dividing all its coefficients.

Lemma 1.3 (Gauss’s Lemma) Let f, g ∈ Z[X] be primitive polynomials. Then fg is also a primitive polynomial.

2 Proof Write

2 m f(X) = a0 + a1X + a2X + ··· + amX and 2 n g(X) = b0 + b1X + b2X + ··· + bnX . We show that there is no prime p dividing all the coefficients of fg. Take a prime p. As f is primitive there is a coefficient of f not divisible by p; let ar be the first such. Similarly let bs be the first coefficient of g not divisible r+s by p. Then p | ai for i < r and p | bj for j < s. The coefficient of X in fg is X cr+s = aibj. i+j=r+s

This sum contains the term arbs, which is not divisible by p. Its other terms aibj are all divisible by p, since they either have i < r or j < s. Hence cr+s is not divisible by p. As there is no prime dividing all the coefficients of fg, the polynomial fg is primitive.  If f ∈ Z[X] is nonzero, let a be the of the coefficients of f. Then f = af1 where f1 is primitive. We call a the content of f and denote it by c(f). More generally, let g be a nonzero element of Q[X]. Then bg ∈ Z[X] where the positive integer b is the product of the denominators of the coefficients of f. Then bg = cg1 where c is the content of bg and g1 is primitive. Hence g = (c/b)g1 where q1 is primitive polynomial in Z[X] and c/b is a positive rational. We can write any nonzero g ∈ Z[X] as g = rg1 with r ∈ Q, r > 0 and g1 ∈ Z[X] being primitive. It’s an instructive to show that this r is uniquely determined; hence it makes sense to call r the content of g. Putting s = 1/r we see that there is a positive rational s with sg a primitive element of Z[X]. We now show that if a monic polynomial in Z[X] factors over the rationals then it factors over the integers.

Proposition 1.2 Let f and g be monic polynomials with f ∈ Z[X] and g ∈ Q[X]. If g | f then g ∈ Z[X].

Proof Suppose that g | f. Then f = gh where h ∈ Q[X]. Then h is monic, as both f and g are. There are positive rationals r and s with rg and sh primitive elements of Z[X]. The leading coefficients of rg and sh are r and s respectively, so that r, s ∈ Z. By Gauss’s lemma, (rg)(sh) = (rs)f is primitive. But since f ∈ Z[X] all coefficients of (rs)f are divisible by rs.

3 Hence rs = 1 (as rs is a positive integer) and so r = s = 1 (as r and s are positive integers). Thus g = rg ∈ Z[X] as required.  An immediate corollary is this important characterization of algebraic integers.

Theorem 1.1 Let α ∈ A have minimum polynomial f. Then α ∈ B if and only if f ∈ Z[X].

Proof Suppose f ∈ Z[X]. Since f is monic and f(α) = 0 then α ∈ B. Conversely suppose that α ∈ B. There is a monic g ∈ Z[X] with g(α) = 0. Then f | g, since f is the minimum polynomial of α. By Proposition 1.2, f ∈ Z[X].  Another corollary is this useful criterion for irreducibility.

Proposition 1.3 (Eisenstein’s criterion) Let p be a prime number and let n−1 n X j f(X) = X + ajX ∈ Z[X]. j=0 If

• p | aj when 0 ≤ j < n, and

2 • p - a0 then f is irreducible over Q.

Proof Suppose that f is reducible over Q. Then f = gh where g, h ∈ Q[X], g and h are monic, and also 0 < r = deg(g) < n and deg(g) = s = n − r. By Proposition 1.2, g, h ∈ Z[X]. Write

r s X i X j g(X) = biX and h(X) = cjX . i=0 j=0

Note that br = 1 = cs. Certainly p - br and p - cs. Let u and v be the least nonnegative integers with p - bu and p - cv. Then u ≤ r and v ≤ s. I claim that u = r and v = s. Otherwise u + v < r + s = n and X au+v = bicj. i+j=u+v

This sum contains the term bucv which which is not divisible by p. The remaining terms have the form bicj with either i < u or j < v. In each case

4 one of bi and cj is divisible by p. Hence au+v is the sum of a nonmultiple of p with a collection of multiples of p and so p - au+v contrary to hypothesis. Hence u = r and v = s. As r, s > 0 both b0 and c0 are divisible by p so that 2 a0 = b0c0 is divisible by p again contrary to hypothesis. This contradiction shows that f is irreducible over Q.  Example Let p be a prime number and let

p−1 X Xp − 1 f(X) = 1 + X + X2 + ··· + Xp−1 = Xj = . X − 1 j=0

We cannot apply Eisenstein to f directly, but if we f1(X) = f(X + 1) we get p−1 (X + 1)p − 1 (X + 1)p − 1 X p f (X) = = = Xp−j−1. 1 (X + 1) − 1 X j j=0 p This is a monic polynomial, but its remaining coefficients have the form j p  for 0 < j < p and so are divisible by p. The final coefficient is p−1 = p which 2 is not divisible by p . By Eisenstein’s criterion, f1 is irreducible over Q. It follows that f is irreducible over Q, for if f(X) = g(X)h(X) were a nontrivial of f, then f1(X) = g(X + 1)h(X + 1) would be a nontrivial factorization of f1. We now show that A is a subfield of C and B is a of C.

Theorem 1.2 (i) Let α, β ∈ A. Then α + β, α − β, αβ ∈ A, and if α 6= 0 then α−1 ∈ A. (ii) Let α, β ∈ B. Then α + β, α − β, αβ ∈ B.

Proof We first prove (ii) in detail, since the bulk of the proof of (i) follows mutatis mutandis. Let α and β have minimum polynomials f and g of degrees m and n respectively. Write

m−1 n−1 m X i n X j f(X) = X + aiX and g(X) = X + bjX . i=0 j=0

Then the ai and bj are integers and

m−1 n−1 m X i n X j α = − aiα and β = − bjβ . (∗) i=0 j=0

5 Let v be the column vector of height mn given by

v> = (1 α α2 ··· αm−1 β αβ α2β ··· αm−1β β2 ··· αm−1βn−1).

In other words the entries of v are the numbers αiβj for 0 ≤ i < m and 0 ≤ j < n. I claim that there are (mn-by-mn) matrices A and B with entries in Z such that Av = αv and Bv = βv. The typical entry in αv has the form αiβj where 1 ≤ i ≤ m and 0 ≤ j < n. If i < m this already is an entry of v while if i = m,(∗) gives

m−1 m j X k j α β = − akα β . k=0

In any case this entry αiβj of αv is a , with integer coeffi- cients, of the entries of v. Putting these coefficients into a A we get αv = Av. Similarly there is a matrix B with integer entries with βv = Bv. Now (A+B)v = (α+β)v,(A−B)v = (α−β)v and (AB)v = (αβ)v. As v 6= 0 the numbers α+β, α−β and αβ are eigenvalues of the matrices A+B, A − B and AB each of which has integer entries. But if the matrix C has integer entries, its eigenvalues are algebraic integers, since the polynomial of C is a monic polynomial with integer coefficients. It follows that α + β, α − β and αβ are all algebraic integers. If we assume instead that α, β ∈ A, the above argument shows (when we replace ‘integer’ by ‘rational’ etc.) that α + β, α − β, αβ are all algebraic numbers. Finally suppose that α is a nonzero algebraic number with minimum polynomial n−1 n X i f(X) = X + aiX . i=0 n Then a0 6= 0 (why?) and dividing the f(α) = 0 by a0α gives

n−1 X an−i 1 α−n + α−i + = 0 a a i=1 0 0

−1 so that α ∈ A.  √ Example Let√ us see what the matrices A and B are for say α = 2 1 2 and β = 2 (1 + 5). The minimum polynomials of α and β are X − 2 and X2 −X −1 respectively so that α2 = 2 and β2 = β +1. Let v> = (1 α β αβ).

6 Then  α   α   0 1 0 0   1   α2   2   2 0 0 0   α  αv =   =   =      αβ   αβ   0 0 0 1   β  α2β 2β 0 0 2 0 αβ and  β   β   0 0 1 0   1   αβ   αβ   0 0 0 1   α  βv =   =   =     .  β2   1 + β   1 0 1 0   β  αββ2 α + αβ 0 1 0 1 αβ We can take  0 1 0 0   0 0 1 0   2 0 0 0   0 0 0 1  A =   and B =   .  0 0 0 1   1 0 1 0  0 0 2 0 0 1 0 1 Then, for instance, αβ is an eigenvalue of  0 0 0 1   0 0 2 0  AB =    0 1 0 1  2 0 2 0 so that h(αβ) = 0 where h is the characteristic polynomial of AB.

2 Number fields

The set A of algebraic numbers is too large to handle all at once. We restrict our consideration to looking at smaller subfields of A which contain all the algebraic numbers “generated” from a given one. For instance consider

K1 = Q(i) = {a + bi : a, b ∈ Q} and √ √ √ 3 3 3 K2 = Q( 2) = {a + b 2 + c 4 : a, b, c ∈ Q}.

It is apparent that both K1 and K2 are rings, being closed under , and . It’s not hard to see that K1 is a field since if a + bi is a nonzero element of K1 then 1 a b = − i ∈ Q(i). a + bi a2 + b2 a2 + b2

7 √ √ 3 3 But it is not so obvious that 1/(a + b 2 + c 4) is an element of K2. But this is in fact so, and is an example of a general phenomenon. Let α be an algebraic number of degree n. Define

2 n−1 Q(α) = {a0 + a1α + a2α + ··· + an−1α : a0, a1, . . . , an−1 ∈ Q}.

Proposition 2.1 For each α ∈ A, Q(α) is a subfield of A.

Proof Since A is closed under addition and multiplication, and α ∈ A and Q ⊆ A then it is apparent that Q(α) ⊆ A. Let α have degree n and minimum polynomial f. Then by definition

Q(α) = {g(α): g ∈ Q[X], and either g = 0 or deg(g) < n}.

I claim that in fact Q(α) = {g(α): g ∈ Q[X]}. Certainly Q(α) ⊆ {g(α): g ∈ Q[X]} so that to prove equality we need to show that g(α) ∈ Q(α) whenever g ∈ Q[X]. By the division algorithm (Proposition A.1) there is q ∈ Q[X] such that h = g − qf either vanishes or has deg(h) < n. Then h(α) ∈ Q(α) but h(α) = g(α) − f(α)q(α) = g(α) as f(α) = 0. This proves that Q(α) = {g(α): g ∈ Q[X]}. It is now clear that, since Q[X] is closed under addition, subtraction and multiplication then so is Q(α). Hence Q(α) is a subring of A. (Alternatively, one sees that the map g 7→ g(α) from Q[X] to A is a homomorphism with Q(α) which must therefore be a subring of A.) To complete the proof that Q(α) is a field, we must show that 1/β ∈ Q(α) whenever β is a nonzero element of Q(α). Write β = g(α) with g ∈ Q[X] and note that f - g since otherwise g(α) = 0. Let h be the greatest common divisor of f and g. By Proposition A.2, there exist u, v ∈ Q[X] with h = uf + vg. But f is irreducible, and so either h = 1 or h = f. But this latter is impossible since h - f. Hence 1 = uf +vg and so 1 = u(α)f(α) +v(α)g(α) = v(α)β. It follows that 1/β = v(α) ∈ Q(α) and so K is a field.  The numbers 1, α, α2, . . . , αn−1, where n is the degree of α, form a basis of Q(α) as a over Q. Thus the degree n is also the of Q(α) as a vector space over Q, and so we call n the degree of Q(α). In general when we speak of a basis for a number field K = Q(α) me mean a basis for K as a vector space over Q. Given α ∈ A of degree n, its minimum polynomial f factors over C as

n Y f(X) = (X − αj) j=1

8 where α = α1 say. The numbers α1, . . . , αn are the conjugates of α. They are all algebraic numbers with minimal polynomial f. It is important to note that the conjugates of α are all distinct. This follows from the following lemma.

Lemma 2.1 Let f ∈ Q[X] be a monic polynomial and suppose that f is irreducible over Q. Then f(X) = 0 has n distinct roots in C.

Proof Suppose that α is a repeated root of f(X) = 0. Then f(X) = (X − α)2g(X) where g(X) ∈ C[X]. Consequently f 0(X) = (X − α)2g0(X) + 2(X − α)g(X) and so f(α) = f 0(α) = 0. Let h be the greatest common divisor of f and f 0. Then h = uf + vf 0 for some u, v ∈ Q[X]. Thus h(α) = u(α)f(α) + v(α)f 0(α) = 0. But as h | f and f is irreducible, then h = 1 of h = f. Since h(α) = 0, h = f. But then f | f 0 and as f 0 has leading n−1 term nX this is impossible.  The field Q(α) forms the set of numbers which can be expressed in terms of rational numbers and α using the standard operations. We might instead consider what happens when we take two algebraic numbers α and β and consider which numbers can be expressed in terms of both. Suppose α and β have degrees m and n respectively, and define

(m−1 n−1 ) X X j k Q(α, β) = cjkα β : cjk ∈ Q . j=0 k=0

It is readily apparent that Q(α, β) is a ring; less apparent but nonetheless true that it is a field. However this field can be expressed in terms of one generator.

Theorem 2.1 (Primitive element) Let α, β ∈ A. Then there is γ ∈ A with Q(α, β) = Q(γ).

Proof We show that for a suitable c, γ = α + cβ suffices. Let α and β have degrees m and n respectively, and let their minimum polynomials be

m n Y Y f(X) = (X − αj) and g(X) = (X − βk) j=1 k=1 respectively, with α = α1 and β = β1. Suppose that 1 ≤ j ≤ m and 2 ≤ k ≤ n. The equation

α + xβ = αj + xβk

9 can be rewritten as (β1 − βk)x = α1 − αj and so has exactly one solution x = xjk as β1 6= βk. Choose c to be a nonzero rational which is not equal to any of the xjk. This is possible as Q is an infinite set. Then α + cβ 6= αj + cβk whenever k 6= 1, by the choice of c. Let γ = α + cβ. For convenience put K = Q(γ). I claim that Q(α, β) = K. Certainly γ ∈ Q(α, β) and as Q(α, β) is a ring, then K ⊆ Q(α, β). To prove that K ⊇ Q(α, β) it suffices to show that α ∈ K and β ∈ K. Let h(X) = f(γ − cX). Then h has degree m, as c 6= 0, and h ∈ K[X]. Also h(β) = f(γ − cβ) = f(α) = 0. But of course g(β) = 0 so that g and h have β as a common zero. Suppose that it had another one, so that g(δ) = h(δ) = 0. Then δ = βk for some k ≥ 2 as g(δ) = 0. But then 0 = h(βk) = f(γ − cβk) so that γ − cβk = αj for some j. Thus γ = αj + cβk which is false by the choice of c. The greatest common divisor of g(X) and h(X) must be X − β. As g ∈ Q[X] ⊆ K[X] and h ∈ K[X] there exists u, v ∈ K[X] with u(X)g(X) + h(X)v(X) = X − β. Thus β = −(u(0)g(0) + h(0)v(0)) ∈ K, and it follows that α = γ − cβ ∈ K. This completes the proof.  More generally we can consider fields Q(β1, . . . , βn) generated by any finite number of algebraic numbers. But by using the primitive element theorem and induction we see that each such field still has the form Q(γ). We call a field of the form Q(α) for α ∈ A an algebraic number field or simply a number field. Let α1, . . . , αn be the conjugates of α. The fields Q(αj) are very similar to Q(α) each being generated by an element with minimum polynomial f. In fact they are all isomorphic. We define an isomorphism σj : Q(α) → Q(αj) by setting σj(g(α)) = g(αj) when g ∈ Q[X]. It is perhaps not immediately evident that σj is well-defined. But this follows since if g1, g2 ∈ Q[X] and g1(α) = g2(α) then g1(αj) = g2(αj). This is a consequence of α and αj having the same minimum polynomial. Once σj is seen to be well-defined, then it is straightforward to prove it is an isomorphism. As α1 = α then σ1 is the identity map on Q(α). Let β ∈ Q(α). We define the norm N(β) and T (β) of β as follows. Let n Y N(β) = σj(β) j=1

10 and n X T (β) = σj(β). j=1

Since the σj preserve addition and multiplication, the following properties are immediate:

• N(βγ) = N(β)N(γ) for all β, γ ∈ Q(α),

• N(cβ) = cnN(β) for all c ∈ Q and β ∈ Q(α),

• T (β + γ) = T (β)T (γ) for all β, γ ∈ Q(α), and

• T (cβ) = cT (β) for all c ∈ Q and β ∈ Q(α).

Clearly N(0) = 0 and N(1) = 1. If β 6= 0 then 1 = N(1) = N(β)N(1/β) so that N(β) 6= 0. A word of warning: the norm N(β) and trace T (β) depend on the field Q(α) as well as the number β. If we wish to be strict we should use the notation NQ(α)/Q(β) and TQ(α)/Q(β) instead. The crucial property of the norm and trace is that they are both rational.

Theorem 2.2 Let β ∈ Q(α). Then N(β) ∈ Q and T (β) ∈ Q.

Pn−1 k Proof Write β = k=0 bkα where the bj ∈ Q. Then

n n−1 n n−1 Y X k X X k N(β) = bkαj and T (β) = bkαj . j=1 k=0 j=1 k=0

Both N(β) and T (β) are symmetric polynomials with rational coefficients in the variables α1, . . . , αn. By Newton’s theorem on symmetric polynomi- als (Theorem A.2), N(β) = g1(e1, e2, . . . , en) and T (β) = g2(e1, e2, . . . , en) where g1 and g2 are polynomials in n variables with rational coefficients and e1, . . . , en are the elementary symmetric polynomials in the variables α1, . . . , αn. But

n n n X j n−j Y X + (−1) ejX = (X − αj) = f(X) j=1 j=1 which is the minimum polynomial of α. Hence ej ∈ Q and so N(β), T (β) ∈ Q. 

11 More generally we can consider the field polynomial

n Y n n−1 n (X − σj(β)) = X − T (β)X + ··· + (−1) N(β) j=1 of β. Using the same argument as Theorem 2.2 one shows that all its coeffi- cients are rational. One gets similar results on replacing A by B and Q by Z. Before proving them it’s convenient to prove, in essence, that σj(β) is always a conjugate of β.

Lemma 2.2 Let α ∈ A and β ∈ Q(α). For each j, β and σj(β) have the same minimum polynomial.

Proof Let g be the minimum polynomial of β. Then

n−1 n X k g(X) = X + bkX k=0 where each bj ∈ Q. For any γ ∈ Q(α) we have, since σj is a ring homomor- phism and σj(b) = b whenever b ∈ Q,

n−1 n−1 n X k n X k σj(g(γ)) = σj(γ ) + σj(bkγ ) = σj(γ) + bkσj(γ) = g(σj(γ)). k=0 k=0

In particular g(σj(β)) = σj(g(β)) = σj(0) = 0. As g is irreducible over Q then g is the minimum polynomial of σ(β).  If β ∈ Q(α) is an algebraic integer then its minimum polynomial has integer coefficients. As σj(β) shares this minimum polynomial, then σj(β) is also an algebraic integer.

Proposition 2.2 Let α ∈ A and β ∈ Q(α) ∩ B. Then T (β), N(β) ∈ Z.

Proof We already know that T (β), N(β) ∈ Q. But T (β) is the sum, and N(β) is the product of the σj(β). As β ∈ B then all σj(β) ∈ B and so T (β), N(β) ∈ B. Thus T (β), N(β) ∈ Q ∩ B. But Q ∩ B = Z since if a ∈ Q its minimum polynomial is X − a and this has integer coefficients if and only if a ∈ Z. Hence T (β), N(β) ∈ Z.  More generally the same argument shows that the field polynomial of β ∈ Q(α) ∩ B has integer coefficients.

12 Given a number field K = Q(α) we define its as OK = K∩B, that is the set of algebraic integers in K. In the proof of Proposition 2.2 we see that if K = Q then OK = Z. We aim to develop the concepts of number theory (primes, congruences, ) in the rings OK , just as standard number theory does for Z. √ Example A√quadratic field is a number field of the form√ Q( m) where√ 2 m ∈ Q but m∈ / Q. Since it is easy to see that Q( r m) = Q(√m) for any nonzero rational r, each quadratic field has the form K = Q( m) where m is a squarefree integer, that is, m is not divisible by the of any prime number. We shall always assume this is the case when we discuss quadratic fields. √ √ When m > 0,√Q( m) is a real quadratic field since Q( √m) ⊆ R, and when m < 0, Q( m) is an imaginary quadratic√ field since Q( m) 6⊆ R. We shall√ compute OK whenever K = Q( m) is a quadratic field. Let β = a + b m ∈ K with a, b ∈ Q. For α ∈ OK it is necessary that T (β), N(β) ∈ Z and this is sufficient too, since β2 − T (β)β + N(β) = 0. Suppose that T (β), N(β) ∈ Z. Then 2a = T (β) ∈ Z and a2 − mb2 = N(β) ∈ Z. It follows that m(2b)2 = T (β)2 − 4N(β) ∈ Z. Since m is squarefree, 2b ∈ Z for otherwise, 2b would have a power of a prime p dividing its denominator. But 2 2 1 √ then, since p - m, so would m(2b) . We can write β = 2 (c + d m) with c, d ∈ Z. Finally c2 − md2 = 4N(β) ≡ 0 (mod 4). Since m is squarefree, m 6≡ 0 (mod 4). As odd are congruent to 1 modulo 4, and even squares are divisible by 4, then c2 − md2 ≡ 0 (mod 4) is only possible if c and d are both even, or if they are both odd and m ≡ 1 (mod 4). To conclude, when m 6≡ 1 (mod 4) then √ √ OK = {a + b m : a, b ∈ Z} = Z[ m] and when m ≡ 1 (mod 4) then √ c + d m  O = : c, d ∈ Z, c ≡ d (mod 2) K 2 √  1 + m  = a + b : a, b ∈ Z 2 √ 1 + m = Z . 2

For quadratic fields K we have show that there exist β1 and β2 such that √ 1 √ OK = {a1β1+a2β2 : a1, a2 ∈ Z}. (We have β1 = 1 and β2 = m or 2 (1+ m) as appropriate.) Our aim will be to show that the corresponding property

13 holds for every number field. Indeed if K is a number field of degree n, then there exist β1, . . . , βn ∈ OK with the property that each element of OK can Pn be uniquely expressed in the form j=1 ajβj where the aj ∈ Z. To this end we need to introduce the concept of . Let K = Q(α) be a number field of degree n. Let β1, . . . , βn ∈ K. We define M(β1, . . . , βn) as the matrix whose (j, k)-entry is T (βjβk). We the define the discriminant of the sequence β1, . . . , βn as ∆(β1, . . . , βn) = det(M(β1, . . . , βn)). Then as each T (βjβk) ∈ Q, ∆(β1, . . . , βn) ∈ Q.

Lemma 2.3 Let K be a number field of degree n and let β1, . . . , βn ∈ K. Then 2 ∆(β1, . . . , βn) = det(N(β1, . . . , βn)) where N(β1, . . . , βn) is the matrix whose (j, k)-entry is σk(βj).

Proof Let M = M(β1, . . . , βn) and N = N(β1, . . . , βn). The (j, k)-entry of NN > is n n X X σi(βj)σi(βk) = σi(βjβk) = T (βjβk) i=1 i=1 > > 2 so that NN = M. Hence det(M) = det(N) det(N ) = det(N) .  Example Suppose that K = Q(α) has degree n. We shall compute ∆(1, α, α2, . . . , αn−1). By the Lemma, 2 1 1 1 ··· 1

α1 α2 α3 ··· αn

α2 α2 α2 ··· α2 2 n−1 1 2 3 n ∆(1, α, α , . . . , α ) = α3 α3 α3 ··· α3 1 2 3 n ...... n−1 n−1 n−1 n−1 α1 α2 α3 ··· αn where α1, α2, . . . , αn are the conjugates of α. But this is a Vandermonde , and so by Proposition A.5

2 n−1 Y 2 ∆(1, α, α , . . . , α ) = (αk − αj) . 1≤j

In particular ∆(α1, α2, . . . , αn) 6= 0 since the αj are distinct. We can continue to get a simpler formula. We have

Y 2 n(n−1)/2 Y (αk − αj) = (−1) (αj − αk)(αk − αj) 1≤j

14 Qn Let f(X) = k=1(X − αk) be the minimum polynomial of α. Then by the product rule for differentiation

n n 0 X Y f (X) = (X − αk). j=1 k=1 k6=k

When X = αj only the j-th summand is nonzero, so

n 0 Y f (αj) = (αj − αk). k=1 k6=k Hence n 2 n−1 n(n−1)/2 Y 0 n(n−1)/2 0 ∆(1, α, α , . . . , α ) = (−1) f (αj) = (−1) N(f (αj)). j=1

Since ∆(β1, . . . , βn) 6= 0 for some choice of the βj, then ∆(β1, . . . , βn) 6= 0 in many other cases. The following lemma enables us to relate the discrimi- nants of different sequences.

Lemma 2.4 Let K be a number field of degree n and let β1, . . . , βn ∈ K. If Pn B = (bjk) is an n-by-n matrix over Q and γj = k=1 bjkβk then

2 ∆(γ1, . . . , γn) = det(B) ∆(β1, . . . , βn).

2 Proof We have ∆(β1, . . . , βn) = det(N(β1, . . . , βn)) where the (j, k) entry Pn of N(β1, . . . , βn) is σk(βj). Now σk(γj) = i=1 bjiσk(βi) so that

N(γ1, . . . , γn) = BN(β1, . . . , βn).

Taking and squaring completes the proof.  We can now show how the discriminant discriminates between bases and nonbases.

Proposition 2.3 Let K be a number field of degree n and let β1, . . . , βn ∈ K. Then β1, . . . , βn form a basis of K as a vector space over Q if and only if ∆(β1, . . . , βn) 6= 0.

Pn k−1 Proof Certainly we can write βj = k=1 bjkα with bjk ∈ Q. Let B be the matrix with the bjk as entries. Then by Lemma 2.4

2 2 n−1 ∆(β1, . . . , βn) = det(B) ∆(1, α, α , . . . , α )

15 and so ∆(β1, . . . , βn) 6= 0 if and only if det(B) 6= 0. But det(B) 6= 0 if and only if the βj form a basis of K as a vector space over Q.  Let K be a number field of degree n. Certainly OK is a subgroup of K under the operation of addition. We aim to show that there are β1, . . . , βn ∈ Pn OK with each element of OK uniquely expressible in the form j=1 bjβj where the βj ∈ Z. A sequence β1, . . . , βn satisfying this is called an integral basis of OK . More generally if G is an abelian under the operation of addition, then an integral basis of G is a sequence γ1, . . . , γm of elements of Pm G such that each element of G uniquely expressible as j=1 cjγj with the γj ∈ Z. If G has an integral basis with m elements then we say that G is a free of rank m. The basic theory of free abelian groups is outlined in Appendix A.3.

Theorem 2.3 Let K be a number field of degree n. Then OK is a of rank n.

Proof Let β1, . . . , βn be a basis of K. Then for positive integers r1, . . . , rn the sequence r1β1, . . . , rnβn is also a basis of K. By Lemma 1.1 we may choose the rj such that rjβj ∈ OK for each j. Replacing βj by rjβj we see that there is a basis β1, . . . , βn of K with each βj ∈ OK . Pn Suppose that γ = k=1 ckβk ∈ OK where the ck ∈ Q. Then for each k, Pn βjγ ∈ OK and so T (βjγ) ∈ Z. Thus dj = k=1 ckT (βjβk) ∈ Z for all j. Let M be the matrix with (j, k)-entry T (βjβk). Then d = Mc where c and d are the column vectors with j-th entries cj and dj respectively. Now −1 det(M) = ∆(β1, . . . , βn) 6= 0 as the βj form a basis. Hence c = M d. The matrix M has integer entries so that M −1 = (det(M))−1adj(M). Let ∆ = det(M). Then adj(M) and d have integer entries and so ∆c has integer entries. Hence ∆cj ∈ Z for all j. −1 Let A = {a1β1 + ··· + anβn : aj ∈ Z} and B = {∆ (a1β1 + ··· + anβn): aj ∈ Z}. Since the βj form a basis of K, the βj form an integral basis of A and the βj/∆ form an integral basis of B. We have shown that A ⊆ OK ⊆ B. Since B is free abelian of rank n, then OK is free abelian of rank m where m ≤ n by Proposition A.3. Again by this proposition, since A ⊆ OK , n ≤ m. Hence m = n.  The choice of integral basis for the ring of integers of a number field K is not unique. However, the discriminant of each integral basis is the same.

Proposition 2.4 Let K be a number field of degree n, and let β1, . . . , βn and γ1, . . . , γn be integral bases of OK . Then ∆(β1, . . . , βn) = ∆(γ1, . . . , γn).

16 Pn Pn Proof We can write γj = k=1 bjkβk and βj = k=1 cjkγk where all the bjk and cjk are integers. Let B and C be the matrices with (j, k)-entries bjk and cjk respectively. Now

n n n n X X X X βj = cjkγk = cjkbkiβi = djiβi k=1 k=1 i=1 i=1 Pn where dji = k=1 cjkbki ∈ Z. From the uniqueness of representations of elements of O in terms of the βi we must have djj = 1 and dji = 0 whenever j 6= i. But dji is the (j, i) entry of the matrix CB. Hence CB = I, the identity matrix. Thus det(C) det(B) = det(I) = 1 and as det(B) and det(C) are integers det(B) = det(C) = ±1. But by Proposition 2.4,

2 ∆(γ1, . . . , γn) = det(B) ∆(β1, . . . , βn) = ∆(β1, . . . , βn).

 The nonzero integer

∆K = ∆(β1, . . . , βn), where β1, . . . , βn form an integral basis of OK , only depends (as the notation suggests) on the field K. We call ∆K the discriminant of K. After the degree, it is the most important numerical of the field. Let β1, . . . , βn be an integral basis for OK and suppose that γ1, . . . , γn ∈ OK and that γ1, . . . , γn are linearly independent over Q. Then γ1, . . . , γn form Pn an integral basis of the subgroup H = { j=1 ajγj : aj ∈ Z} of OK . However, 2 it may happen that H 6= OK . We have ∆(γ1, . . . , γn) = det(B) ∆K by Pn Lemma 2.4, where the matrix B has integer entries bjk and γj = k=1 bjkβk. But by Proposition A.4, det(B) = |OK : H|. Hence

2 ∆(γ1, . . . , γn) = |OK : H| ∆K .

Hence the index |OK : H| is a number whose square divides ∆(γ1, . . . , γn). In particular if ∆(γ1, . . . , γn) is squarefree, then |OK : H| = 1 and γ1, . . . , γn form an integral basis of OK .

3 Factorization

In ordinary number theory we study the integers Z, in particular the positive integers. In algebraic number theory we study rings of integers OK . Each positive integer is a product of prime numbers p which have the two properties

17 • if p = ab with a, b ∈ Z then a = ±1 or b = ±1,

• If p | cd with c, d ∈ Z then p | c or p | d.

These two properties are equivalent, but while it is easy to prove that the sec- ond implies the first, the converse requires the . However the corresponding properties in OK are not equivalent. We need some definitions. A in OK is an element β ∈ OK such that 1/β ∈ OK . The set of units of OK is denoted by U(OK ) and it is apparent that it forms a group under multiplication. There is a nice characterization of units.

Lemma 3.1 Suppose that K is a number field of degree n and let β ∈ OK . Then β ∈ U(OK ) if and only if N(β) = ±1.

Proof If β ∈ OK , then 1/β ∈ OK and so N(β), N(1/β) ∈ Z. But N(β)N(1/β) = N(1) = 1 so that N(β) = N(1/β) = ±1. Conversely suppose that N(β) = ±1. Then

n n Y Y ±1 = σj(β) = β σj(β). j=1 j=2

Qn Thus 1/β = ± j=2 σj(β) which is an algebraic integer because each σj(β) ∈ B. Hence β ∈ OK .  For Z = OQ, the only units√ are ±1. But√ the unit group of OK can be infinite. For example β = 1 + 2 ∈ K = Q( 2) is a unit as N(β) = −1. But β > 1 so that βm → ∞ as m → ∞. But when m is a positive integer, m β ∈ U(OK ) so that U(OK ) is infinite. As for the integers we define a divisibility relation on OK . For β, γ ∈ OK with β 6= 0 we say that β | γ (β divides γ or γ is divisible by β) if γ/β ∈ OK and β - γ otherwise. Similarly we write γ ≡ δ (mod β)(γ is congruent to δ modulo β) if β | (γ − δ). Divisibility and congruences have all the formal properties familiar from Z so we shall not repeat them. Note that β is a unit if and only if β | 1. One useful new property of divisibility is the following.

Lemma 3.2 Let β, γ ∈ OK . If β | γ then N(β) | N(γ) as integers.

Proof If β | γ then δ = γ/β ∈ OK and N(γ) = N(β)N(δ). As N(β), N(δ) ∈ Z then N(β) | N(γ).  We can now define what turns out to be our first analogue of prime numbers. Let β ∈ OK . We say that β is irreducible if

18 • β 6= 0, • β is not a unit, and

• if β = γδ with γ, δ ∈ OK then either γ or δ is a unit. In Z, the irreducible elements have the form ±p where p is a prime num- ber. From Lemma 3.2 it follows that if N(β) is a prime number then β is irreducible. The converse is not true; take the example K = Q(i) so that OK = Z[i]. Then 3 is irreducible in OK but N(3) = 9 is not prime. In each OK we can achieve factorization into irreducibles.

Lemma 3.3 Let K be a number field. Suppose that β ∈ OK and that β 6= 0 and β∈ / U(OK ). Then there are irreducible elements γ1, . . . , γk ∈ OK with β = γ1 ··· γk. Proof By induction on |N(β)| which is a positive integer. Since β 6= 0 and β is not a unit then |N(β)| ≥ 2. If β is irreducible then take k = 1 and γ1 = β. If β is reducible then β = β1β2 where β1, β2 ∈ OK and β1, β2 ∈/ U(OK ). Then |N(β1)|, |N(β2)| > 1 and as |N(β)| = |N(β1)||N(β2)| it follows that |N(β1)|, |N(β2)| < |N(β)|. By the induction hypothesis both β1 and β2 are products of irreducible elements, and by combining these factorizations we see that β is also a product of irreducible elements.  We turn to the question of uniqueness. It is easy to see that if β is irreducible and ξ ∈ U(OK ) then ξβ is also irreducible. Hence we can adjust factorizations by multiplying factors by units. For instance if β = γ1γ2γ3 is a −1 factorization into irreducibles and ξ ∈ OK then β = (ξγ1)γ2(ξ γ3) is also a factorization into irreducibles. If we can go from one factorization to another by introducing unit factors and/or permuting the of factors then we say that the factorizations are equivalent. From standard number theory we know that in Z all factorizations of a given number into irreducibles are equivalent. However this does not hold in all O . √ K√ √Example√ Let K = Q( −6). Then OK = Z[ −6]. Now 6 = 2 × 3 = −6(− −6). I claim that these are inequivalent√ factorizations of 6 into irreducibles.√ Now N(2) = 4, N(3) = 9 and N(± −6) = 6. If any of 2, 3 or ± −6 were reducible,√ their nontrivial factors would have norms√ 2 or 3. But suppose that β ∈ Z[ −6] has N(β) = 2 or 3. Then β = a + b −6√ with 2 2 a, b ∈ Z and a + 6b = 2 or 3 which is impossible. Hence√ 2, 3 and√ ± −6 and as the norms of the factors on both sides of 2 × 3 = −6(− −6) are different then the two factorizations are inequivalent. √ √ √ In this√ example we have −6 is irreducible, but that −6 | (2×3), −6 - 2 and −6 - 3. If we study the proof of uniqueness of prime factorization

19 in Z we see that it relies on the fact that if a prime p divides a product of integers ab then it divides (at least) one of√ the integers√a and b. This property is not shared by the −6 of Z[ −6]. We thus make a definition. Let β ∈ OK . We say that β is prime if • β 6= 0,

• β is not a unit, and

• if β | γδ with γ, δ ∈ OK then either β | γ or β | δ.

From standard number√ theory an integer is irreducible in Z√if and only if it is prime. However −6 is irreducible but not prime in Z[ −6]. However primes are always irreducible.

Lemma 3.4 Let K be a number field. If β is a of OK then β is irreducible in OK .

Proof Let β be prime and suppose that β = γδ with γ, δ ∈ OK . Then β | γδ so that β | γ or β | δ by primality. Without loss of generality suppose that β | γ. Then as γ | β, δ = β/γ is a unit. Hence β is irreducible.  If in a given OK every irreducible is prime then we achieve unique fac- torization, by an argument similar to that of unique factorization in Z.

Proposition 3.1 Let K be a number field and suppose that every irreducible element of OK is prime. Then OK has unique factorization: any two factor- ization of an element into irreducibles are equivalent.

Proof Let r s Y Y β = γj = δk j=1 k=1 be two factorizations of β into irreducibles. We argue that these are equiva- lent by induction on |N(β)|. Since γ1 is prime and γ1 | δ1 ··· δs then γ1 | δk for some k. By shuffling the δs we may assume that γ1 | δ1 and as δ1 is irreducible then δ1 = ξγ1 where ξ is a unit. Hence

r s Y Y β/γ1 = γj = (ξδ2) δk j=2 k=3 is a factorization into irreducibles and |N(β/γ1)| < |N(β)|. By the induc- tive hypothesis these factorizations of β/γ1 are equivalent, and so the given factorizations of β are equivalent. 

20 For some fields K, for instance K = Q every irreducible in OK is prime and for these fields OK has the unique factorization property. The reason Z has unique factorization is because of the Euclidean algorithm. This works as if a, b ∈ Z, a 6= 0 and a - b then there is c ∈ Z with |b − ac| < |a|. Some other number fields have the analogous property. We say that K is norm-Euclidean if when β, γ ∈ OK with β 6= 0, then there exists δ ∈ OK with |N(γ − δβ)| < |N(β)|. In other words we get the “remainder” γ − δβ when dividing γ by β and this remainder is “smaller” than β. There is a useful alternative characterization.

Lemma 3.5 The number field K is norm-Euclidean if and only if for all ξ ∈ K there is δ ∈ OK with |N(ξ − δ)| < 1.

Proof Suppose that K is norm-Euclidean. Let ξ ∈ K. Then ξ = γ/β for some β, γ ∈ OK with β 6= 0. By the norm-Euclidean property, there is δ ∈ OK with |N(γ − δβ)| < |N(β)|. Thus

N(γ − δβ) 1 > = |N(γ/β − δ)| = |N(ξ − δ)|. N(β) Conversely, suppose that for all ξ ∈ K there is δ ∈ K with |N(ξ −δ)| < 1. Let β, γ ∈ OK with β 6= 0. Put ξ = γ/β. Then there is δ ∈ OK with |N(ξ − δ)| < 1. Hence

|N(γ − βδ)| = |N(β)||N(ξ − δ)| < |N(β)| as so K is norm-Euclidean.  Example We show that Q is norm-Euclidean. Let x ∈ Q. Then n ≤ x ≤ n + 1 for some n ∈ Z. Now |N(x − n)| = |x − n| = x − n and |N(x − (n + 1))| = |x − (n + 1)| = n + 1 − x. As (x − n) + (n + 1 − x) = 1 then one of these numbers is ≤ 1/2. So for a = n or a = n+1, |N(x−a)| ≤ 1/2 < 1.

√ Example Consider K = Q( −d) where d = 1 or d = 2. For ξ ∈ K, 2 |N(ξ)| = |ξ| where√|ξ| denotes the absolute√ value of the √ ξ. We have OK = Z[ −d] = {a + b −d : a, b ∈ Z}. Let ξ = x + y −d with x, y√∈ Q. There are integers a, b with |x − a|, |y − b| ≤ 1/2. Let δ = a + b −d ∈ OK . Then 1 + d |N(ξ − δ)| = (x − a)2 + d(y − b)2 ≤ < 1. 4 √ Hence Q(i) and Q( −2) are norm-Euclidean.

21 √ Example Consider K = Q( −d) where d =√ 3, 7 or 11. We have O√K = 1 Z[α] = {a + bα : a, b ∈ Z} where α = 2 (1 + −d). Let ξ = x + y −d with x, y ∈ Q. There√ is an integer b with |2y − b| ≤ 1/2. Then ξ − bα = 1 1 1 2 (2x − b) + 2 (2y − b) −d. There is an integer a with | 2 (2x − b) − a| < 1/2. Let δ = a + bα. Then (2x − 2a − b)2 + d(2y − b)2 1 + d/4 |N(ξ − δ)| = ≤ < 1. 4 4 √ √ √ Hence Q( −3), Q( −7) and Q( −11) are all norm-Euclidean. √ √ √ Example Let K = Q( −6). Then O = Z[ −6]. Let ξ = 1 (1 + −6). If √ K 2 δ = a + b −6 ∈ OK with a, b ∈ Z, then 1 + 6 |N(ξ − δ)| = (1/2 − a)2 + 6(1/2 − b)2 ≥ > 1 4 √ since |c − 1/2| ≥ 1/2 for all c ∈ Z. Hence Q( −6) is not norm-Euclidean. √ Example√ Consider√ K = Q( m) where m √= 2 or m = 3. Then OK = Z[ m] = {a + b m : a, b ∈ Z}. Let ξ = x + y m with√x, y ∈ Q. There are integers a, b with |x − a|, |y − b| ≤ 1/2. Let δ = a + b m ∈ OK . Then

N(ξ − δ) = (x − a)2 − m(y − b)2.

Thus −m/4 ≤ N(ξ − δ) ≤ 1/4 and consequently

|N(ξ − δ)| ≤ max(1/4, m/4) < 1. √ √ Hence Q( 2) and Q( 3) are norm-Euclidean.

It is apparent that if K is norm-Euclidean then OK is a with respect to the Euclidean φ(β) = |N(β)|. Every in a Euclidean domain is principal (Proposition A.6). If all ideals of OK are principal then every irreducible in OK is prime.

Proposition 3.2 Suppose that the number field K has the property that each ideal of OK is principal. Then every irreducible element of OK is prime.

Proof Suppose that each ideal of OK is principal. Suppose that β ∈ OK is irreducible and that β | γδ. We must show that if β - γ then β | δ. Let I = {ξβ + ηγ : ξ, η ∈ OK } be the ideal generated by β and γ. Then I is principal: I = hλi say. As β ∈ I then λ | β. But as β is irreducible either

22 λ = εβ or λ = ε where ε is a unit. If λ = εβ then β | λ, but also λ | γ as γ ∈ I. Hence β | γ which is false. Hence λ = ε and so I = R. Therefore 1 ∈ I so that 1 = ξβ + ηγ for some ξ, η ∈ OK . Hence δ = ξβδ + ηγδ from which it follows that β | δ as δ | γδ. Hence β is prime.  When K is norm-Euclidean we get the following chain of implications. First OK is a Euclidean domain. Then every ideal of OK is principal. Then every irreducible in OK is prime. Finally OK has the unique factorization property.√ However not all these implications are reversible. When K = Q( −19), OK has unique factorization√ but is not a Euclidean domain. Clark proved in 1993 that when K = Q( 69) then OK is a Euclidean domain despite the fact that K is not norm-Euclidean. As an application we look at an equation where to find all integer solutions it is useful to work in a number field. Example We wish to find all solutions of

x3 = y2 + 2 (∗) with x, y ∈ Z. The presence of the 2 in (∗) suggests that we see whether we can restrict the parity of x and y. If y is even, then 4 | y2 and so y2 + 2 ≡ 2 (mod 4). But x3 6≡ 2 (mod 4) so y must be odd. This forces x3 to be odd and so x is odd. Next we factor (∗) as √ √ x3 = (y + −2)(y − −2). (†) √ √ This is a factorization in K = Q( −2). Note that OK = Z[ −2] and that OK has unique factorization as we have seen that K is Euclidean. It is easy to see√ that the only units of OK are ±1. Let us write out the factorization of y + −2 into primes, putting together repeated occurrences and also putting together occurrences of π and −π as −π2. We get √ a1 a2 am y + −2 = ±π1 π2 ··· πm (‡) where if j 6= k then πj 6= ±πk. Then (†) implies that

3 a1 a2 ak a1 a2 ak x = π1 π2 ··· πk π1 π2 ··· πk . √ I claim that√ no πj equals ±πk. If this happened√ then πj | √(y + −2)√ and πj | (y− −2). Thus√ πj would be a factor of√ (y+ −2)−(y− −2) = 2 −2. 2 But as π√j | (y + −2) then N(π√j) | N(y + −2) = y + 2, which is odd, and as πj | 2 −2 then N(πj) | N(2 −2) = 8. Hence N(πj) = 1, which means that πj is a unit, and not an irreducible.

23 Since x3 is a cube, when we write it as a power of irreducibles, the ex- ponent of each is a multiple of 3. From unique√ factorization then each aj 3 3 is divisible by 3. Consequently√ from (‡), y + −2 = ±β = (±β) where β ∈ OK . Write ±β = a + b −2 with a, b ∈ Z. Then √ √ y + −2 = (a3 − 6ab2) + (3a2b − 2b3) −2 and so y = a(a2 − 6b2) and 1 = b(3a2 − 2b2). Hence b = ±1 and ±1 = 3a2 − 2b2 = 3a2 − 2. This can only happen when 3a2 = 3 and a = ±1. Then y = ±(−5) = ±5. Thus x3 = 27 and x = 3. We conclude that the only integer solutions of (∗) are (x, y) = (5, 3) and (x, y) = (−5, 3).

4 Ideals

Recall that an ideal of a (commutative) ring R is a I of R such that • I is a subgroup of R (under the operation of addition),

• if a ∈ I and x ∈ R then xa ∈ I. A is one of the form

hai = {xa : x ∈ R} for some a ∈ R. The trivial cases are h0i = {0} and h1i = R. All other ideals of R are called nontrivial. Let I and J be ideals of R. It is easy to see that their sum I +J = {a+b : a ∈ I, b ∈ J} is an ideal of R. The sum of I and J is the smallest ideal containing both I and J. It is even easier to see that their intersection I ∩ J is an ideal of R. Ideals can be multiplied, but this is more difficult. If I and J are ideals of R then the set {ab : a ∈ I, b ∈ J} is not in general an ideal of R (although if one if I and J is principal it is). The problem is that the sum a1b1 + a2b2 where a1, a2 ∈ I and b1, b2 ∈ J may not be expressible as ab for a ∈ I and n ∈ J. However the additive group generated by the elements ab for a ∈ I, b ∈ J is an ideal of R, and we call this ideal the product IJ of I and J. Symbolically

IJ = {a1b1 + a2b2 + ··· + anbn : a1, . . . , an ∈ I, b1, . . . , bn ∈ J}.

The product IJ is the smallest ideal containing all ab with a ∈ I and b ∈ J. The sum and product satisfy a number of formal properties: • I + I = I when I is an ideal of R,

24 • I + J = J + I when I and J are ideals of R,

• I1 + (I2 + I3) = (I1 + I2) + I3 when I1, I2 and I3 are ideals of R, • IJ ⊆ I ∩ J when I and J are ideals of R,

• IJ = JI when I and J are ideals of R,

• I1(I2I3) = (I1I2)I3 when I1, I2 and I3 are ideals of R, and

• I(J1 + J2) = IJ1 + IJ2 when I, J1 and J2 are ideals of R. We abbreviate the sum of a number of principal ideals as follows:

ha1, a2, . . . , ari = ha1i + ha2i + ··· + hari .

Then ha1, a2, . . . , ari is the smallest ideal containing each of the aj. An ideal of this form is called finitely generated. By using the above properties of the ideal sum and product we find that

ha1, a2, . . . , ari + hb1, b2, . . . , bsi = ha1, a2, . . . , ar, b1, b2, . . . , bsi and

ha1, a2, . . . , ari hb1, b2, . . . , bsi = ha1b1, a1b2, . . . , a1bs, a2b1, a2b2, . . . , arbsi .

We now turn to the of OK for number fields K. Two elements of OK generate the same principal ideal when they differ by a unit factor.

Lemma 4.1 Let β and γ be nonzero elements of OK where K is a number field. Then hβi = hγi if and only if γ/β is a unit in OK .

Proof If hβi = hγi then β ∈ hγi and γ ∈ hβi. Hence γ/β ∈ OK and β/γ ∈ OK and so γ/β ∈ U(OK ). Conversely if γ/β ∈ U(OK ) then β/γ, γ/β ∈ OK and so β | γ and γ | β. Hence hβi ⊆ hγi ⊆ hβi so that hβi = hγi.  The concepts of divisibility and primality in OK can be expressed in terms of ideals. For instance β | γ if and only if γ ∈ hβi which occurs if and only if hγi | hβi. Similarly γ ≡ δ (mod β) if and only if γ − δ ∈ hβi. We can generalize the notion of congruences modulo an element to congruences modulo an ideal; if I is an ideal then we write γ ≡ δ (mod I) whenever γ − δ ∈ I. Hence γ ≡ δ (mod hβi) means the same as γ ≡ δ (mod β). The relation of congruence modulo an ideal has the same formal properties as congruence modulo an element, which I shall not list. The condition for β to be a prime element of OK becomes the following:

25 • hβi= 6 h0i,

• hβi= 6 h1i, and

• if γ, δ ∈ OK and γδ ∈ hβi then either γ ∈ hβi or δ ∈ hβi. Note that here β only enters through the ideal hβi. We say that an ideal P of OK is prime if • P 6= h0i,

• P 6= h1i, and

• if γ, δ ∈ OK and γδ ∈ P then either γ ∈ P or δ ∈ P . Thus the principal prime ideals are those of the form hβi with β prime. When every ideal of OK is principal then every irreducible element of OK is prime by Proposition 3.2. But then factorizations into irreducibles are always unique up to equivalence by Proposition 3.1. We can put these results together and rephrase in the language of ideals.

Proposition 4.1 Let K be a number field, and suppose that each ideal of OK is principal. Each nontrivial ideal of OK is a product of prime ideals and all such expressions are unique up to the order of the factors.

Proof Each nontrivial ideal I has the form I = hβi where β 6= 0 and β∈ / U(OK ). Then by Lemma 3.3, β = γ1γ2 ··· , γr where the γj are irreducible. Then I = hγ1i hγ2i · · · hγri and by Proposition 3.2 the γj are primes. But then the hγji are prime ideals so that I is a product of prime ideals. Let I = P1P2 ··· Pr = Q1Q2 ··· Qs (∗) be two factorizations of I into prime ideals. Write = hγii and Qj = hδji. Then β, γ1γ2 ··· γr and δ1δ2 ··· δs differ only by unit factors. By absorbing these into γ1 and δ1 we may assume that

β = γ1γ2 ··· γr = δ1δ2 ··· δs. By Proposition 3.1, these factorizations are equivalent which means that in (∗), r = s and the Pi and Qj are the same up to order.  But not every K has the property that each ideal of OK is principal. Remarkably, Proposition 4.1 is still valid for these fields, although the proof is harder. The unique factorization property for prime ideals compensates in part for the failure of unique factorization into irreducible elements. It is time to see some examples of nonprincipal ideals.

26 √ √ Example Let K = Q( −6). Then OK = {a + b −6}. We define two of OK which will turn out to be nonprincipal ideals. Let √ √ I = {a + b −6 : a, b ∈ Z, a is even} = {2c + b −6 : b, c ∈ Z} and √ √ J = {a + b −6 : a, b ∈ Z, 3 | a} = {3c + b −6 : b, c ∈ Z}.

It is easy to√ see that I and J are subgroups√ of OK under addition. Suppose β = 2c + b −6 ∈ I and γ = r + s −6 ∈ OK . Then √ √ √ γβ = (r + s −6)(2c + b −6) = 2(rs − 3sb) + (rb + 2sc) −6 ∈ I and so I is an ideal of OK . A similar√ argument shows that J is√ an ideal of O . In fact I claim that I = 2, −6 . Certainly 2 ∈ I and −6 ∈ I K √ so that 2, −6 ⊆ I. On the other hand each element of I has the form √ √ 2c +b −6 for b, c ∈ Z. A fortiori√ each element of I has the√ form 2γ +δ −6 with γ, δ ∈ O and so I ⊆ 2, −6 . Indeed then, I = 2, −6 . Similarly √ K J = 3, −6 . We now show that I and J are nonprincipal. Suppose√ that I were princi- pal.√ Then I = hβi for some β ∈ OK . Then as 2 ∈√I and −6 ∈ I, β | 2 and β | −6. Hence N(β) | N(2) = 4 and N(β) | N( −6) =√ 6. It follows that N(β) = ±1 or ±2. But N(β) = a2 + 6b2 where β = a + b −6 and a, b ∈ Z. The only possibility is a = ±1 and b = 0. But then β = ±1 and ±1 ∈/ I so this is false. Hence I is nonprincipal. A similar argument shows that J is also nonprincipal. We shall compute the products of I and J. First of all consider I2. We have √ √ √ √ I2 = 2, −6 2, −6 = 4, 2 −6, 2 −6, −6 . √ By inspection we see that 4, −6 and 2 −6 are all elements of h2i so that I2 ⊆ h2i. But 2 = (−1)4 − (−1)(−6) ∈ I2. Hence h2i ⊆ I2 and we conclude that I2 = h2i. Similarly J 2 = h3i. Now consider IJ. We have √ √ √ √ IJ = 2, −6 3, −6 = 6, 2 −6, 3 −6, −6 . √ √ √ √ As −6 | ±6 in O we see that IJ ⊆ −6 . But −6 = 3 −6 + √ K √ √ (−1)2 −6 ∈ IJ and so −6 ⊆ IJ. Hence IJ = −6 . √ We√ now show that I and J are prime ideals. Let β = a + b −6, γ = c + d −6 ∈ OK and suppose that√ β∈ / I and γ∈ / I. Then a and c are odd. Thus βγ = (ac − 6bd) + (ad + bd) −6. But ac − 6cd is odd so βγ∈ / I. Hence I is prime. Now suppose that β∈ / J and γ∈ / J. Then 3 - a and 3 - c. But then 3 - (ac − 6bd) so that βγ∈ / J. Hence J is prime.

27 We have already seen the example √ √ 6 = 2 × 3 = ( −6)(− −6) of nonunique factorization into irreducibles in OK . This gives the ideal fac- torization √ 2 h6i = h2i h3i = −6 . (∗) √ But none of h2i, h3i and −6 is “irreducible” as an ideal. The factorization (∗) can be rewritten as (I2)(J 2) = (IJ)(IJ) and is now seen to be exhibit two ways of regrouping the nonprincipal prime ideals in the factorization h6i = I2J 2 into pairs multiplying to principal ideals.

We need a technical result about ideals in OK .

Lemma 4.2 Let K be a number field of degree n. Each nonzero ideal of OK is a free abelian group of rank n under the operation of addition.

Proof Let I be a nonzero ideal of OK . Let β1, β2, . . . , βn form an integral basis of OK and let γ be a nonzero element of I. Then it is plain that γβ1, γβ2, . . . , γβn form an integral basis of hγi. Hence hγi is free abelian of rank m. Since I is a subgroup of OK then by Proposition A.3, I is free abelian of rank m where m ≤ n. But hγi is a subgroup of I and so the rank of hγi, that is n, does not exceed m. Hence n ≤ m ≤ n so that m = n.  Let K be a number field. Each nonzero ideal of OK has the same rank as an abelian group as OK . By Proposition A.4 each ideal I has finite index as a subgroup of OK . We call this index the norm of I, and denote it as N(I). That is, N(I) = |OK : I|. What this means is that if N(I) = m, then there are γ1, . . . , γm ∈ OK which form a system of coset representatives for I in O . That is, each β ∈ O is congruent to exactly one γ modulo I. K √ K √ j Example √Let K = Q( −6) so√ that OK = Z[ −6]. Consider the principal ideal 1 + −6 . Let β = 1 + −6. Then γ ∈ hβi if and only if γ/β ∈ √ √ Z[ −6]. If γ = a + b −6 then √ √ √ γ a + b −6 (a + b −6)(1 − −6) a + 6b b − a√ = √ = √ √ = + −6. β 1 + −6 (1 + −6)(1 − −6) 7 7

Thus γ ∈ hβi if and only if a + 6b ≡ 0 (mod 7) and b − a ≡ 0 (mod√ 7). Both√ conditions are equivalent to a ≡ b (mod 7). Consequently a + b −6 ≡ c + d −6 (mod hβi) if and only if b − a ≡ d − c (mod 7). Hence 0, 1, 2,

28 √ 3, 4, 5,√ 6 form a system of coset representatives for hβi in Z[ −6] and so N( 1 + −6 ) = 7. √ Example√ Again let K = Q( −6), and√ consider the nonprincipal ideal I = 2, −6 . We have seen that a + b −6 ∈ I if and only if a is even. √ √ Hence a + b −6 ≡ c + d −6 (mod I) if and only if a√≡ c (mod 2). Hence 0 and 1 form a system of coset representatives for I in√Z[ −6] and so N(I) = 2. A similar argument gives N(J) = 3 when J = 3, −6 . We list some formal properties of the norm. Let I and J be nonzero ideals of OK .

• N(I) is a positive integer, and N(I) = 1 only when I = h1i = OK , • if I ⊆ J then N(J) | N(I) with equality only when I = J; a fortiori N(J) < N(I) with equality only when I = J.

The latter of these is because |OK : I| = |OK : J||J : I|. So far we have two notions of norm. The norm of an element of K, and the form of a nonzero ideal of OK . As one might expect these notions are linked.

Theorem 4.1 Let K be a number field. If γ is a nonzero element of OK then N(hγi) = |N(γ)|. (∗) (Note that on the left of (∗) we have the norm of an ideal, and on the right we have the norm of an element.)

Proof Let β1, . . . , βn form an integral basis of OK . Then γβ1, . . . , γβn forms Pn an integral basis of hγi. We can write γβj = j=1 ajkβk where the ajk ∈ Z. By Proposition A.4, N(hγi) = |OK : hγi | = | det(A)| where A is the n-by-n matrix with (j, k)-entry ajk. It suffices to show that det(A) = N(γ). We have the matrix equation γv = Av where v is the column vec- > tor (β1 β2 ··· βn) . Applying the homomorphism σk to this equation > gives σk(γ)vk = Avk where vk = (σk(β1) σk(β2) ··· σk(βn)) . Thus the vk are eigenvectors of A with eigenvalues σk(γ). The n-by-n matrix B > with columns the vk has (j, k)-entry σk(βj). Then BB has (j, k)-entry Pn > i=1 σi(βj)σi(βk) = T (βjβk) and so det(BB ) = ∆(β1, . . . , βn) 6= 0. Hence −1 B is nonsingular. But then BAB is a with entries σj(γ) −1 Qn and so det(A) = det(BAB ) = j=1 σj(γ) = N(γ).  A similar argument shows that N(hγi I) = |N(γ)|N(I). Later we shall show that N(IJ) = N(I)N(J) is general, but our proof will be very indirect.

29 Our aim is to show that each nontrivial ideal of OK can be uniquely represented as a product of prime ideals. We need many preliminary results alas. By definition if P is a , β, γ ∈ OK and βγ ∈ P , then either β ∈ P or γ ∈ P . Equivalently if hβi hγi = hβγi ⊆ P then either hβi ⊆ P or hγi ⊆ P . This can be extended to nonprincipal ideals.

Lemma 4.3 Let K be a number field and let P be a prime ideal of OK . If I and J are ideals of OK and IJ ⊆ P then either I ⊆ P or J ⊆ P . More generally if I1,...,Im are ideals and I1 ··· Im ⊆ P then Ik ⊆ J for some k.

Proof Suppose, for a contradiction, that IJ ⊆ P but I 6⊆ P and J 6⊆ P . Then there exist β ∈ I, γ ∈ J with β∈ / P and γ∈ / P . But then βγ ∈ IJ, but βγ∈ / P , since P is prime, contradicting the hypothesis IJ ⊆ P . The case of an m-term product I1 ··· Im now follows by induction.  Primality is also equivalent to maximality. An ideal I of OK is maximal if I is nontrivial but the only ideals J of OK with I ⊆ J are J = I and J = OK .

Lemma 4.4 Let K be a number field. An ideal I of OK is prime if and only if it is maximal.

Proof First suppose that I is maximal. Let β, γ ∈ OK with βγ ∈ I and β∈ / I. To show that I is prime it suffices to show that γ ∈ I. Let J = I +hβi. Then J is an ideal and I ⊆ J, but I 6= J since β ∈ J. By maximality of I, J = OK . Hence 1 ∈ J so 1 = η + δβ where η ∈ I and δ ∈ OK . Then 1 ≡ δβ (mod I). Consequently, γ = 1γ ≡ δβγ ≡ 0 (mod I), as βγ ∈ I. We conclude that γ ∈ I and that I is prime. Conversely suppose that I is prime. Suppose that J is an ideal of OK with I ⊆ J and I 6= J. We need to show that J = OK , or equivalently, that 1 ∈ OK . Let β ∈ J and β∈ / I. Then J ⊇ I + hβi so all we need to do is to show that 1 ∈ I + hβi. The ideal I has finite index, m say, in OK . Let γ1, . . . , γm be coset representatives for I in OK . That is to say that each element of OK is congruent modulo I to exactly one γj. In particular γj ≡ γk (mod I) if and only if j = k. If βγj ≡ βγk (mod I) then β(γj − γk) ∈ I and as I is prime and β∈ / I then γj − γk ∈ I and so j = k. The numbers βγ1, . . . , βγm lie in distinct cosets of I, and so they represent all cosets. In particular 1 ≡ βγj (mod I) for some j, and so 1 = η + γjβ for some η ∈ I. Thus 1 ∈ I + hβi and I is maximal.  As an immediate consequence, if P and Q are prime ideals of OK and P ⊆ Q then P = Q due to the maximality of P .

30 Due to maximality being the same as primality, every nontrivial ideal is contained in a prime ideal.

Lemma 4.5 Let K be a number field, and let I be a nontrivial ideal of OK . Then there is a prime ideal P of OK with I ⊆ P .

Proof Consider the nontrivial ideals J of OK with I ⊆ J. There is certainly at least one namely I itself. Take one, P , with least possible norm. Then P is maximal, for if P ⊆ J1 with J1 6= P an ideal of OK , then N(J1) < N(P ) and so J1 = OK .  We would like to show that each nontrivial ideal is a product of prime ideals. We cannot do so yet, but we can prove a first approximation to this result.

Lemma 4.6 Let K be a number field, and let I be a nontrivial ideal of OK . Then I ⊇ P1P2 ··· Pm where the Pj are prime ideals of OK .

Proof We use induction on N(I). If I is prime, then we can take m = 1 and P1 = I. If I is not prime, then there exist β, γ ∈ OK with β∈ / I, γ∈ / I but βγ ∈ I. Let J1 = hβi + I and J2 = hγi + I. Then I ⊆ J1 and I ⊆ J2, but I 6= J1 and I 6= J2. Hence N(J1) < N(I) and N(J2) < N(I). But 2 J1J2 = hβγi + βI + γI + I ⊆ I as βγ ∈ I. By the inductive hypothesis, J1 ⊇ P1 ··· Pr and J2 ⊇ Q1 ··· Qs where the Pj and Qk are prime. Hence I ⊇ J1J2 ⊇ P1 ··· PrQ1 ··· Qs as required.  As a technical convenience we extend the notion of ideals in OK to that of . It will turn out that the set of fractional ideals forms a group under multiplication, which it is clear that the set of ideals do not. A fractional ideal of K is a set of the form βI where β is a nonzero element of K and I is a nonzero ideal of OK . Note that we do not assume that β ∈ OK . In particular βOK = hβi is a fractional ideal of K for all nonzero β ∈ K. We call such a fractional ideal principal. If all ideals of OK are principal, for instance if K = Q, then so are all fractional ideals, for the fractional ideal βI = hβγi if I = hγi. We define the sum and product of fractional ideals in the same way as for ideals. In particular if β ∈ K, β 6= 0 then hβi h1/βi = h1i = OK , so that principal fractional ideals are invertible. We shall show that all fractional ideals are invertible. We start with an alternative characterization of fractional ideals.

Lemma 4.7 Let K be a number field. Then I is a fractional ideal of K if and only if • I is a nonzero subgroup of K under addition,

31 • if β ∈ I and γ ∈ OK then γβ ∈ I, and

• there is a nonzero η ∈ K such that β/η ∈ OK for each β ∈ I.

Proof If I = ηJ is a fractional ideal of K, with η ∈ K and J an ideal of OK , then the three properties follow with the same value of η. Conversely suppose the three properties hold. Then J = η−1I = {β/η : β ∈ I} is a nonzero ideal of OK and so I = ηJ is a fractional ideal.  We shall show that all fractional ideals are invertible, that is given a fractional ideal I, there is a fractional ideal J with IJ = h1i. It is easy to write down a candidate for the inverse of a fractional ideal I; define I∗ = ∗ {β ∈ K : βI ⊆ OK }. It is clear that I is an additive subgroup of K and is nonzero since it contains 1/η whenever I = ηJ with J an ideal of OK . Also ∗ ∗ it is clear that if β ∈ I and γ ∈ OK then βγ ∈ I . If δ is a nonzero element ∗ ∗ ∗ of I then δI ⊆ OK and so (1/δ)I ⊆ OK . Thus I is a fractional ideal of K. ∗ ∗ Also II ⊆ OK so that II is an ideal of OK . The hard part is to show that ∗ II = OK . We first prove the invertibility for prime ideals. Let P be a prime ideal ∗ ∗ of OK . Then P ⊇ OK since βP ⊆ P for all β ∈ OK . Thus PP ⊇ P OK = ∗ ∗ P . But PP ⊆ OK . By the maximality of P , either PP = OK (as we want) or PP ∗ = P . We dispose of the possibility that PP ∗ = P in two stages: first ∗ ∗ ∗ we show that PP = P implies that P = OK , then we show that P 6= OK .

Lemma 4.8 Let K be a number field and let I be a nonzero ideal of OK . If γI ⊆ I for some I ∈ K, then γ ∈ OK .

Proof By Lemma 4.2, I is a free abelian group, so let β1, . . . , βn form an Pn integral basis of I. Then γβj ∈ I for all j, so γβj = k=1 ajkβk where the ajk ∈ Z. Thus γv = Av where v is the column vector with entries the βj and A is the matrix with entries the ajk. Thus γ is an eigenvalue of A which is a matrix with integer entries. Thus γ is an algebraic integer and so γ ∈ K ∩ B = OK .  We now show that prime ideals are invertible.

Proposition 4.2 Let K be a number field and let P be a prime ideal of OK . Then there is a fractional ideal J of K with PJ = h1i.

∗ ∗ Proof We let P = {β ∈ K : βP ⊆ OK }. Then P is a fractional ideal ∗ ∗ of K, OK ⊆ P and P ⊆ PP ⊆ OK . By the maximality of the prime ideal ∗ ∗ P , either PP = P or PP = OK . We show that the latter is true, so to obtain a contradiction, suppose that PP ∗ = P .

32 ∗ Then γ ∈ P implies that γP ⊆ P and so γ ∈ OK by Lemma 4.8. ∗ ∗ Hence P ⊆ OK and we conclude that P = OK . To obtain the desired ∗ contradiction, it suffices to find an element in P but not in OK . Let β be a nonzero element of P . Then by Lemma 4.6, hβi contains a product P1P2 ··· Pr of prime ideals. Choose such a product with fewest possible factors. Then P ⊇ hβi ⊇ P1P2 ··· Pr and so P ⊇ Pj for some j by Lemma 4.3. We shall assume, without loss of generality, that P ⊇ P1. Then by maximality of the prime ideal P1, P = P1. Thus hβi ⊇ PI where I = P2 ··· Pr. As r was chosen to be minimal then hβi 6⊇ I. Thus there exists γ ∈ I but γ∈ / hβi. Hence δ = γ/β∈ / OK . But γP ⊆ PI ⊆ hβi and so −1 ∗ δP = β γP ⊆ OK . Hence δ ∈ P . But as γ∈ / hβi then δ∈ / OK . ∗ The assumption that P = OK has led to a contradiction. We cannot ∗ ∗ then have PP = P and we conclude that PP = OK = h1i as required.  The inverse of a prime ideal P is uniquely determined, for if PJ = OK then J = JPP ∗ = P ∗. We can now show that every nontrivial ideal is a product of prime ideals.

Theorem 4.2 Let K be a number field, and suppose that I is a nontrivial ideal of OK . Then I = P1P2 ··· Pm where the Pj are prime ideals.

Proof We use induction on N(I). There is nothing to prove when I is prime so assume that it is not. By Lemma 4.5 I ⊆ P for some prime −1 ideal P . Let P be the inverse of P as a fractional ideal. As P ⊆ OK −1 −1 −1 −1 then OK = PP ⊆ OK P = P . Let J = IP . As I ⊆ P then −1 J ⊆ PP = OK and J is an ideal of OK . Also I = PJ. Thus J is a proper ideal of OK . −1 −1 −1 We know P ⊇ OK but P 6⊆ OK for otherwise P would be OK −1 which is not an inverse of P . Thus J = IP ⊇ IOK = I. If we had −1 −1 J = I then γI ⊆ I for all γ ∈ P and so P ⊆ OK by Lemma 4.8. This −1 contradicts P 6⊆ OK . Hence I ⊆ J but I 6= J and so N(J) < N(I). By the inductive hypothesis J is a product of primes, and so I = PJ is also.  We can conclude that every fractional ideal is invertible.

Proposition 4.3 Let K be a number field, and let I be a fractional ideal of K. Then I is invertible.

Proof Write I = βJ where β 6= 0 and J is an ideal of OK . By Theorem 4.2 J = P1P2 ··· Pr where the Pj are prime, and by Lemma 4.5, each Pj is −1 −1 −1 −1 −1 invertible with inverse Pj say. Then the fractional ideal β P1 P2 ··· Pj is an inverse of I. 

33 It is now plain that the fractional ideals of K form a group under mul- tiplication and the principal fractional ideals form a subgroup. One useful consequence is that ideals satisfy the maxim, “to contain is to divide”.

Proposition 4.4 Let K be a number field, and let I1, I2 be nonzero ideals of OK . Then I1 ⊇ I2 if and only if there is an ideal J of OK with I2 = I1J.

Proof If I2 = I1J for an ideal J then I2 ⊆ I1. Conversely suppose that −1 −1 I1 ⊇ I2. Then OK = I1I1 ⊇ I2I1 = J say. Then J is an ideal of OK and −1 I1J = I1I2I = I2.  After all this hard work it is now easy to conclude that factorizations into prime ideals are unique.

Theorem 4.3 Let K be a number field, and suppose that I is a nontrivial ideal of OK . If I = P1P2 ··· Pr = Q1Q2 ··· Qs (∗) where the Pj and Qk are prime ideals of OK , then r = s and the Qk can be reordered so that Pj = Qj for each j.

Proof We use induction on r. Certainly P1 ⊇ I = Q1Q2 ··· Qs. By Lemma 4.3, P1 ⊇ Qk for some k. We reorder the Qj so that P1 ⊇ Q1. −1 By maximality of Q1 then P1 = Q1. Multiplying (∗) by P1 gives

P2 ··· Pr = Q2 ··· Qs and the result now follows from the inductive hypothesis.  We can also repair an earlier omission; we can show that the is multiplicative.

Proposition 4.5 Let K be a number field, and let I and J be nonzero ideals of OK . Then N(IJ) = N(I)N(J).

Proof If I = OK there is nothing to prove, so we can write I as a product of prime ideals. Using induction on the number of prime ideals it plainly suffices to prove the result in the special case where I = P is a prime ideal. By definition N(P ) = |OK : P |, N(J) = |OK : J| and N(PJ) = |OK : PJ|. From the transitivity of index we have

N(PJ) = |OK : PJ| = |OK : J||J : PJ| = |J : PJ|N(J).

It suffices to show that |J : PJ| = N(P ). Certainly J ⊇ PJ and J 6= PJ (why?). Let β ∈ J with β∈ / PJ. Then J ⊇ hβi + PJ ⊇ PJ and so

34 J = hβi + PJ (why?). Let γ1, . . . , γm be a system of coset representatives for P in OK , so that m = N(P ). We shall show that βγ1, . . . , βγm form a system of coset representatives for PJ in J. If δ ∈ J then δ = ξβ + η with ξ ∈ OK and η ∈ PJ as J = hβi + PJ. Now ξ − γk ∈ P for some k, and so β(ξ − γk) ∈ PJ. Hence δ ≡ βγk (mod PJ) so that each coset of PJ in J is represented by some βγk. We need to show that the βγk represent distinct cosets of PJ. Suppose that βγi ≡ βγk (mod PJ) with i 6= k. Then β(γi −γk) ∈ PJ. Let δ = γi −γk. Then δ∈ / P , and as P is maximal hδi + P = OK . There exists λ ∈ OK with λδ ≡ 1 (mod P ) and so (λδ − 1)β ∈ PJ. But δβ ∈ PJ and so β ∈ PJ which is a contradiction. Since βγ1, . . . , βγm form a system of coset representatives for PJ in J with m = N(P ) then the index |J : PJ| = m = N(P ) and this completes the proof.  It is important to know how to find prime ideals in OK . The follow- ing lemma shows they are all obtained from factorization of ordinary prime numbers.

Lemma 4.9 Let K be a number field, and let P be a prime ideal of OK . Then P occurs in the ideal factorization of hpi for a unique prime number p. Also N(P ) is a power of p.

Proof By Proposition 4.4, P occurs in the prime factorization of hpi if and only if P ⊇ hpi which occurs if and only if p ∈ P . Let P have norm m. Then the index of P as a subgroup of OK is m. Consequently mβ ∈ P for all β ∈ OK ; in particular m ∈ P . Write m = p1p2 ··· pr with the pj prime. Then pj ∈ P for some j since P is a prime ideal. Write pj = p. Since P ⊇ hpi, N(P ) is a factor of N(hpi) = |N(p)| = pn where n is the degree of K. Hence N(P ) = pk for some k with 1 ≤ k ≤ n. This also shows that the prime p is uniquely determined.  One corollary of this is the fact that there are only a finite number of ideals of a given norm (one can also prove this directly from the definition of norm).

Lemma 4.10 Let K be a number field and m a positive integer. There are only finitely many ideals I of OK with N(I) = m.

a1 a2 ar Proof Write m = p1 p2 ··· pr as a product of primes. Then I is a product of at most a1 prime ideals dividing hp1i, at most a2 prime ideals dividing hp2i, and so on. There are only finitely many ways of choosing these prime ideals, and consequently only finitely many possibilities for I. 

35 Hence we can determine all the prime ideals of OK by resolving each hpi into its prime ideal factors. 2 n−1 When OK = Z[α] for some α, that is when 1, α, α , . . . , α forms an integral basis of OK , then the prime ideal factorization of p can be computed from the minimum polynomial of α. In general we cannot always find an integral basis of this form for a given K. But in special cases, for instance the important case of quadratic fields, we can. We will need to factorize polynomials modulo p. Recall that for a prime number p, the set Fp = {0, 1, 2,..., p − 1} with addition and multiplication modulo p forms a field. Here we have written elements of Fp in the form a to distinguish them from integers, but in practice we are usually sloppy and write a both for an integer and the corresponding element of Fp. If f ∈ Z[X] is a polynomial with integer coefficients then f ∈ Fp[X] denotes its reduction 2 m 2 modulo p, that is a0 + a1X + a2X + ··· + AmX = a0 +a1X +a2X +···+ m amX . When OK = Z[α] we can classify all ideals containing hpi.

Proposition 4.6 Let K be a number field of degree n, and suppose that there is α ∈ OK with OK = Z[α]. Let f be the minimum polynomial of α, and let p be a prime number. Then each ideal of OK which contains hpi has the form Ig = hp, g(α)i where g ∈ Z[X] is monic and g | f in Fp[X]. Also Ig1 = Ig2 if and only if g1 = g2 in Fp[X]. d The norm of Ig is p where d = deg(g), and Ig1 ⊇ Ig2 if and only if g1 | g2 in Fp[X].

Proof Suppose that I ⊇ hpi. Since f(α) = 0 ∈ I, there are certainly monic polynomials g ∈ Z[X] with g(α) ∈ I. Fix such a polynomial g of least possible degree. Certainly I ⊇ Ig = hp, g(α)i. I claim that, for h ∈ Z[X], h(α) ∈ I if and only if g | h in Fp[X]. Let d = deg(g) = deg(g). Suppose first that deg(h) < d. Then for some integer a we have ah monic, so ah = h1 + ph2 where h1 ∈ Z[X] is monic of degree less than d and h2 ∈ Z[X]. Thus h1(α) = ah1(α) − ph2(α) ∈ I. This contradicts the definition of g. In general suppose g - h and h(α) = 0. Then h − ug is nonzero and has degree less than d for some u ∈ Z[X]. Then vα ∈ I where v = h − ug and v is nonzero and has degree less than d, which is false. Conversely, if g | h then h = ug + pv with u, v ∈ Z[X], and so h ∈ Ig ⊆ I. Hence I = Ig. As h(α) ∈ I then g | h. d 2 d−1 The p numbers a0 + a1α + a2α + ··· + ad−1α where 0 ≤ aj < p form a system of coset representatives for Ig in OK , since each h ∈ Fp[X] is congruent to a unique polynomial of degree less than d modulo g. Hence

36 d N(Ig) = p . Since g2(α) ∈ Ig1 if and only if g1 | g2, it follows that Ig1 ⊇ Ig2 if and only if g1 | g2. In particular, Ig1 = Ig2 if and only if g1 | g2 and g2 | g1, that is if and only if g1 = g2.  Using the above notation we get that when g = f, If = hp, 0i = hpi and when g = 1 then I1 = hp, 1i = OK . Clearly the maximal (prime) ideals containing p correspond to the irreducible factors of f.

Theorem 4.4 Let K be a number field of degree n, and suppose that there is α ∈ OK with OK = Z[α]. Let f be the minimum polynomial of α, and let p be a prime number. Write

a1 a2 ar f = g1 g2 ··· gr where the gj are the distinct monic irreducible factors of f in Fp[X]. Then the prime ideal factorization of hpi in OK is

a1 a2 ar hpi = P1 P2 ··· Pr where Pj = hp, gj(α)i .

Proof By Proposition 4.6, if Q is an ideal of OK and Q ⊇ Pj, then Q = hp, h(α)i where h | gj. Thus h = 1 or h = gj so that Q = OK or Q = Pj. Hence Pj is maximal, and so prime. a1 a2 ar The norm of the ideal P1 P2 ··· Pr is

a1 a2 ar d1a1+d2a2+···+drar N(P1) N(P2) ··· N(Pr) = p where dj = deg(gj). Hence

a1 a2 ar deg(f) n N(P1 P2 ··· Pr ) = p = p = N(hpi).

For any polynomials h1, h2 ∈ Z[X] we have

(hpi + hh1(α)i)(hpi + hh2(α)i) ⊆ hpi + hh1h2(α)i .

Iterating this by induction we get

a1 a2 ar a1 a2 ar P1 P2 ··· Pr ⊆ hpi + hg1 g2 ··· gr (α)i = hpi + hf1(α)i

a1 a2 ar where f1 = g1 g2 ··· gr . Then f1 − f = pf2 where f2 ∈ Z[α]. But f1(α) = a1 a2 ar pf2(α) ∈ hpi so that P1 P2 ··· Pr ⊆ hpi. But we have seen these ideals have the same norm, so they are equal. 

37 In fact this result is true even when OK 6= Z[α] as long as p - |OK : Z[α]|. We shall not prove this generalization. √ Example Let K = Q( m) be a quadratic field where √m is a squarefree integer with m 6≡ 1 (mod 4). Then OK = Z[α] where α = m has minimum polynomial X2 −m. To determine the prime ideal factorization of hpi, where 2 p is a prime number, in OK , we must factorize X − m in Fp[X]. There are three possibilities:

2 1. X − m is irreducible over Fp. Then hpi is a prime ideal of OK . This occurs when the congruence x2 ≡ m (mod p) is insoluble. This can only happen when p is odd. This condition is described by saying that m is a quadratic nonresidue of p. In this case we say that p is inert in K.

2 2 2. X − m splits into two distinct factors over Fp. Then X − m ≡ (X − a)(X + a) (mod p) with a 6≡ −a (mod p). This means that p is odd, a2 ≡ m (mod p) and p - m. This condition is described by saying that m is√ a quadratic residue of√ p. In this case hpi = P1P2 where P1 = hp, m + ai and P2 = hp, m − ai. Here P1 6= P2 and both P1 and P2 have norm p. In this case we say that p splits in K.

2 3. X − m splits into two equal factors over Fp. This happens only when p = 2 or when p | m. When p = 2 then X2 − m ≡ X2 or (X + 1)2 (mod 2) according to the parity of m. When p | m then X2 − m ≡ X2 2 √ (mod p). Then hp√i = P where P = hp, mi, unless p = 2 and m is odd when P = h2, m + 1i. In any case P has norm p. In this case we say that p ramifies in K.

Note that p ramifies if and only if p divides the discriminant of K. Thus is true for all number fields K, but is too difficult to prove in this course.√ It is a good exercise to perform the same calculations when K = Q( m) has m ≡ 1 (mod 4). The theory of ideal factorization allows us to prove various results in elementary number theory. Example Let K = Q(i). We know that K is norm-Euclidean so that OK = Z[i] is a Euclidean domain. Then each ideal of Z[i] is principal. Let p be a prime number. Then p splits in K whenever p is odd and the congruence a2 ≡ −1 (mod p) is soluble. By elementary number theory this 2 occurs if and only if p ≡ 1 (mod 4). When p ≡ 1 (mod 4) then hpi = P1P 2 where P1 = hp, a + ii and P2 = hp, a − ii. Here a ≡ −1 (mod p). These ideals are principal: P1 = hβi and as N(P1) = p then N(β) = p. Hence

38 p = b2 + c2 where β = b + ci. We have recovered the two-square theorem of elementary number theory: if p is a prime congruent to 1 modulo 4, then p is the sum of two squares of integers. If for a given p we can find an a with a2 ≡ −1 (mod p) then by applying the Euclidean algorithm for Z[i] to p and a+i we can obtain β = gcd(p, a+i) and so integers b and c with p = b2 + c2. We shall briefly consider ideals in K = Q(ζ) where ζ = exp(2πi/p) and p is an odd prime number. We have seen that the minimum polynomial of ζ is f(X) = Xp−1 + Xp−2 + ··· + X + 1 so that K has degree p − 1. Certainly ζ ∈ OK . Let λ = ζ − 1, so that Z[ζ] = Z[λ]. Then N(ζ) = (−1)p−1N(−ζ) − f(0) = 1 and N(λ) = (−1)p−1N(1 − ζ) = f(1) = p. Also f(λ + 1) = 0 so that the minimum polynomial of λ is

p−1 (X + 1)p − 1 X p g(X) = f(X + 1) = = Xp−1 + Xp−1−j. (X + 1) − 1 j j=1 Thus p−1 p−2 X p X  p  λp−1 = − λp−1−j = − λk. j k + 1 j=1 k=0 p−1 In particular λ = pβ where β ∈ OK . Comparing norms gives pp−1 = N(λp−1) = pp−1N(β).

Consequently N(β) = 1 and β = λp−1/p is a unit.

Proposition 4.7 Let K = Q(ζ) where p is an odd prime number and ζ = exp(2πi/p). Then ∆(1, ζ, ζ2, . . . , ζp−2) = (−1)(p−1)/2pp−2.

Proof Call this discriminant ∆. Then ∆ = (−1)p(p−1)/2f 0(ζ) where f is the minimum polynomial of ζ. Then f(X) = (Xp − 1)/(X − 1), so p(X − 1)Xp−2 − (Xp − 1) f 0(X) = (X − 1)2 and so f 0(ζ) = pζp−2/(ζ − 1). Hence N(f 0(ζ)) = N(p)N(ζ)p−1/N(ζ − 1) = p−2 p(p−1)/2 (p−1)/2 p . As p is odd, (−1) = (−1) and the result follows.  We can now show that the ring of integers of Q(ζ) is Z[ζ]. We have seen N(λ) = p and αp−1/p is a unit. Thus hλi must be a prime ideal of norm p and hλip−1 = hpi. We thus know the prime factorization of hqi when q = p.

Theorem 4.5 Let K = Q(ζ) where p is an odd prime number and ζ = exp(2πi/p). Then OK = Z[ζ].

39 Proof Since the discriminant of 1, ζ, ζ2, . . . , ζp−2 is, up to sign, a power of p, the index |OK : Z[ζ]| is a power of p. If OK 6= Z[ζ] then there is β ∈ OK with β∈ / Z[ζ] but pβ ∈ Z[ζ]. Since Z[ζ] = Z[λ], where λ = ζ − 1 then we can write p−2 1 X β = b λj p j j=0 where the bj are integers, not all divisible by p. Choose j to be the smallest number such that p - bj. Then

j−1 1 X b λk ∈ O p k K k=0 and so j−1 p−2 1 X 1 X γ = β − b λk = b λk ∈ O . p k p k K j=0 k=j We infer that p−2 1 X λp−2−jγ = b λp−2−j+k ∈ O . p k K k=j p−1 p−2+j+k But we have seen that λ /p ∈ OK . Then for k ≥ j+1 we have λ /p = p−1 k−j−1 λ λ ∈ OK . Hence

p−1 p−2 bjλ 1 X = λp−2−jγ − b λp−2−j+k ∈ O . p p k K k=j+1

p−2 p−1 p−2 p−1 p−1 We now consider norms. The norm of bjλ /p is bj p /p = bj /p. This must be an integer, yet it cannot be as p - bj. This contradiction shows that Ok = Z[ζ]. 

Example Let K = Q(ζ) where ζ = exp(2πi/5). Then OK = Z[ζ] and ζ has minimum polynomial f(X) = X4 + X3 + X2 + X + 1. For each prime number q we aim to factorize the ideal hqi by factorizing the polynomial f modulo q. Consider the case q = 5. Then

(X − 1)4 = X4 − 4X3 + 6X2 − 4X + 1 ≡ X4 + X3 + X2 + X + 1 (mod 5).

4 It follows that h5i = P5 where P5 = h5, ζ − 1i. For λ = ζ − 1 we have seen 4 that λ | 5 so that P5 = hλi. Hence h5i is the fourth power of the prime ideal hζ − 1i.

40 Now suppose that q 6= 5. If f is reducible modulo q, then either it has a linear factor or a quadratic factor. Let us suppose that f has the linear factor X − a modulo q. Then f(a) ≡ 0 (mod q) and so as f(X)(X − 1) = X5 − 1 then a5 ≡ 1 (mod q). But a 6≡ 1 (mod q) for f(1) = 5 6≡ 0 (mod q). Thus a has order 5 modulo p. The powers a2, a3, a4 must also be solutions of f(x) ≡ 0 (mod p), so we have

f(X) ≡ (X − a)(X − a2)(X − a3)(X − a4) (mod q).

Such an a exists if and only if q ≡ 1 (mod 5), and we conclude for these primes that hqi is a product of four distinct prime ideals, each of norm q. For instance, take q = 11. Then a = 3 satisfies a5 ≡ 1 (mod 11) and a2 ≡ 9, a3 ≡ 5, and a4 ≡ 4 (mod 11). Thus

h11i = h11, ζ − 3i h11, ζ − 4i h11, ζ − 5i h11, ζ − 9i .

If f has no linear factor modulo q then either f is irreducible or f is the product of two quadratics, each irreducible modulo q. In the latter case then in fact f(X) ≡ (X2 + aX + 1)(X2 + bX + 1) (mod q)(∗) for some a and b. I shan’t prove this; it is easy if one knows some theory of finite fields, otherwise it’s a rather messy calculation. Then (∗) holds if and only if a + b ≡ 1 and ab ≡ −1 (mod q), that is that a and b are the roots of Y 2 − Y − 1 ≡ 0 (mod q). This equation is soluble modulo q if and only 1 if 5 is a square modulo q; then a and b are congruent to 2 (1 ± s) modulo q, where s2 ≡ 5 (mod q). By 5 is a square modulo q if and only if q ≡ ±1 (mod 5). We have seen that q ≡ 1 (mod 5) if and only if hqi is the product of four prime ideals of norm q. Thus hqi is the product of two prime ideals of norm q2 if and only if q ≡ −1 (mod 5). For example, 2 1 let q = 19. Then 9 ≡ 5 (mod 19) and so we can take a = 2 (1 + 9) = 5 and 1 b = 2 (1 − 9) = −4. That is f(X) ≡ (X2 + 5X + 1)(X2 − 4X + 1) (mod 19) and both X2 + 5X + 1 and X2 − 4X + 1 are irreducible modulo 19. Thus

h19i = 19, ζ2 + 5ζ + 1 19, ζ2 − 4ζ + 1 is the factorization of h19i into prime ideals. In all other cases, that is when q ≡ ±2 (mod 5) then f is irreducible modulo q and hqi is prime.

41 5 Ideal classes

The set of fractional ideals of a number field K forms an abelian group under multiplication which we shall call IK . The set of principal fractional ideals of K is a subgroup of IK and we shall denote it by PK . The quotient group of IK /PK is called the class-group of K and is denoted by ClK . Its elements are the cosets of PK in IK and are called ideal classes. An ideal class is an equivalence class of fractional ideals under the equivalence relation I ∼ J if I = βJ for some nonzero β ∈ K. The set of principal fractional ideals forms an ideal class, the principal ideal class. We denote the ideal class containing the fractional ideal I by [I]. We can consider ClK as the set of such symbols [I] obeying the rule that [I] = [J] whenever J = βI, β ∈ K and β 6= 0 and with the operation [I][J] = [IJ]. The identity element of ClK is [h1i] = [OK ]. In addition since each fractional ideal I of K has the form βJ where J is an ideal, then [I] = [J] and so each ideal class contains ideals. √ √ Example√ Let K = Q( −√6) so that OK = Z[ −6]. Consider the ideals I = 2, −6 and J = 3, −6 . We have already seen that I and J are nonprincipal. We can now√ write this as [I] 6= [OK ] and [J] 6= [OK ]. But we also have I2 = h2i, IJ = −6 and J 2 = h3i. Thus [I]2 = [I][J] = [J]2 = −1 [OK ] in ClK . Thus [J] = [I] = [I], that√ is, the ideals I and√ J lie in the same ideal class. Indeed IJ −1 = IJJ −2 = −6 h2i−1 = −6/2 . Hence √ J = ( −6/2)I. We can confirm this by calculating √ √ −6 −6 √ √ √ I = 2, −6 = −6, −3 = 3, −6 = J. 2 2 2 So we have at least two elements, [OK ] and [I], in ClK . As [I] = [OK ] then [I] has order 2 in the class-group. It will turn out that these are the only ideal classes in K so that |ClK | = 2. We now aim to prove that the class-group of each number field is finite. The order of the class-group ClK of K is called the class-number of K and is denoted by hK . In particular hK = 1 if and only if every ideal of OK is principal. The crucial step in proving the finiteness of the class-group is the following result, which shows that each nonzero ideal of OK has an element of approximately the same norm as that of the ideal.

Proposition 5.1 Let K be a number field. There is a positive number AK with the following property: each nonzero ideal I of OK has a nonzero element β with |N(β)| ≤ AK N(I).

Proof Let γ1, . . . , γn form an integral basis of OK . Let m = N(I) and let r be the integer part of m1/n, that is r is an integer and r ≤ m1/n < r + 1.

42 Consider the set

A = {b1γ1 + b2γ2 + ··· + bnγn : bj ∈ Z, 0 ≤ bj ≤ r}. Then A has (r + 1)n elements and so |A| > m. The elements of A cannot lie in distinct cosets of I in OK . Hence there are β1, β2 ∈ A with β1 6= β2 but β1 ≡ β2 (mod I). Set β = β1 − β2. Then β 6= 0 but β ∈ I. Also Pn β = j=1 cjβj where cj ∈ Z and |cj| ≤ r. Qn Pn Now N(β) = k=1 σk(β) and σk(β) = j=1 cjσk(βj). Hence n n X X |σk(β)| ≤ |cj||σk(βj)| ≤ r |σk(βj)|. j=1 j=1 Multiplying these inequalities together gives n n n Y X |N(β)| ≤ r |σk(βj)| ≤ AK m = AK N(I) k=1 j=1 where n n Y X AK = |σk(βj)|. k=1 j=1  Note here that it is important that AK only depends on K and not on the ideal I. Also if we know an integral basis we can calculate AK . However using other methods we can often get smaller constants that this AK and they will be better in practice. We now show that each ideal class contains an ideal of small norm. Proposition 5.2 Let K be a number field, and suppose that the number A has the property that each nonzero ideal I of OK as a nonzero element β with |N(β)| ≤ AN(I). Then each ideal class of K contains an ideal J with norm N(J) ≤ A.

Proof Let [I] be an ideal class with I an ideal of OK . Let β ∈ I have β 6= 0 and |N(β)| ≤ AN(I). Then I ⊇ hβi and by Proposition 4.4 hβi = IJ for −1 some ideal J of OK . Then [J] = [I] and by Theorem 4.1 and Proposition 4.5 we have N(hβi) |N(β)| N(J) = = ≤ A. N(I) N(I) Hence the ideal class [I]−1 has an ideal of norm at most A, but if we had started with [I]−1 instead of [I] we would have proved that [I] has an ideal of norm at most A.  The finiteness of the class-group is now easy to see.

43 Theorem 5.1 Let K be a number field. Its class-group ClK is a finite abelian group.

Proof Only the finiteness needs to be shown. By Propositions 5.1 and 5.2 there is a positive number A with each class of K containing an ideal of OK with norm at most A. By Lemma 4.10, OK has only finitely many norms of each given norm, and so only finitely many ideals of norm at most A. Therefore K has only finitely many ideal classes.  These results give us a method for finding the class-group and so also the class-number. First find a constant A for which Proposition 5.2 is valid, and find all ideals of norm at most A. Determine which of these lie in the same ideal classes. Of course, this is easier said than done, but is is clear that the smaller one can make A, the less work needs to done. An alternative is to find all prime ideals P of norm at most A. Each ideal of norm at most A is a product of prime ideals of norm at most A. Thus the ideal classes [P ] for these prime ideals generate the class-group in the sense that each ideal class is a product of powers of such ideal classes. √ √ Example Let K = Q( −6). Then OK has integral basis 1, −6. Using the proof of proposition 5.1 we can take √ √ √ 2 A = (|σ1(1)| + |σ1( −6)|)(|σ2(1)| + |σ2( −6)|) = (1 + 6) ≈ 11 · 9.

We shall find all prime ideals of norm at most 11. We find that√ 2 and 3 ramify, 2 2 5 and 7 and 11 split. In detail h2i = P2 where P2 = 2, −6 , h3i = P3 √ √ where P3 = 3, −6 , h5i = P5Q5 where P5 = 5, −6 + 2 and Q5 = √ √ √ 5, −6 − 2 , h7i = P7Q7 where P7 = 7, −6 + 1 and Q7 = 7, −6 − 1 , √ √ and h11i = P11Q11 where P11 = 11, −6 + 4 and Q11 = 11, −6 − 4 . Thus P2, P3, P5, Q5, P7, Q7, P11 and Q11 are the prime ideals of norm at most 11, and each ideal class of K is the product of powers of these classes. To find the relations between these ideal classes we look at the factoriza- 2 tion of principal ideals of small norm. We already know that [P2] = [h2i] = 2 [h1i], [P3] = [h3i] = [h1i], [P5][Q5] = [h5i] =√ [h1i], [P7][Q7] = [h7i] = [h1i] and [P ][Q ] = [h11i] = [h1i]. Consider −6 . It has norm 6 so must 11 11 √ be the product of prime ideals of norms 2 and 3. Hence −6 = P2P3 and 2 so [P2][P3] = [h1i]. As√ [P2] = [h1i] then [P2] = [P3] (as we have already observed). Next 1 + −6 has norm 7, so this is a prime ideal of norm 7. √ √ As 1 + −6 ∈ P then P = 1 + −6 is principal, and so [P ] = [h1i] 7 7 √ 7 −1 and [Q7] = [h1i][P7] = [h1i]. Next 2 + −6 has norm 10 and so is the product√ of prime ideals of norms√ 2 and 5. Hence it is either√ P2P5 or P2Q5. But 2 + −6 ∈ P and so 2 + −6 ⊆ P . Hence 2 + −6 = P P and 5 5 2 √5 −1 −1 −1 so [P5] = [P2] = [P2]. Also [Q5] = [P5] = [P2] = [P2]. Then 4 + −6

44 has norm 22 and so is the product√ of prime ideals of norms√ 2 and 11. It is either P P or P Q , But 4 + −6 ∈ P and so 4 + −6 ⊆ P . 2 √11 2 11 11 11 −1 Hence 4 + −6 = P2P11 and so [P11] = [P2] = [P2]. Also [Q11] = −1 −1 [P11] = [P2] = [P2]. Summarizing, we have [P2] = [P3] = [P5] = [Q5] = 2 [P11] = [Q11] and [P7] = [Q7] = [h1i]. We also have [P2] = [h1i]. Hence ClK = {[h1i], [P2]}. We have seen that P2 is nonprincipal so that [P2] 6= [h1i]. Hence the class-group has two elements and hK = 2. If one knows what the class-number is of a field is, then one can often prove that certain ideals are principal.

Lemma 5.1 Let K be a number field with class-number h, and let m be an m integer with gcd(m, h) = 1. If I is an ideal of OK and I is principal, then I is principal.

m m Proof By assumption we have [I] = [I ] = [h1i] in ClK . As h is the order h of the group ClK then [I] = [h1i] also. As gcd(m, h) = 1 there exist integers r and s with 1 = mr + hs. Then

[I] = [I]mr+hs = ([I]m)r([I]h)s = [h1i]r[h1i]s = [h1i].

Hence I is principal.  Another useful lemma really has nothing to do with the class-group and everything to do with unique factorization into prime ideals. We say that two ideals I and J of OK are coprime if I + J = OK . Then there is no prime ideal P with P ⊇ I and P ⊇ J (for otherwise P ⊇ I + J) and so no prime ideal occurs in both the factorizations of I and J.

Lemma 5.2 Let K be a number field. Let I1 and I2 be coprime ideals of OK m m and suppose that I1I2 = J for some ideal J and integer m. Then I1 = J1 m and I2 = J2 for some ideals J1 and J2.

Proof Let P be a prime ideal occurring in the factorization of I1. Suppose that P occurs a times in this factorization. Then P does not occur in the prime ideal factorization of I2, as I1 and I2 are coprime. Hence P occurs a m times in the prime ideal factorization of I1I2 = J . But if P occurs b times in the prime ideal factorization of J then P occurs mb times in the prime ideal factorization of J m and so a = mb so that m | a. As this is true for all prime ideal factors of I1 then I1 is the m-th power of an ideal. Similarly I2 is the m-th power of an ideal.  Example As an application of these ideas, we investigate the solutions of

y3 = x2 + 6 (∗)

45 √ √ with x, y ∈ Z. let us work in K = Q( −6) so that OK = Z[ −6]. From (∗) we get √ √ y3 = (x + −6)(x − −6) and so in ideal terms √ √ hyi3 = x + −6 x − −6 . (†) √ √ By Lemma 5.2, we can conclude that if the ideals x + −6 and x − −6 are coprime then each is a cube√ of an ideal.√ √ √ Consider then I = x + −6 + x − −6 = x + −6, x − −6 . Then √ √ √ (x + −6) − (x − −6) = 2 −6 ∈ I and √ √ (x − −6)(x + −6) = x2 + 6 ∈ I. √ √ √ From 2 −6 ∈ I we get −6(2 −6) = −12 ∈ I. If x is not divisible by 2 2 2 or 3 then neither is x + 6 and so gcd(x + 6, −12) = 1. This implies√ that I = h1i. Hence as long as x is not divisible by 2 and 3 then x + −6 and √ x − −6 are coprime. 3 If x is even, then by (∗) we have y ≡ 2 (mod 4) which is impossible.√ If 3 | x then we have y3 ≡ 3 (mod 9) which is impossible. Hence x + −6 √ √ and x − −6 are coprime and so x + −6 = J 3 for some ideal J by Lemma 5.2. The class-number of K is√ 2, so that by Lemma 5.1, J is a principal√ ideal. Write J = hβi. Then x + −6 = hβ3i. This means that x + −6 = ηβ3 √ where η is a unit of OK . But the only units of OK are ±1 so that√ x+ −6 = 3 3 3 ±β = (±β) = γ where γ = ±β ∈ OK . Write γ = a + b −6 where a, b ∈ Z. Then

x = a3 − 18ab2 and 1 = 3a2b − 6b3.

The latter equation is absurd as it implies that 3 | 1. Hence there are no integers x and y satisfying (∗).

6 Geometric methods

We can obtain results such as better bounds in Proposition 5.1, by regarding a number field K of degree n as a subset of Rn. For simplicity’s sake, we shall restrict ourselves to quadratic fields here. The general case is no harder in principle, but more complicated in detail. For further details see appropriate

46 textbooks. Alas, I shall not illustrate the arguments here with diagrams due to my inability to produce then in LATEX, but the reader should supply her or his own. √ Throughout this section, let K be a quadratic field. Then K = Q( m) for some squarefree integer m. We define a mapσ ¯ from K to R2 or to C according to whether K is real or imaginary.√ When K is imaginary√ we√ simply takeσ ¯(β) = β. When K is real, letσ ¯(a + b m) = (a + b m, a − b m). Of course we can regard R2 and C as being “the same” by letting (x, y) ∈ R2 correspond to x + iy in C. Write √  m if m 6≡ 1 (mod 4), τ = 1 √ 2 (1 + m) if m ≡ 1 (mod 4) so that OK = Z[τ]. The imageσ ¯(OK ) of OK underσ ¯ is now easily seen to equal {av1 + bv2 : a, b ∈ Z} where v1 =σ ¯(1) and v2 =σ ¯(τ).

Lemma 6.1 The vectors v1 =σ ¯(1) and v2 =σ ¯(τ) are linearly independent over R.

Proof In the case where K is imaginary, v1 = 1 ∈ C and v2 has nonzero imaginary part so clear v1 and v2 are linearly independent over√ R. √ Suppose that K is real. Then v1 = (1, 1) and v2 = ( m, − m) or 1 √ 1 √ ( 2 (1+ m), 2 (1− m)). The matrix with rows v1 and v2 is now seen to have nonzero determinant. This means that v1 and v2 are linearly independent over R.  2 It follows thatσ ¯(OK ) is a in R , that is a set of the form

Λ = {au1 + bu2 : a, b ∈ Z}

2 where u1 and u2 are vectors in R which are linearly independent over R. We need some basic results on lattices. We call u1 and u2 generators of the lattice Λ. The generators u1 and u2 determine a fundamental region

F = {ru1 + su2 : r, s ∈ R, 0 ≤ r, s < 1} of Λ. Each element of R2 is the sum of an element in Λ and in F.

n Lemma 6.2 Let Λ be a lattice in R with generators u1 and u2. Let F be 2 the fundamental region associated to u1 and u2. Each x ∈ R can be uniquely expressed as x = a + t where a ∈ Λ and t ∈ F.

47 2 Proof The vectors u1 and u2 form a basis for R . Write x = x1u1 + x2u2 where x1 and x2 are uniquely determined. If a = a1u1 + a2u2 ∈ Λ and t = t1u1 + t2u2 ∈ F then a1, a2 ∈ Z, t1, t2 ∈ [0, 1). Also x = a + t if and only if x1 = a1 +t1 and x2 = a2 +t2. This can only happen if aj is the integer part of xj (j = 1, 2) and then tj = xj − aj is its fractional part. Thus the representation x = a + t is uniquely determined.  Geometrically the fundamental region F is a parallelogram, and we define the area of the lattice Λ as the area of F. We can easily calculate the area of Λ given the generators u1 and u2.

2 Lemma 6.3 The area of the lattice Λ in R with generators u1 and u2 equals | det(U)|, where U is the matrix with rows u1 and u2.

Proof Let  u u  U = 11 12 u21 u22 2 2 be the matrix with rows u1 and u2. Consider the map ψ : R → R given by ψ(x) = xU. Then ψ is linear with matrix U and so it multiplies areas by | det(U)|. But F = ψ(A) where A = [0, 1) × [0, 1) is a square of area 1. Hence the area of F is | det(U)|.  We aim to show that various regions in the plane contain points of a lattice Λ if their area is big enough. Obviously a very wiggly region might be large but still miss all the points of Λ. We introduce a class of regions which are not too wiggly, namely convex regions. We say that a subset A of R2 is convex if for any points a, b ∈ A, the line segment joining a to b lies in A; more formally if a, b ∈ A and 0 < t < 1 then ta + (1 − t)b ∈ A. We aim to show that a sufficiently large and nice enough convex region containing the origin must contain some other point of the lattice Λ. We first need a technical result, an analogue of the .

Lemma 6.4 Let Λ be a lattice in R2, and let X be a bounded subset of R2 whose area is greater than the area of Λ. Then there are points u, v ∈ X with u 6= v but u − v ∈ Λ.

Proof Let u1 and u2 be generators for Λ and consider the fundamental region F associated to u1 and u2. Lemma 6.2 says essentially that the plane is tiled by translates of F by elements of Λ. More precisely R2 is the disjoint union of the sets a + F = {a + t : t ∈ F}

48 for a ∈ Λ. The set X is bounded and so meets only finitely many of these a + F. Also [ X = (a + F) ∩ X a∈Λ is a disjoint union. Hence X area(X ) = area((a + F) ∩ X ) a∈Λ where only a finite number of terms in the sum are nonzero. Consider the set (a + F) ∩ A. Its elements are a + t where t ∈ F and t ∈ {x − a : x ∈ X} = X − a using the obvious notation. Hence (a + F) ∩ A = a + (F ∩ (X − a)) and clearly area((a + F) ∩ A) = area(F ∩ (X − a)). Thus X area(X ) = area(F ∩ (X − a)). a∈Λ As area(X ) > area(F) by hypothesis, the sets F ∩ (X − a)) cannot all be disjoint, for then ! X [ area(F ∩ (X − a)) = area F ∩ (X − a) ≤ area(F) a∈Λ a∈Λ as this union is a subset of F. Hence there exist a, b ∈ Λ with a 6= b but (F ∩ (X − a)) ∩ (F ∩ (X − b)) 6= ∅. Thus there is x ∈ F with x + a ∈ X and x + b ∈ X . If we let u = x + a and v = x + b we get u, v ∈ X , u 6= v, and u − v = a − b ∈ Λ.  We can now prove the the theorem of Minkowski which has a host of applications to algebraic number theory. First one definition: a subset X of R2 is symmetric if x ∈ X implies −x ∈ X . Theorem 6.1 (Minkowski) Let Λ be a lattice in R2 and X be a bounded convex symmetric subset of R2. If area(X ) > 4 area(Λ) then there exists a nonzero a ∈ Λ ∩ X .

1 1 1 1 Proof Consider 2 X = { 2 c : x ∈ X }. Then area( 2 X ) = 4 area(X ) > area(Λ) 1 1 by hypothesis. Also 2 X is bounded. By Lemma 6.4 there exist u, v ∈ 2 X with a = u − v ∈ Λ and a 6= 0. Now (2u + (−2v)) a = 2

49 is the midpoint of the segment joining 2u to −2v. Now 2u, 2v ∈ X and as X is symmetric, −2v ∈ X . By convexity then a ∈ X . As a ∈ Λ and a 6= 0 the theorem follows.  We shall apply Minkowski’s theorem to Λ =σ ¯(OK ) for quadratic fields K, and more generally to Λ =σ ¯(I) for ideals I of OK . We first need to calculate the areas of these lattices. First we calculate area(¯σ(OK )).

Lemma 6.5 Let K be a quadratic field of discriminant ∆. Then √  ∆ if ∆ > 0, area(¯σ(OK )) = 1 p 2 |∆| if ∆ < 0. √ Proof Let K = Q( m) where m is a squarefree integer. We split into four cases according to the sign of m and whether m is congruent to 1 modulo 4 or not. √Suppose that m > 0 and m 6≡ 1 (mod 4). Then√ ∆ =√ 4m and√ OK = Z[ m]. Then u1 =σ ¯(1) = (1, 1) and u2 =σ ¯( m) = ( m, − m) are generators ofσ ¯(OK ). Then area(¯σ√(OK )) = | det(U)| where U √has rows√ u1 and u2. We calculate det(U) = −2 m and so area(¯σ(OK )) = 2 m = ∆. Now suppose√ that m > 0 and m ≡ 1 (mod 4). Then ∆ = m and O = Z[ 1 (1 + m)]. Then we can take u =σ ¯(1) = (1, 1) and u = K√ 2 √ √ 1 √ 2 σ¯( m) = ( 1 (1 + m), 1 (1 − m)), and we find that det(U) = − m. Hence 2 √2 √ again area(¯σ(OK )) = m = ∆. Now consider the case where m < 0 and m 6≡ 1 (mod 4). Then |∆| = 4|m| √ 2 and OK = Z[ m]. Here we need to identify C with R by identifying x + iy √ p with (x, y). Then u1 =σ ¯(1) = 1 = (1, 0) and u2 =σ ¯( m) = i |m| = p p (0, |m|) are generators ofσ ¯(√OK ). It is immediate that det(U) = |m| and √ 1 so so area(¯σ(OK )) = m = 2 ∆. Finally suppose that m < 0 and m ≡ 1 (mod 4). Then |∆| = |m| and 1 √ 1 √ OK = Z[ 2 (1+ m)]. Then u1 =σ ¯(1) = 1 = (1, 0) and u2 =σ ¯( 2 (1+ m)) = 1 (1 + ip|m|) = ( 1 , 1 p|m|) are generators ofσ ¯(O ). Then det(U) = 1 p|m| 2 2 2 √ K 2 1 √ 1 and so so area(¯σ(OK )) = 2 m = 2 ∆.  Now we calculate area(¯σ(I)).

Lemma 6.6 Let K be a quadratic field and I a nonzero ideal of OK . Then area(¯σ(I)) = N(I)area(¯σ(OK )).

Proof Let β1, β2 form an integral basis of OK and let γ1, γ2 form an integral basis of I. Then γj = aj1β1 + aj2β2 where the ajk ∈ Z. Then N(I) = |OK : I| = | det(A)| where A is the 2-by-2 matrix with entries ajk by Proposition A.4. Let uj =σ ¯(βj) and vj =σ ¯(γj)(j = 1, 2). Let F and

50 0 F be the fundamental regions ofσ ¯(OK ) andσ ¯(I) respectively corresponding to these generators. Then F 0 = φ(F) where φ is the linear transformation with matrix A. Hence area(F 0) = | det(A)|area(F) which gives area(¯σ(I)) = N(I)area(¯σ(OK )).  We now apply this to prove a major theorem of Minkowski. We define the Minkowski bound MK of a number field as follows. Write K = Q(α) where α has degree n. Consider the conjugates α1, . . . , αn of α. Let s denote the number of αj which are real. Since αj is also a conjugate of α for each j, the nonreal conjugates come in pairs. Hence there are 2t nonreal conjugates of α and s + 2t = n. The Minkowski bound of K is

 4 t n! M = p|∆ | K π nn K where ∆K denotes the discriminant of K. In particular when K is real 1 √ quadratic then MK = 2 ∆K and when K is imaginary quadratic then MK = 2 p π |∆K |. The following theorem is valid for all number fields, but we state and prove it only for quadratic fields.

Theorem 6.2 (Minkowski) Let K be a quadratic field. Each ideal class of K contains an ideal I of OK with N(I) ≤ MK .

Proof First take any nonzero ideal J of OK . Let Λ =σ ¯(J). We shall apply Minkowski’s theorem 6.1 to Λ and a suitable region X . To define X we split into the cases of K real and K imaginary. First suppose that K is imaginary. Let c be any positive number, and let X be the interior of the radius c centred at the origin. Note that the interior of a circle is convex; also X is symmetric. By Lemmas 6.5 and 6.6, 1 p 2 area(Λ) = 2 N(J) |∆K |. The area of X is πc . If area(X ) > 4 area(Λ), that 2 p is, if πc > 2N(J) |∆K | then there is a nonzero a ∈ Λ ∩ X by Theorem 6.1. Then a =σ ¯(β) with β ∈ J and |β| < c. Hence 0 < N(β) < c2. Hence if c2 > 2 p 2 π N(J) |∆K | = MK N(J) then there exists a nonzero β ∈ J with N(β) < c . If γ is a nonzero element of J with N(γ) as small as possible, then N(γ) < c2 2 for all c > MK N(J). Hence N(γ) ≤ MK N(J). Now by Proposition 5.2 each ideal class of K contains an ideal I with N(I) ≤ MK . The argument when K is real is similar. Again let c be any positive number. Then let X = {(x, y): |x| + |y| < c}. Then X is the interior of a square with vertices (±c, 0) and (0, ±c). Again, note that the interior of a square is convex; also X is symmetric. This time,

51 p 2 by Lemmas 6.5 and 6.6, area(Λ) =√N(J) |∆K |. The area of X is 2c as the side of the square has length 2c. If area(X ) > 4 area(Λ), that is, if 2 p 2c > 4N(J) |∆K | then there is a nonzero a ∈ Λ ∩ X by Theorem 6.1. Then a =σ ¯(β) = (β, σ2(β)) with β ∈ J. Now

|β| + |σ2(β)| < c and |N(β)| = |β||σ2(β)|. By the inequality of the arithmetic and geometric means, |β| + |σ (β)| 2 ≥ p|β||σ (β)| 2 2 and so p|N(β)| < c/2. 2 2 p Hence 0 < |N(β)| < c /4. Hence if c > 2πN(J) |∆K | = 4MK N(J) then there exists a nonzero β ∈ J with N(β) < c2/4. If γ is a nonzero element of J 2 2 with |N(γ)| as small as possible, then |N(γ)| < c /4 for all c /4 > MK N(J). Hence |N(γ)| ≤ MK N(J). Again by Proposition 5.2 each ideal class of K contains an ideal I with N(I) ≤ MK .  We can now compute class-groups more efficiently. √ √ 2 Example Let K = Q( −6). Then ∆K = π 24 ≈ 3 · 1 so that ClK is generated by the prime ideals of norm at most 3. We have seen that the only √ √ such ideals are P2 = 2, −6 and P3 = 3, −6 , that P2 is not principal, 2 2 but that P2 , P2P3 and P3 all are. Then it is easy to see that the class-number is 2. This took a great deal less effort than our previous computation! √ Example√Let K = Q( −47). Then ∆K = −47 and OK = Z[τ] where τ = 1 (1 + −47) and τ has minimum polynomial f(X) = X2 − X + 12. We 2 √ 2 have MK = π 47 ≈ 4 · 4 so that the class-group is generated by the prime ideals of norm at most 4. We now need to find the prime ideal factors of h2i and h3i. As f(X) ≡ X2 − X = X(X − 1) (mod 2) and f(X) ≡ X2 − X = X(X − 1) (mod 3) then h2i = P2Q2 and h3i = P3Q3 where P2 = h2, τi, Q2 = h2, τ − 1i, P3 = −1 −1 h3, τi and Q3 = h3, τ − 1i. Naturally [Q2] = [P2] and [Q3] = [P3] , and we attempt to find relations between [P2] and [P3] by considering principal ideals of small norm. First of all N(τ) = 12. Also τ ∈ P2 and τ ∈ P3 so that 2 P2P3 ⊇ hτi. Hence either hτi = P2 P3 or P2Q2P3 but the latter is impossible 2 −2 as h2i = P2Q2 ⊇ P2Q2P3 but 2 - τ. Hence hτi = P2 P3 and so [P3] = [P2] . 1 √ It follows that ClK is generated by [P2]. If we try τ + 1 = 2 (3 + −47)

52 then N(τ + 1) = 14 so hτ + 1i has a prime ideal factor of norm 7, which we 1 √ haven’t considered. We get better luck with τ + 2 = 2 (5 + −47). This time N(τ + 2) = 18. Also τ + 2 ∈ P2 and τ + 2 = (τ − 1) + 3 ∈ Q3 and so 2 hτ + 2i = P2P3Q3 or P2Q3. The former is impossible as P3Q3 = h3i does not 2 divide hτ + 2i. Hence hτ + 2i = P2Q3 and

−2 2 −4 [P2] = [Q3] = [P3] = [P2]

5 so that [P2] = [h1i]. We have not determined whether P2 is principal or not. If P2 is principal— 1 √ P2 = hγi where γ = 2 (b + c −47) with b, c ∈ Z—then N(γ) = 2. Hence b2 + 47c2 = 8 and it is apparent that there are no solutions in integers to this equation. Hence [P2] = [h1i] and so [P2] has order 5 in the class-group. The class-number of K is 5. 5 It is an instructive exercise to find generators for the principal ideals P2 , 5 5 5 Q2, P3 and Q3.

We finish this section by showing that U(OK ) is an infinite group when- ever K is a real quadratic field. We first show that there are arbitrarily large numbers in OK with small norm.

Lemma 6.7 Let K be a real quadratic field with discriminant ∆. Let A be any positive number with A > p|∆|. Then for each positive number M, there exists β ∈ OK with β > M and |N(β)| < A.

Proof Apply Minkowski’s theorem 6.1 to the following set

X = {(x, y): |x| < AM, |y| < 1/M}.

Then X is a convex symmetric set (the interior√ of a rectangle centred at the origin) with area 4A. Butσ ¯(OK ) has area ∆ by Lemma 6.5. Thus area(X ) > 4 area(¯σ(OK )) and so by Theorem 6.1 there exists a nonzero β ∈ OK withσ ¯(β) ∈ X . By replacing β by −β if necessary, we may assume that β > 0. Nowσ ¯(β) = (β, σ2(β)) and N(β) = βσ2(β). As |σ2(β)| < 1/M and |N(β)| ≥ 1 then |β| > M.  We can now show that OK has a unit ξ > 1.

Proposition 6.1 Let K be a real quadratic field. There exists ξ ∈ U(OK ) with ξ > 1.

Proof Let A be any number satisfying the conditions of Lemma 6.7. Define a sequence β0, β1, β2,... of elements of OK as follows. Let β0 = 1. Suppose that βn has been defined. Then choose βn+1 to be some number in OK

53 with βn+1 > βn and |N(βn+1)| < A. Consider the ideals hβ0i , hβ1i , hβ2i ,.... Each of these ideals has norm less than A, and so by Lemma 4.10 only a finite number of different ideals can occur. Hence there exist j < k with hβji = hβki. Thus βk = ξβj where ξ ∈ U(OK ), and ξ > 1 as βk > βj.  In fact the structure of the unit group of OK is easy to determine.

Theorem 6.3 Let K be a real quadratic field. There exists η ∈ OK such j that η > 1 and such that every unit in OK has the form ±η where j ∈ Z.

Proof By Proposition 6.1 there exists ξ ∈ U(OK ) with ξ > 1. I claim that there are only finitely many units ε in OK with 1 < ε ≤ ξ. For such an ε, 1 √ 1 √ 1/ε ∈ OK . Let ε = 2 (a+b m) where a, b, m ∈ Z, then 1/ε = ± 2 (a−b m). Consequently a = ε ± 1/ε. As 0 < 1/ε < 1 then 0 < a < ξ + 1. There are only a finite number of possibilities for a, and as a determines b up to sign, only finitely many possibilities for ε. We let η be the smallest unit with 1 < η ≤ ξ; this exists as there is at least one such unit, namely ξ, and there are only finitely many such units. Then there are no units ε with 1 < ε < η. Let δ be any unit. Consider log |δ|. There exists an integer j such that

j log η ≤ log |δ| < (j + 1) log η and so 0 ≤ log |δ|η−j < log η. −j −j −j j Now |δ|η is a unit and 1 ≤ |δ|η < η. Thus |δ|η = 1 and so δ = ±η .  It is easily seen that the unit η in this theorem is uniquely determined by the field K, as it is called the fundamental unit of K. √ √ Example Let K = Q( 2). Then η = 1+ 2 is a unit in OK , and obviously η > 1. I claim that it is√ the fundamental unit. If it were not, then there√ would be a unit ξ = a + b 2 of OK with 1 < ξ < η. Now 1/ξ = ±(a√− b 2) and so 2a = ξ ± 1/ξ. As 0 < 1/ξ < 1 then 0 < 2a < η + 1 = 2 + 2 < 5. Thus a = 1 or 2. Of course a = 1 gives b = ±1 and for ξ > 1 we need b = 1, that is ξ = η which is false. But a = 2 gives 4 − 2b2 = ±1 which is impossible. Thus η is the fundamental unit.

We√ shall not give details on how to find the fundamental unit. If ξ = a + b m with a, b ∈ Z then ξ is a unit if and only if

a2 − mb2 = 1. (∗)

The equation (∗) is called Pell’s equation and methods for finding its solution using continued fractions can be found in texts on elementary number theory.

54 Of course, not all units are of this form, but for any given unit ξ, ξ6 does have this form and so these methods can be used to determine the fundamental unit of a given real quadratic field. Example The method of solving Pell’s equation shows that the first non- trivial solution of a2 − 61b2 = 1 is a = 1766319049, b = 226153980. √ Thus the smallest unit of the form ξ = a + b 61 with a, b ∈ Z ξ > 1 and N(ξ) = 1 is √ 1766319049 + 226153980 61. √ However, this is not the fundamental unit of Q( 61), since we find √ ξ1/2 = 29718 + 3805 61 which is a unit of norm −1. Again, this is still√ not the fundamental√ unit. Since 61 ≡ 1 (mod 4) the ring of integers of Q( 61) is Z[ 1 (1 + 61)] and √ √ 2 1 there may be units of 61 of the form 2 (c+d 61) with c and d odd integers. Indeed we find that √ 39 + 5 61 η = ξ1/6 = (ξ1/2)1/3 = . 2 Now it is not hard to check that η really is the fundamental unit of this quadratic field. √ In general given K = Q( m) with m positive and squarefree, the size of the fundamental unit varies very irregularly with m; some fields have small fundamental units, while others have astronomical fundamental units.

55 A Appendix

A.1 Polynomials If R is a (commutative) ring then R[X] denotes the ring of polynomials in 2 n the variable X with coefficients in R. If f = a0 + a1X + a2X + ··· + anX with an 6= 0 then n is the degree of f, an is the leading coefficient of f and n anX is the leading term of f. The degree of f is denoted deg(f). These concepts are not defined for the zero polynomial. A monic polynomial is one whose leading coefficient is 1. If R is an , then deg(fg) = deg(f) + deg(g) whenever f and g are nonzero elements of R[X]. In particular fg 6= 0 and so R[X] is also an integral domain.

Proposition A.1 (The division algorithm) Let K be a field, and let f, g ∈ K[X] with g 6= 0. There exist unique q, r ∈ K[X] with

• f = gq + r, and

• either r = 0 or deg(r) < deg(g). 

Let f, g ∈ K[X]. We say that f divides g (or g is divisible by f) if g = fh for some h ∈ K[X].

Proposition A.2 (Greatest common ) Let K be a field, and let f and g be nonzero elements of K[X]. There is a unique monic polynomial h such that

• h | f and h | g

• if q ∈ K[X] and q | f and q | g then q | h.

In addition there exist u, v ∈ K[X] with h = uf + vg. 

Let K be a field and f ∈ K[X] have positive degree. We say that f is irreducible over K if there are no g, h ∈ K[X] with f = gh and deg(g), deg(h) < deg(f).

Theorem A.1 (Unique factorization) Let K be a field and let f ∈ K[X] be a monic polynomial of positive degree. Then there are monic polynomials p1, p2, . . . , pk ∈ K[X], each irreducible over K, such that f = p1p2 ··· pk. Furthermore the pj are determined, up to order, uniquely by f. 

56 A.2 Symmetric polynomials

Let R be a ring. Then R[T1,...,Tk] denotes the ring of polynomials in the n indeterminates T1,...,Tk with coefficients in R. (In our applications R will usually be Z or Q so you may think of R as being one of these rings if you prefer). A polynomial f ∈ R[T1,...,Tn] is symmetric if it is left invariant under any permutation of the indeterminates. More precisely f is symmetric if f(T1,T2,...,Tn) = f(Tσ(1),Tσ(2),...,Tσ(n)) for all permutations σ in the symmetric group Sn. For example when n = 3,

2 2 2 2 2 2 f1(T1,T2,T3) = T1 T2 + T1 T3 + T2 T1 + T2 T3 + T3 T1 + T3 T2 is symmetric since

f1(T1,T2,T3) = f1(T1,T3,T2) = f1(T2,T1,T3)

= f1(T2,T3,T1) = f1(T3,T1,T2) = f1(T3,T2,T1) but 2 2 2 f2(T1,T2,T3) = T1 T2 + T2 T3 + T3 T1 is not symmetric, as although f2(T1,T2,T3) = f2(T2,T3,T1) = f2(T3,T1,T2) we have

2 2 2 f2(T1,T3,T2) = T1 T3 + T2 T1 + T3 T2 6= f2(T1,T2,T3). The elementary symmetric functions are defined as follows. Write

n n X n−r (X + T1)(X + T2) ··· (X + Tn) = X + er(T1,T2,...,Tn)X . r=1 Then X er(T1,T2,...,Tn) = Tj1 Tj2 ··· Tjr

1≤j1

e1(T1,T2,...,Tn) = T1 + T2 + ··· + Tn,

e2(T1,T2,...,Tn) = T1T2 + T1T3 + ··· + T1Tn + T2T3 + ··· + Tn−1Tn and en(T1,T2,...,Tn) = T1T2 ··· Tn. Each symmetric function can be expressed in terms of elementary sym- metric polynomials.

57 Theorem A.2 (Newton) Let R be a ring, and let f ∈ R[T1,...,Tn] be a symmetric polynomial. Then there is a polynomial g ∈ R[X1,...,Xn] with the property that f(T1,...,Tn) = g(E1,...,En) where Er denotes er(T1,...,Tn). Proof (outline) Order the terms of f “lexicographically”, that is write the i1 in j1 jn term aT1 ··· Tn ahead of bT1 ··· Tn when ik > jk for the least k where ik 6= i1 in jk. Let aT1 ··· Tn be the first term of f in lexicographic ordering. Then since i1 in aTσ(1) ··· Tσ(n) is also a term in f, as f is symmetric, we have i1 ≥ i2 ≥ · · · ≥ i1 in i1−i2 i2−i3 in−1−in in in. But also aT1 ··· Tn is the first term of h1 = aE1 E2 ··· En−1 En in lexicographical order. Then either f − h1 is zero, or its first term is i1 in later than aT1 ··· Tn in lexicographical ordering. Repeating this argument for h − h1 then yields h2, a in the Ej, with either f − h1 − h2 = 0 or f − h1 − h2 = 0 having first term even later in the lexicographical ordering. Eventually we get f = h1 + h2 + ··· + hk where each hi is a monomial in the Ej.  Example The method of proof gives an algorithm that computes g given f. For instance let

3 3 3 3 3 3 f(T1,T2,T3) = T1 T2 + T1 T3 + T1T2 + T1T3 + T2 T3 + T2T3 . 3 These terms have been written in lexicographic order. The first term T1 T2 is also the first term in

2 2 E1 E2 = (T1 + T2 + T3) (T1T2 + T1T3 + T2T3) 3 3 2 2 2 2 2 3 = T1 T2 + T1 T3 + 2T1 T2 + 5T1 T2T3 + 2T1 T3 + T1T2 2 2 3 3 2 2 3 +5T1T2 T3 + 5T1T2T3 + T1T3 + T2 T3 + 2T2 T3 + T2T3 . Then

2 2 2 2 2 2 2 2 2 2 f − E1 E2 = −2T1 T2 − 5T1 T2T3 − 2T1 T3 − 5T1T2 T3 − 5T1T2T3 − 2T2 T3 . 2 2 The first term of this is −2T1 T2 which is the same as that of 2 2 2 2 2 2 2 2 2 2 −2E2 = −2T1 T2 − 4T1 T2T3 − 2T1 T3 − 4T1T2 T3 − 4T1T2T3 − 2T2 T3 . Then 2 2 2 2 2 2 f − E1 E2 + 2E2 = −T1 T2T3 − T1T2 T3 − T1T2T3 . The first term of this expression agrees with that of

2 2 2 f − E1E3 = −T1 T2T3 − T1T2 T3 − T1T2T3 2 2 and so f = E1 E2 − 2E2 − E1E3.

58 A.3 Free abelian groups In this section all groups are written with addition as the operation. Natu- rally, the identity element of each of our groups will be denoted as 0 and the inverse of an element u as −u. The set Zn of all n- of integers will be our main example. Its ele- ments are row vectors of integers and it forms a group under vector addition. A group G is called a free abelian group of rank n if it is isomorphic to Zn. This means that G has an integral basis or Z-basis: there is a sequence u1, . . . , un of elements of G such that each element of G can be uniquely expressed in the form a1u1 + ··· + anun with a1, . . . , an ∈ Z. The rank of a free abelian group is uniquely determined. Each abelian group G has a subgroup 2G = {u + u : u ∈ G} and when G is free abelian of rank n, then the index |G : 2G| = 2n. We wish to show that each subgroup of a free abelian group G is also a free abelian group. It suffices to prove this for G = Zn. To warm up, consider the case where n = 1. If H is a subgroup of Z then either H = {0} or H contains some positive number. The group {0} is free abelian of rank 0, but if H 6= {0} let b be the smallest positive integer in H. Then b (on its own) forms an integral basis of H. For if c ∈ H there is a ∈ Z with 0 ≤ c − ab < b. Then as c − ab ∈ H we must have c − ab = 0. Thus c = ab and a = c/b is uniquely determined. We now consider the general result.

Proposition A.3 Let G be a free abelian group of rank n, and let H be a subgroup of G. Then H is free abelian of rank m where m ≤ n.

Proof We may assume that G = Zn. We begin by defining various sub- groups H1,...,Hn of H. We let H1 = H and for j > 1 let

Hj = {(0, 0,..., 0, cj, cj+1, . . . , cn): cj, cj+1, . . . , cn ∈ Z} ∩ H.

That is Hj is the set of all vectors in H where the first j − 1 entries vanish. We also let Kj = {cj : (0, 0,..., 0, cj, cj+1, . . . , cn) ∈ Hj} for each j. That is, Kj is the set of j-th entries of vectors in Hj. It is easy n to see that Hj is a subgroup of H (and so of Z ) and consequently that Kj is a subgroup of Z. Then Jj = Zbj = {abj : a ∈ Z} for some integer bj. Define vectors u1, . . . , un ∈ H as follows. If bj = 0 let uj be the zero vector, otherwise let uj be any vector in Hj whose j-th entry is bj. We claim that those uj which are nonzero form an integral basis of H. Given this claim, it is immediate that H is free abelian of rank m, where m is the number of nonzero uj, and so m ≤ n.

59 Pn We first show that each element of H has the form j=1 ajuj with uj ∈ Z. To do this note that if v = (0,..., 0, cj, . . . , cn) ∈ Hj then cj = ajbj with aj ∈ Z and so v − ajuj ∈ Hj+1 if j < n and v − ajuj = 0 if j = n. Start with any v ∈ H = H1. Then there exists a1 ∈ Z with v − a1u1 ∈ H2. Then there exists a2 ∈ Z with v−a1u1 −a2u2 ∈ H3 etc. Eventually we find a1, . . . , an ∈ Z Pn with v − a1u1 − a2u2 − · · · − anun = 0 so that v = j=1 ajuj. In this sum we may discard the uj which vanish, so each v ∈ H is a linear combination, with integer coefficients, of the nonzero uj. We now show uniqueness. Define an n-by-n matrix U with the j-th row being uj. Let ujk denote the typical entry. Then U is upper-triangular: ujk = 0 when k < j. Also ujj = bj and the j-th row of U is zero whenever Pn ujj = 0. To show that H is free abelian we need to show that v = j=1 ajuj uniquely determine the aj for the j where uj is nonzero. We proceed by induction on j; suppose that ak for the k < j with uk 6= 0 are uniquely Pj−1 determined by v. Then v uniquely determines k=1 akuk and so also v − Pj−1 Pn k=1 akuk = ajvj + r=j+1 ajvj. But the j-th entry of this vector is ajbj, and as long as bj = 0 this determines aj uniquely.  Example Let n = 3 and

H = {(r, s, t) ∈ Z3 : 3r + 5s + 7t = 0 and s is even}.

3 It is swiftly verified that H is a subgroup of Z . Of course H1 = H and we find that (1, −2, 1) ∈ H1. It follows that K1 = Z and we may take b1 = 1 and u1 = (1, −2, 1). Now

3 H1 = {(0, s, t) ∈ Z : 5s + 7t = 0 and s is even}.

If (0, s, t) ∈ H2 then s is even and 5s = −7t ≡ 0 (mod 7). It follows that s ≡ 0 (mod 7) and so s is divisible by 14. Hence K2 ⊆ 14Z. But as (0, 14, −10) ∈ H2, then K2 = 14Z and we may take b2 = 14 and u2 = (0, 14, −10). Finally

3 H3 = {(0, 0, t) ∈ Z : 7t = 0} = {(0, 0, 0)} so that b3 = 0 and u3 = (0, 0, 0). It follows that H has rank 2 and that each element of H can be written uniquely as a1u1 + a2u2 for a1, a2 ∈ Z. For example v = (19, 4, −11) ∈ H. To have v = au1 + vu2 we must have, on comparing the first coordinate, 19 = a1 so that a2u2 = v − 19a1 = (0, 42, −30). Thus a2 = 3 so that v = 19u1 + 3u2. n In general given an integral basis u1, . . . , um of a subgroup H of Z , we define an m-by-n matrix U having the uj as rows. We call U a generator matrix of H. In the language of matrix theory, this means that the elements

60 of H are the vectors v = aU where a ∈ Zm, and that each v ∈ H determines the vector a uniquely. A square matrix is called unimodular if it has integer entries and deter- minant ±1. Equivalently a matrix A is unimodular if it is nonsingular and both A and A−1 have integer entries.

Lemma A.1 Let H be a subgroup of Zn of rank m with generator matrix U. The m-by-n matrix V is also a generator matrix of M if and only if V = AU where A is an m-by-m unimodular matrix.

Proof Let V be a generator matrix of H. Let its rows be v1, . . . , vm while Pm let the rows of U be u1, . . . , um. There are integers ajk with vj = k=1 ajkuk. It follows that V = AU where A is the matrix with entries ajk. Similarly U = BV where B is also a matrix with integer entries. Hence U = BAU = Pm CU where C = BA has integer entries. We get uj = k=1 cjkuk so that by uniqueness, cjj = 1 and cjk = 0 for j 6= k. Hence C = BA = I and so A is unimodular. Now let A be unimodular with inverse B, so that B also has integer entries. Let V = AU. Let H0 = {wV : w ∈ Zm} be the subgroup of Zn generated by the rows of V . Then wV = wAU ∈ H so that H0 ⊆ H. Similarly if w0U ∈ H, then w0U = w0BAUV = w0BV ∈ H0 so that H0 = H. If the rows of V did not form an integral basis then wV = 0 for some nonzero w ∈ Zm. Then 0 = wABV = wAU so that wA = 0. Consequently w = wAB = 0, a contradiction.  In the case where m = n the matrix U is square, and so has a determinant. This determinant has a group-theoretical interpretation.

Proposition A.4 Let U be an n-by-n matrix with integer entries. The rows of U form an integral basis of a subgroup H of Zn if and only if det(U) 6= 0. In this case H has rank n and |Zn : H| = | det(U)|.

Proof If the matrix U is singular, then xU = 0 for some nonzero x ∈ Qn. By multiplying the vector x by the product of the denominators of its entries, we get a nonzero y ∈ Zn with yU = 0. Then the rows of U cannot form an integral basis for any subgroup of Zn. Suppose that U is nonsingular and let H = {xU : x ∈ Zn}. Then H is a subgroup of Zn and the rows of u form an integral basis of H since xU = 0 implies x = 0 due to U’s nonsingularity. Hence H is free abelian of rank n. Now assume that U is nonsingular. Then H has a generating matrix V which is upper triangular by the proof of Proposition A.3. By Lemma A.1,

61 V = AU with A unimodular, so that det(V ) = det(A) det(U) = ± det(U). Let V have diagonal entries b1, . . . , bn. Then I claim that the set of vectors

A = {(a1, . . . , an) : 0 ≤ aj < |bj|, for each j} is a set of coset representatives for H in Zn. To prove this we need to show that each x ∈ Zn is congruent modulo H to a unique a ∈ A. Given such an x consider the first entry of x − tv1 for t ∈ Z. This entry equals x1 − tb1. There is a unique t = t1 ∈ Z with 0 ≤ a1 = x1 − tb1 < |b1|. Now consider x − t1v1 − tv2. The first entry is still a1, and there is a unique t = t2 making the second entry a2 satisfy 0 ≤ a2 < b2. Proceeding in this manner we get Pn unique t1, t2, . . . , tn such that a = x − k=1 tjvj satisfy 0 ≤ aj < |bj| for each j.  Example Suppose that n = 3 and H has the generator matrix

 3 −2 1   0 4 −3  . 0 0 5

Let x = (17, −11, 13). We wish to find integers t1, t2 and t3 with x − t1v1 − t2v2 − t3v3 = (a1, a2, a3) and 0 ≤ a1 < 3, 0 ≤ a2 < 4 and 0 ≤ a3 < 5. Then a1 = 17 − 3t1 so that we must have t1 = 5 and a1 = 2. Then x − t1v1 = (2, −1, 8). Then a2 = −1 − 4t2 so that we must have t2 = −1 and a2 = 3. Then x−t1v1 −t2v2 = (2, 3, 5). Then a3 = 5−5t3 so that we must have t3 = 1 and a3 = 0. Thus x − 5v1 + v2 − v2 = (2, 3, 0) ∈ A, and these coefficients are uniquely determined. Proposition A.4 implies that each rank n subgroup of Zn has finite index. In fact the converse is true: each finite index subgroup of Zn has rank n. Equivalently each subgroup of Zn of rank less than n has infinite index.

Lemma A.2 Let H be a rank m subgroup of Zn with m < n. Then H has infinite index in Zn.

Proof Take a generator matrix U for H. As it has m rows and m < n its rows fail to span the Q-vector space Qn. There must be some vector ej = (0, 0,..., 0, 1, 0,..., 0) (with the unique 1 in the j-th position) not in the span of the rows. Hence kej ∈/ H for any nonzero integer k. It follows n that the image of ej in the quotient group Z /H has infinite order. Hence n Z /H has infinite order. 

62 A.4 The Vandermonde determinant

For variables X1,X2,...,Xn the Vandermonde determinant is defined as

1 1 1 ··· 1

X1 X2 X3 ··· Xn

X2 X2 X2 ··· X2 1 2 3 n V (X1,X2,...,Xn) = X3 X3 X3 ··· X3 . 1 2 3 n ...... n−1 n−1 n−1 n−1 X1 X2 X3 ··· Xn

Proposition A.5 For variables X1,X2,...,Xn we have Y V (X1,X2,...,Xn) = (Xk − Xj). 1≤j

Proof We use elementary row operations on the determinant. Starting at the bottom row and continuing upwards until one reaches the second row, subtract X1 times the preceding row from each row. This does not alter the determinant so that

V (X1,X2,...,Xn)

1 1 1 ··· 1

0 X2 − X1 X3 − X1 ··· Xn − X1

0 X (X − X ) X (X − X ) ··· X (X − X ) 2 2 1 3 3 1 n n 1 = 0 X2(X − X ) X2(X − X ) ··· X2(X − X ) 2 2 1 3 3 1 n n 1 ...... n−2 n−2 n−2 0 X2 (X2 − X1) X3 (X3 − X1) ··· Xn (Xn − X1)

X2 − X1 X3 − X1 ··· Xn − X1

X (X − X ) X (X − X ) ··· X (X − X ) 2 2 1 3 3 1 n n 1 X2(X − X ) X2(X − X ) ··· X2(X − X ) = 2 2 1 3 3 1 n n 1 ...... n−2 n−2 n−2 X2 (X2 − X1) X3 (X3 − X1) ··· Xn (Xn − X1) Taking out the common factors from each column we get

n Y V (X1,X2,...,Xn) = V (X2,X3,...,Xn) (Xk − X1) k=2 from which the stated formula follows by induction. 

63 A.5 Euclidean domains Recall that an integral domain is a (commutative) ring in which the product of two nonzero elements is always nonzero. Let R be an integral domain. Recall that we write a | b for a, b ∈ R if b = ac for some c ∈ R. A Euclidean function φ on an integral domain R is a function φ : R\{0} → Z with the properties

• φ(a) ≥ 0 for all a ∈ R, a 6= 0, and

• if a, b ∈ R with a 6= 0 then either a | b or there is c ∈ R with φ(b − ac) < φ(a).

An integral domain R is a Euclidean domain if there is a Euclidean function on R.

Proposition A.6 Let R be a Euclidean domain. Each ideal in R is princi- pal.

Proof Let φ be a Euclidean function on R. Suppose that I is a nonzero ideal of R. Let a be a nonzero element of I with φ(a) minimal. Clearly hai ⊆ I. If equality did not hold there would be b ∈ I with a - b. By the Euclidean property there is c ∈ R with φ(b − ac) < φ(a). But b − ac ∈ I so this contradicts the choice of a as minimizing φ(a). Hence I = hai is principal. 

64