<<

Chapter 2

Geometry of

Literature: W.M. Schmidt, Diophantine approximation, Lecture Notes in 785, Springer Verlag 1980, Chap.II, §§1,2, Chap. IV, §1 J.W.S. Cassels, An Introduction to the of Numbers, Springer Verlag 1997, Classics in Mathematics series, reprint of the 1971 edition C.L. Siegel, Lectures on the Geometry of Numbers, Springer Verlag 1989

2.1 Introduction

Geometry of numbers is concerned with the study of points in certain bodies n in R , where n > 2. We discuss Minkowski’s theorems on lattice points in central symmetric convex bodies. In this introduction we give the necessary definitions.

Lattices. A (full) lattice in Rn is an additive group

L = {z1v1 + ··· + znvn : z1, . . . , zn ∈ Z}

n where {v1,..., vn} is a basis of R , i.e., {v1,..., vn} is linearly independent.

We call {v1,..., vn} a basis of L. The of L is defined by

d(L) := | det(v1,..., vn)|, that is, the absolute value of the determinant of the matrix with columns v1,..., vn.

7 We show that the determinant of a lattice does not depend on the choice of the basis. Recall that GL(n, Z) is the multiplicative group of n×n-matrices with entries in Z and determinant ±1.

Lemma 2.1. Let L be a lattice, and {v1,..., vn}, {w1,..., wn} two bases of L. Then there is a matrix U = (uij) ∈ GL(n, Z) such that

n X (2.1) wi = uijvj for i = 1, . . . , n. j=1

Consequently, | det(v1,..., vn)| = | det(w1,..., wn)|.

Proof. Let U be the matrix relating v1,..., vn to w1,..., wn, that is, the matrix given by (2.1). A priori, U is just a non-singular matrix, but since w1,..., wn lie in the lattice generated by v1,..., vn, it must have its entries in Z. −1 ij Pn ij Let U = (u ). Then from linear algebra we know that vi = j=1 u wj for −1 i = 1, . . . , n. Now U has its entries in Z since v1,..., vn lie in the lattice generated −1 by w1,..., wn. Since both det U and det U are integers and must be multiplicative inverses of one another, we have det U = ±1, i.e., U ∈ GL(n, Z). Finally, we observe that

| det(w1,..., wn)| = | det U| · | det(v1,..., vn)| = | det(v1,..., vn)|.

n Let L, M be two lattices in R with M ⊆ L. Choose bases {v1,..., vn} of L, {w1,..., wn} of M. Let U = (uij) be the matrix relating v1,..., vn to w1,..., wn. Then U has its entries in Z. We define the index of M in L by [L : M] := | det U|.

The relation det(w1,..., wn) = det U · det(v1,..., vn) easily translates into

d(M) = [L : M] · d(L) and this shows that [L : M] does not depend on the choices of the bases of L and M.

Convex bodies. Recall that a subset C of Rn is convex if for any two points

8 x, y ∈ C, also the line segment connecting them, i.e., {λx + (1 − λ)y : 0 6 λ 6 1}, is contained in C.A central symmetric convex body in Rn is a closed, bounded, convex subset C of Rn having 0 as an interior point, and which is symmetric about 0, i.e. if x ∈ C then also −x ∈ C.

Examples. (i). If C is a central symmetric convex body, then so is its dilation with factor λ > 0, λC := {λx : x ∈ C}.

(ii). If C is a central symmetric convex body and φ a linear transformation of Rn, then φ(C) is also a central symmetric convex body. (iii). Parallelepipeds and ellipsoids: Let

n n 2 2 Kn = {x ∈ R : max |xi| 6 1},Bn = {x ∈ R : x1 + ··· + xn 6 1} 16i6n be the n-dimensional unit cube, and the n-dimensional Euclidean unit ball, respec- n n tively, where x = (x1, . . . , xn) ∈ R . A (symmetric) parallelepiped in R is the n n image of Kn under a linear transformation of R , and a (symmetric) ellipsoid in R n the image of Bn under a linear transformation of R . Both are central symmetric convex bodies.

n n Exercise 2.1. Recall that a norm on R is a function k · k : R → R>0 such that • kλxk = |λ| · kxk for all x ∈ Rn, λ ∈ R; n • kx + yk 6 kxk + kyk for all x, y ∈ R ; • kxk = 0 ⇐⇒ x = 0. n Prove that the unit ball Bk·k = {x ∈ R : kxk 6 1} is a central symmetric convex body. Hint. You may use that all norms on Rn induce the usual topology, that is, a subset n U of R is open if and only for every x0 ∈ U there is δ > 0 such that n {x ∈ R : kx − x0k < δ} ⊆ U.

In fact, every central symmetric convex body arises from a norm. On Rn we define the function

kxkC := min{λ ∈ R>0 : x ∈ λC}.

9 Lemma 2.2. (i) k · kC is well defined. n (ii) k · kC defines a norm on R . n (iii) λC = {x ∈ R : kxkC 6 λ} for λ > 0.

Proof. (i). Clearly, k0kC = 0. Suppose x 6= 0. Consider the set

S := {ξx : ξ ∈ R>0, ξx ∈ C}.

Then S 6= {0}. For since 0 is an interior point of C there is δ > 0 with δx ∈ C. Further, since C is compact and S is a closed subset of C, the set S is compact.

S is homeomorphic to I = {ξ ∈ R>0 : ξx ∈ C}. Hence I is compact. So I has a maximum, call it µ0. Then µ0 > 0 since S 6= {0}. −1 −1 We claim that kxkC = µ0 . First µ0x ∈ C, hence x ∈ µ0 C. Further, if x ∈ λC, −1 −1 −1 then λ x ∈ C hence λ 6 µ0. So λ > µ0 . This proves our claim, hence (i). −1 (ii). We have shown above that kxkC = µ0 > 0 if x 6= 0. The proofs of the other two norm properties are left to the reader. (iii). Left to the reader.

Exercise 2.2. Prove (ii) and (iii).

2.2 Minkowski’s first convex body theorem

Using Lebesgue theory, one can define an n-dimensional volume vol(S) ∈ R>0 ∪{∞} (the so-called Lebesgue measure) for subsets S of Rn from a large class, the so-called measurable subsets of Rn. We do not need the precise definition of Lebesgue mea- sure or measurable set. What is important to us is that all open sets and all closed sets are measurable, bounded measurable sets have finite volume, and the empty R set has volume 0. The volume of S is equal to the Riemann integral S dx1 ··· dxn for every set S for which this integral is defined. However, there are measurable sets S for which the Riemann integral is not defined. We mention some important properties of the volume:

1. Let S be a measurable subset of Rn. Then every translate a+S := {a+x : x ∈ S} is also measurable and vol(a + S) = vol(S). Further, if φ is a linear transformation of Rn, then φ(S) is measurable and vol(φ(S)) = | det φ| · vol(S).

10 2. Let S ⊂ Rn be measurable. Then Sc = Rn \ S is measurable. n 3. Let Sn (n = 1, 2, 3,...) be a countable collection of measurable subsets of R . S∞ Then S = n=1 Sn is measurable. Moreover, if the sets Sn are pairwise disjoint, P∞ then vol(S) = n=1 vol(Sn).

Theorem 2.3. (Minkowski’s first convex body theorem, 1896). Let C be a central symmetric convex body in Rn and L a lattice in Rn of rank n. Suppose that n vol(C) > 2 d(L). Then C contains a point from L \{0}.

Choose a basis {v1,..., vn} of L. We call

F := {x1v1 + ··· + xnvn : xi ∈ R, 0 6 xi < 1 for i = 1, . . . , n} a fundamental parallelepiped for L. Notice that F has volume d(L), the translates u + F (u ∈ L) are pairwise disjoint, and

n [ R = (u + F ). u∈L

n 1 1 Proof. We first assume that vol(C) > 2 d(L). Then the set 2 C = { 2 x : x ∈ C} has 1 volume > d(L). For u ∈ L, define Su := 2 C ∩ (u + F ). Then the sets Su (u ∈ L) 1 are pairwise disjoint and their union is precisely 2 C. Hence

X 1 vol(Su) = vol( 2 C) > d(L). u∈L

We shift the sets Su into F , that is, we define

∗ 1 Su := −u + Su = (−u + 2 C) ∩ F for u ∈ L.

∗ Since Su has the same volume as Su, we have

X ∗ X vol(Su) = vol(Su) > d(L) = vol(F ). u∈L u∈L

∗ That is, we have a collection of subsets Su (u ∈ L) of F , the sum of whose volumes is ∗ ∗ larger than the volume of F . So there are two distinct u, v ∈ L such that Su∩Sv 6= ∅. ∗ ∗ 1 Pick a point a ∈ Su ∩Sv. Then for certain x, y ∈ 2 C we have x−u = y −v = a. Hence x − y = u − v ∈ L \{0}.

11 Now 2x, 2y ∈ C, by the symmetry of C we have −2y ∈ C, and by the convexity 1 of C we have 2 (2x − 2y) = x − y ∈ C. This shows that C contains a non-zero point from L. Now assume that vol(C) = 2nd(L). Suppose that C does not contain a non-zero point from L. Then since C is compact and L is discrete, there is λ > 1 such that λC does not contain a non-zero point from L. But this contradicts what has been established above, since vol(λC) = λn vol(C) > 2nd(L).

Exercise 2.3. Prove the following theorem of Blichfeldt. let S be a measurable, not necessarily convex, subset of Rn with vol(S) > d(L). Then there are x, y ∈ S with x 6= y and x − y ∈ L.

We give some consequences.

Corollary 2.4. Let li = αi1X1 + ··· + αinXn (i = 1, . . . , n) be linear forms with real coefficients and with det(l1, . . . , ln) 6= 0. Let A1,...,An be positive reals with

A1 ··· An > | det(l1, . . . , ln)| .

Then there is a non-zero x ∈ Zn with

|l1(x)| 6 A1,..., |ln(x)| 6 An .

Proof. Define

n C = {x ∈ R : |li(x)| 6 Ai (i = 1, . . . , n)}, n R := {y = (y1, . . . , yn) ∈ R : |yi| 6 Ai (i = 1, . . . , n)}.

Then R = φ(C), where φ(x) = (l1(x), . . . , ln(x)). Notice that φ is a linear transfor- n mation of R of determinant det(l1, . . . , ln). Since applying a linear transformation φ to a set has the effect that the volume of that set is multiplied by | det φ|, we have vol(R) = | det φ| vol(C), hence

−1 −1 n n vol(C) = | det φ| vol R = | det(l1, . . . , ln)| 2 A1 ··· An > 2 .

Now it follows at once that C contains a non-zero point from Zn.

12 We recall some well-known results of Dirichlet from 1842, concerning the approx- imation of real numbers by rational numbers. Dirichlet proved these results using his own box principle (if m objects are put into n boxes where m > n then two objects are in the same box). Dirichlet’s results can be obtained alternatively from the above corollary.

Corollary 2.5 (Dirichlet, 1842). Let ξ be a real irrational . Then there are infinitely many pairs of integers (x, y) with gcd(x, y) = 1, y > 0 and

x −2 ξ − y . y 6

This has been proved in the introduction. We deduce some generalizations.

Corollary 2.6 (Dirichlet, 1842). (i) Let ξ1, . . . , ξn be real numbers, at least one of which is irrational. Then there are infinitely many tuples of integers (x1, . . . , xn, y) with gcd(x1, . . . , xn, y) = 1, y > 0 and

xi −1−1/n (2.2) ξi − y for i = 1, . . . , n. y 6

(ii) Let ξ1, . . . , ξn be real numbers such that 1, ξ1, . . . , ξn are linearly independent over Q. Then there are infinitely many tuples of integers (x, y1, . . . , yn) with (y1, . . . , yn) 6= (0,..., 0) and −n |ξ1y1 + ··· + ξnyn − x| 6 max(|y1|,..., |yn|) .

Proof. We prove only (i). Notice that (2.2) is equivalent to

−1/n (2.3) |xi − ξiy| 6 y for i = 1, . . . , n.

The idea is to consider instead of (2.2) the system of inequalities

−1/n (2.4) |xi − ξiy| 6 Q (i = 1, . . . , n), |y| 6 Q, for any integer Q > 1, and to let Q vary. If (x1, . . . , xn, y) is a tuple of integers satisfying this system, then by replacing Q−1/n by y−1/n as we may, we obtain a solution of (2.3) and hence of (2.2).

Notice that the system of linear forms X1 − ξ1Xn+1,...,Xn − ξnXn+1,Xn+1 has determinant 1. So by Corollary 2.4, for every integer Q > 1 there is a non-zero

13 tuple of integers (x1, . . . , xn, y) satisfying (2.4). If y = 0 then x1 = ··· = xn = 0 which is impossible. Hence y 6= 0. By changing signs and dividing out the gcd of x1, . . . , xn, y if necessary, we obtain a solution xQ = (x1, . . . , xn, y) of (2.4) with y > 0 and gcd(x1, . . . , xn, y) = 1.

We claim that if we let Q → ∞, then xQ runs through an infinite set. Indeed, sup- pose the contrary. Then there is an infinite sequence of integers Qi → ∞ such that for each Qi the point xQi is equal to some fixed tuple of integers x = (x1, . . . , xn, y) independent of i. But then, ξi = xi/y ∈ Q for i = 1, . . . , n, against our assumption. Now replacing in (2.4) the upper bound Q−1/n by y−1/n as suggested above, we obtain infinitely many solutions of (2.2). Exercise 2.4. Prove the following common generalization of both (i) and (ii). Let m, n be positive integers and li = ξi1X1 + ··· + ξinXn (i = 1, . . . , m) linear forms with real coefficients satisfying

n {y ∈ Z : li(y) ∈ Z for i = 1, . . . , m} = {0}.

m Then there are infinitely many tuples (x, y), with x = (x1, . . . , xm) ∈ Z , y = n (y1, . . . , yn) ∈ Z , y 6= 0, such that  −m/n |li(y) − xi| 6 max |yj| for i = 1, . . . , m. 16j6n

Remarks. 1. Corollary 2.5 has been improved by Hurwitz (1891) as follows. If ξ is any irrational, real number, then there are infinitely pairs of integers (x, y) ∈ Z2 such that √ −1 −2 |ξ − x/y| 6 5 y , y > 0, gcd(x, y) = 1. √ 1 A second result of Hurwitz states that there are ξ (e.g. ξ = 2 (1+ 5)), for which the √ −1 √ −1 constant 5 is optimal, that is, for every A < 5 , the inequality |ξ − x/y| 6 Ay−2 has only finitely many solutions in pairs (x, y) ∈ Z2 with y > 0. Hurwitz’ proofs are based on the theory of continued fractions (not to be discussed in this course). 2. Given a real number θ, we denote by kθk the distance of θ to the nearest integer, i.e., kθk = min{|θ − m| : m ∈ Z}. Corollary 2.6 implies that for any two real numbers ξ1, ξ2, not both in Q, there are infinitely many positive integers y such that

−1/2 −1/2 kξ1yk 6 y , kξ2yk 6 y .

14 This implies that there are infinitely many positive integers y such that

ykξ1yk · kξ2yk 6 1.

In fact, this is true also if both ξ1, ξ2 ∈ Q, since then, there are infinitely many integers y with kξ1yk = 0, kξ2yk = 0. The following famous conjecture, due to Littlewood, is still open:

Littlewood’s Conjecture. Let ξ1, ξ2 be any two real numbers. Then for every ε > 0 there exists a positive integer y such that

ykξ1yk · kξ2yk < ε.

1 Note that kθk 6 2 for every x ∈ R. So Littlewood’s conjecture would imply also that for any n > 3 reals ξ1, . . . , ξn, and for any ε > 0 there is a positive integer y with ykξ1yk · · · kξnyk < ε.

Exercise 2.5. Deduce from Hurwitz’ second result in Remark 1 that there are ξ ∈ R for which there is a constant c(ξ) > 0 such that

y · kξyk > c(ξ) for all y ∈ Z \{0}. (that is, there is no one-dimensional analogue of Littlewood’s Conjecture).

2.3 Minkowski’s second convex body theorem

Let L be a lattice in Rn and C a central symmetric convex body in Rn.

Definition. The n successive minima λ1, . . . , λn of C with respect to L are defined as follows:

λi is the minimum of all positive reals λ such that λC contains at least i linearly independent points from L.

Lemma 2.7. The successive minima λ1, . . . , λn of C with respect to L are well- defined, and 0 < λ1 6 ··· 6 λn < ∞.

Proof. Let k · kC be the norm associated with C, defined by kxkC = min{λ ∈ R>0 : n x ∈ λC}. Recall that λC = {x ∈ R : kxkC 6 λ}.

15 We can order the points of L as a sequence x0 = 0, x1, x2,... such that 0 = kx0kC < kx1kC 6 kx2kC 6 ··· . To see this, consider for each positive integer m the points x ∈ L with m − 1 < kxkC 6 m, these are the points with x ∈ mC, x 6∈ (m − 1)C. Since L is discrete and mC is closed and bounded, there are only

finitely many such points and these can be ordered according to their k · kC -values.

Let v1 := x1, and for i = 2, 3, . . . , n, let vi be the first element in the sequence ∞ {xk}k=1 that is linearly independent of v1,..., vi−1. Then clearly, λi = kvikC for i = 1, . . . , n and by Lemma 2.7 we have λ1 > 0.

Remark. The above construction gives linearly independent vectors v1,..., vn ∈ L with vi ∈ λiC for i = 1, . . . , n. These vectors need not form a basis of L.

Minkowski’s second convex body theorem gives an optimal upper and lower bound for the product of the successive minima of a central symmetric convex body C with respect to a lattice L.

Theorem 2.8 (Minkowski’s second convex body theorem, 1910). Let L be n a lattice and C a central symmetric convex body, both in R , and let λ1, . . . , λn be the successive minima of C with respect to L. Then

2n d(L) d(L) · λ ··· λ 2n · . n! vol(C) 6 1 n 6 vol(C)

We show that Minkowski’s second convex body theorem implies his first.

Second convex body theorem ⇒ First convex body theorem. Minkowski’s second con- n n n vex body theorem implies that λ1 6 2 d(L)/ vol(C). Assume that vol(C) > 2 d(L); then λ1 6 1. Now λ1C contains a non-zero point from L and λ1C ⊆ C; hence C contains a non-zero point from L.

n 2 2 Example 1. Let Bn be the Euclidean ball in R , given by x1 +···+xn 6 1. Let L be n a lattice of rank n in R . Let λ1, . . . , λn be the successive minima of Bn with respect Pn 21/2 to L. It is clear that x ∈ λBn if and only if kxk2 6 λ, where kxk2 = j=1 xj is the Euclidean norm. There are linearly independent vectors v1,..., vn ∈ L with kvik2 6 λi for i = 1, . . . , n. In fact, v1 is the shortest non-zero vector in L, and for

16 i = 2, . . . , n, vi is the shortest vector in L outside the spanned by v1,..., vi−1. Now Theorem 2.8 implies that

n Y n −1 kvik2 6 2 V (n) d(L), i=1 where V (n) = vol(Bn). We mention once more that v1,..., vn need not be a basis of L. Example 2. We prove that the constant 2n in the upper bound of Theorem 2.8 is best possible, i.e., the theorem becomes false if 2n is replaced by a smaller quantity. n Let L be any lattice in R of rank n and choose a basis v1,..., vn of L. Further, let λ1, . . . , λn be reals with 0 < λ1 6 ··· 6 λn. Define   −1 −1 C1 := x1λ1 v1 + ··· + xnλn vn : x1, . . . , xn ∈ R, max |xi| 6 1 . 16i6n

The set C1 is the image of the n-dimensional cube

n Kn := {x = (x1, . . . , xn) ∈ R : max |xi| 6 1} 16i6n under the linear transformation

n n −1 φ : R → R : ei 7→ λi vi (i = 1, . . . , n),

n where e1,..., en is the standard basis of R . Consequently, C1 is a central symmetric convex body with volume

n | det(v1,..., vn)| n −1 vol(C1) = | det(φ)| · vol(Kn) = 2 = 2 d(L)(λ1 ··· λn) . λ1 ··· λn n −1 Thus, λ1 ··· λn = 2 d(L) vol(C1) .

We now show that λ1, . . . , λn are the successive minima of C1 with respect to L. For λ > 0 we have ( n ) X −1 λC1 = xjλj vj : xj ∈ R, |xj| 6 λ for j = 1, . . . , n . j=1

This implies that λiC1 contains the i linearly independent points v1,..., vi. Let Pn λ < λi and let y = j=1 yjvj with y1, . . . , yn ∈ Z be a lattice point in λC1. Then

17 |yi| < 1,..., |yn| < 1, implying that yi = ··· = yn = 0. So all lattice points in λC1 lie in the (i − 1)-dimensional space spanned by v1,..., vi−1, and this space cannot contain i linearly independent points. So λi is the i-th successive minimum of C1. It follows that indeed λ1, . . . , λn are the successive minima of C1.

Example 3. We prove that the factor 2n/n! in the lower bound of Theorem 2.8 is best possible. Let

( n ) −1 −1 X C2 := x1λ1 v1 + ··· + xnλn vn : x1, . . . , xn ∈ R, |xi| 6 1 . i=1 −1 Then C2 is the image under φ : ei 7→ λi vi (i = 1, . . . , n) of the n-dimensional octahedron ( n ) n X On := y = (y1, . . . , yn) ∈ R : |yi| 6 1 . i=1 n We have vol(On) = 2 /n! (verify this!). It follows that C2 is a central symmetric convex body with volume 2n 2n vol(C ) = | det φ| · = d(L)(λ ··· λ )−1, 2 n! n! 1 n 2n hence λ1 ··· λn = n! d(L)/ vol(C2).

Exercise 2.6. Prove that λ1, . . . , λn are the successive minima of C2 with respect to L.

We first deduce the lower bound for λ1 ··· λn in Theorem 2.8. After that, we prove the upper bound in the special case that C is an ellipsoid. For a proof of the upper bound for arbitrary C, which is much more involved, we refer to the book of Cassels, Chapter 8. We need a lemma.

n Lemma 2.9. Let v1,..., vr ∈ R . Then ( r r ) X X xivi : xi ∈ R, |xi| 6 1 i=1 i=1 n is the smallest convex subset in R , symmetric about 0, that contains v1,..., vr, that is, the set itself is convex and symmetric about 0, and it is contained in every other which is symmetric about 0 and contains v1,..., vr.

18 Exercise 2.7. Prove this lemma.

Proof of the lower bound in Theorem 2.8. Choose linearly independent vectors −1 v1,..., vn of L such that vi ∈ λiC for i = 1, . . . , n. Then λi vi ∈ C for i = 1, . . . , n. Consider the set C2 from Example 3. By Lemma 2.9, this is the smallest symmetric −1 convex set containing the points λi vi ∈ C (i = 1, . . . , n). Hence C2 ⊆ C.

We mention that v1,..., vn generate a sublattice M of L of rank n, therefore, | det(v1,..., vn)| = d(M) = [L : M] · d(L) > d(L). Hence, 2n 2n vol(C) vol(C ) = d(M)(λ ··· λ )−1 d(L)(λ ··· λ )−1. > 2 n! 1 n > n! 1 n

This implies the lower bound for λ1 ··· λn from Theorem 2.8.

We now prove the upper bound for ellipsoids.

Theorem 2.10. Let L be a lattice in Rn and E an ellipsoid in Rn which is symmetric about 0. Then for the successive minima λ1, . . . , λn of E with respect to L we have n λ1 ··· λn 6 2 d(L)/ vol(E).

Proof. We first observe that there is no loss of generality to assume that E = Bn is 2 2 1/2 the Euclidean unit ball given by kxk2 = (x1 +···+xn) 6 1. Recall that there is a n 0 −1 linear transformation φ of R such that φ(Bn) = E. Let L = φ (L). Then clearly, 0 the successive minima of Bn with respect to L are equal to the successive minima 0 of E with respect to L. Further, vol(E) = | det φ| vol(Bn), d(L) = | det φ|d(L ), 0 which implies d(L)/ vol(E) = d(L )/ vol(Bn). This argument shows that it suffices 0 to prove Theorem 2.10 for L ,Bn instead of L, E.

So let λ1, . . . , λn be the successive minima of Bn with respect to a given lattice n L in R . This means that λi is the smallest λ > 0 such that the set of x ∈ L with kxk2 6 λ contains i linearly independent points. Define

 2nd(L) 1/n µ := . vol(Bn)λ1 ··· λn

We have to prove that µ > 1.

Let v1,..., vn be linearly independent vectors in L such that vi ∈ λiBn, i.e., kvk2 = λi for i = 1, . . . , n. Recall from linear algebra that by means of the

19 Gram-Schmidt orthogonalization procedure, one can construct an orthonormal basis n n e1,..., en of R such that for i = 1, . . . , n, the linear subspace of R spanned by n e1,..., ei is equal to the linear subspace of R spanned by v1,..., vi. n Let ψ be the linear transformation of R defined by ψ(ei) = λiei for i = 1, . . . , n. Then ψ has determinant λ1 ··· λn. Hence

 n n vol µψ(Bn) = µ λ1 ··· λn vol(Bn) = 2 d(L).

By Minkowski’s first convex body theorem, µψ(Bn) contains a non-zero lattice point, x, say, of L. The idea of the proof is now to estimate from above and below −1 kψ (x)k2. A comparison of both estimates gives µ > 1. −1 First observe that ψ (x) ∈ µBn. This means precisely that

−1 kψ (x)k2 6 µ.

−1 We now show that kψ (x)k2 > 1. Then µ > 1 follows. Since x ∈ L \{0} we have kxk2 > λ1. Let i be the largest index such that kxk2 > λi. This means that x is in the linear space Vi generated by v1,..., vi. Indeed, this is clearly true if i = n n since Vn = R . Suppose that i < n. Then kxk2 < λi+1. The vectors v1,..., vi have Euclidean norm 6 λi < λi+1 and by the definition of λi+1 there cannot be i + 1 linearly independent points in L of Euclidean norm < λi+1. Hence x ∈ Vi.

As mentioned before, Vi is also spanned by e1,..., ei. So we have in fact

x = x1e1 + ··· + xiei with x1, . . . , xi ∈ R.

−1 Pi Clearly, ψ (x) = k=1(xk/λk)ek. Hence, using λ1 6 ··· 6 λi,

i i ! −1 2 X 2 X 2 2 2 2 kψ (x)k2 = (xk/λk) > xk /λi = kxk /λi > 1. k=1 k=1 This completes the proof of Theorem 2.10.

2.4 Dual lattices

Henceforth, vectors in Rn will be column vectors, unless otherwise stated. As usual, T T n the standard inner product of x = (x1, . . . , xn) , y = (y1, . . . , yn) ∈ R is given by Pn T hx, yi = i=1 xiyi. Here, by A we denote the transpose of a matrix A.

20 Let L be a lattice in Rn The dual of L is given by

∗ n L := {x ∈ R : hx, yi ∈ Z for all y ∈ L}.

Let {v1,..., vn} be a basis of L, and V the matrix with columns v1,..., vn. Thus, L = {V z : z ∈ Zn}. Using hx,V yi = hV T x, yi for x, y ∈ Rn, we obtain

∗ n n L = {x ∈ R : hx,V zi ∈ Z ∀z ∈ Z } n T n = {x ∈ R : hV x, zi ∈ Z ∀z ∈ Z } n T n T −1 n = {x ∈ R : V x ∈ Z } = {(V ) w : w ∈ Z }.

∗ n ∗ ∗ T −1 Hence L is a lattice in R , with basis the columns v1,..., vn of (V ) . Note that d(L∗) = | det(V T )−1| = | det V |−1 = d(L)−1.

1/2 We denote by Bn the n-dimensional Euclidean ball given by kxk2 = hx, xi 6 1. n ∗ Theorem 2.11. Let L be a lattice in R and L its dual. Further, let λ1, . . . , λn be ∗ ∗ the successive minima of L with respect to Bn, and λ1, . . . , λn the successive minima ∗ of L with respect to Bn. Then

∗ 1 6 λiλn+1−i 6 c(n) for i = 1, . . . , n,

n −2 where c(n) = 4 vol(Bn) .

∗ Proof. We first deduce the lower bound for λiλn+1−i. Let v1,..., vn be linearly independent vectors from L such that vi ∈ λiBn, i.e., kvik2 6 λi for i = 1, . . . , n. ∗ ∗ ∗ ∗ ∗ Likewise, let v1,..., vn be linearly independent vectors from L such that kvi k2 6 λi for i = 1, . . . , n. Now let i ∈ {1, . . . , n}, and consider the set of vectors

n {x ∈ R : hvk, xi = 0 for k = 1, . . . , i}.

n Since v1,..., vi are linearly independent, this is a linear subspace of R of dimension ∗ ∗ n − i. Hence at least one of the vectors v1,..., vn+1−i does not lie in this space. It ∗ follows that there are indices k 6 i, l 6 n + 1 − i, such that hvk, vl i= 6 0. ∗ ∗ ∗ ∗ But hvk, vl i ∈ Z, since vk ∈ L, vl ∈ L . Hence |hvk, vl i| > 1. Now by the Cauchy-Schwarz inequality,

∗ ∗ ∗ ∗ 1 6 |hvk, vl i| 6 kvkk2kvl k2 6 λkλl 6 λiλn+1−i.

21 This establishes the lower bound. To prove the upper bound, recall that by Theorem 2.10, we have

n −1 ∗ ∗ n ∗ −1 λ1 ··· λn 6 2 d(L)(vol Bn) , λ1 ··· λn 6 2 d(L )(vol Bn) .

Further, d(L∗) = d(L)−1. Hence

n Y ∗ n ∗ −2 (λiλn+1−i) 6 4 d(L)d(L ) vol(Bn) = c(n). i=1 It follows that for i = 1, . . . , n,

∗ c(n) λiλn+1−i 6 Q ∗ 6 c(n). j6=i λjλn+1−j

This proves Theorem 2.11.

2.5 Kronecker’s approximation theorem

Recall that by Dirichlet’s Theorem, if ξ1, . . . , ξn are real numbers of which at least one is irrational, then there are infinitely many tuples of integers x1, . . . , xn, y such that −1−1/n |ξi − xi/y| 6 y for i = 1, . . . , n, y > 0. n+1 This implies that for every ε > 0, there exists (x1, . . . , xn, y) ∈ Z such that

|ξiy − xi| 6 ε for i = 1, . . . , n, y > 0.

Kronecker’s approximation theorem deals with systems of inhomogeneous inequali- ties of the shape

(2.5) |ξiy − xi − θi| 6 ε (i = 1, . . . , n) in x1, . . . , xn, y ∈ Z where θ1, . . . , θn are any real numbers.

Theorem 2.12. Let ξ1, . . . , ξn, θ1, . . . , θn be real numbers. Suppose that 1, ξ1, . . . , ξn are linearly independent over Q. Then for every ε > 0, there exists (x1, . . . , xn, y) ∈ Zn+1 with (2.5).

22 Remark. The condition that 1, ξ1, . . . , ξn be linearly independent over Q can not be removed. For suppose that 1, ξ1, . . . , ξn are linearly dependent over Q. Then there are integers a1, . . . , an, a0, not all 0, such that a1ξ1 + ··· + anξn = a0. In fact, at least one of a1, . . . , an is non-zero. Choose θ1, . . . , θn ∈ R such that

a1θ1 + ··· + anθn 6∈ Z.

Let δ be the distance from a1θ1 + ··· + anθn to the nearest integer. We show that n+1 for sufficiently small ε > 0, (2.5) is not solvable. Indeed, let (x1, . . . , xn, y) ∈ Z be a solution of (2.5). Then

n n n X X X ai(ξiy − xi − θi) 6 |ai| · |ξiy − xi − θi| 6 ε |ai|. i=1 i=1 i=1 But on the other hand,

n n n X X X ai(ξiy − xi − θi) = a0y − aixi − aiθi > δ. i=1 i=1 i=1 Pn Hence (2.5) is unsolvable for ε < δ/ i=1 |ai|. We need the following lemma.

n Lemma 2.13. Let L be a lattice in R , and λ1, . . . , λn the successive minima of Bn with respect to L. Then for every b ∈ Rn there is u ∈ L such that

1 ku − bk2 6 2 (λ1 + ··· + λn).

Proof. Choose linearly independent v1,..., vn ∈ L such that vi ∈ λiBn for i = Pn 1, . . . , n. Then b = i=1 tivi with t1, . . . , tn ∈ R. Define ai ∈ Z, ri ∈ R by 1 1 Pn ti = ai + ri, − 2 < ri 6 2 , and put u := i=1 aivi. Then clearly, u ∈ L, and

n n n X 1 X 1 X ku − bk2 = k rivik2 6 2 kvik2 6 2 λi. i=1 i=1 i=1

n Exercise 2.8. Let L, λ1, . . . , λn be as above. Prove that there exists b ∈ R such 1 that ku − bk2 > 2 λ1 for every u ∈ L.

23 Proof of Theorem 2.12. Let M be a large positive integer, to be chosen later. Con- sider the lattice in Rn+1,

 −1 −1 −1  LM = ε (x1 − ξ1y), . . . , ε (xn − ξny),M y : x1, . . . , xn, y ∈ Z n+1 = {Az : z ∈ Z }, where  −1 −1  ε 0 −ε ξ1 . .  .. .  A =  .   −1 −1   ε −ε ξn  0 M −1

T −1 −1 T and z = (x1, . . . , xn, y) . Put b = (ε θ1, . . . , ε θn, 0) . We show that there is a positive integer M, such that for the successive minima of the (n + 1)-dimensional Euclidean ball Bn+1 with respect to LM we have 2 (2.6) λ λ ··· λ . 1 6 2 6 6 n+1 6 n + 1

T n+1 Then by Lemma 2.13, there is u = Az ∈ LM with z = (x1, . . . , xn, y) ∈ Z such that 1 ku − bk2 6 2 (λ1 + ··· λn+1) 6 1, and this can be rewritten as n X −2 2 −2 2 ε (xi − ξiy − θi) + M y 6 1. i=1

This certainly implies that x1, . . . , xn, y satisfy (2.5). By Theorem 2.11, to prove (2.6) it suffices to show that for the first minimum ∗ ∗ λ1 of Bn+1 with respect to the dual lattice LM , we have

∗ 1 0 λ1 > 2 (n + 1) · c(n + 1) =: c (n). It is easy to verify that

 ε 0   .. .  T −1  . .  (A ) =   .  ε 0  Mξ1 . . . Mξn M

24 Hence

∗  T −1 n+1 LM = (A ) z : z ∈ Z   = εx1, . . . , εxn,M(ξ1x1 + ··· + ξnxn + y) : x1, . . . , xn, y ∈ Z .

We assume without loss of generality that ε < c0(n). Let µ be the minimum of all numbers |ξ1x1 + ··· + ξnxn + y|, taken over all integers x1, . . . , xn, y such that

−1 0 1 |xi| 6 ε c (n) for i = 1, . . . , n, |ξ1x1 + ··· + ξnxn + y| 6 2 ,

(x1, . . . , xn, y) 6= 0.

Then µ is the minimum of finitely many real numbers which are all positive, since 1, ξ1, . . . , ξn are linearly independent over Q. Hence µ > 0. 0 ∗ Now let M be an integer with M > c (n)/µ. Then for every non-zero u ∈ LM we have 0 kuk2 > c (n) since either one of the numbers εx1, . . . , εxn, or M(ξ1x1 +···+ξnxn +y), has absolute 0 0 value > c (n). This implies that if λ 6 c (n), then λBn+1 does not contain a non-zero ∗ ∗ 0 point from LM , that is, λ1 > c (n). This completes our proof.

In fact, Kronecker proved a much more general approximation theorem, of which

Theorem 2.12 is just a special case. As usual, kxk2 denotes the Euclidean norm of a vector x ∈ Rn. Theorem 2.14 (Kronecker, 1887). Let A be an m × n-matrix with real entries, and b ∈ Rm a column vector. Then the following two assertions are equivalent: (i) For every y ∈ Rm with AT y ∈ Zn we have hb, yi ∈ Z; (ii) For every ε > 0 there is z ∈ Zn such that

kAz − bk2 6 ε.

For a proof, we refer to Siegel, Chapter II.

Exercise 2.9. a) Prove (ii)=⇒(i). b) Deduce Theorem 2.12 from Theorem 2.14.

25