<<

U.U.D.M. Project Report 2019:29

Parametric of and Exponents of Diophantine Approximation

Erik Landstedt

Examensarbete i matematik, 30 hp Handledare: Andreas Strömbergsson Examinator: Denis Gaidashev Juni 2019

Department of Uppsala University

Parametric Geometry of Numbers and Exponents of Diophantine Approximation

Erik Landstedt

June 2019 2 Acknowledgments

I would like to express my gratitude to my supervisor, Professor Andreas Str¨ombergsson, for his guidence and enthusiasm about this project. Your feedback has truly been helpful together with our discussions.

3 4 Contents

1 Introduction 7

2 Preliminaries 9 2.1 Lattices ...... 9 2.2 Convex Bodies and Minkowski’s First Theorem ...... 12 2.3 Minkowski’s Second Convex Body Theorem ...... 17 2.3.1 Gauge Functions ...... 17 2.3.2 Successive Minima ...... 18 2.3.3 Statement of Minkowski’s Second Theorem ...... 19

3 Rigid Systems and Convex Bodies 25 3.1 Convex Bodies and Alternating k-forms ...... 28

4 The Approximation Theorem of Schmidt and Summerer 35 4.1 Appendix: Translation between Roy’s article and the Article by Schmidt and Summer 41

5 Roy’s Contribution 45

6 Exponents of Diophantine Approximation 67 6.1 Khinchin’s Transference Principle ...... 69 6.2 A Consequence of Roy’s Contribution ...... 74

5 6 Chapter 1

Introduction

Diophantine approximation is the part of theory that studies how well an arbitrary real number can be approximated by a . The famous Dirichlet approximation theorem asserts that, provided a real number θ and a positive integer n, then there are integers α and β such that 1 |αθ − β| < n where 1 ≤ α ≤ n. It follows that if θ is irrational, then there are infinitely many rational numbers α/β such that

α 1 θ − ≤ . β β2

A more general idea is to consider an arbitrary unit vector u ∈ Rn and construct a rational codimension one subspace that approximately contains u: Define τ(u) to be the supremum of η > 0 such that the two inequalities

||x|| ≤ Q and

|x · u| ≤ Q−η admit a nonzero solution x ∈ Zn for arbitrarily large values of Q. Then τ(u) is an example of an exponent of Diophantine approximation. There are other exponents as well, and the approximat- ing subspace does not necessarily need to have codimension 1.

The present work is a survey of Roy’s article: On Schmidt and Summerer Parametric Geometry of Numbers, [15], where the author develops a framework which allows many types of Diophantine exponents to be understood in a new light. The building blocks of Roy’s article are the results from Schmidt and Summerer, [16] and [17], where they construct certain one parameter families of convex bodies. The asymptotical behavior of the successive minima of these bodies for large parameter values is directly connected to a certain family of Diophantine exponents. Schmidt

7 and Summerer show that the successive minima can be approximated with bounded difference by certain functions which are amenable to analysis.

Roy constructs his own class of approximating functions, called rigid systems. He also proves that, conversely, given any such rigid system, there exists a point in Rn whose corresponding successive minima approximate the rigid system with a bounded difference. The discussion of this result will form the main part of the present thesis.

Chapter two is dedicated to preliminaries where concepts such as convex bodies and successive minima are presented. The necessary foundations from the geometry of numbers are also intro- duced here such as Minkowski’s second convex body theorem.

In Chapter three we build up Roy’s theory regarding rigid systems, starting with the construction of a certain type of convex bodies, in order to be able to present the Approximation Theorem of Schmidt and Summerer. This theorem is presented in Chapter four together with a translation between Schmidt and Summerer’s notation and Roy’s. In Chapter five we focus on Roy’s result regarding the existence of a rigid system for any point in Rn. Finally, in Chapter 6, we discuss some of the foundations of Diophantine exponents and explain the connection between these and Roy’s result.

8 Chapter 2

Preliminaries

2.1 Lattices

The main objective in this introductory section will be the study of lattices and we will introduce some fundamental concepts and results from the geometry of numbers. The content can be found in [19].

n Definition 1. For any linearly independent vectors e1, ..., em in R (m ≤ n), the additive subgroup n of (R , +) generated by e1, ..., em is called a of dimension (or rank) m. We will denote lattices with Λ or Γ.

This means that a lattice of rank m is an abelian group of the form

m M Λ = Zek k=1 and one immediate example is (Z × Z, +). This is a lattice of rank 2 in R2 and the geometric interpretation can be seen in Figure 2.1. Intuitively we feel that a lattice should be a discrete subset of Rn and this is true together with the converse. It is possible to show that an additive subgroup of Rn is a lattice if and only if it is discrete. 2 Example 2. Take e1 = (1, 1) and e2 = (0, 1) in R . Then the additive group

Λ1 = ({α1e1 + α2e2 : α1, α2 ∈ Z}, +)

2 2 is a discrete subgroup of R and hence a lattice. Notice that Λ1 in fact will be Z . In the same n way it can be noted that if {e1, ..., en} is a basis in R , then

( n ) ! X Λ2 = αkek : αk ∈ Z , + k=1 will be a lattice in Rn. 4

9 Figure 2.1: This is an example of a lattice of rank 2 in R2. Hence it is of full rank.

Definition 3. Given a lattice Λ of full rank in Rn, any set of the form

( n ) X ajej : a1, ..., an ∈ [0, 1) j=1

n where {e1, ..., en} is a set of generators of Λ, is called a fundamental domain of R /Λ.

Let Λ be a lattice of rank n in Rn. Then the quotient group Rn/Λ is isomorphic to the n-dimensional n ∼ torus, which we will denote by T . This follows from the fact that R/Z = S where S denotes the circle group. The latter follows using the map ϕ : R → Z defined by

ϕ(x) = e2πix,

∼ which is clearly a surjective homeomorphism and its kernel is Z. Then R/Z = S follows from the n n ∼ first group isomorphism theorem. Then it is natural to define ψ : R → T = S × · · · × S via

n ! X 2πia1 2πian ψ ajej := (e , ..., e ) j=1

10 n ∼ n n and the isomorphism R /Λ = T follows analogously. The volume of a subset D of R will be denoted by vol(D): Z vol(D) := dx1...dxn. D

It is also necessary to define a volume measure on Tn and to do this we let π : Rn → Rn/Λ be the canonical homomorphism and put for any smooth subset X ⊂ Rn/Λ

vol(X) := vol(T ∩ π−1(X)), (2.1) where T is a fundamental domain of Λ. This volume measure will indeed be well-defined and this statement is formulated as the following proposition:

Proposition 4. The volume measure, vol, on T defined above is well-defined. Proof. To prove this, let T and T 0 be two fundamental domains for Tn. The goal is to show that

vol(T ∩ X) = vol(T 0 ∩ X) (2.2) where X is any Lebesgue measurable subset of Rn such that we for all x ∈ X and λ ∈ Λ have x + λ ∈ X. We will need the following lemma: Lemma 5. For every x ∈ Rn there exists a unique λ ∈ Λ such that x ∈ λ + T , where T is a fundamental domain of Rn/Λ. Proof. With

n X x = ckek k=1 where c1, ..., cn ∈ R we can take

n X λ = bckcek. k=1

Then it follows that every element of Rn/Λ has a unique representation in T .

Lemma 5 above now gives that

n [ 0 R = (T + λ) λ∈Λ and hence [ T ∩ X = (T ∩ X ∩ (T 0 + λ)) λ∈Λ

11 and since these sets are pairwise disjoint it follows that X vol(T ∩ X) = vol (T ∩ X ∩ (T 0 + λ)) . λ∈Λ

In a similar way one gets X vol(T 0 ∩ X) = vol (T ∩ X ∩ (T 0 + λ)) λ∈Λ and the equality in (2.2) is obtained.

2.2 Convex Bodies and Minkowski’s First Theorem

Convex bodies play a fundamental role in the geometry of numbers and we will see that many theorems require the relevant set to be e.g. convex. What is normally called Minkowski’s first theorem asserts that if we have a convex body of a certain size, then it must contain a point of a given lattice. The precise statement and proof will be discussed below. One of the main objects that are studied in both [16] and in [15] are convex bodies but these are also used in many other such as optimization theory and geometry.

Definition 6. Let S be a subset of Rn. Then S is called convex if for any x, y ∈ S and λ ∈ [0, 1] we have λx + (1 − λ)y ∈ S. The expression λx + (1 − λ)y (with λ ∈ [0, 1]) is often referred to as a convex combination. Moreover, S is called a convex body if S is a compact, convex and symmetric region that contains a neighborhood of the origin.

It is somewhat non-standard to require, as we do, every convex body to be symmetric (viz., −S = S); however, this convention is used in [15], which we will give a survey of later on. Moreover, S will be allowed to be a subset of a finite dimensional Euclidean space and the notion of a convex body translates directly to this situation.

Example 7. The closed unit ball

n B = {x ∈ R : ||x|| ≤ 1} is an example of a convex body in Rn. It is clearly nonempty and symmetric. By the Heine-Borel theorem it is compact. To show convexity we can pick x, y ∈ B and form the convex combination

λx + (1 − λ)y for 0 ≤ λ ≤ 1 and show that it lies in B. By the triangle inequality we have

||λx + (1 − λ)y|| ≤ λ||x|| + (1 − λ)||y|| ≤ λ + (1 − λ) = 1

12 =⇒ λx + (1 − λ)y ∈ B.

4

Figure 2.2: The picture shows two sets. The one to the left is convex and the one to the right is not convex.

The lattice Λ ⊆ Rn will still be assumed to have rank n. We have the following lemma, [19, p. 136], that is used to prove Minkowski’s first theorem.

Lemma 8. Let π : Rn → Tn be the canonical homomorphism and let X be a bounded Lebesgue n measurable subset of R . Then vol(π(X)) ≤ vol(X) and if the restriction map π|X is injective then we have equality.

Proof. Let T be a fundamental domain for Γ. We have, using Lemma 5, that !   [ vol(π(X)) = vol T ∩ π−1(π(X)) = vol T ∩ (X + λ) . λ∈Λ

Since Γ is countable, a property of the Lebesgue measure implies that ! [ X   X   vol T ∩ (X + λ) ≤ vol T ∩ (X + λ) = vol X ∩ (T − λ) λ∈Λ λ∈Λ λ∈Λ ! ! [ [ = vol X ∩ (T − λ) = vol (X − λ) ∩ X λ∈Λ λ∈Λ

13 and since [ (T − λ) λ∈Λ tessellates Rn we get ! [ vol (T − λ) ∩ X = vol(X). λ∈Λ

Now, if π|X is an injective map, then   T ∩ (X + λ1) ∩ T ∩ (X + λ2) = ∅

  for all λ1 6= λ2 ∈ Λ. To see this, assume that x ∈ T ∩ (X + λ1) ∩ T ∩ (X + λ2) . Then, in particular π(x − λ1) = π(x − λ2) since (x − λ1) − (x − λ2) = λ2 − λ1 ∈ Λ and injectivity of π|X now implies x − λ1 = x − λ2; thus λ1 = λ2, a contradiction to the original assumption. This concludes the proof of the lemma.

We are now ready to state Minkowski’s first theorem. We use the formulation found in [19, Theorem 7.1, page 140.]

Theorem 9. Let Λ be a lattice of full rank in Rn with fundamental domain T and let X be a bounded symmetric convex subset of Rn. If

n n vol(X) > 2 vol (R /Λ) then X ∩ Λ 6= {0}.

Proof. We follow the proof presented in [19]. The idea is to double the size of the given lattice, Λ, and use Lemma 8. The proof idea is rather standard and similar arguments can be found in e.g. [18] where the lattice is is scaled with 1/2 instead. So, let us consider Γ = 2Λ. Any fundamental domain of Γ has volume 2nvol(T ) where T is a fundamental domain of Λ. By assumption 2nvol(T ) < vol(X) and hence the canonical homomorphism π : Rn → Rn/Γ fulfills that

n vol(π(X)) ≤ vol(R /Γ) < vol(X)

and by Lemma 8 it is given that π|X is not injective. This in turn means that there are elements x1, x2 ∈ X with x1 6= x2 such that x1 − x2 ∈ Γ. By symmetry and convexity of X we get 1 (x − x ) ∈ X 2 1 2 14 Figure 2.3: This is an illustration of Minkowski’s first convex body theorem.

and since

1 (x − x ) ∈ Λ 2 1 2 we have

1 (x − x ) ∈ X ∩ Λ. 2 1 2

This means that X ∩ Λ 6= {0} and the proof is done.

Example 10. An easy example would be to take Λ = Z3 and X as any ellipsoid centered at the origin with volume V > 8. Then Minkowski’s convex body theorem implies that the ellipsoid contains a nonzero lattice point.

4

Example 11. A number-theoretic application of Minkowski’s first theorem is to show the following famous result proved in 1770 by Lagrange:

Proposition 12. Every positive integer n is the sum of four integer squares.

Proof. This proof can be found in [19, p. 143] and [5, Theorem 5]. The idea is to first show the statement for primes and then extend it to all positive integers. We may also assume p is odd since 2 = 02 + 12 + 02 + 12. If p is an odd prime, then there are integers x and y such that

x2 + y2 ≡ −1 (mod p) (2.3)

15 holds. Define the lattice Λ as

Λ := Z(x, y, 1, 0) ⊕ Z(−x, y, 0, 1) ⊕ Z(p, 0, 0, 0) ⊕ Z(0, p, 0, 0).

Then the volume of the fundamental domain becomes

x −y p 0

n y x 0 p 2 1 0 2 vol(R /Λ) = = p = p . 1 0 0 0 0 1

0 1 0 0 As a symmetric we will consider the sphere centered at the origin with radius R > 0, 4 called BR, in R . The volume becomes ZZZZ π2 π2 vol(B ) = dx dx dx dx = R4 = R4. R 1 2 3 4 4  2 Γ 2 + 1 2 2 2 2 2 x1+x2+x3+x4≤R Then

π2 ! R4 > 24p = 16p 2

2 19p if we for instance take R = 10 . Then Minkowski’s theorem insures the existence of a nonzero lattice point (a, b, c, d) ∈ Λ ∩ BR. This gives

0 < a2 + b2 + c2 + d2 ≤ R2 < 2p.

Hence

a2 + b2 + c2 + d2 ≡ 0 (mod p) and this means that a2 + b2 + c2 + d2 is a multiple of p and since it is positive and strictly smaller than 2p we get

a2 + b2 + c2 + d2 = p.

Let n be a composite integer with prime factorization

m Y tj n = pj . j=1

We have just shown that each prime can be written as the sum of four squares, so

m Y 2 2 2 2tj n = aj + bj + cj + dj . j=1

16 This will also be a sum of four squares since

(a2 + b2 + c2 + d2)(e2 + f 2 + g2 + h2) = (ae − bf − cg − dh)2 + (af + be + ch − dg)2+

+(ag − bh + ce + df)2 + (ah + bg − cf + de)2.

4

2.3 Minkowski’s Second Convex Body Theorem

In order to formulate and prove Minkowski’s second convex body theorem some theory is required. We will give a proof by contradiction following [18].

2.3.1 Gauge Functions

This section is dedicated to the concept of a gauge function introduced by Minkowski. The material below can be found in [18].

Definition 13. Let X be a bounded, convex and open set in Rn containing the origin. The gauge function for X is the function f : Rn → [0, ∞) such that f(0) = 0, and for every w 6= 0, f(w) is the unique positive real number such that f(w)−1w ∈ ∂X.

We provide some examples to explore what this concept means.

Example 14. Let X be the interior of the unit circle. Then the gauge function corresponding to X is given by q 2 2 f(x1, x2) = x1 + x2.

4

Example 15. If X is the interior of an ellipsoid with axes a > 0, b > 0 and c > 0, then its gauge function is r x2 x2 x2 f(x , x , x ) = 1 + 2 + 3 . 1 2 3 a2 b2 c2

4

17 Let us note two fundamental properties satisfied by any gauge function f. First,

n f(µv) = µf(v), ∀v ∈ R , µ ≥ 0. (2.4)

Indeed, f is the gauge function of X, and if µ > 0 and v 6= 0, then (2.4) follows from the fact that (µf(v))−1µv = f(v)−1v ∈ ∂X; and in the remaining case when µ = 0 or v = 0, both sides of (2.4) clearly vanish; hence (2.4) is proved. Secondly,

n f(u + v) ≤ f(u) + f(v), ∀u, v ∈ R . (2.5)

In order to establish the subadditivity condition, (2.5), we notice that f(v) ≤ 1 for all vectors v ∈ X := X ∪ ∂X. In particular this means that for any convex combination λu1 + (1 − λ)u2 ∈ X the following holds:

f(λu1 + (1 − λ)u2) ≤ 1, ∀u1, u2 ∈ X. (2.6)

Assume, without loss of generality, that u, v 6= 0 and notice that u/f(u) ∈ X and v/f(v) ∈ X because of the positive-homogeneous condition

 u  f = 1 f(u) and similarly for v/f(v). By defining

f(u) λ = f(u) + f(v) and applying (2.6) with u1 := u/f(u) and u2 = v/f(v) it follows that

 u + v  f ≤ 1 ⇐⇒ f(u + v) ≤ f(u) + f(v) f(u) + f(v) and (2.5) is fulfilled.

2.3.2 Successive Minima

One of the main objects that are studied in both [16] and in [15] are convex bodies. These are used in many areas of mathematics such as optimization theory and geometry, but the purpose they will have for us is in the construction of successive minima and hence helping us define a very fundamental and important function.

Definition 16. Let C be a convex body in Rn, and let Λ be a lattice of full rank in Rn. For j = 1, ..., n we define λj(C) to be the smallest positive λ ∈ R such that |Span{λC ∩ Λ}| ≥ j. The numbers λ1(C), ..., λn(C) are called the successive minima of C with respect to Λ.

18 2.3.3 Statement of Minkowski’s Second Theorem

Let λ1, ..., λn denote the successive minima of a convex body, C, with respect to some lattice Λ ⊂ Rn. Observe that this will correspond to an even gauge function. Minkowski’s second convex body theorem states the following:

Theorem 17. The successive minima fulfill

n 2n Y vol( n/Λ) ≤ λ · vol(C) ≤ 2nvol( n/Λ). (2.7) n! R j R j=1

Proof. The following proof can be found in [18] (we modify it slightly since our C is compact whereas Siegel works with open convex bodies). Let f be the gauge function of C; then

n C = {x ∈ R : f(x) ≤ 1}.

Also, without loss of generality, let Λ = Zn. We will start by proving the second inequality in (2.7). Assume, towards a contradiction, that

n Y n n n λj · vol(C) > 2 vol(R /Z ). j=1

Since vol(Rn/Zn) = 1 the assumption is that

n Y n λj · vol(C) > 2 . (2.8) j=1

n Take vectors x1, ..., xn ∈ Z such that the corresponding successive minima of the gauge function f are attained at these integral points. Since these vectors are linearly independent they form a basis for Rn.

n n We now define a map T : C → R as follows: Let x ∈ C be given, and choose a1, ..., an ∈ R so that

n X x = ajxj. j=1

For k ∈ {0, ..., n − 1} we set

( k n ) X X Lk = cjxj + ajxj : c1, ..., ck ∈ R ; (2.9) j=1 j=k+1

n this is an affine of R and so Lk ∩ C is a bounded convex set containing the point x. Therefore the center of mass of Lk ∩ C, which is denoted c = c(ak+1, ..., an), is a well-defined point lying in Lk ∩ C. In particular for k = 0 we get c(a1, ..., an) = x. Finally we define

19 T (x) := λ1c(a1, ..., an) + (λ2 − λ1)c(a2, ..., an) + ··· + (λn − λn+1−1)c(an). (2.10)

This will be an injective map because if

n X T (x) = bjxj j=1 then it follows after some calculations that

n−1 X bj = λjaj + (λk+1 − λk)gj(ak+1, ..., an), (2.11) k=j which means that it is possible to uniquely determine aj from bk for j, k = 1, ..., n. Hence T is injective. Assume now that

n−1 X (λk+1 − λk)gj(ak+1, ..., an) = 0 k=j for all j = 1, ..., n. Then T will be a rescaling and hence

n Y vol(T (C)) = vol(C) λj. j=1

If

n−1 X (λk+1 − λk)gj(ak+1, ..., an) 6= 0 k=j and if A is the matrix corresponding to the linear map changing basis to {x1, ..., xn} we get that the volume becomes

ZZ Z ZZ Z vol(T (C)) = det A · ··· dy1...dyn = det A · ··· dy1...dyn.

n Pn D {(y1,...,yn)∈R : j=1 yj xj ∈T (C)}

Without loss of generality, assume that the back substitution of (2.11) is

 Pn−1 b1 = λ1a1 + k=2(λk+1 − λk)g2(ak+1, ..., an)  b2 = λ2a2 . .  bn = λnan

and let the integration interval for y1 be given by the functions ϕ1(a2, ..., an) and ϕ2(a2, ..., an).

20 Then ! ZZ Z Z A2(a2,...,an) vol(T (C)) = det A · ··· λ1dy1 dy2...dyn A1(a2,..,an) D˜ where

n−1 ! 1 X A1(a2, ..., an) = ϕ1(a2, ..., an) − (λk+1 − λk)gj(ak+1, ..., an) λ1 k=j and

n−1 ! 1 X A2(a2, ..., an) = ϕ2(a2, ..., an) − (λk+1 − λk)gj(ak+1, ..., an) . λ1 k=j

Now, ! ZZ Z Z A2(a2,...,an) vol(T (C)) = det A · ··· λ1dy1 dy2...dyn A1(a2,..,an) D˜ n−1 ZZ Z X = det A · ··· ϕ1(a2, ..., an) − (λk+1 − λk)gj(ak+1, ..., an) − ϕ2(a2, ..., an) k=j D˜

n−1 ! X ZZ Z + (λk+1−λk)gj(ak+1, ..., an) dy2...dyn = det A· ··· (ϕ1(a2, ..., an) − ϕ2(a2, ..., an)) dy2...dyn k=j D˜ ZZ Z = det A · ··· dy1...dyn = vol(C). D Hence the volume of T (C) is independent of whether the latter part of (2.11) is zero or not, which means that

vol(T (C)) = vol(C). (2.12)

It is convenient, just as in the proof of Minkowski’s first convex body theorem, to scale the lattice, 1 or equivalently C. Follwing the proof presented in [18], we consider the set 2 T (C). Since

n 1  Y vol T (C) = 2−nvol(C) λ > 1 2 j j=1

1 1 1 by (2.8) and (2.12), it follows by Lemma 8 that there exist two distinct points 2 y1, 2 y2 ∈ 2 T (C).

21 −1 −1 Since T is injective we have T (y1) 6= T (y2), and of course these two points lie in C. Then there are two vectors y1 6= y2 ∈ C so that 1 0 6= (y − y ) ∈ C ∩ n. 2 1 2 Z

−1 −1 Notice that T (y1),T (y2) ∈ C. By expressing both these as linear combinations using vectors from {x1, ..., xn} we get

n −1 X T (y1) = αjxj j=1 and

n −1 X T (y2) = βjxj, j=1 implying that αj 6= βj for some j ∈ {1, ..., n} due to injectivity. From the definition of the map T it follows that:

1 1 1 (y − y ) = λ (c(α , ..., α ) − c(β , ..., β )) + (λ − λ )(c(α , ..., α ) − c(β , ..., β )) + ··· 2 1 2 1 2 1 n 1 n 2 2 1 2 n 2 n 1 + (λ − λ )(c(α ) − c(β )). 2 n n+1−1 n n

This yields that

1  1 1 f (y − y ) = f λ (c(α , ..., α )−c(β , ..., β ))+ (λ −λ )(c(α , ..., α )−c(β , ..., β ))+··· 2 1 2 1 2 1 n 1 n 2 2 1 2 n 2 n ! 1 1 + (λ − λ )(c(α ) − c(β )) = f λ (c(α , ..., α ) − c(β , ..., β )) 2 n n+1−1 n n 1 2 1 n 1 n

! 1 1 + (λ − λ )(c(α , ..., α ) − c(β , ..., β )) + ··· + (λ − λ )(c(α , ..., α ) − c(β , ..., β )) 2 2 1 2 n 2 n 2 k k−1 k n k n

≤ λ1 + (λ2 − λ1) + ··· (λk − λk−1 = λk, where we used that f(x) ≤ 1 for all x ∈ C. This is a contradiction to the choice of the vectors {x1, ..., xn}. This finishes the proof of the second inequality of (17).

In order to show the first inequality of (17), it is enough to observe that xk ∈ λk∂C, which in 1 1 particular means that xk ∈ ∂C. By symmetry of C we have − xk ∈ ∂C. After placing out all λk λk 2n points we get an n-dimensional octahedron, Oc. See Figure 2.4 for an illustration.

22 Figure 2.4: This figure describes the octahedron in three dimensions.

The volume becomes, after iterating, 2n vol(Oc) = n!λ1 ··· λm and since Oc ⊂ C it follows that

n 2n 2n Y ≤ vol(C) ⇐⇒ vol( n/Λ) ≤ λ vol(C). n!λ ··· λ n! R j 1 m j=1 and the theorem is proved.

23 24 Chapter 3

Rigid Systems and Convex Bodies

The goal of this chapter is to build up the theory required to deal with Roy’s contribution made in [15, Th. 1.3]. Roy’s work is a continuation of what was developed by Schmidt and Summerer in [16] and [17]. Hence parts of their work will also be presented in this chapter and we will see a connection between Roy’s article and that of Schmidt and Summerer. This chapter and Chapter 4 will serve as a survey of Roy’s article [15], which means that we follow it closely through these chapters also regarding notation. Notationwise, V will always denote a finite dimensional Euclidean vector space and Λ will denote a lattice of V. We also write V = U ⊕ W where W 6= {0} and U are subspaces of V to split V into a direct sum of orthogonal subspaces.

Initially, the construction of a special type of convex bodies will take place, where the shape of the body depends on a parameter Q > 0.

Definition 18. Let dimR(V) = m and let W 6= {0} be a fixed linear subspace of V. We then define the following one parameter family of convex bodies:

−1 C(Q) := {x ∈ V : ||x|| ≤ 1 and ||x||W|| ≤ Q } for Q ≥ 1 and where x||W denotes the orthogonal projection on W. Furthermore, we let Γ be a fixed lattice of full rank in V, and for each j = 1, ..., m, we define Lj : [0, ∞) → R by

q Lj(q) = log λj(C(e )) for q ≥ 0. We also put

L(q) = (L1(q), ..., Lm(q)).

Let the focus now be the foundations of the theory presented in [15]. Some fundamental properties of these functions presented in the above definition will be described later on. One thing we might establish directly is that L1(q) ≤ · · · ≤ Lm(q) for q ≥ 0 since the natural logarithm function is monotonically increasing.

25 Figure 3.1: This is an illustration of C(Q) when V = R3. We have two hyperplanes at a distance 2Q−1 giving us the set of vectors inside the unit ball that also are in C(Q). These are illustrated as the blue part of the unit sphere.

Example 19. The situation which will be most common for us is when V = Rn (thus m = n) and W = Ru for a fixed unit vector u. In this case we write Cu := C(Q), and we clearly have

−1 Cu(Q) = {x ∈ V : ||x|| ≤ 1 and |x · u| ≤ Q }. (3.1)

Note that since u is a unit vector, the scalar product x · u stands for the orthogonal projection of the vector x onto u, in accordance with Definition 18. Observe also that if we consider Rn for n ≥ 2 as

n ∼ R = U ⊕ W (3.2)

⊥ with W = Ru for a unit vector u ∈ R and with U = W it is natural to consider Cu(Q) for Q ≥ 1 as a convex body. From this it is possible to consider a special case of L. Initially, define

n ∆n = {x ∈ R : x1 ≤ · · · ≤ xn} (3.3) to be the set of all monotone increasing n-tuples of real numbers. Next we define a map Lu : [0, ∞) → ∆n by

Lu(q) = (Lu,1(q), ..., Lu,n(q)) (3.4) for q ≥ 0 where we for each j = 1, ..., n have

n n y q Lu,j(q) := inf{y ∈ ≥0 : |Span( ∩ e Cu(e ))| ≥ j}. y R Z

q n Then Lu,j(q) = log λj(Cu(e )) with Γ = Z is indeed a special case of L.

26 4

We now prove some first properties of the functions defined above.

Proposition 20. The function L(q) is Lipschitz continuous.

r2 r1 Proof. If 0 ≤ r1 ≤ r2 it follows that C(e ) ⊆ C(e ) and since

e−r2 = er1−r2 e−r1 we get

er1−r2 C(er1 ) ⊆ C(er2 ).

The chain of inclusions then implies that

Lj(r1) ≤ Lj(r2) ≤ Lj(r1) + (r2 − r1) (3.5)

for j = 1, ..., m and since this is true for all 0 ≤ r1 ≤ r2 we see that Lj is continuous since (3.5) implies that

|Lj(r2) − Lj(r1)| ≤ |r2 − r1|,

which means that Lj is Lipschitz continuous with Lipschitz constant 1.

Example 21. Notice that then the corresponding functions

q Lu,j(q) = log λj(Cu(e )) (3.6) will be continuous for all j = 1, ..., m.

4

27 Moreover, the following property holds: Lemma 22. The function L(q) is piecewise linear with slopes 0 and 1. Proof. As a first step, define

λx(C(Q)) = max{||x||,Q||x||W||} where x ∈ V and Q ≥ 1. Using this one can define

q Lx(q) = log λx(C(e )) (3.7)

for q ≥ 0. Intuitively this means that whenever x 6= 0 the function Lx allows us to follow the q trajectory of x, which depends on C(e ). A consequence of this is that the set {(q, Lx(q))} must have a finite number of intersection points with any bounded subset of [0, ∞) × R for each x ∈ Λ \{0}. We further notice that Lx is piecewise linear and if x||W = 0 then Lx = log ||x|| and hence it has slope 0. If x||W 6= 0 it will first have slope 0 and then, as q increases, get slope 1. Hence all component functions L1, ..., Lm will be piecewise linear functions with slopes 0 and 1.

3.1 Convex Bodies and Alternating k-forms

It is possible to somewhat generalize the concept from the previous section by considering spaces of alternating k-forms and see that the corresponding family of convex bodies for such spaces will work as an approximation of the convex hull of the exterior products of k copies of Cu(Q). Observe that we still follow Roy’s article, [15]. However, it will here be convenient to present some background regarding alternating k-forms. A good account on this topic can be found in [1], [14] V and [7]. Let (V) denote the exterior algebra of V and recall that, by definition

∞ ^ M ^k (V) := (V) k=0 Vk where (V) is the space of alternating multilinear forms

∗ ∗ ϕ : V × · · · × V → R under the ∧-product operation defined for any k, l ≥ 0 by

^k ^l ^k+l (V) × (V) → (V) and

(k + l)! X (ϕ , ϕ ) 7→ ϕ ∧ ϕ := (signσ)(σ(ϕ ⊗ ϕ )) (3.8) 1 2 1 2 k!l! 1 2 σ∈Sn

28 where Sn denotes the permutation group on n letters. Note also that if {e1, ..., en} is an orthonormal n Vk n basis for R , then {ej1 ∧ · · · ∧ ejk } for 1 ≤ j1 ≤ · · · ≤ jk ≤ n will be an orthonormal basis for R and with U = W⊥ and W = Ru we get

k k k−1 ^ n ∼ ^  ^  R = U ⊕ U ∧ Ru .

(k) Vk−1 If we put W = U ∧ Ru it is natural to consider the following, corresponding, parameter family of convex bodies:

k (k) n ^ n −1o Cu (Q) := x ∈ R : ||x|| < 1 and ||x||W(k) || < Q and the associated maps

(k) (k) q Lu,j(q) = log(λj(Cu (e ))).

Definition 23. The k:th compound convex body of Cu(Q) is the convex hull of the exterior products of k elements of Cu(Q).

V1 n ∼ n (1) Vn Example 24. For k = 1 we have R = R so Cu (Q) = Cu(Q) and for k = n we have U = {0} and hence

n (n) n ^ n −1o Cu (Q) = x ∈ R : ||x|| < Q . (3.9)

We continue to follow [15] and Roy’s way to describe the theory developed in Schmidt and Sum- Vk (k) merer [16] and [17]. The following lemma shows that Cu(Q) ⊆ kCu (Q) and can be found in [15, Lemma 2.4.]:

(k) Lemma 25. If x1, ..., xk ∈ Cu(Q) with Q ≥ 1, then x1 ∧ ... ∧ xk ∈ kCu (Q) and in particular if n z1, ..., zk are linearly independent vectors in Z we have the inequality

k X L(z1 ∧ ... ∧ zk, q) ≤ L(zj, q) + log k, (3.10) j=1 where q ≥ 0 and the function Lx : [0, ∞) → R is defined by

L(x, q) = max{log ||x||, q + log ||x||W(k) ||}. (3.11)

Proof.

Pick u ∈ U with U as in (3.2) and define yj = xj||U. For the sake of clarity the inner product x · y will here be denoted hx, yi, for x and y vectors. First note that

||x1 ∧ · · · ∧ xk|| ≤ ||x1|| · · · ||xk|| ≤ 1

29 and that by the standard projection formula from linear algebra

k X j+k x1∧· · ·∧xk = (y1+hx1, uiu)∧· · ·∧(yk+hxk, uiu) = y1∧· · ·∧yk+ (−1) hxj, uiy1∧· · ·∧ybj∧· · ·∧yk∧u, j=1 where ybj stands for that the element yj is omitted in the sum. By taking norms on both sides we see that

k X j+k (−1) hx , uiy ∧ · · · ∧ y ∧ · · · ∧ y ∧ u = (y ∧ · · · ∧ y ) ( j 1 bj k 1 k ||W k) j=1

k k k k X j+k Y X Y =⇒ ||(y1 ∧ · · · ∧ yk)||W(k)|| ≤ |(−1) | · |hxj, ui| ||yi|| · ||u|| = |hxj, ui| ||yi|| j=1 i=1 j=1 i=1 i6=j i6=j

k k k X Y X −1 −1 ≤ ||yi|| ≤ Q = kQ . j=1 i=1 j=1 i6=j

(k) The conclusion is hence that x1 ∧ · · · ∧ xk ∈ kCu (Q). In order to prove the inequality in (3.10) −1 q q we note that exp(L(zj, q)) zj ∈ Cu(e ) because exp(L(y, q)) > e . This means that

k Y z1 ∧ · · · ∧ zk ∈ kC(k)(Q). exp(L(z , q)) u j=1 j

A consequence is then that

L(z1 ∧ ... ∧ zk) = max{log ||z1 ∧ ... ∧ zk||, q + log ||(z1 ∧ ... ∧ zk)||W(k) ||}

k ! k ! k Y Y X ≤ log ||kz1 ∧ · · · ∧ zk|| exp(L(zj, q)) ≤ log k exp(L(zj, q)) = L(zj, q) + log k. j=1 j=1 j=1

This completes the proof of the lemma.

In 1955 published two articles, [10] and [11], on the compound of convex bodies, and the main idea and original source for Lemma 26 comes from his theory in these articles. However, for the sake of consistency we will continue to follow Roy’s, [15], way of presenting this result and his added estimates to the inequality. Define

^k  n N := dim n = R R k

30 and then we state the lemma:

Lemma 26. Order the elements of the set

( k ) X Lu,jm (q) : 1 ≤ j1 < ··· < jk ≤ n m=1

(k) N in monotone increasing order and define {Su,j }j=1 to be the sequence consisting of those. Then the following inequality holds true:

(k) (k) n − log n ≤ Su,j − Lu,j(q) ≤ 2 n log n (3.12) for q ≥ 0 and 1 ≤ j ≤ N.

n Proof. Pick y1, ..., yk as linearly independent vectors of Z in such a way that L(yj, q) = Lu,j(q) for j = 1, ..., N. This can always be done since there for each j = 1, ..., n exist a successive minima q for Cu(e ). This means in particular that

y1 ∧ · · · ∧ yk

Vk n for 1 ≤ j1 < ... < jk ≤ N will be linearly independent in Z . Let (µ1, ..., µN ) be the N-tuple consisting of the numbers L(y1 ∧ · · · ∧ yk, q) for 1 ≤ j1 < ... < jk ≤ n in monotone increasing order. Then L(yj, q) ≤ µj for all j = 1, ..., N since L(yj, q) will act as the j:th successive minima (k) q for Cu (e ). Lemma 25 now implies that

k X (k) L(yj, q) ≤ L(zj, q) + log k =⇒ L(yj, q) ≤ Su,j (q) + log k j=1 for j = 1, ..., n. Observe specially that Lu,j(q) is monotonically increasing, which was proved earlier, and hence

n X Lu,i(q) − q ≤ n log n (3.13) i=1 holds for q ≥ 0 together with

N       X (k) n − 1 n n L (q) − q ≤ log = N log N. (3.14) u,i k − 1 k k i=1

That this is true is due to Minkowksi’s second convex body theorem, Theorem 17. In the latter situation the inequalities

N 2N Y ≤ λ (C(k)(Q))vol(C(k)(Q)) ≤ 2N (3.15) N! i u u i=1

31 (k) hold true. The next step is to derive an inequality for the volume of Cu (Q). To do this, take Vk−1 E := {e1, ..., eK } as an orthonormal basis for U ∧ Ru where ^k−1  K = dimR U ∧ Ru .

Define the following convex body using a natural extension of E:

 k  ^ n −1 C := x ∈ : max |x · ej| ≤ Q , max |x · ej| ≤ 1 . R 1≤j≤K K+1≤j≤N

The set C will then be an N-dimensional cuboid. Notice that the volume of this cuboid will be N −K −1 (k) 2 Q . Further we have that N C ⊂ Cu (Q) ⊂ C and in terms of volume this means

 2 N Q−K ≤ vol(C(k)(Q)) ≤ 2N Q−K N u

N 1 Y =⇒ ≤ λ (C(k)(Q))Q−K ≤ N N . N! i u i=1

Taking logarithms on both sides then implies

N ! Y n n log λ (C(k)(Q)) − K log Q ≤ log = N log N i u k k i=1

N X n n ⇐⇒ log λ (C(k)(Q)) − K log Q ≤ log = N log N i u k k i=1

N X (k) ⇐⇒ Lu,i (q) − K log Q ≤ N log N. i=1 and if we change notation by letting Q = eq and observe that

^k−1  n − 1 K = dim ∧ u = R U R k − 1 we indeed get (3.14). Showing that (3.13) holds is done in exactly the same way. Now when we have established (3.13) and (3.14) we continue the proof via the following estimation:

N N N X (k) (k) X (k) X (k) (Su,j (q) − Lu,j(q) + log k) = Su,j (q) − Lu,j(q) + N log k j=1 j=1 j=1

N N N   n N   X X X (k) n − 1 X X (k) n = L (q) − L (q) + N log k = L (q) − L (q) + log k u,jm u,j k − 1 u,j u,j k j=1 m=1 j=1 j=1 j=1 1≤j1<···

32  n  ≤ log n + N log(kN) k − 1 due to the inequalities in (3.13) and (3.14). Because

 n  = kN ≤ nk k − 1 it follows that

N   X (k) (k) n − 1 (S (q) − L (q) + log k) ≤ 2 n log n ≤ 2nn log n, u,j u,j k − 1 j=1 which in addition means that

(k) (k) n − log n ≤ Su,j (q) − Lu,j(q) ≤ 2 n log n.

33 34 Chapter 4

The Approximation Theorem of Schmidt and Summerer

This chapter is dedicated to the approximation theorem of Schmidt and Summerer, Theorem 28. The original source for this theorem is [17, p. 57 pp.] but we will still use Roy’s notation. However, translation is needed between the two articles. Unlike Schmidt and Summerer, Roy uses the polar bodies. The representations are equivalent, which Roys points out, and we will provide a proof of this in the appendix to this chapter.

For now we focus on the approximation theorem of Schmidt and Summerern, namely Theorem 28. The theorem is originally found in [17]. Let us start with the concept of an (n, γ)-system in order to state Theorem 28 in the same way as in [15, p. 750]. The notion of (n, γ)-systems is omitted in [17] but since we are following Roy’s article, [15], very closely we use his way of presenting the proof as well. The following definition can be found in [15, Def. 2.8]:

Definition 27. For γ, q0 ≥ 0, a so-called (n, γ)−system on the half line [q0, ∞) is a function n P = (P1, ..., Pn):[q0, ∞) → R such that the following holds:

(i) −γ ≤ Pj(q) ≤ Pj+1(q) + γ for 1 ≤ j < n and q0 ≤ q;

(ii) Pj(q1) ≤ Pj(q2) + γ for 1 ≤ j ≤ n and q0 ≤ q1 ≤ q2;

(iii) the function given by Mj := P1 + ··· + Pj :[q0, ∞) → R is a continuous function for all j = 1, ..., n and piecewise linear with slopes 0 and 1;

(iv) Mn(q) = q for q0 ≤ q;

(v) if there is some j ∈ {1, ..., n − 1} such that Mj changes slope from 1 to 0 at a point q > q0, then

Pj+1(q) ≤ Pj(q) + γ.

The first natural thing to ask after a definition has been stated is whether or not such an object exists. The existence of such a function P given in the above definition is guaranteed by the following theorem, found in [15, p. 750]:

35 Theorem 28. For every unit vector u ∈ Rn there exists an (n, γ)-system P : [0, ∞) → Rn for γ = 6n2n log(n) such that

sup ||P(q) − Lu(q)||∞ ≤ γ. q≥0

The original source for Theorem 28 goes back to [17, p. 57 ff.] except the precise value of γ, which is due to Roy.

Proof. Start by defining j1 = 1, ..., jk = k in (3.12) from Lemma 26.Then for every q ≥ 0 the following inequality holds:

k (k) X n Lu,1(q) − Lu,j(q) ≤ 2 n log n. (4.1) j=1

(k) (k−1) (0) It will be convenient to put Pk = Lu,1(q) − Lu,1 (q) for all k = 1, ..., n (with Lu,1 := 0) because the triangle inequality implies that

n+1 n+1 n+1 |Pk(q) − Lu,k(q)| ≤ 2 n log n =⇒ −2 n log n ≤ Lu,k(q) − 2 n log n ≤ Pk(q) and hence the conclusion is that part (i) of the definition is satisfied and hence proved. Moreover,

n+1 n+1 n+2 Pk(q) ≤ Lu,k(q) + 2 n log n ≤ Lu,k+1(q) + 2 n log n ≤ Pk+1(q) + 2 n log n

and for 0 ≤ q1 ≤ q2 the inequality

n+1 n+2 Pk(q1) ≤ Lu,k(q1) + 2 n log n ≤ Pk(q2) + 2 n log n holds for all 1 ≤ k ≤ n and notice that (ii) is fulfilled. We also see that

k k X X (j) (j−1) (k) Mk = Pj = (Lu,1(q) − Lu,1 (q)) = Lu,1(q). j=1 j=1

By Proposition 20 the function Mk will be continuous for all k = 1, ..., n and by Lemma 22 it will be piecewise linear with slope 0 and 1. This completes part (iii). From (3.9) it follows that Vn n ||x||W(n) || = ||x|| and hence we have that for any x ∈ R and for any q ≥ 0 that

(n) q Lu,1(q) = log e = q.

This proves (iv). Finally, the part that is left is (v). This one is implied from the fact that when (k) (k) (k) the function Lu,1 changes slope from 1 to 0 at q > 0 the equality Lu,1(q) = Lu,2(q) holds. This d (k) is true because if x and y are two linearly independent vectors in Λ such that dq Lx (q) = 1 and d (k) (k) q dq Ly (q) = 0, then both x and y belong to exp(L1(q))C (e ). Without loss of generality one can

36 assume that k < n, and by letting j1 = 1, ..., jk−1 = k − 1 and jk = k + 1 in (3.12) from Lemma 26 we get that

k−1 (k) X n Lu,2(q) − Lu,j(q) − Lu,k+1(q) ≤ 2 n log n (4.2) j=1 and combining (4.2) together with (4.1) implies that

k k−1 (k) X (k) X n+1 n+1 Lu,1(q) − Lu,j(q) − Lu,2(q) − Lu,j(q) + Lu,k+1(q) ≤ 2 n log n ⇐⇒ |Lu,k+1−Lu,k| ≤ 2 n log n. j=1 j=1

Because Lu,k+1 ≥ Lu,k we get

n+1 n n Lu,k+1 − Lu,k ≤ 2 n log n =⇒ Pk+1(q) ≤ Pk + 2 · 2 n log n ≤ Pk + 6 · 2 n log n = Pk + γ.

This proves (v) and hence the existence of an (n, γ)-system P : [0, ∞) → Rn for γ = 6n2n log(n) such that

sup ||P(q) − Lu(q)||∞ ≤ γ q≥0 has been shown.

Schmidt and Summerer derived the same conclusion in [17]. Let µ1, ..., µN denote the successive n minima of the lattice in R generated by all exterior products x1 ∧ x2 ∧ · · · ∧ xk for x1, ..., xk ∈ Λ. They derived inequality (4.1) from the fact that

N Y (k) q µ1  λk(Cu (e )). (4.3) k=1 The proof of (4.3) can be found in [10]. Hence there exists a constant c depending on n giving

N k X (k) q (k) X log µ1 − log λk(Cu (e )) = Lu,1(q) − Lu,j(q) ≤ c. k=1 j=1

The rest of the proof is almost the same except for the estimate of γ. Recall that in Example 19 we defined

n ∆n = {x ∈ R : x1 ≤ · · · ≤ xn}.

Next the notion of a canvas will be defined. This is a mathematical object that provides aid to graphically understand the rigid n-systems that will be defined later on. The definition below is

37 cited from Roy’s article, [15], but we notice that Schmidt and Summerer do not use this concept in their article, [17]. This can be seen as the point where Roy starts to create a new way of looking at the situation. Eventually the concept of a rigid n-system will be defined and it will be used in the proof of what is in this thesis called Roy’s contribution.

n Definition 29. Let Φn : R → ∆n be the continuous function that lists the coordinates of a given point in monotone increasing order and let δ ∈ R+ and s ∈ (N ∪ {∞}) \{0}.A canvas with n s mesh δ and cardinality s in R is a triple (a, k, l) where a = {ai}i=1 is a sequence of points s−1 s−1 in ∆n and where k = {ki}i=1 and l = {li}i=1 are two sequences of the same cardinality s such that for each index 0 ≤ i < s we have that

1 n (i) the coordinates of (ai , ..., ai ) of ai are a strictly increasing sequence of positive multiplies of δ;

(ii) we have that 1 ≤ k0 ≤ l0 = n and 1 ≤ ki < li ≤ n given that i ≥ 1;

li+1 li+1 (iii) if i + 1 < s then ki ≤ li+1, ai + δ ≤ ai+1 ; (iv) we have that

1 ki n 1 dli+1 n (ai , ..., aci , ..., ai ) = (ai+1, ..., ai+1 , ..., ai+1) (4.4)

for i + 1 < s.

Before we give an example, we note that the sequence a uniquely determines all li and ki, except ks−1 in the case s < ∞.

Example 30. Let a = {(2, 4, 6, 8), (2, 4, 8, 12), (2, 4, 12, 14)}. Then a1 form a strictly increasing sequence of positive multiples of δ = 2. This is also true in a2 and a3. We have that l0 = 4 and the inequalities  1 ≤ k0 ≤ l1 ≤ 4  1 ≤ k1 < l2 ≤ 4 a0 + 2 ≤ a1  l1 l1  a1 + 2 ≤ a2 , l2 l2 which together with (4.4) gives that

l = (4, 4, 4) and

k = (3, 3, k2).

As one can see, we can pick k2 = 3 and then get a canvas with mesh 2 and cardinality 3.

38 The next definition will help us to understand these canvases graphically.

Definition 31. Let P := (P1, ..., Pn):[c, ∞) → ∆n be a function. The combined graph of P above I, where I is a subinterval of [c, ∞), is the combined graph of its components restricted to I.

Definition 32. To each canvas of mesh δ > 0 we define P :[q0, ∞) → ∆n given by

1 ki n ki P(q) = Φn(ai , ..., aci , ..., ai , ai + q − qi) (4.5)

1 n for 0 ≤ i < s, qi ≤ q < qi+1 and where qi = ai + ··· + ai for 0 ≤ i < s and qs = ∞ if s < ∞. s−1 Such a function is called a rigid n-system with mesh δ and {qi}i=0 is its sequence of switch numbers.

We see from the definition above that if i + 1 < s then P is a continuous function and that on the ki ki interval [qi, qi+1) the component (ai + q − qi) gives us a line segment of slope 1 since a < qi for all i = 1, ..., n. The rest of the line segments on this interval will be horizontal.

Example 33. Let us draw the combined graph of a rigid 4-system with mesh 2 with respect to the canvas given in Example 30. For i = 0 we get

a1 = 2  0  2 a0 = 4 a3 = q − 14  0  4 a0 = 8 on the interval [qi, qi+1) = [20, 26). If one does similar calculations for the other intervals we get the combined graph drawn in Figure 4.1. The switch numbers are in this case {20, 26, 32}.

Figure 4.1: The combined graph of a rigid 4-system with mesh 2 with respect to the canvas given in Example 30.

39 4 Remark 34. More can be said about combined graphs of rigid n-systems. Notice for instance that in each combined graph in Figure 4.1, the line segments with slopes 1 decrease from left to right. This is no coincidence but follows since we have strict inequality in 1 ≤ ki < li ≤ n for i ≥ 1. This means that a(i) − q < a(i+1) − q and the statement follows. ki i ki+1 i+1 Example 35. We will now provide the reader with one more example of a combined graph of a rigid n-system. Consider the canvas {(3, 9, 12, 15), (3, 9, 15, 21)} of cardinality 2 and mesh 3. The corresponding rigid 4-system is given by

P(q) = Φn((3, 9, 15, 12 + q − 39)) = Φn((3, 9, 21, q − 27)) (39 ≤ q ≤ 48)

since k0 = 3. This means that we will have three horizontal line segments starting from (0, 3), (0, 9) and (0, 15) together with a line segment of slope one starting at (0, −27) and reaching the vertical line x = qi = 39 at the point (39, 12). Putting k1 = 3 then gives

P(q) = Φn((3, 9, 21, 15 + q − 48)) = Φn((3, 9, 21, q − 33)) (48 ≤ q < ∞)

The system will continue to look like this for all q ≥ 48. See an illustration in Figure 4.2.

The idea is to consider rigid n-systems where some of the components of the corresponding com- bined graph are unbounded. The behaviour of the system when q approaches infinity is related to exponents of Diophantine approximation; something that Roy has noticed and uses to describe the spectrum related to some of them.

Figure 4.2: This is the combined graph of a rigid 4-system with mesh 3 with respect to the canvas given in Example 35.

4

40 4.1 Appendix: Translation between Roy’s article and the Article by Schmidt and Summer

Schmidt and Summerer, [17], consider another type of convex body than Roy does. Define for Q ≥ 1

n (n−1)q −q K(q) := {(x1, ..., xn) ∈ R : |x1| ≤ e , |xi| ≤ e ∀i > 1} and for ξ1, ..., ξn ∈ R, let Λ = Λ(ξ1, ..., ξn−1) be the lattice

n Λ = {(y, ξ1y − y1, ..., ξn−1y − yn−1):(y, y1, ..., yn−1) ∈ Z }.

Observe that K(q) is a convex body and that it contains a nonzero lattice point by Minkowski’s first theorem. Then K(q) will be the convex body considered by Schmidt and Summerer in [17]. The authors also define

Li(q) = log λi(Λ, K(q)) (4.6) for i = 1, ..., n. So, in order to translate these functions to the ones which Roy studies, we fix a n linear map A ∈ GLn(R) in such a way that Λ = AZ . The definition of successive minima then implies

−1 −1 n −1 λi(Λ, K(q)) = λi(A Λ,A K(q)) = λi(Z ,A K(q)). (4.7)

n Let e1 = (1, 0, ..., 0) and define a unit vector in R by

−1 −1 −1 u := ||A e1|| A e1 (4.8)

−1 and fix a rotation R ∈ SO(n) such that Re1 = u. Now A maps K(q) to a “skew” long, thin −1 cylinder (for q large), whose axis is in the direction of A e1, i.e. the direction of u. In fact, by a computation one shows that there exist constants 0 < α1 < α2, which may depend on A but not on q, such that

−1 α1RK(q) ⊂ A K(q) ⊂ α2RK(q), ∀q ≥ 0. (4.9)

We will now change notation to Li(Λ, K(q) = log λi(Λ, K(q)) and use that (4.9) holds for all q ≥ 0 in order to get

n |Li(Λ, K, q) − Li(Z ,RK(q))| ≤ O(1), ∀q ≥ 0,

n which means that |Li(Λ, K, q) − Li(Z ,RK(q))| is uniformly bounded for all q ≥ 0. In order to continue, the notation of a dual body is now needed and is defined as follows:

41 Definition 36. Let C be a convex body. Then the dual body, or polar body, of C is denoted C∗ and is defined as the set

∗ n C = {x ∈ R : x · c ≤ 1, ∀c ∈ C}. (4.10)

nq q One can show that, up to a bounded rescaling, the dual of Ce1 (e ) is equal to e K(q). That is, there exist constants 0 < α3 < α4 such that, for all q ≥ 0,

nq q ∗ nq α3Ce1 (e ) ⊆ (e K(q)) ⊂ α4Ce1 (e ),

which gives, since u = Re1, that

nq q ∗ nq α3Cu(e ) ⊂ (e · RK(q)) ⊂ α4Cu(e ). (4.11)

We will use the following theorem, found in [4, Thm. VI]:

Theorem 37. Let λ1, ..., λn denote the successive minima of a lattice Λ with respect to the gauge ∗ ∗ ∗ function f and let λ1, ..., λn denote the successive minima of the polar lattice Λ with respect to the gauge function f ∗ polar to f. Then

∗ 1 ≤ λjλn+1−j ≤ n! (4.12) for 1 ≤ j ≤ n.

Remark 38. The notion of a dual lattice of Λ ⊂ Rn is defined as

∗ n Λ = {x ∈ R : x · Λ ⊂ Z}.

Now Theorem 37 gives that

n q n q ∗ 1 ≤ λj(Z , e · RK(q)) · λn+1−j(Z , (e RK(q)) ) ≤ n! and hence

n q n nq α3 ≤ λj(Z , e RK(q)) · λn+1−j(Z , Cu(e )) ≤ α4n!, (4.13)

n ∗ n [SS] where we also used the fact that (Z ) = Z . Let Lj denote the corresponding function in the [Roy] article of Schmidt and Summerer, [17], and Lj the function in [15]. Taking logarithms in (4.13) now gives

[SS] n q [SS] n nq O(1) ≤ Lj (Z , e · RK(q)) + Ln+1−j(Z , Cu(e )) ≤ O(1), ∀q ≥ 0.

42 This means that

[SS] [SS] n [Roy] Lj (Λ, K, q) = Lj (Z ,RK(q)) + O(1) = q − Lu,n+1−j(nq) + O(1). (4.14)

43 44 Chapter 5

Roy’s Contribution

In Roy’s discussion of the results of Schmidt and Summerer in [16] and [17] he contributes a new result regarding the existence of rigid systems and this result has applications in Diophantine approximation. In particular, Roy proves that for a given unit vector u in Rn, with n ≥ 2, there exists a rigid system P :[q0, ∞) → ∆n such that Lu − P is bounded on [q0, ∞). In this section we will state his theorem and reproduce, from [15], his proof. The following discussion will rely on the (n, γ)-systems introduced in the previous section. First, we consider a special type of such a system.

Definition 39. For q0 ≥ 0 we call an (n, γ)-system P = (P1, ..., Pn) on [q0, ∞) an (n, γ)-reduced system if, for any j ∈ {1, ..., n − 1}, a ≥ q0 and b ≥ a + nγ such that P1 + ··· + Pj is constant on the interval [a, b], we have that each function P1, ..., Pj is constant on [a, b − nγ].

Example 40. For γ = 0 we have that every (n, 0)-system is an (n, 0)-reduced system. Indeed, let P = (P1, ..., Pn) be an (n, 0)-system. Then condition (ii) in Definition 27 says that each Pj is a monotonically increasing function on [q0, ∞). Consider any j ∈ {1, ..., n − 1} and any b ≥ a ≥ q0 such that P1+···+Pj is constant on [a, b]. Since each function P1, ..., Pj is monotonically increasing, it then follows that each P1, ..., Pj must be constant on [a, b]. Hence P is an (n, 0)-reduced system.

4

The following theorem will serve as a major building block later on and it can be found as Propo- sition 7.1. in [15, p. 775]:

Theorem 41. Let γ and δ be real numbers such that 0 ≤ γ < δ/2n2. Let P : [0, ∞) → Rn be an n (n, γ)-reduced system and let q0 = n(n + 1)δ/2. Then there is a rigid n-system R :[q0, ∞) → R with mesh δ such that

2 ||P(q) − R(q)||∞ ≤ 3n δ

for all q ≥ q0.

Proof. We follow [15] closely throughout the entire proof. As a first step, we prove by a simple scaling argument that we may, without loss of generality, assume that δ = 1. Indeed, let γ, δ, P

45 ˜ n ˜ −1 and q0 be given as in Theorem 41; then define the function P : [0, ∞) → R by P(q) = δ P(δq). This is then an (n, γ/δ)-reduced system, and we note that 0 ≤ γ/δ < 1/2n2. Hence, if Theorem 41 holds for “δ = 1”, it follows that there exists a rigid n-system R˜ :[n(n+1)/2, ∞) → Rn with mesh ˜ 2 1 such that R(q) = δR(q/δ). Then R is a rigid system with mesh δ, and ||P(q) − R(q)||∞ ≤ 3n δ for all q ≥ q0, and so we have proved the theorem in the general case.

2 Hence from now on we assume δ = 1; thus 0 ≤ γ < 1/2n and q0 = n(n + 1)/2. Define new functions P j : [0, ∞) → R by setting

P j(q) := γ + sup{Pj(t); 0 ≤ t ≤ q} for all j ∈ {1, ..., n} and q ≥ 0. Then it follows that

−γ ≤ sup{Pj(t); 0 ≤ t ≤ q} =⇒ 0 ≤ P j(q) ∀q ≥ 0, (5.1)

and one can also show that P j(q) ≤ P j+1(q) + γ is true for all j = 1, ..., n − 1. Since Pj(q1) ≤ Pj(q2) + γ for 1 ≤ j ≤ n it follows, using q0 ≤ q1 ≤ q2, that the inequality

Pj(q) + γ ≤ P j(q) ≤ Pj(q) + 2γ ∀q ≥ 0 (5.2)

holds. Since each function Pj(t) is piecewise linear with slopes −1, 0 or 1, it follows that each function P j is piecewise linear with slopes 0 or 1. Following Roy, [15], a sequence of functions ∗ Ei : [0, ∞) → N is defined recursively via ( E (q) = bP (q)c + 1, j = 1 1 1 (5.3) Ej(q) = max{Ej−1(q) + 1, bP j(q) − 2(j − 1)γc + 1}, j = 2, ..., n.

Recall that δ = 1 by assumption and that this means that γ < 1/(2n2). A conclusion that can be drawn here is that for all j = 2, ..., n

|Ej(q)−Pj(q)| = | max{Ej−1(q)+1, bP j(q)−2(j−1)γc+1}−Pj(q)| ≤ max{2(j−1)γ, j(1+γ)+2γ} ≤ 1+n

=⇒ ||E(q) − P(q)||∞ ≤ 1 + n, (5.4) where (5.2) is used. To see one other similarity we state the following Lemma:

Lemma 42. For all q ≥ 0 we have that

P j(q) − 2(j − 1)γ < Ej(q) ≤ P j(q) + j(1 + γ), for j = 1, ..., n.

46 Proof. The proof of this lemma can be done with the help of mathematical induction. Since P j(q) < bP 1(q)c + 1 ≤ P j(q) + 1 + γ we see that the inequalities are true for j = 1. For an induction hypothesis, assume that the above holds for some integer j = p where 1 ≤ p < n. Then

P p+1(q) − 2pγ < bP p+1(q) − 2γc + 1 ≤ max{Ep(q) + 1, bP p+1 − 2γc + 1} = Ep+1(q).

By definition we have

Ep+1 = max{Ep(q) + 1, bP p − 2pγc + 1} ≤ max{Ep(q) + 1, P p+1 − 2pγ + 1} and the induction hypothesis implies that

max{Ep(q)+1, P p+1−2pγ+1} ≤ max{P p(q)+p(1+γ)+1, P p+1(q)−2pγ+1} ≤ P p+1(q)+(p+1)(1+γ),

since Pj(q) ≤ Pj+1(q). Hence we have proved the lemma.

Another property is that

n X Ek(q) − q ≤ n(n + 1), (5.5) l=1 which follows since

n X Pk(q) = q. k=1 Indeed,

n n n n n X X X X X Ek(q) − q = Ek(q) − Pk(q) = Ek(q) − Pk(q) ≤ |Ek(q) − Pk(q)| k=1 k=1 k=1 k=1 k=1

n X ≤ (n + 1) = n(n + 1). k=1

47 We will also need the following lemma:

Lemma 43. For every j = 1, ..., n, the function Ej admits at most one point of discontinuity on I, where I = (a, b) is any subinterval of [0, ∞) of length less than or equal to 1. Proof. We will argue by contradiction, so assume, on the contrary, that there exist j ∈ {1, ..., n} such that Ej admits more than one discontinuity on I. Suppose q1 < q2 are two such points with the property that Ej is constant on (q1, q2) ⊂ I and let j be the minimal index so that Ej is discontinuous in these points. Assume that Ej = k on (q1, q2), i.e. that Ej is constant on this interval. Then by definition of Ej and the assumption about discontinuity at q1 we have

lim Ej(t) = k − 1 (5.6) − t→q1 and

lim Ej(t) = k. (5.7) − t→q2

The discontinuity assumption at q2 then implies that

Ej(q2) = k + 1.

See Figure 5.1 for an illustration.

Figure 5.1: This illustrates the equalities (5.6) and (5.7) above.

This means that

P j(q1) ≤ k − 1 + 2(j − 1)γ.

48 We also see that

P j(q2) ≤ P j(q1) + 1

since q2 − q1 < 1 and P j(q2) increases at most with slope 1. Hence

bP j(q2) − 2(j − 1)γc + 1 ≤ k.

Observe that from before it is know that Ej(q2) = k + 1. By definition,

! max{Ej−1(q2) + 1, bP j − 2(j − 1)γc + 1} = Ej−1(q2) + 1 = k + 1,

which gives Ej−1(q2) = k. So, in order for Ej to have a discontinuity at q2 the function Ej−1 must also have a discontinuity there. However, we chose j to be the minimal index such that Ej admits at least two discontinuities. Hence q2 must be the only one for Ej−1 and then we get

lim Ej−1(q2) = k − 1. − t→q2

Since it only has one discontinuity we can conclude that Ej−1 must be constant on (a, q2). But this means that

lim Ej−1(t) = lim Ej(t), − − t→q1 t→q1 which gives a contradiction. Hence we have proved the lemma above.

Using part (iv) of Definition 27 it is possible to conclude that

n X P j(q) ≤ Pj(q) + 2γ = q + 2nγ < 1 (5.8) j=1 for all 1 ≤ j ≤ n and provided that 0 ≤ q ≤ 1/2. Hence we see that

max{Ej−1(q2) + 1, bP j − 2(j − 1)γc + 1} = Ej−1(q2) + 1 and it can be deduced that  E1 = 1  E2 = 2 . .  En = n,

49 for these q ∈ [0, 1/2]. In particular, E is constant on [0, 1/2], which means that the discontinuities must be an infinite discrete subset of [1/2, ∞). Following Roy’s notation we let Σ denote the set of all discontinuities of E in [1/2, ∞] and we put

Σ = Σ + [−nγ, nγ]. (5.9)

We can then express Σ as

∞ [ Σ = [an, bn] n=0 where 0 ≤ a0 < b0 < a1 < b1 < ··· such that

2nγ ≤ bi − ai < 1

(i) for all i ≥ 0. Let Ω be the set of indices j such that Ej is not constant on [ai, bi] and we let Ω(q) be the nonempty set of indices such that Ej is discontinuous at q. The reason Roy introduces this set is because it will help in the construction of the desired rigid n-system. Namely, if we let

(i) (i) k := max{j ∈ Ω |j = 1 or Ej−1(bi) < Ej(bi) − 1} (5.10) and

(i) (i) l := min{j ∈ Ω |j = n or Ej+1(ai) > Ej(ai) − 1} (5.11) one can prove that k(i+1) ≤ l(i). Let us formulate this as a lemma: Lemma 44. With the above notation, the inequality

k(i+1) ≤ l(i) (5.12) holds for all i ≥ 0.

Proof. We recall that we still follow [15] closely throughout this chapter. The first thing to observe is that if we for j < n have that Pj + γ < Pj+1 on some subinterval [a, b], then

j X Mj = Pi i=1 either has constant slope on [a, b] or no slope until a point c ∈ (a, b) and thereafter slope 1. Assume that k ∈ Ω(q). Then

Ek(q) = lim Ek(t) + 1. (5.13) t→q−

50 This means that whenever k 6= 1 we have

Ek(t) = bP k − 2(k − 1)γc + 1 (5.14) provided that

Ek−1(q) + 1 < Ek(q). (5.15)

This implies that

0 lim P k(t) = 1 (5.16) t→q− since Ek has discontinuity at q. From (5.13) it is immediate that

Ek−1(q) ≤ lim Ek(t) − 1 t→q− and Lemma 42 then gives that

P k−1(t) − 2(k − 2)γ < Ek−1(t) ⇐⇒ P k−1(t) < Ek−1(t) + 2(k − 2)γ ≤ lim Ek(t) − 1 + 2(k − 2)γ. t→q−

Under the above assumptions equation (5.14) gives that

P k(q) = lim Ek(t) + 2(k − 1)γ t→q− and substitution then provides with

Pk(t) ≥ lim Ek(t) + 2(k − 1)γ − γ − 1 > Pk−1(q) + γ, t→q− which in particular means that Mk−1 must be a concave up function for q − 1 ≤ t ≤ q. However, there is something stronger to claim here, namely that Mk−1 is constant on this interval. To see this start by noticing that

0 0 0 lim Mk−1(t) = lim Mk(t) − lim Pk(t). t→q− t→q− t→q−

0 We have limt→q− Mk(t) ≤ 1 by definition and that

0 lim Pk(t) = 1 t→q−

0 because of equation (5.16). Hence limt→q− Mk(t) = 0 and we conclude that Mk−1 is constant on [q − 1, q].

51 Assume now, on the contrary, that there exists some i ≥ 0 such that l(i) < k(i+1). Let q ∈ 0 [ai + nγ, bi − nγ] and q ∈ [ai+1 + nγ, bi+1 − nγ] be the point of discontinuity of El(i) respectively for Ek(i+1) . Then it follows that

0 0 Ek(i+1)−1(q ) ≤ Ek(i+1) (q ) − 1,

0 0 which means that Mk(i+1)−1 must be a constant on [q − 1, q ] and we see that all of P1, ..., Pk(i+1)−1 0 0 are constant on [q − 1, ai+1] because q − nγ ≥ ai+1. A similar argument can be made in order to show that El(i) is constant on [q, ai+1] and one can deduce that Ml(i) must be concave up on [q − 1, ai+1]. The fact that P is a (n, γ)-reduced system gives that E1, ..., El(i) are constant on [q − 1, ai+1 − nγ]. However, q ∈ [q − 1, ai+1 − nγ], which means that El(i) is discontinuous at q. This yields a contradiction and the lemma is proved.

We are now close to the end of the proof of Theorem 41. The idea from here on is to define some ti such that there is a rigid n-system R :[t0, ∞) → ∆n such that R(ti) = E(ai) and then show that

2 ||R(t) − P(t)||∞ ≤ 3n , (5.17)

for all t ∈ [t0, ∞). By defining

n X ti = Ek(ai), ∀i ≥ 0 k=1 we get that R(ti) = E(ai). To see this claim, recall that E(0) = (1, 2, ..., n) and that therefore

n X n(n + 1) t = k = . 0 2 k=1

(i) (i) ri (i) Let us partition Ω into sets {Ωj }j=1 where each Ωrk consists of consecutive integers k, ..., l so that Ek(ai), ..., El(ai) are also consecutive integers. We also choose the indices j ∈ {1, ..., ri} in (i) (i) such a way that max Ωk+1 < min Ωk for all 1 ≤ k < ri. The goal is to express the inequality k(i+1) ≤ l(i) in terms of the partitions defined above. To see the connection the following definition is made:

(0) (0) (1) (Ωi)i≥0 := (Ω1 , ..., Ωr0 , Ω1 , ..., Ωr1 (1), ...). (5.18)

(i+1) (i) Then we can consider the inequality min Ω1 ≤ max Ωri instead because

(i) (i) min Ω1 = k and

l(i) = max Ω(i) ri

52 for all i ≥ 0. Directly from construction

max Ωi ≥ min Ωi+1

for all i ≥ 0, which implies that max Ωi ≥ min Ωi−1 for i ≥ 1. We can make the last inequality valid for i = 0 as well by defining Ω−1 := n. Let us now define

(i) (i) (i) ak = (ak,1, ..., ak,n) by

( (i) (i) (i) Ej(ai), if j ∈ Ωr ∪ · · · ∪ Ωri , ak,j = Ej(ai+1), else.

(i) n Notice that ak ∈ Z and that

(i) a1 = E(ai). (5.19)

The last equation, (5.19), follows since E is constant on [bi, ai+1] and that for i ≥ 0 and j ∈ {1, ..., n} we have

( (i) Ej(ai) + 1, j ∈ Ω , Ej(ai+1) = Ej(ai) else,

(i) which holds since Ej has only one point os discontinuity for j ∈ Ω in the interval [ai, bi]. This (i) (i) means that if we know ak , we can obtain ak+1 by adding 1 to all components that have index in (i) (i) i+1 Ω . We define ari+1 = a1 . Then

(a(i), ..., ac(i), ..., a(i)) = (a(i+1), ..., a[(i+1), ..., a(i+1)) (5.20) 1 ki n 1 li+1 n if we define

ki = min Ωi and

li+1 = max Ωi for all i ≥ 0. We have now almost constructed a canvas of mesh 1. Unfortunately we have 1 ≤ ki ≤ li ≤ n. This can, however, be fixed if we use R(ti) = E(ai) and define (im)0≤m

53 get 1 ≤ ki < li ≤ n. Since we have constructed a rigid n-system, i.e. R, we only need to show that (5.17) holds for all t ∈ [t0, ∞). To do this, fix such a t ∈ [ti, ti+1] for i ≥ 0. Then we obtain

n n X X |ai − t| ≤ ai − Ek(ai) + (Ek(ai+1) − Ek(ai)) ≤ n(n + 1) + n k=1 k=1 and since P is continuous with components that have slopes 0 and 1 we get the estimate

2 ||P(ai) − P(t)||∞ ≤ |ai − t| ≤ n(n + 1) + n = n + 2n.

By the previous discussion,

||R(ti) − P(t)||∞ = ||E(ai) − P(ai)||∞ ≤ 1 + n and

||R(t) − R(ti)||∞ ≤ ||R(ti+1) − R(ti)||∞ = ||E(ai+1) − E(ai)||∞ = 1.

This means that

||R(t) − P(t)||∞ = ||R(t) − R(ti) + R(ti) − P(ai) + P(ai) − P(t)||∞

≤ ||R(t) − R(ti)||∞ + ||R(ti) − P(t)||∞ + ||R(ti) − P(t)||∞

≤ n2 + 3n + 2 ≤ 3n2.

This completes the proof of Theorem 41 in the case δ = 1, and by the argument at the beginning of the proof, we have thus proved Theorem 41 in the general setting.

Theorem 41 will work as a lemma in the proof of Theorem 49, which is the main result of this chapter. However, Theorem 41 is not all that is needed. The concept of the so-called projective distance will also be of importance.

Definition 45. Let x, y ∈ Rn be two nonzero vectors. The projective distance is defined by ||x ∧ y|| d (x, y) = . p ||x|| · ||y||

The geometric intuition of the projective distance comes from linear algebra and the theory re- garding alternating k-forms, since ||x ∧ y|| represents the area of the parallelogram spanned by the vectors x and y. This means that if the smallest angle between x and y is θ, then dp(x, y) = sin θ. A slight generalization of the definition above is when we talk about the distance from a vector to a subspace of Rn. Then we set

54 n−1 dp(x, V) = inf{dp(x, y): y ∈ S ∩ V}, whenever V is a nonzero subspace of Rn and x ∈ Rn is a nonzero vector. If we write Rn = V ⊕ V⊥ we have that

2 2 2 ^ n ^ ⊥ ^ ⊥ R = V ⊕ (V ∧ V) ⊕ V ,

⊥ which means that one can split x into x = y˜ + z with y˜ ∈ V and z = x||V⊥ ∈ V . Hence we get

||(y˜ ∧ y) + (z ∧ y)|| ||z|| ||x ⊥ || d (x, y) = ≥ = ||V (5.21) p ||x|| · ||y|| ||x|| ||x|| for any nonzero y ∈ V.

Now the focus will lie on some more technical parts in the form of some lemmas. The following lemma together with its proof can, again, be found in [15]:

Lemma 46. Let 1 ≤ h ≤ l ≤ n and 1 ≤ k < l ≤ n be integers and assume that {x1, ..., xn} is a n l n basis of Z . Let also A ≥ 2 (||x1|| + ··· + ||xl||). Then there exists a basis (y1, ..., yn) of Z such that

(i) (y1, ..., ybl, ..., yn) = (x1, ..., xch, ..., xn),

(ii) yl ∈ xh + hx1, ..., xch, ..., xliZ,

(iii) A ≤ ||yl|| ≤ 2A, (iv) d (y , hy , ..., y , ..., y i ) ≥ 1 − 1 . p l 1 bk l−1 R 2l−1

Proof. Let y 1, ..., ybl, ..., yn be the unique vectors satisfying

(y1, ..., ybl, ..., yn) = (x1, ..., xch, ..., xn).

n Then for any choice of yl as in (ii), y1, ..., yn is a basis of Z . Hence our task is to prove that there exists a choice of yl as in (ii) such that also (iii) and (iv) are fulfilled. One way to do this is to construct a chain of vector spaces W ⊂ V ⊂ U where all the subspaces have codimension one relative to each other. We will define these spaces via

U := hx1, ..., xniR,

V := hy1, ..., yniR and

W := hy1, ..., ybl, ..., yniR.

55 It is then clear that W ⊂ V ⊂ U. Observe that dim U∩W⊥ = 2 and let us construct an orthonormal ⊥ ⊥ basis to this space by choosing unit vectors u ∈ U ∩ V and v ∈ V ∩ W . Since xh||V⊥ = λu for some λ 6= 0 we get that

λu − xh + αv ∈ V for α ∈ R. It will be convenient to choose α = (3/2)A and this will be seen later on. We can now write the vector λu − xh + αv as a linear combination of the basis vectors in V and obtain:

l−1 X λu − xh + αv = cjyj j=1 where c1, ..., cl−1 ∈ R. If we define

n X yl := xh + dcjeyj ∈ xh + hx1, ..., xch, ..., xliZ j=1 we get that

l X α |||y || − α| ≤ ||x || ≤ l j 2l j=1

l where we in the last inequality used the assumption A ≥ 2 (||x1|| + ··· + ||xn||) and that α = (3/2)A > A. This implies that

2(1 − 2−l)A 2(1 + 2−1)A ≤ ||y || ≤ ⇐⇒l≥2 A ≤ ||y || ≤ 2A. 3 l 3 l

We need now to show (iv). However, it is known that |y · v| ≤ ||y ⊥ || and since l l||W −l 2A 2A 2(1 − 2 )A |y · v| = + (dcke − ck)y · v ≥ − ||y || ≥ l 3 k 3 k 3

−l ||y ⊥ || 1 − 2 1 =⇒ d (y , hy , ..., y , ..., y i ) = l||W ≥ ≥ 1 − . p l 1 bk l−1 R −1 l−1 ||yl|| 1 + 2 2

This proves the lemma.

The following Corollary can be found as a part of Lemma 5.2 in [15, p. 765].

j+3 Corollary 46.1. Let A1, ..., An be real numbers satisfying A1 ≥ 8 and Aj ≥ 2 Aj−1 for j = n 2, .., n. Then there is a basis (y1, ..., yn) of Z such that

(i) Aj ≤ ||yj|| ≤ 2Aj for all j = 1, ..., n,

1 (ii) dp(yn, hy1, ..., yn−1iR) ≥ 1 − 2n−1 for k < n.

56 Proof. We follow the proof of Lemma 5.2. in [15, p. 765]. We use Lemma 46 in order to generate a sequence of integer points in Zn with the desired properties and then use this sequence to construct a new basis of Zn satisfying the list in the Corollary. We do this by considering the standard basis n n of Z . If we define this sequence x1, ..., xn ∈ Z by first letting x1 = e1 and then requiring xj for j ≥ 2 to satisfy Lemma 46. This means that the vectors x1, ..., xj−1, ej, ..., en must be a basis for Zn for each j ≥ 2 and also

(i) xl ∈ ej + hx1, ..., xb1, ..., xj−1iZ,

(ii) Aj−1 ≤ ||xl|| ≤ 2Aj−1,

1 (iii) dp(xj, hx1, , ..., xj−1iR) ≥ 1 − 2l−1 .

n So, we are creating sequences of basis vectors of Z starting from the initial vector e1. We can then apply Lemma 46 to the last basis fulfilling the constraints above, namely

x1, ..., xl−1, el, ..., en.

By assumption we obtain

l−1 ! l−1 ! l−1 ! l X l X l+1 X 2 ||xk|| + ||el|| ≤ 2 2 Aj−1 + 2 = 2 1 + Aj−1 k=1 j=2 j=2

l+1 ≤ 2 (1 + (l − 2)Al−1) ≤ Al−1.

Hence by Lemma 46 we get a new vector xl to add to our list. In particular this is true for l = n and we are done with the corollary.

Whether or not a sequence of vectors (x1, ..., xm) satisfies (iv) of Lemma 46 will be of such an importance that a definition is in order.

n Definition 47. A nonempty sequence (x1, ..., xm) consisting of vectors in R is called almost orthogonal if the vectors x1, ..., xm are linearly independent and if they for all 2 ≤ l ≤ m satisfy 1 d (x , hx , ..., x i ) ≥ 1 − . (5.22) p l 1 l−1 R 2l−1

We will now put some more constraints on the constants A1, ..., An from the previous discussion and derive a lemma that will provide an important inequality later on. Let us fix a choice of s ∈ N∗ ∪ {∞}. We will from this point onwards assume that for each 0 ≤ i < s there is given a (i) (i) n fixed point (A1 , ..., An ) ∈ R and fixed integers ki and li, such that:

(i)1 ≤ k0 ≤ l0 = n and 1 ≤ ki < li ≤ n whenever i ≥ 1;

(i) n+3 4 (i) (i) n+3 4 (ii) A1 ≥ 2 e and Aj ≥ Aj−12 e for all j = 2, ..., n;

57 (iii) k ≤ l and A(i) ≥ A(i−1)2n+3e4 i−1 i li li

(iv)( A(i), ..., Ad(i), ..., A(i)) = (A(i−1), ..., A\(i−1), ..., A(i−1)) for i ≥ 1. 1 l1 n 1 ki−1 n

We recall that we are following the construction done by Roy in [15, p. 766]. The following result is a combination of Proposition 5.3 and Proposition 5.4 in [15, p. 766, p. 769]:

Lemma 48. Define

n Y (i) Qi = Aj j=1

(i) (i) n for all 0 ≤ i < s and Qs = ∞ whenever s 6= ∞. Then there exists a basis (x1 , ..., xn ) of Z for n each 0 ≤ i < s and a unit vector u ∈ R such that for each i with 0 ≤ i < s and each Q ∈ [Qi,Qi+1) we have

(i) (i) 4 (i) (i) Aj ≤ λ(xj , Cu(Q)) ≤ 8e Aj for any j = 1, ..., n and j 6= ki;

QA(i) 8QA(i) ki (i) ki (ii) n ≤ λ(x , Cu(Q)) ≤ . 2 Qi ki Qi

Proof. First of all we want to establish that

(i) (i) (i) Aj ≤ ||xj || ≤ 2Aj (5.23) for all j = 1, ..., n. The above inequality can be seen as a slight generalization of (iii) in Lemma 46. In order to prove (5.23) we use mathematical induction and Lemma 46. For the basis of the induction we consider the case when i = 0. This case follows directly from part (ii) of Corollary 46.1. As the induction hypothesis we assume the inequality holds for i = 1, ..., p − 1, i.e. we (i) (i) assume that there is a basis (x1 , ..., xn ) with the wanted properties for all such i:s. If we let k = kp, h = kp−1 and l = lt in the assumptions of Lemma 46 we get from (ii) of the construction (i) (i) of (A1 , ..., An ) that

lp lp X (p−1) X (p−1) (p) 2lp ||x || ≤ 2lp+1 A ≤ A j j lp j=1 j=1

(p) (p) which means that we may apply Lemma 46 and conclude that there exists a basis (x1 , ..., xn ) of Zn such that

A(p) ≤ ||x(p)|| ≤ 2A(p). (5.24) lp lp lp

By the principle of mathematical induction we have proved that (5.23) holds. Assume now that s = ∞. Define

Ω = hx1, ..., xbk, ..., xmiR ∩ hx1, ..., xbl, ..., xmiR

58 and let U be a Q-vector space with dim U = m. Choose unit vectors:

u1 ∈ U ∩ hx1, ..., xbk, ..., xmiR,

u2 ∈ U ∩ hx1, ..., xbl, ..., xmiR, ⊥ v1 ∈ Ω ∩ hx1, ..., xbk, ..., xmiR, ⊥ v2 ∈ Ω ∩ hx1, ..., xbl, ..., xmiR.

⊥ Since {u1, v1} and {u2, v2} are orthonormal bases of U ∩ Ω we see that v1 ∧ v2 = ±u1 ∧ u2. Let now x ∈ hx1, ..., xbk, ..., xmiR. Then we can write x = y + αx˜ for y ∈ Ω and x˜ ∈ hx1, ..., xbk, ..., xmiR. This gives

dp(hx1, ..., xbk, ..., xmiR, hx1, ..., xbl, ..., xmiR) = sup{dp(x, hx1, ..., xbl, ..., xmiR): x ∈ hx1, ..., xbk, ..., xmiR\{0}}

n−1 n  = sup inf(dp(x, y˜)) : y˜ ∈ S ∩ hx1, ..., xbl, ..., xmiR, x ∈ S ∩ hx1, ..., xbk, ..., xmiR

( )! ||(y + αx˜) ⊥ || ||hx1,...,xk,...,xmi = sup c R : y + αx˜ ∈ n ∩ hx , ..., x , ..., x i = d (v , v ) = ||v ∧v || ||y + αx˜|| S 1 bk m R p 1 2 1 2 (5.25) because the supremum value is attained if and only if y = 0. Let us now define

ω = x1 ∧ · · · ∧ xbk ∧ · · · ∧ xbl ∧ · · · ∧ xm ∈ Ω and immediately observe that

, xk = y˜1 + α1y˜2 , ||ω ∧ xk|| = y˜1 ∈ Ω, y˜2 ∈ hx1, ..., xbk, ..., xmiR = ||ω ∧ (y˜1 + α1y˜2)|| = |α1|||ω||. α1 ∈ R

A similar calculation shows that

||ω ∧ xl|| = |α2|||ω||

if we write xl = y˜3 + α2y˜4 with y˜3 ∈ Ω, y˜4 ∈ hx1, ..., xbl, ..., xmiR and α2 ∈ R. We also notice that

||ω ∧ yk ∧ yl|| = ||ω ∧ (y˜1 + α1y˜2) ∧ (y˜3 + α2y˜4)|| = |α1 · α2| · ||ω|| · ||y˜2 ∧ y˜4||

= |α1 · α2| · ||ω|| · dp(hx1, ..., xbk, ..., xmiR, hx1, ..., xbl, ..., xmiR).

59 We let H(U) stand for the co-volume the lattice Zn ∩ U when U is a vector space over Q of dimension m. With this notation we get

m 1 Y H(Ω) = ||ω|| ≤ ||y || ||y || · ||y || j k l j=1 and since

k k k ∞ k Y Y 1−i Y Y 1−i −2 Y ||x1 ∧ · · · ∧ xk|| ≥ ||xj||(1 − 2 ) ≥ ||xj||(1 − 2 ) = e ||xj|| j=1 i=2 j=1 i=2 j=1 for all 1 ≤ k ≤ m we get

m 1 Y H(hx , ..., x , ..., x i ) ≥ ||x || (5.26) 1 bk m R e2||x || j l j=1 and m 1 Y H(hx , ..., x , ..., x i ) ≥ ||x || (5.27) 1 bl m R e2||x || j k j=1 because of (ii) in the assumptions. We can then conclude that

4 e ||x1 ∧ · · · ∧ xm|| dp(hx1, ..., xbk, ..., xmiR, hx1, ..., xbl, ..., xmiR) ≤ . (5.28) ||x1|| · · · ||xm||

Observe that the above inequality is only valid since 1 d (x , hx , ..., x i ) ≥ 1 − (5.29) p j 2 j−1 R 2j−1 for all j ≥ 3. The next step is to show that if u is a unit vector that is orthogonal to hx(i), ...xc(i), ..., x(i)i i 1 ki n R then for all i and j such that 0 ≤ i < j < s we have

2e4 dp(ui, uj) ≤ (i+1) (i+1) . (5.30) ||x1 || · · · ||xn || By induction one can prove that

e4||x(i) ∧ · · · ∧ x(i)|| d (hx(i), ...xc(i), ..., x(i)i , hx(i), ...xc(i), ..., x(i)i ) ≤ 1 m , p 1 li m R 1 ki m R (i) (i) ||x1 || · · · ||xm || holds for all i such that (5.29) holds. Since

hx(i), ...xc(i), ..., x(i)i ⊕ hx(i), ...xc(i), ..., x(i)i ∼ n 1 ki n R 1 ki n R = R and since the co-volume of the integer lattice in Rn is 1 we get that

60 e4 d (hx(i), ...xc(i), ..., x(i)i , hx(i), ...xc(i), ..., x(i)i ) ≤ . p 1 ki n R 1 li n R (i) (i) ||x1 || · · · ||xn || for i ≥ 1 and because of (5.25) we get

e4 dp(ui−1, ui) ≤ (i) (i) . (5.31) ||x1 || · · · ||xn || Now observe that i − 1 and i are consecutive indices. For arbitrary integers 0 ≤ r < t < s such that (5.29) holds we get

t t X X e4 2e4 dp(ur, ut) ≤ dp(ui−1, ui) ≤ (i) (i) ≤ (i+1) (i+1) . i=r+1 i=r+1 ||x1 || · · · ||xn || ||x1 || · · · ||xn ||

n−1 Hence we have derived (5.30). In particular we see that there exists a sequence (ui)i≥0 ∈ P (R) of unit vectors converging to a unit vector u such that

2e4 dp(ui, u) ≤ (i+1) (i+1) ||x1 || · · · ||xn || for 0 ≤ i < s. By choosing ui so that ui · u ≥ 0 it follows that

4e4 4e4 ||u − u|| ≤ 2d (u , u) ≤ ≤ . i p i (i+1) (i+1) Q ||x1 || · · · ||xn || i+1 This immediately gives that

4 (i) (i) 8e AJ |xj · (u − ui)| ≤ Qi+1

(i) (i) since |xj · (u − ui)| ≤ ||xj || · ||u − ui|| and (5.24). If we use that Q < Qi+1 and assume that j 6= ki we get

(i) (i) 4 (i) |xj · (u − ui)| = |xj · u|Q ≤ 8e Aj . (5.32)

(i) 4 (i) Notice that the above implies ||xj || ≤ 8e Aj and so we can conclude that

(i) (i) (i) 4 (i) Aj ≤ max{||xj ||, |xj · u| · Q} ≤ 8e Aj

(i) (i) 4 (i) ⇐⇒ Aj ≤ λ(xj , Cu(Q)) ≤ 8e Aj and we have proved the first claim. To prove part (ii) we use that

(i) ||x || 1 ki ≤ . (i) (i) (i) (i) ||x1 || · · · ||xn || ||x ∧ · · · ∧ xc ∧ · · · ∧ x || 1 ki n

61 since we in general have

||x1 ∧ · · · ∧ xn|| = ||x1 ∧ · · · ∧ xn−1|| · |xn · u| = 1

1 ⇐⇒ |xn · u| = . ||x1 ∧ · · · ∧ xn−1||

If we now use (5.26) adapted to our situation we get

(i) 1 e2||x || ≤ ki . ||x ∧ · · · ∧ x || (i) (i) 1 n−1 ||x1 || · · · ||xn || Now when we have established the above main inequalities we translate them into the language of the lemma to get:

(i) (i) A 1 e2A ki ≤ ≤ ki Qi ||x1 ∧ · · · ∧ xn−1|| Qi A(i) e2A(i) ⇐⇒ ki ≤ |x(i) · u| ≤ ki . (5.33) ki Qi Qi

(i) (i) n Now, part (iii) of the definition of the point (A1 , ..., An ) ∈ R together with (5.33) yield

A(i) 8A(i) ki ≤ |x(i) · u| ≤ ki n ki 2 Qi Qi because Q i+1 ≥ 2n+3e4. Qi

This completes the proof of the lemma.

The above properties will be used to prove Roy’s contribution:

n Theorem 49. If q0 ≥ 0 and P :[q0, ∞) → R is an (n, 0)-system, then there exists a unit vector u of Rn such that

2 ||P(q) − Lu(q)||∞ ≤ 3n (n + 9)

for all q ≥ q0.

What the above theorem says is that we can approximate a rigid n system P with the help of the functions Lu constructed before. This is a very important tool because it allows us to find exponents of Diophantine approximation.

62 Proof. Define

P(q) = Φn(0, ..., 0,P1(q0), ..., Pi−1(q0), q − si−1)

for si−1 ≤ q ≤ si and all 1 ≤ i ≤ n where ( 0, i = 0 si = Pi k=1 Pk(q0), i ∈ {1, ..., n}.

Then P is an (n, 0)-system and in particular it is reduced. Hence, by Theorem 41. we know that there for any given value δ > 0 exists a rigid n-system R :[n(n + 1)/2, ∞) → Rn with mesh δ so that

2 ||P(q) − R(q)||∞ ≤ 3n δ (5.34) for all q ≥ n(n + 1)/2. However, this will not immediately give us the wanted inequality. We will now apply Lemma 48 with

(i) (i) Aj = exp(aj )

(i) (i) n for 1 ≤ j ≤ n in order to derive a powerful inequality. Let (x1 , ..., xn ) be a basis of Z satisfying Lemma 46 and Corollary 46.1. Moreover, let u be a unit vector with the properties from Lemma 48. First we note that

n ! n Y (i) X (i) log Qi = log exp(aj ) = aj = qi j=1 j=1 for 0 ≤ i < s and we put qs = log Qs = ∞ if s 6= ∞. Let q ∈ [qi, qi+1) and notice that q (i) n e ∈ [Qi,Qi+1). Since xj is a basis vector of Z for all j = 1, ..., n we have that there exists a permutation σ ∈ Sn such that

q (i) q λj(C(e )) ≤ λ(xσ(j), Cu(e )).

q If we now use Lemma 48 and that Lu,j(q) = log λj(C(e )) we get

4 (i) 4 (i) Lu,j(q) ≤ log(8e Aj ) = log(8e ) + aj

when j 6= ki and

(i) n ! 8A eq ki (i) q Y (i) (i) q Lu,j(q) ≤ log( ) = log 8 + a + e − log exp(a ) = log 8 + a + e − qi Q ki j ki i j=1

≤ log(8e4) + a(i) + eq − q ki i

63 if j = ki. If we combine them we get

4 Lu,j(q) ≤ Pj(q) + log(8e ). for all j = 1, ..., n. By (3.13) we then get

n X q − n log(n) ≤ Lu,j(q). j=1

We also see that

4 4 0 ≤ Pj(q) + log(8e ) − Lu,j(q) ≤ n(log n + log(8e )) (5.35) since

n X 4 4 (Pj(q) + log(8e )) = q + n · log(8e ). j=1

Also, since (5.35) holds true for all 1 ≤ j ≤ n we can conclude that

4 ||Lu(q) − P(q)||∞ ≤ n log(8e n). (5.36)

Now we use this inequality applied to the rigid n-system R. Then we know there exists a unit vector u such that

4 ||Lu(q) − R(q)||∞ ≤ n log(8e n).

Let us choose δ = n + 7 Then it follows, after combining the above inequality with (5.34), that

||P(q) − Lu(q)||∞ = || − P(q) + R(q) + Lu(q) − R(q)||∞ ≤ ||Lu(q) − R(q)||∞ + ||P(q) − R(q)||∞

≤ n log(8e4n) + 3n2δ ≤ 3n2(n + 9) for q ≥ n(n + 1)δ/2 since it is for those q vales we can define R. The case when q ≤ n(n + 1)δ/2 yields the same result because we then have

n X n(n + 1)δ P (q) ≤ P (q) = q ≤ (5.37) j k 2 k=1 and since (3.13) holds we get

n n X X n(n + 1)δ L (q) − q ≤ n log n =⇒ L (q) ≤ q + n log n ≤ + n log n. u,i u,i 2 i=1 k=1

64 This implies that

2 ||P(q) − Lu(q)||∞ ≤ 3n (n + 9) for 0 ≤ q ≤ n(n + 1)δ/2 and we have proved the theorem!

Finally, we now obtain Roy’s main theorem [15, 15, Thm. 1.3]:

Theorem 50. Assume that n ≥ 2 is an integer and that δ ∈ (0, ∞). For each unit vector u of n R there exists a rigid system P :[q0, ∞) → ∆n with mesh δ such that Lu − P is bounded on the interval [q0, ∞). Conversely, for each rigid system P :[q0, ∞) → ∆n with mesh δ, there exists a n unit vector u in R such that Lu − P is bounded on [q0, ∞). Proof. This is a direct consequence of Theorems 28, 41 and 49.

65 66 Chapter 6

Exponents of Diophantine Approximation

In the study of Diophantine approximation the goal is to approximate an arbitrary real number with a rational number, or more generally, an arbitrary vector in Rn with a vector in Qn. The following theorem is rather standard and can be found in many elementary books; we follow [13] for the statement and proof. The proof uses the Pigeonhole Principle, which we assume is known from earlier studies.

Theorem 51 (Dirichlet’s approximation theorem). Let α be a real number and n a positive integer. Then there are integers a and b with 1 ≤ a ≤ n such that

|aα − b| < 1/n.

Proof. Let {α} denote the fractional part of α, that is {α} := α − bαc. Then each of the numbers {jα} for j = 0, 1, ..., n belongs to exactly one of the disjoint intervals (k − 1)/n ≤ x < k/n, k = 1, 2, ..., n. Note that we have n + 1 numbers and n intervals. By the Pigeonhole Principle we must have the existence of some integers l and k with 0 ≤ k < l ≤ n so that

|{kα} − {lα}| < 1/n.

If we define a = l − k and b = blαc − bkαc we get

|aα − b| = |lα − kα − blαc + bkαc| = |{kα} − {lα}| < 1/n and we have proved the theorem.

A slight generalization of Theorem 51 states that for any real numbers ξ2, ..., ξn ∈ R and Q ≥ 1 n there exists a nonzero point y = (y1, ..., yn) ∈ Z such that |y1| ≤ Q and

−1/(n−1) |ξiy1 − yi| ≤ Q (6.1)

67 for i = 2, ..., n. This follows from Minkowski’s theorem applied to the lattice

n Γ = {(y1, ξ2y1 − y2, ..., ξny1 − yn):(y1, ..., yn) ∈ Z } (6.2) and the convex body

n −1/(n−1) Cn(Q) = {x = (x1, ..., xn) ∈ R : |x1| ≤ Q, |xi| ≤ Q } for i = 2, ..., n. The of this full lattice will be   1 0 ··· 0

 ξ −1 ··· 0  n  1  n−1 vol(R /Γ) = abs  . .  = |(−1) | = 1.  . . 

ξn 0 · · · −1

n −(n−1)/(n−1) n The volume of Cn(Q) is given by 2 · Q · Q = 2 and hence we get that V (Cn(Q)) ≥ n 2 det(Γ). By the Minkowski theorem we have that Cn(Q) contains a nonzero lattice point of Γ. From this the existence claimed around (6.1) follows. We have hence proved that the exponent −1/(n − 1) is a valid one in (6.1). Are there others? Given ξ1, ..., ξn−1 ∈ R, for what values η > 0 n does there exist arbitrarily large integers Q such that there exists (x, y1, ..., yn−1) ∈ Z \{0} such −η that |x| ≤ Q and |ξix − yi| ≤ Q for i = 1, ..., n − 1? These types of questions form the basis of the theory of Diophantine exponents. Some classical exponents were studied by Khintchine [8, 9] defined as follows:

n−1 For ξ = (ξ1, ..., ξn−1) ∈ R it is standard to define ω = ω(ξ) to be the supremum of real numbers η such that there exists arbitrary large integers Q for which the system of inequalities

|x| ≤ Q (6.3) and

−η |ξix − yi| ≤ Q , (i = 1, ..., n − 1) (6.4)

n ∗ ∗ has a nontrivial solution (x, y2, ..., yn) ∈ Z . We will also consider an exponent denoted ω = ω (ξ), which is defined as the supremum of η ∈ R such that there exist arbitrarily large integers Q for which the system of inequalities

n−1 X −η x − ξkyk ≤ Q (6.5) k=1 and

|yi| ≤ Q, (i = 1, ..., n − 1) (6.6)

68 n ∗ admits a nonzero integer solution (x, y1, ..., yn−1) ∈ Z . Finally, we let ωb = wb(ξ) and ωb (ξ) be the corresponding uniform exponents, i.e. ωb is the supremum of real numbers η such that for ∗ arbitrarily large integers Q the system (6.3), (6.4) admits a nontrivial integer solution, and ωb is the supremum of real numbers η such that for all sufficiently large integers Q the system (6.5), (6.6) admits a nontrivial integer solution.

6.1 Khinchin’s Transference Principle

There are different ways to present Khinchin’s transference principle but we will first present it in the same way as it is done in [16, p. 80] and in [21, Satz 3]. However, this result goes back to [8] and says, with the above notation, that ω∗ ≤ ω (6.7) (n − 2)ω∗ + n − 1 and

(n − 1)ω + n − 2 ≤ ω∗ (6.8) for n > 2. The inequalities that Khinchin’s transference principle provides us with leads to the following conclusion made by Schmidt and Summerer in [16]:

Theorem 52. Consider the lattice from (6.2) and use the convex body

C(Q) = [−Q, Q] × [−Q−1/(n−1),Q−1/(n−1)]n−1

to define the corresponding functions L1, ..., Ln. Then Khinchin’s transference principle is equiva- lent to

L (q) L (q) lim inf 1 + (n − 1) lim sup n ≥ 0 q→∞ q q→∞ q and

L (q) L (q) lim sup n + (n − 1) lim inf 1 ≤ 0. q→∞ q q→∞ q

The proof of Theorem 52 will require a lemma, presented in [16, Theorem 1.4]:

69 Lemma 53. Let

  L1(q) E1 = (ω + 1) 1 + lim inf , (6.9) q→∞ q   L1(q) E2 = (ˆω + 1) 1 + lim sup , (6.10) q→∞ q   ∗ 1 Ln(q) E3 = (ω + 1) − lim sup , (6.11) n − 1 q→∞ q   ∗ 1 Ln(q) E4 = (ˆω + 1) − lim inf . (6.12) n − 1 q→∞ q

Then all the expression takes the same value: n E = (i = 1, .., 4). i n − 1

Proof. We follow section 4 in [16] for the proof of this lemma. Let

n 1+γ Cγ(Q) := {x = (x, y1, ..., yn−1)e ∈ R : |x| ≤ Q , −1/(n−1)+γ |ξix − yi| ≤ Q (i = 1, ..., n − 1)}. (6.13)

Introducing the auxiliary variable

(n − 1)−1 − γ n η := = −1 + 1 + γ (n − 1)(1 + γ) makes it possible for us to rewrite (6.13) as

n 1+γ Cγ(Q) := {x = (x, y1, ..., yn−1)e ∈ R : |x| ≤ Q , 1+γ−η |ξix − yi| ≤ Q (i = 1, ..., n − 1)}. (6.14)

Hence, letting X = Q1+γ and comparing with the definition of the Diophantine exponent ω, L1(q) we conclude that γ > lim infq→∞ q implies that ω ≥ η. Then by the definition of L1(q) we L1(q) have that for any number γ > lim infq→∞ q , there exist arbitrarily large values of Q for which γ n L1(q) λ1(C(Q)) < Q , i.e. C(Q) ∩ Z 6= {0}. Letting γ → lim infq→∞ q we now conclude: n ω ≥ −1 + . (6.15) L1(q) (n − 1)(1 + lim infq→∞ q )

70 Equivalently,

n  L (q) ≤ (1 + ω) 1 + lim inf 1 . n − 1 q→∞ q

Since ω is defined to be the supremum of the numbers η such that there exist arbitrarily large −η values of X > 0 for which the system of inequalities |x| < X and |ξix − yi| ≤ X (i = 1, ..., n − 1) n has a non-zero solution (x, y1, ..., yn−1) ∈ Z , we see that if η < ω then there exist arbitrarily large n values of Q for which Cγ(Q) contains a nonzero lattice point of Z . By letting

X = Q1+η and n γ = − 1 (n − 1)(η + 1) we get that

L (q) lim inf 1 ≤ γ. q→∞ q

This means that we get the following inequality:

L (q) n  L (q) n lim inf 1 ≤ − 1 ⇐⇒ (1 + ω) 1 + lim inf 1 ≤ , q→∞ q (n − 1)(ω + 1) q→∞ q n − 1 which means that it is possible to conclude that

 L (q) n (ω + 1) 1 + lim inf 1 = . q→∞ q n − 1

L1(q) For E2 we notice that if δ > lim supq→∞ q then the corresponding set Cδ(Q) will contain one nonzero lattice point of Zn for arbitrarily large values of Q and we obtain in a similar way as before that n ω ≥ b L1(q) (n − 1)(lim supq→∞ q + 1) and whenever η < ωb the system ( |x| ≤ Q1+δ −1/(n−1)δ |ξix − yi| ≤ Q for i = 1, ..., n − 1 must have a solution in Zn for arbitrarily large values of X. This means that L (q) lim sup 1 ≤ δ q→∞ q

71 and we get the inequality:

 L (q) n (1 + ω + 1) 1 + lim sup 1 ≤ , b q→∞ q n − 1 which means that we can conclude that E2 = n/(n − 1).

Next, let νi(Q) denote the successive minima for C(Q) with respect to Cn(Q) and the dual lattice Γ∗ for 1 ≤ i ≤ n. Define now

log ν (Q) π (Q) := i . (6.16) i log Q

Note that if C is a convex body and C∗ denotes the dual body, then we have that C∗ ⊆ C ⊆ nC∗ and

∗ νi(Q) ≤ λi (Q) ≤ n · νi(Q), which by monotonicity gives that

∗ 0 ≤ log λi (Q) − log νi(Q) ≤ log n.

This in turns means that

∗ ∗ log λi (Q) log νi(Q) log λi (Q) log n n πi(Q) − ≤ − ≤ ≤ . log Q log Q log Q log Q log Q

Hence

∗ log λi (Q) lim inf πi(Q) = lim inf Q→∞ Q→∞ log Q and

∗ log λi (Q) lim sup πi(Q) = lim sup . Q→∞ Q→∞ log Q

We now use the following inequality:

λi(Q)λn+1−i(Q)  1, (6.17) which was proved by Mahler in 1939. The proof can be found in e.g. [4, Th. VI, p. 219]. This gives us

∗ ∗ λi (Q) λn+1−i(Q) A lim sup + lim sup < Q→∞ log Q Q→∞ log Q log Q

72 where A is some constant. Using the formulation found in [4] we have A = n! but other sharper estimates can be used as well. Following Banaszczyk in [22] we see that we can take A = n. However, what we want to conclude is that

∗ λi (Q) λn+1−i(Q) lim sup = − lim inf = − lim inf π1(Q) Q→∞ log Q Q→∞ log Q Q→∞ and

∗ λi (Q) λn+1−i(Q) lim inf = − lim sup = − lim sup π1(Q). Q→∞ log Q Q→∞ log Q Q→∞

Hence we see that   ∗ 1 E3 = (ω + 1) + lim inf π1(Q) . n − 1 Q→∞

Letγ ˜ > ω∗ and consider the following convex body:

( n−1 )

n 1+˜γ X −1/(n−1)+˜γ Cγ˜(Q) = x = (x, y1, ..., yn−1) ∈ R : |x| < Q , x − ξjyj ≤ Q . j=1

∗ By the definition of ω we have that Cγ˜(Q) contains a nonzero integer point for some arbitrarily large values of Q. If we use the same reasoning as before with the same η we get

  n ∗ ∗ 1 n −1 + ≤ ω ⇐⇒ (ω + 1) + lim inf π1(Q) ≥ . (n − 1)(1 + lim infQ→∞ π1(Q)) n − 1 Q→∞ n − 1

Whenever η < ω∗ we get the reversed inequality and combining them yields   ∗ 1 n (ω + 1) + lim inf π1(Q) = . n − 1 Q→∞ n − 1

Hence we have proved that E3 = n/(n − 1). The proof for E4 is analogous with the proof for E3.

We will now prove Theorem 52 using the above lemma.

Proof. If we use Kinchin’s transference principle, (6.7), we get

ω∗ ω∗ ω∗ + (n − 2)ω∗ + n − 1 ≤ ω ⇐⇒ +1 ≤ ω+1 ⇐⇒ ≤ ω+1 (n − 2)ω∗ + n − 1 (n − 2)ω∗ + n − 1 (n − 2)ω∗ + n − 1

n − 1 ⇐⇒ ≤ ω + 1. 1/(ω∗ + 1) + n − 2

73 Let

L1(q) Ln(q) L1 := lim inf and Ln := lim sup . q→∞ q q→∞ q

By lemma 53 we get

n − 1 n/(n − 1) ≤  n−1  1 + L 1/(n − 1) − Ln n + n − 2 1 n − 1 n ⇐⇒ ≤ n − Ln (n − 1) (1 + L1)

⇐⇒ (n − 1) (1 + L1) ≤ n − 1 − Ln

⇐⇒ Ln + (n − 1)L1 ≤ 0.

If we now use (6.8) instead: ω∗ + 1 n ω∗ + 1 ≥ (n − 1)ω + n − 1 ⇐⇒ ≥ n − 1 (n − 1) (1 + L1) n n ⇐⇒ ≥ 2 1 (n − 1) (1 + L ) (n − 1) ( n−1 − Ln) 1

⇐⇒ (n − 1)Ln + L1 ≥ 0, which finishes the proof.

6.2 A Consequence of Roy’s Contribution

For any nonzero unit vector u ∈ Rn Roy defines various Diophantine exponents τ(u), τˆ(u), λ(u), λˆ(u); these are defined in a “projective” way and the connection with the more standard definitions is ˆ as follows: All of τ, τ,ˆ λ and λ are invariant under the map u 7→ cu(c ∈ R \{0}), and for un = 1, i.e. u = (u1, ..., un−1, 1) (and u1, ..., un−1, 1 Q-linearly independent) we have

∗ ∗ τ(u) = ω (u1, ..., un−1);τ ˆ(u) =ω ˆ (u1, ..., un−1); ˆ λ(u) = ω(u1, ..., un−1); λ(u) =ω ˆ(u1, ..., un−1), where “ω∗, ωˆ∗, ω, ωˆ” is the notation of Schmidt and Summerer, [16].

Theorem 50 brings a very important corollary regarding exponents of Diophantine approximation. This is due to Roy, [15], and establishes the following bijection:

74 Theorem 54. There exists a bijection between the set of quadruples (τ(u), τˆ(u), λˆ(u), λ(u)) where u runs through all unit vectors of Rn with Q-linearly independent coordinates, and the set of quadruples

 P (q) P (q) P (q) P (q) lim inf 1 , lim sup 1 , lim inf n , lim sup n (6.18) q→∞ q q→∞ q q→∞ q q→∞ q where P = (P1, ..., Pn) goes through all rigid n-systems of mesh δ > 0 for which P1 is unbounded. The bijection is given by θ : R4 → R4 defined as ! 1 1 λˆ λ θ(τ, τ,ˆ λ,ˆ λ) = , , , . (6.19) 1 + τ 1 +τ ˆ 1 + λˆ 1 + λ

Proof. We will se that this, in more precise terms, is a consequence of Theorem 50, combined with Lemma 53 and a thorough translation of Schmidt and Summerer’s notation and Roy’s. Let us start by connecting the definition of τ(u), τˆ(u), λ(u) and λˆ(u) with the more standard definitions: Note that the definitions verbatim make sense for any vector u ∈ Rn \{0} and furthermore, for any u ∈ Rn and c ∈ R \{0} we have τ(cu) = τ(u), and similarly for the exponentsτ, ˆ λ and λˆ; thus τ, τ,ˆ λ and λˆ are really functions on (Rn \{0})/R× ' Pn−1(R). Considering now the special m+1 case un = 1, and writing m = n − 1 together with u = (u1, ..., um, 1) ∈ R , we see that τ(u), respectivelyτ ˆ(u), is the supremum of all τ > 0 for which

" m #

m X −τ ∃q ∈ Z , a ∈ Z : ||(q, a)|| ≤ X and qkuk + a ≤ X k=1 holds for arbitrarily large values of X, and for all sufficiently large values of X respectively. Here (since τ > 0) a is forced to be the integer closest to −(q1u1 + ··· + qmum) for all large X; hence we may just as well define τ(u) andτ ˆ(u) as the supremum of all τ > 0 such that

 m −τ  ∃q ∈ Z , a ∈ Z : ||q|| ≤ X and hq1u+ ··· qmumi ≤ X holds for arbitrarily large values of X, and for all sufficiently large values of X respectively. Notice that it does not matter here if || · || is the l2 or the l∞-norm; hence it follows that ∗ ∗ τ(u) = ω (u1, ..., Un−1) andτ ˆ(u) =ω ˆ (u1, ..., un) in the notation of Schmidt and Summerer.

∗ ∗ A remark is that Schmidt and Summerer only define ω (u1, ..., un−1) andω ˆ (u1, ..., un−1) if 1, u1, ..., un−1 are linearly independent over Q, which is equivalent to u1, ..., un−1 being linearly independent over Q and this condition is invariant under u 7→ cu.

Next, consider λ(u) and λˆ(u) and note that s X 2 ||x ∧ u|| = (xiuj − xjui) i

75 n−1 X ||x ∧ u|| u |xnui − xi| . (6.20) i=1 Indeed, for 1 ≤ i < j < n we get

    xj xi xi xj |xiuj − xjui| = xi uj − − xj ui − ≤ |xnuj − xj| + |xnui − xi| . xn xn xn xn

ˆ Hence we see that λ(u) = ω(u1, ..., un−1 and λ(u) =ω ˆ(u1, ..., un−1) in the notation of Schmidt and Summerer in [16].

Since u is assumed to be a unit vector the above relations become

τ(u) = ω∗(ξ);τ ˆ(u) =ω ˆ∗(ξ); λ(u) = ω(ξ); λˆ(u) =ω ˆ(ξ), where   u1 un−1 n−1 ξ = (ξ1, ..., ξn−1) = , ..., ∈ R . (6.21) un un

From the discussion in the appendix to Chapter 4 it is possible to derive that

[SS] Lj (Λ, K, q) = Lj((n − 1)q). (6.22)

Then it follows from (4.14) that

[SS] n [Roy] Lj((n − 1)q) = Lj (Z ,RK(q)) + O(1) = q − Lu,n+1−j(nq) + O(1). (6.23)

Remember that

n Λ = {(y, ξ1y − y1, ..., ξn−1y − yn−1):(y, y1, ..., yn−1) ∈ Z }.

−1 n while u is proportional to A e1, where A is a matrix in GLn(R) with Λ = AZ . Let us choose  1 0 0 ··· 0   ξ −1 0 ··· 0   1  A =  . .. .  .  . . .  ξn−1 0 0 · · · −1

76 It is then easy to verify that Λ = AZn and that  1   1   ξ   ξ   1  −1  1  A  .  = e1 ⇐⇒ A e1 =  .  .  .   .  ξn−1 ξn−1

The conclusion from this is that the vectors u and ξ can be obtained, after a certain cyclic permutation of u:s coordinates, from one another via (6.21). Now, let P = (P1, ..., Pn) be a rigid system corresponding to the unit vector u as in Roy’s main result, Theorem 50. Then for each j, the difference

Lu,n+1−j(q) − Pn+1−j(q) is uniformly bounded for all large q. This implies that

Lj((n − 1)q) = q − Pn+1−j(nq) + O(1) and so

L ((n − 1)q) P (nq) L (q) P (nq) lim sup j = 1−lim inf n+1−j ⇐⇒ (n−1) lim inf j = 1−n lim sup n+1−j . q→∞ q q→∞ q q→∞ q q→∞ q

Letting

Lj(q) Lj(q) Pj(q) Pj(q) Lj := lim inf ; Lj := lim sup ; P j := lim inf ; P j := lim sup q→∞ q q→∞ q q→∞ q q→∞ q then gives the relations 1 (n − 1)L = 1 − nP ⇐⇒ L = 1 − nP  (6.24) j n+1−j j n − 1 n+1−j and 1 (n − 1)L = 1 − nP ⇐⇒ L = 1 − nP  (6.25) j n+1−j j n − 1 n+1−j

Applying this to Lemma 53, E1, it follows that

n  1  λ = (ω+1)(1+L ) = (λ+1) 1 + 1 − nP  ⇐⇒ 1 = (λ+1)(1−P ) ⇐⇒ P = . n + 1 1 n + 1 n n n λ + 1

77 The formula, E2, in Lemma 53 gives ˆ n ˆ 1 λ = (ˆω + 1)(1 + L1) = (λ + 1)(1 + (1 − nP n)) ⇐⇒ P n = . n − 1 n − 1 1 + λˆ

The formulas E3 and E4 in Lemma 53 give

n  1   1 1  1 = (ω∗ + 1) − L = (τ + 1) − (1 − nP ) ⇐⇒ P = n − 1 n − 1 n n − 1 n − 1 1 1 1 + τ and

n  1   1 1  1 = (ˆω∗ + 1) − L = (ˆτ + 1) − 1 − nP  ⇐⇒ P = . n − 1 n − 1 n n − 1 n − 1 1 1 1 +τ ˆ

78 Bibliography

[1] R. Bott and L. W. Tu, Differential Forms in Algebraic Topology. Springer Verlag, New York inc, 1982.

[2] H. Brezis, , Sobolev Spaces and Partial Differential Equations. Springer, New York, 2011.

[3] Y. Bugeaud and M. Laurent, Diophantine approximation and parametric geometry of numbers. Monatsh. Math. 169, , 51–104, (2006).

[4] J.W.S. Cassels, An Introduction to the Geometry of Numbers. Springer-Verlag Berlin Heidelberg 1997.

[5] Jesse Ira Deutsch, Geometry of Numbers Proof of G¨otzky’sFour-Squares Theorem. Journal of Number Theory 96, 417–431, 2002.

[6] O. German, On Diophantine exponents and Khintchine’s Transference Principle. arXiv:1004.4933 [math.NT] 2010.

[7] J. Jost, Riemannian Geometry and Geometric Analysis. Springer Verlag Berlin Heidelberg, 2011.

[8] A. Khintchine, Uber¨ eine Klasse linearer Diophantischer Approximationen. PRend. Circ. Math. Palermo 50, 170–195 (1926).

[9] A. Khintchine, Zur metrischen Theorie der diophantischen Approximationen. Mathematische Zeitschrift 24, 706-714 (1926).

[10] K. Mahler, On Compound Convex Bodies I. Proc. Lond. Math. Soc 5(3), 358-379 (1955).

[11] K. Mahler, On Compound Convex Bodies II. Proc. Lond. Math. Soc 5(3), 380-384 (1955).

[12] A. Marnat, About Jarn´ık’s-type Relation in Higher Dimension. Ann. Inst. Fourier, Grenoble 68, 1 (2018) 131-150.

[13] K. H. Rosen, Elementary Number Theory and Its Applications. Pearson education, Inc, 2011.

[14] S. Roman, Advanced Linear Algebra. Springer Science+Buisness Media, LLC, 2008.

[15] D. Roy, On Schmidt and Summerer parametric geometry of numbers. Annals of Math. 182, 739-786, (2015).

[16] W. M. Schmidt and L. Summerer, Parametric geometry of numbers and applications. Acta Arith. 140, (2009), 67–91.

79 [17] W. M. Schmidt and L. Summerer, Diophantine approximation and parametric geometry of numbers. Monatsh. Math. 169, (2013), 51–104.

[18] C. L. Siegel, Lectures on the Geometry of Numbers. Springer-Verlag Berlin Heidelberg, 1989, Notes by B. Friedman, Rewritten by Komaravolu Chandrasekharan with the assistance of Rudolf Suter, With a preface by Chandrasekharan.

[19] I. Stewart and D. Tall, Theory and Fermats Last Theorem. third edition: AK Peters, 2002.

[20] M. Laurent, On transfer inequalities in Diophantine Approximation. https://arxiv.org/abs/math/0703146. 2007.

[21] V. Jarn´ık, Zum Khintchineschen ”Ubertragungssatz”¨ . Trudy Tbilis. Mat. Instituta 3 (1938), pp. 193–216.

[22] W. Banaszczyk, New bounds in some transference theorems in the geometry of numbers. Math- ematische Annalen, 296(4):625–635, 1993.

80