Contemporary Mathematics
Orders of Finite Groups of Matrices
Robert M. Guralnick and Martin Lorenz
To Don Passman, on the occasion of his 65th birthday
ABSTRACT. We present a new proof of a theorem of Schur’s from 1905 determining the least common multiple
of the orders of all finite groups of complex n × n-matrices whose elements have traces in the field É of rational numbers. The basic method of proof goes back to Minkowski and proceeds by reduction to the case of finite fields.
For the most part, we work over an arbitrary number field rather than É. The first half of the article is expository and is intended to be accessible to graduate students and advanced undergraduates. It gives a self-contained treatment, following Schur, over the field of rational numbers.
1. Introduction 1.1. How large can a finite group of complex n n-matrices be if n is fixed? Put differently: if is × G a finite collection of invertible n n-matrices over C such that the product of any two matrices in again belongs to , is there a bound on× the possible cardinality , usually called the order of ? WithoutG further restrictionsG the answer to this question is of course negative.|G| Indeed, the complex numbersG contain all roots of ∗ unity; so there are arbitrarily large finite groups inside C . Thinking of complex numbers as scalar matrices,
we also obtain arbitrarily large finite groups of n n-matrices over C. The situation changes when certain arithmetic× conditions are imposed on the matrix group . When G all matrices in have entries in the field É rational numbers, Minkowski [33] has shown that the order of divides someG explicit, and optimal, constant M(n) depending only on the matrix size n. Later, Schur G[39] improved on this result by showing that Minkowski’s bound M(n) still works if only the traces of all
matrices in are required to belong to É. G 1.2. The first four sections of this article present full proofs of the theorems of Schur and Minkowski that depend on very few prerequisites. These sections follow Schur’s approach via character theory and have been written with a readership of beginning graduate and advanced undergraduatestudents in mind. Provided the reader is willing to accept one simple fact concerning group representations (Fact 2 in Section 3.2 below), the proofs will be completely understandable with only a rudimentary knowledge of linear algebra, group theory (symmetric groups, Sylow’s theorem), and some algebraic number theory (minimal polynomials, Galois groups of cyclotomic fields). The requisite background material will be reviewed in Section 3.
2000 Mathematics Subject Classification. Primary 20-02, 20C15, 20G40; Secondary 11B99. Key words and phrases. Finite linear group, group representation, Jordan bound, number field, Minkowski sequence. The first author was supported in part by NSF Grant DMS 0140578. Research of the second author supported in part by a grant from the NSA.
c 0000 (copyright holder) 1 2 ROBERT M. GURALNICK AND MARTIN LORENZ
The material in Section 5 is new. We show that Minkowski’s original approach used in [33] in fact also yields Schur’s theorem [39]. Minkowski’s method is conceptually very simple, and it quickly and elegantly explains why some bound on the order must exist, even for arbitrary algebraic number fields, |G| that is, finite extensions of É. The method proceeds by reduction modulo suitably chosen primes and then using information about the orders of certain classical linear groups over finite fields. In fact, the general linear group alone almost suffices; only dealing with the 2-part of using this strategy requires additional information. Since we work over algebraic number fields, a bit more|G| mathematical background is assumed in this section. As of this writing, Schur’s theorem first appeared in print exactly a century ago and Minkowski’s goes even further back. In the final section of this article, we will survey some recent related work of Collins, Feit and Weisfeiler on finite groups of matrices, in particular on the so-called Jordan bound. We will also mention two mysterious coincidences concerning the Minkowski numbers M(n), one proven but unexplained, the other merely based on experimental evidence as of now.
1.3. Minkowski [33] provedhis remarkabletheorem in the course of his investigationof quadratic forms. Stated in group theoretical terms, the theorem reads as follows.
THEOREM 1 (Minkowski 1887). The least common multiple of the orders of all finite groups of n n- ×
matrices over É is given by
j k n n n − + − + +... (1) M(n)= p⌊ p 1 ⌋ ⌊ p(p 1) ⌋ p2(p−1) p Y Here, . denotes the greatest integer less than or equal to . and p runs over all primes. Note that if p>n+1 then⌊ ⌋ the corresponding factor in the product equals 1 and can be omitted. Therefore, (1) is actually a finite product. The first few values of M(n) are: M(1) = 21 =2 ,M(2) = 22+1 31 = 24 ,M(3) = 23+1 31 = 48 ,M(4) = 24+2+1 32 51 = 5760 .
1.4. For a positive integer m and a prime p, let mp denote the p-part of m, that is, the largest power of p
j k n n n p−1 + p(p−1) + 2 − +... dividing m. Thus, M(n)p = p⌊ ⌋ ⌊ ⌋ p (p 1) . This number can be written in a more compact form. Indeed, the p-part of m!=1 2 . . . m is given by
· · ·
j k m m p + 2 +... (2) (m!)p = p⌊ ⌋ p .
′ m ′ To see this, put m = p and note that m!= p (2p) . . . (m p) (factors not divisible by p) . Therefore, ′ · · · · m ′ (m!)p = p (m !)p andj (2)k follows by induction. Using (2) we can write
n p−1 n (3) M(n)p = p p−1 ! . ⌊ ⌋ p j k 1.5. The notation M(n), in the variant Mn, was introduced by Schur in [39] to honor Minkowski who had originally denoted the same number by n . Relaxing the condition in Theorem 1 that all matrix entries
be rational and replacing it with the weaker requirement that only the matrix traces belong to É, Schur was
able to prove that Minkowski’s bound M(n) still works: É THEOREM 2 (Schur 1905). If is any finite group of n n-matrices over C such that trace(g) holds for all g then the order ofG divides M(n). × ∈ ∈ G G Schur’s theorem covers a considerably larger class of groups than Theorem 1. In [39], the following example of a group covered by Theorem 2 but not Theorem 1 is given. FINITE MATRIX GROUPS 3
0 −1 i 0 EXAMPLE 3. Consider the matrices g = and h = , where i = √ 1 C. Then 1 0 0 −i − ∈ g2 = h2 = 1 and gh = ( 0 i ) = hg. Thus = 1 , g, h, gh is a group of complex 2×2 i 0 2×2 2 2-matrices− of order 8; it is isomorphic− to the so-calledG quaternion{± ± group± ±. Note} that the traces of all × Q8 elements of are rational – they are either 0 or 2 – but certainly does not consist of matrices over É. In fact, thereG does not even exist an invertible complex± 2 G 2-matrix a such that the matrices x = aga−1 −1 × and y = aha both have entries in the field Ê of real numbers. To see this, note that x and y both would have determinant 1 and trace 0, as g and h do. A direct calculation shows that the product matrix z = xy then satisfies z 2 + x 2 + y 2 = x y trace(z), where . indicates the (1, 2)-entry of the matrix in 12 12 12 − 12 12 12 question. However, trace(z) = trace(gh)=0. Hence, if x and y are matrices over Ê then all terms on the left will be zero. But then 1 = det(x)= x x = x 2 which is impossible. 11 22 − 11 We remark in passing that, for any “irreducible” finite group of complex n n-matrices, a necessary and sufficient condition for the existence of an invertible complexGn n-matrix a such× that aga−1 is real for all g is that × ∈ G 1 trace(g2)=1 . |G| gX∈G The sum on the left is called the Frobenius-Schur indicator of ; see, e.g., Isaacs [20, Chapter 4]. The group = in the example above has Frobenius-Schur indicator G1. G Q8 − 1.6. The proof of Theorems 1 and 2 to be given in Section 4 below proceeds by first exhibiting suf- ficiently large groups of rational (in fact, integer) matrices showing that the least common multiple of the
orders of all finite groups of n n-matrices over É must be at least equal to M(n). Thereafter, we may concentrate on Theorem 2 which× in particular implies that the least common multiple in Theorem 1 does not exceed M(n). Apart from updating terminology and notation to current usage and adding more generous details to the exposition, we have followed Schur’s original approach in [39] quite closely. For a proof of Schur’s theorem using slightly more sophisticated tools from representation theory, see Isaacs [20, Theorem 14.19]. A proof of Minkowski’s Theorem 1 is also presented in Burnside [6, pp. 479–484] and in Bourbaki [4, chap. III 7, Exerc. 5–8]. Stronger results can be found in Feit [17] and in Serre [41, pp. 518–519]. § 1.7. This article is dedicated to our friend and colleague Don Passman. Don’s contributions to group theory and ring theory in general and his expository masterpieces [35],[36] in particular have profoundly influenced our own work. In the course of various collaborations with Don, we have both benefitted from his deep insights and his generosity in sharing ideas.
NOTATIONS. Throughout, GLn(R) will denote the group of all invertible n n-matrices over the commutative ring R. Recall that a matrix over R is invertible if and only if its determinant× is an invertible element of R.
2. Large groups of integer matrices
The principal goal of this section is to construct certain groups of n n-matrices over Z such that the least common multiple of their orders equals the Minkowski bound M(n)×in (1). This will then allow us to give a reformulation of the core of Theorem 2.
2.1. Construction of groups. The main building blocks of the construction will be the symmetric groups for various r. Recall that consists of all permutations of 1,...,r and has order r! . Sr Sr { }
PROPOSITION 4. Let a, m and n be positive integers with am n. Then GLn(Z) has a subgroup of order = (m + 1)! a a! . ≤ G |G|
4 ROBERT M. GURALNICK AND MARTIN LORENZ Z PROOF. If we can realize inside GL (Z) then we can view as a subgroup of GL ( ) via G am G n
GLam(Z) Z GLam(Z) ∼= 1 GLn( ) . G ⊆ ⊆ . .. 1 Therefore, we may assume that n = am. Think of the rows of any n n-matrix as partitioned into a blocks × of m adjacent rows, and similarly for the columns. Now consider all matrices in GLn(Z) that have exactly one m m-identity matrix 1 in each block of rows and each block of columns and 0s elsewhere; these × m×m are special permutation matrices. In fact, the collection of all these matrices forms a subgroup Π GLn(Z) that is isomorphic to the symmetric group : ⊆ Sa 1m×m ...... 1m×m
Z a ∼= Π= . GLn( ) . S .. ⊆ . . . 1 m×m m+1 Next, we turn to the symmetric group . This group acts on the lattice Z by permuting its canon- Sm+1 ical basis e1 = (1, 0,..., 0),...,em+1 = (0,..., 0, 1) via σ(ei) = eσ(i). Note that this action maps the following sublattice to itself:
m+1 m Z A = (z ,...,z ) Z z =0 = m { 1 m+1 ∈ | i } ∼ i X
(The notation Am comes from the theory of root systems; cf. [3].) Thus, fixing some Z-basis of Am, each
permutation σ yields a matrix σ GL (Z). It is easy to see that the map σ σ is an injec- ∈ Sm+1 ∈ m 7→ tive group homomorphism m+1 GLm(Z). Stringing each a-tuple (σ1,..., σa) along the diagonal in
S → a Z GL (Z) we obtain a subgroup ∆ GL ( ) that is isomorphic to : n ⊆ ne Sm+1 e f f σ1 σ
a f 2 Z m+1 = m+1 m+1 ∼= ∆= GLn( ) . S S ×···×S . ⊆ .. a factors f | {z } σa The subgroup Π of GLn(Z) constructed earlier has only the identity matrixf in common with ∆. Moreover, conjugatinga matrix in ∆ with a matrix from Π simply permutes the σi-blocks along the diagonal. Therefore,
defining to be the subgroup of GL (Z) that is generated by Π and ∆, we obtain G n = ∆ Π = (m + 1)! a a!e, |G| | | | | as desired.
n Now fix a prime p n +1. Taking m = p 1 and a = p−1 in Proposition 4 we obtain a subgroup a≤ a − G of GLn(Z) of order p! a!; so p = p (a!)p. In view of (3),j thisk says that p = M(n)p. Letting p range |G| |G| over all primes n +1, we have exhibited a collection of subgroups of GLn(Z) such that the least common multiple of their≤ orders is M(n) . FINITE MATRIX GROUPS 5
2.2. Reformulation of Theorem 2. Let GL (C) be as in Theorem2. Our goal is to show that, for G ⊆ n all primes p, the p-part divides M(n) = pa (a!) with a = n as in (3). Now Sylow’s Theorem |G|p p p p−1 tells us that has subgroups of order p, the so-called Sylow p-subgroupsj k of . Replacing by one of its Sylow p-subgroups,G the issue becomes|G| to show that divides paa! . Therefore,G in order to proveTheoremG 2, and thereby complete the proof of Theorem 1, it suffices|G| to establish the following proposition.
PROPOSITION 5. Let be finite subgroup of GLn(C) whose order is a p-power for some prime p and G a n such that trace(g) É holds for all g . Then divides p a! with a = . ∈ ∈ G |G| p−1 j k 3. Tools for the proof The proof of Proposition 5 will depend on three ingredients: a lemma to narrow down the possible trace values, some basic facts on characters of group representations, and an observation concerning the familiar Vandermonde matrix. We will discuss each of these topics in turn.
3.1. Traces. This section uses a small amount of algebraic number theory. The book [23] by Janusz is a good background reference. Besides the usual matrix traces, we will use a notion of trace that is associated with field extensions. Specifically, let K/F be a finite Galois extension with Galois group Γ = Gal(K/F ). Then the trace Tr : K F is defined by Tr (α) = γ(α) for α K. If xm + cxm−1 + . . . is the min- K/F → K/F γ∈Γ ∈ imal polynomial of α over F then P Γ (4) Tr (α)= | | c . K/F − m · This follows from the fact that the minimal polynomial of α is equal to m (x α ), where α m are the i=1 − i { i}1 distinct Galois conjugates γ(α) with γ Γ. We will only be concerned with the special case where F = É
2πi/pr ∈ 2πi/pr Q
É É and K = É(e ) with p prime. The Galois group of (e )/ is isomorphic to the group of units
r ∗ r r r−1
Z Z Z (Z/p ) of the ring /p ; its order is ϕ(p )= p (p 1). −
LEMMA 6. Let g GL (C) be a matrix of finite order. Then trace(g) n and trace(g) = n holds ∈ n | | ≤ only for g = 1 . If the order of g is a power of p and trace(g) É then trace(g) must be one of the n×n ∈ integers n,n p,n 2p,...,n ap , where a = n . { − − − } p−1 q j k PROOF. By hypothesis, g = 1n×n for some positive integer q. Let ε1,...,εn denote the eigenvalues
2πi/q C of g; they are all powers of ζ = e . Hence, trace(g) = ε belongs to the subring Z[ζ] . By the i i ⊆ triangle inequality, trace(g) εi = n and is equality if and only if all εi are the same, that is, g is | |≤ i | | ≤ P a scalar matrix. In particular, trace(g)= n holds only for g =1n×n. r P Now assume that q = p and trace(g) É. Then trace(g) is actually an integer; see [23, Section I.2]. ∈
Let p = (ζ 1) denote the ideal of Z[ζ] that is generated by the element ζ 1. So ζ 1 mod p, and − − ≡ hence all εi 1 mod p and trace(g) n mod p. Therefore, trace(g) n p Z = (p); see [23, Theorem I.10.1]≡ for the last equality. Since≡ we have already shown that trace(− g∈) ∩n, we conclude that trace(g)= n pt for some non-negative integer t. It remains to show that t n ≤or, equivalently, − ≤ p−1 n trace(g) . ≥−p 1
−
É É É É To this end, consider the Galois extension (ζ)/ and its trace TrÉ(ζ)/ . The minimal polynomial over of s−1 s−1 a root of unity of order ps > 1 is given by xp (p−1) + xp (p−2) + . . . +1 ([23, Theorem I.10.1] again). 6 ROBERT M. GURALNICK AND MARTIN LORENZ
Therefore, equation (4) yields
r ϕ(p ) if εi =1 ,
r−1 É TrÉ (εi)= p if ε has order p , (ζ)/ − i 0 otherwise.
Put n0 = # i εi = 1 and n1 = # i εi has order p ; so 0 ni n. Using the fact that trace(g) É we obtain { | } { | } ≤ ≤ ∈
r r r−1
É É É ϕ(p ) trace(g) = TrÉ (trace(g)) = Tr (ε )= ϕ(p )n p n . (ζ)/ (ζ)/ i 0 − 1 i X Hence, trace(g)= n n1 n , as desired. 0 − p−1 ≥− p−1 3.2. Characters. A complex representation of a group is a homomorphism ρ: GL(V ) for some
G G →
C C -vector space V . If n = dimC V then we may identify GL(V ) with GLn( ); the integer n is called the degree of the representation ρ. The character χ = χρ of ρ is the complex-valued function on that is given by χ(g) = trace(ρ(g)) for g . G ∈ G Fact 1 The sum χ(g) is always an integer that is divisible by . g∈G |G|
P 1 C
To see this, consider the linear operator e EndC (V ) = M ( ) that is defined by e = ρ(g). ρ ∈ ∼ n ρ |G| g∈G Note that ρ(g)eρ = eρ holds for all g , because multiplication with ρ(g) simply permutes the summands ∈ G 2 P of eρ. Hence, eρ is an idempotent operator: eρ = eρ. Therefore, the trace of eρ is equal to the rank of eρ: 1 1 trace(eρ) = dimC eρ(V ). On the other hand, trace(eρ) = |G| g∈G trace(ρ(g)) = |G| g∈G χ(g). This proves Fact 1. We remark that Fact 1 is a special case of the so-called orthogonality relations of characters. P P Fact 2 The product of any two characters of is again a character of . In particular, all powers χs (s 0) of a character χ are alsoG characters of . G ≥ G Here, the 0th power χ0 is the constant function with value 1; it is the character of the so-called trivial repre-
∗ C sentation C = GL ( ) sending every g to 1. In order to show that the product of two characters, G → 1 ∈ G χρ and χρ′ , is itself a character, one needs to construct a complex representation of whose character is ′ G ′ χρ χρ′ . This is achieved by the so-called tensor product ρ ρ of the representations ρ and ρ , a complex representation· of degree equal to deg ρ deg ρ′ for whose detailed⊗ construction the reader is referred to Isaacs [20, Chapter 4] or any other text on group· representation theory. More generally, tensor products of repre- sentations can be defined for Hopf algebras; they form an important aspect of the current investigation of quantum groups.
3.3. Vandermonde matrix. Given a collection z0,...,za of elements in some commutative ring R
(later we will take R = Z), form the familiar Vandermonde matrix 2 a 1 z0 z0 ... z0 2 a 1 z1 z1 ... z1 V = . . . . . ...... 1 z z2 ... za a a a We will exhibit a matrix E over R so that the matrix product V Eis diagonal: ·
(5) V E = diag z0 zs, z1 zs,..., za zs . · − − − 0≤s≤a 0≤s≤a 0≤s≤a sY6=0 sY6=1 sY6=a FINITE MATRIX GROUPS 7
th To this end, let es = es(x1,...,xa) denote the s elementary symmetric function in the commuting variables x1,...,xa . These functions can be defined by a a (6) (x x )= xs( 1)a−se , − i − a−s i=1 s=0 Y X where x is an additional commuting variable. Explicitly, es = I i∈I xi, where I runs over all subsets I 1,...,a with I = s. Specializing x to zt′ and (x1,...,xa) to (z0,..., zt,...,za), where zt signals ⊆{ } | | P Q that zt has been deleted from the list, and defining E = ( 1)a−se (z ,..., z ,...,z ) b b − a−s 0 t a s,t=0,...,a equation (6) becomes the desired equation (5). b 4. Schur’s proof of Theorems 1 and 2
It remains to prove Proposition 5. So fix a prime p and let be finite subgroup of GL (C) whose order G n
is a power of p. We assume that trace(g) É holds for all g . Since the order of each g divides , Lemma|G| 6 implies that the traces trace(g) can∈ only take the values∈ G |G| z = n pt with 0 t a = n . t − ≤ ≤ p−1 j k Put mt = # g trace(g) = zt ; so m0 = 1 by Lemma 6. Proposition 5 is the case t = 0 of the following { ∈ G | }
CLAIM. For all 0 t a, the order divides the product m pa s t. ≤ ≤ |G| t − 0≤s≤a sY6=t
To prove this, note that the inclusion GLn(C) is a complex representation of with character χ(g) = trace(g). Therefore, it follows fromG Facts ⊆ 1 and 2 above that, for each non-negativeG integer s, the sum s a s g∈G trace(g) is an integer that is divisible by . In other words, t=0 mtzt 0 mod or, in matrix form, |G| ≡ |G| P P (7) (m ,...,m ) V (0,..., 0) mod , 0 a · ≡ |G| where V = (zs) is the Vandermonde matrix, as in 3.3. Multiplying both sides of equation (7) with t t,s=0,...,a § the matrix E constructed in 3.3, we deduce from equation (5) that § m z z 0 mod t t − s ≡ |G| 0≤s≤a sY6=t holds for all 0 t a. Since zt zs = p(s t), this is exactly what the claim states. This completes the proof of Proposition≤ ≤ 5, and hence− Theorems 1− and 2 are proved as well.
REMARK. It has been pointed out to us by Serre that the Claim above can be stated more generally as
follows. Let be a finite subgroup of GL (C). Put X = trace(g) g and, for each ξ X, let
G n { | ∈ G} ∈ C mξ = # g trace(g) = ξ . Note that X Z[ζ] for some root of unity ζ so that the eigenvalues of each g{ ∈are G | powers of ζ; no} a priori hypothesis⊆ on trace values is necessary.∈ The above argument, with the obvious∈ notationalG changes, proves the following: For each ξ X, the product m (ξ η) is divisible by .
∈ ξ η∈X\{ξ} − |G| Z The matrices V and E will now be matricesQ over Z[ζ] and divisibility is to be understood in [ζ]. However, X
s É is stable under the Galois group Gal(É(ζ)/ ), because each Galois automorphism sends ζ to some power ζ , and hence trace(g) is sent to trace(gs) X. Thus, if ξ is rational then so is the product (ξ η), ∈ η∈X\{ξ} − and hence this product is actually an integer and divides m (ξ η) in Z. This applies in |G| ξ η∈X\{ξ} − Q Q 8 ROBERT M. GURALNICK AND MARTIN LORENZ
particular to ξ = n X, with m =1. Therefore, always divides (n η) in Z. Now, if is ∈ ξ |G| η∈X\{n} − |G| a power of p and trace(g) É holds for all g then Lemma 6 implies that (n η) is a divisor ∈ ∈ G Q η∈X\{n} − of paa!; so divides paa!, as desired. ± |G| Q 5. Minkowski’s reduction method Minkowski’s original proof of Theorem 1 is quite different from Schur’s. The essential tool are reduc- tion homomorphisms to the general linear group over certain finite fields. The reduction method applies
to algebraic number fields K, that is, finite extensions of É, and very quickly yields rough bounds for the
orders of all finite subgroups GLn(K); see Proposition 11 below. In fact, subgroups GLn(C) satisfying only trace(g) KGfor ⊆ all g can also be treated by this strategy due to the factG ⊆ that linear groups over finite fields can∈ be realized∈ over G the subfield generated by the traces; see Lemma 8. A sharp bound for the 2′-part of can be easily deduced in this way from the well-known order of the general linear group over a finite field|G| together with some elementary number theoretic observations; see Proposition 15. The 2-part of requires additional information concerning certain classical groups associated to hermitian or skew-hermitian|G| forms. This will be explained in 5.5 and 5.6 below. §§ As usual, the field with q elements will be denoted by Fq. We will also occasionally write the p-part of vp(m) ′ an integer m as mp = p , and mp′ will denote the p -part of m; so mp′ = m/mp.
5.1. The general linear group over finite fields. It is well-known and easy to see that GLn(Fq) has order n−1(qn qi); cf., e.g., Rotman [38, Theorem 8.5]. Thus, if q = pf then i=0 − Q n ′ i (8) GL (F ) = (q 1) . | n q |p − i=1 Y LEMMA 7. Let ℓ be an odd prime. There are infinitely many primes p such that
(1+vℓ(f))⌊n/τ⌋ GL (F f ) = ℓ ( n/τ !) | n p |ℓ ⌊ ⌋ ℓ ℓ−1 holds for all positive integers n and f, where τ = (ℓ−1,f) .
s ∗ s
Z Z Z PROOF. We use the fact that, for odd primes ℓ, the group of units (Z/ℓ ) of the ring /ℓ is cyclic
s s−1 2 2 ∗ Z of order ϕ(ℓ ) = ℓ (ℓ 1). Any integer whose residue class modulo ℓ generates (Z/ℓ ) will also generate the units modulo− all powers ℓs; see [19, proof of Theorem 2 on p. 43]. Moreover, by Dirichlet’s theorem on primes in arithmetic progression (e.g., [40, p. 61]), the residue class modulo ℓ2 of any generator
2 ∗ s Z of (Z/ℓ ) contains infinitely many primes p. Let p be one of these primes. Then p has order ϕ(ℓ ) in
s ∗ i s s i Z (Z/ℓ ) ; so p 1 mod ℓ if and only if i is divisible by ϕ(ℓ ). In other words, ℓ divides p 1 if and only if ℓ 1 divides≡ i and, in this case, − − i i (p 1)ℓ = ℓ . − ℓ−1 ℓ f i i Now put q = p . Then ℓ divides q 1 if and only if τ divides i and, in this case, (q 1)ℓ = ℓfℓ (i/τ)ℓ. − − n i For 1 i n, this applies to i = τ, 2τ,...,ατ, where α = n/τ . Thus, GLn(Fpf ) ℓ = i=1(q 1)ℓ = α≤ ≤ ⌊ ⌋ | | − (ℓf ) (α!) , which proves the lemma. ℓ ℓ Q (1+vp(f))⌊n/τ⌋ We remark that, for f = 1, the expression ℓ ( n/τ !)ℓ in Lemma 7 is identical with the ℓ-part of the Minkowski bound M(n); see equation (3). Thus, for⌊ an odd⌋ prime ℓ,
(9) GL (F ) = M(n) | n p |ℓ ℓ holds for infinitely many primes p. Lemma 7 fails for the prime ℓ = 2, because the linear group is too big. 2 For example, for all odd primes p, GL (F ) = (p 1) (p 1) is divisible by 16 while M(2) =8. | 2 p |2 − 2 − 2 2 FINITE MATRIX GROUPS 9
f LEMMA 8. Let be a finite subgroup of GL (F ), where q = p . Assume that p does not divide and G n q |G| that p>n. If all g satisfy trace(g) F for some subfield F F then is conjugate to a subgroup of ∈ G ∈ ⊆ q G GLn(F ).
alg
F k PROOF. Let k = F denote an algebraic closure of F with q , and let σ denote the canonical
⊆ σ σ
Z k topological generator of Gal(k/F ) = . Then σ acts on GL ( ) by (g ) = g . By our ∼ n i,j n×n i,j n×n
σ k hypothesis on traces, the map GL (k), g g ,isa -representations of having the same character G → n 7→ G as the inclusion ֒ GLn(k). Since bothb representations are semisimple, by Maschke’s theorem, they are isomorphic. (TheG proof→ of [5, 12, Proposition 3] works in characteristic p>n.) Thus, there exists a matrix −1§ σ σ −1 u GLn(k) such that ugu = g holds for all g . By Lang’s theorem [28], we can write u = v v ∈ −1 ∈ G for some v GL (k). Thus, each v gv is fixed by σ, and hence it belongs to GL (F ). By the Noether- ∈ n n Deuring Theorem (e.g., Curtis-Reiner [12, p. 139]), we may replace v by a matrix in GLn(Fq), proving the lemma.
REMARKS. (a) Lang’s theorem is a much more general result than what is actually needed for the proof of Lemma 8; see, e.g., Borel [2, Corollary 16.4]. Indeed, we only invoke the theorem for the algebraic group GLn and, in this case, it is a special case of Speiser’sversion of Hilbert’s Theorem 90: the Galois cohomology 1 set H (F, GLn) is trivial for every field F ; cf. Serre [42, Proposition X.3] or Knus et. al. [26, Remark 29.3]. 1 alg For a finite field F , triviality of H (F, GLn) amounts to the desired fact that every u GLn(F ) can be written as u = vσv−1, where σ is the Frobenius generator of Gal(F alg/F ); see[26, Exercise∈ 2 on p. 442].
1 alg alg alg ∗
F F F (b)It followsfrom (a)that H (Fq, PGLn) is trivial as well: every U PGLn( q )=GLn( q )/( q )
σ −1 alg ∈ 1 F can be written as U = V V for some V PGLn(Fq ). Moreover, triviality of H ( q, PGLn) is equiv- alent to Wedderburn’s commutativity theorem∈ for finite division rings; see [42, Proposition X.8] or [26, p. 396]. For an alternative proof of a version of Lemma 8 based on Wedderburn’s commutativity theorem, see Isaacs [20, Theorem 9.14]. Incidentally, Wedderburn’s article [45] appeared in 1905, as did Schur’s, and Speiser’s generalization of Hilbert’s Theorem 90 appeared in 1919 [44, Satz 1]. None of this was available to Minkowski when [33] was written. 5.2. The reduction map. Throughout this section, K will denote an algebraic number field and will G be a finite subgroup of GLn(K). Furthermore, = K will denote the ring of algebraic integers in K. Put L = g n Kn; this is a O-stableO finitely generated -submodule of Kn. If is a g∈G · O ⊂ G O O principal ideal domain (or, put differently, K has class number 1) then the theory of modules over PIDs tells us that L is isomorphicP to n; see, e.g., Jacobson [21, Section 3.8]. Therefore: O If = is a PID then is conjugate in GL (K) to a subgroup of GL ( ).
O OK G n n O
É Z For K = É, for example, this says that every finite subgroup of GLn( ) can be conjugated into GLn( ).
This explains why it was enough to look at integer matrices rather than matrices over É in Section 2. In general, is a Dedekind domain and the foregoing applies “locally”: for every prime ideal p of , O O the localization p is a PID; see Jacobson [22, Section 10.2]. Consequently, as above, we may conclude that O is conjugate in GLn(K) to a subgroup of GLn( p), and hence we may assume that GLn( p) after Greplacing by a conjugate. In fact, except for finitelyO many primes of , the group isG actually ⊆ containedO G O G in GLn( p) at the outset: if a is a common denominator for all matrix entries of all elements of the originalO then GL ( [1∈/a O]); so any prime p not containing a will do. Now let p = 0 and put G G ⊆ n O 6
(p) = p Z. Then /p is a finite field of characteristic p. The number of elements of /p is often called the absolute∩ or countingO norm of p; it will be denoted by (p). Thus, O N f /p = F and (p)= p ,
O ∼ N (p) N É where f = f(p/É) is the relative degree of p over . Reduction of all matrix entries modulo the maximal ideal p p of p gives a homomorphism O O
(10) GL ( p) GL (F ) , n O → n N (p) 10 ROBERT M. GURALNICK AND MARTIN LORENZ because p/p p = /p. The following lemma is well-known. Only the first assertion will be needed later; O O ∼ O the second is included for its own sake. Recall that, since p is a local PID, its non-zeroideals are exactly the O e powers of the maximal ideal p p. The ramification index of p over É is the power e such that p p = p p. O O O LEMMA 9. The kernel of the reduction homomorphism (10) has at most p-torsion. In fact, any torsion i element g in the kernel satisfies gp =1 for some pi ep/(p 1). n×n ≤ − m PROOF. For each g GL ( p), define d(g) = sup m g 1 M (p p) ; so d(g) = ∈ n O { | − n×n ∈ n O } ∞ if and only if g = 1n×n and d(g) > 0 if and only if g belongs to the kernel of (10). Now assume that d 0 < d = d(g) < and write g = 1n×n + π h, where π is a generator of the ideal p p and h ∞ r d ℓ r d(i−1) i O ∈ Mn( p) Mn(p p). Then g = 1n×n + π (rh + s) with s = i=2 i π h Mn(p p). If (r, p)=1O \ then rhO + s / M (p ) and so gr = 1 . This shows that the kernel of∈ (10) hasO at most n p n×n p-torsion. ∈ O 6 P p p We claim that any g GLn( p) with d = d(g) > 0 satisfies d(g ) min e+d, pd ,and d(g )= e+d ∈ O ≥ { d } if pd > e + d. Indeed, we may assume that d < . Writing g = 1n×n + π h be as above, we obtain p dp p p−1 p di i ∞ p g =1n×n + π h + t with t = i=1 i π h . Since p divides all binomial coefficients i occurring in e+d e+d+1 t, we have t Mn(p p) Mn(p p). The claim follows from this. We conclude in particular that p ∈ O \ P O g =1n×n if > (p 1)d>e. 6 ∞ − pi Now assume that g GLn( p) is a torsion-element with 0 < d(g) < . Then g = 1n×n for some positive integer i. If∈i is chosenO minimal then our observations in the previous∞ paragraph imply that i−1 e (p 1)d(gp ) (p 1)pi−1d(g). Hence, pi ep/(p 1) which proves our second assertion. ≥ − ≥ − ≤ − The above proof also shows that if m(p 1) >e then there is no non-trivial torsion in the kernel of the m − m homomorphism GL ( p) GL ( /p ) that is defined by reduction of all matrix entries modulo p p. n O → n O O
EXAMPLE 10. Let K = É. Then p = (p) and e = 1. Thus, in Lemma 9, we must have i = 0 when
p is an odd prime, and i 1 when p = 2. In other words, the kernel of the reduction map GL (Z ) ≤ n (p) →
GLn(Fp) is torsion-free for odd p. For p = 2, the only non-trivial torsion possible is order 2. The kernel of
Z Z GL (Z ) GL ( /4 ) is torsion-free. n (2) → n ′ ′ ′ The first assertion of Lemma 9 implies that the p -part p of the order of divides GLn(FN (p)) p . In view of equation (8), this yields the following proposition.|G| G | |
PROPOSITION 11. Let be a finite subgroup of GLn(K), where K is an algebraic number field. Then, n G ′ i for each non-zero prime p of lying over p Z, divides ( (p) 1) . OK ∈ |G|p i=1 N − Applying Proposition 11 with any two choices of p lying overQ different rational primes yields a bound for the order of . Moreover, Proposition 11 comes close to establishing the Minkowski bound M(n) for the field of rationalG numbers:
EXAMPLE 12. For a finite subgroup GL (É) and a given prime ℓ, Proposition 11 implies that the G ⊆ n
ℓ-part of the order of divides GL (F ) , where p is any prime other than ℓ. Furthermore, if ℓ = 2 |G|ℓ G | n p |ℓ 6 then GL (F ) = M(n) for infinitely many primes p, by (9). Thus, we have shown (again) that if is a | n p |ℓ ℓ G finite subgroup of GLn(É) then ℓ divides M(n)ℓ for all primes ℓ =2. In order to extend this to the prime ℓ =2, Minkowski uses additional|G| facts about quadratic forms. This will6 be explained below. 5.3. The Schur bound. Fix an algebraic number field K. We will describe certain constants S(n,K), introduced by Schur in [39], for the purpose of extending Theorem 2 to general algebraic number fields.
Thus, S(n, É) will be identical to M(n). Like M(n), the constant S(n,K) will be defined as a product of ℓ-factors for all prime numbers ℓ, and almost all ℓ-factors will be 1. Throughout, we put
2πi/m
ζ = e C . m ∈
FINITE MATRIX GROUPS 11
É É For a given prime ℓ, the chain K É(ζ ) K (ζ m ) K (ζ m+1 ) . . . of subfields of ∩ ℓ ⊆···⊆ ∩ ℓ ⊆ ∩ ℓ ⊆
K must stabilize, since K is finite over É. Thus we may define É (11) m(K,ℓ) = min m 1 K É(ζ m )= K (ζ m+1 )= . . . . { ≥ | ∩ ℓ ∩ ℓ }
Now put É (12) t(K,ℓ) = [É(ζ m(K,ℓ) ) : K (ζ m(K,ℓ) )] . ℓ ∩ ℓ
and define
j k n n n n n− m(K,ℓ) t(K,ℓ) + ℓt(K,ℓ) + 2 +... (13) S(n,K)=2 ⌊ t(K,2) ⌋ ℓ ⌊ ⌋ ⌊ ⌋ ℓ t(K,ℓ) ℓ n Y n n− t(K,2) m(K,ℓ) t(K,ℓ) n =2 ℓ t(K,ℓ) ! . ⌊ ⌋ ⌊ ⌋ ℓ Yℓ j k Here, ℓ runs over all rational primes, including 2, and the second equality follows from equation (2). Since
t(K,ℓ)[K : É] ℓ 1, only finitely many ℓ will satisfy t(K,ℓ) n and so almost all ℓ-factors are trivial.
≥ − ≤
É É É EXAMPLE 13. Let K = É(ζ ) for some positive integer k. Since (ζ ) (ζ ) = (ζ ), we have k k ∩ t (k,t) m(K,ℓ) = max 1, v (k) . If ℓ does not divide k then t(K,ℓ)= ℓ 1; otherwise t(K,ℓ)=1. For K = É
{ ℓ } − É in particular, we obtain m(É,ℓ)=1 and t( ,ℓ)= ℓ 1 for all ℓ. Thus, equation (13) reduces to (1) and so −
S(n, É)= M(n). In [39], Schur proved the following generalization of Theorem 2 using a larger dose of character theory than what was needed in Section 4.
THEOREM 14 (Schur 1905). Let be a finite subgroup of GLn(C) such that the traces of all elements of belong to some fixed algebraic numberG field K. Then divides S(n,K). G |G| An alternative description of the constants S(n,K) is as follows. Let µℓ∞ denote the group of all
∞ É ℓ-power complex roots of unity. Then K É(µ )= K (ζ m(K,ℓ) ).
∩ ℓ ∩ ℓ
É É É If ℓ is odd then each K É(ζℓm )/ is a subextension of (ζℓm )/ which is cyclic with Galois group
• ∗ ∩m−1
Z Z Z Z Z É É isomorphic to (Z/ℓ ) = /ℓ /(ℓ 1) . Also, K (ζℓ) is the fixed subfield of K (ζℓm )
m−1∼ × − ∩ ∩
Z É É É É É É É É under the group Z/ℓ . Thus, [K (ζℓm ) : ] = [K (ζℓ) : ][K (ζℓm ) : ]ℓ and [K (ζℓ) : ] is a divisor of ℓ 1. Hence, for odd primes∩ ℓ, ∩ ∩ ∩ −
∞ É (14) m(K,ℓ)=1+ v ([K É(µ ) : ]) ℓ ∩ ℓ
ℓ−1 É
t(K,ℓ) = [É(ζℓ) : K (ζℓ)] = ∞ . É ∩ (ℓ−1,[K∩É(µℓ ): ])
m ∗ m−2
É Z Z Z Z Z Z For the prime ℓ = 2, the extension É(ζ m )/ has Galois group ( /2 ) = /2 /2
• 2 ∼ ×
Z É (m 2). The factor Z/2 is generated by complex conjugation. When m > 2, the field (ζ2m ) has ≥ −1
−
É É exactly three subfields that are not contained in É(ζ2m 1 ): besides (ζ2m ), there are (ζ2m + ζ2m ) and
−1 ∞ É É(ζ m ζ m ). If t(K, 2)=1, which certainly holds when m(K, 2)=1 or m(K, 2)=2, then K (µ )= 2 − 2 ∩ 2
∞ m(K,2)−1 ∞
É É É É(ζ2m(K,2) ), andso [K (µ2 ) : ]=2 . If t(K, 2) =1 then K (µ2 ) must be equal to either
−1 ∩ −1 6 ∩ ∞ m(K,2)−2
É É É É(ζ m(K,2) + ζ ) or (ζ m(K,2) ζ ). Thus, t(K, 2)= 2 and [K (µ ) : ]=2 . 2 2m(K,2) 2 − 2m(K,2) ∩ 2 In either case, the 2-factor of S(n,K) in (13) simplifies to
n
∞ t(K,2) n É (15) S(n,K) = [K É(µ ) : ] 2 (n!) 2 ∩ 2 ⌊ ⌋ 2 The following properties of S(n,K) are easy to verify: (16) S(m,K)S(n,K) divides S(m + n,K) and (17) S(n,K) divides S(n, F ) if K F . ⊆ 12 ROBERT M. GURALNICK AND MARTIN LORENZ
5.4. Odd primes. The following proposition establishes Theorem 14 for the 2′-part of . The special |G| case where K = É was done earlier in Example 12.
PROPOSITION 15. Let be a finite subgroup of GLn(C). Assume that the traces of all elements of belong to some algebraic numberG field K. Then divides S(n,K) for all odd primes ℓ. G |G|ℓ
PROOF. Replacing by a conjugate in GLn(C) if necessary, we can make sure that GLn(F ) for some algebraic number fieldG F K. Indeed, any splitting field for that is finite over GK ⊆will serve this purpose; see [20, Theorem 9.9].⊇ Let = denote the ring of algebraicG integers of F and consider any O OF non-zero prime p of such that GL ( p) and ℓ / p. Put (p)= p Z and assume that p is chosen as O G ⊆ n O ∈ ∩ in Lemma 7 and also satisfies p>n. Let ρ: GLn(FN (p)) denote the reduction homomorphism (10) restricted to . Upon replacing by a SylowGℓ-subgroup, → the map ρ becomes injective, by Lemma 9, and G G our goal now is to show that divides S(n,K)ℓ. As in the first paragraph|G| of the proof of Lemma 6, one sees that the traces of all elements of actually
′ ′ ∞ G F belong to the ring of algebraic integers of the field K = K É(µ ). Therefore, trace(ρ(g)) OK ∩ ℓ ∈ q holds for all g , where q = (p ′ ) = pf . Lemma 8 now implies that ρ( ) is conjugate to a ∈ G N ∩ OK G subgroup of GLn(Fq) and Lemma 7 further gives that n (1+vℓ(f)) τ n divides GL (F ) = ℓ ! , |G| | n q |ℓ ⌊ ⌋ τ ℓ ℓ−1 where τ = (ℓ−1,f) . Now, for odd ℓ,
n m(K,ℓ) t(K,ℓ) n S(n,K)ℓ = ℓ t(K,ℓ) ! ⌊ ⌋ ℓ
∞ j ℓ−1 k É
with m(K,ℓ)=1+vℓ([K É(µℓ ) : ]) and t(K,ℓ)= ∞ by (14). Since the residue class É (ℓ−1,[K∩É(µℓ ): ])
s ∗ ∩
Z Z of p generates (Z/ℓ ) for all s, p remains prime in [ζℓs ]; see the proof of Lemma 7 and [19, Theorem
′ ′ ∞
É É 2 on p. 196]. In particular, p remains prime in K , and so f = f(p K /É) = [K (µℓ ) : ]. O ∩ O ∩ Therefore, GL (F ) = S(n,K) and the proposition is proved. | n q |ℓ ℓ 5.5. Unitary, orthogonal and symplectic groups. In this section, we review some standard facts about hermitian and skew-hermitian forms and certain classical groups that are associated with them. Throughout,
θ 2 k k will denote a field and a a will be an automorphism of satisfying θ = Id. We assume for simplicity 7→ that char k =2. 6
5.5.1. Sesquilinear forms. Let V denote an n-dimensional vector space over k. A bi-additive map
β : V V k is called sesquilinear (with respect to θ) if × → β(av,bw)= abθβ(v, w)
holds for all v, w V and a,b k. When θ is the identity, sesquilinear forms are ordinary bilinear forms. A sesquilinear∈ form β is called∈ non-singular if β satisfies the following equivalent conditions: (i) β(v, V ) = 0 for v V implies v = 0; (ii) β(V, v) = 0 for v V implies v = 0; (iii) for any basis v ,...,v { of} V , the∈ matrix (β(v , v )) has non-zero{ determinant;} ∈ see [29, Proposition XIII.7.2]. If β { 1 n} i j n×n is any sesquilinear form on V and g GL(V ) then, defining βg(v, v′) := β(g(v),g(v′)) for v, v′ V , one again obtains a sesquilinear form βg ∈on V with respect to θ; it is called equivalent to β. ∈ Sesquilinear forms β satisfying β(w, v) = β(v, w)θ (resp. β(w, v) = β(v, w)θ) for all v, w V are called hermitian (resp. skew-hermitian). The stabilizer in GL(V ) of a non-singular− hermitian or skew-∈ hermitian form β is called the group of isometries of (V,β) and is denoted by Iso(V,β); so Iso(V,β)= g GL(V ) β(g(v),g(v′)) = β(v, v′) for all v, v′ V . { ∈ | ∈ } Let β be non-singular skew-hermitian. If β(v, v) =0 for some v V then β′ = β(v, v)β is a non-singular hermitian form on V with Iso(V,β′) = Iso(V,β).6 On the other hand,∈ if β(v, v)=0 for all v V then it is easy to see that θ = Id and so β is an alternating bilinear form. Therefore, when studying isometry∈ groups of non-singular hermitian or skew-hermitian forms β on V , it suffices to consider the following cases: FINITE MATRIX GROUPS 13
unitary case: β is hermitian with respect to θ = Id; orthogonal case: β is symmetric bilinear (θ =6 Id); symplectic case: β is alternating bilinear (θ = Id).
5.5.2. Twisting modules. Now assume that V is a finitely generated (left) k[ ]-module, where is a finite group. We let V θ = vθ v V denote a copy of V with operations G G { | ∈ } vθ + wθ = (v + w)θ , (av)θ = aθvθ and gvθ = (gv)θ
θ k for v, w V , a k and g . Then V becomes a [ ]-module and ∈ ∈ ∈ G G θ
θ k (18) traceV /k (g)= traceV/ (g)
holds for all g . Furthermore, there is an isomorphism