Quick viewing(Text Mode)

A Square Matrix Is Congruent to Its Transpose

A Square Matrix Is Congruent to Its Transpose

Journal of Algebra 257 (2002) 97–105 www.academicpress.com

A square is congruent to its

Dragomir Ž. Ðokovic´ a,∗,1 and Khakim D. Ikramov b,2

a Department of Pure , University of Waterloo, Waterloo, ON, Canada N2L 3G1 b Faculty of Computational Mathematics and Cybernetics, Moscow State University, Moscow 119899, Russia Received 12 September 2001 Communicated by Efim Zelmanov

Abstract  For any matrix X let X denote its transpose. We show that if A is an n by n matrix over    afieldK,thenA and A are congruent over K, i.e., P AP = A for some P ∈ GLn(K).  2002 Elsevier Science (USA). All rights reserved.

Keywords: Congruence of matrices; Bilinear forms; Sesquilinear forms; Kronecker modules

1. Introduction

For any matrix X let X denote its transpose. Let us start by recalling the following known fact (see, e.g., [2, Theorem 4, p. 205] or [3, Theorem 11]): if A is a complex n by n matrix, then A and A are congruent, i.e., P AP = A for some invertible complex matrix P . Our main objective is to prove that the same assertion is valid for matrices over an arbitrary field K. In spite of its elementary character, the proof of this result is quite involved.

* Corresponding author. E-mail addresses: [email protected] (D.Ž. Ðokovic),´ [email protected] (K.D. Ikramov). 1 The author was supported in part by the NSERC Grant A-5285. 2 The author was supported in part by the NSERC Grant OGP 000811.

0021-8693/02/$ – see front matter  2002 Elsevier Science (USA). All rights reserved. PII:S0021-8693(02)00126-6 98 D.Ž. Ðokovi´c, K.D. Ikramov / Journal of Algebra 257 (2002) 97–105

Theorem 1.1. If A is an n by n matrix over a field K,thenP AP = A for some P ∈ GLn(K).

The following result is an immediate consequence.

−1 Corollary 1.2. For A ∈ GLn(K), the matrices A and A are congruent over K.

   −1 Proof. Indeed, if P ∈ GLn(K) is such that P AP = A ,thenQ AQ = A for Q = PA−1. ✷

The proof of Theorem 1.1 is based on the paper of Riehm [3] and an addendum to it by Gabriel [1]. This paper solves the equivalence problem for bilinear forms on finite-dimensional vector spaces over a field. For technical reasons we prefer to use the subsequent paper of Riehm and Shrader-Frechette [4], which gives a solution of the equivalence problem for sesquilinear forms on finitely generated modules over semisimple (Artinian) rings. We need only apply this general theory to bilinear forms over K. Let us reformulate the above theorem in the language of bilinear forms. If f : V × V → K is a on a finite-dimensional K- V , we shall say that (V, f ) is a bilinear space. The definition of equivalence of two bilinear forms is the usual one.

Definition 1.3. Two bilinear forms f : V × V → K and g : W × W → K are equivalent if there exists a vector space isomorphism ϕ : V → W such that g(ϕ(x), ϕ(y)) = f(x,y), ∀x,y ∈ V . In that case, assuming that V and W are finite-dimensional, we also say that the bilinear spaces (V, f ) and (W, g) are isometric and that ϕ is an isometry.

Let us define the transpose of a bilinear form.

Definition 1.4. The transpose of a bilinear form f : V × V → K is the bilinear form g : V ×V → K such that g(x,y)= f(y,x) for all x,y ∈ V . We shall denote the transpose of f by f .

Let f and g be as in Definition 1.3 and assume that dim(V ) = dim(W) = n<∞.Wefixabasis{v1,v2,...,vn} of V . Then the n by n matrix A = (aij ) where aij = f(vi ,vj ) is the matrix of f with respect to this . The matrix   of f , with respect to the same basis, is A . Similarly, let B = (bij ) be the matrix of g with respect to a fixed basis {w1,w2,...,wn} of W. To say that f and g are  equivalent is the same as to say that P AP = B for some P ∈ GLn(K).

Theorem 1.5. Let V be a finite-dimensional K-vector space, f : V × V → K a bilinear form on V , and f  its transposed form. Then f and f  are equivalent. D.Ž. Ðokovi´c, K.D. Ikramov / Journal of Algebra 257 (2002) 97–105 99

The orthogonality of subspaces of a bilinear space is defined as follows.

Definition 1.6. Let (V, f ) be a bilinear space. We say that the subspaces U and W of V are orthogonal to each other if f(U,W)= 0andf(W,U)= 0.

Answering the question of a referee, we point out that Theorem 1.1 does not generalize to “hermitian conjugacy,” i.e., if σ ∈ Aut(K) with σ 2 = 1andwe set A∗ = (A)σ ,thenA and A∗ in general are not ∗-congruent. (Two n by n matrices A and B over K are said to be ∗-congruent if B = P ∗AP for some P ∈ GLn(K).) A simple counter example is provided by (K, σ) = (the complex field, the complex conjugation) and the 1 by 1 matrix A =[i],wherei is the .

2. Kronecker modules

In this section we recall some facts about the Kronecker modules which are special cases of the general Kronecker modules discussed in [4]. The reader should consult this reference and [1] for more details. We define a Kronecker as a four-tuple (X,u,v,Y) where X and Y are finite-dimensional K-vector spaces and u, v : X → Y are linear maps. To a bilinear space (Z, h) we assign the Kronecker module K(Z,h) = K(Z) = ∗ ∗ ∗ (Z, hl,hr ,Z ),whereZ is the of Z and hl,hr : Z → Z are defined by

hl(x)(y) = h(x,y), hr (x)(y) = h(y,x), ∀x,y ∈ Z.

Every Kronecker module is a direct sum of indecomposable ones which are unique up to ordering and isomorphism. There are five types of indecomposable Kronecker modules (X,u,v,Y):

I. Both u and v are isomorphisms and u−1v is indecomposable (i.e., it has only one elementary divisor). II. The spaces X and Y have the same dimension and the pencil λu + µv, with respect to suitable bases of X and Y , has the matrix   λ    µλ   .   ..   µ  .  .  .. λ µλ 100 D.Ž. Ðokovi´c, K.D. Ikramov / Journal of Algebra 257 (2002) 97–105

II∗. Similar to II with the matrix   µλ    µλ   .   ..   µ  .  .  .. λ µ III. In this case dim(Y ) = dim(X) + 1 and the pencil λu + µv has the matrix   λ    µλ   .   ..   µ  .  .  .. λ µ III∗. In this case dim(X) = dim(Y ) + 1andthematrixis   µλ  .   ..   µ   .  . .. λ µλ

The following theorem is an immediate consequence of the theory of Kronecker modules (also known as the theory of matrix pencils).

Theorem 2.1. If (V, f ) is a bilinear space, then the Kronecker modules K(V,f) and K(V,f) are isomorphic.

In terms of matrices, this can be restated as follows.

Theorem 2.2. If A is an n by n matrix over a field K, then the matrix pencils λA + µA and λA + µA are equivalent.

If (Z, h) is a bilinear space and Z = Z1 + Z2 is a direct decomposition of Z such that h(Z1,Z2) = 0andh(Z2,Z1) = 0, then we say that this space is the orthogonal direct sum of the bilinear spaces (Z1,h1) and (Z2,h2),where h1 and h2 are the corresponding restrictions of h. The following theorem is a very special case of the general result stated in [4, Section 9].

Theorem 2.3. Every bilinear space (Z, h) can be decomposed into an orthogonal direct sum

Z = ZI + ZII + ZIII D.Ž. Ðokovi´c, K.D. Ikramov / Journal of Algebra 257 (2002) 97–105 101 such that all indecomposable direct summands of K(ZI) are of type I, those ∗ ∗ of K(ZII) are of type II or II , and those of K(ZIII) are of type III or III .If hI is the restriction of h to ZI × ZI, etc., the bilinear spaces (ZI,hI), (ZII,hII), (ZIII,hIII) are uniquely determined by (Z, h) up to isometry. Moreover, if Z = ZII or Z = ZIII,thenK(Z) determines (Z, h) up to isometry.

3. Reduction to the nondegenerate case

We now begin the proof of Theorem 1.5. We warn the reader that the notation and the definitions of the invariants of bilinear spaces in [3] and [4] do not agree. The second paper is more general and we shall exclusively use the definitions given there. By Theorem 2.3, there is an orthogonal direct decomposition V = VI + VII + VIII where the summands VI, VII,andVIII have the properties stated there. ×   Let fI be the restriction of f to VI VI,etc.,andfI the restriction of f to ×  VI VI, etc. Clearly fI is the transpose of fI, etc. We claim that the bilinear   spaces (VII,fII) and (VII,fII) are isometric, and so are (VIII,fIII) and (VIII,fIII). The Kronecker module K(VII,fII) is a direct sum of indecomposable summands ∗ =∼  of type II or II . By Theorem 2.2, K(VII,fII) K(VII,fII) and the last assertion  of Theorem 2.3 implies that (VII,fII) and (VII,fII) are isometric. The same  argument shows that also (VIII,fIII) and (VIII,fIII) are isometric, and so our claim is true.  It remains to show that the bilinear spaces (VI,fI) and (VI,fI ) are isometric. As fI is nondegenerate, the proof of our theorem has been reduced to the nondegenerate case, i.e., the case where A is a nonsingular matrix.

4. Reduction to the primary case

We assume in this section that f is nondegenerate. We shall use a number of results of [4] without explicit reference and the reader should consult this paper for the claims made but not proved here. We recall from [4] that the asymmetry of f is the invertible linear operator α : V → V such that f(x,y) = f(α(y),x), ∀x,y ∈ V . Its matrix, with respect to our fixed basis of V ,is(A)−1A. The asymmetry of f  is α = α−1 and its matrix is A−1A. As any matrix is similar to its transpose, the asymmetries α and α are similar operators. Let p ∈ K[X] be a monic irreducible polynomial and assume that p = X. For such p we define the monic irreducible polynomial p∗ ∈ K[X] by p∗(X) = 102 D.Ž. Ðokovi´c, K.D. Ikramov / Journal of Algebra 257 (2002) 97–105 p(0)−1Xd p(X−1),whered is the degree of p. Let us decompose V into primary components with respect to α V = Vp, p where the sum is over the monic irreducible polynomials p ∈ K[X], p = X.The ∗ ∗ subspaces Vp and Vq are orthogonal if q = p .Ifp = p then the assertion of our theorem is true for the restriction of f to Vp + Vp∗ by [4, Theorem 16, Corollary]. It remains to deal with the case p∗ = p. Hence the proof of Theorem 1.4 has been reduced to the primary case: The minimal polynomial of α is a power of p,wherep = p∗ is an irreducible polynomial in K[X].

5. Reduction to the homogeneous primary case

In this section we consider the primary case as just described above. By [4, Proposition 25], there exists an orthogonal direct decomposition V = Vs s1 s such that Vs ⊆ ker(p(α) ) and the induced map s s−1 s+1 Vs/p(α)(Vs ) → ker p(α) ker p(α) + p(α)ker p(α) is an isomorphism for each s. Hence, without any loss of generality, we may assume that V = Vs for some s. In other words, we may assume that α has only one elementary divisor with arbitrary multiplicity. We refer to this case as the homogeneous primary case.

6. The homogeneous primary case

The minimal polynomial of α is a power of p,sayps ,wherep is as in the previous section, and all elementary divisors of α are equal to ps .Letr be the number of these elementary divisors. In order to prove that f and f  are equivalent, it suffices to check that they have the same invariants attached to them by [4, Theorems 27 and 31]. That is exactly what we are going to show. Assume first that p = X − 1. Set π = 1 − α−1 and π = 1 − (α)−1 = 1 − α. Define V = V/π(V) and note that π(V ) = π(V).Ifs is even, then the bilinear invariant attached to f (see [4, p. 517]) is a nondegenerate skew-symmetric form on the r-dimensional K-vector space V.(Henceifs is even then r must be even.) Therefore this invariant is unique up to equivalence. D.Ž. Ðokovi´c, K.D. Ikramov / Journal of Algebra 257 (2002) 97–105 103

We now assume that s is odd, in which case the bilinear invariant is the nondegenerate symmetric form f˜ on V defined by f(˜ x,˜ y)˜ = f πs−1(x), y , ∀x,y ∈ V, where x˜ denotes the canonical image of x in V. The analogous invariant f˜ of the bilinear form f  is defined similarly (using π instead of π). Since πs−1(V ) is the eigenspace of α for the eigenvalue 1 and π =−απ, we obtain that    −  − − f˜ (x,˜ y)˜ = f (π )s 1(x), y = f (απ)s 1x,y = f y,(απ)s 1x − − − = f α(απ)s 1x,y = f αs πs 1x,y = f πs 1x,y = f(˜ x,˜ y)˜ for all x,y ∈ V . Hence f˜ = f˜. If the of K is 2 then there are additional invariants: The quadratic forms Fi , i  0. It is immediate from the definition of these forms (see [4, Section 8]) that these invariants are the same for f and f . In the case p = X + 1 (we may assume that the characteristic of K is not 2) the proof is similar. It remains to consider the case where p has degree d>1. As p = p∗, it follows that d is even (see [3]) and p(0)2 = 1. We set π = α−d p(α), π = αd p(α−1) and V = V/π(V)= V/π(V ). The algebra K[α], which is isomorphic to the quotient K[X]/(ps ), has an J such that αJ = α−1. The corresponding involution, also denoted by J ,ofK[X]/(ps ) sends the element ζ = X + (ps ) to its inverse. The algebra K[X]/(p) is a finite field extension of K. It also has an involu- tion, J , which sends the element ξ = X + (p) to its inverse. The space V is nat- urally a module over the algebra K[X]/(ps) with ζ acting as α. Similarly, V is naturally a module over the algebra K[ξ]=K[X]/(p) in two different ways: First we let ξ act as α˜ (the linear transformation induced by α), and second we let ξ act as α˜  = (α)˜ −1. We shall distinguish these two actions by writing ξ ◦˜x =˜α(x)˜ for the former and ξ ∗˜x = (α)˜ −1(x)˜ for the latter. Recall that a J -sesquilinear form h on a K[ξ]-vector space M is a K- h : M × M → K[ξ] such that h(ax, by) = aJ bh(x,y) for all a,b ∈ K[ξ] and x,y ∈ M.Letµ ∈ K[ξ] satisfy µµJ = 1. A J -sesquilinear form h on M is called µ-hermitian if h(y,x) = µh(x, y)J for all vectors x,y ∈ M.AJ -sesquilinear form h on M is called hermitian if it is µ-Hermitian for µ = 1. From now on we set µ = p(0)s−1ξ (s−1)d+1. As in [4, p. 512], let ν be 1 if the characteristic of K is 0, and otherwise let ν be the greatest power of the ν ν−1 characteristic such that p is a polynomial in X1 = X .Then{1,ξ,...,ξ } is ν abasisofK[ξ] as a vector space over its subfiled K[ξ1],whereξ1 = ξ .Define the K[ξ1]-linear τ1 : K[ξ]→K[ξ1] by ν−1 i τ1 aiξ = a0,ai ∈ K[ξ1]. i=0 104 D.Ž. Ðokovi´c, K.D. Ikramov / Journal of Algebra 257 (2002) 97–105

We now define the K- τ : K[ξ]→K by τ = Tr ◦ τ1 where Tr is the trace map K[ξ1]→K of the separable field extension K[ξ1] of K. Apart from the asymmetry α, the bilinear form f has only one invariant (see [4]): The unique nondegenerate µ-Hermitian form f˜ on the K[ξ]-vector space (V, ◦) such that τf(˜ x,˜ y)˜ = f πs−1(x), y , ∀x,y ∈ V.

Similarly, the analogous invariant of the bilinear form f  is the unique nondegen- erate µ-Hermitian form f˜ on the K[ξ]-vector space (V, ∗) such that τf˜(x,˜ y)˜ = f  (π)s−1(x), y , ∀x,y ∈ V.

In order to complete the proof of the theorem, it suffices to show that the µ-Hermitian forms f˜ and f˜ are equivalent, i.e., that there exists an isomorphism ϕ : (V, ◦) → (V, ∗) of K[ξ]-vector spaces such that f˜(ϕ(x),ϕ(˜ y))˜ = f(˜ x,˜ y)˜ for all x,y ∈ V . Recall that p(0)2 = 1. It is easy to check that τ(aJ ) = τ(a) for all a ∈ K[ξ]. Since π = p(0)αd π and f(x,y) = f(α(y),x)for arbitrary x,y ∈ V , we obtain that    −  − τf˜ (x,˜ y)˜ = f (π )s 1(x), y = f p(0)αd π s 1(x), y s− − + − − = f y, p(0)αd π 1(x) = f p(0)s 1α1 d(s 1)πs 1(x), y − + − − − − = τf˜ p(0)s 1ξ 1 d(s 1) ◦˜x,y˜ = τ p(0)s 1ξ d(1 s) 1f(˜ x,˜ y)˜ − = τ p(0)2(s 1)f(˜ y,˜ x)˜ J = τf(˜ y,˜ x).˜

As both (x,˜ y)˜ → f˜(x,˜ y)˜ and (x,˜ y)˜ → f(˜ y,˜ x)˜ are µ-Hermitian forms on the K[ξ]-vector space (V, ∗), the above equality and [4, Theorem 22] imply that  f˜ (x,˜ y)˜ = f(˜ y,˜ x),˜ ∀x,y ∈ X. (6.1)

One should keep in mind that this identity is possible only because f˜ and f˜ are µ-Hermitian forms for two different K[ξ]-vector space structures on the K-vector space V. The identity map from (V, ◦) to (V, ∗) is a J -linear isomorphism (not K[ξ]-linear). We remark that every basis of (V, ◦) is also a basis of (V, ∗).By[5, Theorem 6.3, p. 259], we can choose vectors x1,...,xr ∈ V such that {˜x1,...,x˜r } is an orthogonal basis of (V, ◦) with respect to the form f˜. By (6.1), this basis is also an orthogonal basis of (V, ∗) with respect to the form f˜.Moreover, (6.1) entails that the µ-Hermitian forms f˜ and f˜ have the same matrix with respect to the above basis. Hence these two forms are equivalent and the proof of Theorem 1.4 is completed. D.Ž. Ðokovi´c, K.D. Ikramov / Journal of Algebra 257 (2002) 97–105 105

7. An application to real orthogonal groups

Let O(p, q), p + q = n, be the of GLn(R) consisting of all  matrices A such that A Jp,qA = Jp,q,whereJp,q = diag(1,...,1, −1,...,−1) with the first p (respectively last q) diagonal entries equal +1 (respectively −1). Consider the action of O(p, q) on the space Kn of all n by n skew-symmetric  matrices given by X → AXA , X ∈ Kn, A ∈ O(p, q). Then the following result is valid.

Proposition 7.1. For any X ∈ Kn, the matrices X and −X belong to the same orbit of O(p, q).

Proof. Apply Theorem 1.1 to the matrix Jp,q + X whose transpose is Jp,q − X. ✷

References

[1] P. Gabriel, Appendix: Degenerate bilinear forms, J. Algebra 31 (1974) 67–72. [2] A.I. Mal’tsev, Foundations of , Freeman, San Francisco, 1963. [3] C. Riehm, The equivalence of bilinear forms, J. Algebra 31 (1974) 45–66. [4] C. Riehm, M.A. Shrader-Frechette, The equivalence of sesquilinear forms, J. Algebra 42 (1976) 495–530. [5] W. Scharlau, Quadratic and Hermitian Forms, Springer, New York, 1985.