A Trace Inequality with a Subtracted Term

CORE Metadata, citation and similar papers at core.ac.uk

Provided by Elsevier - Publisher Connector

A Trace Inequality With a Subtracted Term

H. Miranda and Robert C. Thompson Department of Mathematics University of California Santa Barbara, California 93106

Submitted by Richard A. Brualdi

ABSTRACT

For futed real or complex matrices A and B, the well-known von Neumann trace inequality identifies the maximum of ItdUAVB)], as U and V range over the unitary group, the maximum being a bilinear expression in the singular values of A and B. This paper establishes the analogue of this inequality for real matrices A and B when U and V range over the proper (real) orthogonal group. The maximum is again a bilinear expression in the singular values, but there is a subtracted term when A and B have determinants of opposite sign.

John von Neumann [l] proved a half century ago that if A and B are square matrices with complex elements, then

sup Itr(UAVB)) = a,fil + a,& + *a. +(Y,P,, u, VE U(n)

where (Ye > ‘a. > (Y, are the singular values of A and p1 > *.a > p,, the singular values of B, with the sup taken over all matrices U and V in the n x n unitary group U(n). This theorem has attracted interest in applied linear algebra, including mathematical physics 171, psychology 191, the hypere- lasticity of isotropic materials [18], and elsewhere, including 117, 191. In this paper we consider matrices A and B with real elements, and we locate the value of sup tdUAVB) as the sup is taken over all elements U, V of SO(n), the real proper orthogonal group.

LINEAR ALGEBRA AND ITS APPLICATIONS 185: 165-172 (1993) 165 0 Elsevier Science Publishing Co., Inc., 1993 655 Avenue of the Americas, New York, NY 10010 0024.3795/93/$6,00 166 H. MIRANDA AND ROBERT C. THOMPSON

A list [2-111 of articles simplifying the original von Neumann proof, or expanding the scope of the result, appears at the end of this paper. The earliest of these is Fan’s paper [8]. There is a detailed analysis of the case of equality in [7]. N ew results related to the theorem are probably worthwhile, and ours are a natural counterpart to the original theorem and seem not to be in the literature, at least not in [2-111. Our results add to the slowly growing class of spectral inequalities having subtracted terms. We augment the last sentence by explaining that spectral inequalities with subtracted terms often occur in the study of singular values. See [13-151 for some examples. Many of these seemingly curious inequalities are best under- stood in terms of the properties of the root systems associated with the classical simple Lie groups and algebras. Our proof technique in this paper is elementary, using no Lie theory, instead using a maximization technique often employed to establish spectral inequalities. An application of Theorem 1 is in [20].

THEOREM 1. Taking the singular values q, /3, of the real matrices A and B in weakly decreasing order,

sup tr(PAQB) = ol& -t oJ2 + *.a +cY,_~&-~ P,QESO~)

+(signdet AB)a, &.

In particular, when A and B have determinants of opposite sign,

sup tr(PAQB) = or& + (~a& + *** +o,_1@,_r - a,&,. P,QESOh)

Proof. Since SO(n) X SO(n) 1s. compact and trace is a continuous function, the sup in Theorem 1 is attained. We show that it has at most the value claimed in the theorem. Let P, and Q. be elements of SO(n) at which the sup is attained. We are going to perturb the matrix P, AQo B by a rotation and deduce certain information. Let Rij(8) be a rotation matrix, that is, an identity matrix apart from elements cos 8, sin 8, -sin 8, cos 8 in positions (i, i>, (i,j>,(j, i> and (j,j>, respectively. Then tr[ R,(B)P, AQO B 1 achieves a maximum at 13= 0, so that its derivative with respect to 8 vanishes at 13= 0. A simple computation shows that the (i, j) and (j, i) elements of P, AQ, B are the same. Application of this fact for all i and j shows that PO AQ, B is symmetric. Since tl(P, AQO B) = tr(Q, BP, A), a similar computation shows that Q0 BP, A is symmetric. TRACE INEQUALITY 167

Let S = PO A and T = Q,,B. Then ST and TS are real symmetric matrices, with S having as its singular values those of A, and T those of R. By the singular value decomposition for real matrices, matrices 0, and 0, in SO(n) exist such that

O,SO, = diag( si, . . , sn).

We may assume that the diagonal elements si in O,SO, are nonnegative, except perhaps for the last, and are arranged in order of weakly decreasing absolute values. Thus si = cri, . , s,_~ = (Y,_ 1, s, = (sign det A)cr,. Note that

tr(F’,,AQ,B) = tr(ST) = tr[(O,P,)(AO,)(O~‘Q,)(RO~‘)].

Renaming O,P, as P,,, AO, as A, O,‘Q,, as Q,,, BO,’ as B, O,SO, as S, and O,‘TO;’ as T, we now have S = P,A = diagia,, . . . , a,_ 1, (sign det Ajo,,), T = Q. B, with ST and Z’S symmetric. Let T = [tij]. We assert that the trace of ST is the trace of a product

diag(ai,...,a,_l (sign det A) on) diag( + Ppojr . , A Ppcnj), with p a permutation of 1,. . , n, and with the product of the + signs in the right factor giving the sign of det B. The symmetry of ST and TS implies that si tij = sjtji and tijsj = tjisi. Hence (sf - sf)tij = 0. If sf # sJ? then tij = 0. If S2 has distinct diagonal elements, then T must be diagonal. Because the diagonal elements of T are _+ the singular values of B and det ST = det AB, our assertion is immediate, even if A or B is singular. Since an inequality is being proved, we could avoid the case in which S2 has nondistinct singular values case by appealing to the distinct singular value case and continuity. We prefer to give a direct analysis. Let S2 have nondistinct diagonal elements. Then T splits as a direct sum of blocks: T = diagf?;, T,, . . . , Tk_ 1, Tk), say, corresponding to S = diag(u,Zi, OZZ,,..., uk_,Zk_,, ai,D,), with ui > u2 > 1.. > uk_i > a, > 0. Here each Ii is an identity matrix, but D, departs from an identity in that the last diagonal entry is - 1 exactly when det A is negative. A simultaneous block diagonal similarity of S and T, with proper orthogonal diagonal blocks, permits us to take T,,...,Tk_l to be diagonal, and also Tk when D, is an identity matrix and Us is nonzero. If Us = 0, we may replace Tk by PkTkQk where Pk and Qk are proper orthogonal matrices diagonalizing T,, and leave 168 H. MIRANDA AND ROBERT C. THOMPSON the products ST and TS unchanged. We only have to show how to replace Tk by a diagonal matrix when det A is negative and D, has - 1 as its last diagonal element. The matrix o, DkTk is symmetric, and its trace, as the sum of its eigenvalues, is a sum of terms each of which is the singular value gk of S times + a singular value of DkTk, that is, times + a singular value of Tk. The product of the + signs is the sign of the product of the eigenvalues of DkTk and therefore is the sign of det DkTk Because the last diagonal element of D, is - 1, the product of the + signs is the sign of - det Tk = det DkTk. Hence the trace of CT~DkTk is the trace of a product (c~ D, times a diagonal matrix of signed singular values of Tk), in which the signs on the singular values of Tk are those on the singular values of DkTk, except for the sign on one singular value, which for Tk is opposite to that of DkTk. From these facts, our assertion follows without any need to effect a diagonalization of Tk. Thus

tr(P,AQ,B) = tr(ST) = 2 (+ai)(+Pp(,,): i=l with only (Y, among the oi perhaps carrying a negative sign. If det AB is negative, the positions of the negative entries on the q and on the pPCij cannot completely be the same, so that at least one term cq /!IPCijcarries a negative sign. A simple rearrangement argument shows that when det ( AR) is nonnegative the sum cannot exceed

and when det AB is negative,

n-l c %Pi - %P,. i=l

Returning to the original matrices A and B, before the notational changes, we have proved that the expressions just displayed are upper bounds for trf PAQB). M oreover, these expressions are achievable values for tr( PAQB) as P and Q range over SO(n). Indeed, we may take A = diag(a,, . , a,_ 1, (sign det A)cw,), B = diag( /Ii, . , P, _ 1, (sign det B)&), and then take P=Q=I. n TRACE INEQUALITY 169

It is easy to see that inf”, y E uCnjltr(UAVR>j = 0. Because of the absence of absolute values, the inf parallel to the sup in Theorem 1 is generally nonzero, and its value is left to the reader. The generalization of Theorem I to more than two matrices is the content of Theorem 2. Its proof will give an alternative demonstration of Theorem 1.

THEOREM2. Let A,, . , A, be matrices with real entries. Take the singular values of Aj to be s,(Aj) > -a. 2 s,

sup tr( p, A, ... P,A,) P, E SO(?l), , Pm E SO(n)

n-l ln

= C nsi(Aj) + [signdet(Al*** A,)IjQSn(Aj). i=l j=l

Proof. We shall use induction on m. No use is made of Theorem 1. The following argument includes the m = 1 case starting the induction. Without loss of generality, we may suppose that A,, . , A,,, are diagonal, with the diagonal elements of each Aj appearing in order of decreasing absolute values, and only the last possibly negative. As before, the sup is attained, so suppose that matrices P,, Pz, , P, in SO(n) achieve it. The matrices A,, . . , A,, may have multiple or zero singular values. Suppose, as an initial case, that each diagonal matrix Ai has simple nonzero singular values. Set M = P,A, *.. P”,_ 1A,_, P,,,. Let Rij( 13) be a rotation matrix as before. Then tr[ R,,(B)MA,] has a maximum at 8 = 0, and so does tr[MRij(B)A,] = tr[Rij(O>A, M]. Therefore MA, is symmetric, and so is A,M. Let A, = diag(a,, . . , a,), where the ai are distinct in absolute value. Then Mija; = Mjiq and criMij = a; Mji. Therefore ( ui’ - uj2>Mij =

0, whence M is diagonal. Moreover, the diagonal elements of M are in order of weakly decreasing absolute values, and only the last is possibly negative. For if not, by simple rearrangement inequalities, tr(R-lMRA,) would be increased by a suitable choice of the generalized permutation matrix R in SO(n). When m = 1, by proper orthogonality the matrix M = P, must now be the identity, and the value of the sup is clear. Let m > 1. Let N = P,,A,P,A, *-- P,_l. Then trMA, = trNA,_i, and by the same argument N is a diagonal matrix, with diagonal elements in order of weakly decreasing absolute values and only the last possible negative. 170 H. MIRANDA AND ROBERT C. THOMPSON

Now

P,,,A,MP,-’ = NA,_,(= P,A,P,A, .a* P,,,_,A,_,).

Both A,,, M and NA,_ I are diagonal matrices with nonzero diagonal elements which are in order of strictly decreasing absolute values, since this is true for A, and weakly so for M, and for A,,_ 1 and weakly for N. Thus the similar diagonal matrices A, M and NA, _ r have their necessarily simple eigenvalues appearing on the diagonal in the same order. Consequently the matrix Pm effecting the similarity must be a diagonal matrix, and therefore commutes with the diagonal matrix A,_ r. Hence

Pl Al **a P,,-lA,_,PmA, = P,A, .*.(P,_,P,)( A,_,&,)

We are now in a position to apply induction. As P,, , . , P,,, _ 2, (Pm _ 1Pm) range over SO(n),

suptr[P,A, -0. P,,_2 A,-e(P,-1P,)(A,-IA,)I

n-1

= iFl ‘i( Al) **aSi(Am-,)Si(Atn-lAm)

+{signdet[ A, 0.. A rn-2( 474L)lIsn( Ad.‘.

n-l

= c si( A,) ... ‘i( An-l)si( Am) i=l

+ [sign det( A, ~0. A,-,A,)]s,(A,) *** 4A,-,)s,(A,J.

Therefore, for m > 1, the sup has its claimed value when the A, have simple nonzero singular values. Now suppose the Ai do not all have simple nonzero singular values. Choose the Pi so that the sup is attained, and then perturb the Ai to have simple nonzero singular values. The upper bound on the trace is then valid for the chosen Pi and the perturbed A,. By continuity it continues to be an upper bound as the perturbations approach zero, whence it is an upper bound for the original matrices A,. It is clear that the upper bound on the trace is achieved for suitable matrices Pi in SO(n). n TRACE INEQUALITY 171

The case of equality in the von Neumann theorem seems to be analysed only in [7]. Th e f u 11 result is somewhat intricate, but becomes a bit simpler if the von Neumann theorem is stated in another way. A later paper examining cases of equality in the von Neumann result and our proper orthogonal version of it will be prepared if sufficiently significant results are found.

The preparation of this paper was supported in part by a National Science Foundation grant to the second author. The referee is thanked for helpful comments.

REFERENCES

Item 1 is the original paper, items 2-6 include simpler proofs of the von Neumann inequality for two matrices, and 7-11 include the generalization to an arbitrary number of matrices. Item 7 discusses the case of equality. None of 1-11 investigate the case in which the sup is taken over the proper orthogonal group. Items 12-16 relate to spectral inequalities having subtracted terms, and 17-19 contain applications of the theorem. An application of Theorem 1 is in [20].

1 J. von Neumann, Some matrix inequalities and metrization of matrix space, Tomsk. Univ. Rev. 1:286-300 (1937); Collected Works, Pergamon, New York, Vol. 4, 1962, pp. 205-219. 2 P. M. Alberti and A. Uhlmann, Stochasticity and Partial Order, Reidel, Dordrecht, 1982. 3 M. L. Eaton, Lectures on Topics in Probability Inequalities, Centrum voor Wiskunde en Informatica, Amsterdam, 1987. 4 R. Horn and C. R. Johnson, Matrix Analysis, Cambridge U.P., Cambridge, 1985. 5 L. Mirsky, A trace inequality of John von Neumann, Mona&. Math. 79:303-306 (1975). 6 R. Schatten, Norm Ideals of Completely Continuous Operators, Springer-Verlag, Berlin, 1960. 7 H. W. Cape1 and P. A. J. Tindemans, An inequality for the trace of matrix products, Rep. Math. Phys. 6:225-235 (1974). 8 Ky Fan, Maximum properties and inequalities for the eigenvalues of completely continuous operators, Proc. Nat. Acad. Sci. U.S.A. 37:760-766 (1951). 9 W. Kristoff, A theorem on the trace of certain matrix products and some applications, J. Math. Psych. 7:515-530 (1970). 10 M. Marcus and B. N. Moyls, On the maximum principle of Ky Fan, Canad. J. Math. 9:313-320 (1957). 11 I. Olkin and A. Marshall, Inequalities: Theory of Majorization and Its Applica- tions, Academic, New York, 1979. 12 A. Horn, Doubly stochastic matrices and the diagonal of a rotation matrix, Amer. 1. Math. 76:620-630 (1954). 13 R. C. Thompson, Singular values, diagonal elements, and convexity, SIAM J. Appl. Math. 32:39-63 (1977). 172 H. MIRANDA AND ROBERT C. THOMPSON

14 R. C. Thompson, The intersection of the convex hulls of proper and improper real matrices with prescribed singular values. Linear and Multilinear Algebra 3:155-160 (1975). 15 R. C. Thompson, Singular values and diagonal elements of complex symmetric matrices, Linear Algebra A$. 26:65-106 (1979). 16 B. Tromberg and S. Waldenstrom, Bounds on the diagonal elements of unitary matrices, Linear Algebra Appl. 20:189-195 (1978). 17 R. W. Brockett, Dynamical systems that sort lists, diagonalize matrices, and solve linear programming problems, Linear Algebra A&. 146:79-91 (1991). 18. H. Le Dret, Sur les functions de matrices convexes et isotropes, C. R. Acud. Ski. Paris (Sir. 1. Math.) 310:617-620 (1990). 19. G. W. Stewart and Ji-guang Sun, Matrix Perturbation Theory, Academic, Boston, 1990. 20. Hector F. Miranda and Robert C. Thompson, Group majorization, the convex hulls of sets of matrices, and the diagonal element/singular value inequalities, in preparation.

Received 12 August 1991; final manuscript accepted 14 April 1992