The hyperbolic Schur decomposition

Šego, Vedran

2012

MIMS EPrint: 2012.119

Manchester Institute for Mathematical Sciences School of Mathematics

The University of Manchester

Reports available from: http://eprints.maths.manchester.ac.uk/ And by contacting: The MIMS Secretary School of Mathematics The University of Manchester Manchester, M13 9PL, UK

ISSN 1749-9097 The hyperbolic Schur decomposition

Vedran Segoˇ a,b

a Faculty of Science, University of Zagreb, Croatia b School of Mathematics, The University of Manchester, Manchester, M13 9PL, UK

Abstract We propose a hyperbolic counterpart of the Schur decomposition, with the em- phasis on the preservation of structures related to some given hyperbolic scalar product. We give results regarding the existence of such a decomposition and research the properties of its block triangular factor for various structured ma- trices. Keywords: indefinite scalar products, hyperbolic scalar products, Schur decomposition, Jordan decomposition, quasitriangularization, quasidiagonalization, structured matrices 2000 MSC: 15A63, 46C20, 65F25

1. Introduction

The Schur decomposition A = UTU ∗, sometimes also called Schur’s unitary triangularization, is a unitary similarity between any given square A ∈ n n n n C × and some upper T ∈ C × . Such a decomposition has a structured form for various structured matrices, i.e., T is diagonal if and only if A is normal, real diagonal if and only if A is Hermitian, positive (nonnegative) real diagonal if and only if A is positive (semi)definite and so on. Furthermore, the Schur decomposition can be computed in a numerically stable way, making it a good choice for calculating the eigenvalues of A (which are the diagonal elements of T ) as well as the various matrix functions (for more details, see [11]). Its structure preserving property allows to save time and memory when working with structured matrices. For example, computing the value of some function of a is reduced to working with a , which involves only evaluation of the diagonal elements. Unitary matrices are very useful when working with the traditional Euclidean n scalar product hx,yi = y∗x, as their columns form an orthonormal basis of C . However, many applications require a nonstandard scalar product which is usu- ally defined by [x,y]J = y∗Jx, where J is some nonsingular matrix, and many of these applications consider Hermitian or skew-Hermitian J. The hyperbolic

Email address: [email protected] (Vedran Sego)ˇ scalar product defined by a signature matrix J = diag(j1,...,jn)(jk ∈ {−1, 1}) arises frequently in applications. It is used, for example, in the theory of rel- ativity and in the research of the polarized light. More on the applications of such products can be found in [10, 13, 14, 17]. The Euclidean matrix decompositions have some nice structure preserving properties even in nonstandard scalar products, as shown by Mackey, Mackey and Tisseur [16], but it is often worth looking into versions of such decomposi- tions that respect the structures related with the given scalar product. There is plenty of research on the subject, i.e., hyperbolic SVD [17, 24], J1J2-SVD [9], two-sided hyperbolic SVD [20], hyperbolic CS decomposition [8, 10] and indefinite QR factorization [19]. There are many advantages of using decompositions related to some specific, nonstandard scalar product, as such decompositions preserve structures related to a given scalar product. They can simplify calculation and provide a better insight into the structures of such structured matrices. In this paper we investigate the existence of a decomposition which would resemble the traditional Schur decomposition, but with respect to the given hyperbolic scalar product. In other words, our similarity matrix should be unitary-like (orthonormal, to be more precise) with respect to that scalar prod- uct. As we shall see, a hyperbolic Schur decomposition can be constructed, but not for all square matrices. Furthermore, we will have to relax conditions on both U and T . The matrix U will be hyperexchange (a column-permutation of the matrix unitary with respect to J). The matrix T will have to be block upper triangular with diagonal blocks of order 1 and 2. Both of these changes are quite usual in hyperbolic scalar products. For example, they appear in the traditional QR vs. the hyperbolic QR factorizations [19]. Some work on the hyperbolic Schur decomposition was done by Ammar, Mehl and Mehrmann [1, Theorem 8], but with somewhat different focus. They have assumed to have a partitioned J = Ip ⊕ (−Iq), in the paper denoted as Σp,q, for which they have observed a Schur-like similarity through unitary factors (without permuting J), producing more complex triangular factors. Also, their decomposition is applicable only to the set of J-unitary matrices, in the paper denoted as the Lie group Op,q. In the symplectic scalar product spaces, Schur-like decomposition was re- searched by Lin, Mehrmann and Xu [15], by Ammar, Mehl and Mehrmann [1], and by Xu [22, 23]. In section 2, we provide a brief overview of the definitions, properties and other results relating to the hyperbolic scalar products that will be used later. In section 3, the definition and the construction of the hyperbolic Schur decom- position are presented. We also provide sufficient requirements for its existence and examples showing why such a decomposition does not exist for all matrices. In section 4 we observe various properties of the proposed decomposition. We finalize the results by providing the necessary and the sufficient conditions for the existence of the hyperbolic Schur decomposition of J-Hermitian matrices in section 5.

2 The notation used is fairly standard. The capital letters refer to matrices and their blocks, elements are denoted by the appropriate lowercase letter with two subscript indices, while lowercase letters with a single subscript index represent vectors (including matrix columns). By J = diag(±1) we denote a diagonal signature matrix defining the hyperbolic scalar product, while P and Pk (for some indices k) denote permutation matrices. We use

1 . 1 Sn := [δi,n+1 j ]=  ..  = Sn− − 1      for the standard involutory permutation (see [6, Example 2.1.1.]), J for a Jor- dan matrix and Jk(λ) for a single Jordan block of order k associated with the th eigenvalue λ. Vector ek denotes k column of the and ⊗ denotes the Kronecker product. The symbol ⊕ is used to describe a diagonal concatena- tion of matrices, i.e., A ⊕ B is a block diagonal matrix with the diagonal blocks A and B. Also a standard notation, but somewhat incorrect in terms of the indefinite scalar products, is |v| := |[v,v]|. This is used as the norm of vector v induced by the scalar product [·, ·], but one should keep in mind that it doesn’t have p the usual properties of the norm (definiteness and the triangle inequality do not hold), but is used nevertheless due to its relation with the scalar product.

2. The hyperbolic scalar products As mentioned in the introduction, an indefinite scalar product is defined by n n a nonsingular Hermitian indefinite matrix J ∈ C × as [x,y]J = y∗Jx. When J is known from the context, we simply write [x,y] instead of [x,y]J . When J is a signature matrix, i.e., J = diag(±1) := diag(j11,j22,...,jnn), where jkk ∈ {−1, 1} for all k, the scalar product is referred to as hyperbolic and takes the form n

[x,y]J = y∗Jx = jiixiyi. i=1 X Throughout this paper we assume that all considered scalar products are hy- perbolic, unless stated otherwise. Indefinite scalar products have another important property which, unfortu- nately, causes a major problem with the construction of the decomposition. A vector v 6=0 is said to be J-degenerate if [v,v] = 0; otherwise, we say that it is J-nondegenerate. Degenerate vectors are sometimes also called J-neutral. If [v,v] < 0 for some vector v, we say that v is J-negative, while we call it J-positive if [v,v] > 0. When J is known from the context, we simply say that the vector is degenerate, nondegenerate, neutral, negative or positive. We extend this notion to matrices as well: a matrix A is J-degenerate if rank A∗JA < rank A. Otherwise, we say that A is J-nondegenerate. Again, if J is known from the context, we simply say that A is degenerate or nondegenerate.

3 We say that the vector v is J-normalized, or just normalized when J is known from the context, if |[v,v]| = 1. As in the Euclidean scalar product, if a vector v is given, then the vector 1 1 v′ = v = v (1) |v| |[v,v]| is a normalization of v. Note that degeneratep vectors cannot be normalized. Also, for a given vector x ∈ Cn, sign[ξx,ξx] is constant for all ξ ∈ C \ {0}. This means that the normalization (1) does not change the sign of the scalar product, i.e., sign[v,v] = sign[v′,v′]=[v′,v′]. Like in the Euclidean scalar products, we define the J-conjugate trans- [ ]J pose (or J-adjoint) of A with respect to a hyperbolic J, denoted as A ∗ , [ ]J n as [Ax,y]J = [x, A ∗ y]J for all vectors x,y ∈ C . It is easy to see that [ ]J [ ] A ∗ = JA∗J. Again, if J is known from the context, we simply write A ∗ . The usual structured matrices are defined naturally. A matrix H is called [ ] J-Hermitian (or J-selfadjoint) if H ∗ = H, i.e., if JH is Hermitian. A matrix [ ] 1 U is said to be J-unitary if U ∗ = U − , i.e., if U ∗JU = J. Like their traditional counterparts, J-unitary matrices are orthonormal with respect to [·, ·]J . However, unlike in the Euclidean scalar product, in hyperbolic scalar products we have a wider class of matrices orthonormal with respect to J (which are not necessarily unitary with respect to the same scalar product), called J-hyperexchange matrices. We say that U is J-hyperexchange if U ∗JU = P ∗JP for some permutation P . Although the term “hyperexchange” is quite common, we often refer to such matrices as J-orthonormal, to emphasize that their columns are J-orthonormal vectors. In other words, if the columns of U are denoted as ui, then

[ui,ui]= ±1, [ui,uj ] = 0, for all i,j. More on the definitions and properties related to hyperbolic (and, more generally, indefinite) scalar products can be found in [6]. Throughout the paper, we often consider the diagonal blocks of a given matrix A. In order to keep the relation with the given hyperbolic scalar product induced by some J = diag(±1), we introduce the term the corresponding part of J. Let J = diag(j1,...,jn), jk ∈ {−1, 1}, define a hyperbolic scalar product Cn n and let A = Aij ∈ × be a blockmatrix with elements apq partitioned in the blocks Aij inh a wayi that each diagonal block Akk is of order 1 or 2. Observing the block Akk for a given k, the corresponding part of J, here denoted as J ′, is defined as J ′ = jp if Akk = app is of order 1 and as J ′ = diag(jp,jp+1) if h i h i app ap,p+1 Akk = "ap+1,p ap+1,p+1# is of order 2.

4 3. Definition and existence of the hyperbolic Schur decomposition

In this section we present the definition and the main results regarding the n n hyperbolic Schur decomposition of a matrix A ∈ C × . Like other hyperbolic generalizations of the Euclidean decompositions, this one also has a hyperexchange matrix instead of a unitary one, as well as the block structured factor instead of a triangular/diagonal factor of the Schur de- composition. Block upper triangular matrices with the diagonal blocks of order 1 and 2 are usually referred to as quasitriangular (see [18, Section 3.3]). Simi- larly, if T is block diagonal with the diagonal blocks of order 1 and 2, we refer to it as quasidiagonal. Before providing a formal definition, let us first show obstacles which will explain why these changes are necessary. It is a well known fact that the first column u1 of the similarity matrix U 1 in any similarity triangularization A = UTU − is an eigenvector of A. Also, it is easy to construct a nonsingular matrix with all degenerate columns (see [21, Example 3.1], which cannot be J-normalized. This means that there are some matrices (even diagonalizable ones!) that are not unitarily triangulizable with respect to the scalar product induced by J. However, allowing the blocks of T to be of order 2, we relax that condition, as shown in the Example 3.1. For better clarity, we say that a block Xjj of a matrix X is irreducible if it cannot be split into smaller blocks, without losing the block structure of the k matrix. If X = i=1 Xii is block diagonal, we say that Xjj is irreducible if it cannot be split as X = Y ⊕ Y for some square blocks Y and Y . For L jj 1 2 1 2 a block triangular X, we say that Xjj is irreducible if it cannot be split as Y11 Y12 Xjj = Y22 , where Y11 and Y22 are square blocks. Since we mostly deal with blocksh of orderi 2, the irreducible blocks will be those of order 1 and those of order 2 with one of the nondiagonal elements (in case X is block diagonal) or the bottom left element (in case X is a block triangular) being nonzero. The described notation of irreducible blocks is just a descriptive way of saying that we cannot split X into smaller blocks while preserving its block diagonal or block triangular structure. It is used to simplify some of the statements and proofs by reducing the number of observed cases without losing generality. See [21, Example 3.2] for further clarification of the concept of irreducible blocks. Unfortunately, allowing the triangular factor to have blocks is not enough. We also need permutations of the decomposing matrix and the scalar product generator J. The following example shows why this is needed, simultaneously illustrating a general approach to proving that some matrices do not have a hyperbolic Schur decomposition.

1 Example 3.1. Let J = diag(1, 1, −1, −1) and let A = SJ S− , where J =

5 diag(1, 2, 3, 4) and

1 1 1 1 5√3 17√15 195√7 257√255 2 4 8 16  5√3 17√15 195√7 257√255  S = 4 16 64 256 .  5√3 17√15 195√7 257√255     8 64 512 4096   5√3 17√15 195√7 257√255    We shall now show that no J-unitary V and quasitriangular T exist such that 1 A = VTV − . Since

−1 −0.994393 −0.970230 −0.949852 −0.994393 −1 −0.993828 −0.985075 S∗JS = (2) −0.970230 −0.993828 −1 −0.998151   −0.949852 −0.985075 −0.998151 −1      to 6 significant digits, all eigenvectors of A (i.e., columns of S) are normalized negative vectors. Furthermore, if we denote them by si, then −1 < [si,sj ] < 0 for all i 6= j. Let us now assume that there exist a J-unitary V and a quasitriangular T 1 such that A = VTV − . We distinguish the following possibilities:

1. The first block of T is of order 1. Then v1 is obviously a J-normalized eigenvector of A. But, this is impossible, since V is J-unitary, meaning that V ∗JV = J, so [v1,v1] = 1, while [si,si] < 0 for all i and, since normalization does not change the sign of the vector’s scalar product by itself, [v1,v1] < 0. 2. The first block of T is an irreducible block of order 2 (i.e., t21 6= 0). Then it is easy to see that

Av1 = t11v1 + t21v2, Av2 = t12v1 + t22v2.

In other words,

(A − t11I)v1 = t21v2, (A − t22I)v2 = t12v1.

Multiplying the second equality with t21 and substituting t21v2 with the expression from the first equality, we get:

t12t21v1 =(A − t22I)t21v2 =(A − t22I)(A − t11I)v1.

In other words, v1 is an eigenvector of (A − t22I)(A − t11I). Using the same argument, we see that v2 is an eigenvector of (A − t11I)(A − t22I). Furthermore,

2 (A − t22I)(A − t11I)v1 = A v1 − (t11 + t22)Av1 + t11t22v1 = t12t21v1.

6 Now, we have:

2 0= A v1 − (t11 + t22)Av1 +(t11t22 − t12t21)v1, t + t 2 (t − t )2 = A − 11 22 I v − 11 22 + t t v . 2 1 4 12 21 1   !

So, v1 is an eigenvector of

t + t 2 A := A − 11 22 I , (3) 2 2   i.e., 2 (t11 − t22) A v = λ′v , λ′ := + t t . (4) 2 1 1 4 12 21 But, since (A − t22I)(A − t11I)=(A − t11I)(A − t22I),

v1 and v2 are both (linearly independent) eigenvectors of A2, with the same eigenvalue λ′. Since A and A2 are diagonalizable, every eigenvector of A is also an eigen- vector of A2. Moreover, since the eigenvalues of A are distinct (they are 1, 2, 3 and 4), A2 has at most one eigenvalue of multiplicity 2 (this is easily seen from its Jordan decomposition). In other words, its eigenspaces have dimensions at most 2, so v1 and v2 are linear combinations of si and sj for some i 6= j. Let v1 = α1si + β1sj and v2 = α2si + β2sj. Obviously, α1,β1 6=0(or v1 would be an eigenvector of A, hence covered by the case 1 ). Then, from J = V ∗JV , we get:

2 2 1= j11 =[v1,v1]= |α1| [si,si]+ |β1| [sj ,sj ]+2Re(α1β1[si,sj ]) 2 2 = −(|α1| + |β1| − 2[si,sj ]Re(α1β1)). (5)

From (2) we see that for i 6= k, −1 < [si,sk] < 0. Using (5) and the well known fact that | Re(z)|≤|z|, i.e., Re(z) ≥ −|z|, for all z ∈ C, we see that

2 2 −1= |α1| + |β1| − 2[si,sj ]Re(α1β1) 2 2 = |α1| + |β1| + 2|[si,sj ]| Re(α1β1) 2 2 ≥|α1| + |β1| − 2|[si,sj ]|·|α1|·|β1| 2 2 2 > |α1| + |β1| − 2|α1|·|β1| =(|α1|−|β1|) ≥ 0,

which is an obvious contradiction. Since both possible cases have led to a contradiction, the described decomposition does not exist for the pair (A, J).

7 Knowing the obstacles, we are now ready to define the hyperbolic Schur decomposition. n n Definition 3.2 (The hyperbolic Schur decomposition). For a given A ∈ C × and J = diag(±1), a hyperbolic Schur decomposition of A (with respect to J) is any J-orthonormal similarity of A to the quasitriangular form, i.e.,

1 A = VT V − , V ∗JV = P ∗JP, (6) where T is quasitriangular andb Pb is a permutation.b b It is easy to show (see [21]) that (6) is equivalent to the following identities:

1 A =(P1UP2)T (P1UP2)− , U ∗JU = J, J = P1∗JP1, 1 A =(VP )T (VP )− , V ∗JV = J, (7) e e e 1 A =(PW )T (PW )− ,W ∗J ′W = J ′, J ′ = P ∗JP, (8) 1 P ∗AP = WTW − , where U is J-unitary, V is J-unitary, W is J ′-unitary and P , P1 and P2 are permutations such that P = P1P2. Throughoute the paper we also consider decompositions that resemble the Schur decomposition, but with some of the irreducible blocks in T of order strictly larger than 2. We refer to such decompositions as the hyperbolic Schur- like decompositions. Naturally, the most interesting question here concerns the existence of such a decomposition. The following theorem shows that all diagonalizable matrices have it. n n Theorem 3.3 (). If A ∈ C × is diagonalizable, then it has a hyperbolic Schur decomposition with respect to any given J = diag(±1). 1 Proof. Let A = SJ S− , where J is a diagonal matrix, be a Jordan decompo- sition of A. Since S is nonsingular, and therefore S∗JS is of full rank, by [19, n n n n Theorem 5.3], there exist matrices Q ∈ C × and R ∈ C × and permutations P1 and P2 such that

S = P1QRP2∗, Q∗JQ = J, J = P1∗JP1 and R is quasitriangular. Note that, sincee S ise nonsingular,e R is nonsingular as well and is therefore invertible. So,

1 1 1 A = SJ S− = P1QRP2∗J P2R− Q− P1∗. (9)

Since J is diagonal, P2∗J P2 is diagonal as well, which means it is also (block) 1 upper triangular. We already have that R is quasitriangular and R− has the same block triangular structure as R. In other words,

1 T := RP2∗J P2R−

8 is quasitriangular. From this and (9), we see that there exists P := P1 such that 1 A =(PQ)T (PQ)− , Q∗JQ = J, J = P ∗JP, so (8) holds, which means that A has a hyperbolic Schur decomposition. e e e Obviously, there are also some nondiagonalizable matrices that have a hy- perbolic Schur decomposition. Trivial examples are all Jordan matrices with at least one diagonal block of order greater than 1, since they are by definition both nondiagonalizable and triangular. In Theorem 3.3, we assume A to be diagonalizable, which is then conve- niently used in the proof. Under the assumption of diagonalizability, we can also mimic the traditional proof from the Euclidean case, using the diagonalizabil- ity for the convenient choice of the columns of the similarity matrix. Although technically far more complex than in the Euclidean case, this proof is also pretty straightforward and will be used to prove the existence of the hyperbolic Schur decomposition for some nondiagonalizable matrices in Proposition 3.7. It is only natural to ask if there exists a nondiagonalizable matrix which does not have a hyperbolic Schur decomposition? As the following example shows, such matrices do exist. Remark 3.4. When discussing the counterexamples for the existence of a hy- perbolic Schur decomposition, we shall often define our matrices via their Jordan decompositions because the (non)existence of a hyperbolic Schur decomposition heavily depends on the degeneracy and mutual J-orthogonality of the (general- ized) eigenvectors, i.e., of (some) columns in the similarity matrix S.

1 Example 3.5. Let J = diag(1, −1, 1, −1) and A = SJ4(λ)S− for some λ ∈ C and 11 1 1 1 1 1 −1 S = . 1 −1 1 −1   1 −1 −1 −1     Let us assume that A has a hyperbolic Schur decomposition (6). As usual, T11 denotes the smallest irreducible top left diagonal block of T = [tij] (i.e., such that the elements of T beneath T11 are zero). In other words, T11 is of order 1 if t21 = 0 and of order 2 otherwise. We denote the columns of V as vi and the columns of S as si. Note that

b ±1, i = j, [vi,vj ]= (0, i 6= j.

Now, if T11 is of order 1, then v1 is an eigenvector of A, i.e., it is collinear with s1. In other words, v1 = xs1 for some ξ 6= 0. Then

2 1= |[v1,v1]| = |ξ| |[s1,s1]| = 0,

9 which is an obvious contradiction, so T11 is of order 2. Let T11 be of an order 2, with the elements denoted as tij , for i,j ∈ {1, 2}. Similar as in Example 3.1 (case 2 ), we define

2 t11 + t22 (t11 − t22) A := A − I , λ′ := + t t . 2 2 4 12 21  

Now, the first two columns of V , v1 and v2, are both (linearly independent) eigenvectors of A2 with the same eigenvalue λ′. It is easy to see that if A2 is nonsingular, then it is similar to J4(λ′), which means it has only one eigenvector. This is a contradiction with the assumption that v1 and v2 are linearly independent. Hence, A2 is singular, i.e., λ′ = 0, and we get t11 + t22 = 2λ. A simple calculation yields two eigenvectors of A2:

T T s1′ =[1100] , s2′ =[0011] .

Since they span the same eigenspace as v1 and v2, we conclude that

v1 = α11s1′ + α21s2′ , v2 = α12s1′ + α22s2′ . for some α11, α12, α21, α22. Note that

[s1′ ,s1′ ] = 0, [s2′ ,s2′ ] = 0, [s1′ ,s2′ ] = 0, which means that s1′ and s2′ are both J-degenerate and mutually J-orthogonal. The contradiction is now obvious:

±1=[v1,v1]=[α11s1′ + α21s2′ ,α11s1′ + α21s2′ ] 2 2 = |α11| [s1′ ,s1′ ]+2Re(α11α21[s1′ ,s2′ ]) + |α21| [s2′ ,s2′ ] = 0.

It is fairly easy to construct a matrix A of order n similar to the one in Example 3.5 for the Schur-like decompositions with bigger diagonal blocks of T . This means some matrices do not have a hyperbolic Schur-like decomposition, regardless of the order of the biggest diagonal block in T , as long as this order is strictly less than n. For a more detailed description, see [21]. Since the blocks in T are of an order at most 2, it makes sense to ask if all the matrices with the Jordan blocks of order at most 2 (in their Jordan de- composition) have a hyperbolic Schur decomposition. As the following example shows, this is also, unfortunately, not the case.

1 Example 3.6. Let J = diag(1, −1, 1, −1) and let A = SJ S− , where

11 1 2 12 1 1  S = , J = J2(λ1) ⊕ J2(λ2), λ1 6= λ2. 1 2 −1 −1   1 1 −1 −2    

10 Assume that A has a hyperbolic Schur decomposition (6) with V = [v1 ...v4] hyperexchange and T quasitriangular and let si = Sei. Using the same argu- mentation as in Example 3.5, we see that the top left block of Tb has to be of 2 order 2. Also, v1 and v2 must be eigenvectors of some A2(ξ) := (A − ξI) as- sociated with the same eigenvalue (for some ξ). However, v1 and v2 must be linearly independent, and the only 2-dimensional eigenspaces of A2(ξ) are those spanned by:

1. {s1,s2} for ξ = λ1, 1 2. {s1,s3} for ξ = 2 (λ1 + λ2), and 3. {s3,s4} for ξ = λ2. Each of these sets consists of degenerate, mutually J-orthogonal vectors. It is easy to see that the linear combinations of such vectors are also degenerate (and mutually J-orthogonal), which is a contradiction with the assumption that v1 and v2 are columns of a hyperexchange matrix V .

However, if we limit the matrix to have onlyb one Jordan block of order 2 (the rest of the Jordan form being diagonal), it will always have a hyperbolic Schur decomposition, as shown in the following proposition. Its proof is a constructive one, following the idea of the iterative reduction, very much like the common proof in Schur decomposition (see [12, Theorem 2.3.1], [7, Theorem 7.1.3] or [21]).

n n 1 Proposition 3.7. Let A ∈ C × have a Jordan decomposition A = SJ S− such that J has at most one block of order 2, while all others are of order 1. Then A has a hyperbolic Schur decomposition for any given J = diag(±1).

Proof. If all Jordan blocks of A are of order 1, the matrix A is diagonalizable and, by Theorem 3.3, has a hyperbolic Schur decomposition. So, we shall only consider a case when A has (exactly one) Jordan block of order 2.

Case 1 If there is a nondegenerate eigenvector s1 or A, we can J-normalize it, obtaining the J-normal eigenvector v1 = s1/|s1|. As explained in [6, page 10], v1 can be expanded to the J-orthonormal basis {v1,v2,...,vn}. Defining a matrix V := v1 v2 ··· vn , we see that h i t11 ∗ 1 A = V   V − , V ∗JV = P ∗JP.  0 A′        We repeat the process on A′ until we either get to the block of order 2 or to the A′ such that all its eigenvectors are J ′-degenerate, where J ′ is the corresponding (bottom right) part of P ∗JP . It is not hard to see that this sequence really gives the hyperbolic Schur-like decomposition such that blocks in T are of order 1,

11 except maybe for the bottom right one which may be of an arbitrary order, which is covered by Case 2. Case 2 We now focus on A such that all its eigenvectors are J-degenerate, i.e., [si,si] = 0 for all i (except, maybe, the second one, since s2 is not an eigenvector but (the second) generalized eigenvector associated with the afore- mentioned block J2(λ)), as this is the only case not resolved by the previously described reductions. Without the loss of generality, we may assume that

J = J2(λ1) ⊕ λ2 ⊕···⊕ λn 1. −

So, by assumption, [si,si] = 0, forall i ∈ {1, 3, 4,...,n}. Note the absence of the second column, as this one is not an eigenvector, but a generalized eigenvector associated with J2(λ1). Case 2.1 If [s1,s2] 6=0 and [s2,s2] 6= 0, we define

v1′ = ξs1 + s2, v2′ = s2.

Since s1 and s2 are linearly independent, v1′ and v2′ are also linearly independent for every ξ 6= 0. We shall define the appropriate ξ in a moment. Note that

[v1′ ,v2′ ]=[ξs1 + s2,s2]= ξ[s1,s2]+[s2,s2].

Since we want to construct a J-orthonormal set, we want [v1′ ,v2′ ] = 0, so we define [s ,s ] ξ := − 2 2 . [s1,s2]

Now that v1′ and v2′ are J-orthogonal, we need to be able to J-normalize them and, in order to do that, we need them to be nondegenerate. Vector v2′ = s2 is nondegenerate by assumption. We check the (non)degeneracy of vector v1′ , using the fact that [s2,s2] ∈ R, which is valid for all vectors in any indefinite scalar product space:

2 [v1′ ,v1′ ]=[ξs1 + s2,ξs1 + s2]= |ξ| [s1,s1]+2Re(ξ[s1,s2])+[s2,s2]

= −2Re[s2,s2]+[s2,s2]= −[s2,s2] 6= 0.

We define v1 = v1′ /|v1′ | and v2 = v2′ /|v2′ |, obtaining the J-orthonormal set {v1,v2}. As we did in Case 1, we expand this set to the J-orthonormal ba- sis {v1,v2,...,vn}, define the J-orthonormal matrix V with columns v1,...,vn and, by construction, see that

T11 ∗ 1 A = V   V − , V ∗JV = P ∗JP. (10)  0 A′       

12 Here, T11 is of order 2 and the matrix A′ is diagonalizable. Hence, by Theo- rem 3.3, A′ has a hyperbolic Schur decomposition, so A has one too.

Case 2.2 Let us now assume that [s1,s2] 6=0 and [s2,s2] = 0. We define

v1′ = ξs1 − s2, v2′ = ξs1 + s2.

As before, v1′ and v2′ are linearly independent for every ξ 6= 0. In order to define the appropriate ξ, note that

2 [v1′ ,v2′ ]=[ξs1 − s2,ξs1 + s2]= |ξ| [s1,s1]+2Im(ξ[s1,s2])+[s2,s2]

= 2Im(ξ[s1,s2]).

Hence, to obtain the J-orthonormality of v1′ and v2′ , we define

ξ := [s1,s2], once again getting a J-orthogonal set {v1′ ,v2′ }. As before, we check that these vectors are J-nondegenerate:

2 [v1′ ,v1′ ]=[ξs1 − s2,ξs1 − s2]= |ξ| [s1,s1] − 2Re(ξ[s1,s2])+[s2,s2] 2 = −2|[s1,s2]| 6= 0, 2 [v2′ ,v2′ ]=[ξs1 + s2,ξs1 + s2]= |ξ| [s1,s1]+2Re(ξ[s1,s2])+[s2,s2] 2 = 2|[s1,s2]| 6= 0.

Next, we J-normalize vectors v1′ and v2′ to obtain a J-orthonormal set {v1,v2}, expand it to a J-orthonormal basis {v1,...,vn} and a J-orthonormal matrix V . By construction, (10) holds and we can further decompose A′, which is again diagonalizable.

Case 2.3 We have now covered all the cases such that [s1,s2] 6= 0, so we now assume that [s1,s2]=0. Let k be such that [s1,sk] 6= 0. Obviously, k 6= 2, so both vectors s1 and sk are J-degenerate eigenvectors of A. Note that such k must exist because, otherwise, k-th row and column of S∗JS would be zero, which is contradictory to the assumption that S and J are nonsingular. We handle this case exactly the same way we did Case 2.2. The only differ- ence is in the exact formula for the block T11 in (10), which we omit, as it is unimportant for the proof.

We have researched the existence of the hyperbolic Schur decomposition of a given matrix in relation to its Jordan structure. But, the Schur decomposition is particularly interesting as a decomposition that preserves structures (with respect to the corresponding scalar product), so it is only natural to research the existence of the hyperbolic Schur decomposition for J-Hermitian and J- unitary matrices. A discussion in [6, Section 5.6] gives a detailed analysis of J-Hermitian matri- ces for the case when J (in [6] denoted as H) has exactly one negative eigenvalue.

13 In the case of the hyperbolic scalar products, the discussion is basically about the Minkowski spaces, i.e.,

J = ± diag(1, −1,..., −1) or J = ± diag(1,..., 1, −1).

However, Proposition 3.7 cannot be used to conclude that all J-Hermitian ma- trices in the Minkowski space have a hyperbolic Schur decomposition. The problem arises from the case (iv) in the aforementioned discussion in [6, Sec- tion 5.6], which states that a J-Hermitian matrix can have a Jordan block of order 3. Using this case, we construct the following very important example which shows that there really exist such matrices that have no hyperbolic Schur decomposition.

Example 3.8 (J-Hermitian matrix that does not have a hyperbolic Schur de- composition). Let J = diag(1, 1, −1) and let

12 11 −16 3 5 3 1 A = 11 8 −13 = SJ3(0)S− , S = 4 5 3 . 16 13 −20 5 7 4         1 Let us assume that A has a hyperbolic Schur decomposition A = VTV − , V ∗JV = P ∗JP . From the definition of A, it is obvious that all the eigenvectors of A are colinear with s1. Note that

0 0 1 S∗JS = 0 1 2 , 1 2 2     which means that s1 is degenerate and J-orthogonal to s2. Following the previ- ous discussions (see Example 3.5 or the proof of Proposition 3.7), we see that the top left block of T must be of order 2. But, the associated (first two) columns of V , denoted v1 and v2, must be J-normal, mutually J-orthogonal linear com- binations of s1 and s2. This is impossible, since s1 is degenerate and s1 and s2 are J-orthogonal. To show this, assume that such vectors exist, i.e., we have αij such that

v1 = α11s1 + α12s2, v2 = α21s1 + α22s2.

We check the desired properties of v1 and v2. First, J-normality:

1=[v1,v1]=[α11s1 + α12s2,α11s1 + α12s2] 2 2 2 = |α11| [s1,s1]+2Re(α11α12[s1,s2]) + |α12| [s2,s2]= |α12| ,

1=[v2,v2]=[α21s1 + α22s2,α21s1 + α22s2] 2 2 2 = |α21| [s1,s1]+2Re(α21α22[s1,s2]) + |α22| [s2,s2]= |α22| .

14 So, |α12| = |α22| = 1. Using this result, we analyse the J-orthogonality of v1 and v2:

0= |[v1,v2]| = |[α11s1 + α12s2,α21s1 + α22s2]|

= |α11α21[s1,s1]+ α11α22[s1,s2]+ α12α21[s2,s1]+ α12α22[s2,s2]|

= |α12α22[s2,s2]| = 1, which is an obvious contradiction, hence no hyperbolic Schur decomposition ex- ists for A. The previous example is very significant, as it proves that not even all J- Hermitian matrices have a hyperbolic Schur decomposition. In [21] was de- scribed how Example 3.8 was constructed. There also exist J-unitary matrices that do not have a hyperbolic Schur de- composition. One can be constructed in a manner simmilar to the Example 3.8. Example 3.9 (J- without a hyperbolic Schur decomposition). Let J = diag(1, 1, −1) and let −1 −32 31 3 10 23 1 1 U = 8 −8 8 = SJ (1)S− , S = 4 10 23 .   3 8   1 −32 33 5 14 33         It is easy to see that U is J-unitary. Also, it does not have a hyperbolic Schur decomposition, which can be shown using exactly the same arguments as in Ex- ample 3.8. Even though the above examples show that some J-Hermitian and J-unitary matrices do not have a hyperbolic Schur decomposition, they also show how rare such matrices are: in both cases we had rather strict conditions that had to be met in order to construct them. Interestingly, a special class of J-Hermitian matrices, referred to as J-non- negative matrices, always has a hyperbolic Schur decomposition. We say that a matrix A is J-nonnegative if JA is positive semidefinite and A is J-positive if JA is positive definite. These are, in a way, hyperbolic counter- parts of positive definite and semidefinite matrices and find their applications in the research of J-nonnegative spaces and the semidefinite J-polar decompo- sition. For details, see [3] and [6]. n n Theorem 3.10 (J-). Let J = diag(±1). If A ∈ C × is J-nonnegative, then it has a hyperbolic Schur decomposition with respect to J.

Proof. Since JA is positive semidefinite, we can write A = JB∗B for some B. Then the hyperbolic SVD of B, as described in [24, Section 3], where J is H denoted as Φ, A∗ as A and JV as P (we shall use P as a , which is not explicitly used in [24]) is

Ij Ij 1/2 1/2 ∗ B = U h i diag(|λ1| ,..., |λl| )  (JV ) ,  0     15 where U ∗U =I, (JV )∗J(JV )= V ∗JV = P ∗JP , for some permutation P , and j = rank B − rank BJB∗. It is easy to see that

Ij Ij

"Ij Ij#  1 A = JB∗B = V P ∗JPV − .  diag(|λ1|,..., |λl|)     0     Note that P ∗JP is diagonal and

Ij Ij 1 1 = ⊗ Ij "Ij Ij # "1 1# is permutationally similar to

j 1 1 1 1 =Ij ⊗ , i=1 "1 1# "1 1# M which completes the proof of the theorem.

Remark 3.11. If A is J-positive, then JA is nonsingular, so j = 0 in the above proof, which means that J-positive matrices have a hyperbolic Schur decomposi- tion with T diagonal with positive entries. In fact, more is known for this case, as shown in [20, Corollary 5.3]:

1 1 [ ] A = VTV − , V − = V ∗ , T = JΣ, Σ = diag(|λ1|,..., |λn|), i.e., V can be chosen to be J-unitary if we set the signs in T according to those in J.

Before exploring the properties of the hyperbolic Schur decomposition, we investigate further the quasitriangular factor T . To do this, we define the fol- lowing very important variant of the hyperbolic Schur decomposition.

Definition 3.12 (The complete hyperbolic Schur decomposition). Let J = n n 1 diag(±1) and A ∈ C × be of same order. We say that A = VTV − is a complete hyperbolic Schur decomposition of the matrix A with respect to J if V is J-orthonormal, T is quasitriangular, and all diagonal blocks Tkk of order 2 are indecomposable1.

1 A diagonal block Tkk is indecomposable if it cannot be further reduced by the hyperbolic e e Schur decomposition, i.e., there are no Jk-orthogonal V and triangular (not just quasitrian- e e e e −1 e e ∗ gular!) T such that Tkk = V T V , where Jk is the part of J := V JV corresponding to Tkk.

16 Definition 3.12 allows us to work with the fully reduced (in terms of the J-orthonormal similarity) quasitriangular matrices. This makes proofs simpler (by reducing the number of observed cases) and gives us the following properties of the diagonal blocks in an indecomposable matrix T . Note that any matrix having a hyperbolic Schur decomposition trivially also has a complete hyperbolic Schur decomposition. Theorem 3.13 (The complete hyperbolic Schur decomposition). Let A = 1 VTV − be a complete hyperbolic Schur decomposition of A with respect to some J = diag(±1). Then all the irreducible diagonal 2 × 2 blocks of T have only degenerate eigenvectors with respect to the corresponding part of J := V ∗JV . Furthermore, all such blocks are either degenerate (with respect to the cor- responding part of J) or nonsingular. e Proof. It is sufficient to note that, once we have a hyperbolic Schur decomposi- e tion, we can further decompose irreducible 2 × 2 diagonal blocks of the quasitri- angular factor T , either using the traditional Schur decomposition (if the corre- sponding part of J, denoted J ′, is definite, i.e., J ′ = ±I2) or the hyperbolic Schur decomposition (if J ′ is hyperbolic, i.e., J ′ = diag(1, −1) or J ′ = diag(−1, 1)). In the latter case,e we can triangularize the observed 2 × 2 block if and only if it has at least one nondegenerate eigenvector, by J ′-normalizing that eigenvector and expanding it to the J ′-orthonormal basis, which can always be done for a J ′-orthonormal set (see [6, p. 10]). So, the only indecomposable 2 × 2 blocks of T are those with degenerate eigenvectors with respect to the hyperbolic scalar product induced by the corresponding part of J. For the second part of the theorem, regarding the irreducible 2 × 2 diagonal blocks in the factor T , note that all irreduciblee singular blocks of order 2 must have rank 1; otherwise, they are either 0 (which is reducible as [0] ⊕ [0]) or nonsingular. Let us observe one of such blocks, denoting it as T ′ and the corresponding part of J as J ′ = ± diag(1, −1). We see that T ′ must be in one of the following two forms: e 1 0 α 1. T ′ = S diag(λ, 0)S− , λ 6= 0, S∗J ′S = , α ∈ C \ {0}, which gives "α 0#

1 (T ′)∗J ′T ′ = S−∗ diag(λ, 0)S∗J ′S diag(λ, 0)S−

0 α 1 = S−∗ diag(λ, 0) diag(λ, 0)S− = 0, or "α 0#

1 0 α 2. T ′ = SJ2(0)S− , S∗J ′S = , α ∈ C \ {0}, β ∈ C, which gives "α β#

T 1 (T ′)∗J ′T ′ = S−∗J2(0) S∗J ′SJ2(0)S−

0 0 0 α 0 1 1 = S−∗ S− = 0. "1 0#"α β#"0 0#

17 So, in both cases (T ′)∗J ′T ′ = 0, which means that T ′ is J ′-degenerate. Since T ′ was an arbitrary irreducible diagonal singular block in the complete Schur decomposition, this means that all such blocks are degenerate. The indecomposable blocks are sometimes referred to as atomic (see [4, page 466]) and Theorem 3.13 states that all irreducible blocks in the complete hyperbolic Schur decomposition are atomic and have a specific eigenstructure. Note that each atomic block can have one or two eigenvectors, as Theorem 3.13 makes no statement on the Jordan structure of such blocks.

4. Properties

In this section, we assume that J = diag(±1) and A are given such that A has a hyperbolic Schur decomposition, as described in Definition 3.2. We also use V , P and J from (7) and (6). One of the main properties of the traditional Schur decomposition is that it keeps some structurese of the matrix unchanged, i.e., if the decomposed matrix A is normal, Hermitian or unitary, then the triangular block T of a Schur decomposition of A will also be normal, Hermitian or unitary, respectively. Not surprisingly, similar properties hold in the hyperbolic case as well.

Proposition 4.1 (J-conjugate transpose). Let A have a hyperbolic Schur de- composition (7) with respect to J = diag(±1). Then

[ ]J [ ] e 1 A ∗ =(VP )T ∗ J (VP )− , J = P ∗JP.

Proof. The proof is straightforward. From (7), ite follows that

[ ]J 1 A ∗ = J((VP )T (VP )− )∗J = J(VP )−∗T ∗(VP )∗J 1 1 = JV −∗PT ∗P ∗V ∗J =(VP )P ∗(V ∗JV )− PT ∗P ∗V ∗JVP (VP )−

1 [ ] e 1 =(VP )JT ∗J(VP )− =(VP )T ∗ J (VP )− .

As we have seene ine Example 3.5, some matrices do not have a hyperbolic Schur decomposition. However, if a matrix A has it, then its conjugate trans- poses (both, the Euclidean and the hyperbolic one) have it as well.

Proposition 4.2 (Existence of the hyperbolic Schur decomposition for conju- gate transposes). A matrix A has a hyperbolic Schur decomposition with respect [ ]J to J = diag(±1) if and only if A∗ and A ∗ have it as well.

Proof. The proof is a direct consequence of Proposition 4.1. Note that if some 1 matrix X is lower triangular, then Sn− XSn is upper triangular. Now, we have:

[ ]J [ ]Je 1 1 [ ]Je 1 1 A ∗ =(VP )T ∗ (VP )− =(VP )SnSn− T ∗ SnSn− (VP )−

1 [ ]Je 1 =(VP Sn)(Sn− T ∗ Sn)(VP Sn)− ,

18 [ ]J which is one possible hyperbolic Schur decomposition of A ∗ . The similar proofs can be applied to A∗. For the other implication, we need only apply what was already proven in the first part and use the fact that

[ ]J ] [ ]J A =(A∗)∗ =(A ∗ ) ∗ .

We now consider J-Hermitian matrices. As seen in the previous section, the hyperbolic Schur decomposition was defined in the way that it keeps J- Hermitianity. This property can be shown directly as well.

Proposition 4.3 (J-Hermitian matrices). If a matrix A has a hyperbolic Schur decomposition with respect to J = diag(±1), then A is J-Hermitian if and only if T is J-Hermitian and quasidiagonal, where J = P ∗JP .

1 Proof. Frome Proposition 4.1, we see that for Ae=(VP )T (VP )− ,

[ ]J [ ] e 1 A ∗ =(VP )T ∗ J (VP )− ,

[ ] e [ ]J so T = T ∗ J if and only if A ∗ = A. If T is quasitriangular and J-Hermitian, then it is quasidiagonal.

At this point, it is worth notinge that the spectrum of a J-Hermitian matrix A is always symmetric with respect to the real axis. Moreover, the Jordan structure associated with λ is the same as that associated with λ, as shown in [6, Proposition 4.2.3.]. Also, by [6, Corollary 4.2.5], nonreal eigenvalues of J- Hermitian matrices have J-neutral root subspaces, which implies that all their adjoined eigenvectors are J-degenerate. This means that each such eigenvalue will participate in some singular atomic block of order 2 in the quasidiagonal matrix T of the matrix’ hyperbolic Schur decomposition. Of course, for J = ±I, all eigenvalues are real and such blocks do not exist. In the following proposition, we consider J-normal matrices. Recall that A [ ]J [ ]J is J-normal if AA ∗ = A ∗ A.

Proposition 4.4 (J-normal matrices). If a matrix A has a hyperbolic Schur decomposition with respect to J = diag(±1), then A is J-normal if and only if T is J-normal quasitriangular, where J = P ∗JP .

Proof.e This follows directly from Propositione 4.1:

[ ]J [ ] e 1 [ ]J [ ] e 1 AA ∗ =(VP )TT ∗ J (VP )− , A ∗ A =(VP )T ∗ J T (VP )− .

Unlike the Euclidean case, where a normal triangular matrix is also diagonal, in the hyperbolic case, we have no guarantees that a block triangular J- is also block diagonal, as shown in the following example.

19 Example 4.5 (A block triangular, J-normal matrix which is not block diag- onal). Let J = diag(1, −1, 1, −1) and let A be the following block triangular matrix: ξ 1 1 1 1 ξ 1 1 A = , 0 0 ξ 1   0 0 1 ξ     [ ] [ ] for some ξ ∈ C. A simple multiplication shows that AA ∗ = A ∗ A, for any ξ ∈ C, and A is obviously not block diagonal (with the diagonal blocks of order 2). Where does this difference between the Euclidean and the hyperbolic case come from? In the Euclidean case, for an upper triangular normal A, we have:

2 2 |a1k| =(AA∗)11 =(A∗A)11 = |a11| , (11) k n X≤ 2 from which we conclude that 1 1. ≤ Then we do the same for (AA∗) , (AA∗) , etc. But, in the hyperbolic case, P22 33 for a block upper triangular J-normal A, (11) takes the following form:

2 [ ] [ ] 2 j11 jkk|a1k| =(AA ∗ )11 =(A ∗ A)11 = |a11| . k n X≤ Since J contains both positive and negative numbers on its diagonal, the sum on the left hand side of the previous equation may also contain both positive and negative elements, so we can make no direct conclusion about the elements a1k for any k. Similarly to J-normal and J-Hermitian, we can also analyze J-unitary ma- trices. Somewhat surprisingly, this property is much closer to the Euclidean case than the previous result regarding J-normal matrices. Let us first review the structure of quasitriangular hyperexchange matrices, as this will give us more insight into the structure of the J-unitary matrices (which are a special case of the hyperexchange matrices). Proposition 4.6 (Block triangular hyperexchange matrices). Let T be a qu- asitriangular hyperexchange matrix with respect to some given J = diag(±1). Then T is also quasidiagonal. Proof. Since T is a hyperexchange matrix, there exists a permutation P such 1 that T ∗JT = PJP ∗. This means that T − = PJP ∗T ∗J. Because T is block upper triangular and J and PJP ∗ are diagonal, 1 1. T − is block upper triangular, and 2. T ∗ and PJP ∗T ∗J are block lower triangular. 1 1 Hence, T − is both block upper and block lower triangular, i.e., T − and there- fore T are quasidiagonal.

20 We are now ready to analyze the hyperbolic Schur decomposition of a J- unitary matrix. Proposition 4.7 (J-unitary matrices). If a matrix A has a hyperbolic Schur 1 decomposition A = VTV − with respect to J = diag(±1), then A is J-unitary if and only if the quasitriangular factor T is J-unitary and quasidiagonal, where J = P ∗JP . Proof. By Proposition 4.1, using the same argumentse as in the proof of Proposi- tione 4.4, T is J-unitary. The quasidiagonality of T follows directly from Propo- sition 4.6 and the fact that every J-unitary matrix is also J-hyperexchange. e The previous proposition can also be proven in a more straightforward man- e e ner, by analyzing the top right block of dimensions 1 × (n − 1) or 2 × (n − 2) in T ∗JT and then repeating the process iteratively on the bottom right block of order n − 1 or n − 2. Eigenvalues of J-unitary and J-Hermitian matrices have well researched properties, nicely presented in [16, Section 7] with J-unitary matrices being referred to as members of the automorphism group G and J-Hermitian matri- ces being referred to as members of the Jordan algebra J. These results apply for various scalar products, but when it comes to hyperbolic products and a hyperbolic Schur decomposition, more can be said about the atomic blocks in T . Proposition 4.8 (Nondiagonalizable atomic blocks). Let J = diag(±1) and A of the same order be given such that A has a complete hyperbolic Schur de- composition (7). Furthermore, let T be a nondiagonalizable atomic block on the diagonal of T , let J be the corresponding part of J and let s1 and s2 denote the 1 columns of the similarity matrix S inb the Jordan decomposition T = SJ2(λ)S− . Then the followingb is true: b R b 1. If A is J-unitary, then |λ| = 1 and [s1,s2]J ∈ i \ {0}, and R b R 2. If A is J-Hermitian, then λ ∈ and [s1,s2]J ∈ \ {0}. Proof. By Propositions 4.7 and 4.3, if A is J-unitary or J-Hermitian, T is quasidiagonal J-unitary or J-Hermitian, respectively. Since we have assumed a complete hyperbolic Schur decomposition, by Theorem 3.13 the eigenvector of every nondiagonalizablee atomice block T is J-degenerate, also implying that J = ± diag(1, −1). This means that b b b 1 0 α T = SJ2(λ)S− , S∗JS = . "α β# b b R b Since S∗JS is Hermitian, β ∈ . Note that α = [s2,s1]J 6= 0 because S∗JS is nonsingular. Let usb now assume that A is J-unitary. As stated before, T is J-unitaryb and since it is quasidiagonal, its block T is J-unitary. So, e 1 0 α 1 J = T ∗JT = S−∗J2(λ)∗S∗JSJb2(λ)Sb− = S−∗ S− . "α 2λ Re(α)+ β# b b bb b 21 Premultiplying by S∗ and postmultiplying S, we get

0 α 0 α = S∗JS = . "α β# "α 2λ Re(α)+ β# b From here, we see that 2λ Re(α) = 0. Since T is J-unitary and, hence, nonsin- gular, λ 6= 0. Obviously, Re(α) = 0, i.e., α is imaginary, so [s1,s2] ∈ iR. The J-Hermitian case is similar. For a J-Hermitianb b A, as before, we conclude [ ] that T is J-Hermitian, and its block T is J-Hermitian. From T ∗ = T follows that JT ∗ = T J, so 1 e JS−∗J2(λ)∗S∗b= SbJ2(λ)S− J. b b bb b b Premultiplying by S∗J and postmultiplying S−∗S, we get b b 1 1 J2(λ)∗ = Sb∗JSJ2(λ)S− JS−∗ =(S∗JS)J2(λ)(S∗JS)− .

Expanding all these matricesb and multiplyingb thoseb on the rightb hand side, we see that λ λ = . "1 λ# "α/α λ# Here, we see that λ ∈ R (which also follows straight from [16, Theorem 7.6]) and α = α, i.e., α is real, so [s1,s2] ∈ R. It is worth noting that J-Hermitian J-unitary matrices always have a hy- perbolic Schur decomposition with a specific structure of the atomic blocks of order 2, as shown in the following proposition.

Proposition 4.9 (J-Hermitian J-unitary matrices). Let J = diag(±1) and n n let A ∈ C × be both J-Hermitian and J-unitary. Then A has a complete hyperbolic Schur decomposition (7) such that each diagonal atomic block T of order 2 is of the form b α β T = , J = ± diag(1, −1), "−β −α# b b for some α ∈ R, β ∈ C such that α2 = |β|2 + 1. As before, J denotes the part of J corresponding to T . b Proof.e From the J-unitarityb and the J-Hermitianity of A, we get: [ ] 2 I= A ∗ A = A , which means that A is involutory and, hence, diagonalizable (see [2, Fact 5.12.13]) which, by Theorem 3.3 means that it has a hyperbolic Schur decom- position. By Theorem 3.13, we can choose such a decomposition in a way that each diagonal block of order 2 in T , denoted T , has only degenerate eigenvectors

b 22 with respect to J (the corresponding part of J), which is possible if and only if that part is either J = diag(1, −1) or J = diag(−1, 1). By Propositionb 4.3, T is J-Hermitian and,e by Proposition 4.7, it is also J-unitary. So, b b [ ]Jb 2 b bI2 = T ∗ T = T . (12) b Let us denote the elements of T as t (i,j ∈ {1, 2}): bij b b

t11 t12 b [ ] b t11 −t21 T = , T ∗ J = JT ∗J = . "t21 t22# "−t12 t22 # b b bb b [ ]Jb Since T = T ∗ , we see that t11,t22 ∈ R and t21 = −t12. Note that we have chosen T to be irreducible (because it comes from the complete hyperbolic Schurb decomposition),b i.e., t12 6= 0 or t21 6= 0, which means that both of the nondiagonalb elements of T are nonzero. Furthermore, from (12), we get

2 2 1 b t11 −|t12| t12(t11 + t22) = 2 2 . " 1# "−t12(t11 + t22) t22 −|t12| #

Since t12 6= 0, we get t11 = −t22. Defining α := t11 and β := t12 completes this proof.

5. Existence of the hyperbolic Schur decomposition for J-Hermitian matrices

As we have seen in Example 3.8, some J-Hermitian matrices do not have a hyperbolic Schur decomposition. Also, Theorem 3.7 provides a sufficient, but not necessary condition for the existence of such a decomposition in a general case. In this section we give a necessary and sufficient condition for the existence of the hyperbolic Schur decomposition of J-Hermitian matrices. To achieve our goal, we briefly leave the hyperbolic scalar product spaces and move to more general, indefinite scalar product spaces. A detailed analysis of such spaces is given in [6], from where we use Theorem 5.1.1. which states that for every nonsingular indefinite J and for every J-Hermitian A (i.e., JA = A∗J) there exists a nonsingular matrix X such that

β β 1 A = XJ X− , X∗JX = S, J = Jk, S = εkSk, (13) Mk=1 Mk=1 where J is a of A. Blocks Jk for k = 1,...,α are associated with the real eigenvalues λ1,...,λα and blocks Jk for k = α + 1,...,β are associated with conjugate pairs of nonreal eigenvalues λα+1,...,λβ in the upper half-plane, i.e., Jk = J (λk) ⊕J (λk) for k = α + 1,...,β. Matrices Sk denote standard involutory permutations of order same as Jk, εk ∈ {−1, 1} for k ≤ α

23 1 and εk = 1 for k >α. In[6], J, X and S are denoted as H, T − and J, respectively. The decomposition (13) is referred to as the canonical form of the pair (A, J) and has many applications. See [6, Chapter 5] for the applications in the indefi- nite scalar product spaces and [15] for its symplectic version and the application in the symplectic scalar product spaces. The set of signs {ε1,...,εα} is called the sign characteristic of the pair (A, J). In [15], this notion is adapted to the symplectic scalar product spaces, losing the property that ǫk = 1 for k>α. For more details, see [15, Section 3.2.]. Apart from the applications to the nonstandard scalar product spaces, the sign characteristic plays a significant role in the research of the self-adjoint matrix polynomials [5, Chapter 12]. Throughout this paper, the Jordan normal form of a matrix plays a crucial role in the existence of the hyperbolic Schur decomposition. To better describe it, we associate with each Jordan block Jd(λ) the term partial multiplicity, which is the size d of that Jordan block. Obviously, each eigenvalue has as many partial multiplicities as it has associated Jordan blocks and the largest partial multiplicity of each eigenvalue λ of a matrix A is equal to the size of its d largest Jordan block or, equivalently, to max{d ∈ N : (x − λ) |µA(x)}, where µA(x) is the minimal polynomial of A. We are now ready to present and prove the necessary and sufficient conditions for the existence of the hyperbolic Shur decomposition of a J-Hermitian matrix.

Theorem 5.1 (Hyperbolic Schur decomposition of a J-Hermitian matrix). Let n n J = diag(±1) and let A ∈ C × be a J-Hermitian matrix of the same order. Then A has a hyperbolic Schur decomposition with respect to J if and only if its real eigenvalues have partial multiplicities at most 2 and its non-real eigenvalues have multiplicities at most 1.

Proof. Let us first show that if such a decomposition of A exists, then the real eigenvalues of A have partial multiplicities at most 2 and its non-real eigenvalues have multiplicities at most 1. 1 Let A = UTU − , where U is J-orthonormal, and let J := U ∗JU. By Proposition 4.3, T is quasidiagonal and J-Hermitian. This means that we can write e T = Tk, Je = Jk, M Mk e e where each Tk is irreducible (i.e., either of order 1 or nondiagonal of order 2) and Jk has the same order as Tk. Furthermore, each Tk is Jk-Hermitian. Trivially, this means that all blocks Tk of order 1 are real. (k) (k) e Notice that if Tk of order 2 has non-real eigenvalues λe1 and λ2 , then they (k) (k) form a complex conjugate pair, i.e., λ1 = λ2 , due to [16, Theorem 7.6], so their partial multiplicity must be 1. If they are real, their partial multiplicity is 1 or 2, depending on the diagonalizability of Tk. Since A is similar to Tk, it

24 has the same Jordan structure, i.e., the same eigenvalues with the same partial multiplicities. Let us now show that any J-Hermitian matrix A with the described eigen- values’ partial multiplicities has a hyperbolic Schur decomposition with respect to J. We observe the canonical form of (A, J) from (13). By the assumption, all Jk are of an order at most 2. 1 Let k be such that Jk is of order 2. Then Sk = [ 1 ], which is congruent to diag(1, −1). By the Sylvester’s law of intertia, there exists Yk such that Yk∗SkYk = diag(1, −1). For all k such that Jk is of order 1 we define Yk = 1, so Yk∗SYk =I1 for such k. Furthermore, we define

1 Y := Yk, U := XY, T := Y − J Y. (14) Mk Note that T has the same block structure as J , so it is quasidiagonal, and

Y ∗SY = Yk∗ εkSk∗ Yk = εkYk∗Sk∗Yk =: J ′, (15) k ! k ! k ! k M M M M  where J ′ = diag(j1,...,jn) for some j1,...,jn ∈ {−1, 1}. Using (13), (14) and (15), we see that

1 1 1 1 A = XJ X− = UY − J YU − = UTU − ,

U ∗JU =(XY )∗J(XY )= Y ∗X∗JXY = Y ∗SY = J ′, which proves that A is J-orthonormally similar to some quasidiagonal T , hence A has a hyperbolic Schur decomposition with respect to J.

6. Conclusion

In this paper we have introduced the hyperbolic Schur decomposition of a square matrix with respect to the scalar product induced by J = diag(±1). We have shown that all diagonalizable matrices have such a decomposition, which means that the set of matrices that don’t have a hyperbolic Schur decomposition is a subset of the set of nondiagonalizable matrices, which is a set of measure zero (this follows from [12, Section 2.4.7]). We have also given examples of matrices that do not have such a decomposition. By its design, the hyperbolic Schur decomposition preserves structures of the structured matrices, albeit with respect to a somewhat changed (symmetrically permuted) J, denoted by J, which is a common property of hyperbolic decom- positions. We have analyzed the properties of such matrices, as well as the properties of atomic (indecomposable)e blocks on the diagonal of the quasitrian- gular factor T . In Example 3.8 we have shown that there exist J-Hermitian (and, there- fore, J-normal) matrices that do not have a hyperbolic Schur decomposition,

25 even in spaces as simple as the Minkowski space. In Example 3.9 we have pro- vided an example of a J-unitary matrix for which there is no hyperbolic Schur decomposition, also showing that not all hyperexchange matrices have such a decomposition (since J-unitary matrices are a special case). In section 5, we have given sufficient and necessary conditions under which a J-Hermitian matrix has a hyperbolic Schur decomposition. We have shown that those structured matrices that do have a hyperbolic Schur decomposition also have desirable properties. The only exemption from this rule are general J-normal matrices (which are not “nicer”, i.e., neither J-Hermitian nor hyperexchange) for which block triangularity does not imply block diagonality. It remains to be researched if the block triangular ones can always be block diagonalized via J-orthonormal similarities (i.e., by a hyperbolic Schur decomposition), as always happens in the Euclidean case. Interestingly enough, all important subclasses of J-normal matrices maintain the block diag- onality of the factor T . Another subject that remains to be researched is the algorithm to calculate such a decomposition. The first thing that might come to mind here is the QR algorithm used for the Euclidean Schur decomposition. However, this would not work, at least not as directly as one might hope. Firstly, every nonsingular matrix has a hyperbolic QR factorization, as shown by [19, Theorem 5.3]. But, as shown in the examples in this paper, not all such matrices have a hyperbolic Schur decomposition (see Example 3.5 for λ 6= 0, Example 3.6 for λ1,λ2 6= 0 and Example 3.9). This means that any algorithm employing the hyperbolic QR factorization to calculate the hyperbolic Schur decomposition will diverge for such matrices. Secondly, many singular matrices have a hyperbolic Schur decomposition, while it is unclear if they also have a hyperbolic QR factorization, since [19, Theorem 5.3] covers only the matrices A such that A∗JA is of full rank. This means that it may be possible for some matrices to have a hyperbolic Schur decomposition which is uncomputable via the hyperbolic QR factorization.

Acknowledgments Most of the work on this paper was done at the School of Mathematics, University of Manchester, where I was invited as a research visitor by Fran¸coise Tisseur, whom I thank dearly for the opportunity and a great working experience as well as many suggestions that helped me with this work. The paper was also proofread by Nataˇsa Strabi´cwhom I thank for all the suggestions that have considerably improved the paper. I would also like to thank the anonymous referee at the LAA, who made a special impact on this paper and without whose suggestions section 5 would not exist.

References [1] G. Ammar, C. Mehl, and V. Mehrmann. Schur-like forms for matrix Lie groups, Lie algebras and Jordan algebras. Appl., 287(1- 3):11–39, 1999.

26 [2] D. S. Bernstein. Matrix Mathematics: Theory, Facts, and Formulas with Application to Linear Systems Theory. Princeton University Press, Prince- ton, NJ, USA, 2005. [3] Y. Bolshakov, C. V. M. van der Mee, A. Ran, B. Reichstein, and L. Rod- man. Extension of isometries in finite–dimensional indefinite scalar product spaces and polar decompositions. SIAM J. Matrix Anal. Appl., 18(3):752– 774, July 1997. [4] P. Davies and N. Higham. A Schur-Parlett algorithm for computing matrix functions. SIAM J. Matrix Anal. Appl., 25(2):464–485, 2006.

[5] I. Gohberg, P. Lancaster, and L. Rodman. Matrix polynomials. Classics in applied mathematics. Society for Industrial and Applied Mathematics, Philadelphia, PA, USA, 1982.

[6] I. Gohberg, P. Lancaster, and L. Rodman. Indefinite Linear Algebra and Applications. Birkh¨auser, Basel, Switzerland, 2005.

[7] G. Golub and C. F. Van Loan. Matrix computations. 3rd ed. Johns Hopkins University Press, Baltimore, MD, USA, 1996.

[8] E. Grimme, D. Sorensen, and P. Van Dooren. Model reduction of state space systems via an implicitly restarted Lanczos method. 12(1-2):1–31, 1996.

[9] S. Hassi. A Singular Value Decomposition of Matrices in a Space with an Indefinite Scalar Product. Series A, I mathematica, dissertationes no. 79, Annales Academiæ Scientiarum Fennicæ, Helsinki, 1990.

[10] N. J. Higham. J-orthogonal matrices: Properties and generation. SIAM Rev., 45(3):504–519, 2003.

[11] N. J. Higham. Functions of matrices. Theory and computation. Society for Industrial and Applied Mathematics, 2008.

[12] R. A. Horn and C. R. Johnson. Matrix Analysis. Cambridge University Press, Cambridge, UK, second edition, 2013.

[13] A. Kılı¸cman and Z. A. Zhour. The representation and approximation for the weighted Minkowski inverse in Minkowski space. Math. Comput. Modelling, 47(3-4):363–371, 2008.

[14] B. C. Levy. A note on the hyperbolic singular value decomposition. Linear Algebra Appl., 277(1–3):135–142, 1998.

[15] W.-W. Lin, V. Mehrmann, and H. Xu. Canonical forms for Hamiltonian and symplectic matrices and pencils. Linear Algebra Appl., 302-303:469– 533, 1999.

27 [16] D. S. Mackey, N. Mackey, and F. Tisseur. Structured factorizations in scalar product spaces. SIAM J. Matrix Anal. Appl., 27(3):821–850, 2006.

[17] R. Onn, A. O. Steinhardt, and A. Bojanczyk. The hyperbolic singular value decomposition and applications. Applied mathematics and comput- ing, Trans. 8th Army Conf., Ithaca/NY (USA) 1990, ARO Rep. 91–1, 93–108, 1991.

[18] G. Sewell. Computational Methods of Linear Algebra, Second Edition. Wi- ley, 2005.

[19] S. Singer. Indefinite QR factorization. BIT, 46(1):141–161, 2006.

[20] V. Sego.ˇ Two-sided hyperbolic SVD. Linear Algebra Appl., 433(7):1265– 1275, 2010.

[21] V. Sego.ˇ The hyperbolic Schur decomposition (extended), Oct 2013. http://eprints.ma.man.ac.uk/2026/.

[22] H. Xu. An SVD-like matrix decomposition and its applications. Linear Algebra Appl., 368:1–24, 2003.

[23] H. Xu. A numerical method for computing an SVD-like decomposition. SIAM J. Matrix Anal. Appl., 26(4):1058–1082, 2005.

[24] H. Zha. A note on the existence of the hyperbolic singular value decompo- sition. Linear Algebra Appl., 240:199–205, 1996.

28