<<

arXiv:2103.02086v1 [math.NA] 2 Mar 2021 References h vrl loih n ueia eut 26 results numerical and algorithm overall The 7 structure Jordan numerical the Computing 6 optn tics ietiltwt nw tutr 13 11 structure known a with eigentriplet staircase a sensitivity Computing its and 5 eigentriplet staircase numerical The eigentriplet 4 staircase a of Regularity 3 Preliminaries 2 Introduction 1 Contents . ueia eut ...... results Numerical algorithm overall 7.2 The 7.1 . ueia xmlsfrcmuigtesaraefr . . . . . form . . staircase . the . . . computing decomposition . for . Jordan examples . to . Numerical . form . staircase . eigentriplet a 5.4 staircase Converting . a for refinement eigentriplet 5.3 Iterative staircase initial the 5.2 Computing 5.1 . h iia oyoil...... stratification . bundle . . . . and . reduction . Hessenberg . Minimal via . polynomials minimal . minimal 6.4 The via . structure . Jordan 6.3 The . polynomial minimal 6.2 The 6.1 . h tics om...... Form . Canonical . . . Jordan . . numerical . the . . . of . . notion . The . . form . . staircase 2.4 . The characteristics . Weyr and . 2.3 Segre terminology and 2.2 Notation 2.1 optn h odnCnnclForm Canonical Jordan the Computing hngn Zeng Zhonggang ueia Method Numerical A for and inYe Li Tien-Yien 15 ...... 16 16 ...... 14 ...... 25 ...... 21 ...... 20 ...... 5 ...... 4 ...... 3 . . . . 27 . . . 26 . . . 18 . . . 4 . . 29 18 7 3 1 i A numerical method for computing the Jordan

Zhonggang Zeng∗ Tien-Yien Li†

March 4, 2021

Abstract

The Jordan Canonical Form of a matrix is highly sensitive to perturbations, and its nu- merical computation remains a formidable challenge. This paper presents a regularization theory that establishes a well-posed least squares problem of finding the nearest staircase decomposition in the matrix bundle of the highest codimension. A two-staged algorithm is developed for computing the numerical Jordan Canonical Form. At the first stage, the method calculates the Jordan structure of the matrix and an initial approximation to the multiple eigenvalues. The staircase decomposition is then constructed by an iterative algorithm at the second stage. As a result, the numerical Jordan Canonical decomposition along with multiple eigenvalues can be computed with high accuracy even if the underlying matrix is perturbed. keywords Jordan canonical form, eigenvalue, staircase form,

1 Introduction This paper presents an algorithm and a regularization theory for computing the Jordan Canon- ical Form accurately even if the matrix is perturbed.

The existence of the Jordan Canonical Form is one of the fundamental theorems in as an indispensable tool in matrix theory and beyond. In practical applications, however, it is well documented that the Jordan Canonical Form is extremely difficult, if not impossible, for numerical computation [3, p.25], [5, p.52], [8, p.189], [9, p.165], [13, p.146], [23, p.371], [26, p.132], [44, p.22]. In short, as remarked in a celebrated survey article by Moler and Van Loan [38]: “The difficulty is that the JCF cannot be computed using floating point arithmetic. A single error may cause some multiple eigenvalue to become distinct or vise versa, altering the entire structure of J and P .”

Indeed, defective multiple eigenvalues in a non-trivial Jordan Canonical Form degrade to clus- ters of simple eigenvalues in practical numerical computation. A main theme of the early

∗Department of Mathematics, Northeastern Illinois University, Chicago, IL 60625 (email: [email protected]). Research supported in part by NSF under Grant DMS-0412003. †Department of Mathematics, Michigan State University, East Lansing, MI 48824 (email: [email protected]). Research supported in part by NSF under Grant DMS-0411165.

1 attempts for numerical computation of the Jordan Canonical Form is to locate a multiple eigenvalue as the mean of a cluster that is selected from eigenvalues computed by QR algo- rithm and, when it succeeds, the Jordan structure may be determined by computing a staircase form at the multiple eigenvalue. This approach includes works of Kublanovskaya [33] (1966), Ruhe [41] (1970), Sdridhar et al [43] (1973), and culminated in Golub and Wilkinson’s review [24] (1976) as well as K˚agstr¨om and Ruhe’s JNF [29, 30] (1980). Theoretical issues have been analyzed in, e.g. [11, 12, 48, 49].

However, the absence of a reliable method for identifying the proper cluster renders a major difficulty for this approach. Even if the correct cluster can be identified, its arithmetic mean may not be sufficiently accurate for identifying the Jordan structure, as shown in Example 1 (§4). While improvements have been made steadily [6, 37], a qualitative approach is proposed in [8], and a partial canonical form computation is studied in [31], “attempts to compute the Jordan canonical form of a matrix have not been very successful” as commented by Stewart in [44, p. 22].

A related development is to find a well-conditioned matrix G such that G-1AG is block [23, §7.6.3]. Gu proved this approach is NP-hard [25], with a suggestion that “it is still possible that there are algorithms that can solve most practical cases” for the problem. Another closely related problem is the computation of the Kronecker Canonical Form for a matrix pencil A λB (see [15, 16, 17, 28]). For a given Jordan structure, a minimization − method is proposed in [36] to find the nearest matrix with the same Jordan structure.

Multiple eigenvalues are multiple roots of the characteristic polynomial of the underlying ma- trix. There is a perceived barrier of “attainable accuracy” associated with multiple zeros of algebraic equations which, in terms of number of digits, is the larger one between data error and machine divided by the multiplicity [39, 50, 53]. Thus, as mentioned above, ac- curate computation of multiple eigenvalues remains a major obstacle of computing the Jordan Canonical Form using floating point arithmetic. Recently, a substantial progress has been achieved in computing multiple roots of polynomials. An algorithm is developed in [53] along with a software package [52] that consistently determines multiple roots and their multiplicity structures of a polynomial with remarkable accuracy without using multiprecision arithmetic even if the polynomial is perturbed. The method and results realized Kahan’s observation in 1972 that multiple roots are well behaved under perturbation when the multiplicity structure is preserved [32].

Similar to the methodology in [53], we propose a two-stage algorithm in this paper for com- puting the numerical Jordan Canonical Form. To begin, we first find the Jordan structure in terms of the Segre/Weyr characteristics at each distinct eigenvalue. With this structure as a constraint, the problem of computing the Jordan Canonical Form is reformulated as a least squares problem. We then iteratively determine the accurate eigenvalues and a staircase decomposition, and the Jordan decomposition can follow as an option.

We must emphasize the numerical aspect of our algorithm that focuses on computing the numerical Jordan Canonical Form of inexact matrices. The exact Jordan Canonical Form of a matrix with exact data may be obtainable in many cases using symbolic computation (see, e.g. [10, 21, 22, 35]). Due to ill-posedness of the Jordan Canonical Form, however, symbolic computation may not be suitable for applications where matrices will most likely be perturbed

2 in practice. For those applications, we must formulate the notion of the numerical Jordan Canonical Form that is structurally invariant under small data perturbation, and continuous in a neighborhood of the matrix with the exact Jordan Canonical Form in question.

More precisely, matrices sharing a particular Jordan structure form a matrix bundle, or, a manifold. For a given matrix A, we compute the exact Jordan Canonical Form of the nearest matrix A˜ in a bundle Π of the highest co- within a neighborhood of A. Under this formulation, computing the numerical Jordan Canonical Form of A should be a well-posed problem when A is sufficiently close to bundle Π. In other words, under perturbation of sufficiently small magnitudes, the deviation of the numerical Jordan Canonical Form is tiny with the structure intact.

The main results of this paper can be summarized as follows. In §3, we formulate a system of quadratic equations that uniquely determines a local staircase decomposition at a multiple eigenvalue from a given Jordan structure. Regularity theorems (Theorem 1 and Theorem 2) in this section establish the well-posedness of the staircase decomposition that ensures accurate computation of multiple eigenvalues. Based on this regularity, the numerical unitary-staircase eigentriplets is formulated in §4, along with the backward error measurement and a proposed .

In §5, we present an iterative algorithm for computing the well-posed unitary-staircase eigen- triplet assuming the Jordan structure is given. The algorithm employs the Gauss-Newton it- eration whose local convergence is a result of the regularity theorems given in §3. The method itself can be used as a stand-alone algorithm for calculating the nearest staircase/Jordan de- composition of a given structure, as demonstrated via numerical examples in §5.4.

The algorithm in §5 requires a priori knowledge of the Jordan structure, which can be com- puted by an algorithm we propose in §6. The algorithm employs a special purpose Hessenberg reduction and a -revealing mechanism that produces the sequence of minimal polynomials. Critically important in our algorithm is the application of the recently established robust mul- tiple root algorithm [53] to those numerically computed minimal polynomials in determining the Jordan structure as well as an initial approximation of the multiple eigenvalues, providing the crucial input items needed in the staircase algorithm in §5. In §7, we summarize the overall algorithm and present numerical results

2 Preliminaries 2.1 Notation and terminology Throughout this paper, matrices are denoted by upper case letters A, B, etc., and O denotes a with known dimensions. Vectors are in columns and represented by lower case boldface letters like u, v and x. A zero vector is denoted by 0, or 0n to emphasize the dimension. The notation ( )⊤ represents the of a matrix or a vector ( ), and ( )H · · · is its (or ). The fields of real and complex numbers are denoted by R and C respectively.

For any matrix B Cm×n, the rank, nullity, range and of B are denoted by rank B , ∈ nullity B , ( B ) and ( B ) respectively. The n n is In, or simply R K ×  I when its size is clear. The column vectors of I are canonical vectors e , , en. A  1 · · ·

3 matrix U Cm×n is said to be unitary if U HU = I. A matrix V Cm×(n-m) is called a ∈ ∈ unitary complement of U if [U, V ] is a square unitary matrix. Subspaces of Cn are denoted by calligraphic letters , , with dimensions dim , dim , etc., and X Y X Y ⊥ stands for the orthogonal complement of subspace . The set of distinct eigenvalues of X X   A is the spectrum of A and is denoted by Λ A .

2.2 Segre and Weyr characteristics  The Jordan structure and the corresponding staircase structure (see §2.3) of an eigenvalue can be characterized by the Segre characteristic and the Weyr characteristic respectively. These two characteristics are conjugate partitions of the algebraic multiplicity of the underlying eigenvalue. Here, a sequence of nonnegative k k is called a partition of { 1 ≥ 2 ≥···} a positive k if k = k +k + . For such a partition, sequence lj = max i ki j , 1 2 · · · ≥ j = 1, 2, is called the conjugate partition of k1, k2, . For example,n [3, 2, 2, 1]o is · · · { ···} a partition of 8 with conjugate [4, 3, 1] and vice versa.

Let λ be an eigenvalue of A with an algebraic multiplicity m corresponding to elementary Jordan blocks of orders n n nl > 0. The infinite sequence n , ,nl, 0, 0, is 1 ≥ 2 ≥···≥ { 1 · · · ···} called the Segre characteristic of A associated with λ. The Segre characteristic forms a partition of the algebraic multiplicity m of λ. Its conjugate partition is called the Weyr characteristic of A associated with λ. We also take the Weyr characteristic as an infinite sequence for convenience. The nonzero part of such sequences will be called the nonzero Segre/Weyr characteristics. Let A be an n n matrix with Weyr characteristic × m m associated with an eigenvalue λ. Then [14, Definition 3.6 and Lemma { 1 ≥ 2 ≥ ···} 3.2], for j = 1, 2, , · · · j j-1 mj = nullity (A λI) nullity (A λI) − − − which immediately implies the uniqueness of the two charact eristics and their invariance under unitary similarity transformations, since the rank of (P AP -1 λI)j = P (A λI)jP -1 is the − − same as the rank of (A λI)j for j = 1, 2, . In particular, both characteristics are − · · · invariant under Hessenberg reduction [23, §7.4.3].

2.3 The staircase form Discovered by Kublanovskaya [33], a matrix is associated with a staircase form given below.

Cn×n k Lemma 1 Let A be a matrix with nonzero Weyr characteristic mj j=1 associated ∈ { n×m} with an m-fold eigenvalue λ. For consecutive j = 1, , k, let Yj C j be a matrix j · · · ∈ satisfying Y , ,Yj = (A λI) . Then Y , ,Yk is of full rank and R 1 · · · K − 1 · · ·      A Y , ,Yk = Y , ,Yk (λIm + S) (1) 1 · · · 1 · · ·     m1 m2 ··· mk O S12 ··· S1k m1  .. .. .  . where S = . . . . (2)    ..   . S -  mk-1  k 1,k   O  k   m

Furthermore, all super-diagonal blocks S ,S , ,Sk- ,k are matrices of full rank. 12 23 · · · 1

4 Proof. Equation (1) and the existence of S in (2) can be proved by a straightforward l verfication using ( Yl )= (A λI) for l = 1, , k. From (2), we have R K − · · · l-1 l-2  l-2 (A λI) Yl = (A λI) Y1S1,l + + Yl-1Sl-1,l = (A λI) Yl-1Sl-1,l. − − · · · −   ml-1×ml This implies Sl-1,l C is of full rank since Sl-1,l z = 0 with z = 0 will lead to l∈-1 6 Yl z (A λI) = Y1, ,Yl-1 , contradicting to the of ∈ K − R · · · columns of Y , ,Yl . ✷ 1 · · ·      The matrix λmI + S in (1) is called a local staircase form of A associated with λ. The matrix S is called a staircase associated with eigenvalue λ. Writing Y = Y , ,Yk , we call the array (λ, Y, S) as in (1) a staircase eigentriplet of A associated 1 · · · with Weyr characteristic m m . It is called unitary-staircase eigentriplet   { 1 ≥ 2 ≥ ···} of A if Y is a unitary matrix. The unitary-staircase form is often preferable to Jordan Canonical Form itself since the columns of Y = Y , ,Yk in (1) form an orthonormal 1 · · · for the of A associated with λ.   n×n Let Λ A = λ , , λl . Lemma 1 lead to the existence of a unitary matrix U C sat- { 1 · · · } ∈ isfying [24, 33, 41] 

λ1I + S1 T12 ··· T1l .. . H λ I S . A = UTU , where T =  2 + 2 .  (3) . .. T  l-1,l   λlI + Sl    The matrix T in (3) is called a staircase form of A and the matrix factoring UTU H is called a unitary-staircase decomposition of A. A staircase decomposition of matrix A can be converted to Jordan decomposition via a series of similarity transformations [24, 30, 33, 41].

2.4 The notion of the numerical Jordan Canonical Form

Corresponding to a fixed set of k integer partitions ni ni for i = 1, k with { 1 ≥ 2 ≥···} · · · i,j nij = n, the collection of all n n matrices with k distinct eigenvalues associated ∞ × with Segre characteristics nij for i = 1, , k forms a manifold, known as a matrix P { }j=1 · · · bundle originated by A. I. Arnold [1]. This bundle has a codimension that can be represented in terms of Segre/Wyre characteristics [1, 14]

k ∞ k ∞ 2 1+ (2j 1)nij 1+ m . (4) i=1 − j=1 − ≡ i=1 − j=1 ij     ∞P P P P where mij for 1 i k are corresponding Weyr characteristics. When a matrix { }j=1 ≤ ≤ A belongs to such a bundle, it can also be in the closure of many bundles with respect to different Segre characteristics. In other words, a matrix with certain Jordan structure can be arbitrarily close to matrices with other Jordan structures. For example, matrix deformations

λ 1 λ 1 λ 1       λ ε ε→0 λ δ→0 λ  λ δ   λ δ   λ         λ   λ   λ  (5) bundle codimention = 3 −→ bundle codimention = 7 −→ bundle codimention = 9 show that a matrix with Segre characteristic 2, 1, 1 is arbitrarily close to some matrices with { } Segre characteristic 2, 2 , which are arbitrarily near certain matrices with Segre characteristic { }

5 4 . Let ( ) denote the matrix bundle with respect to the Segre characteristics listed in { } B · ( ) and ( ) denote its closure. Then (5) suggests that ( 2, 1, 1 ) ( 2, 2 ) and · B · B { } ⊂ B { } ( 2, 2 ) ( 4 ). Extensively studied in e.g. [7, 14, 16, 17, 20], these closure relationships B { } ⊂ B { } form a hierarchy or stratification of Jordan structures that can be conveniently decoded by a covering relationship theorem by Edelman, Elmroth and K˚agstr¨om [17, Theorem 2.6]. As an example, Figure 1 lists all the Jordan structures and their closure stratification for 4 4 × matrices in different Segre characteristics and codimensions of matrix bundles.

Number of distinct eigenvalues Codim 1 2 3 4

15 {1,1,1,1}

9 {2,1,1} : 8 {1,1,1} {1} "is in the closure of"

7 {2,2}

6 {1,1}{1,1}

5 {3,1}

4 {2}{1,1} {2,1} {1}

3 {4} {1,1}{1}{1}

2 {2}{2}{3} {1}

1 {2}{1}{1}

0 {1}{1}{1}{1}

Figure 1: Stratification of Jordan structures for 4 × 4 matrices [17]. General stratification graphs can be drawn automatically using software package StratiGraph [19, 27].

Let dist A, Π = inf A B F B Π present the of a matrix A to a bundle × k − k ∈ Π. When A Cn n is near a bundle Π, say Π = ( 2, 2 ) listed in Figure 1, then ∈  B { } clearly dist A, Π dist A, ( 3, 1 ) since Π = ( 2, 2 ) ( 3, 1 ). Indeed, matrix ≥ B { } B { } ⊂ B { } A is automatically as close, or even closer, to ten other bundles of lower codimensions below   2, 2 following the hierarchy. In practical applications and numerical computation, the given { } matrix A comes with imperfect data and/or roundoff error. We must assume A = A + E with a perturbation E of small magnitude E F on the original matrix A. The main theme k k of this article is: How to compute the Jordan Canonical Form of the matrix A accuratelyb from its inexact data A. b b Suppose matrix A has an exact nontrivial Jordan Canonical Form and thus belongs to a bundle Π of codimension d, then there is a lower bound δ > 0 for the distance from A to any other bundleb Π′ of codimension d′ d. When A is the given data of A with an ≥ imperfect accuracy, that is, A = A + E with a perturbation E, then A generically resides in the bundle ( 1 , 1 , , 1 ) of codimension 0. As a result, the Jordan structureb of B { } { } · · · { } b 6 A is lost in exact computation on A for its (exact) Jordan Canonical Form. However, the original bundle Π where A belongs to has a distinct feature: it is of the highest codimension amongb all the bundles passing through the ε-neighborhood of A, as long as ε satisfies dist A, Π <ε<δ. Therefore,b to recover the desired Jordan structure of A from its empirical data A, we first identify the matrix bundle Π of the highest codimension in the  neighborhood of A, followed by determining the matrix A on Π that is closestb to A. The numerical Jordan Canonical Form of A will then be defined as the exact Jordan Canonical Form of A. In summary, the notion of the numerical Jordane Canonical Form is formulated according to the following three principles: e • Backward nearness: The numerical Jordan Canonical Form of A is the exact Jordan Canonical Form of certain matrix A within a given distance ε, namely A A F < ε. k − k • Maximum codimension: Among all matrix bundles having distance less than ε of A, matrix A lies in the bundle Π withe the highest codimension. e • Minimum distance: Matrix A is closest to A among all matrices in the bundle Π. e e Definition 1 For A Cn×n and ε> 0, let Π Cn×n be the matrix bundle such that ∈ ⊂ codim Π = max codim Π′ dist A, Π′ < ε ,     and A Π satisfying A A F = min A B F with (exact) Jordan decomposition ∈ k − k B∈Π k − k - A = XJX 1. Then J is called the numerical Jordan Canonical Form of A within ε, e - e and XJX 1 is called the numerical Jordan decomposition of A within ε. e Remark: The same three principles have been successfully applied to formulate other ill- posed problems with well-posed numerical solutions such as numerical multiple roots [53] and numerical polynomial GCD [51, 54]. In this section, we shall attempt to determine the structure of the bundle Π with the highest codimension in the neighborhood of A. The iterative algorithm EigentripletRefine developed in §5.2 is essentially used to find the matrix A in the bundle Π which is nearest to matrix A. ✷

There ise an inherent difficulty in computing the Jordan structure from inexact data and/or using floating point arithmetic. If, for instance, a matrix is near several bundles of the same codimension with almost identical , then the structure identification may not be a well determined problem. Therefore, occasional failures [18] for computing the numerical Jordan Canonical Form can not be completely eliminated.

3 Regularity of a staircase eigentriplet For an eigenvalue λ of matrix A with a fixed Weyr characteristic, the components U and S of the staircase eigentriplet in the staircase decomposition AU = U (λI + S) are not unique. We shall impose additional constraints for achieving uniqueness which is important in establishing the well-posedness of computing the numerical staircase form.

Theorem 1 Let A Cn×n and λ Λ A of multiplicity m with nonzero Weyr ∈ ∈ n characteristic m mk. Then for almost all b , , bm C there is a unitary 1 ≥ ··· ≥  1 · · · ∈

7 n×m m×m matrix U = u , , um C and a staircase nilpotent matrix S C as in (2) 1 · · · ∈ ∈ such that   AU U(λI + S) = O − H (6) u bj = 0 for every (i, j) Φλ  i ∈ where Φλ (i, j) µ - < i µ , i

and µ0 = 0, µj = m 1 + + mj, j = 1, , k. (8) · · · · · · Moreover, if there is another unitary matrix Uˆ = [uˆ1, , uˆm] and a staircase nilpotent · · · H matrix Sˆ that can substitute U and S in (6), then Sˆ = Uˆ (A λI)Uˆ where Uˆ = UD − for a D = diag α , , αm with α = = αm = 1. 1 · · · | 1| · · · | |  Remark : The second equation in (6) along ∗ 0 0 0 ∗ ∗ ∗ ∗ ∗ ∗ ∗ 0 0 ∗ ∗ ∗ ∗ ∗ with the index set Φλ in (7) means that, for ∗ ∗ ∗ 0 ∗ ∗ ∗ ∗ ∗  ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗  j = 1, , k, the matrix UjBj is lower tri- ∗ ∗ ∗∗ ∗ 0 0 ∗ ∗ · · · ∗∗∗∗∗∗ 0 ∗ ∗ angular where Uj = uµj-1+1, , uµj and  ∗∗∗∗∗∗∗ ∗ ∗  · · ·  ∗∗∗∗∗∗∗∗ 0  Bj = bµ + , , bµ . For example: Let j-1 1 · · · j   ∗∗∗∗∗∗∗∗∗  λ be an eigenvalue with Weyr characteris-       tic 4, 3, 2 , we have multiplicity m = 9, Figure 2: Index set Φλ for Weyr characteristic { } {4, 3, 2}. Every (i, j) ∈ Φλ entry is zero. [µ0,µ1,µ2,µ3] = [0, 4, 7, 9], and Φλ = (1, 2), { H (1, 3), (1, 4), (2, 3), (2, 4), (3, 4), (5, 6), (5, 7), (6, 7), (8, 9) . The matrix U [b , , b ] has zero at every } 1 · · · 9 (i, j) Φλ entry. As shown in Figure 2, those zeros are under the staircase and above the ∈ diagonal. ✷

Proof of Theorem 1. For j = 1, , k, the subspace (A λI)j (A λI)j-1 ⊥ · · · K − ∩ K − is of dimension mj. For almost all vectors bi, i = µj-1+1, ,µj, the subspace · · ·   j j-1 ⊥ ⊥ (A λI) (A λI) span bµ + , bµ K − ∩ K − ∩ j-1 2 · · · j    is of dimension one and spanned by a uµj-1+1 which is unique up to a unit constant multiple. After obtaining uµ + , uµ +l, the subspace j-1 1 · · · j-1 j j-1 ⊥ ⊥ (A λI) (A λI) span uµ + , uµ +l, bµ +l+ , bµ K − ∩ K − ∩ j-1 1 · · · j-1 j-1 2 · · · j    is of dimension one and spanned by uµj-1+l+1 which is again unique up to a unit constant multiple. Therefore, we have a unitary matrix Uj = uµ + , , uµ , whose columns j-1 1 · · · j j j-1 ⊥ satisfy the second equation in (6) and span the subspace  (A λI)  (A λI) K − H∩ K − for j = 1, , k. These unitary matrices uniquely determines Sij = U (A λI)Uj in (2). · · · i −  It is straightforward to verify (6) for U = U , ,Uk = u , , um . ✷ 1 · · · 1 · · · One of the main components of our algorithm is an iterative  refinement of the eigentriplet (λ, U, S) using the Gauss-Newton iteration. For this purpose we need to construct a system of analytic equations for the eigentriplet (λ, U, S) specified in Theorem 1. If the matrix A and the eigenvalue λ are real, it is straightforward to set up the system using equations in (6 along with the equaiton U ⊤U I = O. When either matrix A or eigenvalue − H λ is complex, however, the unitary constraint U U I = O is not analytic. One way to H − circumvent this difficulty is converting (6) and U U I = O to real equations by splitting − A and the eigentriplet (λ, U, S) into real and imaginary parts. The resulting system of real equations would be real analytic.

8 Alternatively, we developed a simple and effective strategy to overcome this difficulty by a two step approach. As an initial approximation, a staircase eigentriplet (λ, Y, S) is computed and Y does not need to be unitary. We replace the unitary constraint U HU I with a − nonsingularity requirement

1 0 ··· 0 . . . . C H Y = ∗ 1 . . (9)  . . .  ...... 0 ∗ ··· ∗  1 m×m   n via a constant matrix C = c , , cm C . The solution Y = [y , , ym] to the { 1 · · · } ⊂ 1 · · · equation (9) combined with (6) is a nonsingular matrix whose columns span the invariant subspace of A associated with λ. Then, at the second step, Y can be stably orthogonalized to U = u , , um and provide the solution of (6). If necessary, we repeat the process 1 · · · as refinement by replacing C with U in (9) along with (6) and solve for Y again using   the previous eigentriplet results as the initial iterate. In the spirit of Kahan’s well-regarded “twice is enough” observation [40, p. 110], this reorthogonalization never needs the third run.

n In general, let b1, , bm C be the set of predetermined random complex vectors as in { · · · }⊂ n Theorem 1. A second set of complex vectors c , , cm C will also be chosen to set up 1 · · · ∈ the overdetermined quadratic system (A λI)Y = Y S − H H c1, , ci yi = [0, , 0, 1] , for 1 i m (10)  · · · H · · · ≤ ≤ b yi = 0, for (i, j) Φλ   j ∈ where Φλ is defined in (7). There are η equations and ζ unknowns in (10) where m2 1 k 2 η = nm + 2 + 2 j=1 mj and ζ = 1+ nm + i

j Proof. The subspace (A λI) is of dimension µj for j = 1, , k. For each K − j · · · ⊥ l µj-1 + 1, ,µj , the subspace  (A λI) span c1, , cl-1, bl+1, , bµj is ∈ { · · · } K − ·H · · · · · of dimension one and spanned by the unique vector yl with cl yl = 1. Theirfore vectors  T  j y1, , ym are uniquely defined so that Y1, ,Yj = (A λI) for j = 1, , k · · · R · · · H K − · · · where Yi = yµi-1+1, yµi , i = 1, , k. Moreover, yi c1, , ci] = 0, , 0, 1 and H · · · · · ·   · · ·  · · · b yi = 0 for (i, j) Φλ. It is straightforward to verify A Y , ,Yk = Y , ,Yk (λI+S) j  ∈  1· · · 1 · · ·  for S being a nilpotent staircase matrix in the form of (2) with uniquely determined blocks    

S1j . +  .  = Y1, ,Yj-1 (A λI) Yj for j = 2, , k. . · · · − · · ·  Sj-1,j      Consequently, the matrix pair (Y,S) satisfying (10) exists and is unique for almost all n c , , cm, b , , bm C . 1 · · · 1 · · · ∈ We now prove the Jacobian J(λ, Y, S) of f(λ, Y, S) in (12) is of full rank at a staircase eigentriplet (λ, Y, S). The Jacobian J(λ, Y, S) can be considered a linear transformation which maps (σ, Z, T ) into Cζ, where σ C, Z Cn×m and T is a nilpotent staircase ∈ ∈ matrix of m m relative to the nonzero Weyr characteristic m1 mk . We partition × mi×mj { ≥···≥ } T with blocks Tij C in the same way as we partition S = Sk in (2) for l = k. ∈ Assume J(λ, Y, S) is rank-deficient. Then there is a triplet (σ, Z, T ) = (0,O,O) such that 6 J(λ, Y, S)[σ, Z, T ] = 0, namely

(A λI)Z = σY + ZS + Y T (15) − H c1, , ci zi = 0, i = 1, , m, (16) · · · H · · · b zi = 0, (i, j) Φλ. (17)  j ∈

Using Yj = yµ +1, , yµ and Zj = zµ +1, , zµ for j = 1, , k, we have j-1 · · · j j-1 · · · j · · ·     (A λI)Z1 = σY1 − i-1 (18) (A λI)Zi = σYi + (ZjSji + YjTji), i = 2, k.  − j=1 · · · P j+1 from (15). A simple induction using (18) leads to (A λI) Zj = O, for j = 1, , k. − · · · Namely, vectors z1, , zm all belong to the invariant subspace of A associated with λ, · · · × and thus Z = Y E holds for certain E Cm m. ∈ Also by a straightforward induction we have

l (A λI) Yl+ = Y S S Sl,l+ , for l = 1, , k 1. (19) − 1 1 12 23 · · · 1 · · · − We claim that

l (A λI) Zl = lσY1S12S23 Sl-1,l, for each l = 1, , k. (20) − · · · · · ·

10 This is true for l = 1 because of (18). Assume (20) is true for l j 1. Then by (19) ≤ − j j-1 j-1 j-1 (A λI) Zj = (A λI) (A λI)Zj = (A λI) σYj + (ZiSij + YiTij) − − − − i=1 j-1 j-1 = σ(A λI) Yj + (A λI) Zj-1Sj-1,j = jσY1S12S23 Sj-1,j − −  P · · ·  j-1 since, again, (A λI) Yi = O for i j 1. Thus (20) holds for l = 1, , k. − ≤ − · · · Since, (A λI)kY = O, hence (A λI)kZ = (A λI)kY E = O from Z = Y E. From − k − − (20), we have (A λI) Zk = kσY1S12S23 Sk-1,k = O, By Lemma 1, S12S23 Sk-1,k is − l · · · · · · l of full rank. Consequently, (A λI) Zl = O by (20), namely ( Zl ) (A λI) for − R ⊂ K − l = 1, , k. Therefore, for every i µi-1 + 1, ,µi , zi is in · · · ∈ { · · · }  l ⊥ (A λI) span c , , ci, bi+ , , bµ = 0 . K − 1 · · · 1 · · · i { }   \  for i = 1, , k, implying Z = O. The equation (15) them implies Y T = O and thus · · · T = O since Y is of full column rank. Consequently, J(λ, Y, S) is of full column rank. ✷

The component Y in the staircase eigentriplet (λ, Y, S) satisfying (10) is a unitary matrix for a particular c , , cm = Y . This will be achieved in our eigentriplet refinement process. 1 · · ·   4 The numerical staircase eigentriplet and its sensitivity

Consider an n n complex matrix A along with a fixed partition m m mk > × { 1 ≥ 2 ≥···≥ 0 of integer (multiplicity) m> 0. Let the vector f(λ, Y, S) be defined defined in } (12) with respect to fixed vectors b1, , bm and c1, , cm = y1, , ym and J( , , ) · · · × · ·× · · · · · · · is its Jacobian. An array (λ, Y, S) C Cn m Cm m is called a numerical unitary- ∈ × ×    staircase eigentriplet of A with respect to nonzero Weyr characteristic m1 m2 H { ≥ ≥···≥ mk if (λ, Y, S) satisfies J(λ, Y, S) f(λ, Y, S)= 0, a necessary condition for f( , , ) to } · · · 2 reach a local minimum at (λ, Y, S). The requirement c , , cm = y , , ym can be 1 · · · 1 · · · satisfied in our refinement algorithm that is to be elaborated in §5.2.     If A possesses a numerical unitary-staircase eigentriplet (λ, Y, S) with a small residual

ρ = AY Y (λI + S) F A F , (21) k − k k k then, letting Z be a unitary complement of Y , it is straightforward to verify that

H H λI + S Y AZ Y H H H H H Aˆ = [Y, Z] H H = Y (λI + S)Y + YY AZZ + ZZ AZZ (22) O Z AZ Z h ih i possesses (λ, Y, S) as its exact unitary-staircase eigentriplet and the distance ˆ ˆ ˆ ˆ A A F = (A A)[Y,Z] F = (A A)Y F + (A A)Z F − − − kH −H = AY Y (λI + S) + AZ (YY AZ + ZZ AZ) − F − F = AY Y (λI + S) = ρ A F − F k k is small. We now derive the well-posedness and the sensitivity measurement in a heuristic manner. Let (λ, Y, S) bean m-fold numerical unitary-staircase eigentriplet of A with residual q = f(λ, Y, S) for f defined in (12) via certain auxiliary vectors b , , bm and 1 · · · c , , cm. To analyze the effect of perturbation on matrix A, let g(A, λ, Y, S) denote the 1 · · ·

11 same vector function f(λ, Y, S) in (12) where A is now considered as a variable. When A becomes A˜ = A+E by adding a matrix E of small , denote (λ,˜ Y˜ , S˜) as a numerical unitary eigentriplet of A˜. Let us estimate the asymptotic bound of error

˜ ˜ ˜ ˜ 2 ˜ 2 ˜ 2 λ, Y, S λ, Y , S λ λ + Y Y F + S S F J K − J K 2 ≡ | − | k − k k − k ˜ ˜ ˜ q ˜ ˜ ˜ where λ, Y, S and λ, Y , S denote the vector forms of (λ, Y, S) and (λ, Y , S) respectively accordingJ to theK ruleJ given inK (13). Write g(A, λ, Y, S) = q. Since g(A,˜ λ,˜ Y˜ , S˜) is the k k2 local minimum in a neighborhood of (λ,˜ Y˜ , S˜), we have

g(A,˜ λ,˜ Y˜ , S˜) g(A,˜ λ, Y, S) EY F + q E F + q k k2 ≤ k k2 ≤ k k k k2 ≤ k k k k2 for small E F and q . Moreover, k k k k2 g(A, λ,˜ Y˜ , S˜) g(A,˜ λ,˜ Y˜ , S˜) + E Y˜ F 2 E F + q . k k2 ≤ k k2 k k ≤ k k k k2 In other words,

J(λ, Y, S)( λ, Y, S λ,˜ Y˜ , S˜ ) 2 = f(λ, Y, S) f(λ,˜ Y˜ , S˜) 2 + h.o.t. k J K − J K k k − k 2 E F + 2 q + h.o.t. ≤ k k k k2 where J( , , ) is the Jacobian of f( , , ) and h.o.t represents the higher order terms · · · · · · of E F + q . Let σ ( ) be the smallest of matrix ( ). Then k k k k2 min · · σmin J(λ, Y, S) is strictly positive by Theorem 2 and  σmin(J(λ, Y, S)) λ, Y, S λ,˜ Y˜ , S˜ J(λ, Y, S)( λ, Y, S λ,˜ Y˜ , S˜ ) J K − J K 2 ≤ J K − J K 2

2 E F + 2 q 2 + h.o.t. ≤ k k k k where h.o.t represents the higher order terms of E F . This provides an asymptotic bound k k 2 λ λ˜ λ, Y, S λ,˜ Y˜ , S˜ ( E F + q 2) , (23) | − | ≤ J K − J K 2 ≤ σmin J(λ, Y, S) k k k k

and the finite positive  -1 κ(λ, Y, S) 2 σ J(λ, Y, S) = 2 J(λ, Y, S)+ (24) ≡ min k k2 serves as a condition number of the unitary-staircase  eigentriplet that measures its sensitivity with respect to perturbations on matrix A.

Definition 2 Let (λ, Y, S) be a numerical unitary-staircase eigentriplet of A Cn×n as ∈ a regular orthogonal solution to the system f( , , ) = 0 corresponding to auxiliary vectors · · · b , , bm and c , , cm = Y in (12). Let J( , , ) be the Jacobian of f( , , ). Then 1 · · · 1 · · · · · · · · · we call κ(λ, Y, S) 2 J(λ, Y, S)+ the staircase condition number for the eigentriplet. ≡ k  k2 Remark. The arithmetic mean of an eigenvalue cluster is often used as an approximation to a multiple eigenvalue. Let λ be an m-fold eigenvalue of A with an orthonormal basis matrix Y for the invariant subspace. The perturbed matrix A + E has a cluster of eigenvalues around λ. Chatelin [9, pp.155–156] established the bound on the arithmetic mean λˆ as λˆ λ (X HY )-1 E (25) − ≤ 2k k2 for small E 2, where X is a matrix whose columns form a basis for the invariant subspace H k k H -1 of A . We call (X Y ) 2 the cluster condition number of λ. From our comput- ing experiments, the cluster condition number can be substantially larger than the staircase condition number as shown in the following example. ✷

12 Example 1 Matrix

1 0 20 93 0 71 34 6 -20 3 31 -14 0 0 -19 14 -11 0 3 -6  17 7 -32 -84 3 -69 -33 -9 21 -3 -30 17 -2 -4 15 -12 9 0 -3 9   10 5 40 247 0 193 92 17 -58 9 83 -21 -5 0 -48 27 -25 0 9 -17     -7 -3 0 -39 -3 -34 -19 -4 13 0 -15 -1 0 0 7 1 2 0 0 4     -6 0 62 307 -1 248 118 26 -77 12 106 -30 -3 4 -56 31 -29 0 12 -26   -5 -1 22 86 3 71 39 1 -22 -1 31 -18 4 0 -17 10 -9 3 -1 -7     -3 0 -5 -37 0 -29 -15 11 9 1 -13 4 0 0 8 -4 4 -6 1 1     1 0 3 26 0 22 11 4-15 0 11 0 0 0 -4 0 0 1 0 -3     12 4 -15 -9 0 -6 -1 -2 -1 -8 -1 16 -4 0 3 -8 4 3 -4 -1     -3 0 11 45 0 37 19 7 -15 1 19 -4 0 0 -8 4 -4 0 -1 -7    A =  48 16 -64 -63 0 -47 -24 -7 5 -7 -18 60 -16 0 16 -28 15 3 -3 4     16 9 10 145 0 116 55 11 -38 6 49 3 -9 -4 -26 10 -11 0 6 -11     21 8 -39 -93 3 -75 -36 -9 21 -3 -33 24 -3 -12 18 -15 12 0 -3 9     -3 0 -3-18 0-12 -6 0 3 0 -6 3 0 3 6 -3 3 0 0 0     -3 1 16 57 -3 41 17 5 -7 3 18 -11 -4 0 -11 19 -11 -3 3 -2     4 4 -10 -18 0 -12 -6 -3 0 0 -6 10 -4 -4 6 -3 6 3 0 0     15 4 -24 -27 0 -18 -7 -11 -4 -8 -7 25 -4 0 9 -17 13 12 -4 -1     1 0 3 26 0 22 11 1-15 0 11 0 0 0 -4 0 0 4 0 -3     18 4 -36 -95 0 -77 -42 -7 27 3 -38 23 -4 0 18 -15 11 -3 5 13   22 9 10 177 0 142 68 12 -53 6 62 3 -9 0 -32 11 -13 1 6 -11 

has two exact eigenvalues λ1 = 2.0 and 0.3

λ2 = 3.0 with Segre characteristics 0.2 9, 1 and 8, 2 respectively. Under 0.1 { } { } 0 round-off perturbation in the magnitude imaginary part of machine precision ( 2.2 10-16), Mat- −0.1 ≈ × −0.2 lab outputs eigenvalues in two noticeable −0.3

1.8 2 2.2 2.4 2.6 2.8 3 3.2 clusters show in Figure 3. The arithmetic real part means of the two clusters are as follows Figure 3: Eigenvalue clusters produced by Matlab

means exact eigenvalues cluster condition number left cluster: 1.99724665369002 2.000000000000000 6.50e+012 right cluster: 3.00275334630999 3.000000000000000 6.48e+012

From these results, we can see that only 3 correct digits are obtained by grouping. In con- trast, our iterative method, which will be presented in §5, converges on the two eigentriplets accurately and attains 14 correct digits on the two eigenvalues.

Computed eigenvalues 2.00000000000004 3.00000000000003 ------forward error 4.00e-15 3.02e-14 backward error 1.65e-17 5.77e-17 staircase condition number 3.45e+07 5.33e+05

The cluster condition numbers are over 6 1012 and the staircase condition numbers are × substantially smaller ( 3 107 and 5 105). From the examples we have tested, computing × × staircase eigentriplet appears to be always more accurate than grouping clusters.

5 Computing a staircase eigentriplet with a known structure In this section we present the method for computing a numerical unitary-staircase eigentriplet under the assumption that the Weyr characteristic m m is known for an m-fold { 1 ≥ 2 ≥···} eigenvalue λ that is approximated by λˆ. An algorithm for computing the required Weyr characteristic and initial approximations to the eigenvalues will be given in the next section (§6). There are two steps in calculating the staircase eigentriplet (λ, U, S): First find an initial staircase eigentriplet (λ,ˆ U,ˆ Sˆ), then the Gauss-Newton iteration is applied to refine the eigentriplet until a desired accuracy is attained.

13 The QR decomposition and its updating/downdating will be used extensively. When a row is deleted from a matrix B to form a new matrix Bˇ, finding a QR decomposition of Bˇ from an existing QR decomposition of B is called a QR downdating. Conversely, computing the QR decomposition after inserting a row called a QR updating. QR updating and downdating are standard techniques in matrix computation [23, §12.5.3] requiring O(m2) flops.

5.1 Computing the initial staircase eigentriplet When λˆ λ is available with known multiplicity m and nonzero Weyr characteristic ≈ m mk, we need an initial approximation (λ,ˆ U,ˆ Sˆ) to the solution of equations (6). 1 ≥···≥ Write U = [U1, ,Uk] with Uj = uµ +1, , uµ . From the uniqueness in Theorem 1, · · · j-1 · · · j each column u + of U + along with the j-th column of S is the unique solution to the µi j i 1   homogeneous system

(A λI) uµi +j U1S1,i+1ej UiSi,i+1ej = 0 − − −···− H u1, , uµi +j-1 uµi+j = 0 (26)  · · · H bµ +j+ , , bµ + uµ +j = 0  i 1 · · · i 1  i up to a unit multiple for i = 0, , k-1 and j = 1,  ,mi+ where S is as in (2). · · · · · · 1 Consequently, the vector zµ +j consists of components uµ +j, S ,i+ ej, ,Si,i+ ej spans the i i 1 1 · · · 1 one-dimensional kernel of the matrix H b + + b µi j 1, ··· , µi+1 Gi,j =  A − λI  −U1, ··· , −Ui , i = 0, , k 1, j = 1, ,mi-1 (27) u u H · · · − · · · " 1, ··· , µi+j-1 #   Let Qij Rij be the QR decomposition of Gi,j. Then the vector zµi+j can be computed by a simple [34] on R = Rij

set z0 as a random vector for j = 1, 2, do  H· · · (28)  solve R x = zj-1  solve Ry = x and set zj = y/ y  k k2  After zmui+j is computed from Gij = QijRij, the next vector zmui+j+1 will be computed from Gi,j+1 which comes from deleting bmui+j+1 from the top row of Gij and inserting uµi+j at the bottom. Namely, the QR decomposition of Gi,j+1 is obtained from that of Gij via a QR updating and a QR downdating.

In summary, computing the initial staircase eigentriplet (λ,ˆ U,ˆ Sˆ) is a process consisting of repeated QR updating/downdating and consecutive applications of inverse iteration (28), as outlined in the following pseudo-code. Algorithm InitialEigentriplet

Input: matrix A, Weyr char. m mk , initial eigenvalue λ = λˆ { 1 ≥···≥ } – get random vectors b , , bm and QR decomposition of A λI 1 · · · − – for i = 0, 1, , k 1 do · · · − Update the QR decomposition Gi1 = Qi1Ri1

for j = 1, 2, ,mi+1 do  · · ·  apply iteration (28) on Rij to find a numerical null vector z  ˆ ˆ  extract uˆµ +j, S ,i+ ej, , Si,i+ ej from z   i 1 1 · · · 1   get G + = Q + R + by QR downdating/updating on G = Q R   i,j 1 i,j 1 i,j 1 ij ij ij   14 Output Uˆ, Sˆ

Remark: Computing a staircase form from a given eigenvalue was proposed by Kublanovskaya [33] in 1968. Ruhe [41] improved the Kublanovskaya Algorithm in 1970 by employing singular value decomposition (SVD) for determining the numerical rank and kernel. Due to successive SVD computation, the original Kublanovskaya-Ruhe approach leads to an O(n4) algorithm [6] in the worst case senerio. Further improvement has been proposed in [6, 24] that reduce the complexity to O(n3). Our Algorithm InitialEigentriplet can be considered a new improvement from Kublanovskaya-Ruhe Algorithm. The novelty of our algorithm includes (a) the nullity-one homogeneous system (26); (b) employing an efficient null-vector finder (28) to replace the costly SVD; and (c) successive QR updating/downdating. As a result, Algorithm InitialEigentriplet is of complexity O(n3) and fits our specific need in satisfying the second constraint in (6). Furthermore, our computation of staircase form goes further with a refinement step using the Gauss-Newton iteration in the following section. ✷

5.2 Iterative refinement for a staircase eigentriplet The initial eigentriplet (λ,ˆ U,ˆ Sˆ) produced by Algorithm InitialEigentriplet (or by existing variations of the Kublanovskaya Algorithm) may not be accurate enough. One of the main features of our algorithm is an iterative refinement strategy for ensuring the highest achievable accuracy in computing the staircase eigentriplet. We elaborate the process in the following.

Since (λ,ˆ U,ˆ Sˆ) approximately satisfies (26), this eigentriplet is an approximate solution to (10) for c , , cm = uˆ , , uˆm = Uˆ . Using these ci’s in (10) and (12), we apply the 1 · · · 1 · · · Gauss-Newton iteration for i = 0, 1, ,    · · · + + + + qλ(i 1),Y (i 1),S(i 1)y = qλ(i),Y (i),S(i)y J λ(i),Y (i),S(i) f λ(i),Y (i),S(i) (29) −   with initial iterate qλ(0),Y (0),S(0)y = λ,ˆ U,ˆ Sˆ . Here again, λ(i),Y (i),S(i) denotes the vector form of λ(i),Y (i),S(i) accordingJ to theK rule given in (13).J K

Let (λ, Y, S) be a least squares solution of (10) with sufficiently small residual, or equivalently A is close to a matrix Aˆ having (λ, Y, S) as its exact eigentriplet. Then the Jacobian J( , , ) · · · is injective by Theorem 2, ensuring the Gauss-Newton iteration (29) to converges to (λ, Y, S) locally. This (λ, Y, S) is a numerical staircase eigentriplet, a unitary-staircase eigentriplet can be obtained by an orthogonalization and an extra step of refinement. Specifically, Let Y = UR be the “economic” QR decomposition with U = u , , um . Partitioning 1 · · · U = U1, ,Uk the same way as Y = Y1, ,Yk , it is straightforward to verify that · · · l · · ·   U1, ,Ul = (A λI) for l = 1, , k if A has λ as an exact eigenvalue R  · · ·  K − H  · · ·  with Weyr characteristic mj , and U AU λI is the corresponding nilpotent staircase   { }  H − form. Furthermore, by resetting S = U AU λI, c = b = u , , cm = bm = um, the − 1 1 1 · · · equations in (10) are satisfied including the auxilliary equations.

In actual computation with the empirical data matrix A, a small error may emerge during the reorthogonalization process. This error can easily be eliminated by one extra step of refinement via the Gauss-Newton iteration starting from the new eigentriplet (λ, U, S). Algorithm EigentripletRefine Input: Initial approximate unitary-staircase eigentriplet (λ,ˆ U,ˆ Sˆ), tolerance δ > 0

15 (0) (0) (0) – Set λ ,Y ,S = (λ,ˆ U,ˆ Sˆ), and c , , cm = Uˆ 1 · · · – For i = 1, 2, do · · ·    Solve J λ(i-1),Y (i-1),S(i-1) z = f λ(i-1),Y (i-1),S(i-1) for z Set λ(i),Y (i),S(i) = λ(i-1),Y (i-1),S(i-1) z   −   If z < δ , then set (λ, Y, S) = λ(i),Y (i),S(i) and break the loop  k k2    – Economic QR decomposition Y = UR and set S = U HAU λI  − – If R I, exit. Otherwise, set (λ,ˆ U,ˆ Sˆ) = (λ, U, S), reset b , , bm and ≈ 1 · · · c , , cm, and repeat the algorithm 1 · · · Output (λ, U, S)

(i) (i) (i) To carry out the iterative refinement (29), a QR decomposition J λ ,Y ,S = QiRi is required at every iteration step. A straightforward QR decomposition costs O (mn)3 flops,  which can be substantially reduced by taking the structure of the Jacobian into account. Using  the Weyr characteristic 3, 2, 1 as an example, the nilpotent staircase matrix S is shown { } in (14). the Jacobian J(λ, Y, S) of f(λ, Y, S) with a re-arrangement of columns and rows P1 and P2 becomes

P1J(λ, Y, S)P2 =

H C6 A − λI -s56I -s46I -s36I -s26I -s16I -y6 -[y1, ··· , y5] H C5  A − λI -s35I -s25I -s15I -y5 -[y1, y2, y3]  H b5 H  C4   A − λI -s34I -s24I -s14I -y4 -[y1, y2, y3]  H  C3   A − λI -y3   H  b3  H   C2   A − λI -y2   H  b2  H  b3  H   C1   −   A λI -y1    where the blocks Cj = c , , cj for j = 1, 2, , 6. Without loss of generality, we 1 · · · · · · can assume A is in Hessenberg form. Then J(λ, Y, S) is near triangular and requires only   O(m3n2) flops for its QR decomposition. The backward substitution requires O(m2n2) flops.

5.3 Converting a staircase form to Jordan decomposition After finding a staircase decomposition A = UTU H, the Jordan Canonical Form of A is avail- able by conjugating the Weyr characteristic. In the cases where the Jordan decomposition is in demand, a method for converting the staircase decomposition to the Jordan decomposition is proposed by Kublanovskaya [33] which primarily involves non-unitary similarity transforma- tions. Detailed procedures can also be found in [24, 30].

Alternatively, we may calculate the Jordan decomposition AX = XJ by converting unitary- staircase eigentriplets (λi,Ui,Si) to Jordan decompositions A(UiGi) = (UiGi)Ji for i = 1, , k using Kublanovskaya’s algorithm. · · · 5.4 Numerical examples for computing the staircase form Algorithm InitialEigentriplet combined with Algorithm EigentripletRefine forms a stand-alone algorithm for computing a staircase/Jordan decomposition, assuming an initial

16 approximation to a multiple eigenvalue together with its Segre/Weyr characteristics are avail- able by other means. This combination is implemented as a Matlab EigTrip. We shall present a method for computing the Jordan structure in §6.

A previous algorithm for computing a staircase form with known Segre characteristic via a min- imization process is constructed by Lippert and Edelman [36] and implemented as a Matlab module sgmin. The iteration implemented in sgmin converges in many cases, including diffi- cult test matrices such as the Frank matrix. As we shall show below, our algorithm provides a substantial improvement over sgmin particularly on cases where the cluster condition numbers (25) are large but the staircase condition numbers stay moderate. We list the comparisons on accuracy only.

Example 1’ We test both sgmin and our EigTrip on the matrix A given in Example 1 in §4, starting from eigenvalue approximation λˆ1 = 1.999 and λˆ2 = 2.999 with given Segre characteristics 9, 1 and 8, 2 respectively. The code sgmin improves the eigenvalue { } { } accuracy by one and four digits respectively. In contrast, our EigTrip obtains an accuracy near the machine precision on both eigenvalues, as shown in Table 1.

from λˆ1 =1.999 from λˆ2 =2.999 computed backward computed backward eigenvalue error eigenvalue error cluster mean 1.99724665369002 — 3.00275334630999 — sgmin 1.99991878946447 1.004e-008 2.99999991118127 6.895e-010 EigTrip 1.99999999999998 3.270e-017 3.000000000000003 4.673e-017

Table 1: Accuracy comparison for Example 1

Example 2 We construct a 50 50 matrix having known multiple eigenvalues λ = × 1.0, 2.0 and 3.0 with Segre characteristics 10, 5, 3, 2 , 8, 4, 3 and 4, 1 respectively, { } { } { } together with ten simple eigenvalues randomly generated in the box [ 3, 3] [ 3, 3]. Both (0) (0) − × − (0) sgmin and EigTrip start at initial approximations λ1 = 0.99, λ2 = 1.99, and λ3 = 2.99. The results of the iterations are listed in Table 2, in which forward errors are λj λˆj for | − | each computed eigenvalue λˆj, j = 1, 2, 3, and the backward errors are the residual (21) for each eigentriplet.

at λ = 1.0 at λ = 2.0 at λ = 3.0 forward backward forward backward forward backward error error error error error error sgmin 2.29e-008 8.46e-007 5.01e-008 9.42e-007 1.03e-009 3.15e-008 EigTrip 2.22e-016 1.16e-015 0 1.89e-016 8.88e-016 1.23e-016

Table 2: Accuracy comparison for Example 2

The results show that our algorithm is capable of calculating eigenvalues to the accuracy near machine precision (16 digits). For each approximate eigentriplet (λ, Y, S) of matrix A, the residual ρ is defined in (21). By (23), with relative distance up to ρ from A, there is a nearby matrix Aˆ for which (λ, Y, S) is an exact eigentriplet.

17 Example 3 (Frank matrix) [8, 24, 30, 37, 41, 45, 47]: This is a classical test matrix given in a Hessenberg form F = (fij), with fij = n + 1 max i, j for j i 1 and fij = 0 − { } ≥ − otherwise. Frank matrix has no multiple eigenvalues. However, its small eigenvalues are ill-conditioned measured by the standard eigenvalue condition number [23], as shown in the following table.

Eigenvalues and condition numbers of 12 12 Frank matrix Eigenvalue condition Eigenvalue condition × Eigenvalue condition 32.22889 8.5 3.51186 34.1 0.143647 611065747.8 20.19899 16.2 1.55399 1512.5 0.081228 2377632497.8 12.31108 9.0 0.64351 1371441.3 0.049507 3418376227.8 6.96153 24.1 0.28475 53007100.5 0.031028 1600156877.4

Clearly, Frank matrix is near matrices which possess multiple eigenvalues near zero with non- trivial Jordan structures. Using an initial eigenvalue estimation near zero and Segre charac- teristics 2 , 3 , 4 , 5 and 6 in consecutive tests, our refinement algorithm EigTrip { } { } { } { } { } produces five nearby matrices with an eigenvalue of multiplicity 2, 3, 4, 5, and 6 respectively, as shown in the table below.

5 nearby matrices with following features respectively given computed backward staircase cluster Segre ch. eigenvalue error condition condition sgmin 6 0.1870511240986754 6.34e-05 126.8 EigTrip {6} 0.1870509025041315 6.34e-05 5.96 { } sgmin 5 0.1076751260727581 1.90e-06 7689.2 EigTrip {5} 0.1076751114381528 1.90e-06 32.2 { } sgmin 4 0.0701182985767899 6.12e-08 291589.8 EigTrip {4} 0.0703019426541069 3.47e-08 447.4 { } sgmin 3 0.0504328996330119 4.23e-l0 3666804.6 EigTrip {3} 0.0504338685708545 4.23e-10 11322.9 { } sgmin 2 0.0305042120283680 9.87e-10 15192435.2 EigTrip {2} 0.0386493437615946 3.45e-12 458607.1 { }

In other words, Frank matrix F resides within a relative distance 3.45 10-12 from a × matrix having a double eigenvalue, or 4.23 10-10 from a matrix having a triple eigenvalue, × etc. Notice that the cluster condition numbers (25) in both cases are quite high whereas the staircase condition numbers are small. It appears that our Algorithm EigTrip substantially improves backward accuracy over sgmin, particularly when cluster condition number is large.

6 Computing the numerical Jordan structure In this section we present the theory and algorithm for computing the structure of the numerical Jordan Canonical Form represented by Segre and Weyr characteristics.

6.1 The minimal polynomial As described in many textbooks on fundamental algebra (see, e.g. [2]), given a linear operator T : on a over a field , one may view as a module over [t] V −→V V nF n-1 V F by a “” product: p(t)v p(T )v = anT (v)+ an-1T (v)+ + a1T (v)+ a0v for ≡ · · ·

18 n n n×n p(t)= ant + + a t + a [t] and v . For = C, = C , and A C being · · · 1 0 ∈ F ∈V F V ∈ the of T , we consider Cn a module over the polynomial C[t] with scalar product p(t)v p(A)v for p(t) C[t] and v . ≡ ∈ ∈V A monic polynomial p(t) C[t] is called an annihilating polynomial for v (with respect ∈ ∈V to A) if p(t)v ( p(A)v ) = 0. For a subspace Cn, if p(t)v = 0 for all v , ≡ W ⊆ ∈ W then p(t) is regarded as an annihilating polynomial for . The polynomial with least W degree among all the annihilating polynomials for v (or subspace ) is called the minimal W polynomial for v (or subspace ). Note that every annihilating polynomial for v (or W subspace ) is divisible by the minimal polynomial and obviously the minimal polynomial W for subspace is divisible by any minimal polynomial for any vector in . If the minimal W W polynomial for a vector v coincides with the minimal polynomial for then v is ∈ W W said to be a regular vector of . W By the Fundamental Structure Theorem for modules over Euclidean domain [2], Cn is a direct n sum of cyclic submodules, say C = k, where for each i = 1, , k, i is a W1 ⊕···⊕W · · · W cyclic submodule (a submodule spanned by one vector) invariant with respect to A and is isomorphic to C[t]/ (pi(t)) with pi(t) being the minimal polynomial for i. Moreover, each W pi(t) is divisible by pi+1(t) for i = 1, , k 1, that is pk(t) pk-1(t) p1(t). Here, for · · · − · · · polynomial h(t) and q(t), notation h(t) q(t) stands for “h(t) divides q(t)”.

It follows from pk(t) p1(t) that each pi(t) for i = 1, , k can be written in the form · · · · · · mi1 mil pi(t) = (t α ) (t αl) (30) − 1 · · · − for fixed α , , αl C, and m j m j mkj 0 for j = 1, , l. 1 · · · ∈ 1 ≥ 2 ≥···≥ ≥ · · · m Lemma 2 [2] For each (t αj) ij in (30) with mij > 0 where i = 1, , k and − · · · j = 1, , l, there is an elementary Jordan block Jm (αj) of order mij associated with · · · ij eigenvalue αj in the Jordan Canonical Form of A, and the Jordan Canonical Form of A consists of all such elementary Jordan blocks.

When a subspace Cn is invariant with respect to A, the linear transformation W ⊂ A induces a A˜ : Cn/ Cn/ given by A˜(v + ) = Av + . All W −→ W W W the concepts and statements on annihilating polynomials and minimal polynomials introduced above for Cn with linear map A : Cn Cn can be repeated for Cn/ with linear map → W A˜ : Cn/ Cn/ . For instance, p(t) C[t] is the minimal polynomial for subspace W → W ∈ ˜ Cn/ if p(t) is the least degree polynomial which annihilates all u˜ ˜, that is U ⊂ W ∈ U p(t)u˜ = p(A˜)u˜ = 0 for all u˜ ˜. ∈ U Cn When / is isomorphic to a vector n A˜ n ′ W C / C / space over C with isomorphism W −−−−→ W nW ′ σ : C / , then the linear map σ σ W→W A˜ : Cn/ Cn/ induces a linear W → W ′ B ′ map B = σ A˜ σ-1 : ′ ′,   ◦ ◦ W −→ W Wy −−−−→ Wy making the diagram in Figure 4 com- Figure 4: Commuting diagram mutes. That is, B σ = σ A˜. ◦ ◦ Lemma 3 For any subspace ˜ Cn/ , p(t) C[t] is the minimal polynomial for ˜ U ⊂ W ∈ U with respect to A˜ if and only if p(t) is the minimal polynomial of σ( ˜) with respect to B. U

19 Proof. B = σ A˜ σ-1 implies Bm = σ A˜m σ-1 for any integer m> 0. It follows that ◦ ◦ ◦ ◦ g(B)= σ g(A˜) σ-1 for any g(t) C[t]. Thus g(B)σ(u˜)= σ g(A˜)u˜ for u˜ ˜ and ◦ ◦ ∈ ◦ ∈ U g(B)σ(u˜) = 0 g(A˜) u˜ = 0. (31) ⇐⇒ Let p (t) be the minimal polynomial for ˜ (with respect to A˜) and p (t) be the minimal 1 U 2 polynomial for σ( ˜) (with respect to B). Then, by (31), p (A˜)u˜ = 0 implies p (B)σ(u˜)= 0 U 1 1 for all u˜ ˜. So, p (t) annihilates σ( ˜) and hence p (t) p (t). By the same argument ∈ U 1 U 2 1 p (t) p (t), and the assertion follows. ✷ 1 2

6.2 The Jordan structure via minimal polynomials By Lemma 2, the first task in finding the Jordan structure of A : Cn Cn is to identify → the minimal polynomial pi(t) for the corresponding cyclic submodules i, i = 1, , k in n W · · · C = k, followed by factorizing pi(t) in the form given in (30). We must W1 ⊕···⊕W emphasize here that accurate factorization of pi(t) in numerical computation used to be regarded as a difficult problem. However, the appearance of a newly developed numerical algorithm MultRoot [52, 53] for calculating multiple roots and their multiplicities makes this problem well-posed and solvable. Consequently the structure of the Jordan Canonical Form can be determined accurately.

n We shall begin by finding the minimal polynomial p1(t) for 1. From C = 1 k, n W W ⊕···⊕W every v C can be written in the form v = v + +vk where vi i for i = 1, , k. ∈ 1 · · · ∈W · · · Thus, by pk(t) p (t), we have p (t)v = p (A)v = p (A)v + +p (A)vk = 0, making · · · 1 1 1 1 1 · · · 1 p (t) the minimal polynomial for Cn. Meanwhile, p (t) is the minimal polynomial for 1 1 all v Cn except those v’s for which v = 0. The exceptional set is of measure zero. ∈ 1 Therefore almost every v Cn is a regular vector. In other words, vector v is regular with ∈ probability one if it is chosen at random as in §6.3.

To find minimal polynomial p (t), we choose a generic vector x Cn and check the 1 ∈ dimensions of the Krylov subspaces span x, Ax , span x, Ax, A2x , span x, Ax, A2x, A3x , consecutively to look for the first integer j where span x, Ax, , Ajx is of dimension · · ·   · · ·  j. For this j, let c′ x + c′ Ax + + c′ Ajx = 0. Obviously, c′ =0 and 0 1 · · · j  j 6 j j-1 ′ p1(t) = t + cj-1t + + c0, with ci = c /cj, i = 1, , j-1 · · · i · · · can serve as the minimal polynomial of 1. We then proceed to find the minimal polynomial W n p2(t) for 2. By the same argument given above along with the property C / 1 W ′ ′ W ≃ 2 k = , p2(t) is the minimal polynomial for (by pk(t) p1(t) as W ⊕···⊕W W ′ W · · · well as the minimal polynomial for almost all v . By Lemma 3, p2(t) is the minimal n ∈ W n n polynomial for C / 1 (with respect to the induced linear map A˜ : C / 1 C / 1), W n W → W and, with probability one, the minimal polynomial for any vector in C / 1. To derive the ˜ W j-1 ⊥ induced map A, let qj+ , , qn be an orthonormal basis for span x, Ax, , A x . { 1 · · · } · · · j-1 n Then x, Ax, , A x, qj+ , , qn forms a basis for C , and by writing v˜ = v + · · · 1 · · ·  W 1 ∈ Cn Cn Cn / 1n for any vector v , q˜j+o1, , q˜n forms a basis for / 1. For the matrix W ∈ { · · · } W representation of A˜ : Cn/ Cn/ , let W1 → W1 j-1 Aqi = c ix + c iAx + + cjiA x + cj+ ,iqj+ + + cniqn for i > j 1 2 · · · 1 1 · · · = bi + cj+ ,iqj+ + + cniqn (32) 1 1 · · ·  20 j-1 with bi = c ix + c iAx + + cjiA x . It follows that 1 2 · · · ∈W1 ˜ Aq˜i = Aq = cj+ ,iq˜j+ + + cniq˜n i 1 1 · · · and the (n j) (n j) matrix f − × −

cj+1,j+1 ··· cj+1,n . .. .  . . .  cn,j+1 ··· cn,n   n n becomes the matrix representation of the linear transformation A˜ : C / 1 C / 1 with H W → W respect to the basis q˜j+ , , q˜n . Meanwhile, by (32), cli = q Aqi, for l, i = j+1, ,n. { 1 · · · } l · · · With the matrix representation of A˜ : Cn/ Cn/ available, we may find the minimal W1 → W1 polynomial p (t) for Cn/ (with respect to A˜) by following the same procedure that 2 W1 produces minimal polynomial p1(t) for 1 (with respect to A). For instance, using W j-1 generically chosen y 1, write y = y1x + y2Ax + + yjA x + yj+1qj+1 + + ynqn and ∈W ⊤ n n-j · · · · · · consider y˜ = (yj+ , ,yn) C / ( C ). Checking the sequence of Krylov subspaces 1 · · · ∈ W1 ≃ span y˜, A˜y˜ , span y˜, A˜y˜, A˜2y˜ , consecutively. Let span y˜, A˜y˜, A˜ly˜ be the · · · · · · first one with its dimension less than the number of generating vectors. That is, the relation   l  l dl-1 l-1 d0 d0y˜ +d1A˜y˜ + +dlA˜ y˜ = 0 with dl = 0 exists, and polynomialp ˜2(t)= t + t + + · · · 6 dl · · · dl becomes the minimal polynomial for y˜. With probability one, it is the minimal polynomial for Cn/ (with respect to A˜). Therefore p (t) =p ˜ (t). W1 2 2 l-1 n Notice that the linear independence of y˜, A˜y˜, , A˜ y˜ in C / 1 implies the linear l-1 { n · · · } W l-1 independence of y˜, A˜y˜, , A˜ y˜ in C . Thus 2 = span y, Ay, , A y and { · · · j-1 } l-1 W n · · · 1 2 = span x, Ax, , A x, y, Ay, A y . In general, C /( 1 m-1) W ⊕W · · · · · ·  W ⊕···⊕W ≃ m k, for m = 2, , k, so the same process may be continued to find the minimal W ⊕···⊕W  · · · polynomial pi(t) for i, i = 3, , k. W · · · 6.3 The minimal polynomial via Hessenberg reduction In the process elaborated in the last section (§6.2), a crucial step for finding minimal poly- nomials is the determination of the dimensions of the Krylov subspaces spanned by vector sets x, Ax, , Aj-1x for j = 1, 2, . However, the condition of the Krylov ma- { · · · } · · · trix K(A, x, j) x, Ax, , Aj-1x deteriorates when j increases, making the rank ≡ · · · decision difficult. A more reliable method is developed below to decide the dimension of   span x, Ax, , Aj-1x accurately without the explicit calculation of the Krylov matrices. · · · Computing eigenvalues of a matrix A Cn×n starts with the Hessenberg reduction [23, p.344] ∈ H H Q AQ = H = h , , hn , with Q Q = I. (33) 1 · · ·   Let q , , qn be the column vectors of Q in (33). Then 1 · · · QH q , Aq , , Aj-1q = e ,QHAq , ,QHAj-1q 1 1 · · · 1 1 1 · · · 1 H H H j-1 H j-1 = e , (Q AQ)Q q , , (Q A Q)Q q = e ,He , ,H e = Rj.  1 1 · · ·  1 1  1 · · · 1  ⊤    Here e = [1, 0, , 0] . Clearly, Rj is an n j upper . Therefore 1 · · · × j-1 K(A, q , j) = q , Aq , , A q = QRj (34) 1 1 1 · · · 1   21 is a QR decomposition of the Krylov matrix K(A, q , j) = q , Aq , , Aj-1q . Further- 1 1 1 · · · 1 more, if K(A, q , j) is of full rank, then the first j columns q , , qj of Q form an 1  1 · · ·  orthonormal basis for the Krylov subspace ( K(A, q , j) ). R 1 Taking (34) into account for computing the minimal polynomial via Krylov matrices K(A, v, j) for j = 1, 2, , using randomly chosen unit vector v, the Hessenberg reduction matrix Q in · · · (33) must have v as its first column. This can be achieved by a modified Hessenberg reduction

H Find the Householder matrix T such that T v = e1 for j = 1, 2, do  · · · H  Hessenberg reduction [23, §7.4.3] step j on T AT : Obtaining Pj so (35)  H H  that the first j column block [h , , hj] of (P Pj) (T AT )(P Pj)   1 · · · 1 · · · 1 · · ·  is upper-Hessenberg      Since v = T e1, the first column of T is the same as v. The subsequent Hessenberg H reduction steps of T AT with unitary transformations P Pj does not change its first 1 · · · j 1 columns. Consequently the first j-column block [q , , qj] of T P Pk stay − 1 · · · 1 · · · the same for k j with q1 = v. Upon completing (35) for j up to n, we obtain the ≥ H Q AQ = H with a specified first column q1 = v in Q = T P1 ...Pn.

When the Krylov matrix K(A, v, j) is of full rank, then ( K(A, v, j)) = ( [q , , qj] ) R R 1 · · · and ( K(A, v, j+1)) = q , A q , , qj . Thus, the rank of K(A, v, j+1) can R R 1 1 · · · be decided by finding the numerical rank of q , A[q , , qj] during the process (35).  1 1 · · · Moreover, AQ = QH implies A q , , qj = Q h , , hj where h , , hn are 1 · · ·  1 · · ·  1 · · · columns of H. Consequently    

q , A[q , , qj] = q ,Q[h , , hj] = Q e , h , , hj . 1 1 · · · 1 1 · · · 1 1 · · ·       Therefore, the numerical rank of q , A[q , , qj ] is the same as the upper-triangular matrix 1 1 · · · e , h , , hj . We summarize this result in the following proposition. 1 1 · · ·     Proposition 3 For A Cn×n, let Q be the unitary whose first ∈ n H column is parallel to v C such that Q AQ = H = h , , hn is upper-Hessenberg. ∈ 1 · · · Assume j > 0 is the smallest integer for which Krylov matrix K(A, v, j+1) is rank-deficient,   then rank K(A, v, i) = rank e , h , , hi- for i = 2, , j. 1 1 · · · 1 · · ·    When Krylov matrices K(A, v, i) for i j are of full rank, the matrix e , h , , hj is ≤ 1 1 · · · rank-deficient in exact sense if and only if the diagonal entry h is zero since the matrix j-1,j   e , h , , hj is upper-triangular. In numerical computation, however, an upper-triangular 1 1 · · · matrix can be numerically rank deficient even though its diagonal entries are not noticeably   small, e.g., the Kahan matrix [23, p.260]. That is, hj-1,j is usually small but not near zero for e , h , , hj to be numerically rank deficient. Therefore we must apply the inverse 1 1 · · · iteration (28) to determine whether e , h , , hj is rank deficient in approximate sense.   1 1 · · ·   When the first index j is encountered with e , h , , hj being numerically rank-deficient, 1 1 · · · we can further refine the Hessenberg reduction and minimize the magnitude of the entry h +   j 1,j of H since ˆ ˆ A q , , qj q , , qj h , , hj = hj+ ,jqj+ (36) 1 · · · − 1 · · · 1 · · · 1 1      22 j should be zero, here hˆi C is the first j-entry subvector of hi for i = 1, , j. As a ∈ · · · result, the least squares solution to the overdetermined system

A q1, , qj q1, , qj hˆ1, , hˆj = 0 · · · H − · · · · · · q c1, , ci-1, ci 0, , 0, 1 = 0, for i = 1, , j,   i  · · · − · · ·  · · · (37)  q1 v = 0     −  hil = 0, for i > l+1  minimizeshj+ ,j. Here v is the predetermined random vector and c , , cj are constant 1 1 · · · vectors. Let f(q , , qj, h , , hj) be the vector mapping that represents the left side of the 1 · · · 1 · · · system (37) and J(q , , qj, h , , hj) be its Jacobian. The following proposition ensures 1 · · · 1 · · · the local convergence of the Gauss-Newton iteration in solving f(q , , qj, h , , hj) = 0 1 · · · 1 · · · for the least squares solution.

n×n Proposition 4 Let A C and f(q , , qj, h , , hj)= 0 be the vector form of the ∈ 1 · · · 1 · · · system (37). Assume q , , qj, h , , hj satisfies (37) with hi+ ,i = 0 for i = 1, , j-1. 1 · · · 1 · · · 1 6 · · · Then the Jacobian of f(q , , qj, h , , hj) is injective. 1 · · · 1 · · · n×j Proof. Differentiating the system (37), let matrix z1, , zj C and upper- j×j · · · ∈ Hessenberg matrix g , , gj C satisfy 1 · · · ∈     A z1, , zj z1, , zj hˆ1, , hˆj = q1, , qj g1, , gj · · · − · · · · · · H · · · · · · c1, , ci-1, ci zi = 0, for i = 1, , j, (38)     · · ·    · · ·  z = 0, g = 0, for i > l+1   1  il

Using an induction, we have z1 = 0 and assume z1 = = zk = 0. The equation k+1 k+1 · · · Azk = i=1(hikzi + gikqi) becomes hk+1,kzk+1 + i=1 gikqi = 0. For i = 1, , k, we have H H H · · · H gik = 0 from ci zk+1 = 0, ci qi = 1, ci ql =0 for l = i+1, , k+1. Also, ck+1zk+1 = 0 HP P · · · and c + qk+ = 1 lead to gk+ ,k =0 and zk+ = 0 since hk+ ,k = 0. Thus zi = 0 and k 1 1 1 1 1 6 gi = 0 for i = 1, , j. Namely, J(q , , qj , h , , hj) is injective. ✷ · · · 1 · · · 1 · · ·

When e1, h1, , hj is numerically rank-deficient, we set c1, , cj = q1, , qj · · · H · · · · · · and the initial iterate z(0) = qH, , qH, hH, , hH for the Gauss-Newton iteration   1 · · · j 1 · · · j    

+   z(i 1) = z(i) J z(i) +f z(i) (39) − (i) H (i) H (i) H (i) H H with z(i) = q ,  , q , h , , h , i = 0, 1, 1 · · · j j · · · j · · ·       that refines the (partial) Hessenberg reduction A q , , qj = q , , qj hˆ , , hˆj 1 · · · 1 · · · 1 · · · and minimize the magnitude of the residual h + q + that approaches zero during the j 1,j j 1     iterative refinement.

Overwrite q , , qj, h , , hj with the terminating iterate of (39) and U R be the QR 1 · · · 1 · · · 1 1 decomposition of q , , qj . Then 1 · · ·   H H1 U AU1 = ∗ 1 O A  1  with H1 being an upper-Hessenberg matrix whose characteristic polynomial is the first min- imal polynomial p1 of A. By the argument in § 6.2, the second minimal polynomial p2 of

23 A is the (first) minimal polynomial of A1. Therefore we can continue the same Hessenberg reduction-refinement strategy on A1 recursively and obtain a reduced-Hessenberg form

H 1 ··· ∗ H .. . U AU =  . .  (40) H  ℓ    where each Hi is an irreducible upper-Hessenberg matrix whose characteristic polynomial is the i-th minimial polynomial pi of A for i = 1, ,ℓ. · · · j The first minimal polynomial p1(t) = p0 + p1t + + pjt and its coefficient vector p ⊤ · · · ≡ (p ,p , ,pj) satisfies K(A, v, j+1)p = 0. From (34), v = q , and K(A, v, j) being 0 1 · · · 1 Rˆj full rank, we have K(A, v, j) = QRj = Q where Rˆj is a j j upper triangular  O  × matrix and

K(A, v, j+1) = v, AK(A, v, j) = v, AQRj = v, Q h , , hj Rˆj 1 · · · h i ˆh i h 1   i = Qe1,Q h1, , hj Rj = Q[e1,H1] ˆ . · · ·  Rj  h   i In general, to find the coefficient vector pi of the i-th minimal polynomial pi for i = 1, ,ℓ, · · · we first solve [e1,Hi] z = 0 for z and write

α α pi = , and z = .  u   v 

Then solve 1 α α = . (41) Rˆ u v  j     Algorithm MinimalPolynomials Input: A Cn×n, numerical rank threshold θ > 0 ∈ – Initialize ℓ = 0 – While n> 0 do Set unit vector v at random. Apply (35) until rank [e , h , , hj] = j θ 1 1 · · · Update ℓ = ℓ + 1    Apply the Gauss-Newton iteration (39) to refine q , , qj, h , , hj  1 · · · 1 · · ·  Obtain the QR decomposition UℓRℓ = [q , , qj]  1 · · ·  H Hℓ ∗  Obtain Uℓ AUℓ = . Overwrite A with Aℓ and update n = n-j  O Aℓ  Solve [e ,H ] z = h0 for z iand construct p by solving (41)  1 ℓ ℓ Output: minimal polynomials p , ,pℓ 1 · · ·

The sequence of minimal polynomials p (t), , pℓ(t) produced by Algorithm MinimalPoly- 1 · · · nomials are in the form

n1i nli pi(t) = (t λ ) (t λl) , i = 1, 2 ,ℓ − 1 · · · − · · · where nj nj is the Segre characteristic of A associated with λj for j = 1, , l. { 1 ≥ 2 ≥···} · · · Although the process is recursive, there is practically no loss of accuracy from Ai to Ai+1 since

24 Ai+1 is extracted as a submatrix of Ai during which only unitary similarity transformations are involved.

For each pi(t), Algorithm MultRoot in [52, 53] is applied to calculate the multiplicity structure [ mi , ,miσ ] and corresponding approximate roots ti , ,tiσ , obtaining the 1 · · · i 1 · · · i Jordan structure of matrix A.

Remark. The modified Hessenberg reduction (35) is in fact the Arnoldi process [42, p. 172-179] with Householder orthogonalization, which is the most reliable version of the Arnoldi method. We improve its robustness even further with a novel iterative refinement step (39). There are less reliable versions of the (see, e.g. [13, p.303][23, p.499]) based on Gram-Schmidt orthogonalization that may be applied to construct unitary bases for the Krylov subspaces. A method of finding minimal polynomials can alternatively be based on those versions of the Arnoldi algorithm. We choose the modified Hessenberg reduction and Gauss-Newton refinement to ensure the highest possible accuracy. ✷

6.4 Minimal polynomials and matrix bundle stratification

In fact, the process of applying Algorithm MinimalDegree on matrix sequence A1, A2, inherently calculates the Segre characteristics associated with the matrix bundle of the · · · × highest codimension. Suppose A Cn n belongs to matrix bundle defined by Segre ∈ B characteristics nj nj , for j = 1, 2, , k. As explained in §2.4, bundle is { 1 ≥ 2 ≥ ···} · · · B imbedded in the closure of a lower codimension matrix bundle, say ˜, in a hierarchy of bundle B stratification. Our algorithm actually identifies the highest codimension bundle because B of the covering relationship established in [17].

k For minimal polynomials p ,p , of A, let di = deg(pi) = nji for i = 1, , . 1 2 · · · j=1 · · · ∞ The integer sequence d d forms a partition of n. Let d˜ d˜ be { 1 ≥ 2 ≥ ···} P { 1 ≥ 2 ≥ ···} the similarly constructed sequence of minimal polynomial degrees associated with bundle ˜ B where ˜ . B⊇B Lemma 4 Suppose and ˜ are two bundles of n n matrices with ˜ and a matrix B B × B⊇B on has at least as many distinct eigenvalues as a matrix on ˜. Let d = d d B B { 1 ≥ 2 ≥···} and d˜ = d˜ d˜ be the degree sequences of minimal polynomials associated with { 1 ≥ 2 ≥ ···} and ˜ respectively. Then d and d˜ as partitions of n satisfy the dominant ordering B B relationship d˜ d, namely ≥

d˜ + d˜ + + d˜j d + d + + dj for each j = 1, 2, . (42) 1 2 · · · ≥ 1 2 · · · · · ·

Proof. By [17, Theorem 2.6], ˜ if and only if it is possible to coalesce eigenvalues and B⊇B apply the dominance ordering coin moves to the Segre characteristics which defines bundle ˜ B to reach those of . If is obtained by one dominance coin move from one Segre characteristic B B ν˜ = n˜j n˜j to ν = nj nj with other Segre characteristics unchanged, { 1 ≥ 2 ≥···} { 1 ≥ 2 ≥···} thenν>ν ˜ and therefore (42) holds.

Similarly, assume is obtained by coalescing two eigenvalues on ˜ with their Wyre charac- B B teristics combined as a union of sets, or equivalently, their Segre characteristics n˜i , n˜i , { 1 2 ···} and n˜j , n˜j , combined in a componentwise sum n˜i +n ˜j , n˜i +n ˜j , and other { 1 2 ···} { 1 1 2 2 ···} Segre characteristics unchanged (see also [17, Lemma 2.5]). Actually the equalities in (42)

25 hold in this case since the degree dk is the sum n k + n k + of the k-th components in 1 2 · · · the Segre characteristics.

Since (42) is valid for every single dominant coin move and every coalesce of eigenvalues, it holds for a sequence of such manipulations of Segre characteristics from ˜ to . ✷ B B Because of (42), we have either d = d˜ or d > d˜. If d˜ > d, there is an l > 0 such that d˜ = d , , d˜l = dl and d˜l > dl. Algorithm MinimalDegree applying on Al 1 1 · · · −1 −1 stops at dl instead of d˜l, since the search goes through degree 1, 2, and dl precedes · · · d˜l. Consequently, the highest codimension bundle is identified before ˜ with proper B B rank calculation. If d = d˜, the degrees of minimal polynomials associated with is the B same as those of ˜. For a similar reason, Algorithm MultRoot [52] extracts the highest B codimension multiplicity structure that leads to rather than B˜. Consequently, the highest B codimension bundle is identified before ˜ with proper rank calculation. Our computing B B experiment is consistent with this observation.

7 The overall algorithm and numerical results 7.1 The overall algorithm Our overall algorithm for computing the numerical Jordan Canonical Form of given matrix A Cn×n can now be summarized as follows. ∈

Stage I: Computing the Jordan Structure

Step 1 Francis QR. Apply Francis QR algorithm to obtain a H A = QTQ and approximate eigenvalues λ , , λn. 1 · · · Step 2 Deflation. For each well-conditioned simple eigenvalue λj, apply the deflation method in [4] to swap λj downward along the diagonal of T to reach

A = U B D U H  C 

where Λ C consists of all the well-conditioned eigenvalues of A. Step 3 Jordan  structure. Apply the method in §6.3 to calculate the Segre charac- teristics of B and initial estimates of the distinct eigenvalues.

Stage II: Computing the staircase/Jordan decompositions There is an option here to select either the unitary staircase decompsition A = USU H or the Jordan decomposition A = XJX-1.

Step 4(a) To compute the staircase decomposition. For each distinct eigen- value, apply Algorithm InitialEigentriplet for an initial eigentriplet using the Segre characteristic and initial eigenvalue approximation computed in the previous step. Then iteratively refine the eigentriplet by Algorithm EigentripletRefine. Continue this process to reach a unitary staircase decomposition A = USU H ul- timately. Step 4(b) To compute the Jordan decomposition. For each distinct eigenvalue λj, j = 1, , k with Segre characteristic and the initial approximate determined · · ·

26 in Step 3 above, apply the precess described in §5 to compute a unitary-staircase eigentriplet (λj,Uj,Sj). Then apply the Kublanovskaya algorithm to obtain the -1 local Jordan decomposition λjI + Sj = GjSjGj . Consequently, the Jordan -1 decomposition A = X J X with X = U G , ,UkGk is constructed. 1 1 · · ·   As mentioned before, Stage II can be considered a stand-alone algorithm for computing the staircase/Jordan form from given Weyr/Segre characteristics and initial eigenvalue approxi- mation. It can be used in conjunction with other approaches where the Jordan structure is determined by alternative means.

There are four control parameters that can be adjusted to improve the results:

1. The deflation threshold δ: If a simple eigenvalue has a condition number less than δ, it will be deflated. The default value for δ is 1000. 2. The gap threshold in rank decision γ: In determining the rank deficiency of Hj in Algorithm MinimalDegree, we calculate the smallest singular value of each Hj. If the ratio of the smallest singular values of Hj and Hj-1 is less than γ, then Hj is considered rank deficient. The default value for γ is 10−4. 3. The residual tolerance τ for MultRoot: The residual tolerance required by Mult- Root. See [53] for details. 4. The residual tolerance ρ for eigentriplet refinement: It is used to stop the iteration in refining the eigentriplet. The default value ρ = 10-8.

7.2 Numerical results We made a Matlab implementation NumJCF of our algorithm for computing the Jordan decomposition. It has been tested in comparison with the Matlab version of JNF [29] on a large number of matrices, including classical examples in the literature. Our experiment is carried out on a Dell Optiplex GX 270 personal computer with Intel Pentium 4 CPU, 2.66 GHz and 1.5 GB RAM. For a computed Jordan decomposition AX = XJ of matrix A, the residual ρ = A X X J F A F is used as one of the measures for the accuracy k − k k k Example 4 Let 

2 r − 5 − s −r + 3 s − 2 t 20 − 2 s + 2 t 15 − 2 s + 2 t 10 −5+ s − t 2 r − 5 − 2 s −r − 15 + 6 s − 4 t 50 − 4 s + 4 t 40 − 4 s + 4 t 20 −15 + 2 s − 2 t A = 0 −10 − 2 s + 2 t 10 + 4 s − 3 t 10 + 3 s − 3 t s − t −5 − s + t R6×6. 4  2 r − 5 − 2 s −r − 10 + 8 s − 7 t 50 − 8 s + 8 t 40 − 7 s + 8 t 25 − s + t −15 + 3 s − 3 t  −2 r +5+2 s r + 25 − 6 s + 5 t −65 + 4 s − 4 t −55 + 4 s − 4 t −25 + t 25 − 2 s + 2 t ∈ − −  0 5 10 10 5 5+ t    We compare our method with the conventional symbolic computation on the exact matrix. The exact eigenvalues are r, s and t with Segre characteristics 1 , 2 and 3 , { } { } { } respectively. For r = √2, s = √3 and t = √5, it takes Maple 10 nearly two hours (7172 seconds) to find the Jordan Canonical Form, while both JNF and NumJCF complete the computation instantly. On a similarly constructed matrix of size 10 10, Maple does × not finish the computation in 8 hours and Mathematica runs out of memory.

Approximating √2, √3 and √5 in machine precision 2.2×10−16, both JNF and NumJCF ≈ correctly identify the Jordan structure, whereas our NumJCF obtained the eigenvalues with 3 6 more correct digits than JNF along with smaller residual as shown in Table 3. ∼

27 computed eigenvalues residual (with correct digits in boldface and Jordan block sizes in braces) ρ JNF 1.414213563 {1} 1.732050809 {2} 2.236067975 {3} 3.38e-013 NumJCF 1.41421356237311 {1} 1.732050807574 {2} 2.23606797749971 {3} 1.01e-016

Table 3: Eigenvalues and residuals computed by JNF and NumJCF

Example 5 This is a classic test matrix that is widely used in eigenvalue computing exper- iment [8, 30, 35, 41, 46]:

1 1 1 −2 1 −1 2 −2 4 −3 −1 2 3 −4 2 −2 4 −4 8 −6 −1 0 5 −5 3 −3 6 −6 12 −9 Λ A5 Segre ch.  −1 0 3 −4 4 −4 8 −8 16 −12  −1 0 3 −6 5 −4 10 −10 20 −15 1 1 A5 = −1 0 3 −6 2 −2 12 −12 24 −18   −1 0 3 −6 2 −5 15 −13 28 −21  2 3 2  −1 0 3 −6 2 −5 12 −11 32 −24   −1 0 3 −6 2 −5 12 −14 37 −26  3 2 2  − − − − −   1 0 3 6 2 5 12 14 36 25    Using the default parameters, both JNF and our NumJCF easily obtained the accurate Jordan decomposition.

Computing results for eigentriplets of A5 eigenvalue Segre characteristic residual 1.0000000000000002 {1, 0} JNF 2.0000000000000001 {3, 2} 1.41e-15 3.0000000000000002 {2, 2} 0.9999999999999995 {1, 0} NumJCF 2.0000000000000000 {3, 2} 1.40e-16 3.0000000000000003 {2, 2}

Both JNF and NumJCF obtain similarly accurate results on classical matrices such as those in [8, pp. 192-196]. We choose to omit them and concentrate on the cases in which our NumJCF significantly improves the robustness and accuracy in comparison with JNF.

Example 6 This is a series of test matrices with a parameter t.

t 2+t -t -2 -1-3t -2 2 -1+t -t 0  1-t -1-3t 2t 2+t 2+6t 2+t -3-t 1-t 1+2t 1  2t -4t 2 4t t 3t -2t t 0 0  + + + + + +   -1 t -7t 2t 1 4t 1 10t -1 4t -1-5t 0 1 3t 1 t   +  ×  3t -4t 0 4t 3 t 4t 1-3t 2t 0 -1  R10 10 A(t) =  + +  (43)  2-3t -4 5t 0 4-4t 1-5t 6-4t -2 6t 1-t -t -2t  ∈  + + + +   -3 4t 2-3t -t -2 4t -1-3t -2 4t 6-2t -1 4t -t -1-t   +   4t -5t 0 5t t 5t 1-4t 3 3t 0 -1   + + + +   -2-3t -2 2t t -3t 1 2t -3t -3 2t -2t 4 t 3   + + + +   -3 4t 2-3t -t -2 4t -1-3t -2 4t 3-2t -1 4t -t 2-t  For every t > 0, matrix A(t) has the same Jordan Canonical Form J consisting of two eigenvalues λ =2 and λ = 3 with Segre characteristics 3, 1 and 4, 2 respectively. 1 2 { } { } Let the Jordan decomposition be A(t) = X(t) J X(t)-1. When t increases, the condition number X(t) X(t)-1 of X(t) increases rapidly. This example tests the accuracy and k k2k k2 robustness of numerical Jordan Canonical Form finders under the increasing condition number of X(t). As shown in Table 4, our algorithm maintains high backward accuracy, forward accuracy, and structure correctness, while the results of JNF deteriorate as t increases. Starting from t = 5, JNF outputs incorrect Jordan structure. When t 23, JNF outputs ≥ only one 10 10 Jordan block. Our NumJCF continues to produce accurate results. ×

28 -1 eigenvalues Segre ch. backward error kX(t)k2kX(t) k2

2.00000000000001 3,1 JNF 2.99999999999999 4,2 6.10e-015 t = 1 2.00000000000000 3,1 1113.9 NumJCF 3.00000000000000 4,2 1.11e-015 2.000000000002 3,1 JNF 2.999999999998 4,2 7.40e-014 t = 2 2.00000000000000 3,1 28894.5 NumJCF 3.00000000000000 4,2 4.87e-016 1.99999999987 3,1 JNF 3.00000000009 4,2 8.30e-012 t = 4 2.00000000000000 3,1 1658396.6 NumJCF 2.99999999999999 4,2 5.65e-016 2.0000000006 4 JNF 2.9999999996 5,1 1.36e-011 t = 5 2.00000000000001 3,1 5655648.5 NumJCF 2.99999999999999 4,2 7.60e-016 2.0000001 4 JNF 2.99999992 5,1 9.29e-010 t = 10 2.00000000000003 3,1 297244917.4 NumJCF 2.99999999999998 4,2 6.94e-016 JNF 2.6 10 1.27e-004 t = 25 1.99999999999992 3,1 60948418207.9 NumJCF 2.99999999999998 4,2 8.58e-016

Table 4: Comparison between JNF and NumJCF on matrix A(t) in (43)

Example 7 The matrices in the literature on computing Jordan Canonical Forms are usually not larger than 10 10. We construct a 100 100 real matrix × × J -1 A = X B X , h i where J is the Jordan Canonical Form of eigenvalues λ1 = 1 and λ2 = 2 with Segre characteristics 5, 4, 3, 1 and 4, 2, 2 respectively, B R80×80 and X R100×100 are { } { } ∈ ∈ random matrices with entries uniformly distributed in [ 1, 1]. There are 80 simple eigenvalues − randomly scattered around the two multiple eigenvalues. This example is designed to show that our code NumJCF may be more reliable than the approach of grouping the eigenvalue clusters in the process of identifying a multiple eigenvalue and determining the Jordan structure. We generate 1000 such matrices A with fixed J and randomly chosen B as well as X. For each matrix A, we run JNF and NumJCF twice and the results are shown in Table 5.

% of failures % of failures % of failures on both run on first run on second run JNF 41.9% 41.9% 41.9% NumJCF 0.1% 4.5% 4.6%

Table 5: Results for Example 7 on 1000 matrices.

Notice that there are several steps in our algorithm which require parameters generated at random. Consequently, failures are rarely repeated (0.1% in this case) in the subsequent runs of NumJCF. The code JNF appears to be deterministic and always repeats the same results. On the other hand, failures are verifiable in our algorithm from the residuals staircase condition numbers. One may simply run the code second time when the first run fails.

29 References [1] V. I. Arnold, On matrices depending on parameters, Russian Math. Surveys, (1971), pp. 29–43. [2] M. Artin, Algebra, Pretice-Hall, New Jersey, 1991. [3] Z. Bai, J. Demmel, J. Dongarra, A. Ruhe, and H. van der Vorst, editors, Templates for the Solution of Algebraic Eigenvalue Problems: A Practical Guide, SIAM, Philadelphia, 2000. [4] Z. Bai and J. W. Demmel, On swapping diagonal blocks in real Schur form, Lin. Alg. Appl., 186 (1993), pp. 73–95. [5] S. Barnett and R. G. Cameron, Introduction to Mathematical Control Theory, Clarendon Press, Oxford, 2nd ed., 1985. [6] T. Beelen and P. V. Dooren, Computational aspects of the Jordan canonical form, in Reliable Numerical Computation, M. Cox and S. Hammerling, eds., Oxford, 1990, Clarendon Press, pp. 57–72. [7] R. Byers, C. He, and V. Mehrmann, Where is the nearest non-regular pencil?, Lin. Alg. Appl., 121 (1998), pp. 245–287. [8] F. Chaitin-Chatelin and V. Fraysse´, Lectures on Finite Precision Computations, SIAM, Philadelphia, 1996. [9] F. Chatelin, Eigenvalues of Matrices, John Wiley and Sons, New York, 1993. [10] J. M. de Olazabal´ , Unified method for determining canonical forms of a matrix, ACM SIGSAM Bulletin, 33, issue 1 (1999), pp. 6–20. [11] J. W. Demmel, A numerical analyst’s Jordan canonical form. Ph.D. Diss., Computer Sci. Div., Univ. of California, Berkeley, 1983. [12] , Computing stable eigendecompositions of matrices, Lin. Alg. and Appl., 79 (1986), pp. 163–193. [13] , Applied , SIAM, Philadelphia, 1997. [14] J. W. Demmel and A. Edelman, The dimension of matrices (matrix pencils) with given Jordan (Kronecker) canonical forms, Linear Algebra and its Applications, 230 (1995), pp. 61–87. [15] J. W. Demmel and B. K˚agstrom¨ , The generalized Schur decomposition of an arbitrary pencil A λB: robust software with error bounds and applications. Part I & Part II, ACM − Trans. Math. Software, 19 (1993), pp. 161–201. [16] A. Edelman, E. Elmroth, and B. K˚agstrom¨ , A geometric approach to perturbation theory of matrices and and matrix pencils. Part I: Versal deformations, SIAM J. Matrix Anal. Appl., 18 (1997), pp. 653–692. [17] , A geometric approach to perturbation theory of matrices and and matrix pencils. Part II: a stratification-enhanced staircase algorithm, SIAM J. Matrix Anal. Appl., 20 (1999), pp. 667–699.

30 [18] A. Edelman and Y. Ma, Staircase failures explained by orthogonal versal form, SIAM J. Matrix Anal. Appl., 21 (2000), pp. 1004–1025.

[19] E. Elmroth, P. Johansson, and B. K˚agstrom¨ , Computation and presentation of graphs displaying closure hierarchies of Jordan and Kronecker structures, Numerical Lin- ear Algebra with Applications, 8 (2001), pp. 381–399.

[20] , Bounds for the distance between nearby Jordan and Kronecker structures in a closure hierarchy, J. of Mathematical Sciences, 114 (2003), pp. 1765–1779.

[21] E. Fortuna and P. Gianni, Square-free decomposition in finite characteristic: an appli- cation to Jordan Form computation, ACM SIGSAM Bulletin, 33, issue 4 (1999), pp. 14–32.

[22] M. Giesbrecht, Nearly optimal algorithms for canonical matrix forms, SIAM J. Comp., 24 (1995), pp. 948–969.

[23] G. H. Golub and C. F. Van Loan, Matrix Computations, The John Hopkins University Press, Baltimore and London, 3rd ed., 1996.

[24] G. H. Golub and J. H. Wilkinson, Ill-conditioned eigensystems and the computation of the Jordan canonical form, SIAM Review, 18 (1976), pp. 578–619.

[25] M. Gu, Finding well-conditioned similarities to block-diagonalize nonsymmetric matrices is NP-hard, J. of Complexity, 11 (1995), pp. 377–391.

[26] R. A. Horn and C. R. Johnson, , Cambridge University Press, New York, 1985.

[27] P. Johansson, StratiGraph User’s Guide. Report UMINF 03.21, Department of Com- puting Science, Ume˚aUniversity, SE-901 87, Ume˚a, Sweden, 2003.

[28] B.K˚agstrom¨ , Singular matrix pencils (Section 8.7). In Z. Bai, J. Demmel, J. Dongarra, A. Ruhe, and H. van der Vorst, editors, Templates for the Solutions of Algebraic Eigenvalue Problems: A Practical Guide, pp 260–277, SIAM, Philadelphia, 2000.

[29] B.K˚agstrom¨ and A. Ruhe, Algorithm 560: JNF, an algorithm for numerical compu- tation of the of a complex matrix, ACM Trans. Math. Software, 6 (1980), pp. 437–443.

[30] , An algorithm for numerical computation of the Jordan normal form of a complex matrix, ACM Trans. Math. Software, 6 (1980), pp. 398–419.

[31] B.K˚agstrom¨ and P. Wiberg, Extracting partial canonical structure for large scale eigenvalue problem, Numerical Algorithms, 24 (2000), pp. 195–237.

[32] W. Kahan, Conserving confluence curbs ill-condition. Technical Report 6, Computer Science, University of California, Berkeley, 1972.

[33] V. N. Kublanovskaya, On a method of solving the complete eigenvalue problem for a degenerate matrix, USSR Computational Math. and Math. Phys., 6 (1968), pp. 1–14.

[34] T. Y. Li and Z. Zeng, A rank-revealing method with updating, downdating and applica- tions, SIAM J. Matrix Anal. Appl., 26 (2005), pp. 918–946.

31 [35] T. Y. Li, Z. Zhang, and T. Wang, Determining the structure of the Jordan normal form of a matrix by symbolic computation, Linear Algebra and its Appl., 252 (1997), pp. 221–259.

[36] R. A. Lippert and A. Edelman, Nonlinear eigenvalue problems with orthogonality constraints (Section 9.4). In Z. Bai, J. Demmel, J. Dongarra, A. Ruhe, and H. van der Vorst, editors, Templates for the Solutions of Algebraic Eigenvalue Problems: A Practical Guide, pp 290–314, SIAM, Philadelphia, 2000.

[37] , The computation and sensitivity of double eigenvalues, in Advances in computational mathematics, Lecture Notes in Pure and Appl. Math. 202, New York, 1999, Dekker, pp. 353–393.

[38] C. Moler and C. Van Loan, Nineteen dubious ways to compute the exponential of a matrix, twenty-five years later, SIAM Review, 45 (2003), pp. 3–49.

[39] V. Y. Pan, Solving polynomial equations: some history and recent progress, SIAM Review, 39 (1997), pp. 187–220.

[40] B. N. Parlett, The Symmetric Eigenvalue Problem, Prentice-Hall, Englewood Cliffs, N.J., 1980.

[41] A. Ruhe, An algorithm for numerical determination of the structure of a general matrix, BIT, 10 (1970), pp. 196–216.

[42] Y. Saad, Numerical Methods for Large Eigenvalue Problems, Manchester University Press, Manchester, and Halsted Press, New York, 1992.

[43] B. Sridhar and D. Jordan, An algorithm for calculation of the Jordan Canonical Form of a matrix, Comput. & Elect. Engng., 1 (1973), pp. 239–254.

[44] G. W. Stewart, Matrix Algorithms, Volumn II, Eigensystems, SIAM, Philadelphia, 2001.

[45] L. N. Trefethen and M. Ebree, Spectra and Pseudospectra, Princeton University Press, Princeton and Oxford, 2005.

[46] J. Varah, The computation of bounds for the invariant subspaces of general matrix op- erator. Stanford Tech. Rep. CS 66, Stanford Univ., 1967.

[47] J. H. Wilkinson, The Algebraic Eigenvalue Problem, Oxford University Press, New York, 1965.

[48] , Sensitivity of eigenvalues, Utilitas Mathematica, 25 (1984), pp. 5–76.

[49] , Sensitivity of eigenvalues, II, Utilitas Mathematica, 30 (1986), pp. 243–286.

[50] T. J. Ypma, Finding a multiple zero by transformations and Newton-like methods, SIAM Review, 25 (1983), pp. 365–378.

[51] Z. Zeng, The approximate GCD of inexact polynomials, I: a univariate algorithm. to appear.

32 [52] , Algorithm 835: Multroot – a Matlab package for computing polynomial roots and multiplicities, ACM Trans. Math. Software, 30 (2004), pp. 218–235.

[53] , Computing multiple roots of inexact polynomials, Math. Comp., 74 (2005), pp. 869– 903.

[54] Z. Zeng and B. H. Dayton, The approximate GCD of inexact polynomials, II: a mul- tivariate algorithm. Proc. of ISSAC ’04, ACM Press, (2004), pp 320–327.

33