d d

Computing Accurate Eigensystems of Scaled Diagonally Dominant Matrices

(Appeared in SIAM J. Numer. Anal., v. 27, n. 3, pp. 762-791, 1990)

Jesse Barlow James Demmel Department of Computer Science Courant Institute The Pennsylvania State University 251 Mercer Str. University Park, PA 16802 New York, NY 10012

Abstract

When computing eigenvalues of sym metric matrices and singular values of general matrices in ®nite precision arithmetic we in general only expect to compute them with an error bound pro- portional to the product of machine precision and the norm of the . In particular, we do not expect to compute tiny eigenvalues and singular values to high relative accuracy. There are some important classes of matrices where we can do much better, including bidiagonal matrices, scaled diagonally dominant matrices, and scaled diagonally dominant de®nite pencils. These classes include many graded matrices, and all sym metric positive de®nite matrices which can be consistently ordered (and thus all symmetric positive de®nite tridiagonal matrices) . In particular, the singular values and eigenvalues are determined to high relative precision independent of their magnitudes, and there are algorithms to compute them this accurately. The eigenvectors are also determined more accurately than for general matrices, and may be computed more accurately as well. This work extends results of Kahan and Demmel for bidiag- onal and tridiagonal matrices.

Keywords: Graded matrices, Singular value decomposition, symmetric eigenproblem, perturba- tion theory, error analysis

AMS(MOS) subject classi®cations: 65F15, 15A60

Acknowledgements: The ®rst author was supported by the Air Force Of®ce of Scienti® Research under grant no. AFOSR-88-0161 and the Of®ce of Naval Research under grant no. N00014-80-0517. The second author was supported by the National Science Foundation (grants NSF-DCR-8552474 and NSF-ASC-8715728). Part of this work was performed while the ®rst author was visiting the Courant Institute and the second author was visiting the IBM Bergen Scienti®c Center.

1 d d

1. Introduction When computing the eigenvalues of sym metric matrices and singular values of general matrices in ®nite precision arithmetic one generally only expects to compute them with an error bound f (n)ε eeA ee,wheref(n) is a modestly growing function of the matrix dimension n, ε is the machine precision, and eeA ee is the 2-norm of the matrix A. This follows as a result of stan- dard theorems which state: (1.1) A perturbation δA in the matrix A cannot change its eigenvalues (singular values) by more than eeδA ee [12]. (1.2) The standard algorithm for computing eigenvalues (singular values) of A computes the exact eigenvalues (singular values) of A +δA, eeδA ee ≤ f (n)εeeA ee,wheref(n)isamod- estly growing function of n and ε is the machine precision [12]. These error bounds imply that tiny eigenvalues and singular values (tiny compared to e e A ee) cannot generally be computed to high relative accuracy, since the error bound f (n)ε eeA ee may be much larger than the desired quantity. In fact, if each matrix entry is uncertain in its least signi®cant digits, the tiny eigenvalues and singular values may not even be determined accurately by the data. Sometimes, however, the eigenvalues and singular values are determined much more accurately than error bounds like f (n)ε eeA ee would indicate. This was shown for singular values of bidiagonal matrices in [9], where it was proven that small relative perturbations in the bidiag- onal entries only cause sm all relative perturbations in the singular values, independent of their magnitudes. It was also shown how to compute all the singular values to high relative accuracy. In this paper we extend these results to eigenvalues of sym metric scaled diagonally dominant matrices and scaled diagonally dominant de®nite pencils. (Henceforth we will abbreviate "scaled diagonally dominant" by s.d.d.) A sym metric s.d.d. matrix is any matrix of the form ∆A ∆,whereAis sym metric and diagonally dominant in the usual sense, and ∆ is an arbitrary nonsingular . A pencil H −λM is s.d.d. de®nite if H and M are sym metric s.d.d. and M is positive de®nite. Examples of s.d.d. matrices are the "graded" matrices R10 10 H R 1 10 H J J J J 2 2 − 3 4 J10 10 10 J J10 10 10 J A = J 102 103 103 J and A = J 104 106 104 J . 0 J J 1 J J 3 4 4 4 − 3 J 10 10 10 J J 10 10 10J Q 104 105P Q 10 1 P

Note that A 0 is graded in the usual sense, but not diagonally dominant in the usual sense. A 1 is neither diagonally dominant in the usual sense, nor graded in the usual sense, since the diago- nal entries are positive and negative, and not sorted. Thus we see that the usual diagonal domi- nance implies being s.d.d., but not the converse. In fact, the set of s.d.d. matrices includes all symmetric positive de®nite matrices which can be consistently ordered, a class which includes all sym metric positive de®nite tridiagonal matrices. Dense matrices may be s.d.d. as well.

Another example arises from modeling a series of masses m 1 ,...,mn on a line con- nected by simple, linear springs with spring constants k 0 ,...,kn (the ends of the extreme springs are ®xed). The natural frequencies of vibration of this system are the square roots of the eigenvalues of the s.d.d. de®nite pencil H −λM,whereMis the diagonal mass matrix + + diag( m 1,...,mn)andHis the tridiagonal stiffness matrix with diagonal k 0112k , k k ,..., + − − −1/2 −1/2 kn −1knand offdiagonal k 1,..., kn−1. Note that the matrix MHM, which has the same eigenvalues as H −λM, is sym metric s.d.d. In particular, we will show that sm all relative perturbations in the entries of an s.d.d. matrix only cause sm all relative perturbations in the eigenvalues and singular values, indepen- dent of their magnitudes. This is a much tighter perturbation bound than the classical one pro- vided by (1.1) above. Our proof of this result generalizes and uni®es results in [9] for bidiago- nal matrices alone and in [14] for sym metric tridiagonal s.d.d. matrices alone.

2 d d

Given that the matrix entries determine all eigenvalues or singular values to high relative accuracy, one would naturally like to compute them that accurately as well. We present algo- rithms based on bisection which attain this accuracy; in the case of bidiagonal or sym metric positive de®nite tridiagonal matrices QR iteration (suitably modi®ed) can be shown to attain high accuracy as well. It is not yet known whether algorithms based on divide and conquer [5, 11, 13] can be made to work in some of these situations too. One may also ask if the singular vectors and eigenvectors of s.d.d. matrices and pencils are determined any more accurately than for general matrices. To state the standard perturba- tion bound for eigenvectors of sym metric matrices and singular vectors of general matrices, we λ λ ≡ eλ−λ e need to de®ne the gap:if i is an eigenvalue (singular value) of A then gap( i) min ij. j≠i λ In other words, it is the absolute distance between iand the remainder of the spectrum. +δ α= T λ (1.3) Let y be a unit eigenvector of A A, yAythe Rayleigh quotient, i the eigenvalue α θ of A closest to ,andzi its unit eigenvector. Let (zi, y) be the acute angle between y and θ ≤ eeδ ee λ zii.Thensin (y,z) 4 A /gap( i) [17, p. 222]. In other words, the error as measured by the angle is proportional to the reciprocal of the gap; λ if the gap is sm all ( i is in a cluster of eigenvalues), the corresponding eigenvector is poorly determined. As before, the standard algorithms guarantee eeδA ee ≤ f (n)εeeA ee, so eigenvectors ee ee ee ee λ of eigenvalues poorly separated with respect to A (i.e. A /gap( i) is large) will generally not be computed accurately. Analogous results hold for singular vectors of general matrices. For eigenvectors of s.d.d. matrices, a stronger perturbation theorem is true. Brie¯y, we eλ−λ e eλλe1/2 λ can replace the gap in (1.3) with the relative gap,min ij/ ji. Thus, as long as i is j≠ i relatively well separated from its neighbors, its corresponding eigenvector is determined to high relative accuracy. This is a much stronger result than (1.3), as the following example shows. Suppose the eigenvalues are 1, 2.10− 10 and 10− 10 . Then the gap for the sm allest eigenvalue is gap(10− 10 )=10− 10 , but the relative gap is .707. Thus (1.3) predicts a loss of 10 decimal digits, whereas the ®ner analysis predicts nearly full accuracy. We also show that a suitable variation of can be used to compute the eigenvectors to this accuracy. We conjecture that other methods based on divide and conquer can attain this accuracy as well, but this has not been proven. Similar results can be proven for singular vectors of bidiagonal matrices and eigenvectors of s.d.d. de®nite pencils; the result for singular vectors partially settles an open question in [9]. The rest of this paper is organized as follows. Section 2 contains de®nitions. Section 3 discusses some simple generalizations of Gershgorin's theorem applicable to s.d.d. matrices. Section 4 uses the minimax characterization of eigenvalues to present simple perturbation lem- mas. In section 5 this lemma is applied to singular values and in section 6 to eigenvalues. Sec- tion 7 discusses perturbation theory for eigenvectors and singular vectors. Section 8 shows that the condition numbers for the eigenvectors provide good estimates for the reciprocal of the dis- tance to the nearest matrix with multiple eigenvalues. Section 9 discusses algorithms for the bidiagonal singular value decomposition, section 10 discusses algorithms for the sym metric tri- diagonal eigenproblem, and section 11 discusses algorithms for the dense sym metric eigenprob- lem (both matrices and pencils). The new algorithm for the sym metric positive de®nite tridiag- onal eigenproblem will be included in the LAPACK library [8]. Section 12 applies our results to a matrix arising from a differential operator. Section 13 summarizes the available algorithms and the current state of research, and discusses future work.

2. De®nitions and Basic Lemmas In this paper we will deal exclusively with real (usually sym metric) matrices. Extensions to complex (usually Hermitian) matrices will be obvious. ee.ee will denote the 2-norm. Decompose the matrix A as A = D + N where D is diagonal and N has a zero diagonal. We γ eee.eee eee eee ≤γ e e will call a matrix A -diagonally dominant with respect to a norm if N min Dii , i

3 d d

where 0≤γ<1. Suppose that A is γ -diagonally dominant with respect to either the 1-norm or in®nity-norm. Then the well known Gershgorin's Theorem says that the eigenvalues of A lie in γ e e the union of the Gershgorin disks Bi,whereB iis centered at D iiand has radius at most D ii . In particular, if some Bi is disjoint from the other disks, it contains exactly one eigenvalue and γ Dii is an approximation to this eigenvalue with relative error at most . = + e e = ± ∆ ∆ Now let A D N and Dii 1i.e.Ahas 1's on the diagonal. Let 12and be arbitrary ≡∆ ∆γ γ nonsingular diagonal matrices. Then we call H 12A -scaled diagonally dominant ( -s.d.d.) with respect to a norm eee.eee if A is γ -diagonally dominant with respect to eee.eee.IfHis sym metric, ∆=∆ weinsistthatAbe sym metric as well in which case 12can be chosen in only one way: ∆ =ee1/2 1iiH ii . Note that a matrix may only be diagonally dominant in the scaled sense, as the following example shows: R 1 .1H R1 0 H R1 10 H A = J J , ∆=∆= J J , H ≡∆A∆=J J Q.1 1 P 12Q0 100P 12 Q10 10000P Here, A is γ -diagonally dominant with γ=.1 (with respect to the 1-norm, 2-norm or in®nity- norm), H is γ -s.d.d. with the same γ ,butHis not diagonally dominant in the nonscaled sense for any γ< 1. Our de®nition of scaling implies nothing about the monotonicity of the diagonal entries of H; for example H is γ -s.d.d. if R1 .1 .1H R1 0 0 H R1 10 .001 H J J J J J = − ∆=∆= ≡∆ ∆= − J A J.1 1 .1J , 12J0100 0 J , H 12A J10 10000 .1 J . Q.1 .1 1 P Q0 0 .01P Q.001 .1 .0001P

IfHis sym metric with positive diagonal entries, being γ -s.d.d. with respect to the 2-norm is closely related to another well known property: consistent ordering [18]. Consistent ordering is de®ned as follows: Let A = I + N = I + L + U,whereLis strictly lower triangular and U is strictly upper triangular. Then A is consistently ordered if the eigenvalues of αL +α− 1U are independent of α≠ 0. Now suppose there is a permutation matrix P such that A in PHPT =∆A∆=∆(I+N)∆ is consistently ordered. Then we claim H is positive de®nite if and only if it is s.d.d. To prove this, note that by choosing α= 1andα= − 1, we see the eigenvalues of N = L + U occur in ± pairs, including ± eeN ee =±γ. Now note that H is positive de®nite if and only if I + N is posi- tive de®nite (by Sylvester's theorem), and that the sm allest eigenvalue of I + N is 1− eeN ee=1−γ. Therefore, the theorems in this paper apply to many matrices arising from discretized differential equations [18]; see section 12 for an example. We will call a sym metric pencil H −λM γ-scaled diagonal dominant de®nite (γ -s.d.d. de®nite) with respect to a norm eee.eee if H and M are γ -s.d.d. sym metric with respect to eee.eee and M is positive de®nite. If H is positive de®nite as well, we call H −λM γ-s.d.d. positive de®nite. ee ee = eλ e ≤ eee eee If T is any sym metric matrix, T max i(T) T for any operator norm or the i Frobenius norm eee.eee. Therefore, all the theorems in this paper which are proven for diagonal dominance with respect to the 2-norm automatically hold for diagonal dominance with respect to any operator norm or the Frobenius norm. λ≤ ...≤λ The minimax characterization of the eigenvalues 1 nof a de®nite pencil H −λM (H=HTT, M=M , M positive de®nite) is [17] hhhhhhxHT x λ=min max . i iiT (2.1) S x∈S xMx eexee=1 where Si varies over all i-dimensional subspaces of Rniand x varies over all unit vectors in S (x could vary over all nonzero vectors, but we will ®nd it convenient to restrict to unit vectors) . There is an obvious simpli®cation if M is the identity matrix (the standard eigenproblem).

4 d d

3. Generalizations of Gershgorin's Theorem =∆ ∆ It turns out the eigenvalues of the s.d.d. matrix H 12A lie in Gershgorin circles whose ∆∆ centers and radii are both scaled by 12: =∆ ∆ =± γ Proposition 1: Let H 12A (Aii 1) be a (possibly nonsymmetric) -s.d.d. matrix with respect γ< to the in®nity-norm or 1-norm, with 1. Then the eigenvalues of H lie in disks Bii , where B is cen- γ e e tered at Hii and has radius at most H ii. If B i is disjoint from the other disks, it contains exactly γ one eigenvalue and Hii is an approximation to that eigenvalue with relative error at most . Proof: Suppose without loss of generality that H is γ -s.d.d. with respect to the in®nity-norm; otherwise consider H T . The scalar λ is an eigenvalue of H if and only if H −λI is singular, −λ∆− 1 ∆− 1 which is in turn true if and only if A 1 2 is singular. Let x be a right null vector of −λ∆− 1 ∆− 1 A 1 2 , and suppose x j has absolute value at least as large as any other component of x. Then we may rearrange the equation −λ ∆∆− 1 − 1 = ΣAxjk kx j 1, jj 2, jj 0 k to obtain x λ= ∆ ∆ + hhhk 1, jj 2, jj(A jj Σ Ajk ) . k≠ j xj n x ∆∆ = e hhk h e ≤γ λ γ e e Since A jj 1, jj 2, jjH jj and Σ A jk , must lie in a ball of radius Hii centered at k≠ j xj Hii. The usual Gershgorin argument shows that if this disk is isolated, it contains exactly one eigenvalue. ` This theorem implies that at least if a Gershgorin disk is isolated so that sm all relative changes in the matrix entries do not effect its isolation, then the eigenvalue it contains cannot change by a factor of more than (1+γ)/(1−γ). If we assume H is sym metric, we need not assume the disks are isolated to obtain this result: γ Proposition 2: Let H be a -s.d.d. with respect to the 2-norm. Let hi be its diago- ≤ ...≤ λ nal entries in increasing order h 1 hni , and its eigenvalues, also in increasing order. Then λ hhhi 1−γ ≤ ≤ 1+γ . hi = Proof: Assume without loss of generality that Hiih i, by reordering the rows and columns of H if necessary. Then by (2.1) T T T (i) λ=min max xHx≤ maxixHx=max xHà xà i ii∈ ∈ Sx S x S0 eexÃee=1 eexee=1 eexee=1 i (i) where S0 is the space spanned by the ®rst i standard basis vectors, and H is the leading prin- < λ≤ −γ > cipal i by i submatrix of H.Ifhiii0then (1 )h by simple norm inequalities. If h i0and > ≤ λ≤ +γ > < all h j0forj i, simple norm inequalities again imply iii(1 )h.Ifh 0andsomehj0 < λ≤ +γ for j i, we also have ii(1 )h but we must argue as follows: RH R∆ 0HIR−I0H MR∆ 0Hxà T TT1j 1J1J (i)≡ J J.JJ J+ (i)J.J J. xÃÃÃÃHx [x1,x2] ∆ N ∆JJ Q0 2PLQ0IkP O0 2xà Q PQ2P

≤−ee∆ ee2+ ee∆ ee2+γ ee∆ ee2+ ee∆ ee2 11xÃÃ22x ( 11x ÃÃ22x )

5 d d

≤ +γ ee∆ ee2≤ +γ (1 ) 22xà (1 )h.i

− λ≥ +γ < Applying the same process to H yields the complementary inequalities iii(1 )h for h 0 λ≥ −γ > ` and iii(1 )h for h 0. Finally, we may extend the result to s.d.d. sym metric de®nite pencils: −λ γ Proposition 3: Let H M be a symmetric -s.d.d. de®nite pencil with respect to the 2-norm. Let ri ≤ ...≤ λ be the sorted ratios of diagonal entries Hii/M ii , where r 1 rni , and the eigenvalues, also in increasing order. Then −γ λ +γ h1hhhh ≤ hhhi ≤ h1hhhh +γ −γ . 1 ri 1 = Proof: Assume as in the proof of Proposition 2 that riiiiiH /M , by reordering rows and columns if necessary. Write M =∆A∆ where ∆ is diagonal and A is diagonally dominant with ones on its diagonal. Then the pencil ∆ − 1 H ∆−λ− 1 A≡R−λA has the same eigenvalues as −λ = H M, but now the diagonal entries R iir i.Then T T T (i) λ= hxRxhhhhh ≤ hxRxhhhhh = hhhhhhhhxRà xà i min max maxi max ii∈ T ∈ T T (i) Sx S xAx x S0xAx eexÃee=1 xAà xà eexee=1 eexee=1 where Si is the space spanned by the ®rst i standard basis vectors, and R (i)(and A i) are the 0 T leading principal i by i submatrices of R and A, respectively. Note that 1−γ ≤ xAÃÃ(i) x≤1+γ for all unit vectors xÃ,sinceAequals the identity matrix plus a matrix of norm at most γ .Then λ≤ +γ −γ > λ≤ −γ +γ < by Proposition 2, we have iiiiii(1 )/(1 )r if r 0and (1 )/(1 )r if r 0. Applying the same process to R +λA yields the other inequalities. `

6 d d

4. Perturbation Lemmas Based on the Minimax Theorem λ ≤ ...≤λ −λ Let 1(H,M ) n(H,M) denote the eigenvalues of the de®nite pencil H M. Given the minimax characterization in (2.1), the following lemma is simple to prove: Lemma 1: Suppose δH has the property that for all nonzero x T +δ ≤ hhhhhhhhhhhhx(H H)x ≤ gl gu , xHxT < ≤ where 0 glug.Then λ(H+δH,M) ≤ hhhhhhhhhhhhhhi ≤ gl λ gu i(H,M) for all i. In other words, if the R ayleigh quotients x TT(H +δH)x and x Hx differ by at most a certain factor for all x, then the eigenvalues of H +δH−λM and H −λM differ by at most that same factor. λ≡λ λ′ ≡ λ +δ λ≥ λ< Proof: Let ii(H,M) and Let ii(H H,M). We consider only i0; the case i0 i i is analogous. Let the spaces S 0and S 1satisfy TT TT λ=maxixHx/xMx and λ′ = maxix (H+δH)x/xMx. i ∈ i∈ x S0 xS1 Then hhhhhhhhhhhhxT (H+δH)x hhhhhhhhhhhhxT (H+δH)x hhhhhhxHxT λ′ = min max ≤ maxi ≤ g λ i ii∈ T ∈ T T u i S x S xMx x S 0xHx xMx and similarly T T T +δ hhhhhhxHx hhhhhhhhhhhhxHx hhhhhhhhhhhhx(H H)x − 1 λ=min max ≤ maxi ≤ g λ′ i ii∈ T ∈ T T l i S x S xMx x S1x (H +δH)x xMx completing the proof. ` Lemma 1 can also be generalized to in®nite dimensional operators [15, Thm VI.3.9]. There is an obvious analogous result if if both H and M are perturbed simultaneously: Lemma 2: Suppose δH and δM have the property that for all nonzero x T +δ T +δ ≤ hhhhhhhhhhhhx(H H)x≤ ≤ hhhhhhhhhhhhhx (M M)x ≤ glH guHand g lM guM , xHT x xMT x < ≤ < ≤ where 0 glHg uH and 0 g lMg.Then uM g λ (H +δH,M+δM) g hhhhlH ≤ hhhhhhhhhhhhhhhhhhi ≤ hhhhuH λ guM i(H, M ) glM for all i. Lemma 3: Let H be symmetric γ -s.d.d. with respect to the 2-norm. W rite H =∆A∆ where ∆ is diag- onal and A has ± 1's on its diagonal. Let λ be an eigenvalue of H, y the corresponding unit eigenvec- tor, and x =∆y/ee∆yee the corresponding unit eigenvector of A −λ∆−2.Then 1−γ ≤ exAxT e≤1+γ .

Proof: Write A = E + N,whereEis diagonal with ± 1's on the diagonal, N has zero diagonal, and eeN ee≤γ<1. The upper bound on e xAxT e follows immediately from taking norms: exAxT e≤ eeEee+ eeN ee ≤ 1+γ.IfE=I,then exAxTTe=e1+xNxe≥1−γ. Assume then without loss of generality that

RI j 0 H J E = J − Q0 Ik P λ< where Il denotes an l-by-l identity matrix. We will prove the theorem only for 0; for the positive λ consider − H. Partition

7 d d

∆ T Rx 11H R 0H RN 1H = J J ∆= J J = J J x , ∆ and N T Qx 2 P Q0 2P QN 2P conformally with E.ThenAx=λ∆−2x may be rewritten + T=λ∆−2 x11Nx 1x1 − + T=λ∆−2. x22Nx 2x2

Solving the ®rst equation above for x 1yields = λ∆−2 − −1T x 11( I) Nx.1 λ∆−2 − − Note that 1 I is diagonal with diagonal entries less than 1. Now TTT= + = T− T+ T= T− + T xAx xEx xNx xx112xx2xNx 2xx111 xNx T+ T= since xx112xx21. Combining the last two displayed equations, and using the fact that − T λ∆−2 − −1TT≥ λ∆−2 − −2T≥ xN11( I) Nx1 xN11( I) Nx1 0 yields T RNx1H TT= λ∆−2 − −2T+ TTJ J− xAx 2xN11( I) Nx1 (x1,x2) T 1 QNx2P

=Tλ∆−2 − −2T+ TTT+ λ∆−2 − −1T− 2xN11( I) Nx1 xNx22 xN11( I) Nx1 1

≤T TT+ λ∆−2 − −2T− xNx2 2 xN11( I) Nx1 1

=T T+ Tλ∆−2 − −1T− xNx2 2 x1( 1 I) Nx1 1

=T λ∆− 2 − − 1 T − (x1 ( 1 I),x2 )Nx 1

λ∆− 2 − − 1 R( 1 I) x 1H ≤ ee J Jee.eeNx ee − 1 Q x2 P

Rx1H ≤ ee J Jee.eeNx ee − 1 Qx2P

≤ γ−1 `.

In the next two sections we will use these results to derive perturbation theorems for eigenvalues and singular values.

8 d d

5. A Perturbation Theorem for Singular Values. Using Lemma 2, we prove the following theorem, which is a slight strengthening of a result of Kahan [9]: Theorem 1: Let B be an n by n : Ra b H J 11 J J . . J B = . J . b − J J n 1 J Q an P

W e assume the ai and bi are nonzero since otherwise B splits into independent subproblems. Let +δ α β B B be a perturbed bidiagonal matrix with entries iia in place of a i and iib in place of b i . Then σ≤ ...≤σ σ′ ≤ ...≤σ′ the singular values 1 nof B and 1 nsatisfy σ ′ ≤ hhhhi ≤ gl σ gu i

where glu and g are de®ned as follows. De®ne the ®nite set S of positive numbers by β ...β α ...α ≡ ehhhhhhhhhhhhjke ≤ ≤ ≤ − e hhhhhhhhhhhhjke ≤ ≤ ≤ S { α ...α :1 j k n 1}∪ { β ...β :1 j k n} . j+1 k jk−1 e β e ≤ < e α e ≤ ≤ Note that S contains jj, 1 j n, and , 1 j n. Let min S and max S denote the minimum and maximum entries of S, respectively. Then = = gulmax S and g min S. +δ σ ≤ ...≤σ Corollary 1: Let B and B B be bidiagonal with singular values 1(B ) n(B) and σ+δ ≤ ...≤σ +δ 1 (B B) n (B B) respectively. If for all nonzero entries Bij , τ≤− 1 e +δ e≤τ τ≥ (B B)/ijB ij for some 1,then σ +δ hhhhhh1 hhhhhhhhhhhi(B B) − ≤ ≤τ2n 1 . 2n−1 σ τ i(B ) Thus, relative perturbations of at most τ in the entries of B cause relative perturbations of at most τ 2n − 1 in its singular values. If τ=1+η is close to 1, so is τ 2n − 1∼ 1+ (2n−1)η. +δ e +δ e =τe e Corollary 2: Let B and B B be bidiagonal. If (B B)ijB ij for all i and j, then σ +δ =τσ ±τ ii(B B) (B). This simply says that if you multiply each entry of a bidiagonal matrix by , you multiply all its singular values by τ as well. Proof of Theorem 1: For notational convenience rename the entries of B so that R H Js 1 s 2 J J s 3 s 4 J = J . . J B J J . J s 2n − 2 J J J Q s 2n − 1 P +δ γ and so that B B has entries isiiin place of s . We may assume without loss of generality that all the si are real and positive, since this may be achieved by pre- and postmultiplying B by uni- γ tary diagonal matrices. We may also assume the i are real and positive for the same reason. We also use the well known fact that the eigenvalues of R0 B T H C = J J QB 0 P are plus and minus the singular values of B. Furthermore, by reordering the rows and columns of C in the order 1, n + 1, 2, n + 2,..., n,2n,weseeCis similar to

9 d d

R 0 s 1 H J . . J Js 1 J E = . J . . s − J J 2n 1 J Q s 2n − 1 0 P

Thus, the singular values of B are the absolute values of the eigenvalues of the pencil E −λI. We are interested in comparing these eigenvalues with the eigenvalues of E+δE−λI, +δ γ where E E has entries iis in place of s i. We will chose a diagonal matrix D such that D(E+δE)D = E, so that these eigenvalues will be the eigenvalues of the pencil E −λD2;we will then apply Lemma 2 to conclude that the eigenvalues will change at most by a factor in the − 2 − 2 range minDii to maxDii . i i +δ = Actually, we will ®nd two diagonal matrices Duland D satisfying D (E E)D E but corresponding to different choices of D 11 in order to minimize max Dlii and maximize min Duii; i i this will in turn maximize gl and minimize gu where (by Lemma 2) σ′ hhhhhhhh1 hhhhi hhhhhhhhh1 g ≡ ≤ ≤ ≡ g . l 2 σ 2 u max D lii i min D uii i i +δ = First consider Dll. We want to choose D 11 to minimize max Dlii. D(E E)D E implies i = γ − 1 the diagonal entries of Dllmust satisfy the recurrence D ,i+1, i + 1 ( iD lii) . This means the diagonal entries of Dl are either of the form γγ ...γ .hhhhhhhhhhhhhh13 2j−1 Dl11 γγ ...γ (5.1) 24 2j or γγ ...γ hhhhhhh1 . hhhhhhhhhhhhhh24 2j γ γγ ...γ . (5.2) Dl11 1 3 5 2j + 1

Choosing Dl11 to minimize the maximum of these terms is equivalent to choosing Dl 11to − 1 minimize max(sD1 l11,sD 2 l 11 ), where s 1 is the maximum coef®cient of Dl 11in (5.1) and s 2 is − 1 1/2 the maximum coef®cient of D l11 in (5.2); this minimum is easily seen to be (ss12).Some tedious algebraic manipulation leads to the expression for gl in the statement of the theorem. ` Choosing Du11 to maximize min Duii is analogous. i

A similar proof yields the following analogous result: Let A be an arbitrary matrix, and D 1 and D2 arbitrary nonsingular diagonal matrices, which we may assume are real and nonnegative. σ σ′ Then the singular values iiof A and of DAD12satisfy σ′ . ≤ hhhhi≤ . min D 1imin D2j σ max D1imax D.2j i j i i j

10 d d

6. Perturbation Theorems for Eigenvalues In this section we use the perturbation lemma of section 3 to prove perturbation theorems for eigenvalues of sym metric γ -s.d.d. matrices and γ -s.d.d. positive de®nite pencils. Theorems 2 and 3 apply only to positive de®nite matrices and pencils, and are stronger than Proposition 4 and Theorem 4 which apply to inde®nite matrices as well. In fact, when the matrix or pencil is positive de®nite, ∆ may be an arbitrary matrix, not just diagonal. Our results for de®nite pencils with both positive and negative eigenvalues are rather weaker than the results for s.d.d. inde®nite sym metric matrices. Theorem 2: Let H=∆TTA∆=∆ (I+N)∆ be a positive de®nite matrix where eeN ee=γ< 1, and ∆ is an arbitrary nonsingular matrix. (If ∆ is diagonal this is equivalent to H being a symmetric positive de®nite γ -s.d.d. matrix with respect to the 2-norm.) Let δH be a symmetric perturbation of H with ee∆δ− T ∆− 1 ee ≡η< −γ λ≤ ...≤λ λ′ ≤ ...≤λ′ H 1 . Then the eigenvalues 1 n of H and 1 n of H +δH satisfy η λ′ η − hhhhh ≤ hhhi ≤ + hhhhh 1 −γ λ 1 −γ . 1 i 1 Thus, if ∆ is diagonal, relative perturbations of at most η in the entries of H cause only relative per- turbations of at most nη/(1−γ) in the eigenvalues. Proof: We have − − − − hhhhhhhhhhhhx T(H +δH)x hhhhhhhhhhhhhhhhhhhhhhhxTT∆(A+∆T δH∆ 1)∆x hhhhhhhhhhhhhhhhhhhy T(A +∆T δH∆ 1)y = = (6.1) xHT x x TT∆ A ∆x yATy where y=∆x. The values taken by the quantity in (6.1) as x varies over all nonzero vectors are the same as the values taken on as y varies over all unit vectors. Since 1−γ≤ yAyT if y is a unit vector, − − hhhhhη hhhhhhhhhhhhhhhhhhhyT (A+∆T δH∆ 1 )y hhhhhη 1− ≤ ≤ 1+ . 1−γ yAyT 1−γ The result now follows from Lemma 1. ` =∆T ∆=∆T + ∆ =∆T ∆=∆T + ∆ Theorem 3: Let H H AHH H(I NHH) and M MAMM M(INMM) be positive ee ee≤γ< ee ee≤γ< ∆ ∆ de®nite matrices where NHM1, N 1, and HMand are arbitrary nonsingular ∆ ∆ −λ γ matrices. (If HMand are diagonal this is equivalent to H M being a -s.d.d. positive de®nite pencil with respect to the 2-norm.) Let δH and δM be symmetric perturbations of H and M such that ee∆δ−T ∆−1ee ≤η< −γ ee∆δ−T ∆−1ee ≤η< −γ λ≤ ...≤λ H HH 1 and M MM 1 . Then the eigenvalues 1 n −λ λ′ ≤ ...≤λ′ +δ −λ +δ ofH M and 1 n of (H H) (M M) satisfy −γ−η λ −γ+η hhhhhhhh1 ≤ hhhi≤ hhhhhhh1 h −γ+η λ′ −γ−η 1 i 1 ∆ ∆ η Thus, if HMand are diagonal, relative perturbations of at most in the entries of H and M cause only relative perturbations of at most 2nη/(1−γ−2nη) in the eigenvalues of H −λM. Proof: Apply the technique in the proof of Theorem 2 to both H and M and apply Lemma 2. `

11 d d

When H is inde®nite, our results are weaker. In the case of H alone, we must assume ∆ is diagonal to attain a bound like that of Theorem 2: Proposition 4: Let H be an n-by-n symmetric γ -s.d.d. matrix with respect to the 2-norm. Thus H =∆A∆ where ∆ is diagonal, A = E + N where E is diagonal with ± 1's on the diagonal, N has a zero diagonal and eeN ee≤γ<1.LetδH be a symmetric perturbation of H with ee∆δ− 1 H∆− 1 ee ≡η<(1−γ)/n. Assume also that H +δHisγ-s.d.d. Then the eigenvalues λ≤ ...≤λ λ′ ≤ ...≤λ′ +δ 1 n of H and 1 n of H H satisfy η λ′ η − hhhhhn ≤ hhhi ≤ − hhhhhn − 1 1 −γ λ (1 −γ ) . 1 i 1 Thus, relative perturbations of at most η in the entries of H can cause relative perturbations of at most n 2 η/(1−γ) in the eigenvalues. Proof: We will prove the theorem only for the negative eigenvalues; for the positive ones con- sider − H. We cannot apply Lemma 1 directly here because xAxTT/xxis not bounded away from 0 for all x as in the proof of Proposition 4. By Lemma 3, however, it is so bounded if we restrict x to lie in an eigenspace of A −λ∆− 2 corresponding to only negative (or only positive) eigenvalues; this will be suf®cient for the proof. λ≤ ...≤λ To this end, let 1 n be the eigenvalues and x 1,...,xn the corresponding unit −λ∆− 2 λ′ ≤ ...≤λ′ ′ right eigenvectors of A ; let the primed quantities 1 niand x correspond to +∆− 1 δ ∆−λ∆− 1 − 2 T ≤γ− < λ< AH . By Lemma 3 xAxi ii1 0if 0. = Λ= λ λ =∆−2Λ Now let Xi[x 1,...,xii]and diag( 1,..., ii), so AX Xii. Since the −λ∆− 2 ∆ − 1 columns of Xi are eigenvectors of A , the columns of Xi are eigenvectors of H and so are orthogonal. Thus T = T∆−2Λ= T∆ −2λ XAXi iiX Xiidiag(x i xii) is diagonal with diagonal entries bounded above by γ−1. Let z be an arbitrary unit vector; then T T = T T∆ −2 λ ≤γ− zXAXzi i z diag(x i xii)z 1. Now we use the characterization T λ= hhhhhhhhxAx i min max − Siix ∈S xT ∆ 2 x i= i= where the minimum is attained for S S 0span(Xi). Then T +∆− 1 δ ∆− 1 T +∆− 1 δ ∆− 1 T hhhhhhhhhhhhhhhhhhhx (A H )x hhhhhhhhhhhhhhhhhhhx(A H )x. hhhhhhhhxAx λ′ = min max ≤ maxi i ii∈ T − 2 ∈ T T − 2 S x S x ∆ x x S 0 xAx x ∆ x TT∆δ−1∆−1 TT hzXhhhhhhhhhhhhhhhhhiH XzihhhhhhhhhhhhzXAXzii =max (1 + ). . ee ee= T T TT∆−2 z 1 zXAXzi i zXiXzi e T T∆δ−1 ∆−1 e ≤ η e T T e≥ −γ Now zXi H Xzi i and zXAXzi i 1 so η λ ′ η λ′ ≤ − hhhhhi λ hhhi≥ − hhhhi h ii(1 −γ ) or λ 1 − γ 1 i1 Swapping the roles of A and A +∆− 1 δH∆− 1 we obtain η λ′ η − hhhhhi ≤ hhhi ≤ − hhhhhi − 1 1 − γ λ (1 −γ ) 1 i 1 as desired. `

12 d d

The factor n in the bound of Proposition 4 is an overestimate, and can be removed by modifying the conditions of the proposition just slightly: Theorem 4: Let H=∆A∆ be an n-by-n symmetric γ -s.d.d. matrix with respect to the 2-norm. Here ∆ is diagonal and A has ± 1's on the diagonal. Let δH be a symmetric perturbation with ee∆δ−1 ∆−1ee ≡η +ξδ γ ≤ξ≤ λ≤ ...≤λ H . Assume that H His -s.d.d. for all 0 1. Then letting 1 n λ′ ≤ ...≤λ′ +δ be the eigenvalues of H and 1 n be the eigenvalues of H H, we have − η λ′ η hhhhh ≤ hhhi ≤ hhhhh exp( − γ ) λ exp( − γ ) . (6.2) 1 i 1

Proof: Let E =∆− 1 δH∆− 1 /ee∆δ− 1 H∆− 1 ee be a matrix of norm 1, H (ζ) =∆(A+ζE)∆,and λ ζ≤...≤λ ζ ζ λ 1() i( ) be the eigenvalues of H ( ). Suppose ®rst that ii(0) is simple. Let x be λ the unit eigenvector corresponding to i(0). Then from standard eigenvalue perturbation theorem [15, 19], we know λ ζ =λ +ζT∆ ∆ + ζ2 ii( ) (0) x iE xiO( ) Therefore λ ζ T ∆ ∆ T hhhhhhi( ) hhhhhhhhhxi E xi hhhhhhyEyi i = 1 +ζ + O(ζ2 ) = 1 +ζ + O(ζ2 ) λ T ∆ ∆ T i(0) xi A xi yAyi i ee ee e T e≥ −γ where yi may be taken to be one. By Lemma 3, yAyi i 1 ,andso ζ λ (ζ) ζ − hhhhh + ζ2 ≤ hhhhhhi ≤ + hhhhh + ζ2 1 −γ O ( ) λ 1 − γ O ( ) . (6.3) 1 i(0) 1 λ ζ ≤ζ≤η Assume now that i( ) is simple for all 0 ; then (6.3) implies that e λ e ≤ −γ − 1 η d log i(x )/dx (1 ) . Integrating from 0 to yields (6.2). By [11, Theorem II.6.1], the eigenvalues are all real analytic, even when they are multiple. Thus, if there are only ®nitely ζ λ ζ many where i( ) is multiple, we can apply the above argument in the intermediate intervals. λ ζ ζ It remains to consider the case where i( ) is multiple for in®nitely many .Herewe argue that this can only happen for a set of pairs of matrices H and E of measure zero, that off this set the previous argument holds, and by continuity of the eigenvalues the same bounds λ ζ must hold on the set. To see that the set of H and E such that some i( ) is multiple in®nitely often is of measure zero, consider the discriminant of the characteristic polynomial of H −ζ∆E∆; this is a polynomial in ζ and the 2n 2 entries of H and E. H (ζ) can have multiple eigenvalues if and only if this discriminant vanishes. If it vanishes for in®nitely many ζ,its coef®cients (viewing it as a polynomial in ζ) must vanish identically. These coef®cient are in turn polynomials in the entries of H and E, and so vanish only on a proper variety (a set of measure zero). Off this set, the discriminant has at most a ®nite number of zeros (bounded by its degree as a polynomial in ζ). ` Result (6.2) was claimed without proof in [14] just for the case of tridiagonal matrices perturbed on their offdiagonals. In light of Proposition 3, it is probably possible to prove an analogous theorem for γ - s.d.d. de®nite pencils which have both positive and negative eigenvalues, but we have not been able to do so. However, the following technique frequently succeeds in reducing the eigenprob- lem for such pencils to a problem where Theorem 4 may be applied:

13 d d

Algorithm 1: R educing a γ -s.d.d. de®nite pencil H −λM to an s.d.d. matrix Y: = 1/2 = −1 −1 = −1 −1 (1) Let D1 diag( M ii ), and compute H1 DHD and M 1 DMD.NowM1 has unit diagonal and is diagonally dominant in the usual sense. = T (2) Let P be a permutation matrix chosen so that H 21PH P has its diagonal entries sorted from sm allest to largest in absolute value (sm allest at the top left, largest at the bottom = T right). Let M 21PM P . = −1−T (3) Let L be the lower triangular Cholesky factor of M 2. Let Y LHL2.ThenYand H −λM have the same eigenvalues.

This is a variation on the usual reduction of a de®nite pencil to standard form. The point is that if M is suf®ciently diagonally dominant, L will also be diagonally dominant with nearly − 1 − T unit diagonal, and the multiplication LHL2 will not destroy the s.d.d. property of H 2 . Thus, Y will be s.d.d. The following theorem formalizes this, but is weak in that it only guaran- tees diagonal dominance of Y for rather sm all γ , much sm aller than those that work in practice: Proposition 5: Let H −λM be an n by n γ -s.d.d. de®nite pencil, and let Y be the output of the above reduction algorithm. De®ne h1hhhh+γ γ′≡ ((2n)1/2+1). .[2 + ((2n) 1/2+1)γ 1/2].γ 1/2 . 1−γ Then if h hhhhhhhhhhh(n+1)γ′+γ γ ≡ < 1, 1−γ′ h which will be true for γ small enough, Y will be γ -s.d.d. The proof of Proposition 5 requires the following lemma: Lemma 4: Let M be an n by n γ-s.d.d. matrix with unit diagonal. Let L be its lower triangular Cholesky factor. Then eeL−Iee ≤ ((2n)1/2+1)γ 1/2 and eeL−1−Iee ≤ ((2n) 1/2+1)γ 1/2/(1−γ) 1/2. Also, (1+γ)−1/2 ≤ eeL−1 ee ≤ (1−γ)−1/2. Proof: Let X = L − I and W = L − 1 − I =−LX− 1 . Since −γ 1/2 ≤λ1/2 =σ ≤ ≤σ =λ1/2≤ +γ 1/2 (1 ) min(M) min(L) Lii max(L) max(M) (1 ) e e = e − e ≤ − −γ 1/2≤γ 1/2 we have XiiL ii 1 1 (1 ) . Also, we may bound the norm of the i-th subdi- agonal column of X as follows: +γ ≥ ee ... ee2= ee ... ee2≥ −γ + ee ee2 1 [Li,ii, L +1, in, , L ,i] [Li,ii, X +1, in, , X,i] 1 Xi+1:n, i ee ee ≤ γ 1/2ee ee ≤ γ 1/2+γ 1/2 whence Xi+ 1:n, i (2 ) . Thus X (2n ) as desired. Finally, eeW ee ≤ eeL−1 ee.eeXee ≤ (1−γ)−1/2 eeXee. ` Proof of Proposition 5: By applying the ®rst two steps of the reduction, assume without loss of e e ≤ e e generality that M has unit diagonal and that HiiH i + 1, i + 1 for all i. Let L ., i denote the i-th (i, j) column of L and similarly for L j, . . Also, let G denote the leading i by j submatrix of G. Let = e e 1/2 −1= + D diag( Hii )andL I W.Then = − 1 − T = + −T+ −1 − YijLHL i, . ., j HijWHL i, ..,jLHWi,. .,jiWHW,..,j. Therefore e − e≤e −Te+e−1 e+e e YijH ijWHL i, ..,j LHWi,. .,jiWHW,..,j

≤ ee ee.ee −12(ee + ee ee ee i,j)ee ≤ ee ee.ee −12ee + ee ee +γ (2 W L W ) H (2 W L W )DDii jj(1 ) . Now insert the bounds of Lemma 4 to get

− − h1hhhh+γ e (D 1(Y − H)D 1) e ≤ ((2n)1/2+1). .[2 + ((2n) 1/2+1)γ 1/2].γ=γ′ 1/2 . ij 1−γ

14 d d

= e e 1/2 Finally, letting DYiidiag( Y ) − − hhhhhhhhhhh(n+1)γ′+γ eeoffdiag( DYD1 1)ee ≤ Y Y 1−γ′ follows from simple norm inequalities. ` In order to apply Proposition 4 or Theorem 4, however, we must argue that small relative perturbations in H and M (of the type permitted in Proposition 4 and Theorem 4) cause sm all perturbations of the same type in Y: −λ = −λ γ Theorem 5: Let H M DADHHHDAD MMM bea -s.d.d. de®nite pencil, where A H and AM ± = −λ have 1s on theirh diagonals.h Let Y DYYY A D be the output of Algorithm 1 applied to H M. Assume that Y is γ -s.d.d. (γ may be smaller than the expression in Proposition 5). Now de®ne ζ = +ζ ζ = +ζ ee ee = ee ee = H( ) DHH(A E HH)D , and similarly M ( ) D MM(A E MM)D , where E HE M 1. Let Y (ζ) be the output of the reduction algorithm applied to H (ζ). Then for asymptotically small ζ . 1/2+ +γ ee −1 ζ − −1 ee ≤ζ.hhhhhhhhhhhhhhhhhhn (2 n 1)(1 ) + ζ2 DY (Y( ) Y)DY O ( ) (1−γ)2 λ ζ ζ −λ ζ and the eigenvalues i( ) of H ( ) M( ) satisfy . 1/2+ +γ λ (ζ) . 1/2+ +γ −ζ.hhhhhhhhhhhhhhhhhhn(2 n 1)(1 ) + ζ2 ≤ hhhhhhi ≤ +ζ.hhhhhhhhhhhhhhhhhhn(2 n 1)(1 ) + ζ2 1 h O ( ) λ 1 h O( ) . (1−γ)(1−γ)2 i(0) (1−γ)(1−γ)2 For the proof of Theorem 5 we require the following lemma: Lemma 5: Let M be a γ -s.d.d. symmetric matrix with unit diagonal. Let L be its lower triangular Cholesky factor. Let L +δL be the lower triangular Cholesky factor of the perturbed matrix M +δM. Then I M1/2 hhhhhn eeδL ee ≤ J J . eeδM ee + O(eeδM ee2) . L1−γ O

Proof: It suf®ces to consider M diagonal, since the Cholesky factors of M and QM Q T (Q orthog- onal) have the same norm. Equating ®rst order terms on both sides of +δ = +δ +δ T δ = − 1/2δ > δ = . − 1/2δ = M M (L L)(L L) yields L ijM jj M ij(i j)and L ii.5M ii Mii (i j). Taking norms yields the result. ` Proof of Theorem 5: By applying the ®rst two steps of Algorithm 1, we may assume without loss = e e ≤ e e ζ ζ of generality that M ii1and H iiH i + 1, i + 1 . Let L ( ) be the Cholesky factor of M ( ). Then the eigenvalues of H (ζ)−λM(ζ) are the eigenvalues of Y (ζ) = L − 1 (ζ)H (ζ)L − T (ζ). Letting M (ζ)= M +δM, L(ζ)=L+δL,andH(ζ)=H+δH, we get that to ®rst order Y (ζ) = (L −1− L −1δLL −1)(H+δH)(L −T−L −TTδLL−T)

=LHL−1 −T−L−1δLL−1 HL −T+ L −1δHL −T− LHL−1 −TTδLL−T . Therefore to ®rst order in ζ e ζ − e ≤ . +γ .ee − 13ee . 1/2. −γ −1/2+ ee −12ee ζ (Y ( ) Y )ijDD A,ii A, jj(2 (1 ) L n (1 ) L )

1/2+ +γ ≤ hhhhhhhhhhhhhhhh(2n 1)(1 ) .ζ DDA,ii A, jj (1−γ)2 as desired. Applying Theorem 4 yields the ®nal result. ` We may apply Theorem 4 to analyze the convergence criterion for the QR algorithm for eigenvalues of sym metric tridiagonal matrices [17]. In the course of running the QR algorithm on a sym metric one must examine the matrix

15 d d

R. . H J. J J a j bj J T = J b a + .J J j j 1 J Q . .P

and decide whether bj can be set to zero ("convergence") without making unacceptably large γ perturbations in the eigenvalues. Theorem 4 tells us that if T is -s.d.d., then setting bj to zero makes relative errors no larger than e e hhhhhhhhhhhbj hhhhh1 exp( . ) − 1 (6.4) e . e 1/2 − γ ajja +1 1 in any eigenvalue. This result is attractive because it is inexpensive and purely local; it only depends on bjjjand its neighbors a and a +1on the diagonal of T. It does differ from the stan- dard criterion which essentially asks if e e hhhhhhhhhhhbj e e + e e a jja +1 is small; this criterion is weaker than (6.4) and does not guarantee high relative accuracy. Unfortunately, even using (6.4) as a convergence criterion does not guarantee that QR will compute eigenvalues with high relative accuracy; there are examples which are even fairly strongly s.d.d. of QR computing eigenvalues with incorrect signs, i.e. no relative accuracy at all. We discuss this further in Section 10.

7. Perturbation Theorems for Eigenvectors In this section we discuss the sensitivity of the eigenvectors of sym metric s.d.d. matrices under the same small perturbations as in section 6. As discussed in the introduction, the stan- λ = eλ−λ e dard perturbation bound (1.3) is proportional to the reciprocal of gap ( i) min ij. j≠i λ Thus, if the absolute distance from i to its nearest neighbor is sm all, we expect the corresponding eigenvector to be sensitive to perturbations. Our ®rst result will be an analogous theorem which replaces the gap with the relative gap e λ−λ e hhhhhhhhhij relgap (λ ) = min , (7.1) i ≠ eλλ e1/2 j i ij which may be large even when the usual gap is sm all. Even more may be shown. Proposition 6 will show that the eigenvectors are scaled analo- λ λ λ gously to the matrix entries: if xiijiis the eigenvector for ,and differs from by a large fac- tor, the j-th component of xiwill be sm all. Theorem 7 will show that sm all relative perturba- tions in the matrix only cause sm all perturbations in the eigenvector entries relative to their upper bounds of Proposition 6; thus some tiny eigenvector components may be determined to high relative accuracy as well. Finally, we discuss eigenvector bounds for de®nite pencils and singular vector bounds for bidiagonal matrices (partially settling a conjecture from [9]). Perturbation theory and algorithms for conventionally diagonally dominant nonsym metric matrices were developed in [1], under the assumption that the gap between eigenvalues greatly exceeded the norm γ of the offdiagonal part. Thus these results apply to general nonsym metric matrices, but require much stronger diagonal dominance assumptions than we do and are weaker than our results.

16 d d

Theorem 6: Let H=∆A∆ be a γ -s.d.d. symmetric matrix with respect to the 2-norm. Let E be sym- ζ =∆ +ζ ∆ λ ζ metric and have 2-norm one, and de®ne H ( ) (A E) .Let i()be the i-th eigenvalue of ζ λ ζ H ( ), and assume i(0) is simple so that the corresponding unit eigenvector xi( ) is well de®ned for suf®ciently small ζ. Then for asymptotically small ζ − ζ ee ζ − ee ≤ hhhhhhhhhhhhhhhhhh(n 1) + ζ2 xii( ) x (0) −γ λ O ( ) (1 )relgap( i)

Proof: From [19] we have xT ∆E∆x ζ=+ζ hhhhhhhhhhk i + ζ2 xii()x(0) Σλ−λ xk O ( ) k≠i( ik ) =∆ Let ykkx.Then yEyT ζ = +ζ hhhhhhhhhhk i + ζ2 xii( ) x(0) Σλ−λ xk O ( ) k≠i( ik ) λ −λ∆−2 T =λ T∆−2 The pair ( kk, y ) is an eigenpair of the pencil A . Thus yAyk kkky yk.From Lemma 3 we have −γ ee ee2 ≤ e T e=eλe.ee∆−1ee2= eλ e ≤ +γ ee ee2 (1 ) yk yAyk kk yk kk(1 ) y . Thus eλ e e λ e hhhhhhk hhhhhhk ( ) 1/2 ≤ eey ee ≤ ( ) 1/2 (7.2) 1 + γ k 1 − γ = ee ee Ifweletzkkky / y then T hhhhhhhhhhhhhhhhhhzEkiz x(ζ)=x(0) +ζΣ ξ + O (ζ2) (7.3) ii ik λ−λ eλλ e1/2 k≠i( ikik)/ +γ −1≤ eξ e≤ −γ − 1 where (1 ) ik (1 ) . If we take norms then − ζ ee ζ − ee ≤ hhhhhhhhhhhhhhhhhh(n 1) + ζ2 xii( ) x (0) −γ λ O ( ) (1 )relgap( i) as desired. ` ζ λ ζ ζ ζ γ Corollary 3: hLet H( ), ii( )h, and x ( ) be as in Theorem 6. Assume further that H ( ) is -s.d.d. for all 0≤ζ≤ζ,that1−γ−3nζ> 0,and h . − 1/2. .ζ λ ≥ hhhhhhhhhhh3 2 n relgap( i) h . 1−γ−3nζ Then h h hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh3(n − 1)ζ eex (ζ)−x (0) ee ≤ h . ii h . −1/2. .ζ . −γ− ζ . λ − hhhhhhhhhh32 n 2 (1 3n ) (relgap( i) h ) 1−γ−3nζ ζ λ ζ Proof: The idea is that if is sm all enough, the k( ) can only change by a sm all relative amount, so the relative gap can only change by a sm all absolute amount. From Proposition 4, we can bound the perturbed relative gap from below as follows: ζ e λ −λ e− hhhhhn e λ e+ eλ e eλ (ζ)−λ (ζ)e ik(0) (0) −γ ( i(0) k(0) ) λ ζ = hhhhhhhhhhhhhhik≥ hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh1 relgap( i( )) min min k≠ieλ ζ λ ζ e 1/2 k≠i hhhhhnζ − ik( ) ( ) eλ (0)λ (0) e(1− ) 1 ii 1−γ

17 d d

eλ e + eλ e hhhhhnζ hhhhhnζ hhhhhhhhhhhhhhhhhik(0) (0) = (1− )min[(relgap(λ (0),λ (0)) − ] −γ ≠ ik −γ e λ λ e 1/2 1 ki 1 ik(0) (0)

λ λ ≡ eλ −λ e.eλ λ e −1/2 where relgap( ik(0), (0)) i(0) k(0) ik(0) (0) . λ λ ≥ −1/2 λ λ ≤ −1/2 We consider two cases, relgap( ik(0), (0)) 2 , and relgap( ik(0), (0)) 2.The λ λ ®rst case corresponds to ik(0) and (0) differing by at least a factor of 2, whence eλ e + eλ e hhhhhhhhhhhhhhhhhik(0) (0) min ≤ 3.relgap(λ (0),λ (0)) ≠ eλ λ e 1/2 ik k i ik(0) (0) and hhhhhnζ hhhhh3nζ relgap(λ (ζ),λ (ζ)) ≥ (1− ).relgap(λ (0),λ (0)).(1− ) . ik1−γ ik 1−γ λ λ The second case corresponds to ik(0) and (0) differing by at most a factor of 2, whence e λ e + eλ e hhhhhhhhhhhhhhhhhik(0) (0) − min ≤ 3.2 1/2 ≠ eλ λ e 1/2 k i ik(0) (0) and − hhhhhnζ hhhhhhhhh3.2 1/2nζ relgap(λ (ζ),λ (ζ)) ≥ (1− ).(relgap(λ (0),λ (0)) − ) . (7.4) ik1−γ ik 1−γ Altogether, we have ζ ζ . −1/2 ζ −γ λ ζ ≥ − hhhhhn . − hhhhh3n λ − hhhhhhhhhhhhhhhhhh(3 2 n )/(1 ) relgap( ii( )) (1 −γ ) (1 −γ )(relgap( (0)) − ζ −γ ) . 1 1 h 1 3n /(1 ) Now integrate the bound of Theorem 6 from ζ= 0toζ= ζ to get the desired result. ` The next theorem shows that the components of each eigenvector are scaled analogously to the way the matrix is scaled: Proposition 6: Let H=∆A∆ be a γ -s.d.d. symmetric matrix with respect to the 2-norm. Let = Λ T = H X X be its eigenvector decomposition, where X [x 1 , ...,xn ]is the matrix whose columns are Λ= λ λ orthonormal eigenvectors and diag( 1 , ..., ni). Let x ( j) be the j-th component of xi . Then h I +γ M3/2 e λ e 1/2 e λ e 1/2 e e ≤ ≡ Jh1hhhhJ . ehh hi e e hh hj e xii(j) x (j) −γ min( λ , λ ) L 1 O e j e e i e W e also have I +γ M3/2 ∆ ∆ e e ≤ Jh1hhhhJ . hhhhii hhhhjj xi( j) −γ min( ∆ , ∆ ) L 1 O jj ii =∆ ∆≤∆ Proof: Let yi xiiijjfor all i. First we consider the case . From (7.2) we have ee ee ≤ eλ e −γ 1/2 yii( /(1 )) . Thus, applying Proposition 2 as well, Ie λ e + γ M1/2 e e =∆−1e e≤∆−1eλe −γ 1/2 ≤ Jehh hi e . h1hhhhJ x ijj( j ) yijj(j) ( i/(1 )) λ − γ Le j e 1 O as desired. We may also write ∆ I +γ M1/2 e e =∆−1 e e≤∆−1 eλe −γ 1/2 ≤ hhhhii . Jh1hhhhJ xij( j) jyij(j) j ( i /(1 )) ∆ −γ jj L 1 O to get the other inequality. ∆≥∆ Now consider the case ii jj. We will take the j-th components of both sides of the =λ∆− 2 equality Ayi i yi, and bound them as follows. The left hand side component is bounded

18 d d

above in absolute value by +γ ee ee ≤ +γ −γ −1/2 eλ e1/2 (1 ) yi (1 )(1 ) i . The right hand side component is bounded below in absolute value by e λ e ∆ −2 e e ≥ −γ eλ λ e.e e ijjyiiji( j) (1 ) / y(j) . Thus +γ eλ e I M3/2 e λ e1/2 − − hhhhhhhhhhhhhhh(1 ) j h1hhhh+γ hhhj ex(j)e=∆ 1 ey(j)e≤∆1 ≤ J J . e e ijjijj −γ 3/2 eλ e1/2 −γ λ (1 ) i L 1 O e i e ∆ ∆ ` as desired. The bound in terms of jj/ ii is obtained similarly. ∆ Thus, if a matrix is strongly scaled (the jj vary greatly in magnitude), the eigenvectors will be strongly scaled, and sm all relative perturbations in the matrix entries will not be able to change the smaller eigenvector components much. In the next theorem, we prove something even stronger: theh perturbations in the eigenvector components xi( j) will be sm all compared to the upper bounds xi( j): ζ ζ Theorem 7: Let H( ) beh as in Theorem 6. Let xi( )(j) denote the j-th component of the i-th unit ζ eigenvector of H ( ). Let xi( j) be the upper bound for the j-th component of the unperturbed unit eigenvector xi(0)(j).Then hhhhhhhhhhhhhhhhhhhhhhhhhhhh2(1/2 n−1) h ex(ζ)(j) − x (0)(j) e ≤ζ. .x ( j) + O(ζ2) ii −γ . λ −1/2 i (1 ) min(relgap( i),2 ) λ λ λ −1/2 λ (iiis the same as (0)). Note that relgap( i) exceeds 2 only when i differs from its nearest neighbor by a factor greater than 2 or less than 1/2. Proof: We start from (7.3) in the proof of Theorem 6; it implies ξ T hhhhhhhhhhhhhhhhhijzEz i k e x (ζ)(j)−x (0)(j) e = eζ Σ .x (0)(j) + O(ζ2) e ii eλ−λ e eλλ e1/2 k k≠iik/ ik)

3/2 eλλ e 1/2 e λ e 1/2 e λ e 1/2 hhhhhhhhhhhhhhhhζ(n−1)(1+γ) hhhhhhhhhik hhk h hh hj ≤ . .min( e e , e e ) + O ( ζ 2) 5/2 e λ−λ e λ λ (1−γ) ik e j e e k e

1/2 3/2 e λ e 1/2 eλ e e λ e eλ e hhhhhhhhhhhhhhhhζ(n−1)(1+γ) hh hi hhhhhhhhk hhhj hhhhhhhhi = .min( e e , e e ) + O (ζ2 ) 5/2 λ e λ−λ e λ e λ−λ e (1−γ) e j e i k e i e i k

3/2 e λ e 1/2 e λ e 1/2 eλ e eλ e hhhhhhhhhhhhhhhhζ(n−1)(1+γ) hh hi hh hj hmax(hhhhhhhhhhhhhhhik, ) ≤ .min( e e , e e ) + O (ζ2) 5/2 λ λ e λ−λ e (1−γ) e j e e i e ik

3/2 e λ e 1/2 e λ e1/2 1/2 hhhhhhhhhhhhhhhhζ(n−1)(1+γ) hh hi hhhj hhhhhhhhhhhhhhhhhhhhhh2 ≤ .min( e e , e e ) + O (ζ2) −γ 5/2 λ λ λ −1/2 (1 ) e j e e i e min(relgap( i),2 ) as desired. ` A version of Theorem 7 for nonasymptotically sm all ζ can be proven in the same way as Corollary 3 was derived from Theorem 6. If we have a cluster of eigenvalues which are relatively well separated from the others, similar analyses to those above show that the invariant subspace they span is insensitive to per- turbations, even if the individual eigenvectors are sensitive. Now consider s.d.d. de®nite pencils. From Proposition 5 of the last section, we know we can often reduce such pencils to standard form without sacri®cing diagonal dominance: Given H −λM, M positive de®nite with lower triangular Cholesky factor L,thematrixY=LHL−1 −T has the same spectrum as H −λM. Also, if y is an eigenvector of Y, x = Ly−T is an eigenvector

19 d d

of H −λM; thus Theorems 6 and 7 can be used to derive perturbation bounds on x, although we will not do so here. Finally, we consider perturbation theory for the singular vectors of bidiagonal matrices. Let R H Ja 1 b1 J J a 2 b2 J J . . J B = J J . (7.5) . J bn − 1J J J Q an P

be bidiagonal, where as in Theorem 1 we may assume without loss of generality that all ai and T bi are positive. Recall that the left singular vectors of B are the eigenvectors of BB and the right singular vectors of B are the eigenvectors of BBT . Since Ra2+b2 ba H J 1 1 12 J 2+ 2 J ba1 2 a2 b2ba23 J J J T = . . BB J ba23 J J 2 2 J . a−+b− ba− J n1 n1 n1nJ 2 Q ban−1n anP and Ra2 ba H J 1 11 J 2+ 2 Jba11a2 b1ba22 J J J T = . . BB J ba22 J J 2 2 J . a−+b− ba− − J n1 n2 n1n1J 2+2 Q ban−1n−1 anbn−1P

sm all relative perturbations in the aiiand b only cause sm all relative perturbations in the entries of BB TTand BB. Therefore, we can reduce perturbation theory for the singular vectors of a bidiagonal matrix to perturbation theory for the tridiagonal matrices BB TTand BB. Proposition 7: Let B be as in (7.5). Since BBTT and B B are positive de®nite tridiagonal, and hence s.d.d., small relative changes in the entries of B cause perturbations in the singular vectors as = T described by Theorems 6 and 7. More speci®cally, let DL diag( (BB ))ii and = T DR diag( (BB))ii .Then

− − hhhhhhhhhhhhhhhhhhhhhhhhbajj+1 hhhhhhhhhhhhhhhbn − 1 eeDBBD1/2 T 1/2 −Iee ≤γ ≡2.max ( max , ) L L L < − 2+ 21/2 2+21/2 2+ 21/2 j n 1 (aj bj)(aj+1bj+1) (a n−1bn−1) and

− − hhhhhhhhhhhhhhhhhhhhhhhhabjj hhhhhhhhhhhb 1 eeDBBD1/2 T 1/2 −Iee ≤γ ≡2.max (max , ) . R R R > 2+ 21/22 +21/2 2+ 21/2 j 1 (ajbj−1)(aj+1 bj) (a 2 b1) γ γ > 1/2 γ< > 1/2 γ< Both LRand are bounded by 2. If a j3 bjL , then 1, and if a j3 bj−1 , then R 1. − 1/2 T −1/2 Proof: A simple computation shows that the diagonal of DBBDL L consists of ones and 2+ 2 −1/2 2 +2 −1/2 <− that the offdiagonals are bajj+1(aj bj−1 )(aj+1 bj )forjn1and .2 +2 −1/2 =− γ −1/2 T −1/2 − bn−1 (an−1 bn−1)forjn1. Thus LLbounds the 1-norm of DBBDL I.Asimi- lar computation applies to BBT .` Thus, the sensitivity of the singular vectors to relative perturbations in the entries of B is governed by the relative gap between singular values, as conjectured in [9]. Actually, more was conjectured: it does not appear that the measure of diagonal dominance γ of BB TTor BB

20 d d

affects the singular vector sensitivity. A proof of this stronger conjecture will appear in [6]. More precisely, a version of Theorem 6 is proved in [6] without the 1−γ factor in the denomi- nator, and extended to nonasymptotically small perturbations as in Corollary 3. Whether Theorem 7 can be generalized without a 1/(1−γ) dependence is an open question.

8. On Condition Numbers and the Distance to the Nearest Ill-Posed Problem In [7] it was observed that a common feature of many numerical analysis problems is that their condition numbers approximate or at least bound the reciprocal of the distance to the nearest ill-posed problem, i.e. problem whose condition number is in®nite. The classical example of this is matrix inversion, where the condition number of a matrix T is κ(T ) ≡ eeT ee.eeT−1 ee. By scaling T we may assume without loss of generality that eeT ee=1, so that κ(T ) = eeT−1 ee.But the distance (in the ee.ee norm) from T to the nearest singular matrix (nearest matrix whose condition number is in®nite) is eeT− 1 ee = 1/κ(T). Thus, the condition number is exactly the reciprocal of the distance to the nearest ill-posed problem. This phenomenon recurs throughout numerical analysis, although we usually get a somewhat weaker relationship, such as one sided bounds between the condition number and reciprocal distance. Here we investigate this phenomenon in the case of ®nding the eigenvectors of a sym - λ metric matrix. From (1.3), we see that 1/gap( i) is a condition number for the i-th eigenvector λ = of a general sym metric matrix. This condition number is in®nite precisely when gap( i) 0, i.e. λ i is a multiple eigenvalue. It is reasonable to call such an eigenproblem ill-posed because the eigenvector is no longer uniquely determined: any vector in an at least two-dimensional invari- ant subspace will do. It is elementary to show that the reciprocal of this condition number gives exactly the distance to the nearest ill-posed problem: λ Proposition 8: Let H be a symmetric matrix with simple eigenvalue i;thus λ = eλ−λ e > eeδ ee +δ gap( i) min ik 0. Then the smallest H such that the eigenvalue of H H k≠i λ "corresponding to" i is multiple is λ hhhhhhhhgap( i) min eeδH ee = . 2 λ λ ξ By "the eigenvalue corresponding to i is multiple" we mean that if the continuous function i( ) is an +ξδ ≤ξ≤ λ =λ λ ξ ≤ξ< λ eigenvalue of H H for all 0 1, with iii(0) ,then ( )is simple for 0 1and i(1) is multiple. e λ−λe = λ Proof: Suppose jigap( i)andthatx iand x jare corresponding unit eigenvectors. Let δ = .λ−λ . T− T eeδ ee ≤ λ H .5(j iii)(xx xxjj)toshowmin H gap( i) /2. To get the other inequality eeδ ee λ λ apply (1.1) to see that any smaller H could not move either ijor more than half the dis- tance towards one another. `

21 d d

It turns out that a similar relationship holds for γ -s.d.d. symmetric matrices, provided we λ measure distances in the scaled way used so far in this paper, and that we use 1/relgap( i)asa condition number. This is interesting because it extends work in [7] to a case where the dis- λ tance metric is quite skewed from the usual norm, and shows that 1/relgap( i)isthemost natural condition number for this problem, because it shares the same geometric properties as other condition numbers. =∆ ∆ γ λ Proposition 9: Let H A be an n by n -s.d.d. symmetric matrix with simple eigenvalue i;thus λ = eλ−λ e.eλλ e−1/2 > λ ≤ −1/2 relgap( i) min ik ik 0. Assume further that relgap( i) 2 (this means that k≠i λ eeδ ee i and its nearest neighbor differ by a factor between .5 and 2). Then the smallest A such that ∆ +δ ∆ λ the eigenvalue of (A A) "corresponding to" i is multiple satis®es

−γ λ 4 hhhhhhhhhhhhhhhhhhhh(1 )relgap( i) hhhhhhhh(1+γ) ≤ min eeδA ee ≤ relgap(λ ).21/2.n. . √ dd. + λ i −γ 3 3( 2 n relgap( i)) (1 ) λ << eeδ ee For relgap( i) 1, the lower bound on min A equals λ −γ 1/2. . + λ 2 relgap( i)(1 )/(2 3 n) O((relgap( i)) ). In other words, both the upper and lower bounds Θ λ are (relgap( i)).

∆= λ Proof: By scaling H we may assume without loss of generality that ii1. Let j satisfy λ = e λ−λ e.eλλ e1/2 relgap( iijij) , and let xijand x be corresponding unit eigenvectors. First we prove the upper bound on min eeδA ee. From Proposition 6 we have that e T e ≤ +γ 3 −γ − 3∆∆ ee∆−1 T∆−13ee ≤ +γ −γ −3 (xxii)kl (1 )(1 ) kl. Thus xxii n(1 )(1 ) . Let δ=λ−λ T ∆ +δ ∆ λ A(jiii)xx . Clearly (A A) has a multiple eigenvalue at jas desired. Also, e λ−λe = λ . e λλ e1/2≤ 1/2 λ . e λ e ≤ 1/2. +γ . λ jirelgap( i) ij 2 relgap( ii) 2 (1 ) relgap( i), ee∆−1 T∆−1ee which when combined with the bound on xxii , yields the desired upper bound. Now we consider the lower bound. Abbreviate eeδA ee by η.NotethatifHis γ -s.d.d., then H+∆δA∆ is (γ+2η)(1−η)− 1 -s.d.d., if (γ+2η)(1−η)− 1 < 1. From (7.4) in Corollary 3, we see that if (γ+2η)(1−η)− 1< 1, then the perturbed relative gap will be at least . 1/2. η λ − hhhhhhhhhhhhhhhhhhh3 2 n relgap( i) . 1 − (γ+2η)(1−η) −1 In order for the perturbed relative gap to be zero, this lower bound will have to be nonpositive, and so λ hrelgaphhhhhhhhh( i) − η≥ .(1 − (γ+2η)(1−η))1 3.21/2.n Solving for the sm allest η satisfying this inequality yields the desired lower bound. When λ << η γ+ η −η −1∼ γ relgap( i) 1sothat is sm all enough that ( 2 )(1 ) ∼ ,weget η∼ λ −γ . 1/2. ` ∼ relgap( i)(1 )/(32 n) as desired. The same results could have been obtained using the general machinery of differential inequalities in [7], but these proofs are more straightforward.

22 d d

9. Algorithms for the Bidiagonal Singular Value Decomposition In this section we discuss algorithms capable of attaining the high relative accuracy inherent in the data as described in Theorem 1. Most of this work has appeared elsewhere, but since we will need the results in the next section, we outline them here. The three classes of algorithms we will discuss are QR, bisection, and divide and conquer. The standard QR iteration [12] as implemented in LINPACK [3] does not compute all singular values to high relative precision. It may be modi®ed, however, to achieve this as described in [9]. Brie¯y, the idea is to use a zero shift in a QR sweep when a tiny singular value is present. It turns out this zero-shift QR can be implemented in a forward stable way that only introduces small relative errors in each entry of the bidiagonal matrix. Corollary 1 of Theorem 1 then guarantees the singular values are not changed signi®cantly. The standard convergence criterion must also be changed to guarantee high relative accuracy; see [9] for details. The resulting algo- rithm is not only more accurate than the standard implementation but faster on the average; this is because the zero-shift QR sweep contains signi®cantly fewer ¯oating point operations than shifted QR. It was conjectured in [9] that this modi®ed QR algorithm computes the singular vectors as accurately as the "relative gap" error bounds of section 7 permit; this conjecture is supported by Proposition 7 and will be proven completely in [6] (see section 7 for discussion). Bisection is another method that guarantees high relative accuracy. An error analysis of the Sturm sequence recurrence for counting the number of singular values of a bidiagonal matrix in an interval [14] shows that it computes the exact number of singular values for a matrix differing from the original one only by sm all relative perturbations in each entry; Corol- lary 1 of Theorem 1 then guarantees high relative accuracy again. Divide and conquer [13] has not yet been shown to achieve high relative accuracy, at least without resorting to extended precision arithmetic in the inner loop. Achieving this accuracy is a current area of research.

10. Algorithms for the Symmetric Tridiagonal Eigenproblem In this section we present algorithms for computing eigenvalues of γ -s.d.d. sym metric tri- diagonal matrices to high relative accuracy. We note that reducing a dense γ -s.d.d. sym metric matrix to tridiagonal form will not generally preserve diagonal dominance or the accuracy of the eigenvalues; thus the algorithms in this section are suitable only when the original matrix is tri- diagonal. If the original matrix is dense, the algorithm in the next section should be used. Brie¯y, bisection can always be used to ®nd the eigenvalues accurately. If the matrix is positive de®nite as well, Cholesky followed by the algorithm of the last section applied to the bidiagonal Cholesky factor can be used. QR does not seem to work in general, but may if the matrix is strongly diagonally dominant and monotonically graded. It is still an open question whether divide and conquer techniques [5, 11] can achieve high relative accuracy. As with the bidiagonal singular value problem, the standard implementation of QR for tri- diagonal matrices does not guarantee high relative accuracy in the computed eigenvalues for sym metric γ -s.d.d. matrices, even if the change in convergence criterion suggested in section 6 is adopted. Even if a zero-shifted QR algorithm like the one used for the bidiagonal singular value decomposition is used, relative accuracy is lost (in numerical experiments on a strongly s.d.d. positive de®nite matrix, negative eigenvalues were computed). However, recent work by Le and Parlett [16] gives some hope that zero-shifted tridiagonal QR may sometimes be used in the same way as the bidiagonal zero-shifted QR to compute all eigenvalues to high relative accuracy. Their work shows that the inner loop of the standard QR iteration may be modi®ed to provide componentwise relative stability in the following sense: in ¯oating point this modi®ed zero-shifted QR is equivalent to making sm all relative perturbations in each entry of the tridiagonal matrix, performing QR exactly on this perturbed matrix, and again making sm all relative perturbations in each entry of the resulting matrix. This is a much stronger kind of stability than the usual kind described in paragraph (1.2) of the introduction. If

23 d d

the original and ®nal tridiagonals are also γ -s.d.d., Proposition 4 would imply that their - values agree to high relative accuracy. Thus, QR, combined with the stopping criterion (6.4), could be used to compute all eigenvalues to high relative accuracy, but only if all the matrices produced in the course of the iteration were γ -s.d.d.. Unfortunately, numerical experiments show this is not the case in general, and may only be true if the original matrix is very strongly ≥ diagonally dominant and monotonically scaled (HiiH i + 1, i + 1 ). Thus, we do not expect to be able to generally use QR based algorithms for the γ -s.d.d. tridiagonal eigenproblem. If we limit ourselves to positive de®nite matrices T, the following QR based approach will work. The following algorithm originally appeared in [10]. Recall that a tridiagonal matrix T is positive de®nite if and only if it has a positive diagonal and is γ -s.d.d. for some γ< 1.

Algorithm 2: Computing the eigenvalues of a positive de®nite tridiagonal matrix T: 1) Compute the Cholesky factorization LL T =T of T. 2) Find the singular values of the bidiagonal matrix L using the bidiagonal QR algorithm of section 9. 3) Square the singular values of L to get the eigenvalues of T.

To show that this method is viable, one needs to show that scaled diagonal dominance is suf®cient to guarantee that the squares of the exact singular values of L are all relatively close to the eigenvalues of A; the algorithms of section 9 then guarantee that we can compute the singular values of L to high relative accuracy. To this end we present a backward error analysis of the of a posi- tive de®nite sym metric tridiagonal matrix A. Our goal is to show that the computed Cholesky factor L is a sm all componentwise relative perturbation of the exact Cholesky factor of a sm all componentwise relative perturbation of A. We assume the usual model of ¯oating point arith- metic ¯ (aopb)=(aopb).(1+e), where e e e ≤ε and op ∈ {+ , − , × , /}, and that the ¯oating point square root function sqrt satis®es sqrt(a) = (1+e)a1/2 where e e e ≤ε. Proposition 10: Let A be an n by n positive de®nite symmetric tridiagonal matrix with diagonal entries a 1 , ...,an and offdiagonal entries b 1 , ...,bn − 1 . Let L be the computed Cholesky factor from the following algorithm:

= l11sqrt(a 1 ) for i = 1 to n − 1 = li + 1, iiib /l i = − 2 li +1, i + 1 sqrt(ai + 1 li + 1, i) endfor

Then barring over/under¯ow and attempts to take square roots of negative arguments, L is the exact à = +δ = T e δ e ≤ ε e e Cholesky factor of A A A LL , where Aijg ( ) A ij , and hhhhhhh(1+ε)3 g(ε) = 3ε+ 3ε+ε+23(4ε+ 6ε+ 24ε+ε 34). = 7ε+ O (ε2) (10.1) (1−ε)4

Proof: We construct AÃ=A +δA as follows. Subscripted εs denote independent quantities bounded in norm by ε. = = +ε 1/2 = +ε 2 1/2 ≡ 1/2 l11¯(sqrt(a 1)) (1 11)a 1 ((1 11 ) a1 ) aà 1

= = +ε ≡ Ã li,i−1 ¯ (bi−1 /li−1, i − 1 ) (1 i 1 )bi − 1 /li − 1, i − 1 bi − 1 /li − 1, i − 1

= − 2 li,iii¯(sqrt(a l ,i−1))

24 d d

= +ε . +ε − +ε 21/2 (1 i2) ((1 i3)(aii(1 4)li,i−1))

=+ε 2+ε − +ε 2+ε +ε 21/2 ((1 i2)(1 i3)aii(1 2)(1 i3)(1 i4)li,i−1))

≡+ − + 21/2 ((1 gi1)aii(1 g 2)li,i−1) e e ≤ ε+ ε+ε=23ε+ ε 2 e e≤ ε+ ε+234ε+ε= ε+ ε 2 where gi13 3 3 O ( )and gi2 4 6 4 4 O ( ). By assump- + < + 2 tion (1 gi 1 )aii(1 g 2)li,i−1 so 1+g e 2e≤ e e.hhhhhhi1. ≡ gli2i,i−1gi2+ aiiga3i 1gi2 where +ε 3 e e≤ ε+ ε+234ε+ε .hhhhhhh(1 ) = ε+ ε2 gi3 (4 6 4 ) 4 O ( ) (1−ε)4 Thus =+ − − 21/2 li,ii((1 g 1gi3)aiil ,i+1)

≡+ − 21/2 ((1 gi4)aiil ,i+1)

≡− 21/2 (aÃiil ,i+1) where +ε 3 e e ≤ ε+ ε+ε+23ε+ ε+ 2ε+ε 34hhhhhhh(1 ) = ε+ ε2 gi4 3 3 (4 6 4 ) 7 O ( ) (1−ε)4 as desired. ` Theorem 8: Let A be an n by n positive de®nite symmetric tridiagonal matrix, L its computed Chole- sky factor as in Proposition 10, and g (ε) as in (10.1). Let γ< 1 be the scaled diagonal dominance σ≤ ...≤σ λ≤ ...≤λ ofA.Let 1 n be the singular values of L and 1 n be the eigenvalues of A. Then λ hhhhhg(ε) hhhhhhhi(A ) hhhhhg(ε) (1− ) ≤ ≤ (1+ ) . −γ σ 2 −γ 1 i (L ) 1 h7.04hhhhhε For example, when ε< .001, the upper and lower bounds are bounded by 1± [ ]. 1− γ Proof: Combine Corollary 1 of Theorem 1, Theorem 2 and Proposition 10. ` It is easy to ®nd T for which Algorithm 2 computes all the eigenvalues accurately, but where the standard QR iteration [17] loses all accuracy on the sm allest eigenvalues. Since the bidiagonal SVD algorithm of section 9 can compute the singular vectors as accu- rately as the "relative gap" error bound permits (see the discussion of section 9), Algorithm 2 will compute the eigenvectors of T as accurately as Theorem 6 permits. Bisection is a viable algorithm for all s.d.d. tridiagonal matrices. A similar error analysis to the one for bidiagonal Sturm sequences shows that Sturm sequences can ®nd accurate eigen- values of T +δT where δT causes only sm all relative perturbations in each entry of T [14]. No pivoting is required, so Sturm sequence evaluation can be done in linear time with no storage beyond that needed for T. Together with Theorems 2 or 4, this implies that the computed eigenvalues all have high relative accuracy if the matrix is sym metric tridiagonal γ -s.d.d. As in the bidiagonal case, the ability of divide and conquer algorithms [11] to achieve high relative accuracy is an open problem.

11. Algorithms for the Dense Symmetric Eigenproblem

25 d d

First we present a new algorithm (or rather a new analysis of an old algorithm) for ®nding accurate eigenvalues of (possibly dense) sym metric s.d.d. matrices. The algorithm is based on bisection, but rather than computing the LDL T factorization of H − xI and using the number of T negative Dii to compute the number of eigenvalues of H less than x, we compute the LDL factorization of A − x ∆ −2,whereH=∆A∆. Second, we show that a suitable variation of inverse iteration can compute the eigenvec- tors of a symmetric s.d.d. matrix to the limiting accuracy of the "relative gap" error bounds in Theorems 6 and 7.

Algorithm 3: Stably computing the inertia of a shifted γ -scaled diagonally dominant symmetric matrix H − xI: =± > < − − (0) We assume as before that Aii 1. We consider only x 0; for x 0 consider H xI. (1) Permute the rows and columns of A − x ∆ − 2 and partition it as − ∆ − 2 RA 11x 1 A 12 H J J − ∆ − 2 Q A 21 A 22x 2 P − − 2 − ∆ − 2 =− = so that if a xd is a diagonal entry of A 11x 1 ,theneithera 1ora 1and xd− 2≥2. = − ∆−− 2 −∆− 2 − 1 (2) Compute X A 22x 2 A21(A 11x 1 )A12 . (3) Compute inertia( X )= (n , z , p) using a stable pivoting scheme such as in [4]. Here n is the number of negative eigenvalues, z the number of zero eigenvalues, and p the number of positive eigenvalues of X. − + (4) The inertia of H xI is (n dim(A 11 ),z,p).

Theorem 9: Let H=∆A∆ be a γ -scaled diagonally dominant symmetric matrix and x > 0 a real scalar. Algorithm 3 computes the exact inertia of a matrix H +δH−xI, where δH =∆δA∆, eeδA ee=O(ε), ε being the machine precision. Thus, Algorithm 3 can be used in a bisection algorithm to ®nd all the eigenvalues of H to high relative accuracy. − ∆ − 2 Proof: The partitioning guarantees that the diagonal entries of A 11x 1 are all less than or − − ∆ − 2 − +γ equal to 1. Therefore, all the eigenvalues of A 11x 1 are less than or equal to 1 . Since X is de®ned so that − ∆ − 2 − ∆ − 2 − ∆ − 2 − 1 RA 11x 1 A 12 H R I 0H RA 11x 1 0H RI (A 11x 1 ) A 12 H J J = J J. J J. J J − ∆ − 2 − ∆ − 2 − 1 , Q A 21 A 22x 2 P QA 21(A 11x 1 ) I P Q 0 XP Q0 I P the inertia of A− x∆− 2 is equal to − ∆ − 2 = + − ∆ − 2 = + inertia( A x ) inertia( X ) inertia( A 11x 1 ) inertia( X ) (dim(A 11 ), 0, 0) by Sylvester's Theorem. The algorithm in [4] will compute the exact inertia of X +δX,where eeδ ee= ε ee ee ee ee −γ ≤ ee ee ≤ +γ X O( ) X . Thus if we show that X is of order 1 A22 1 , we will be done. ee ∆− 2 ee≤ σ − ∆ −2 ≥ −γ To this end we note that by construction x 2 2, min(A 11x 1 ) 1 ,and ee ee= ee ee≤γ A12A 21 . Thus hhhhhγ 2 hhhh3 h eeXee ≤ 1+γ + 2 + ≤ 1−γ 1− γ as desired. ` −λ =∆ ∆−λ∆ ∆ In the case of pencils, there is an analogous algorithm. If H M HHHA MMA M γ T − ∆∆−1 ∆∆−1 is a -s.d.d. de®nite pencil, we compute the LDL decomposition of A HHx MMMHA in order to count the number of eigenvalues less than x. Henceforth we will assume without loss −λ = −λ∆ ∆ e e = = of generality that H M H A with HiiA ii 1.

26 d d

Algorithm 4: Stably computing the inertia of a shifted γ -scaled diagonally dominant de®nite pencil H − xM: =∆ ∆ e e = = (0) As stated above, we assume without loss of generality that M A with HiiA ii 1. We also consider only x > 0; for x < 0 consider − H − xM. (1) Permute the rows and columns of H − x∆A∆ and partition it as − ∆ − 1 ∆ − 1 − ∆ − 1 ∆ − 1 RH 11x 1 A 11 1 H 12x 1 A 12 2 H J J − ∆ − 1 ∆ − 1 − ∆ − 1 ∆ − 1 QH 21x 2 A 21 1 H 22x 2 A 22 2 P −− 2 − ∆ − 1 ∆ − 1 sothatifhxd is a diagonal entry of H 11x 1 A 11 1 ,then xd−2 ≥µ≡2(1+γ)/(1−γ). (2) Compute = − ∆ − 1 ∆−− 1 −∆− 1 ∆− 1 . −∆− 1 ∆− 1 − 1 . −∆− 1 ∆− 1 X H 22x 2 A 22 2 (H21x 2 A21 1 )(H11x 1 A11 1 )(H12x 1 A12 2 )

(3) Compute inertia( X ) = (n , z , p) using a stable pivoting scheme such as in [4]. − + (4) The inertia of H xM is (n dim(A 11 ),z,p).

−λ =∆ ∆−λ∆ ∆ γ > Theorem 10: Let H M HHHA MMMA be a -s.d.d. de®nite pencil and x 0 a real +δ − δ =∆ δ ∆ scalar. Algorithm 4 computes the exact inertia of H H xM, where H AAA , eeδA ee = O(ε), ε being the machine precision. Thus, Algorithm 4 can be used in a bisection algorithm to ®nd all the eigenvalues of H −λM to high relative accuracy. = − ∆ − 1 ∆ − 1 = − 1 ∆ ∆− = ∆− 1 ∆− 1 Proof: Let Y H 11x 1 A 11 1 , and de®ne K x 1H 11 1A 11,sothatY x 1 K 1 . Now if λ(K) is any eigenvalue of K, we have

− − h1hhhh−γ λ(K ) ≤λ (x1∆H ∆)−λ (A )≤µ1(1+γ) − (1−γ) =− max 1111 min11 2 and eeK− 1ee ≤ 2/(1−γ). Thus, Y is also nonsingular, with all negative eigenvalues. Since X and Y are de®ned so that − ∆ − 1 ∆ − 1 − ∆ − 1 ∆ − 1 RH 11x 1 A 11 1 H 12x 1 A 12 2 H J J = − ∆ − 1 ∆ − 1 − ∆ − 1 ∆ − 1 QH 21x 2 A 21 1 H 22x 2 A 22 2 P − 1 − ∆ − 1 ∆ − 1 R I 0H RY 0H RI Y (H 12x 1 A 12 2 )H J J. J J. J J − ∆ − 1 ∆ − 1 − 1 , Q(H 21x 2 A 21 1 )Y I P Q0 XP Q0 I P we have − = + = + inertia( H xM ) inertia( X ) inertia( Y ) inertia( X ) (dim(A 11 ), 0, 0) by Sylvester's Theorem. The algorithm in [4] will compute the exact inertia of X +δX,where eeδ ee = ε ee ee ee ee −γ ≤ ee ee ≤ +γ X O( ) X . Thus if we show that X is of order 1 H22 1 , we will be done. To this end we write ee ee ≤ ee ee + ee ∆− 1 ∆− 1 ee + ee − 1 ee + ee ∆− 1 ∆− 1 − 1 ee X H22x 2 A22 2 HY21 H12x 2 A21 1 YH12

+ ee − 1 ∆− 1 ∆− 1 ee + ee ∆− 1 ∆− 1 − 1 ∆− 1 ∆− 1 ee HY21 x1 A12 2 x 2 A21 1 Yx1 A12 2

= ee ee + ee ∆− 1 ∆− 1 ee + ee − 1 ∆ − 1 ∆ ee + ee∆−1 − 1 ∆ ee H22x 2 A22 2 Hx21 1 K 112H 2AK21 112H

+ ee ∆ − 1 ∆− 1 ee + ee ∆− 1 − 1 ∆− 1 ee H21 1 KA12 2 x 2 AK21 A12 2 . ee ee ≤γ ee ee ≤γ ee ee ≤γ ee ee ≤γ ee − 1 ee ≤ −γ Using the facts that A12, A 21, H 12, H 21 , K 2/(1 ), ee∆ ee.ee∆− 1 ee ≤ − 1 ee∆ ee2 ≤µ− 1 ee∆− 12ee ≤µ 1 2 1, x 1 ,andx 2 ,weget eeXee ≤ 1+γ + (1+γ)µ+γ2(2/(1−γ))µ+γ−1222(2/(1−γ)) +γ (2/(1−γ)) +γ (2/(1−γ))µ

27 d d

≤ 14/(1−γ)2 as desired. ` Now we present a variation on inverse iteration which can compute the eigenvectors of a symmetric s.d.d. matrix to the limiting accuracy of the "relative gap" error bounds of Theorems 6 and 7. A similar algorithm applies to pencils:

Algorithm 5: Inverse iteration for computing the eigenvector x of a symmetric s.d.d. matrix H =∆A∆ corresponding to eigenvalue z: (0) We assume the eigenvalue z has been computed accurately using one of the previous algo- rithms. = (1) Choose a unit starting vector y0;seti 0. (2) Compute the LDL T factorization of P T (A − z∆ − 2)P,wherePis the same permutation as in Algorithm 3. (3) Repeat i = i+ 1 − ∆ − 2 = T Solve (A z )yÄÄiiy −1for yiusing the LDL factorization of step (2) =ee ee r1/ yÄi = . yi r yÄi until (r = O(ε)) =∆−1 (4) x yi

Theorem 11: Suppose Algorithm 5 terminates with x as the computed eigenvector of the symmetric =∆ ∆ = + ε ε= s.d.d. matrix H A . Then there is a diagonal matrix D with Dii 1 O ( ), machine preci- sion, and a matrix δA, eeδA ee = O(ε), such that Dx is an exact eigenvector of ∆(A +δA)∆. Thus, the error in x is bounded by the results in Theorems 6 and 7. − ∆−2= Sketch of Proof: Let yÄÄibe the computed solution of (A z )yiiy −1at the last iteration of Algorithm 5. Applying the error analysis of the proof of Theorem 4, one can show that there is = + ε ee ee = ε − ∆+− 2 = a diagonal matrix D, Dii 1 O ( ), and an E, E O( ), such that D (A z E)DyÄiiy −1. Applying the result in [2], we can assume E is sym metric. Since Algorithm 5 guarantees = ee ee = ε r1/ yÄi O( ), another application of the result in [2] guarantees the existence of a sym - ee ee = ε − ∆+− 2 + = metric F, F O( ), such that (A z EF)Dyii0. Thus, Dy is an exact eigenvector of + + −λ∆−2 λ= = ∆ −1 ∆ +δ ∆ AE F for z,andDx D yi is an exact eigenvector of (A A) , eeδA ee = eeE+Fee = O(ε) as desired. `

12. Application to Differential Operators Consider the n by n second central difference matrix R 2 − 1 H J J J− 1 . . J H = h − 2 . n J . . − 1J J − J Q 1 2 P − 22 = ≤ ≤ which arises from discretizing d /dx at equally spaced grid points xiih,1 i n.Oneeasily hhhhhπ computes that 1−γ for H is 1− cos , which approaches 0 as n→∞ .Thisistobeexpected n n + 1 since Hn approximates an unbounded operator, and so has a wider and wider range of eigen- →∞ =∆ ∆ values as n . Since the diagonal of Hnnis constant, nothing is gained by writing H A , = Aii 1, and our perturbation theory merely says that all the eigenvalues are at least as sensitive as the sm allest one. Our theory becomes more interesting when considering unevenly spaced

28 d d

= − ≡β grid points xi. For example, let h iiix x −1, and suppose hi+1/hi for all i; this corresponds to β a uniformly graded grid. Then the corresponding Hn( ) has diagonal entries ranging from −2β−1 −2β−2n+1 γ β β γ β = +β+β−1 −1/2γ≤ γ 2h 1 to 2h1 . One can show ( )forHn( ) satis®es ( ) 2(2 ) ,so β that the eigenvalues of Hn( ) are always at least as accurately determined as the eigenvalues of β≠ −γ β Hn.Infact,if 1, 1 ( ) is bounded away from 0 for all n.

29 d d

13. Summary and Future Work We have shown that there are a number of situations where tiny eigenvalues and singular values can be determined much more accurately than standard perturbation theorems and numerical algorithms can guarantee. This is true for singular values of bidiagonal matrices, eigenvalues of sym metric s.d.d. matrices and eigenvalues of s.d.d. de®nite pencils. In addition, eigenvectors of sym metric s.d.d. matrices corresponding to relatively isolated eigenvalues are determined accurately by the data. Open questions remain in the perturbation theory for the singular vectors of bidiagonal matrices and in perturbation theory for eigenvalues of s.d.d. pen- cils with positive and negative eigenvalues. The following is a tabular summary of the current state of research into corresponding high accuracy algorithms. We consider two classes of algorithms for eigenvalues and singular values (bisection and QR), and two classes of algorithms for eigenvectors and singular vectors (inverse iteration using accurate eigenvalues/ singular values, and QR). If an algorithm was presented in earlier research, a reference is given; otherwise it was discussed here for the ®rst time. Conjectured algorithms are also indicated. (Divide and conquer is another technique for these problems. Since its ability to deliver high accuracy has not been proven in any of the cases considered in this paper, it quali®es as a "conjectured" algorithm for all of them.) Bisection based algorithms for computing eigenvalues and singular values to high relative accuracy: 1) Singular values of bidiagonal matrices [9]. 2) Eigenvalues of sym metric tridiagonal s.d.d. matrices (Theorem 4 and [14]). 3) Eigenvalues of not necessarily tridiagonal sym metric s.d.d. matrices (Algorithm 3). 4) Eigenvalues of s.d.d. de®nite pencils (Algorithm 4). QR based algorithms for computing eigenvalues and singular values to high relative accuracy: 1) Singular values of bidiagonal matrices [9]. 2) Eigenvalues of sym metric positive de®nite tridiagonal matrices (Algorithm 2 and [10]). 3) Eigenvalues of sym metric inde®nite tridiagonal scaled diagonally dominant matrices: no QR based algorithm appears to work in general. Inverse Iteration based algorithms for computing eigenvectors and singular vectors accurately: 1) Eigenvectors of sym metric s.d.d. matrices and pencils (Algorithm 5). QR based algorithms for computing eigenvectors and singular vectors accurately: 1) Conjectured: singular vectors of bidiagonal matrices (Proposition 7 and [6]). 2) Eigenvectors of sym metric positive de®nite tridiagonal s.d.d. matrices (Algorithm 2 and [6]). (Conjectured: the ®ner error bounds of Theorem 7). 3) Eigenvectors of sym metric inde®nite tridiagonal s.d.d. matrices: since the eigenvalues apparently cannot be computed accurately, neither can the eigenvectors.

In summary, various numerical algorithms are available to compute eigenvalues and eigenvectors, and singular values and singular vectors with high accuracy. Algorithm 2 for the symmetric positive de®nite tridiagonal eigenproblem will be incorporated in the LAPACK linear algebra library [8]. Not all algorithmic questions have been settled, however, and these will be the subject of future research.

30 d d

References

[1] M. Blevins and G. W. Stewart, Calculating the Eigenvectors of Diagonally Dominant Matrices, J. Assc. Comp. Mach., Vol. 21, No. 2, April 1974, pp 261-271 [2] J. Bunch, J. Demmel, C. Van Loan, The Strong Stability of Algorithms for Solving Symmetric Linear Systems, to appear in SIAM J. Matrix Anal. Appl. [3] J. Bunch, J. Dongarra, C. B. Moler, G. W. Stewart, LINPACK Users' Guide, SIAM, Phi- ladelphia, 1979 [4] J. Bunch and L. Kaufman, Some stable methods for calculating inertia and solving symmetric linear systems, Math. Comp., Vol. 31, No. 137, Jan 1977, pp 163-179 [5] J. J. M. Cuppen, A Divide and Conquer method for the Symmetric Tridiagonal Eigenproblem, Numer. Math. 36, pp. 177-195, 1981 [6] P. Deift, J. Demmel, L.C. Li, C. Tomei, The Bidiagonal Singular Value Decomposition and Hamiltonian Mechanics, in preparation [7] J. Demmel, On Condition Numbers and the Distance to the Nearest Ill-posed Problem, Num. Math., v. 51, n. 3, pp. 251-89, July 1987 [8] J. Demmel, J. Dongarra, J. Du Croz, A. Greenbaum, S. Hammarling, D. Sorensen, Pros- pectus for the Development of a Linear Algebra Library for High-Performance Computers, Argonne National Lab, Mathematics and Computer Science Division, ANL-MCS-TM-97, Sept. 1987 [9] J. Demmel and W. Kahan, Accurate Singular Values of Bidiagonal Matrices, to appear in SIAM J. Sci. Stat. Comp. (updated version of [10]) [10] J. Demmel and W. Kahan, Computing Small Singular Values of Bidiagonal Matrices with Guaranteed High R elative Accuracy, LAPACK Working Note # 3, ANL-MCS-TM-110, Argonne National Laboratory, February 1988 [11] J. J. Dongarra and D. C. Sorensen, A Fully Parallel Algorithm for the Symmetric Eigenprob- lem, SIAM J. Sci. Stat. Comput. 8, pp. 139-154, 1987 [12] G. Golub and C. Van Loan, Matrix Computations, The Johns Hopkins University Press, 1983 [13] E. R. Jessup and D. C. Sorensen, A Parallel Algorithm for Computing the Singular Value Decomposition of a Matrix, ANL-MCS-TM-102, Argonne National Laboratory, December 1987 [14] W. Kahan, Accurate Eigenvalues of a Symmetric Tridiagonal Matrix, Technical Report CS41, Computer Science Dept., Stanford University, 1966, revised 1968 [15] T. Kato, Perturbation Theory for Linear Operators, Springer-Verlag, New York, 1966 [16] J. Le and B. Parlett, On the Forward Instability of the QR Transformation, submitted to SIAM J. Mat. Anal. Appl.; also Report PAM-419, Center for Pure and Applied Mathematics, University of California, Berkeley, July 1988 [17] B. Parlett, The Symmetric Eigenproblem, Prentice Hall, 1980 [18] R. S. Varga, Matrix Iterative Analysis, Prentice Hall, 1962 [19] J. H. Wilkinson, The Algebraic Eigenvalue Problem, Clarendon Press, Oxford, 1965

31

d d