the Top T HEME ARTICLE

THE QR ALGORITHM

After a brief sketch of the early days of eigenvalue hunting, the author describes the QR Algorithm and its major virtues. The symmetric case brings with it guaranteed convergence and an elegant implementation. An account of the impressive discovery of the algorithm brings the article to a close.

assume you share the view that the rapid how the digital computer might be employed to computation of a square ’s eigenval- solve the matrix eigenvalue problem. The obvi- ues is a valuable tool for engineers and sci- ous approach was to use a two-stage method. entists.1 The QR Algorithm solves the First, compute the coefficients of the character- Ieigenvalue problem in a very satisfactory way, istic polynomial, and then compute the zeros of but this success does not mean the QR Algo- the characteristic polynomial. rithm is necessarily the last word on the subject. There are several ingenious ways to accomplish Machines change and problems specialize. What the first stage in a number of arithmetic opera- makes the experts in matrix computations happy tions proportional to n3 or n4, where n is the order is that this algorithm is a genuinely new contri- of the matrix.2 The second stage was a hot topic of bution to the field of numerical analysis and not research during this same time. Except for very just a refinement of ideas given by Newton, small values of n, n ≤ 10, this two-stage approach Gauss, Hadamard, or Schur. is a disaster on a computer with fixed word length. The reason is that the zeros of a polynomial are, in general, incredibly sensitive to tiny changes in Early history of eigenvalue the coefficients, whereas a matrix’s eigenvalues are computations often, but not always, insensitive to small uncer- Matrix theory dramatically increased in im- tainties in the n2 entries of the matrix. In other portance with the arrival of matrix mechanics words, the replacement of those n2 entries by the and quantum theory in the 1920s and 1930s. In characteristic polynomial’s n coefficients is too the late 1940s, some people asked themselves great a condensation of the data. A radical alternative to the characteristic poly- nomial is the use of similarity transformations to 1521-9615/00/$10.00 © 2000 IEEE obtain a nicer matrix with the same eigenvalues. The more entries that are zero, the nicer the ma- BERESFORD N. PARLETT trix. Diagonal matrices are perfect, but a trian- University of California, Berkeley gular matrix is good enough for our purposes be-

38 COMPUTING IN SCIENCE & ENGINEERING cause the eigenvalues lie on the main diagonal. Each factorization leads to an algorithm by iter- For deep theoretical reasons, a ation. is generally not attainable in a finite number of The LU transform of B is UL = L–1BL, and the arithmetic operations. Fortunately, one can get QR transform of B is RQ = Q–1BQ = Q*BQ. In close to triangular form with only O(n3) arith- general, the LU or QR transform of B will not metic operations. More precisely, a matrix is said have more zero entries than B. The rewards of to be upper Hessenberg if it is upper triangular with using this transform come only by repetition. an extra set of nonzero entries just below the di- agonal in positions (i + 1, i), i = 1, 2, …, n – 1, and Theorem 1 (Fundamental Theorem). If B’s eigen- these are called the subdiagonal entries. What is values have distinct absolute values and the QR trans-

more, the similarity transformations needed to form is iterated indefinitely starting from B1 = B, obtain a Hessenberg form (it is not unique) can be chosen so that no computed matrix entry ever Factor Bj = QjRj, exceeds the norm of the original matrix. Any Form Bj+1 = RjQj, j = 1, 2, …. proper norm, such as the square root of the sum of the squares of the entries, will do. then, under mild conditions on the eigenvector ma-

This property of keeping intermediate quan- trix of B, the sequence {Bj} converges to the upper tities from becoming much bigger than the orig- triangular form (commonly called the Schur form) of inal data is important for computation with B with eigenvalues in monotone decreasing order of fixed-length numbers and is called stability. The absolute value down the diagonal. hard task is to find similarity transformations that both preserve Hessenberg form and elimi- This result is not obvious; see James Hardy nate those subdiagonal entries. This is where the Wilkinson’s article for proofs.4 The procedure that QR Algorithm comes to the rescue. The sub- generates {Bj} is called the basic QR Algorithm. routine DHSEQR in the Lapack library em- A similar theorem holds for the basic LU al- bodies the latest implementation.3 gorithm, provided all the transforms exist. It Let us try to put the improvement based on takes several clever observations to turn this sim- QR in perspective. It has reduced the time for ple theory into the highly successful QR Algo- standard eigenvalue computations to the time rithm of today. required for a few matrix multiplies. Invariance of the Hessenberg form If B is an upper (entry (i, j) The LU and QR Algorithms vanishes if i > j + 1), then so are all its QR iterates Suppose B = XY, with X invertible. Then the and LU iterates. This useful result is easy to see new matrix C := YX is similar to B, because C = because it depends only on the pattern of zeros YX = X–1BX. Although intriguing, this observa- in the matrices and not on the values of the tion does not appear to be of much use to eigen- nonzero entries. If Bj is Hessenberg, then so is –1 value hunters. Let us recall two well-known ways Qj, because RJ is triangular. Consequently, RjQj of factoring a matrix: = (Bj+1) is also Hessenberg. The cost of comput- ing the QR factorization falls from O(n3) to • Triangular factorization (or Gaussian elimi- O(n2) when the matrix is Hessenberg and n × nation), B = LU, where L is lower triangular n.5A similar result holds for LU iterates. with 1’s along the main diagonal and U is up- Fortunately, any matrix can be reduced by simi- per triangular. This factorization is not al- larities to Hessenberg form in O(n3) arithmetic op- ways possible. The multipliers in the reduc- erations in a stable way, and from this point on we tion are stored in L, the reduced matrix in U. will assume this reduction has been performed as • QR (or orthogonal triangular) factorization, an initial phase. The mild conditions mentioned in B = QR, where Q is unitary, Q–1 = Q* (conju- the Fundamental Theorem are satisfied by upper gate transpose of Q), and R is upper trian- Hessenberg matrices with nonzero subdiagonals.6 gular with nonnegative diagonal entries. This factorization always exists. Indeed, the Accelerating convergence columns of Q are the outputs of the Gram- The Hessenberg sequences {Bj} and {Cj} pro- Schmidt orthonormalizing process when duced by the basic algorithms converge linearly, it is executed in exact arithmetic on the and that is too slow for impatient customers. We columns of B = (b1, b2, …, bn). can improve the situation by making a subtle

JANUARY/FEBRUARY 2000 39 change in our goal. Instead of looking at the ma- Q1Q2 is the Q factor of (B1 – s1I)(B1 – s2I) and so trix sequence {Bj}, we can focus on the (n, n – 1) is real orthogonal, not just unitary. Hence, B3 is entry of each matrix, the last subdiagonal entry. a product of three real matrices and thus real. When the (n, n – 1) entry is negligible, the (n, n) The next challenge is to compute B3 from B1 entry is an eigenvalue—to within working pre- and s (complex) without constructing B2. The so- cision—and column n does not influence the re- lution is far from obvious and brings us to the maining eigenvalues. Consequently, the variable concept of bulge chasing, a significant component n can be reduced by 1, and computation contin- of the QR success story. ues on the smaller matrix. We say that the nth eigenvalue has been deflated. The top of the ma- Bulge chasing trix need not be close to triangular form. Thus, The theoretical justification comes from the convergence refers to the scalar sequence of (n, n Implicit Q or Uniqueness of Reduction property. – 1) entries. The rate of convergence of this se- quence can be vastly improved, from linear to Theorem 2 (Uniqueness). If Q is orthogonal, B is real, quadratic, by using the shifted QR Algorithm: and H = Q*BQ is a Hessenberg matrix in which each > Let B1 = B. For i = 1, 2, … until convergence, subdiagonal entry hi+1,i 0, then H and Q are deter- mined uniquely by B and q1, the first column of Q. Select a shift si Factor Bi – siI = QiRi Now return to the equations above and sup- * – ≠ Form Bi+1 = RiQi + siI = Qi BiQi. pose s1 = s, s2 = s s. If B1 is Hessenberg, then so are all the Bi’s. Suppose that B3 has positive sub- In principle, each shift strategy has a different diagonal entries. By Theorem 2, both B3 and convergence theory. It is rare that more than 2n Q1Q2 are determined by column 1 of Q1Q2, QR iterations are needed to compute all the which we’ll call q. Because R2R1 is upper trian- eigenvalues of B. gular, q is a multiple of the first column of

2 2 The double shift implementation for real B1 – 2 (Re s)B1 + |s| I. matrices There is a clever variation on the shifted QR Because B1 is Hessenberg, the vector q is zero Algorithm that I should mention. In many appli- except in its top three entries. cations, the initial matrix is real, but some of the The following three-stage algorithm computes eigenvalues are complex. The shifted algorithm B3. It uses orthogonal matrices Hj, j =1, 2, …, n – × must then be implemented in complex arithmetic 1, such that Hj is the identity except for a 3 3 to achieve quadratic convergence. The man who submatrix in rows and columns j, j + 1, j + 2. Hn–1 first presented the QR Algorithm, J.G.F Francis,7 differs from I only in its trailing 2 × 2 matrix. showed how to keep all arithmetic in the real field and still retain quadratic convergence. 1. Compute the first three entries of q. t Let us see how it is done. Without loss, take 2. Compute a matrix H1 such that H1 q is a j = 1. Consider two successive steps: multiple of e1, the first column of I. Form t C1 = H1 B1H1. It turns out that C1 is upper B1 – s1I = Q1R1 Hessenberg except for nonzeros in positions B2 = R1Q1 + s1I (3, 1), (4, 1), and (4, 2). This little submatrix B2 – s2I = Q2R2 is called the bulge. B3 = R2Q2 + s2I. 3. Compute a sequence of matrices H2, …, t Hn–1 and Cj = Hj Cj-1Hj, j = 2, …, n – 1 such t t It turns out, after some manipulation, that that Cn–1 = H n–1 … H2 C1H2 … Hn–1 is a Hessenberg matrix with positive subdiago- (Q1Q2)(R2R1) = (B1 – s1I) (B1 – s2I) nal entries. More on Hj below. and We claim that Cn–1 = B3. Recall that column 1 * > B3 = (Q1Q2) B1(Q1Q2). of Hj is e1 for j 1. Thus, H1H2 … Hn-1e1 = H1e1   = q/ q = (Q1Q2)e1. Now Cn–1 = (H1 … t t Suppose s1 and s2 are either both real or a com- Hn–1) B1(H1 … Hn–1), and B3 = (Q1Q2) B1(Q1Q2). plex conjugate pair. Then (B1 – s1I)(B1 – s2I) is The Implicit Q Theorem ensures that B3 and real. By the uniqueness of the QR factorization, Cn–1 are the same. Moreover, if s is not an eigen-

40 COMPUTING IN SCIENCE & ENGINEERING value, then Cn–1 must have positive subdiagonal There is one more twist to the implementa- entries in exact arithmetic. tion that is worth mentioning. The code can be Step 3 involves n – 2 minor steps, and at each rearranged so that no square roots need to be one only three rows and columns of the array are computed.9 altered. The code is elegant, and the cost of form- 2 ing Cn–1 is about 5n operations. The transforma- → t The discovery of the algorithms tion C2 H2 C1H2 pushes the bulge into posi- tions (4, 2), (5, 2), and (5, 3), while creating zeros There is no obvious benefit in factoring a in positions (3, 1) and (4, 1). Subsequently, each square matrix B into B = QR and then forming a operation with an H matrix pushes the bulge one new matrix RQ = Q*BQ. Indeed, some structure row lower until it falls off the bottom of the ma- in B might be lost in Q*BQ. trix and the Hessenberg form is restored. The So how did someone come up with the idea of → transformation B1 B3 is called a double step. iterating this transformation? Major credit is due It is now necessary to inspect entry (n – 1, n – 2) to the Swiss mathematician and computer scientist as well as (n, n – 1) to see whether a deflation is H. Rutishauser. His doctoral thesis was not con- warranted. For complex conjugate pairs, the (n – 1, cerned with eigenvalues but rather with a more n – 2) entry, not (n, n – 1), becomes negligible. general algorithm he invented, which he called the The current shift strategies for QR do not guar- Quotient-Difference (QD) Algorithm.10 This antee convergence in all cases. Indeed, examples procedure can be used to find zeros of polynomi- are known where the sequence {Bj} can cycle. To als or poles of rational functions or to manipulate guard against such misfortunes, an ad hoc excep- continued fractions. The algorithm transforms an tional shift is forced from time to time. A more array of numbers, which Rutishauser writes as complicated choice of shifts might produce a nicer 8 theory, but practical performance is excellent. Z = (q1, e1, q2, e2, …, qn–1, en–1, qn)

^ into another one, Z, of the same form. The symmetric case Let us define two bidiagonal matrices associ- The QR transform preserves symmetry for real ated with Z. For simplicity, take n = 5; then matrices and preserves the Hermitian property for complex matrices: B → Q*BQ. It also preserves  1  q 1  Hessenberg form. Because a symmetric Hessen- 1 e 1   q 1  berg matrix is tridiagonal (that means entry (i, j)  1   2  = ,U = q 1 . vanishes if|i – j| > 1), the QR Algorithm pre- L  e2 1   3  e 1  q 1  serves symmetric tridiagonal form and the cost of  3   4  2  e 1  q  a transform plunges from O(n ) to O(n) opera- 4 5 tions. In fact, the standard estimate for the cost of computing all the eigenvalues of an n × n sym- Rutishauser observed that the rhombus rules he metric is 10n2 arithmetic op- discovered for the QD transformation, namely erations. Recall that all the eigenvalues are real. eˆ + qˆ + = q + + e + One reason this case is worthy of a section to i i 1 i 1 i 1 ˆ ˆ = itself is its convergence theory. Theorem 1 tells eiqi qi+1ei , us that the basic algorithm (all shifts are zero) admit the following remarkable interpretation: pushes the large eigenvalues to the top and the ˆ ˆ = small ones to the bottom. A shift strategy LU UL . Wilkinson suggested in the 1960s always makes Note that UL and LU are tridiagonal matrices the (n, n – 1) entry converge to zero. Moreover with all superdiagonal entries equal to one. In convergence is rapid. Everyone believes the rate other words, the QD Algorithm is equivalent to is cubic (very fast), although our proofs only the following procedure on tridiagonals J with guarantee a quadratic rate (such as Newton’s it- unit superdiagonals: eration for polynomial zeros).9 The implementation for symmetric tridiagonals Factor J = LU, ˆ = is particularly elegant, and we devote a few lines to Form J UL . evoke the procedure. The bulge-chasing method Thus was the LU transform born. It did not described earlier simplifies, because the bulge con- take Rutishauser a moment to see that the idea sists of a single entry on each side of the diagonal. of reversing factors could be applied to a dense

JANUARY/FEBRUARY 2000 41 matrix or a banded matrix. Although the QD Al- and they are called sparse. The QR Algorithm gorithm appeared in 1953 or 1954, Rutishauser is not well suited for such cases. It destroys spar- did not publish his LU algorithm until 1958.10,11 sity and has difficulty finding selected eigenpairs. He called it LR, but I use LU to avoid confusion Since 1970, research has turned toward these with QR. challenging cases.13 Unfortunately, the LU transform is not always stable, so the hunt was begun for a stable variant. This variant was found by a young computer sci- entist J.G.F Francis,7 greatly assisted by his men- tor Christopher Strachey, the first professor of computation at Oxford University. Independent References of Rutishauser and Francis, Vera Kublanovskaya 1. A. Jennings and J.J. McKeown, Matrix Computations, 2nd ed., in the USSR presented the basic QR Algorithm John Wiley, New York., 1977. in 1961.12 However, Francis not only gave us the 2. D.K. Faddeev and V.N. Faddeeva, Computational Methods of Lin- ear Algebra, W.H. Freeman and Co., San Francisco, 1963. basic QR Algorithm, but at the same time ex- 3. E. Anderson et al., LAPACK Users’ Guide, 2nd ed., SIAM, Philadel- ploited the invariance of the Hessenberg form phia, 1995. and gave the details of a double step to avoid the 4. J.H. Wilkinson, “Convergence of the LR, QR, and Related Algo- use of complex arithmetic. rithms,” The Computer J., No. 4, 1965, pp. 77–84. 5. G.H. Golub and C.F. Van Loan, Matrix Computations, 3rd ed., The Johns Hopkins Univ. Press, Baltimore, 1996. 6. B.N. Parlett, “Global Convergence of the Basic QR Algorithm on Hessenberg Matrices,” Mathematics of Computation, No. 22, 1968, pp. 803–817. 7. J.G.F Francis, “The QR Transformation, Parts I and II,” The Com- puter J., No. 4, 1961–1962, pp. 265–271 and 332–345. 8. S. Batterson and D. Day, “Linear Convergence in the Shifted QR Algorithm,” Mathematics of Computation, No. 59, 1992, pp. 141–151. n 1955, the calculation of the eigenvalues 9. B.N. Parlett, The Symmetric Eigenvalue Problem, 2nd ed., SIAM, and eigenvectors of a real matrix that was Philadelphia, 1998. not symmetric was considered a formida- 10. H. Rutishauser, “Der Quotienten-Differenzen-Algorithmus,” ble task for which there was no reliable Zeitung der Angewandte Mathematik und Physik, No. 5, 1954, pp. Imethod. By 1965, the task was routine, thanks 233–251. to the QR Algorithm. However, there is more 11. H. Rutishauser, “Solution of Eigenvalue Problems with the LR- Transformation,” Nat’l Bureau Standards Applied Mathematics Se- to be done, because of accuracy issues. The ries, No. 49, 1958, pp. 47–81. eigenvalues delivered by QR have errors that are 12. V.N. Kublanovskaya, “On Some Algorithms for the Solution of tiny, like the errors in rounding a number to 15 the Complete Eigenvalue Problem,” USSR Computational Math- decimals, with respect to the size of the numbers ematics and Physics, No. 3, 1961, pp. 637–657. in the original matrix. That is good news for the 13. Y. Saad, Numerical Methods for Large Eigenvalue Problems, Man- chester Univ. Press, Manchester, UK, 1992. large eigenvalues but disappointing for any that are tiny. In many applications, the tiny eigenval- ues are important because they are the dominant eigenvalues of the inverse. For a long time, this limitation on accuracy was regarded as an intrinsic limitation caused by the limited number of digits for each number. Now we know that there are important cases where the data in the problem do define all the Beresford N. Parlett is an emeritus professor of math- eigenvalues to high relative accuracy, and cur- ematics and of computer science at the University of rent work seeks algorithms that can deliver an- California, Berkeley. His professional interest is in matrix swers to that accuracy. computations, especially eigenvalue problems. He re- The QR Algorithm was aimed at matrices ceived his BA in mathematics from Oxford University with at most a few hundred rows and columns. and his PhD from Stanford University. He is a member Its success has raised hopes. Now customers of the AMS and SIAM. Contact him at the Mathemat- want to find a few eigenpairs of matrices with ics Dept. and Computer Science Division, EECS Dept., thousands, even hundreds of thousands, of rows. Univ. of California, Berkeley, CA 94720; parlett@math. Most of the entries of these matrices are zero, berkeley.edu.

42 COMPUTING IN SCIENCE & ENGINEERING