Finding Eigenvalues: Arnoldi Iteration and the QR Algorithm
Will Wheeler
Algorithms Interest Group
Jul 13, 2016 Outline
Finding the largest eigenvalue
I Largest eigenvalue ← power method
I So much work for one eigenvector. What about others?
I More eigenvectors ← orthogonalize Krylov matrix
I build orthogonal list ← Arnoldi iteration Solving the eigenvalue problem
I Eigenvalues ← diagonal of triangular matrix
I Triangular matrix ← QR algorithm
I QR algorithm ← QR decomposition
I Better QR decomposition ← Hessenberg matrix
I Hessenberg matrix ← Arnoldi algorithm Power Method
How do we find the largest eigenvalue of an m × m matrix A?
I Start with a vector b and make a power sequence:
b, Ab, A2b,...
I Higher eigenvalues dominate:
n n n n A (v1 + v2) = λ1v1 + λ2v2 ≈ λ1v1
Vector sequence (normalized) converged?
I Yes: Eigenvector
I No: Iterate some more Krylov Matrix
Power method throws away information along the way.
I Put the sequence into a matrix
K = [b, Ab, A2b,..., An−1b] (1)
= [xn, xn−1, xn−2,..., x1] (2)
I Gram-Schmidt approximates first n eigenvectors
v1 = x1 (3) x2 · x1 v2 = x2 − x1 (4) x1 · x1 x3 · x1 x3 · x2 v3 = x3 − x1 − x2 (5) x1 · x1 x2 · x2 ... (6)
I Why not orthogonalize as we build the list? Arnoldi Iteration
I Start with a normalized vector q1
xk = Aqk−1 (next power) (7) k−1 X yk = xk − (qj · xk ) qj (Gram-Schmidt) (8) j=1
qk = yk / |yk | (normalize) (9)
I Orthonormal vectors {q1,..., qn} span Krylov subspace n−1 (Kn = span{q1, Aq1,..., A q1}) I These are not the eigenvectors of A, but make a similarity † (unitary) transformation Hn = QnAQn Arnoldi Iteration
I Start with a normalized vector q1
xk = Aqk−1 (next power) (10) k−1 X yk = xk − (qj · xk ) qj (Gram-Schmidt) (11) j=1
qk = yk /|yk | (normalize) (12)
I Orthonormal vectors {q1,..., qn} span Krylov subspace n−1 (Kn = span{q1, Aq1,..., A q1}) I These are not the eigenvectors of A, but make a similarity † (unitary) transformation Hn = QnAQn Arnoldi Iteration - Construct Hn
We can construct the matrix Hn along the way. † I For k ≥ j: hj,k−1 = qj · xk = qj Aqk−1 I For k = j: hj,k−1 = |yk | (equivalent)
I For k < j: hj,k−1 = 0
I ⇒ Hn is upper Hessenberg h1,1 h1,2 h1,3 h1,4 h1,5 h1,6 h2,1 h2,2 h2,3 h2,4 h2,5 h2,6 0 h3,2 h3,3 h3,4 h3,5 h3,6 Hn = (13) 0 0 h4,3 h4,4 h4,5 h4,6 0 0 0 h5,4 h5,5 h5,6 0 0 0 0 h6,5 h6,6
† I Why is this useful? Hn = QnAQn → same eigenvalues as A Finding Eigenvalues
How do we find eigenvalues of a matrix?
I Start with a triangular matrix A
I The diagonal elements are the eigenvalues
a11 a12 a13 a14 a15 a16 0 a22 a23 a24 a25 a26 0 0 a33 a34 a35 a36 A = (14) 0 0 0 a44 a45 a46 0 0 0 0 a55 a56 0 0 0 0 0 a66 What if A isn’t triangular? QR algorithm
Factorize A = QR
I Q is orthonormal (sorry this is a different Q)
I R is upper triangular (aka right triangular) Algorithm:
1. Start A1 = A
2. Factorize Ak = Qk Rk
3. Construct Ak+1 = Rk Qk −1 −1 = Qk Qk Rk Qk = Qk Ak Qk ← similarity transform 4. Ak converges to upper triangular QR Algorithm Example (numpy.linalg.qr)
0.313 0.106 0.899 A1 = 0.381 0.979 0.375 0.399 0.488 0.876 1.649 0.05 −0.256 A2 = 0.035 0.502 0.534 −0.014 0.004 0.017 1.653e + 00 2.290e − 02 2.305e − 01 A3 = 5.966e − 03 5.063e − 01 −5.342e − 01 8.325e − 05 −9.429e − 05 9.666e − 03 1.653e + 00 1.872e − 02 −2.285e − 01 A4 = 1.800e − 03 5.063e − 01 5.349e − 01 −4.812e − 07 1.803e − 06 9.554e − 03 QR Decomposition - Gram-Schmidt
How do we get the matrices Q and R? http://www.seas.ucla.edu/ vandenbe/103/lectures/qr.pdf Recursively solve for the first column of Q and the first row of R:
I A = QR r11 R12 I a1 A2 = q1 Q2 0 R22 I a1 A2 = r11q1 q1R12 + Q2R22
I r11 = ||a1||, q1 = a1/r11 | | I Note q1q1 = 1 and q1Q2 = 0 | I R12 = q1A2 I Solve A2 − q1R12 = Q2R22 QR Decomposition of Hessenberg Matrix
What if we have a matrix in upper Hessenberg form? h1,1 h1,2 h1,3 h1,4 h1,5 h1,6 h2,1 h2,2 h2,3 h2,4 h2,5 h2,6 0 h3,2 h3,3 h3,4 h3,5 h3,6 Hn = (15) 0 0 h4,3 h4,4 h4,5 h4,6 0 0 0 h5,4 h5,5 h5,6 0 0 0 0 h6,5 h6,6 We just need to remove the numbers below the diagonal by combining rows → Givens rotation. Givens Rotation
A Givens rotation acts only on two rows and leaves the others unchanged.
γ −σ 0 0 0 σ γ 0 0 0 G1 = 0 0 1 0 0 (16) 0 0 0 1 0 0 0 0 0 1 a r √ Want G = , r = a2 + b2 b 0 √ 2 2 I γ = a/ a + b √ 2 2 I σ = −b/ a + b QR Decomposition of Hessenberg Matrix
he11 he12 he13 he14 he15 0 he22 he23 he24 he25 G2G1Hn = 00 he33 he34 he35 (17) 0 0 h43 h44 h45 0 0 0 h54 h55
| | | So Hn = G1G2 ... Gn−1R = QR QR Decomposition of Hessenberg Matrix
QR algorithm is iterative; is RQ also upper Hessenberg? Yes: Acting on the right, the Givens rotations mix two columns instead of rows, but change the same zeros.
re11 re12 re13 r14 r15 r21 r22 r23 r24 r25 e e e G2G1Hn = 0 r32 r33 r34 r35 (18) e e 0 00 r44 r45 0 0 0 0 r55 Householder Reflections
I Arnoldi finds a few eigenvalues of a large matrix
I In practice, QR uses Householder reflections
I www.math.usm.edu/lambers/mat610/sum10/lecture9.pdf Reflection across the plane perpendicular to a unit vector v:
P = I − 2vv| (19) Px = x − 2vv|x (20) Px − x = −v(2v|x) (21)
Want to arrange first column into zeros: find v so that Px = αe1. Householder Reflections
Want to arrange first column into zeros: find v so that Px = αe1.
||x|| = α (22) | x − 2vv x = αe1 (23) 1 (x − αe ) = v(v|x) (24) 2 1 So we know
I v ∝ x − αe1 I v is a unit vector
x − αe v = 1 (25) ||x − αe1|| Lanczos
What if we apply the Arnoldi algorithm to a Hermitian matrix?
I Start with a normalized vector q1
xk = Aqk−1 (next power) (26) k−1 X yk = xk − (qj · xk ) qj (Gram-Schmidt) (27) j=k−2
qk = yk / |yk | (normalize) (28)
† Since Hn = QnAQn,
I Hn is also Hermitian
I Entries of Hn above the first diagonal are 0 (tridiagonal) I We don’t need to calculate them! Lanczos
What if we apply the Arnoldi algorithm to a Hermitian matrix? † Since Hn = QnAQn,
I Hn is also Hermitian
I Entries of Hn above the first diagonal are 0 (tridiagonal) I We don’t need to calculate them!
∗ h11 h21 0 0 0 ∗ h21 h22 h32 0 0 ∗ Hn = 0 h32 h33 h43 0 (29) ∗ 0 0 h43 h44 h54 0 0 0 h54 h55