AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 16: Rayleigh Quotient Iteration
Xiangmin Jiao
SUNY Stony Brook
Xiangmin Jiao Numerical Analysis I 1 / 10 Solving Eigenvalue Problems
All eigenvalue solvers must be iterative Iterative algorithms have multiple facets:
1 Basic idea behind the algorithms 2 Convergence and techniques to speed-up convergence 3 Efficiency of implementation 4 Termination criteria We will focus on first two aspects
Xiangmin Jiao Numerical Analysis I 2 / 10 Simplification: Real Symmetric Matrices
We will consider eigenvalue problems for real symmetric matrices, i.e. T m×m m A = A ∈ R , and Ax = λx for x ∈ R √ ∗ T T I Note: x = x , and kxk = x x
A has real eigenvalues λ1,λ2, ... , λm and orthonormal eigenvectors q1, q2, ... , qm, where kqj k = 1 Eigenvalues are often also ordered in a particular way (e.g., ordered from large to small in magnitude) In addition, we focus on symmetric tridiagonal form
I Why? Because phase 1 of two-phase algorithm reduces matrix into tridiagonal form
Xiangmin Jiao Numerical Analysis I 3 / 10 Rayleigh Quotient
m The Rayleigh quotient of x ∈ R is the scalar xT Ax r(x) = xT x For an eigenvector x, its Rayleigh quotient is r(x) = xT λx/xT x = λ, the corresponding eigenvalue of x
For general x, r(x) = α that minimizes kAx − αxk2. 2 x is eigenvector of A⇐⇒ ∇r(x) = x T x (Ax − r(x)x) = 0 with x 6= 0 r(x) is smooth and ∇r(qj ) = 0 for any j, and therefore is quadratically accurate:
2 r(x) − r(qJ ) = O(kx − qJ k ) as x → qJ for some J
Xiangmin Jiao Numerical Analysis I 4 / 10 Power Iteration
Simple power iteration for largest eigenvalue
Algorithm: Power Iteration v (0) =some unit-length vector for k = 1, 2,... w = Av (k−1) v (k) = w/kwk λ(k) = r(v (k)) = (v (k))T Av (k)
Termination condition is omitted for simplicity
Xiangmin Jiao Numerical Analysis I 5 / 10 Convergence of Power Iteration
(0) k Expand initial v in orthonormal eigenvectors qi , and apply A :
(0) v = a1q1 + a2q2 + ··· + amqm (k) k (0) v = ck A v k k k = ck (a1λ1 q1 + a2λ2 q2 + ··· + amλmqm) k k k = ck λ1 (a1q1 + a2(λ2/λ1) q2 + ··· + am(λm/λ1) qm)
T (0) If |λ1| > |λ2| ≥ · · · ≥ |λm| ≥ 0 and q1 v 6= 0, this gives
(k) k (k) 2k kv − (±q1)k = O |λ2/λ1| , |λ − λ1| = O |λ2/λ1|
T (k) where ± sign is chosen to be sign of q1 v It finds the largest eigenvalue (unless eigenvector is orthogonal to v (0))
Error reduces by only a constant factor (≈ |λ2/λ1|) each step, and very slowly especially when |λ2| ≈ |λ1|
Xiangmin Jiao Numerical Analysis I 6 / 10 Inverse Iteration −1 −1 Apply power iteration on (A − µI ) , with eigenvalues {(λj − µ) } −1 If µ ≈ λJ for some J, then (λJ − µ) may be far larger than −1 (λj − µ) , j 6= J, so power iteration may converge rapidly Algorithm: Inverse Iteration v (0) =some unit-length vector for k = 1, 2,... Solve (A − µI )w = v (k−1) for w v (k) = w/kwk λ(k) = r(v (k)) = (v (k))T Av (k)
Converges to eigenvector qJ if parameter µ is close to λJ k ! 2k ! (k) µ − λJ (k) µ − λJ kv − (±qJ )k = O , |λ − λJ | = O µ − λK µ − λK
where λJ and λK are closest and second closest eigenvalues to µ Standard method for determining eigenvector given eigenvalue
Xiangmin Jiao Numerical Analysis I 7 / 10 Rayleigh Quotient Iteration
Parameter µ is constant in inverse iteration, but convergence is better for µ close to the eigenvalue Improvement: At each iteration, set µ to last computed Rayleigh quotient
Algorithm: Rayleigh Quotient Iteration v (0) =some unit-length vector λ(0) = r(v (0)) = (v (0))T Av (0) for k = 1, 2,... Solve (A − λ(k−1)I )w = v (k−1) for w v (k) = w/kwk λ(k) = r(v (k)) = (v (k))T Av (k)
Cost per iteration is linear for tridiagonal matrix
Xiangmin Jiao Numerical Analysis I 8 / 10 Convergence of Rayleigh Quotient Iteration
Cubic convergence in Rayleigh quotient iteration
(k+1) (k) 3 kv − (±qJ )k = O(kv − (±qJ )k )
and (k+1) (k) 3 |λ − λJ | = O |λ − λJ | In other words, each iteration triples number of digits of accuracy (k) (k) Proof idea: If v is close to an eigenvector, kv − (±qJ )k ≤ , then accuracy of Rayleigh quotient estimate λ(k) is (k) 2 |λ − λJ | = O( ). One step of inverse iteration then gives
(k+1) (k) (k) 3 kv − qJ k = O(|λ − λJ |kv − qJ k) = O( )
Rayleigh quotient is great in finding largest (or smallest) eigenvalue and its corresponding eigenvector. What if we want to find all eigenvalues?
Xiangmin Jiao Numerical Analysis I 9 / 10 Operation Counts
In Rayleigh quotient iteration, m×m (k−1) if A ∈ R is full matrix, then solving (A − µI )w = v may take O(m3) flops per step m×m 2 if A ∈ R is upper Hessenberg, then each step takes O(m ) flops m×m if A ∈ R is tridiagonal, then each step takes O(m) flops
Xiangmin Jiao Numerical Analysis I 10 / 10