AMS526: I () Lecture 16: Rayleigh Quotient Iteration

Xiangmin Jiao

SUNY Stony Brook

Xiangmin Jiao Numerical Analysis I 1 / 10 Solving Eigenvalue Problems

All eigenvalue solvers must be iterative Iterative algorithms have multiple facets:

1 Basic idea behind the algorithms 2 Convergence and techniques to speed-up convergence 3 Efficiency of implementation 4 Termination criteria We will focus on first two aspects

Xiangmin Jiao Numerical Analysis I 2 / 10 Simplification: Real Symmetric Matrices

We will consider eigenvalue problems for real symmetric matrices, i.e. T m×m m A = A ∈ R , and Ax = λx for x ∈ R √ ∗ T T I Note: x = x , and kxk = x x

A has real eigenvalues λ1,λ2, ... , λm and orthonormal eigenvectors q1, q2, ... , qm, where kqj k = 1 Eigenvalues are often also ordered in a particular way (e.g., ordered from large to small in magnitude) In addition, we focus on symmetric tridiagonal form

I Why? Because phase 1 of two-phase algorithm reduces into tridiagonal form

Xiangmin Jiao Numerical Analysis I 3 / 10 Rayleigh Quotient

m The Rayleigh quotient of x ∈ R is the scalar xT Ax r(x) = xT x For an eigenvector x, its Rayleigh quotient is r(x) = xT λx/xT x = λ, the corresponding eigenvalue of x

For general x, r(x) = α that minimizes kAx − αxk2. 2 x is eigenvector of A⇐⇒ ∇r(x) = x T x (Ax − r(x)x) = 0 with x 6= 0 r(x) is smooth and ∇r(qj ) = 0 for any j, and therefore is quadratically accurate:

2 r(x) − r(qJ ) = O(kx − qJ k ) as x → qJ for some J

Xiangmin Jiao Numerical Analysis I 4 / 10

Simple power iteration for largest eigenvalue

Algorithm: Power Iteration v (0) =some unit-length vector for k = 1, 2,... w = Av (k−1) v (k) = w/kwk λ(k) = r(v (k)) = (v (k))T Av (k)

Termination condition is omitted for simplicity

Xiangmin Jiao Numerical Analysis I 5 / 10 Convergence of Power Iteration

(0) k Expand initial v in orthonormal eigenvectors qi , and apply A :

(0) v = a1q1 + a2q2 + ··· + amqm (k) k (0) v = ck A v k k k = ck (a1λ1 q1 + a2λ2 q2 + ··· + amλmqm) k k k = ck λ1 (a1q1 + a2(λ2/λ1) q2 + ··· + am(λm/λ1) qm)

T (0) If |λ1| > |λ2| ≥ · · · ≥ |λm| ≥ 0 and q1 v 6= 0, this gives

(k)  k  (k)  2k  kv − (±q1)k = O |λ2/λ1| , |λ − λ1| = O |λ2/λ1|

T (k) where ± sign is chosen to be sign of q1 v It finds the largest eigenvalue (unless eigenvector is orthogonal to v (0))

Error reduces by only a constant factor (≈ |λ2/λ1|) each step, and very slowly especially when |λ2| ≈ |λ1|

Xiangmin Jiao Numerical Analysis I 6 / 10 −1 −1 Apply power iteration on (A − µI ) , with eigenvalues {(λj − µ) } −1 If µ ≈ λJ for some J, then (λJ − µ) may be far larger than −1 (λj − µ) , j 6= J, so power iteration may converge rapidly Algorithm: Inverse Iteration v (0) =some unit-length vector for k = 1, 2,... Solve (A − µI )w = v (k−1) for w v (k) = w/kwk λ(k) = r(v (k)) = (v (k))T Av (k)

Converges to eigenvector qJ if parameter µ is close to λJ k ! 2k ! (k) µ − λJ (k) µ − λJ kv − (±qJ )k = O , |λ − λJ | = O µ − λK µ − λK

where λJ and λK are closest and second closest eigenvalues to µ Standard method for determining eigenvector given eigenvalue

Xiangmin Jiao Numerical Analysis I 7 / 10 Rayleigh Quotient Iteration

Parameter µ is constant in inverse iteration, but convergence is better for µ close to the eigenvalue Improvement: At each iteration, set µ to last computed Rayleigh quotient

Algorithm: Rayleigh Quotient Iteration v (0) =some unit-length vector λ(0) = r(v (0)) = (v (0))T Av (0) for k = 1, 2,... Solve (A − λ(k−1)I )w = v (k−1) for w v (k) = w/kwk λ(k) = r(v (k)) = (v (k))T Av (k)

Cost per iteration is linear for

Xiangmin Jiao Numerical Analysis I 8 / 10 Convergence of Rayleigh Quotient Iteration

Cubic convergence in Rayleigh quotient iteration

(k+1) (k) 3 kv − (±qJ )k = O(kv − (±qJ )k )

and (k+1)  (k) 3 |λ − λJ | = O |λ − λJ | In other words, each iteration triples number of digits of accuracy (k) (k) Proof idea: If v is close to an eigenvector, kv − (±qJ )k ≤ , then accuracy of Rayleigh quotient estimate λ(k) is (k) 2 |λ − λJ | = O( ). One step of inverse iteration then gives

(k+1) (k) (k) 3 kv − qJ k = O(|λ − λJ |kv − qJ k) = O( )

Rayleigh quotient is great in finding largest (or smallest) eigenvalue and its corresponding eigenvector. What if we want to find all eigenvalues?

Xiangmin Jiao Numerical Analysis I 9 / 10 Operation Counts

In Rayleigh quotient iteration, m×m (k−1) if A ∈ R is full matrix, then solving (A − µI )w = v may take O(m3) flops per step m×m 2 if A ∈ R is upper Hessenberg, then each step takes O(m ) flops m×m if A ∈ R is tridiagonal, then each step takes O(m) flops

Xiangmin Jiao Numerical Analysis I 10 / 10