The QR Algorithm ® the Most Common Method for Solving Small (Dense) Eigenvalue Problems. the Basic Algorithm: QR Without Shifts

The QR algorithm ® The most common method for solving small (dense) eigenvalue problems. The basic algorithm: QR without shifts 1. Until Convergence Do: 2. Compute the QR factorization A = QR 3. Set A := RQ 4. EndDo ® \Until Convergence" means \Until A becomes close enough to an upper triangular matrix" H H ® Note: Anew = RQ = Q (QR)Q = Q AQ ® Anew Unitarily similar to A ! Spectrum does not change 13-1 GvL 8.1-8.2.3 { Eigen2 13-1 ® Convergence analysis complicated { but insight: we are implicitly doing a QR factorization of Ak: QR-Factorize: Multiply backward: Step 1 A0 = Q0R0 A1 = R0Q0 Step 2 A1 = Q1R1 A2 = R1Q1 Step 3: A2 = Q2R2 A3 = R2Q2 Then: [Q0Q1Q2][R2R1R0] = Q0Q1A2R1R0 = Q0(Q1R1)(Q1R1)R0 = Q0A1A1R0;A1 = R0Q0 ! 3 = (Q0R0) (Q0R0) (Q0R0) = A | {zA } | {zA } | {zA } 3 ® [Q0Q1Q2][R2R1R0] == QR factorization of A ® This helps analyze the algorithm (details skipped) 13-2 GvL 8.1-8.2.3 { Eigen2 13-2 ® Above basic algorithm is never used as is in practice. Two variations: (1) Use shift of origin and (2) Start by transforming A into an Hessenberg matrix 13-3 GvL 8.1-8.2.3 { Eigen2 13-3 Practical QR algorithms: Shifts of origin Observation: (from theory): Last row converges fastest. Convergence is dictated by jλnj jλn−1j ® We will now consider only the real symmetric case. ® Eigenvalues are real. ® A(k) remains symmetric throughout process. (k) ® As k goes to infinity the last column and row (except ann) converge to zero quickly.,, (k) ® and ann converges to lowest eigenvalue. 13-4 GvL 8.1-8.2.3 { Eigen2 13-4 0 1 ..... a B ..... a C B C (k) B ..... a C A = B C B ..... a C B C @ ..... a A a a a a a a (k) (k) ® Idea: Apply QR algorithm to A − µI with µ = ann. Note: eigenvalues of A(k) − µI are shifted by µ, and eigenvectors are the same. 13-5 GvL 8.1-8.2.3 { Eigen2 13-5 QR with shifts 1. Until row ain; 1 ≤ i < n converges to zero DO: 2. Obtain next shift (e.g. µ = ann) 3. A − µI = QR 5. Set A := RQ + µI 6. EndDo ® Convergence (of last row) is cubic at the limit! [for symmetric case] 13-6 GvL 8.1-8.2.3 { Eigen2 13-6 ® Result of algorithm: 0 1 ..... : B ..... : C B C (k) B ..... : C A = B C B ..... : C B C @ ..... : A 0 0 0 0 0 λn ® Next step: deflate, i.e., apply above algorithm to (n − 1) × (n − 1) upper block. 13-7 GvL 8.1-8.2.3 { Eigen2 13-7 Practical algorithm: Use the Hessenberg Form Recall: Upper Hessenberg matrix is such that aij = 0 for j < i − 1 Observation: The QR algorithm preserves Hessenberg form (tridiagonal form in symmetric case). Results in substantial savings. Transformation to Hessenberg form 0 1 ?????? ® T B??????C Want H1AH1 = H1AH1 to B C B0 ?????C have the form shown on the right B C B0 ?????C B C ® Consider the first step only on a @0 ?????A 6 × 6 matrix 0 ????? 13-8 GvL 8.1-8.2.3 { Eigen2 13-8 T ® Choose a w in H1 = I − 2ww to make the first column have zeros from position3 to n. So w1 = 0. ® Apply to left: B = H1A ® Apply to right: A1 = BH1. Main observation: the Householder matrix H1 which transforms the column A(2 : n; 1) into e1 works only on rows 2 to n. When applying the transpose H1 to the right of B = H1A, we observe that only columns 2 to n will be altered. So the first column will retain the desired pattern (zeros below row 2). ® Algorithm continues the same way for columns 2, ...,n − 2. 13-9 GvL 8.1-8.2.3 { Eigen2 13-9 QR for Hessenberg matrices ® Need the \Implicit Q theorem" Suppose that QT AQ is an unreduced upper Hessenberg matrix. Then columns 2 to n of Q are determined uniquely (up to signs) by the first column of Q. ® In other words if V T AV = G and QT AQ = H are both Hessenberg and V (:; 1) = Q(:; 1) then V (:; i) = ±Q(:; i) for i = 2 : n. T Implication: To compute Ai+1 = Qi AQi we can: ® Compute 1st column of Qi [== scalar ×A(:; 1)] ® Choose other columns so Qi = unitary, and Ai+1 = Hessenberg. 13-10 GvL 8.1-8.2.3 { Eigen2 13-10 0∗ ∗ ∗ ∗ ∗1 ® W'll do this with Givens rotations: B∗∗∗∗∗ C B C A = B0 ∗ ∗ ∗ ∗C Example: With n = 5 : B C @0 0 ∗ ∗ ∗A 0 0 0 ∗ ∗ T 1. Choose G1 = G(1; 2; θ1) so that (G1 A0)21 = 0 0 ∗ ∗ ∗ ∗ ∗1 B ∗ ∗ ∗ ∗ ∗C T B C ® A1 = G AG1 = B+ ∗ ∗ ∗ ∗C 1 B C @ 0 0 ∗ ∗ ∗A 0 0 0 ∗ ∗ 13-11 GvL 8.1-8.2.3 { Eigen2 13-11 T 2. Choose G2 = G(2; 3; θ2) so that (G2 A1)31 = 0 0∗ ∗ ∗ ∗ ∗1 B∗ ∗ ∗ ∗ ∗C T B C ® A2 = G A1G2 = B0 ∗ ∗ ∗ ∗C 2 B C @0+ ∗ ∗ ∗A 0 0 0 ∗ ∗ T 3. Choose G3 = G(3; 4; θ3) so that (G3 A2)42 = 0 0∗ ∗ ∗ ∗ ∗1 B∗ ∗ ∗ ∗ ∗C T B C ® A3 = G A2G3 = B0 ∗ ∗ ∗ ∗C 3 B C @0 0 ∗ ∗ ∗A 0 0+ ∗ ∗ 13-12 T 4. Choose G4 = G(4; 5; θ4) so that (G4 A3)53 = 0 0∗ ∗ ∗ ∗ ∗1 B∗ ∗ ∗ ∗ ∗C T B C ® A4 = G A3G4 = B0 ∗ ∗ ∗ ∗C 4 B C @0 0 ∗ ∗ ∗A 0 0 0 ∗ ∗ ® Process known as \Bulge chasing" ® Similar idea for the symmetric (tridiagonal) case 13-13 GvL 8.1-8.2.3 { Eigen2 13-13 The symmetric eigenvalue problem: Basic facts ® Consider the Schur form of a real symmetric matrix A: A = QRQH Since AH = A then R = RH ® Eigenvalues of A are real and There is an orthonormal basis of eigenvectors of A In addition, Q can be taken to be real when A is real. (A − λI)(u + iv) = 0 ! (A − λI)u = 0 & (A − λI)v = 0 ® Can select eigenvector to be either u or v 13-14 GvL 8.1-8.2.3 { Eigen2 13-14 The min-max theorem (Courant-Fischer) Label eigenvalues decreasingly: λ1 ≥ λ2 ≥ · · · ≥ λn The eigenvalues of a Hermitian matrix A are characterized by the relation (Ax; x) λk = max min S; dim(S)=k x2S;x6=0 (x; x) Proof: Preparation: Since A is symmetric real (or Hermitian complex) there is an orthonormal basis of eigenvectors u1; u2; ··· ; un. Express any vector x in this Pn P 2 P 2 basis as x = i=1 αiui. Then : (Ax; x)=(x; x) = [ λijαij ]=[ jαij ]. (a) Let S be any subspace of dimension k and let W = spanfuk; uk+1; ··· ; ung. A dimension argument (used before) shows that S \ W= 6 f0g. So there is a 13-15 Pn non-zero xw in S \W. Express this xw in the eigenbasis as xw = i=k αiui. Then since λi ≤ λk for i ≥ k we have: Pn 2 (Axw; xw) λijαij = i=k ≤ λ Pn 2 k (xw; xw) i=k jαij So for any subspace S of dim. k we have minx2S;x6=0(Ax; x)=(x; x) ≤ λk. (b) We now take S∗ = spanfu1; u2; ··· ; ukg. Since λi ≥ λk for i ≤ k, for this particular subspace we have: Pk 2 (Ax; x) i=1 λijαij min = min n = λk: x 2 S ; x6=0 x 2 S ; x6=0 P 2 ∗ (x; x) ∗ i=k jαij (c) The results of (a) and (b) imply that the max over all subspaces S of dim. k of minx2S;x6=0(Ax; x)=(x; x) is equal to λk 13-16 GvL 8.1-8.2.3 { Eigen2 13-16 ® Consequences: (Ax; x) (Ax; x) λ1 = max λn = min x6=0 (x; x) x6=0 (x; x) ® Actually 4 versions of the same theorem. 2nd version: (Ax; x) λk = min max S; dim(S)=n−k+1 x2S;x6=0 (x; x) ® Other 2 versions come from ordering eigenvalues increasingly instead of decreasingly. -1 Write down all 4 versions of the theorem -2 Use the min-max theorem to show that kAk2 = σ1(A) - the largest singular value of A. 13-17 GvL 8.1-8.2.3 { Eigen2 13-17 ® Interlacing Theorem: Denote the k × k principal submatrix of [k] k A as Ak, with eigenvalues fλi gi=1. Then [k] [k−1] [k] [k−1] [k−1] [k] λ1 ≥ λ1 ≥ λ2 ≥ λ2 ≥ · · · λk−1 ≥ λk Example: λi's = eigenvalues of A, µi's = eigenvalues of An−1: λn λn−1 λ3 λ2 λ1 • ? • ? •••??? • ? • ? • µn−1 µn−2 µ2 µ1 ® Many uses. ® For example: interlacing theorem for roots of orthogonal polyno- mials 13-18 GvL 8.1-8.2.3 { Eigen2 13-18 The Law of inertia (real symmetric matrices) ® Inertia of a matrix = [m, z, p] with m = number of < 0 eigenvalues, z = number of zero eigenvalues, and p = number of > 0 eigenvalues. Sylvester's Law If X 2 Rn×n is nonsingular, then A of inertia: and XT AX have the same inertia. T -3 Suppose that A = LDL where L is unit lower triangular, and D diagonal. How many negative eigenvalues does A have? -4 Assume that A is tridiagonal. How many operations are re- quired to determine the number of negative eigenvalues of A? 13-19 GvL 8.1-8.2.3 { Eigen2 13-19 -5 Devise an algorithm based on the inertia theorem to compute the i-th eigenvalue of a tridiagonal matrix.

Load more