A Compact Arnoldi Algorithm for Polynomial Eigenvalue Problems

A compact Arnoldi algorithm for polynomial eigenvalue problems Yangfeng Su, Junyi Zhang, Zhaojun Bai School of Mathematical Sciences Fudan University January, 2008, Taiwan RANMEP2008 Goals Polynomial Eigenvalue Problems (PEPs): d A0 + λA1 + ... + λ Ad x = 0 (λ, x) is called an eigenpair. d=1, SEP,GEP d=2, QEP ( Quadratic Eigenvalue Problem) Goals Solving large scale polynomial eigenvalue problems with Implicitly Restarted Arnoldi (IRA) algorithm with less memory. Goals Polynomial Eigenvalue Problems (PEPs): d A0 + λA1 + ... + λ Ad x = 0 (λ, x) is called an eigenpair. d=1, SEP,GEP d=2, QEP ( Quadratic Eigenvalue Problem) Goals Solving large scale polynomial eigenvalue problems with Implicitly Restarted Arnoldi (IRA) algorithm with less memory. Outline 1 Background 2 Algorithm 3 Numerical Comparison Outline 1 Background 2 Algorithm 3 Numerical Comparison Linearization Linearization ⇒ A larger GEP: A0 −A1 −A2 ... −Ad x I I λx − λ = 0 .. .. . . . . I I 0 λd−1x Sleijpen, van der Vorst, and van Gijzen. Quadratic eigenproblems are no problem, SIAM News, 1996 Can be solved with ARPACK Not true: structure is neither used nor preserved. size and therefore memory are d times! This work: explore the structure to save memory in ARPACK. It is a byproduct of SOAR [Bai & S. SIMAX 2005]. Linearization Linearization ⇒ A larger GEP: A0 −A1 −A2 ... −Ad x I I λx − λ = 0 .. .. . . . . I I 0 λd−1x Sleijpen, van der Vorst, and van Gijzen. Quadratic eigenproblems are no problem, SIAM News, 1996 Can be solved with ARPACK Not true: structure is neither used nor preserved. size and therefore memory are d times! This work: explore the structure to save memory in ARPACK. It is a byproduct of SOAR [Bai & S. SIMAX 2005]. Linearization Linearization ⇒ A larger GEP: A0 −A1 −A2 ... −Ad x I I λx − λ = 0 .. .. . . . . I I 0 λd−1x Sleijpen, van der Vorst, and van Gijzen. Quadratic eigenproblems are no problem, SIAM News, 1996 Can be solved with ARPACK Not true: structure is neither used nor preserved. size and therefore memory are d times! This work: explore the structure to save memory in ARPACK. It is a byproduct of SOAR [Bai & S. SIMAX 2005]. Linearization Linearization ⇒ A larger GEP: A0 −A1 −A2 ... −Ad x I I λx − λ = 0 .. .. . . . . I I 0 λd−1x Sleijpen, van der Vorst, and van Gijzen. Quadratic eigenproblems are no problem, SIAM News, 1996 Can be solved with ARPACK Not true: structure is neither used nor preserved. size and therefore memory are d times! This work: explore the structure to save memory in ARPACK. It is a byproduct of SOAR [Bai & S. SIMAX 2005]. Arnoldi decomposition An Arnoldi decomposition of order-j: T OPQj = Qj Hj + hj+1,j qj+1ej where OP: a matrix Qj = [q1, q2,..., qj ]: orthonormal Hj : upper Hessenberg Arnoldi process is used to compute the Arnoldi decomposition with numerical stability Implicitly Restarted Arnoldi (IRA) for SEP Sorensen [SIMAX92] Given an Arnoldi decomposition of order-p, T OPQp = QpHp + βpqp+1ep . 1 Extend Arnoldi decomposition from order p to order k: T AQk = Qk Hk + βk qk+1ek . 2 Divide the eigenvalues of Hk as “good” ones µ1, . , µp and “bad” ones µp+1, . , µk . 3 For Hk do implicitly QR steps with shifts µk+1, . , µk , get H Hk = UH˜k U 4 Take first p columns of ˜ T AQk U = Qk UHk + βk qk+1ek U as a restarted Arnoldi decomposition of order p. ARPACK ARPACK is an implementation of IRA algorithm a well-coded, well-documented package produced by Lehoucq, Sorensen and Yang during 1992-1997 used in MATLAB as eigs and arpackc IRA for QEP For simplicity, we only discuss QEPs. For QEP: (λ2M + λD + K)x = 0 1 shift-and-invert: for shift σ, let λ = σ + 1/µ (µ2I − µA − B)x = 0 where A = −(σ2M + σD + K)−1(2σM + D) B = −(σ2M + σD + K)−1M 2 linearize AB µx µx = µ I 0 x x 3 apply IRA IRA for QEP Easy use of ARPACK How to utilize the Frobenius structure to save memory? Outline 1 Background 2 Algorithm 3 Numerical Comparison Arnoldi decomposition for QEP An Arnoldi decomposition with order-j OPQj = Qj+1Hbj For QEP: AB Qj,1 Qj+1,1 = Hbj I 0 Qj,2 Qj+1,2 Since Qj,1 = Qj+1,2Hbj we have Theorem rank ([Qj,1, Qj,2]) ≡ rj ≤ j + 1 Observed by many people, e.g. Meerbergen [SIMAX06], Bai & S. [SIMAX05]. The key is how to use it with numerical stability. Arnoldi Decomposition for QEP Theorem rank ([Qj,1, Qj,2]) ≡ rj ≤ j + 1 Let n×rj Vj ∈ C = orth[Qj,1, Qj,2] then Qj,1 Vj Rj,1 Vj Rj,1 Qj = = = Qj,2 Vj Rj,2 Vj Rj,2 Two levels of orthonormality: Vj is orthonormal R j,1 is orthonormal Rj,2 Compact ARnoldi Decomposition (CARD) Compact ARnoldi Decomposition (CARD) Vj Rj,1 Vj+1 Rj+1,1 OP = Hbj Vj Rj,2 Vj+1 Rj+1,2 n×r Vj ∈ C j r ×j Rj ∈ C j j ≤ rj+1 ≤ j + 1 Memory cost: Arnoldi: 2n(j + 1) ( for PEPs: dn(j + 1)) CARD: nrj+1 ≤ n(j + 2) (for PEPs: ≤ n(j + d + 1)) CARD process CARD process is to compute the CARD with numerical stability! CARD of order j: VR1 VR1 Vˆ Rˆ1 OP = H + βqe1 = Hˆ VR2 VR2 Vˆ Rˆ2 Expand it to a CARD of order j + 1 ⇒ next two pages Expand CARD process 1 compute q1 = Aq1 + Bq2; 2 decompose q1 = Vˆ x + vα with MGS T x = Vˆ q1, v = q1 − Vˆ x, α = kvk, v = v/α 3 update h i Vˆ = Vˆ , v , rj+1 = rj + 1, Rˆ x Rˆ Rˆ (:, j + 1) Rˆ = 1 , Rˆ = 2 2 1 0 α 2 0 0 Expand CARD process Rˆ1 x ˆ R1 0 α := Rˆ2 Rˆ2 Rˆ1 (:, j + 1) 0 0 4 decompose with MGS: Rˆ (:, j + 2) Rˆ (:, 1 : j + 1) 1 = 1 H Rˆ (:, j + 2) Rˆ (:, 1 : j + 1) 1:j+1,j+1 2 old 2 Rˆ (:, j + 2) + 1 H Rˆ (:, j + 2) j+2,j+1 2 new 5 update the current Arnoldi vector q: q1 q = q2 q1 = Vˆ Rˆj+1,1 [:, j + 2] q2 = Vˆ Rˆj+1,2 [:, j + 2] Only GMS (with re-orthogonalization), no inversion CARD is numerically stable! IRA with CARD Given a CARD of k-order: VR1 Vˆ Rˆ1 H OP = ˆ ˆ T VR2 V R2 βek IRA does (m − p) QR steps on H with shifts µp+1, ..., µm, i.e. H = UHUe H Then VR1 Vˆ Rˆ1 UH˜ OP U = ˆ ˆ T VR2 V R2 βek U Denote U Uˆ = 1 IRA with CARD Then VR1U Vˆ Rˆ1Uˆ H˜ OP = ˆ ˆ ˆ T VR2U V R2U βek U Its first p columns , denoted by Vk R1,p Vk+1R1,p+1 Hp OP = ˜ T Vk R2,p Vk+1R2,p+1 βep still form an Arnoldi decomposition of order p However, the Vk has rk (instead of rp) columns, it is not a CARD! IRA with CARD Since Vk+1R1,p+1 Vk+1R2,p+1 is the orthonormal factor of an Arnoldi decomposition, from previous theorem, rank[Vk+1R1,p+1, Vk+1R2,p+1] = rank[R1,p+1, R2,p+1] = rp+1 ≤ p + 2, we have a compact SVD: rk+1×rp+1 rp+1×rp+1 rp+1×(p+1) rp+1×(p+1) [R1,p+1, R2,p+1] = P Σ [G1 , G2 ] ≡ P[R1, R2] IRA with CARD Therefore, n×r V k+1 R (V P)n×rp+1 R k+1 1,p+1 = k+1 1 Vk+1R2,p+1 (Vk+1P)R2 The Arnoldi decomposition is expressed in CARD again! This process can also be implemented by a compact “QR” decomposition, which is similar with the compact Arnoldi decomposition (CARD). Details omitted. POLYAR POLYAR: modified ARPARK for polynomial eigenvalue problems (not only QEPs) 1 znaitr p: compute CARD 2 znapps p: IRA with CARD; use LAPACK routine zgesdd to compute SVD decomposition 3 znaupd p, znaupd2 p, zgetv0 p: slightly revised (arguments, storage) 4 zgemip(added): compute inner product in compact form Outline 1 Background 2 Algorithm 3 Numerical Comparison Example 1: A random QEP Problem: QEP (pdeg=2) Size: n=500 Environment: PC (EMS memory: 512M) Randomized Matrix M,D,K (each matrix have about 24,000 non-zero elements.) We choose shift σ = 1 and use shift-invert mode to compute eigenvalues close to 1 LU factorization of Mσ2 + Dσ + K, L and U contain about 120,000 non-zero elements. To compute 8 eigenvalues close to 1; Use 30 Arnoldi base vectors, says, 31 CARD base vectors. A random QEP Computed eigenvalues: ARPACK POLYAR Real Imag Real Imag 1 1.02817D+00 1.38768D−01 1.02817D+00 1.38768D-01 2 1.11582D+00 3.61818D-02 1.11582D+00 3.61818D-02 3 1.11582D+00 -3.61818D-02 1.11582D+00 -3.61818D-02 4 1.05613D+00 -1.05380D-01 1.05613D+00 -1.05380D-01 5 1.05613D+00 1.05380D-01 1.05613D+00 1.05380D-01 6 9.34692D-01 -2.73028D-15 9.34692D-01 4.39496D-15 7 1.00023D+00 5.87804D-02 1.00023D+00 5.87804D-02 8 1.00023D+00 -5.87804D-02 1.00023D+00 -5.87804D-02 A random QEP Storage comparison: ARPACK POLYAR V 500 × 2 × 30 500 × 31 workd 3 × 2 × 500 (2 + 2) × 500 Resid 500 × 2 500 × 2 A random QEP Iteration and time comparison: ARPACK POLYAR update iteration 4 4 OP × x operations 86 86 reorthogonalization of V 84 84 reorthogonalization of R 0 84 user’s OP × x operations 1.375000 1.250000 naupd2 1.609375 1.515625 basic Arnoldi iteration loop 1.531250 1.437500 reorthogonalization phrase 0.093750 0.046875 Hessenberg eig subproblem 0.031250 0.031250 applying the shifts 0.046875 0.046875 calling gesdd 0 0.000000 Example 2: A QEP from SLAC Problem: QEP (pdeg=2) Size: n=5384 Environment: PC (EMS memory: 512M) Genuine Matrix M,D,K (M,D,K have 61425,1183,61425 non-zero elements respectively.) We choose shift σ = −10i and use shift-invert mode to compute eigenvalues close to −10i LU factorization of Mσ2 + Dσ + K, L and U have 749610 and 780229 non-zero elements.

A Compact Arnoldi Algorithm for Polynomial Eigenvalue Problems

A Distributed and Parallel Asynchronous Unite and Conquer Method to Solve Large Scale Non-Hermitian Linear Systems Xinzhe Wu, Serge Petiton

Implicitly Restarted Arnoldi/Lanczos Methods for Large Scale Eigenvalue Calculations

Krylov Subspaces

Iterative Methods for Linear System

Implicitly Restarted Arnoldi/Lanczos Methods for Large Scale Eigenvalue Calculations

The Influence of Orthogonality on the Arnoldi Method

Power Method and Krylov Subspaces

Finding Eigenvalues: Arnoldi Iteration and the QR Algorithm

LARGE-SCALE COMPUTATION of PSEUDOSPECTRA USING ARPACK and EIGS∗ 1. Introduction. the Matrices in Many Eigenvalue Problems

Comparison of Numerical Methods and Open-Source Libraries for Eigenvalue Analysis of Large-Scale Power Systems

Eigenvalue Problems Last Time …

Lecture 33. the Arnoldi Iteration