Parallel Algorithms for the Singular Value Decomposition 119

i i i Erricos John: Handbook on Parallel Computing and Statistics DK2384 c004 2005/7/20 20:27 i Parallel Algorithms for 4 the Singular Value Decomposition Michael W. Berry, Dani Mezher, Bernard Philippe, and Ahmed Sameh CONTENTS Abstract................................................................................. 118 4.1 Introduction. ...................................................................... 118 4.1.1 Basics...................................................................... 118 4.1.2 Sensitivity of the Smallest Singular Value . ................................... 120 4.1.3 Distance to Singularity — Pseudospectrum ................................... 122 4.2 Jacobi Methods for Dense Matrices . ................................................ 123 4.2.1 Two-sided Jacobi Scheme [2JAC]............................................ 123 4.2.2 One-Sided Jacobi Scheme [1JAC] ........................................... 127 4.2.3 Algorithm [QJAC] .......................................................... 130 4.2.4 Block-Jacobi Algorithms .................................................... 132 4.3 Methods for large and sparse matrices . ............................................ 133 4.3.1 Sparse Storages and Linear Systems . ........................................ 133 4.3.2 Subspace Iteration [SISVD] ................................................. 134 4.3.3 Lanczos Methods . ......................................................... 136 4.3.3.1 The Single-Vector Lanczos Method [LASVD] ........................ 136 4.3.3.2 The Block Lanczos Method [BLSVD]................................ 138 4.3.4 The Trace Minimization Method [TRSVD].................................... 140 4.3.4.1 Polynomial Acceleration Techniques for [TRSVD] ................... 142 4.3.4.2 Shifting Strategy for [TRSVD]....................................... 143 4.3.5 Refinement of Left Singular Vectors. ........................................ 144 4.3.6 The Davidson Methods . ..................................................... 147 4.3.6.1 General Framework of the Methods . ............................... 147 4.3.6.2 How do the Davidson Methods Differ? .............................. 149 4.3.6.3 Application to the Computation of the Smallest Singular Value . ..... 151 4.4 Parallel computation for Sparse Matrices ............................................ 152 4.4.1 Parallel Sparse Matrix-Vector Multiplications . ............................... 152 4.4.2 A Parallel Scheme for Basis Orthogonalization ............................... 154 4.4.3 Computing the Smallest Singular Value on Several Processors . ............. 156 4.5 Application: parallel computation of a pseudospectrum .............................. 157 4.5.1 Parallel Path-Following Algorithm using Triangles . .......................... 157 4.5.2 Speedup and Efficiency . ..................................................... 158 4.5.3 TestProblems............................................................... 159 References .............................................................................. 160 117 i i i i i i i Erricos John: Handbook on Parallel Computing and Statistics DK2384 c004 2005/7/20 20:27 i 118 Handbook on Parallel Computing and Statistics ABSTRACT The goal of the survey is to review the state-of-the-art of computing the Singular Value Decompo- sition (SVD) of dense and sparse matrices, with some emphasis on those schemes that are suitable for parallel computing platforms. For dense matrices, we present those schemes that yield the complete decomposition, whereas for sparse matrices we describe schemes that yield only the extremal singular triplets. Special attention is devoted to the computation of the smallest singular values which are normally the most difficult to evaluate but which provide a measure of the distance to singularity of the matrix under consideration. Also, we conclude with the presentation of a parallel method for computing pseudospectra, which depends on computing the smallest singular values. 4.1 INTRODUCTION 4.1.1 BASICS The Singular Value Decomposition (SVD) is a powerful computational tool. Modern algorithms for obtaining such a decomposition of general matrices have had a profound impact on numerous applications in science and engineering disciplines. The SVD is commonly used in the solution of unconstrained linear least squares problems, matrix rank estimation, and canonical correlation analysis. In computational science, it is commonly applied in domains such as information retrieval, seismic reflection tomography, and real-time signal processing [1]. In what follows, we will provide a brief survey of some parallel algorithms for obtaining the SVD for dense, and large sparse matrices. For sparse matrices, however, we focus mainly on the problem of obtaining the smallest singular triplets. To introduce the notations of the chapter, the basic facts related to SVD are presented without proof. Complete presentations are given in many text books, as for instance [2, 3]. Theorem 4.1 [SVD] If A ∈ Rm×n is a real matrix, then there exist orthogonal matrices m×m n×n U =[u1,...,um]∈R and V =[v1,...,vn]∈R such that T m×n = U AV = diag(σ1,...,σp) ∈ R , p = min(m, n) (4.1) where σ1 ≥ σ2 ≥···≥σp ≥ 0. Definition 4.1 The singular values of A are the real numbers σ1 ≥ σ2 ≥ ··· ≥ σp ≥ 0. They are uniquely defined. For every singular value σi (i = 1,...,p), the vectors ui and vi are respectively the left and right singular vectors of A associated with σi . More generally, the theorem holds in the complex field but with the inner products and orthogonal matrices replaced by the Hermitian products and unitary matrices, respectively. The singular values, however, remain real nonnegative values. Theorem 4.2 Let A ∈ Rm×n (m ≥ n) have the singular value decomposition U T AV = . Then, the symmetric matrix × B = AT A ∈ Rn n, (4.2) i i i i i i i Erricos John: Handbook on Parallel Computing and Statistics DK2384 c004 2005/7/20 20:27 i Parallel Algorithms for the Singular Value Decomposition 119 2 2 has eigenvalues σ1 ≥···≥σn ≥ 0, corresponding to the eigenvectors (vi ), (i = 1, ···, n). The symmetric matrix 0 A A = (4.3) aug AT 0 has eigenvalues ±σ1,...,±σn, corresponding to the eigenvectors √1 ui , = ,..., . ±v i 1 n 2 i The matrix Aaug is called the augmented matrix. Every method for computing singular values is based on one of these two matrices. The numerical accuracy of the ith approximate singular triplet (u˜i , σ˜i , v˜i ) determined via the eigensystem of the 2-cyclic matrix Aaug is then measured by the norm of the eigenpair residual vector ri defined by =[ ( ˜ , v˜ )T −˜σ ( ˜ , v˜ )T ]/[ ˜ 2 +˜v 2]1/2, ri 2 Aaug ui i i ui i 2 ui 2 i 2 which can also be written as =[( v˜ −˜σ ˜ 2 + T ˜ −˜σ v˜ 2)1/2]/ ˜ 2 +˜v 2]1/2. ri 2 A i i ui 2 A ui i i 2 ui 2 i 2 (4.4) The backward error [4] η = { v˜ −˜σ ˜ , T ˜ −˜σ v˜ 2} i max A i i ui 2 A ui i i 2 may also be used as a measure of absolute accuracy. A normalizing factor can be introduced for assessing relative errors. Alternatively, we may compute the SVD of A indirectly by the eigenpairs of either the n × n T T matrix A A or the m × m matrix AA .IfV ={v1,v2,...,vn} is the n × n orthogonal matrix representing the eigenvectors of AT A, then T( T ) = (σ 2,σ2,...,σ2, ,..., ), V A A V diag 1 2 r 0 0 n−r where σi is the ith nonzero singular value of A corresponding to the right singular vector vi .The corresponding left singular vector, ui , is then obtained as ui = (1/σi )Avi . Similarly, if U = T {u1, u2,...,um} is the m × m orthogonal matrix representing the eigenvectors of AA , then T( T) = (σ 2,σ2,...,σ2, ,... ), U AA U diag 1 2 r 0 0 m−r where σi is the ith nonzero singular value of A corresponding to the left singular vector ui .The T corresponding right singular vector, vi , is then obtained as vi = (1/σi )A ui . Computing the SVD of A via the eigensystems of either AT A or AAT may be adequate for determining the largest singular triplets of A, but some loss of accuracy may be observed for the smallest√ singular triplets (see Ref. [5]). In fact, extremely small singular values of A (i.e., smaller than A, where is the machine precision parameter) may be computed as zero eigenvalues of AT A (or AAT). Whereas the smallest and largest singular values of A are the lower and upper bounds of the spectrum of AT A or AAT, the smallest singular values of A lie at the center of the i i i i i i i Erricos John: Handbook on Parallel Computing and Statistics DK2384 c004 2005/7/20 20:27 i 120 Handbook on Parallel Computing and Statistics T T spectrum of Aaug in (4.3). For computed eigenpairs of A A and AA , the norms of the ith eigenpair residuals (corresponding to (4.4)) are given by = T v˜ −˜σ 2v˜ / ˜v = T ˜ −˜σ 2 ˜ / ˜ , ri 2 A A i i i 2 i 2 and ri 2 AA ui i ui 2 ui 2 respectively. Thus, extremely high precision in computed eigenpairs may be necessary to compute the smallest singular triplets of A. This fact is analyzed in Section 4.1.2. Difficulties in approxi- mating the smallest singular values by any of the three equivalent symmetric eigenvalue problems will be discussed in Section 4.4. When A is a square nonsingular matrix, it may be advantageous in certain cases to com- −1 pute the singular values of A which are (1/σn) ≥ ··· ≥ (1/σ1). This approach has the drawback of solving linear systems involving the matrix A, but when manageable, it provides a more robust algorithm. Such an alternative is of interest for some subspace methods (see Section 4.3). Actually, the method can be extended to rectangular matrices of full rank by considering a QR-factorization: Proposition 4.1 Let A ∈ Rm×n (m ≥ n) be of rank n. Let A = QR, where Q ∈ Rm×n and R ∈ Rn×n, T such that Q Q = In and R is upper triangular. The singular values of R are the same as the singular values of A. Therefore, the smallest singular value of A can be computed from the largest eigenvalue of 0 R−1 (R−1 R−T) or of . R−T 0 4.1.2 SENSITIVITY OF THE SMALLEST SINGULAR VALUE To compute the smallest singular value in a reliable way, one must investigate the sensitivity of the singular values with respect to perturbations of the matrix at hand.

Parallel Algorithms for the Singular Value Decomposition 119

Orthogonal Projection, Low Rank Approximation, and Orthogonal Bases

Criss-Cross Type Algorithms for Computing the Real Pseudospectral Abscissa

Life As a Developer of Numerical Software

Numerical and Parallel Libraries

New Fast and Accurate Jacobi Svd Algorithm: I

Reproducing Kernel Hilbert Space Compactification of Unitary Evolution

Jack Dongarra: Supercomputing Expert and Mathematical Software Specialist

Randnla: Randomized Numerical Linear Algebra

Inner Product Spaces Isaiah Lankham, Bruno Nachtergaele, Anne Schilling (March 2, 2007)

Basics of Euclidean Geometry

Banach J. Math. Anal. 6 (2012), No. 1, 45–60 LINEAR MAPS

Evolving Software Repositories