Algorithms and Perturbation Theory for Matrix Eigenvalue Problems and the Singular Value Decomposition by YUJI NAKATSUKASA B.S
Total Page:16
File Type:pdf, Size:1020Kb
Algorithms and Perturbation Theory for Matrix Eigenvalue Problems and the Singular Value Decomposition By YUJI NAKATSUKASA B.S. (University of Tokyo) 2005 M.S. (University of Tokyo) 2007 DISSERTATION Submitted in partial satisfaction of the requirements for the degree of DOCTOR OF PHILOSOPHY in APPLIED MATHEMATICS in the OFFICE OF GRADUATE STUDIES of the UNIVERSITY OF CALIFORNIA DAVIS Approved: Roland Freund (Chair) Fran¸cois Gygi Naoki Saito Committee in Charge 2011 -i- Contents Abstract v Acknowledgments vi Chapter 1. Introduction 1 Chapter2. Overviewandsummaryofcontributions 4 2.1. Notations 4 2.2. The symmetric eigendecomposition and the singular valuedecomposition 5 2.3. The polar decomposition 6 2.4. Backward stability of an algorithm 7 2.5. The dqds algorithm 8 2.6. Aggressive early deflation 10 2.7. Hermitian eigenproblems 13 2.8. Generalized eigenvalue problems 14 2.9. Eigenvalue first-order expansion 16 2.10. Perturbation of eigenvectors 17 2.11. Gerschgorin’s theorem 21 Part 1. Algorithms for matrix decompositions 22 Chapter 3. Communication minimizing algorithm for the polardecomposition 23 3.1. Newton-type methods 25 3.2. Halley’smethodanddynamicallyweightedHalley 27 3.3. QR-based implementations 33 3.4. Backward stability proof 36 3.5. Numerical examples 48 3.6. Solving the max-min problem 50 3.7. Applicationinmoleculardynamicssimulations 56 Chapter 4. Efficient, communication minimizing algorithm for the symmetric eigenvalue problem 59 4.1. Algorithm QDWH-eig 60 4.2. Practical considerations 66 4.3. Numerical experiments 69 Chapter 5. Efficient, communication minimizing algorithm for the SVD 73 5.1. Algorithm QDWH-SVD 73 5.2. Practical considerations 75 -ii- 5.3. Numerical experiments 76 Chapter 6. dqds with aggressive early deflation for computing singular values of bidiagonal matrices 80 6.1. Aggressive early deflation for dqds - version 1: Aggdef(1) 81 6.2. Aggressive early deflation for dqds - version 2: Aggdef(2) 83 6.3. Convergence analysis 98 6.4. Numerical experiments 102 Part 2. Eigenvalue perturbation theory 108 Chapter 7. Eigenvalue perturbation bounds for Hermitian block tridiagonal matrices 109 7.1. Basic approach 110 7.2. 2-by-2 block case 111 7.3. Block tridiagonal case 114 7.4. Two case studies 118 7.5. Effectofthepresenceofmultipleeigenvalues 123 Chapter8. Perturbationofgeneralizedeigenvalues 126 8.1. Absolute Weyl theorem 127 8.2. Relative Weyl theorem 130 8.3. Quadratic perturbation bounds for Hermitian definite pairs 132 8.4. An extension to non-Hermitian pairs 146 Chapter 9. Perturbation and condition numbers of a multiple generalized eigenvalue 148 9.1. Perturbation bounds for multiple generalized eigenvalues 150 9.2. Another explanation 154 9.3. Implication in the Rayleigh-Ritz process 157 9.4. Condition numbers of a multiple generalized eigenvalue 161 9.5. Hermitian definite pairs 163 9.6. Non-Hermitian pairs 165 9.7. Multiple singular value 170 Chapter 10. Perturbation of eigenvectors 172 10.1. The tan θ theoremunderrelaxedconditions 173 10.2. The generalized tan θ theoremwithrelaxedconditions 177 10.3. RefinedRayleigh-Ritzapproximationbound 178 10.4. New bounds for the angles between Ritz vectors and exact eigenvectors 179 10.5. Singular vectors 183 10.6. The cos θ theorem 187 Chapter 11. Gerschgorin theory 192 11.1. AnewGerschgorin-typeeigenvalueinclusionset 194 11.2. Examples and applications 199 11.3. Gerschgorin theorems for generalized eigenproblems 202 11.4. Examples 213 -iii- 11.5. Applications 214 Chapter 12. Summary and future work 219 Bibliography 220 -iv- Yuji Nakatsukasa September 2011 Applied Mathematics Algorithms and Perturbation Theory for Matrix Eigenvalue Problems and the Singular Value Decomposition Abstract This dissertation is about algorithmic and theoretical developments for eigenvalue prob- lems in numerical linear algebra. The first part of this dissertation proposes algorithms for two important matrix decom- positions, the symmetric eigenvalue decomposition and the singular value decomposition. Recent advances and changes in computational architectures have made it necessary for basic linear algebra algorithms to be well-adapted for parallel computing. A few decades ago, an algorithm was considered faster if it required fewer arithmetic operations. This is not the case anymore, and now it is vital to minimize both arithmetic and communication when designing algorithms that are well-suited for high performance scientific computing. Unfortunately, for the above two matrix decompositions, no known algorithm minimizes communication without needing significantly more arithmetic. The development of such al- gorithms is the main theme of the first half of the dissertation. Our algorithms have great potential as the future approach to computing these matrix decompositions. The second part of this dissertation explores eigenvalue perturbation theory. Besides being of theoretical interest, perturbation theory is a useful tool that plays important roles in many applications. For example, it is frequently employed in the stability analysis of a numerical algorithm, for examining whether a given problem is well-conditioned under perturbation, or for bounding errors of a computed solution. However, there are a number of phenomena that still cannot be explained by existing theory. We make contributions by deriving refined eigenvalue perturbation bounds for Hermitian block tridiagonal matrices and generalized Hermitian eigenproblems, giving explanations for the perturbation behavior of a multiple generalized eigenvalue, presenting refined eigenvector perturbation bounds, and developing new Gerschgorin-type eigenvalue inclusion sets. -v- Acknowledgments Roland Freund advised me for this dissertation work and I am grateful to his support, encouragement and sharing mathematical insights. His view towards numerical linear algebra greatly influenced my perspective for research directions. A number of individuals have given me valuable suggestions and advices during my graduate studies. With Kensuke Aishima I had exciting communications on the dqds algorithm. I thank Zhaojun Bai for many discussions and introducing me to the polar decomposition. Fran¸cois Gygi introduced me to the computational aspects of molecular dynamics. Nick Higham always inspired me with his insights and knowledge, and gave me countless suggestions to a number of my manuscripts. Ren-Cang Li provided his expertise on eigenvalue perturbation theory and discussions with him has always been exciting and illuminating. Beresford Parlett shared his wisdom on tridiagonal and bidiagonal matrices. Efrem Rensi made our office an excellent place to work in and I enjoyed many discussions with him inside and outside the office. Naoki Saito introduced me to Laplacian eigenvalues, and provided help beyond mathematics, always being attentive despite his busy schedule. Fran¸coise Tisseur generously spared her time to go over my manuscripts and presentations and gave me invaluable suggestions. Ichitaro Yamazaki offered great help with coding, especially for the work on dqds. Shao-Liang Zhang introduced me to the field of numerical linear algebra, and has been understanding and welcoming whenever I visited Japan. Beyond my studies, I would like to extend my special acknowledgment to Celia Davis, who provided me with inestimable support and advices during difficult times. And last but not least, I would like to thank my family for always being incredibly patient, understanding and supportive. Without their support this work would not have been completed. This work was partially supported by NSF grant OCI-0749217 and DOE grant DE-FC02- 06ER25794. -vi- 1 CHAPTER 1 Introduction This dissertation is about design and perturbation analysis of numerical algorithms for matrix eigenvalue problems and the singular value decomposition. The most computation- ally challenging part of many problems in scientific computing is often to solve large-scale matrix equations, the two most frequently arising of which are linear systems of equations and eigenvalue problems. The rapid development of computer technology makes possible simulations and computations of ever larger scale. It is vital that the algorithms are well- suited for the computing architecture to make full use of the computational power, and that we understand the underlying theory to ensure the computed quantities are meaningful and reliable. These two issues are the fundamental motivation for this dissertation work. For a numerical algorithm to maximize the computational resources, not only does it need to do as few arithmetic operations as possible, it must be suitable for parallel or pipelined processing. In particular, on the emerging multicore and heterogeneous computing systems, communication costs have exceeded arithmetic costs by orders of magnitude, and the gap is growing exponentially over time. Hence it is vital to minimize both arithmetic and communication when designing algorithms that are well-suited for high performance scientific computing. The asymptotic communication lower bound is analyzed in [9], and algorithms that attain these asymptotic lower bounds are said to minimize communication. There has been much progress in the development of such algorithms in numerical linear algebra, and communication-minimizing implementations are now known for many basic matrix oper- ations such as matrix multiplications, Cholesky factorization and QR decomposition [9]. However, the reduction in communication cost sometimes comes at the expense of signifi- cantly more arithmetic. This includes the existing