A Tour of the Lanczos Algorithm and Its Convergence Guarantees Through the Decades

Total Page:16

File Type:pdf, Size:1020Kb

A Tour of the Lanczos Algorithm and Its Convergence Guarantees Through the Decades A Tour of the Lanczos Algorithm and its Convergence Guarantees through the Decades Qiaochu Yuan Department of Mathematics UC Berkeley Joint work with Prof. Ming Gu, Bo Li April 17, 2018 Qiaochu Yuan Tour of Lanczos April 17, 2018 1 / 43 Timeline (1965) Lanczos algorithm (2009, 2010, 2015) for SVD Randomized (1950) (1974, 1977) Block Lanczos Lanczos iteration Block Lanczos algorithm algorithm 1950 1960 1970 1980 1990 2000 2010 (2015) (1966, 1971, 1980) Convergence Convergence (1975, 1980) bounds for bounds for Convergence Randomized Lanczos bounds for Block Lanczos Block Lanczos Qiaochu Yuan Tour of Lanczos April 17, 2018 2 / 43 Timeline (1965) Lanczos algorithm (2009, 2010, 2015) for SVD Randomized (1950) (1974, 1977) Block Lanczos Lanczos iteration Block Lanczos algorithm algorithm 1950 1960 1970 1980 1990 2000 2010 Our work: (2015) (1966, 1971, 1980) 1) Convergence boundsConvergence for Convergence (1975, 1980) RBL, arbitrary b bounds for bounds for Convergence 2) Superlinear convergenceRandomized Lanczos bounds for Block Lanczos Block Lanczosfor typical data matrices Qiaochu Yuan Tour of Lanczos April 17, 2018 3 / 43 Timeline (1965) Lanczos algorithm (2009, 2010, 2015) for SVD Randomized (1950) (1974, 1977) Block Lanczos Lanczos iteration Block Lanczos algorithm algorithm 1950 1960 1970 1980 1990 2000 2010 (2015) (1966, 1971, 1980) Convergence Convergence (1975, 1980) bounds for bounds for Convergence Randomized Lanczos bounds for Block Lanczos Block Lanczos Qiaochu Yuan Tour of Lanczos April 17, 2018 4 / 43 Lanczos Iteration Algorithm Developed by Lanczos in 1950 [Lan50]. Widely used iterative algorithm for computing the extremal eigenvalues and corresponding eigenvectors of a large, sparse, symmetric matrix A. Goal n×n Given a symmetric matrix A 2 R , with eigenvalues λ1 > λ2 > ··· > λn and associated eigenvectors u1; ··· ; un, want to find approximations for λi , i = 1; ··· ; k, the k largest eigenvalues of A ui , i = 1; ··· ; k, the associated eigenvectors where k n. Qiaochu Yuan Tour of Lanczos April 17, 2018 5 / 43 Lanczos - Details General idea: 1 Select an initial vector v. 2 Construct Krylov subspace K (A; v; k) = spanfv; Av; A2v; ··· ; Ak−1vg. 3 Restrict and project A to the Krylov subspace, T = projKAjK 4 Use eigen values and vectors of T as approximations to those of A. In matrices: k−1 n×k Kk = v Av ··· A v 2 R Qk = q1 q2 ··· qk qr (Kk ) T k×k Tk = Qk AQk 2 R Qiaochu Yuan Tour of Lanczos April 17, 2018 6 / 43 Lanczos - Details 2 3 α1 β1 6 .. .. 7 6 β1 . 7 6 7 A q1 ··· qj = q1 ··· qj qj+1 6 .. .. 7 6 . βj−1 7 6 7 4 βj−1 αj 5 βj At each step j = 1; ··· ; k of Lanczos iteration: T AQj = Qj Tj + βj qj+1ej+1 Use the three-term recurrence: Aqj = βj−1qj−1 + αj qj + βj qj+1 Calculate the αs, βs as: T αj = qj Aqj (1) rj = (A − αj I) qj − βj−1qj (2) βj = krj k2, qj+1 = rj /βj (3) Qiaochu Yuan Tour of Lanczos April 17, 2018 7 / 43 Timeline (1965) Lanczos algorithm (2009, 2010, 2015) for SVD Randomized (1950) (1974, 1977) Block Lanczos Lanczos iteration Block Lanczos algorithm algorithm 1950 1960 1970 1980 1990 2000 2010 (2015) (1966, 1971, 1980) Convergence Convergence (1975, 1980) bounds for bounds for Convergence Randomized Lanczos bounds for Block Lanczos Block Lanczos Qiaochu Yuan Tour of Lanczos April 17, 2018 8 / 43 Convergence of Lanczos (k) How well does λi , the eigenvalues of Tk , approximate λi , the eigenvalues of A, for i = 1; ··· ; k? First answered by Kaniel in 1966 [Kan66] and Paige in 1971 [Pai71]. Theorem (Kaniel-Paige Inequality) If v is chosen to be not orthogonal to the eigenspace associated with λ1, then 2 (k) tan θ (u1; v) 0 ≤ λ1 − λ1 ≤ (λ1 − λn) 2 (4) Tk−1 (γ1) where Ti (x) is the Chebyshev polynomial of degree i, θ (·; ·) is the angle between two vectors, and λ1 − λ2 γ1 = 1 + 2 λ2 − λn Qiaochu Yuan Tour of Lanczos April 17, 2018 9 / 43 Convergence of Lanczos Later generalized by Saad in 1980 [Saa80]. Theorem (Saad Inequality) T For i = 1; ··· ; k, if v is chosen such that ui v 6= 0, then (k) !2 (k) Li tan θ (ui ; v) 0 ≤ λi − λi ≤ (λi − λn) (5) Tk−i (γi ) where Ti (x) is the Chebyshev polynomial of degree i, θ (·; ·) is the angle between two vectors, and λi − λi+1 γi = 1 + 2 λi+1 − λn 8 λ(k)−λ < Qi−1 j n (k) j=1 (k) if i 6= 1 Li = λj −λi : 1 if i = 1 Qiaochu Yuan Tour of Lanczos April 17, 2018 10 / 43 Aside - Chebyshev Polynomials Recall 1 p j p j T (x) = x + x2 − 1 + x − x2 − 1 (6) j 2 When j is large, j 1 p 2 T (x) ≈ x + x − 1 140 j 2 120 Chebyshev, Lanczos and when g is small, 100 j 1 p 80 Tj (1 + g) ≈ 1 + g + 2g 2 60 determines the convergence of 40 Monomial, power method Lanczos with 20 0 λi − λi+1 g = Θ 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 λi+1 − λn Qiaochu Yuan Tour of Lanczos April 17, 2018 11 / 43 Timeline (1965) Lanczos algorithm (2009, 2010, 2015) for SVD Randomized (1950) (1974, 1977) Block Lanczos Lanczos iteration Block Lanczos algorithm algorithm 1950 1960 1970 1980 1990 2000 2010 (2015) (1966, 1971, 1980) Convergence Convergence (1975, 1980) bounds for bounds for Convergence Randomized Lanczos bounds for Block Lanczos Block Lanczos Qiaochu Yuan Tour of Lanczos April 17, 2018 12 / 43 Block Lanczos Algorithm Introduced by Golub and Underwood in 1977 [GU77] and Cullum and Donath in 1974 [CD74]. The block generalization of the Lanczos method uses, instead of a single initial vector v, a block of b vectors V = v1 ··· vb , and builds the Krylov subspace in q iterations as q−1 Kq (A; V; q) = span V; AV; ··· ; A V . k ≤ b, bq n. Compared to classical Lanczos, block Lanczos is more memory and cache efficient, using BLAS3 operations. has the ability to converge to eigenvalues with cluster size > 1. has faster convergence with respect to number of iterations. Qiaochu Yuan Tour of Lanczos April 17, 2018 13 / 43 Block Lanczos Algorithm - Details 2 T 3 A1 B1 6 .. .. 7 6 B1 . 7 6 . 7 A Q1 ··· Qj = Q1 ··· Qj Qj+1 6 .. .. T 7 6 Bj−1 7 6 7 4 Bj−1 Aj 5 Bj At each step j = 1; ··· ; k of Lanczos iteration: AQ = QTj + Qj+1 0 ··· 0 Bj Use the three-term recurrence: T AQj = Qj−1Bj−1 + Qj Aj + Qj+1Bj Calculate the As, Bs as: T Aj = Qj AQj (7) T Qj+1Bj qr AQj − Qj Aj − Qj Bj−1 (8) Qiaochu Yuan Tour of Lanczos April 17, 2018 14 / 43 Timeline (1965) Lanczos algorithm (2009, 2010, 2015) for SVD Randomized (1950) (1974, 1977) Block Lanczos Lanczos iteration Block Lanczos algorithm algorithm 1950 1960 1970 1980 1990 2000 2010 (2015) (1966, 1971, 1980) Convergence Convergence (1975, 1980) bounds for bounds for Convergence Randomized Lanczos bounds for Block Lanczos Block Lanczos Qiaochu Yuan Tour of Lanczos April 17, 2018 15 / 43 Convergence of Block Lanczos First analyzed by Underwood in 1975 [Und75] with the introduction of the algorithm. Theorem (Underwood Inequality) Let λi , ui , i = 1; ··· ; n be the eigenvalues and eigenvectors of A T respectively. Let U = u1 ··· ub . If U V is of full rank b, then for i = 1; ··· ; b 2 (q) tan Θ(U; V) 0 ≤ λi − λi ≤ (λ1 − λn) 2 (9) Tq−1 (ρi ) where Ti (x) is the Chebyshev polynomial of degree i, T cos Θ (U; V) = σmin U V , and λi − λb+1 ρi = 1 + 2 λb+1 − λn Qiaochu Yuan Tour of Lanczos April 17, 2018 16 / 43 Convergence of Block Lanczos Theorem (Saad Inequality [Saa80]) Let λj , uj , j = 1; ··· ; n be the eigenvalues and eigenvectors of A T respectively. Let Ui = ui ··· ui+b−1 . For i = 1; ··· ; b, if Ui V is of full rank b, then (q) !2 (q) Li tan Θ (Ui ; V) 0 ≤ λi − λi ≤ (λi − λn) (10) Tq−i (^γi ) where Ti (x) is the Chebyshev polynomial of degree i, T cos Θ (U; V) = σmin U V , and λi − λi+b γ^i = 1 + 2 λi+b − λn 8 λ(q)−λ < Qi−1 j n (q) j=1 (q) if i 6= 1 Li = λj −λi : 1 if i = 1 Qiaochu Yuan Tour of Lanczos April 17, 2018 17 / 43 Aside - Block Size Recall, when j is large and g is small 1 j T (1 + g) ≈ 1 + g + p2g (11) j 2 λ − λ classical: g = Θ i i+1 λi+1 − λn 140 λi − λi+b g g block: gb = Θ 120 b λi+b − λn 100 Suppose eigenvalue distributed as 80 λ > (1 + )λ for all j: Chebyshev, Lanczos j j+1 60 1 − (1 + )−b 40 g ≈ · λ Monomial, power method b (1 + )−b i 20 0 ≈ b · λi 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 ≈ b · g Qiaochu Yuan Tour of Lanczos April 17, 2018 18 / 43 Timeline (1965) Lanczos algorithm (2009, 2010, 2015) for SVD Randomized (1950) (1974, 1977) Block Lanczos Lanczos iteration Block Lanczos algorithm algorithm 1950 1960 1970 1980 1990 2000 2010 (2015) (1966, 1971, 1980) Convergence Convergence (1975, 1980) bounds for bounds for Convergence Randomized Lanczos bounds for Block Lanczos Block Lanczos Qiaochu Yuan Tour of Lanczos April 17, 2018 19 / 43 Randomized Block Lanczos In recent years, there has been increased interest in algorithms to compute low-rank approximations of matrices, with high computational efficiency requirement, running on large matrices low approximation accuracy requirement, 2-3 digits of accuracy Applications mostly in big-data computations compression of data matrices matrix processing techniques, e.g.
Recommended publications
  • REVIEWS and DESCRIPTIONS of TABLES and BOOKS the Numbers
    MATHEMATICS OF COMPUTATION Volume 65, Number 213, January 1996, Pages 429{443 S 0025-5718-96-00698-9 REVIEWS AND DESCRIPTIONS OF TABLES AND BOOKS The numbers in brackets are assigned according to the American Mathematical Society classification scheme. The 1991 Mathematics Subject Classification can be found in the annual subject index of Mathematical Reviews starting with the December 1990 issue. 1[65-00, 41-00, 41A05, 41A21, 41A55, 41A65, 65C05, 65H05, 65T20, 73C50, 73V20]—Handbook of numerical analysis, Vol. III, P. G. Ciarlet and J. L. Lions (Editors), Techniques of scientific computing (Part 1), Numerical methods for solids (Part 1), Solution of equations in Rn (Part 2), North-Holland, 1 Amsterdam, 1994, x+778 pp., 24 2 cm, $154.25/Dfl. 270.00 This book contains three parts: 1) Techniques of Scientific Computing; 2) Nu- merical Methods for Solids; 3) Solution of Equations in Rn. Nevertheless, only the second part gets close to “numerical methods for large-scale computation”. The aims of the book are rather “constructive methods”, which are often the background to numerical methods. The first part is concerned with three aspects of approximation theory. First, Claude Brezinski provides an interesting historical perspective on interpolation, ap- proximation and numerical quadrature since the year 1611. His main contribution, however, is a survey on Pad´e approximation, written jointly with van Iseghen. The algebraic theory naturally elucidates the connections with orthogonal polynomi- als and continued fractions. The convergence theory contains, on the one hand, a complete theory for Stieltjes functions and, on the other hand, only convergence in capacity for the general case.
    [Show full text]
  • A Parallel Lanczos Algorithm for Eigensystem Calculation
    A Parallel Lanczos Algorithm for Eigensystem Calculation Hans-Peter Kersken / Uwe Küster Eigenvalue problems arise in many fields of physics and engineering science for example in structural engineering from prediction of dynamic stability of structures or vibrations of a fluid in a closed cavity. Their solution consumes a large amount of memory and CPU time if more than very few (1-5) vibrational modes are desired. Both make the problem a natural candidate for parallel processing. Here we shall present some aspects of the solution of the generalized eigenvalue problem on parallel architectures with distributed memory. The research was carried out as a part of the EC founded project ACTIVATE. This project combines various activities to increase the performance vibro-acoustic software. It brings together end users form space, aviation, and automotive industry with software developers and numerical analysts. Introduction The algorithm we shall describe in the next sections is based on the Lanczos algorithm for solving large sparse eigenvalue problems. It became popular during the past two decades because of its superior convergence properties compared to more traditional methods like inverse vector iteration. Some experiments with an new algorithm described in [Bra97] showed that it is competitive with the Lanczos algorithm only if very few eigenpairs are needed. Therefore we decided to base our implementation on the Lanczos algorithm. However, when implementing the Lanczos algorithm one has to pay attention to some subtle algorithmic details. A state of the art algorithm is described in [Gri94]. On parallel architectures further problems arise concerning the robustness of the algorithm. We implemented the algorithm almost entirely by using numerical libraries.
    [Show full text]
  • ITS FIRST FIFTY YEARS Carl Β
    THE MATHEMATICAL ASSOCIATION OF AMERICA: ITS FIRST FIFTY YEARS Carl Β. Boyer Brooklyn College, CUNY Richard W. Feldmann Lycoming College Harry M. Gehman SUN Yat Buffalo Phillip S. Jones University of Michigan Kenneth O. May University of Toronto Harriet F. Montague SUNYat Buffalo Gregory H. Moore University of Toronto Robert A. Rosenbaum Wesleyan University Emory P. Starke Rutgers University Dirk J. Struik Massachusetts Institute of Technology THE MATHEMATICAL ASSOCIATION OF AMERICA: ITS FIRST FIFTY YEARS Kenneth O. May, editor University of Toronto, Canada Published and distributed by The Mathematical Association of America Copyright 1972 by THE MATHEMATICAL ASSOCIATION OF AMERICA (INCORPORATED) PREFACE The fiftieth anniversary of the founding of the Mathematical Association of America was celebrated at the 1965 summer meeting at Cornell University [MONTHLY 72, 1053-1059]. The invited addresses on that occasion dealing with the past, present, and future of the Association and of mathematics, were published in the fiftieth anniversary issue [MONTHLY 74, Num. !, Part II] under the editorship of Carl B. Allendoerfer. The historical addresses by A. A. Bennett, R. A. Rosenbaum, W. L. Duren, Jr., and P. S. Jones whetted appetites for a more complete story of the Association. Early in 1966, on a recommendation of the Committee on Publications, President R. L. Wilder appointed a Committee on the Preparation of a Fifty-Year History of the Association consisting of Carl B. Boyer, Kenneth O. May (Chairman), and Dirk J. Struik. An appropriation of one thousand dollars was set aside to meet incidental expenses. The Committee began its work with very ambitious plans, hoping to get financial support for interviewing older members of the Association and the preparation of special studies on particular aspects of the Association's work.
    [Show full text]
  • Exact Diagonalization of Quantum Lattice Models on Coprocessors
    Exact diagonalization of quantum lattice models on coprocessors T. Siro∗, A. Harju Aalto University School of Science, P.O. Box 14100, 00076 Aalto, Finland Abstract We implement the Lanczos algorithm on an Intel Xeon Phi coprocessor and compare its performance to a multi-core Intel Xeon CPU and an NVIDIA graphics processor. The Xeon and the Xeon Phi are parallelized with OpenMP and the graphics processor is programmed with CUDA. The performance is evaluated by measuring the execution time of a single step in the Lanczos algorithm. We study two quantum lattice models with different particle numbers, and conclude that for small systems, the multi-core CPU is the fastest platform, while for large systems, the graphics processor is the clear winner, reaching speedups of up to 7.6 compared to the CPU. The Xeon Phi outperforms the CPU with sufficiently large particle number, reaching a speedup of 2.5. Keywords: Tight binding, Hubbard model, exact diagonalization, GPU, CUDA, MIC, Xeon Phi 1. Introduction hopping amplitude is denoted by t. The tight-binding model describes free electrons hopping around a lattice, and it gives a In recent years, there has been tremendous interest in utiliz- crude approximation of the electronic properties of a solid. The ing coprocessors in scientific computing, including condensed model can be made more realistic by adding interactions, such matter physics[1, 2, 3, 4]. Most of the work has been done as on-site repulsion, which results in the well-known Hubbard on graphics processing units (GPU), resulting in impressive model[9]. In our basis, however, such interaction terms are di- speedups compared to CPUs in problems that exhibit high data- agonal, rendering their effect on the computational complexity parallelism and benefit from the high throughput of the GPU.
    [Show full text]
  • An Implementation of a Generalized Lanczos Procedure for Structural Dynamic Analysis on Distributed Memory Computers
    NU~IERICAL ANALYSIS PROJECT AUGUST 1992 MANUSCRIPT ,NA-92-09 An Implementation of a Generalized Lanczos Procedure for Structural Dynamic Analysis on Distributed Memory Computers . bY David R. Mackay and Kincho H. Law NUMERICAL ANALYSIS PROJECT COMPUTER SCIENCE DEPARTMENT STANFORD UNIVERSITY STANFORD, CALIFORNIA 94305 An Implementation of a Generalized Lanczos Procedure for Structural Dynamic Analysis on Distributed Memory Computers’ David R. Mackay and Kincho H. Law Department of Civil Engineering Stanford University Stanford, CA 94305-4020 Abstract This paper describes a parallel implementation of a generalized Lanczos procedure for struc- tural dynamic analysis on a distributed memory parallel computer. One major cost of the gener- alized Lanczos procedure is the factorization of the (shifted) stiffness matrix and the forward and backward solution of triangular systems. In this paper, we discuss load assignment of a sparse matrix and propose a strategy for inverting the principal block submatrix factors to facilitate the forward and backward solution of triangular systems. We also discuss the different strategies in the implementation of mass matrix-vector multiplication on parallel computer and how they are used in the Lanczos procedure. The Lanczos procedure implemented includes partial and external selective reorthogonalizations and spectral shifts. Experimental results are presented to illustrate the effectiveness of the parallel generalized Lanczos procedure. The issues of balancing the com- putations among the basic steps of the Lanczos procedure on distributed memory computers are discussed. IThis work is sponsored by the National Science Foundation grant number ECS-9003107, and the Army Research Office grant number DAAL-03-91-G-0038. Contents List of Figures ii .
    [Show full text]
  • Lanczos Vectors Versus Singular Vectors for Effective Dimension Reduction Jie Chen and Yousef Saad
    1 Lanczos Vectors versus Singular Vectors for Effective Dimension Reduction Jie Chen and Yousef Saad Abstract— This paper takes an in-depth look at a technique for a) Latent Semantic Indexing (LSI): LSI [3], [4] is an effec- computing filtered matrix-vector (mat-vec) products which are tive information retrieval technique which computes the relevance required in many data analysis applications. In these applications scores for all the documents in a collection in response to a user the data matrix is multiplied by a vector and we wish to perform query. In LSI, a collection of documents is represented as a term- this product accurately in the space spanned by a few of the major singular vectors of the matrix. We examine the use of the document matrix X = [xij] where each column vector represents Lanczos algorithm for this purpose. The goal of the method is a document and xij is the weight of term i in document j. A query identical with that of the truncated singular value decomposition is represented as a pseudo-document in a similar form—a column (SVD), namely to preserve the quality of the resulting mat-vec vector q. By performing dimension reduction, each document xj product in the major singular directions of the matrix. The T T (j-th column of X) becomes Pk xj and the query becomes Pk q Lanczos-based approach achieves this goal by using a small in the reduced-rank representation, where P is the matrix whose number of Lanczos vectors, but it does not explicitly compute k column vectors are the k major left singular vectors of X.
    [Show full text]
  • Selected Papers
    Selected Papers Volume I Arizona, 1968 Peter D. Lax Selected Papers Volume I Edited by Peter Sarnak and Andrew Majda Peter D. Lax Courant Institute New York, NY 10012 USA Mathematics Subject Classification (2000): 11Dxx, 35-xx, 37Kxx, 58J50, 65-xx, 70Hxx, 81Uxx Library of Congress Cataloging-in-Publication Data Lax, Peter D. [Papers. Selections] Selected papers / Peter Lax ; edited by Peter Sarnak and Andrew Majda. p. cm. Includes bibliographical references and index. ISBN 0-387-22925-6 (v. 1 : alk paper) — ISBN 0-387-22926-4 (v. 2 : alk. paper) 1. Mathematics—United States. 2. Mathematics—Study and teaching—United States. 3. Lax, Peter D. 4. Mathematicians—United States. I. Sarnak, Peter. II. Majda, Andrew, 1949- III. Title. QA3.L2642 2004 510—dc22 2004056450 ISBN 0-387-22925-6 Printed on acid-free paper. © 2005 Springer Science+Business Media, Inc. All rights reserved. This work may not be translated or copied in whole or in part without the written permission of the publisher (Springer Science+Business Media, Inc., 233 Spring Street, New York, NY 10013, USA), except for brief excerpts in connection with reviews or scholarly analysis. Use in connection with any form of information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed is forbidden. The use in this publication of trade names, trademarks, service marks, and similar terms, even if they are not identified as such, is not to be taken as an expression of opinion as to whether or not they are subject to proprietary rights. Printed in the United States of America.
    [Show full text]
  • RM Calendar 2017
    Rudi Mathematici x3 – 6’135x2 + 12’545’291 x – 8’550’637’845 = 0 www.rudimathematici.com 1 S (1803) Guglielmo Libri Carucci dalla Sommaja RM132 (1878) Agner Krarup Erlang Rudi Mathematici (1894) Satyendranath Bose RM168 (1912) Boris Gnedenko 1 2 M (1822) Rudolf Julius Emmanuel Clausius (1905) Lev Genrichovich Shnirelman (1938) Anatoly Samoilenko 3 T (1917) Yuri Alexeievich Mitropolsky January 4 W (1643) Isaac Newton RM071 5 T (1723) Nicole-Reine Etable de Labrière Lepaute (1838) Marie Ennemond Camille Jordan Putnam 2002, A1 (1871) Federigo Enriques RM084 Let k be a fixed positive integer. The n-th derivative of (1871) Gino Fano k k n+1 1/( x −1) has the form P n(x)/(x −1) where P n(x) is a 6 F (1807) Jozeph Mitza Petzval polynomial. Find P n(1). (1841) Rudolf Sturm 7 S (1871) Felix Edouard Justin Emile Borel A college football coach walked into the locker room (1907) Raymond Edward Alan Christopher Paley before a big game, looked at his star quarterback, and 8 S (1888) Richard Courant RM156 said, “You’re academically ineligible because you failed (1924) Paul Moritz Cohn your math mid-term. But we really need you today. I (1942) Stephen William Hawking talked to your math professor, and he said that if you 2 9 M (1864) Vladimir Adreievich Steklov can answer just one question correctly, then you can (1915) Mollie Orshansky play today. So, pay attention. I really need you to 10 T (1875) Issai Schur concentrate on the question I’m about to ask you.” (1905) Ruth Moufang “Okay, coach,” the player agreed.
    [Show full text]
  • On a Stationary Cosmology in the Sense of Einstein's
    G en eral Relativity an d G ravitation , Vol. 29, No. 3, 1997 Editor’s Note: On a Stationary Cosm ology in the Sense of Einstein’s Theory of Grav itation by Kornel Lanczos, Zeitschrift furÈ Physik 21 (1924) , 73± 110 When asked who found the ® rst exact solution of Einstein’ s equations with rotating matter source and when, most physicist s would answer that it was Godel,È in 1949 [1]. In fact, the correct answer is Lanczos in 1924, in the paper print ed in this issue. Setting the historical record straight is the ® rst reason for republishing Lanczos’ paper. (To be sure, Godel’È s solution was new in 1949 because it is diŒerent from Lanczos’ , it is very important and was discussed by GodelÈ with illuminat ing insight s, but the credit for historical priority that it enjoys is not deserved.) There is one more reason. Back in 1924, exact solutions of Einstein’s equations were still a rarity. Nevertheless, the author was not satis® ed just to derive a new exact solution. He felt obliged to investigate what the solution implies for our Universe, and he did so with such breadth, depth and clarity that the paper can set standards and provide inspirat ion to readers even today. The modern reader will have to forgive the author for a few inconsis- tencies with the now commonly accepted terminology (such as the notion of static vs. stationary solut ions discussed in the introduct ion) and, in places, for an old-fashioned, long-forgot ten notation. Also, in 1924 no- body was yet aware that our Galaxy is not the whole Universe, and that the Sun is not at the center of the Galaxy.
    [Show full text]
  • Restarted Lanczos Bidiagonalization for the SVD in Slepc
    Scalable Library for Eigenvalue Problem Computations SLEPc Technical Report STR-8 Available at http://slepc.upv.es Restarted Lanczos Bidiagonalization for the SVD in SLEPc V. Hern´andez J. E. Rom´an A. Tom´as Last update: June, 2007 (slepc 2.3.3) Previous updates: { About SLEPc Technical Reports: These reports are part of the documentation of slepc, the Scalable Library for Eigenvalue Problem Computations. They are intended to complement the Users Guide by providing technical details that normal users typically do not need to know but may be of interest for more advanced users. Restarted Lanczos Bidiagonalization for the SVD in SLEPc STR-8 Contents 1 Introduction 2 2 Description of the Method3 2.1 Lanczos Bidiagonalization...............................3 2.2 Dealing with Loss of Orthogonality..........................6 2.3 Restarted Bidiagonalization..............................8 2.4 Available Implementations............................... 11 3 The slepc Implementation 12 3.1 The Algorithm..................................... 12 3.2 User Options...................................... 13 3.3 Known Issues and Applicability............................ 14 1 Introduction The singular value decomposition (SVD) of an m × n complex matrix A can be written as A = UΣV ∗; (1) ∗ where U = [u1; : : : ; um] is an m × m unitary matrix (U U = I), V = [v1; : : : ; vn] is an n × n unitary matrix (V ∗V = I), and Σ is an m × n diagonal matrix with nonnegative real diagonal entries Σii = σi for i = 1;:::; minfm; ng. If A is real, U and V are real and orthogonal. The vectors ui are called the left singular vectors, the vi are the right singular vectors, and the σi are the singular values. In this report, we will assume without loss of generality that m ≥ n.
    [Show full text]
  • A Parallel Implementation of Lanczos Algorithm in SVD Computation
    A Parallel Implementation of Lanczos Algorithm in SVD 1 Computation SRIKANTH KALLURKAR Abstract This work presents a parallel implementation of Lanczos algorithm for SVD Computation using MPI. We investigate exe- cution time as a function of matrix sparsity, matrix size, and the number of processes. We observe speedup and slowdown for parallelized Lanczos algorithm. Reasons for the speedups and the slowdowns are discussed in the report. I. INTRODUCTION Singular Value Decomposition (SVD) is an important classical technique which is used for many pur- poses [8]. Deerwester et al [5] described a new method for automatic indexing and retrieval using SVD wherein a large term by document matrix was decomposed into a set of 100 orthogonal factors from which the original matrix was approximated by linear combinations. The method was called Latent Semantic Indexing (LSI). In addition to a significant problem dimensionality reduction it was believed that the LSI had the potential to improve precision and recall. Studies [11] have shown that LSI based retrieval has not produced conclusively better results than vector-space modeled retrieval. Recently, however, there is a renewed interest in SVD based approach to Information Retrieval and Text Mining [4], [6]. The rapid increase in digital and online information has presented the task of efficient information pro- cessing and management. Standard term statistic based techniques may have potentially reached a limit for effective text retrieval. One of key new emerging areas for text retrieval is language models, that can provide not just structural but also semantic insights about the documents. Exploration of SVD based LSI is therefore a strategic research direction.
    [Show full text]
  • A High Performance Block Eigensolver for Nuclear Configuration Interaction Calculations Hasan Metin Aktulga, Md
    1 A High Performance Block Eigensolver for Nuclear Configuration Interaction Calculations Hasan Metin Aktulga, Md. Afibuzzaman, Samuel Williams, Aydın Buluc¸, Meiyue Shao, Chao Yang, Esmond G. Ng, Pieter Maris, James P. Vary Abstract—As on-node parallelism increases and the performance gap between the processor and the memory system widens, achieving high performance in large-scale scientific applications requires an architecture-aware design of algorithms and solvers. We focus on the eigenvalue problem arising in nuclear Configuration Interaction (CI) calculations, where a few extreme eigenpairs of a sparse symmetric matrix are needed. We consider a block iterative eigensolver whose main computational kernels are the multiplication of a sparse matrix with multiple vectors (SpMM), and tall-skinny matrix operations. We present techniques to significantly improve the SpMM and the transpose operation SpMMT by using the compressed sparse blocks (CSB) format. We achieve 3–4× speedup on the requisite operations over good implementations with the commonly used compressed sparse row (CSR) format. We develop a performance model that allows us to correctly estimate the performance of our SpMM kernel implementations, and we identify cache bandwidth as a potential performance bottleneck beyond DRAM. We also analyze and optimize the performance of LOBPCG kernels (inner product and linear combinations on multiple vectors) and show up to 15× speedup over using high performance BLAS libraries for these operations. The resulting high performance LOBPCG solver achieves 1.4× to 1.8× speedup over the existing Lanczos solver on a series of CI computations on high-end multicore architectures (Intel Xeons). We also analyze the performance of our techniques on an Intel Xeon Phi Knights Corner (KNC) processor.
    [Show full text]