LANCZOS and GOLUB-KAHAN REDUCTION METHODS APPLIED to ILL-POSED PROBLEMS a Dissertation Submitted to Kent State University In

Total Page:16

File Type:pdf, Size:1020Kb

LANCZOS and GOLUB-KAHAN REDUCTION METHODS APPLIED to ILL-POSED PROBLEMS a Dissertation Submitted to Kent State University In LANCZOS AND GOLUB-KAHAN REDUCTION METHODS APPLIED TO ILL-POSED PROBLEMS A dissertation submitted to Kent State University in partial fulfillment of the requirements for the degree of Doctor of Philosophy by Enyinda N. Onunwor May, 2018 Dissertation written by Enyinda N. Onunwor A.A., Cuyahoga Community College, 1998 B.S., Youngstown State University, 2001 B.S., Youngstown State University, 2001 M.S., Youngstown State University, 2003 M.A., Kent State University, 2011 Ph.D., Kent State University, 2018 Approved by , Chair, Doctoral Dissertation Committee Lothar Reichel , Member, Doctoral Dissertation Committee Jing Li , Member, Doctoral Dissertation Committee Jun Li , Member, Outside Discipline Arden Ruttan , Member, Graduate Faculty Representative Arvind Bansal Accepted by , Chair, Department of Mathematical Sciences Andrew Tonge , Dean, College of Arts and Sciences James L. Blank TABLE OF CONTENTS LIST OF FIGURES . v LIST OF TABLES . vii ACKNOWLEDGEMENTS . x NOTATION . xii 1 Introduction . 1 1.1 Overview . 1 1.2 Regularization methods . 2 1.2.1 Truncated singular value decomposition (TSVD) . 2 1.2.2 Truncated eigenvalue decomposition (TEVD) . 4 1.2.3 Tikhonov regularization . 5 1.2.4 Regularization parameter: the discrepancy principle . 7 1.3 Krylov subspace methods . 8 1.3.1 The Arnoldi method . 9 1.3.2 The symmetric Lanczos process . 11 1.3.3 Golub-Kahan bidiagonalization . 13 1.3.4 Block Krylov methods . 15 1.4 The test problems . 16 1.4.1 Descriptions of the test problems . 16 2 Reduction methods applied to discrete ill-posed problems . 20 2.1 Introduction . 20 iii 2.2 Application of the symmetric Lanczos method . 21 2.3 Application of the Golub–Kahan reduction method . 29 2.4 Computed examples . 31 2.5 Conclusion . 42 3 Computation of a truncated SVD of a large linear discrete ill-posed problem . 44 3.1 Introduction . 44 3.2 Symmetric linear discrete ill-posed problems . 45 3.3 Nonsymmetric linear discrete ill-posed problems . 47 3.4 Computed examples . 48 3.5 Conclusion . 61 4 Solution methods for linear discrete ill-posed problems for color image restoration . 62 4.1 Solution by partial block Golub–Kahan bidiagonalization . 66 4.2 The GGKB method and Gauss-type quadrature . 72 4.3 Golub–Kahan bidiagonalization for problems with multiple right-hand sides . 77 4.4 Computed examples . 81 4.5 Conclusion . 87 BIBLIOGRAPHY . 88 iv LIST OF FIGURES 1 Behavior of the bounds (2.2.1) (left), (2.2.7) (center), and (2.3.1) (right), with respect to the iteration index `. The first test matrix is symmetric positive definite, the second is symmetric indefinite, and the third is unsymmetric. The left-hand side of each inequality is represented by crosses, the right-hand side by circles. 31 2 The graphs in the left column display the relative error Rl,k between the eigenvalues of the symmetric test problems, and the corresponding Ritz values generated by the Lanczos process. The right column shows the behavior of Rsˆ ,k for the unsymmetric problems; see (2.4.1) and (2.4.3). 33 3 Distance between the subspace spanned by the first dk/3e eigenvectors (resp. singular vec- tors) of the symmetric (resp. nonsymmetric) test problems, and the subspace spanned by the corresponding Lanczos (resp. Golub–Kahan) vectors; see (2.4.2) and (2.4.4). 35 T (2) 4 Distance kVk,iVn−ik, i = 1,2,:::,k, between the subspace spanned by the first i eigenvectors of the Foxgood (left) and Shaw (right) matrices, and the subspace spanned by the corre- sponding i Ritz vectors at iteration k = 10. 36 5 Distance between the subspace spanned by the first dk/2e eigenvectors (resp. singular vec- tors) of selected symmetric (resp. nonsymmetric) test problems and the subspace spanned by the corresponding Lanczos (resp. Golub–Kahan) vectors. The index ` ranges from 1 to either the dimension of the matrix (n = 200) or to the iteration where there is a breakdown in the factorization process. 37 T (2) T (2) 6 Distance maxfkVk,iVn−ik,kUk,iUn−ikg, i = 1,2,:::,k, between the subspace spanned by the first i singular vectors of the Heat (left) and Tomo (right) matrices and the subspace spanned by the corresponding i Golub–Kahan vectors at iteration k = 100. 38 v 7 The first four LSQR solutions to the Baart test problem (thin lines) are compared to the corresponding TSVD solutions (dashed lines) and to the exact solution (thick line). The size of the problem is n = 200, the noise level is d = 10−4. The thin and dashed lines are very close. 41 8 Convergence history for the LSQR and TSVD solutions to the Tomo example of size n = −2 225, with noise level d = 10 . The error ELSQR has a minimum at k = 66, while ETSVD is minimal for k = 215. 42 9 Solution by LSQR and TSVD to the Tomo example of size n = 225, with noise level d = 10−2: exact solution (top left), optimal LSQR solution (top right), TSVD solution cor- responding to the same truncation parameter (bottom left), optimal TSVD solution (bottom right). 43 1 Example 2: Original image (left), blurred and noisy image (right) . 85 2 Example 2: Restored image by Algorithm 5 (left), and restored image by Algorithm 6 (right). 85 3 Example 3: Cross-channel blurred and noisy image (left), restored image by Algorithm 6 (right). 86 vi LIST OF TABLES 1 Solution of symmetric linear systems: the errors ELanczos and ETEIG are optimal for truncated Lanczos iteration and truncated eigenvalue decomposition. The corresponding truncation parameters are denoted by kLanczos and kTEIG. Three noise levels d are considered; ` denotes the number of Lanczos iterations performed. 38 2 Solution of nonsymmetric linear systems: the errors ELSQR and ETSVD are optimal for LSQR and TSVD. The corresponding truncation parameters are denoted by kLSQR and kTSVD. Three noise levels are considered; ` denotes the number of Golub–Kahan iterations performed. 40 1 foxgood test problem. 49 2 shaw test problem. 50 3 shaw test problem. 51 4 phillips test problem. 52 5 baart test problem. 53 6 baart test problem. 53 7 Inverse Laplace transform test problem. 54 8 Example 3.6: Relative errors and number of matrix-vector products, d˜ = 10−2. The initial vector for the first Golub–Kahan bidiagonalization computed by irbla is a unit random vector. 54 9 Example 3.6: Relative errors and number of matrix-vector products, d˜ = 10−2. The initial vector for the first Golub–Kahan bidiagonalization computed by irbla is b/kbk. 54 10 Example 3.6: Relative errors and number of matrix-vector products, d˜ = 10−4. The initial vector for the first Golub–Kahan bidiagonalization computed by irbla is b/kbk. 55 11 Example 3.6: Relative errors and number of matrix-vector products, d˜ = 10−6. The initial vector for the first Golub–Kahan bidiagonalization computed by irbla is b/kbk. 55 12 Relative errors and number of matrix-vector product evaluations, d˜ = 10−2. 60 vii 13 Relative errors and number of matrix-vector product evaluations, d˜ = 10−4. 60 14 Relative errors and number of matrix-vector product evaluations, d˜ = 10−6. 61 1 Results for the phillips test problem . 82 2 Results for the baart test problem . 83 3 Results for the shaw test problem . 83 4 Results for Example 2 . 84 viii To Olivia and Kristof ix ACKNOWLEDGEMENTS This work would not have been possible without the wisdom, support, and tireless assistance of my advisor, Lothar Reichel. I genuinely appreciate both his patience with me and the guidance he has given me over the years. His invaluable encouragement and counsel have been critical in facilitating the progress I have made to this point. He has truly been a blessing, and he has made a positive impact on my life. In addition, I extend my undying gratitude to my committee: Jing Li, Jun Li, Arden Ruttan, and Arvind Bansal. I am tremendously indebted to them for their collective time, effort, and direction. I would be remiss if I failed to recognize the important contributions made by the following collaborators: Silvia Gazzola, Giuseppe Rodriguez, Mohamed El Guide, Abdeslem Bentbib, and Khalide Jbilou. A special thanks to Xuebo Yu for helping me debug my codes and for his valuable input. I honor the memory of my parents, HRH Sir Wobo Weli Onunwor and Dame Nchelem Onunwor. Their legacy of love, strength, determination, support, and faith imbued me with the courage I needed to achieve this objective, and they will forever endure in my spirit and in my work. My sister, Chisa, is one of the most brilliant people I know; her fortitude and determination are unmatched, and I am inspired by her integrity and work ethic. My oldest brother, HRH Nyema Onunwor, sets the example for the rest of us; he helps us maintain a calm demeanor in the face of the challenges we encounter and remains a constant voice of reason. I offer my deep respect and admiration to my other siblings, Rommy, Acho, and Emenike, for helping me maintain my sanity through this process. Their stimulating conversations and the familial communion we share sustained and comforted me when I was in need of a respite during challenging moments. My thanks to Dike Echendu for is wisdom and advice. Special thanks to two of my closest friends, Dennis Frank-Ito and Ian Miller for their mathematical insights and constant encouragement. My cousins Anderson, Blessing, Charles, Mary-Ann, and Gloria are like siblings to me, and their parents, Dr. Albert and Ezinne Charity Nnewihe, have acted as my parental figures. I will be eternally grateful to them for their emotional support x and loving guidance.
Recommended publications
  • 18.06 Linear Algebra, Problem Set 2 Solutions
    18.06 Problem Set 2 Solution Total: 100 points Section 2.5. Problem 24: Use Gauss-Jordan elimination on [U I] to find the upper triangular −1 U : 2 3 2 3 2 3 1 a b 1 0 0 −1 4 5 4 5 4 5 UU = I 0 1 c x1 x2 x3 = 0 1 0 : 0 0 1 0 0 1 −1 Solution (4 points): Row reduce [U I] to get [I U ] as follows (here Ri stands for the ith row): 2 3 2 3 1 a b 1 0 0 (R1 = R1 − aR2) 1 0 b − ac 1 −a 0 4 5 4 5 0 1 c 0 1 0 −! (R2 = R2 − cR2) 0 1 0 0 1 −c 0 0 1 0 0 1 0 0 1 0 0 1 ( ) 2 3 R1 = R1 − (b − ac)R3 1 0 0 1 −a ac − b −! 40 1 0 0 1 −c 5 : 0 0 1 0 0 1 Section 2.5. Problem 40: (Recommended) A is a 4 by 4 matrix with 1's on the diagonal and −1 −a; −b; −c on the diagonal above. Find A for this bidiagonal matrix. −1 Solution (12 points): Row reduce [A I] to get [I A ] as follows (here Ri stands for the ith row): 2 3 1 −a 0 0 1 0 0 0 6 7 60 1 −b 0 0 1 0 07 4 5 0 0 1 −c 0 0 1 0 0 0 0 1 0 0 0 1 2 3 (R1 = R1 + aR2) 1 0 −ab 0 1 a 0 0 6 7 (R2 = R2 + bR2) 60 1 0 −bc 0 1 b 07 −! 4 5 (R3 = R3 + cR4) 0 0 1 0 0 0 1 c 0 0 0 1 0 0 0 1 2 3 (R1 = R1 + abR3) 1 0 0 0 1 a ab abc (R = R + bcR ) 60 1 0 0 0 1 b bc 7 −! 2 2 4 6 7 : 40 0 1 0 0 0 1 c 5 0 0 0 1 0 0 0 1 Alternatively, write A = I − N.
    [Show full text]
  • [Math.RA] 19 Jun 2003 Two Linear Transformations Each Tridiagonal with Respect to an Eigenbasis of the Ot
    Two linear transformations each tridiagonal with respect to an eigenbasis of the other; comments on the parameter array∗ Paul Terwilliger Abstract Let K denote a field. Let d denote a nonnegative integer and consider a sequence ∗ K p = (θi,θi , i = 0...d; ϕj , φj, j = 1...d) consisting of scalars taken from . We call p ∗ ∗ a parameter array whenever: (PA1) θi 6= θj, θi 6= θj if i 6= j, (0 ≤ i, j ≤ d); (PA2) i−1 θh−θd−h ∗ ∗ ϕi 6= 0, φi 6= 0 (1 ≤ i ≤ d); (PA3) ϕi = φ1 + (θ − θ )(θi−1 − θ ) h=0 θ0−θd i 0 d i−1 θh−θd−h ∗ ∗ (1 ≤ i ≤ d); (PA4) φi = ϕ1 + (Pθ − θ )(θd−i+1 − θ0) (1 ≤ i ≤ d); h=0 θ0−θd i 0 −1 ∗ ∗ ∗ ∗ −1 (PA5) (θi−2 − θi+1)(θi−1 − θi) P, (θi−2 − θi+1)(θi−1 − θi ) are equal and independent of i for 2 ≤ i ≤ d − 1. In [13] we showed the parameter arrays are in bijection with the isomorphism classes of Leonard systems. Using this bijection we obtain the following two characterizations of parameter arrays. Assume p satisfies PA1, PA2. Let ∗ ∗ A, B, A ,B denote the matrices in Matd+1(K) which have entries Aii = θi, Bii = θd−i, ∗ ∗ ∗ ∗ ∗ ∗ Aii = θi , Bii = θi (0 ≤ i ≤ d), Ai,i−1 = 1, Bi,i−1 = 1, Ai−1,i = ϕi, Bi−1,i = φi (1 ≤ i ≤ d), and all other entries 0. We show the following are equivalent: (i) p satisfies −1 PA3–PA5; (ii) there exists an invertible G ∈ Matd+1(K) such that G AG = B and G−1A∗G = B∗; (iii) for 0 ≤ i ≤ d the polynomial i ∗ ∗ ∗ ∗ ∗ ∗ (λ − θ0)(λ − θ1) · · · (λ − θn−1)(θi − θ0)(θi − θ1) · · · (θi − θn−1) ϕ1ϕ2 · · · ϕ nX=0 n is a scalar multiple of the polynomial i ∗ ∗ ∗ ∗ ∗ ∗ (λ − θd)(λ − θd−1) · · · (λ − θd−n+1)(θ − θ )(θ − θ ) · · · (θ − θ ) i 0 i 1 i n−1 .
    [Show full text]
  • Self-Interlacing Polynomials Ii: Matrices with Self-Interlacing Spectrum
    SELF-INTERLACING POLYNOMIALS II: MATRICES WITH SELF-INTERLACING SPECTRUM MIKHAIL TYAGLOV Abstract. An n × n matrix is said to have a self-interlacing spectrum if its eigenvalues λk, k = 1; : : : ; n, are distributed as follows n−1 λ1 > −λ2 > λ3 > ··· > (−1) λn > 0: A method for constructing sign definite matrices with self-interlacing spectra from totally nonnegative ones is presented. We apply this method to bidiagonal and tridiagonal matrices. In particular, we generalize a result by O. Holtz on the spectrum of real symmetric anti-bidiagonal matrices with positive nonzero entries. 1. Introduction In [5] there were introduced the so-called self-interlacing polynomials. A polynomial p(z) is called self- interlacing if all its roots are real, semple and interlacing the roots of the polynomial p(−z). It is easy to see that if λk, k = 1; : : : ; n, are the roots of a self-interlacing polynomial, then the are distributed as follows n−1 (1.1) λ1 > −λ2 > λ3 > ··· > (−1) λn > 0; or n (1.2) − λ1 > λ2 > −λ3 > ··· > (−1) λn > 0: The polynomials whose roots are distributed as in (1.1) (resp. in (1.2)) are called self-interlacing of kind I (resp. of kind II). It is clear that a polynomial p(z) is self-interlacing of kind I if, and only if, the polynomial p(−z) is self-interlacing of kind II. Thus, it is enough to study self-interlacing of kind I, since all the results for self-interlacing of kind II will be obtained automatically. Definition 1.1. An n × n matrix is said to possess a self-interlacing spectrum if its eigenvalues λk, k = 1; : : : ; n, are real, simple, are distributed as in (1.1).
    [Show full text]
  • A Parallel Lanczos Algorithm for Eigensystem Calculation
    A Parallel Lanczos Algorithm for Eigensystem Calculation Hans-Peter Kersken / Uwe Küster Eigenvalue problems arise in many fields of physics and engineering science for example in structural engineering from prediction of dynamic stability of structures or vibrations of a fluid in a closed cavity. Their solution consumes a large amount of memory and CPU time if more than very few (1-5) vibrational modes are desired. Both make the problem a natural candidate for parallel processing. Here we shall present some aspects of the solution of the generalized eigenvalue problem on parallel architectures with distributed memory. The research was carried out as a part of the EC founded project ACTIVATE. This project combines various activities to increase the performance vibro-acoustic software. It brings together end users form space, aviation, and automotive industry with software developers and numerical analysts. Introduction The algorithm we shall describe in the next sections is based on the Lanczos algorithm for solving large sparse eigenvalue problems. It became popular during the past two decades because of its superior convergence properties compared to more traditional methods like inverse vector iteration. Some experiments with an new algorithm described in [Bra97] showed that it is competitive with the Lanczos algorithm only if very few eigenpairs are needed. Therefore we decided to base our implementation on the Lanczos algorithm. However, when implementing the Lanczos algorithm one has to pay attention to some subtle algorithmic details. A state of the art algorithm is described in [Gri94]. On parallel architectures further problems arise concerning the robustness of the algorithm. We implemented the algorithm almost entirely by using numerical libraries.
    [Show full text]
  • Accurate Singular Values of Bidiagonal Matrices
    d d Accurate Singular Values of Bidiagonal Matrices (Appeared in the SIAM J. Sci. Stat. Comput., v. 11, n. 5, pp. 873-912, 1990) James Demmel W. Kahan Courant Institute Computer Science Division 251 Mercer Str. University of California New York, NY 10012 Berkeley, CA 94720 Abstract Computing the singular values of a bidiagonal matrix is the ®nal phase of the standard algo- rithm for the singular value decomposition of a general matrix. We present a new algorithm which computes all the singular values of a bidiagonal matrix to high relative accuracy indepen- dent of their magnitudes. In contrast, the standard algorithm for bidiagonal matrices may com- pute sm all singular values with no relative accuracy at all. Numerical experiments show that the new algorithm is comparable in speed to the standard algorithm, and frequently faster. Keywords: singular value decomposition, bidiagonal matrix, QR iteration AMS(MOS) subject classi®cations: 65F20, 65G05, 65F35 1. Introduction The standard algorithm for computing the singular value decomposition (SVD ) of a gen- eral real matrix A has two phases [7]: = T 1) Compute orthogonal matrices P 11and Q such that B PAQ 11is in bidiagonal form, i.e. has nonzero entries only on its diagonal and ®rst superdiagonal. Σ= T 2) Compute orthogonal matrices P 22and Q such that PBQ 22is diagonal and nonnega- σ Σ tive. The diagonal entries i of are the singular values of A. We will take them to be σ≥σ = sorted in decreasing order: ii+112. The columns of Q QQ are the right singular vec- = tors,andthecolumnsofP PP12are the left singular vectors.
    [Show full text]
  • Design and Evaluation of Tridiagonal Solvers for Vector and Parallel Computers Universitat Politècnica De Catalunya
    Design and Evaluation of Tridiagonal Solvers for Vector and Parallel Computers Author: Josep Lluis Larriba Pey Advisor: Juan José Navarro Guerrero Barcelona, January 1995 UPC Universitat Politècnica de Catalunya Departament d'Arquitectura de Computadors UNIVERSITAT POLITÈCNICA DE CATALU NYA B L I O T E C A X - L I B R I S Tesi doctoral presentada per Josep Lluis Larriba Pey per tal d'aconseguir el grau de Doctor en Informàtica per la Universitat Politècnica de Catalunya UNIVERSITAT POLITÈCNICA DE CATAl U>'YA ADA'UNÍIEÏRACÍÓ ;;//..;,3UMPf';S AO\n¿.-v i S -i i::-.« c:¿fCM or,re: iïhcc'a a la pàgina .rf.S# a;; 3 b el r;ú¡; Barcelona, 6 N L'ENCARREGAT DEL REGISTRE. Barcelona, de de 1995 To my wife Marta To my parents Elvira and José Luis "A journey of a thousand miles must begin with a single step" Lao-Tse Acknowledgements I would like to thank my parents, José Luis and Elvira, for their unconditional help and love to me. I will always be indebted to you. I also want to thank my wife, Marta, for being the sparkle in my life, for her support and for understanding my (good) moods. I thank Juanjo, my advisor, for his friendship, patience and good advice. Also, I want to thank Àngel Jorba for his unconditional collaboration, work and support in some of the contributions of this work. I thank the members of "Comissió de Doctorat del DAG" for their comments and specially Miguel Valero for his suggestions on the topics of chapter 4. I thank Mateo Valero for his friendship and always good advice.
    [Show full text]
  • Exact Diagonalization of Quantum Lattice Models on Coprocessors
    Exact diagonalization of quantum lattice models on coprocessors T. Siro∗, A. Harju Aalto University School of Science, P.O. Box 14100, 00076 Aalto, Finland Abstract We implement the Lanczos algorithm on an Intel Xeon Phi coprocessor and compare its performance to a multi-core Intel Xeon CPU and an NVIDIA graphics processor. The Xeon and the Xeon Phi are parallelized with OpenMP and the graphics processor is programmed with CUDA. The performance is evaluated by measuring the execution time of a single step in the Lanczos algorithm. We study two quantum lattice models with different particle numbers, and conclude that for small systems, the multi-core CPU is the fastest platform, while for large systems, the graphics processor is the clear winner, reaching speedups of up to 7.6 compared to the CPU. The Xeon Phi outperforms the CPU with sufficiently large particle number, reaching a speedup of 2.5. Keywords: Tight binding, Hubbard model, exact diagonalization, GPU, CUDA, MIC, Xeon Phi 1. Introduction hopping amplitude is denoted by t. The tight-binding model describes free electrons hopping around a lattice, and it gives a In recent years, there has been tremendous interest in utiliz- crude approximation of the electronic properties of a solid. The ing coprocessors in scientific computing, including condensed model can be made more realistic by adding interactions, such matter physics[1, 2, 3, 4]. Most of the work has been done as on-site repulsion, which results in the well-known Hubbard on graphics processing units (GPU), resulting in impressive model[9]. In our basis, however, such interaction terms are di- speedups compared to CPUs in problems that exhibit high data- agonal, rendering their effect on the computational complexity parallelism and benefit from the high throughput of the GPU.
    [Show full text]
  • Computing the Moore-Penrose Inverse for Bidiagonal Matrices
    УДК 519.65 Yu. Hakopian DOI: https://doi.org/10.18523/2617-70802201911-23 COMPUTING THE MOORE-PENROSE INVERSE FOR BIDIAGONAL MATRICES The Moore-Penrose inverse is the most popular type of matrix generalized inverses which has many applications both in matrix theory and numerical linear algebra. It is well known that the Moore-Penrose inverse can be found via singular value decomposition. In this regard, there is the most effective algo- rithm which consists of two stages. In the first stage, through the use of the Householder reflections, an initial matrix is reduced to the upper bidiagonal form (the Golub-Kahan bidiagonalization algorithm). The second stage is known in scientific literature as the Golub-Reinsch algorithm. This is an itera- tive procedure which with the help of the Givens rotations generates a sequence of bidiagonal matrices converging to a diagonal form. This allows to obtain an iterative approximation to the singular value decomposition of the bidiagonal matrix. The principal intention of the present paper is to develop a method which can be considered as an alternative to the Golub-Reinsch iterative algorithm. Realizing the approach proposed in the study, the following two main results have been achieved. First, we obtain explicit expressions for the entries of the Moore-Penrose inverse of bidigonal matrices. Secondly, based on the closed form formulas, we get a finite recursive numerical algorithm of optimal computational complexity. Thus, we can compute the Moore-Penrose inverse of bidiagonal matrices without using the singular value decomposition. Keywords: Moor-Penrose inverse, bidiagonal matrix, inversion formula, finite recursive algorithm.
    [Show full text]
  • An Implementation of a Generalized Lanczos Procedure for Structural Dynamic Analysis on Distributed Memory Computers
    NU~IERICAL ANALYSIS PROJECT AUGUST 1992 MANUSCRIPT ,NA-92-09 An Implementation of a Generalized Lanczos Procedure for Structural Dynamic Analysis on Distributed Memory Computers . bY David R. Mackay and Kincho H. Law NUMERICAL ANALYSIS PROJECT COMPUTER SCIENCE DEPARTMENT STANFORD UNIVERSITY STANFORD, CALIFORNIA 94305 An Implementation of a Generalized Lanczos Procedure for Structural Dynamic Analysis on Distributed Memory Computers’ David R. Mackay and Kincho H. Law Department of Civil Engineering Stanford University Stanford, CA 94305-4020 Abstract This paper describes a parallel implementation of a generalized Lanczos procedure for struc- tural dynamic analysis on a distributed memory parallel computer. One major cost of the gener- alized Lanczos procedure is the factorization of the (shifted) stiffness matrix and the forward and backward solution of triangular systems. In this paper, we discuss load assignment of a sparse matrix and propose a strategy for inverting the principal block submatrix factors to facilitate the forward and backward solution of triangular systems. We also discuss the different strategies in the implementation of mass matrix-vector multiplication on parallel computer and how they are used in the Lanczos procedure. The Lanczos procedure implemented includes partial and external selective reorthogonalizations and spectral shifts. Experimental results are presented to illustrate the effectiveness of the parallel generalized Lanczos procedure. The issues of balancing the com- putations among the basic steps of the Lanczos procedure on distributed memory computers are discussed. IThis work is sponsored by the National Science Foundation grant number ECS-9003107, and the Army Research Office grant number DAAL-03-91-G-0038. Contents List of Figures ii .
    [Show full text]
  • Basic Matrices
    View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by Elsevier - Publisher Connector Linear Algebra and its Applications 373 (2003) 143–151 www.elsevier.com/locate/laa Basic matricesୋ Miroslav Fiedler Institute of Computer Science, Czech Academy of Sciences, 182 07 Prague 8, Czech Republic Received 30 April 2002; accepted 13 November 2002 Submitted by B. Shader Abstract We define a basic matrix as a square matrix which has both subdiagonal and superdiagonal rank at most one. We investigate, partly using additional restrictions, the relationship of basic matrices to factorizations. Special basic matrices are also mentioned. © 2003 Published by Elsevier Inc. AMS classification: 15A23; 15A33 Keywords: Structure rank; Tridiagonal matrix; Oscillatory matrix; Factorization; Orthogonal matrix 1. Introduction and preliminaries For m × n matrices, structure (cf. [2]) will mean any nonvoid subset of M × N, M ={1,...,m}, N ={1,...,n}. Given a matrix A (over a field, or a ring) and a structure, the structure rank of A is the maximum rank of any submatrix of A all entries of which are contained in the structure. The most important examples of structure ranks for square n × n matrices are the subdiagonal rank (S ={(i, k); n i>k 1}), the superdiagonal rank (S = {(i, k); 1 i<k n}), the off-diagonal rank (S ={(i, k); i/= k,1 i, k n}), the block-subdiagonal rank, etc. All these ranks enjoy the following property: Theorem 1.1 [2, Theorems 2 and 3]. If a nonsingular matrix has subdiagonal (su- perdiagonal, off-diagonal, block-subdiagonal, etc.) rank k, then the inverse also has subdiagonal (...)rank k.
    [Show full text]
  • Lanczos Vectors Versus Singular Vectors for Effective Dimension Reduction Jie Chen and Yousef Saad
    1 Lanczos Vectors versus Singular Vectors for Effective Dimension Reduction Jie Chen and Yousef Saad Abstract— This paper takes an in-depth look at a technique for a) Latent Semantic Indexing (LSI): LSI [3], [4] is an effec- computing filtered matrix-vector (mat-vec) products which are tive information retrieval technique which computes the relevance required in many data analysis applications. In these applications scores for all the documents in a collection in response to a user the data matrix is multiplied by a vector and we wish to perform query. In LSI, a collection of documents is represented as a term- this product accurately in the space spanned by a few of the major singular vectors of the matrix. We examine the use of the document matrix X = [xij] where each column vector represents Lanczos algorithm for this purpose. The goal of the method is a document and xij is the weight of term i in document j. A query identical with that of the truncated singular value decomposition is represented as a pseudo-document in a similar form—a column (SVD), namely to preserve the quality of the resulting mat-vec vector q. By performing dimension reduction, each document xj product in the major singular directions of the matrix. The T T (j-th column of X) becomes Pk xj and the query becomes Pk q Lanczos-based approach achieves this goal by using a small in the reduced-rank representation, where P is the matrix whose number of Lanczos vectors, but it does not explicitly compute k column vectors are the k major left singular vectors of X.
    [Show full text]
  • Copyright © by SIAM. Unauthorized Reproduction of This Article Is Prohibited. PARALLEL BIDIAGONALIZATION of a DENSE MATRIX 827
    SIAM J. MATRIX ANAL. APPL. c 2007 Society for Industrial and Applied Mathematics Vol. 29, No. 3, pp. 826–837 ∗ PARALLEL BIDIAGONALIZATION OF A DENSE MATRIX † ‡ ‡ § CARLOS CAMPOS , DAVID GUERRERO , VICENTE HERNANDEZ´ , AND RUI RALHA Abstract. A new stable method for the reduction of rectangular dense matrices to bidiagonal form has been proposed recently. This is a one-sided method since it can be entirely expressed in terms of operations with (full) columns of the matrix under transformation. The algorithm is well suited to parallel computing and, in order to make it even more attractive for distributed memory systems, we introduce a modification which halves the number of communication instances. In this paper we present such a modification. A block organization of the algorithm to use level 3 BLAS routines seems difficult and, at least for the moment, it relies upon level 2 BLAS routines. Nevertheless, we found that our sequential code is competitive with the LAPACK DGEBRD routine. We also compare the time taken by our parallel codes and the ScaLAPACK PDGEBRD routine. We investigated the best data distribution schemes for the different codes and we can state that our parallel codes are also competitive with the ScaLAPACK routine. Key words. bidiagonal reduction, parallel algorithms AMS subject classifications. 15A18, 65F30, 68W10 DOI. 10.1137/05062809X 1. Introduction. The problem of computing the singular value decomposition (SVD) of a matrix is one of the most important operations in numerical linear algebra and is employed in a variety of applications. The SVD is defined as follows. × For any rectangular matrix A ∈ Rm n (we will assume that m ≥ n), there exist ∈ Rm×m ∈ Rn×n ΣA ∈ Rm×n two orthogonal matrices U and V and a matrix Σ = 0 , t where ΣA = diag (σ1,...,σn) is a diagonal matrix, such that A = UΣV .
    [Show full text]