Fast Algorithms for Sparse Matrix Inverse Computations a Dissertation Submitted to the Institute for Computational and Mathemati

Total Page:16

File Type:pdf, Size:1020Kb

Fast Algorithms for Sparse Matrix Inverse Computations a Dissertation Submitted to the Institute for Computational and Mathemati FAST ALGORITHMS FOR SPARSE MATRIX INVERSE COMPUTATIONS A DISSERTATION SUBMITTED TO THE INSTITUTE FOR COMPUTATIONAL AND MATHEMATICAL ENGINEERING AND THE COMMITTEE ON GRADUATE STUDIES OF STANFORD UNIVERSITY IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY Song Li September 2009 c Copyright by Song Li 2009 All Rights Reserved ii I certify that I have read this dissertation and that, in my opinion, it is fully adequate in scope and quality as a dissertation for the degree of Doctor of Philosophy. (Eric Darve) Principal Adviser I certify that I have read this dissertation and that, in my opinion, it is fully adequate in scope and quality as a dissertation for the degree of Doctor of Philosophy. (George Papanicolaou) I certify that I have read this dissertation and that, in my opinion, it is fully adequate in scope and quality as a dissertation for the degree of Doctor of Philosophy. (Michael Saunders) Approved for the University Committee on Graduate Studies. iii iv Abstract An accurate and efficient algorithm, called Fast Inverse using Nested Dissection (FIND), has been developed for certain sparse matrix computations. The algorithm reduces the computation cost by an order of magnitude for 2D problems. After discretization on an N N mesh, the previously best-known algorithm Recursive x × y 3 Green's Function (RGF) requires O(Nx Ny) operations, whereas ours requires only 2 O(Nx Ny). The current major application of the algorithm is to simulate the transport of electrons in nano-devices, but it may be applied to other problems as well, e.g., data clustering, imagery pattern clustering, image segmentation. The algorithm computes the diagonal entries of the inverse of a sparse of finite- difference, finite-element, or finite-volume type. In addition, it can be extended to computing certain off-diagonal entries and other inverse-related matrix computations. As an example, we focus on the retarded Green's function, the less-than Green's function, and the current density in the non-equilibrium Green's function approach for transport problems. Special optimizations of the algorithm are also be discussed. These techniques are applicable to 3D regular shaped domains as well. Computing inverse elements for a large matrix requires a lot of memory and is very time-consuming even using our efficient algorithm with optimization. We pro- pose several parallel algorithms for such applications based on ideas from cyclic re- duction, dynamic programming, and nested dissection. A similar parallel algorithm is discussed for solving sparse linear systems with repeated right-hand sides with significant speedup. v Acknowledgements The first person I would like to thank is professor Walter Murray. My pleasant study here at Stanford and the newly established Institute for Computational and Mathematical Engineering would not have been possible without him. I would also very like to thank Indira Choudhury, who helped me on numerous administrative issues with lots of patience. I would also very like to professor TW Wiedmann, who has been constantly pro- viding mental support and giving me advice on all aspects of my life here. I am grateful to professors Tze Leung Lai, Parviz Moin, George Papanicolaou, and Michael Saunders for willing to be my committee members. In particular, Michael gave lots of comments on my dissertation draft. I believe that his comments not only helped me improve my dissertation, but also will benefit my future academic writing. I would very much like to thank my advisor Eric Darve for choosing a great topic for me at the beginning and all his help afterwards in the past five years. His constant support and tolerance provided me a stable environment with more than enough freedom to learn what interests me and think in my own pace and style, while at the same time, his advice led my vague and diverse ideas to rigorous and meaningful results. This is almost an ideal environment in my mind for academic study and research that I have never actually had before. In particular, the discussion with him in the past years has been very enjoyable. I am very lucky to meet him and have him as my advisor here at Stanford. Thank you, Eric! Lastly, I thank my mother and my sister for always encouraging me and believing in me. vi Contents Abstract v Acknowledgements vi 1 Introduction 1 1.1 The transport problem of nano-devices . .1 1.1.1 The physical problem . .1 1.1.2 NEGF approach . .4 1.2 Review of existing methods . .8 1.2.1 The need for computing sparse matrix inverses . .8 1.2.2 Direct methods for matrix inverse related computation . 10 1.2.3 Recursive Green's Function method . 13 2 Overview of FIND Algorithms 17 2.1 1-way methods based on LU only . 17 2.2 2-way methods based on both LU and back substitution . 20 3 Computing the Inverse 23 3.1 Brief description of the algorithm . 23 3.1.1 Upward pass . 25 3.1.2 Downward pass . 26 3.1.3 Nested dissection algorithm of A. George et al. 29 3.2 Detailed description of the algorithm . 30 3.2.1 The definition and properties of mesh node sets and trees . 31 vii 3.2.2 Correctness of the algorithm . 34 3.2.3 The algorithm . 37 3.3 Complexity analysis . 39 3.3.1 Running time analysis . 39 3.3.2 Memory cost analysis . 44 3.3.3 Effect of the null boundary set of the whole mesh . 45 3.4 Simulation of device and comparison with RGF . 46 3.5 Concluding remarks . 49 4 Extension of FIND 51 4.1 Computing diagonal entries of another Green's function: G< ..... 51 4.1.1 The algorithm . 52 4.1.2 The pseudocode of the algorithm . 53 4.1.3 Computation and storage cost . 56 4.2 Computing off-diagonal entries of Gr and G< ............. 57 5 Optimization of FIND 59 5.1 Introduction . 59 5.2 Optimization for extra sparsity in A .................. 61 5.2.1 Schematic description of the extra sparsity . 61 5.2.2 Preliminary analysis . 64 5.2.3 Exploiting the extra sparsity in a block structure . 66 5.3 Optimization for symmetry and positive definiteness . 72 5.3.1 The symmetry and positive definiteness of dense matrix A .. 72 5.3.2 Optimization combined with the extra sparsity . 74 5.3.3 Storage cost reduction . 75 5.4 Optimization for computing G< and current density . 76 5.4.1 G< sparsity . 76 5.4.2 G< symmetry . 76 5.4.3 Current density . 77 5.5 Results and comparison . 78 viii 6 FIND-2way and Its Parallelization 83 6.1 Recurrence formulas for the inverse matrix . 84 6.2 Sequential algorithm for 1D problems . 89 6.3 Computational cost analysis . 91 6.4 Parallel algorithm for 1D problems . 95 6.4.1 Schur phases . 97 6.4.2 Block Cyclic Reduction phases . 103 6.5 Numerical results . 104 6.5.1 Stability . 104 6.5.2 Load balancing . 105 6.5.3 Benchmarks . 107 6.6 Conclusion . 108 7 More Advanced Parallel Schemes 113 7.1 Common features of the parallel schemes . 113 7.2 PCR scheme for 1D problems . 114 7.3 PFIND-Complement scheme for 1D problems . 117 7.4 Parallel schemes in 2D . 120 8 Conclusions and Directions for Future Work 123 Appendices 126 A Proofs for the Recursive Method Computing Gr and G< 126 A.1 Forward recurrence . 128 A.2 Backward recurrence . 129 B Theoretical Supplement for FIND-1way Algorithms 132 B.1 Proofs for both computing Gr and computing G< ........... 132 B.2 Proofs for computing G< ......................... 136 C Algorithms 140 C.1 BCR Reduction Functions . 141 ix C.2 BCR Production Functions . 142 C.3 Hybrid Auxiliary Functions . 144 Bibliography 147 x List of Tables I Symbols in the physical problems . xvii II Symbols for matrix operations . xviii III Symbols for mesh description . xviii 3.1 Computation cost estimate for different type of clusters . 46 5.1 Matrix blocks and their corresponding operations . 66 5.2 The cost of operations and their dependence in the first method. The costs are in flops. The size of A, B, C, and D is m m; the size of W , × X, Y , and Z is m n........................... 67 × 5.3 The cost of operations and their dependence in the second method. The costs are in flops. 68 5.4 The cost of operations in flops and their dependence in the third method 69 5.5 The cost of operations in flops and their dependence in the fourth method. The size of A, B, C, and D is m m; the size of W , X, Y , × and Z is m n .............................. 70 × 5.6 Summary of the optimization methods . 70 5.7 The cost of operations and their dependence for (5.10) . 71 5.8 Operation costs in flops. 75 5.9 Operation costs in flops for G< with optimization for symmetry. 77 6.1 Estimate of the computational cost for a 2D square mesh for different cluster sizes. The size is in units of N 1=2. The cost is in units of N 3=2 flops. 94 xi List of Figures 1.1 The model of a widely-studied double-gate SOI MOSFET with ultra- thin intrinsic channel. Typical values of key device parameters are also shown. .3 1.2 Left: example of 2D mesh to which RGF can be applied. Right: 5-point stencil. 13 2.1 Partitions of the entire mesh into clusters . 18 2.2 The partition tree and the complement tree of the mesh .
Recommended publications
  • Eigenvalues and Eigenvectors of Tridiagonal Matrices∗
    Electronic Journal of Linear Algebra ISSN 1081-3810 A publication of the International Linear Algebra Society Volume 15, pp. 115-133, April 2006 ELA http://math.technion.ac.il/iic/ela EIGENVALUES AND EIGENVECTORS OF TRIDIAGONAL MATRICES∗ SAID KOUACHI† Abstract. This paper is continuation of previous work by the present author, where explicit formulas for the eigenvalues associated with several tridiagonal matrices were given. In this paper the associated eigenvectors are calculated explicitly. As a consequence, a result obtained by Wen- Chyuan Yueh and independently by S. Kouachi, concerning the eigenvalues and in particular the corresponding eigenvectors of tridiagonal matrices, is generalized. Expressions for the eigenvectors are obtained that differ completely from those obtained by Yueh. The techniques used herein are based on theory of recurrent sequences. The entries situated on each of the secondary diagonals are not necessary equal as was the case considered by Yueh. Key words. Eigenvectors, Tridiagonal matrices. AMS subject classifications. 15A18. 1. Introduction. The subject of this paper is diagonalization of tridiagonal matrices. We generalize a result obtained in [5] concerning the eigenvalues and the corresponding eigenvectors of several tridiagonal matrices. We consider tridiagonal matrices of the form −α + bc1 00 ... 0 a1 bc2 0 ... 0 .. .. 0 a2 b . An = , (1) .. .. .. 00. 0 . .. .. .. . cn−1 0 ... ... 0 an−1 −β + b n−1 n−1 ∞ where {aj}j=1 and {cj}j=1 are two finite subsequences of the sequences {aj}j=1 and ∞ {cj}j=1 of the field of complex numbers C, respectively, and α, β and b are complex numbers. We suppose that 2 d1, if j is odd ajcj = 2 j =1, 2, ..., (2) d2, if j is even where d 1 and d2 are complex numbers.
    [Show full text]
  • Parametrizations of K-Nonnegative Matrices
    Parametrizations of k-Nonnegative Matrices Anna Brosowsky, Neeraja Kulkarni, Alex Mason, Joe Suk, Ewin Tang∗ October 2, 2017 Abstract Totally nonnegative (positive) matrices are matrices whose minors are all nonnegative (positive). We generalize the notion of total nonnegativity, as follows. A k-nonnegative (resp. k-positive) matrix has all minors of size k or less nonnegative (resp. positive). We give a generating set for the semigroup of k-nonnegative matrices, as well as relations for certain special cases, i.e. the k = n − 1 and k = n − 2 unitriangular cases. In the above two cases, we find that the set of k-nonnegative matrices can be partitioned into cells, analogous to the Bruhat cells of totally nonnegative matrices, based on their factorizations into generators. We will show that these cells, like the Bruhat cells, are homeomorphic to open balls, and we prove some results about the topological structure of the closure of these cells, and in fact, in the latter case, the cells form a Bruhat-like CW complex. We also give a family of minimal k-positivity tests which form sub-cluster algebras of the total positivity test cluster algebra. We describe ways to jump between these tests, and give an alternate description of some tests as double wiring diagrams. 1 Introduction A totally nonnegative (respectively totally positive) matrix is a matrix whose minors are all nonnegative (respectively positive). Total positivity and nonnegativity are well-studied phenomena and arise in areas such as planar networks, combinatorics, dynamics, statistics and probability. The study of total positivity and total nonnegativity admit many varied applications, some of which are explored in “Totally Nonnegative Matrices” by Fallat and Johnson [5].
    [Show full text]
  • On Multigrid Methods for Solving Electromagnetic Scattering Problems
    On Multigrid Methods for Solving Electromagnetic Scattering Problems Dissertation zur Erlangung des akademischen Grades eines Doktor der Ingenieurwissenschaften (Dr.-Ing.) der Technischen Fakultat¨ der Christian-Albrechts-Universitat¨ zu Kiel vorgelegt von Simona Gheorghe 2005 1. Gutachter: Prof. Dr.-Ing. L. Klinkenbusch 2. Gutachter: Prof. Dr. U. van Rienen Datum der mundliche¨ Prufung:¨ 20. Jan. 2006 Contents 1 Introductory remarks 3 1.1 General introduction . 3 1.2 Maxwell’s equations . 6 1.3 Boundary conditions . 7 1.3.1 Sommerfeld’s radiation condition . 9 1.4 Scattering problem (Model Problem I) . 10 1.5 Discontinuity in a parallel-plate waveguide (Model Problem II) . 11 1.6 Absorbing-boundary conditions . 12 1.6.1 Global radiation conditions . 13 1.6.2 Local radiation conditions . 18 1.7 Summary . 19 2 Coupling of FEM-BEM 21 2.1 Introduction . 21 2.2 Finite element formulation . 21 2.2.1 Discretization . 26 2.3 Boundary-element formulation . 28 3 4 CONTENTS 2.4 Coupling . 32 3 Iterative solvers for sparse matrices 35 3.1 Introduction . 35 3.2 Classical iterative methods . 36 3.3 Krylov subspace methods . 37 3.3.1 General projection methods . 37 3.3.2 Krylov subspace methods . 39 3.4 Preconditioning . 40 3.4.1 Matrix-based preconditioners . 41 3.4.2 Operator-based preconditioners . 42 3.5 Multigrid . 43 3.5.1 Full Multigrid . 47 4 Numerical results 49 4.1 Coupling between FEM and local/global boundary conditions . 49 4.1.1 Model problem I . 50 4.1.2 Model problem II . 63 4.2 Multigrid . 64 4.2.1 Theoretical considerations regarding the classical multi- grid behavior in the case of an indefinite problem .
    [Show full text]
  • Sparse Matrices and Iterative Methods
    Iterative Methods Sparsity Sparse Matrices and Iterative Methods K. Cooper1 1Department of Mathematics Washington State University 2018 Cooper Washington State University Introduction Iterative Methods Sparsity Iterative Methods Consider the problem of solving Ax = b, where A is n × n. Why would we use an iterative method? I Avoid direct decomposition (LU, QR, Cholesky) I Replace with iterated matrix multiplication 3 I LU is O(n ) flops. 2 I . matrix-vector multiplication is O(n )... I so if we can get convergence in e.g. log(n), iteration might be faster. Cooper Washington State University Introduction Iterative Methods Sparsity Jacobi, GS, SOR Some old methods: I Jacobi is easily parallelized. I . but converges extremely slowly. I Gauss-Seidel/SOR converge faster. I . but cannot be effectively parallelized. I Only Jacobi really takes advantage of sparsity. Cooper Washington State University Introduction Iterative Methods Sparsity Sparsity When a matrix is sparse (many more zero entries than nonzero), then typically the number of nonzero entries is O(n), so matrix-vector multiplication becomes an O(n) operation. This makes iterative methods very attractive. It does not help direct solves as much because of the problem of fill-in, but we note that there are specialized solvers to minimize fill-in. Cooper Washington State University Introduction Iterative Methods Sparsity Krylov Subspace Methods A class of methods that converge in n iterations (in exact arithmetic). We hope that they arrive at a solution that is “close enough” in fewer iterations. Often these work much better than the classic methods. They are more readily parallelized, and take full advantage of sparsity.
    [Show full text]
  • Scalable Stochastic Kriging with Markovian Covariances
    Scalable Stochastic Kriging with Markovian Covariances Liang Ding and Xiaowei Zhang∗ Department of Industrial Engineering and Decision Analytics The Hong Kong University of Science and Technology, Clear Water Bay, Hong Kong Abstract Stochastic kriging is a popular technique for simulation metamodeling due to its flexibility and analytical tractability. Its computational bottleneck is the inversion of a covariance matrix, which takes O(n3) time in general and becomes prohibitive for large n, where n is the number of design points. Moreover, the covariance matrix is often ill-conditioned for large n, and thus the inversion is prone to numerical instability, resulting in erroneous parameter estimation and prediction. These two numerical issues preclude the use of stochastic kriging at a large scale. This paper presents a novel approach to address them. We construct a class of covariance functions, called Markovian covariance functions (MCFs), which have two properties: (i) the associated covariance matrices can be inverted analytically, and (ii) the inverse matrices are sparse. With the use of MCFs, the inversion-related computational time is reduced to O(n2) in general, and can be further reduced by orders of magnitude with additional assumptions on the simulation errors and design points. The analytical invertibility also enhance the numerical stability dramatically. The key in our approach is that we identify a general functional form of covariance functions that can induce sparsity in the corresponding inverse matrices. We also establish a connection between MCFs and linear ordinary differential equations. Such a connection provides a flexible, principled approach to constructing a wide class of MCFs. Extensive numerical experiments demonstrate that stochastic kriging with MCFs can handle large-scale problems in an both computationally efficient and numerically stable manner.
    [Show full text]
  • Chapter 7 Iterative Methods for Large Sparse Linear Systems
    Chapter 7 Iterative methods for large sparse linear systems In this chapter we revisit the problem of solving linear systems of equations, but now in the context of large sparse systems. The price to pay for the direct methods based on matrix factorization is that the factors of a sparse matrix may not be sparse, so that for large sparse systems the memory cost make direct methods too expensive, in memory and in execution time. Instead we introduce iterative methods, for which matrix sparsity is exploited to develop fast algorithms with a low memory footprint. 7.1 Sparse matrix algebra Large sparse matrices We say that the matrix A Rn is large if n is large, and that A is sparse if most of the elements are2 zero. If a matrix is not sparse, we say that the matrix is dense. Whereas for a dense matrix the number of nonzero elements is (n2), for a sparse matrix it is only (n), which has obvious implicationsO for the memory footprint and efficiencyO for algorithms that exploit the sparsity of a matrix. AdiagonalmatrixisasparsematrixA =(aij), for which aij =0for all i = j,andadiagonalmatrixcanbegeneralizedtoabanded matrix, 6 for which there exists a number p,thebandwidth,suchthataij =0forall i<j p or i>j+ p.Forexample,atridiagonal matrix A is a banded − 59 CHAPTER 7. ITERATIVE METHODS FOR LARGE SPARSE 60 LINEAR SYSTEMS matrix with p =1, xx0000 xxx000 20 xxx003 A = , (7.1) 600xxx07 6 7 6000xxx7 6 7 60000xx7 6 7 where x represents a nonzero4 element. 5 Compressed row storage The compressed row storage (CRS) format is a data structure for efficient represention of a sparse matrix by three arrays, containing the nonzero values, the respective column indices, and the extents of the rows.
    [Show full text]
  • A Geometric Theory for Preconditioned Inverse Iteration. III: a Short and Sharp Convergence Estimate for Generalized Eigenvalue Problems
    A geometric theory for preconditioned inverse iteration. III: A short and sharp convergence estimate for generalized eigenvalue problems. Andrew V. Knyazev Department of Mathematics, University of Colorado at Denver, P.O. Box 173364, Campus Box 170, Denver, CO 80217-3364 1 Klaus Neymeyr Mathematisches Institut der Universit¨atT¨ubingen,Auf der Morgenstelle 10, 72076 T¨ubingen,Germany 2 Abstract In two previous papers by Neymeyr: A geometric theory for preconditioned inverse iteration I: Extrema of the Rayleigh quotient, LAA 322: (1-3), 61-85, 2001, and A geometric theory for preconditioned inverse iteration II: Convergence estimates, LAA 322: (1-3), 87-104, 2001, a sharp, but cumbersome, convergence rate estimate was proved for a simple preconditioned eigensolver, which computes the smallest eigenvalue together with the corresponding eigenvector of a symmetric positive def- inite matrix, using a preconditioned gradient minimization of the Rayleigh quotient. In the present paper, we discover and prove a much shorter and more elegant, but still sharp in decisive quantities, convergence rate estimate of the same method that also holds for a generalized symmetric definite eigenvalue problem. The new estimate is simple enough to stimulate a search for a more straightforward proof technique that could be helpful to investigate such practically important method as the locally optimal block preconditioned conjugate gradient eigensolver. We demon- strate practical effectiveness of the latter for a model problem, where it compares favorably with two
    [Show full text]
  • A Parallel Solver for Graph Laplacians
    A Parallel Solver for Graph Laplacians Tristan Konolige Jed Brown University of Colorado Boulder University of Colorado Boulder [email protected] ABSTRACT low energy components while maintaining coarse grid sparsity. We Problems from graph drawing, spectral clustering, network ow also develop a parallel algorithm for nding and eliminating low and graph partitioning can all be expressed in terms of graph Lapla- degree vertices, a technique introduced for a sequential multigrid cian matrices. ere are a variety of practical approaches to solving algorithm by Livne and Brandt [22], that is important for irregular these problems in serial. However, as problem sizes increase and graphs. single core speeds stagnate, parallelism is essential to solve such While matrices can be distributed using a vertex partition (a problems quickly. We present an unsmoothed aggregation multi- 1D/row distribution) for PDE problems, this leads to unacceptable grid method for solving graph Laplacians in a distributed memory load imbalance for irregular graphs. We represent matrices using a seing. We introduce new parallel aggregation and low degree elim- 2D distribution (a partition of the edges) that maintains load balance ination algorithms targeted specically at irregular degree graphs. for parallel scaling [11]. Many algorithms that are practical for 1D ese algorithms are expressed in terms of sparse matrix-vector distributions are not feasible for more general distributions. Our products using generalized sum and product operations. is for- new parallel algorithms are distribution agnostic. mulation is amenable to linear algebra using arbitrary distributions and allows us to operate on a 2D sparse matrix distribution, which 2 BACKGROUND is necessary for parallel scalability.
    [Show full text]
  • (Hessenberg) Eigenvalue-Eigenmatrix Relations∗
    (HESSENBERG) EIGENVALUE-EIGENMATRIX RELATIONS∗ JENS-PETER M. ZEMKE† Abstract. Explicit relations between eigenvalues, eigenmatrix entries and matrix elements are derived. First, a general, theoretical result based on the Taylor expansion of the adjugate of zI − A on the one hand and explicit knowledge of the Jordan decomposition on the other hand is proven. This result forms the basis for several, more practical and enlightening results tailored to non-derogatory, diagonalizable and normal matrices, respectively. Finally, inherent properties of (upper) Hessenberg, resp. tridiagonal matrix structure are utilized to construct computable relations between eigenvalues, eigenvector components, eigenvalues of principal submatrices and products of lower diagonal elements. Key words. Algebraic eigenvalue problem, eigenvalue-eigenmatrix relations, Jordan normal form, adjugate, principal submatrices, Hessenberg matrices, eigenvector components AMS subject classifications. 15A18 (primary), 15A24, 15A15, 15A57 1. Introduction. Eigenvalues and eigenvectors are defined using the relations Av = vλ and V −1AV = J. (1.1) We speak of a partial eigenvalue problem, when for a given matrix A ∈ Cn×n we seek scalar λ ∈ C and a corresponding nonzero vector v ∈ Cn. The scalar λ is called the eigenvalue and the corresponding vector v is called the eigenvector. We speak of the full or algebraic eigenvalue problem, when for a given matrix A ∈ Cn×n we seek its Jordan normal form J ∈ Cn×n and a corresponding (not necessarily unique) eigenmatrix V ∈ Cn×n. Apart from these constitutional relations, for some classes of structured matrices several more intriguing relations between components of eigenvectors, matrix entries and eigenvalues are known. For example, consider the so-called Jacobi matrices.
    [Show full text]
  • Solving Linear Systems: Iterative Methods and Sparse Systems
    Solving Linear Systems: Iterative Methods and Sparse Systems COS 323 Last time • Linear system: Ax = b • Singular and ill-conditioned systems • Gaussian Elimination: A general purpose method – Naïve Gauss (no pivoting) – Gauss with partial and full pivoting – Asymptotic analysis: O(n3) • Triangular systems and LU decomposition • Special matrices and algorithms: – Symmetric positive definite: Cholesky decomposition – Tridiagonal matrices • Singularity detection and condition numbers Today: Methods for large and sparse systems • Rank-one updating with Sherman-Morrison • Iterative refinement • Fixed-point and stationary methods – Introduction – Iterative refinement as a stationary method – Gauss-Seidel and Jacobi methods – Successive over-relaxation (SOR) • Solving a system as an optimization problem • Representing sparse systems Problems with large systems • Gaussian elimination, LU decomposition (factoring step) take O(n3) • Expensive for big systems! • Can get by more easily with special matrices – Cholesky decomposition: for symmetric positive definite A; still O(n3) but halves storage and operations – Band-diagonal: O(n) storage and operations • What if A is big? (And not diagonal?) Special Example: Cyclic Tridiagonal • Interesting extension: cyclic tridiagonal • Could derive yet another special case algorithm, but there’s a better way Updating Inverse • Suppose we have some fast way of finding A-1 for some matrix A • Now A changes in a special way: A* = A + uvT for some n×1 vectors u and v • Goal: find a fast way of computing (A*)-1
    [Show full text]
  • Explicit Inverse of a Tridiagonal (P, R)–Toeplitz Matrix
    Explicit inverse of a tridiagonal (p; r){Toeplitz matrix A.M. Encinas, M.J. Jim´enez Departament de Matemtiques Universitat Politcnica de Catalunya Abstract Tridiagonal matrices appears in many contexts in pure and applied mathematics, so the study of the inverse of these matrices becomes of specific interest. In recent years the invertibility of nonsingular tridiagonal matrices has been quite investigated in different fields, not only from the theoretical point of view (either in the framework of linear algebra or in the ambit of numerical analysis), but also due to applications, for instance in the study of sound propagation problems or certain quantum oscillators. However, explicit inverses are known only in a few cases, in particular when the tridiagonal matrix has constant diagonals or the coefficients of these diagonals are subjected to some restrictions like the tridiagonal p{Toeplitz matrices [7], such that their three diagonals are formed by p{periodic sequences. The recent formulae for the inversion of tridiagonal p{Toeplitz matrices are based, more o less directly, on the solution of second order linear difference equations, although most of them use a cumbersome formulation, that in fact don not take into account the periodicity of the coefficients. This contribution presents the explicit inverse of a tridiagonal matrix (p; r){Toeplitz, which diagonal coefficients are in a more general class of sequences than periodic ones, that we have called quasi{periodic sequences. A tridiagonal matrix A = (aij) of order n + 2 is called (p; r){Toeplitz if there exists m 2 N0 such that n + 2 = mp and ai+p;j+p = raij; i; j = 0;:::; (m − 1)p: Equivalently, A is a (p; r){Toeplitz matrix iff k ai+kp;j+kp = r aij; i; j = 0; : : : ; p; k = 0; : : : ; m − 1: We have developed a technique that reduces any linear second order difference equation with periodic or quasi-periodic coefficients to a difference equation of the same kind but with constant coefficients [3].
    [Show full text]
  • The Godunov–Inverse Iteration: a Fast and Accurate Solution to the Symmetric Tridiagonal Eigenvalue Problem
    The Godunov{Inverse Iteration: A Fast and Accurate Solution to the Symmetric Tridiagonal Eigenvalue Problem Anna M. Matsekh a;1 aInstitute of Computational Technologies, Siberian Branch of the Russian Academy of Sciences, Lavrentiev Ave. 6, Novosibirsk 630090, Russia Abstract We present a new hybrid algorithm based on Godunov's method for computing eigenvectors of symmetric tridiagonal matrices and Inverse Iteration, which we call the Godunov{Inverse Iteration. We use eigenvectors computed according to Go- dunov's method as starting vectors in the Inverse Iteration, replacing any non- numeric elements of Godunov's eigenvectors with random uniform numbers. We use the right-hand bounds of the Ritz intervals found by the bisection method as Inverse Iteration shifts, while staying within guaranteed error bounds. In most test cases convergence is reached after only one step of the iteration, producing error estimates that are as good as or superior to those produced by standard Inverse Iteration implementations. Key words: Symmetric eigenvalue problem, tridiagonal matrices, Inverse Iteration 1 Introduction Construction of algorithms that enable to find all eigenvectors of the symmet- ric tridiagonal eigenvalue problem with guaranteed accuracy in O(n2) arith- metic operations has become one of the most pressing problems of the modern numerical algebra. QR method, one of the most accurate methods for solv- ing eigenvalue problems, requires 6n3 arithmetic operations and O(n2) square root operations to compute all eigenvectors of a tridiagonal matrix [Golub and Email address: [email protected] (Anna M. Matsekh ). 1 Present address: Modeling, Algorithms and Informatics Group, Los Alamos Na- tional Laboratory. P.O. Box 1663, MS B256, Los Alamos, NM 87544, USA Preprint submitted to Elsevier Science 31 January 2003 Loan (1996)].
    [Show full text]