Sparse Approximate Inverse Methods

Sparse Approximate Inverse Methods Bachelor thesis in Applied Mathematics July 2013 Student : K. van Geffen Primary Supervisor : dr. B. Carpentieri Secondary Supervisor : prof. dr. H. Waalkens Abstract In undergraduates numerical mathematics courses I was strongly warned that inverting a matrix for computational purposes is generally very inefficient. Not only do we have to do more computation than factorization, but we also lose sparsity in the matrix. However, doing a search in the literature, I found that for many computational settings the inverse, although dense, may contain many small entries that can be dropped. As a result we approximate the inverse by a sparse matrix. Techniques for constructing such a sparse approximate inverse can be effectively used in many applications of numerical analysis, e.g. for preconditioning of linear systems and for smoothing multigrid methods. I describe some of the most popular algorithms and through some theory, numerical experiments and examples of applications, I show that sparse approximate inverse methods can be competitive, sometimes even superior, to standard factorization methods based on incomplete LU decomposition. Keywords: Sparse approximate inverse, Preconditioning, Krylov subspace methods, Iterative methods i Acknowledgements I would like to gratefully thank my primary supervisor dr. B. Carpentieri for the supervision during this work, the research of this thesis would not have been possible without his expertise on the subject. I would also like to thank my primary supervisor for providing me with an abundance of the needed material and for his extensive feedback on the work during the process. Thanks to his guidance the writing of this thesis has been a great learning experience. ii Contents 1 Introduction 1 2 Krylov Methods 2 2.1 Theoretical background on Krylov Methods . 2 2.1.1 The nonsingular case . 3 2.1.2 The singular case . 3 2.1.3 Some remarks on Krylov methods . 5 2.2 Arnoldi Method . 5 2.3 Generalization on Richardson iterations . 6 2.4 Krylov subspace methods . 7 2.4.1 Ritz-Galerkin projection . 7 2.4.2 The minimum residual approach . 8 2.4.3 Petrov-Galerkin projection . 9 2.4.4 The minimum error approach . 10 3 Preconditioning 11 3.1 The concept of preconditioning . 11 3.1.1 Discussion on preconditioning . 14 3.2 Preconditioning techniques . 16 3.2.1 Incomplete LU-factorization . 16 3.2.2 Zero Fill-In ILU (ILU(0)) . 18 3.2.3 ILU(p) . 18 3.2.4 ILUT . 20 4 Sparse Approximate Inverse Preconditioners 21 4.1 Motivation to sparse approximate inverse techniques . 22 4.1.1 A priori pattern selection . 23 4.2 Frobenius norm minimization . 25 4.2.1 SPAI . 26 4.3 Factorized sparse approximate inverses . 28 4.3.1 FSAI . 28 4.3.2 AINV . 30 4.3.3 Factorized vs unfactorized preconditoners . 32 4.4 Approximate inverses of ILU-factorizations . 32 5 Software Implementation 34 iii 5.1 Software libraries and packages . 34 5.1.1 PETSc . 35 5.1.2 SPAI . 35 5.1.3 ParaSails . 36 5.1.4 HSL MI12 . 37 6 Numerical experiments 39 6.1 Comparative study on SPAI and ILUT preconditioners . 39 6.2 Electromagnetic Scattering - the Satellite Problem . 43 6.2.1 Spectral deflation . 45 7 Conclusion 49 References 50 Appendices 51 A ILUT algorithm 51 B MATLAB code for GMRES(m) 51 iv 1 Introduction In my thesis I review state-of-the-art techniques for computing sparse approximate inverses. That is, we consider methods for constructing a sparse approximation to the inverse of some matrix. These techniques can be effectively used in many applications of numerical analysis, e.g. for preconditioning of linear systems and for smoothing multigrid methods. For the purpose of this thesis we are mainly interested in linear systems, the problems we address to are of the form Ax = b; (1.1) where A is an n × n large and sparse matrix with real entries and b is a given, real, right-hand side. Sparse linear systems are ubiquitous in computational science. They arise, for instance, in the numerical solution of partial differential equations, of inverse problems and in optimization. Direct methods are in general very robust, but they are not feasible for large systems; they are expensive in terms of both work and memory. Modern iterative methods, namely Krylov subspace methods, can solve the bottleneck of memory, but it is now well established that convergence might be slow when the coefficient matrix A is highly nonsymmetric and/or indefinite. With the assistance of a preconditioner, we could try to transform system (1.1) into an equivalent system which is more amenable to an iterative solver. Well-known standard preconditioning techniques compute a nonsingular approximation M to the matrix A and the equivalent system might for example be M −1Ax = M −1b. Usually the construction of M is based on an incomplete LU decomposition. In contrast, we could construct a preconditioner M that directly approximates the inverse matrix, i.e. M ≈ A−1. In this case the equivalent system might be given by MAx = Mb. From numerical experiments it is observed that the inverse of sparse matrix is typically dense, but a lot of the entries are of small magnitude. This justifies the motivation to approximate the inverse by a sparse matrix. In the last decade, a significant amount of research has been devoted to develop such preconditioning techniques. There are several advantages of sparse approximate inverse techniques. The most obvious one is that the preconditioning operation reduces to forming one (or more) matrix-vector products, whereas for standard techniques we require a linear solve. Hence, approximate inverse methods are generally less prone to instabilities. Another issue we address to in this thesis is that sparse approximate inverses are suitable for a parallel environment; matrix-vector products are nowadays already efficiently implemented in parallel, but in many cases the constructing phase of the sparse approximate inverse preconditioner is implemented in parallel as well. The standard techniques, however, are highly sequential. Other issues we address to in these thesis are: pattern selection, the use of factorized or non-factorized preconditioners and numerical software packages. The structure of the thesis is as follows. In Section 2 we give a theoretical background on Krylov subspace methods. In Section 3 we discuss the concept of preconditioning and we review some well-known standard preconditioning techniques. The most popular sparse approximate inverse techniques are described in Section 4. Next, in Section 5, we will see what software packages are provided for some of the described algorithms. In Section 6 we conduct some numerical experiments and we will see an application of sparse approximate inverse in a practical implementation. Finally, in Section 7 we draw some conclusions arising from the work. 1 2 Krylov Methods In the field of iterative techniques for solving linear systems, Krylov methods are nowadays among the most important. Many well known algorithms, such as the Generalized Minimum Residual (GMRES), Conjugate Gradient (CG), Full Orthogonalization Method (FOM), Lanczos Method, Biconjugate Gradient (Bi-CG) and Biconjugate Stabilized (Bi-CGSTAB) are based on projection techniques on Krylov subspaces [15]. Depending on the properties of the system, one method tends to be more efficient than the other. Later on we give an overview of these methods and tell for what kind of systems each particular method is suitable. But first we will give a theoretical background on Krylov methods and show that Krylov methods can be introduced as a generalization of Richardson iterations. Recall that the standard Richardson iteration is based on the matrix splitting A = I − (I − A). Similar techniques that are based on matrix splitting are the Jacobi and Gauss-Seidel method. As it turns out, Krylov methods are a generalization of these type of methods as well. 2.1 Theoretical background on Krylov Methods In this theoretical background we do not concern ourselves with technical details that come into play when we consider numerical applications, nor do we go into detail on the concept of preconditioning yet. For now we will just focus on the idea behind the Krylov methods. This means that the following is based on exact arithmetic. Now, as in all iterative methods, we start with an initial guess x(0). For the purpose of this thesis we assume that there is no information available on the solution x to the system Ax = b and hence we will always choose x(0) = 0. Krylov methods search in the kth iteration for an approximated solution x(k) in the Krylov subspace generated by the vector b, which is defined by1: k−1 Kk(A; b) := spanfb; Ab; : : : ; A bg: (2.1) The subspace is usually simply referred to by Kk. At first glance it is not clear why it makes sense to search for a solution in a Krylov subspace and it is certainly not right away clear why a Krylov method has the opportunity to converge fast. Ilse C.F. Ipsen and Carl D. Meyer tried to answer these questions in an article on Krylov methods [12]. In this theoretical background on Krylov subspace methods, we will follow their reasoning starting with the nonsingular case. After that we consider the singular case and give some restrictions to b that need to hold in order for a Krylov solution x to exist. We will always make a distinction between regular solutions and Krylov solutions. We already know that a regular solution exists if and only if b is in the range of A, i.e. b 2 R(A). We will see that for singular matrices we have to meet slightly different restrictions for the existence of Krylov solutions.

Load more