Solution of General Linear Systems of Equations Using Block Krylov Based

Total Page:16

File Type:pdf, Size:1020Kb

Solution of General Linear Systems of Equations Using Block Krylov Based Solution of general linear systems of equations using blo ck Krylov based iterative metho ds on distributed computing environments LeroyAnthony Drummond Lewis Decemb er i Resolution de systemes lineaires creux par des metho des iteratives par blo cs dans des environnements distribues heterogenes Resume Nous etudions limplantation de metho des iteratives par blo cs dans des environnements mul tipro cesseur amemoire distribuee p our la resolution de systemes lineaires quelconques Dans un premier temps nous nous interessons aletude du p otentiel de la metho de du gradient con jugue classique en environnement parallele Dans un deuxieme temps nous etudions une autre metho de iterativedelameme famille que celle du gradient conjugue qui a ete concue p our resoudre simultanementdessystemes lineaires avec de multiples seconds membres et commune mentreferencee sous le nom de gradient conjugue par blo cs La complexite algorithmique de la metho de du gradient conjugue par blo cs est superieure a celle de la metho de du gradientconjugue classique car elle demande plus de calculs par iteration et necessite en outre plus de memoire Malgre cela la metho de du gradient conjugue par blo cs apparat comme etant mieux adaptee aux environnements vectoriels et paralleles Nous avons etudie trois variantes du gradient conjugue par blo cs basees sur dierents mo deles de programmation parallele et leur ecaciteaete comparee sur divers environnements distribues Pour la resolution des systemes lineaires non symetriques nous considerons lutilisation de me thodes iteratives de pro jection par lignes accelerees par la metho de du gradient conjugue par blo cs Les dierentes versions de gradient conjugue par blo cs precedemmentetudiees sont util isees en vue de cette acceleration En particulier nous etudions limplantation dans des environ nements distribues de la metho de de Cimmino par blo cs acceleree par la metho de du gradient conjugue par blo cs La combinaison de ces deux techniques presente en eet un b on p otentiel de parallelisme Pour une b onne p erformance de limplantation de cette derniere metho de dans des environ nements distribues heterogenes nous avons etudie dierentes strategies de repartition des taches aux divers pro cesseurs et nous comparons deux sequencements statiques realisant cette repar tition Le premier a p our ob jectif de maintenir lequilibre des charges et le second a p our but de reduire en premier les communications entre les dierents pro cesseurs tout en essayant dequilibrer aux mieux la charge des pro cesseurs Finalement nous etudions des strategies de pretraitement des systemes lineaires p our ameliorer la p erformance de la metho de iterative basee sur la metho de de Cimmino Mots clefs La metho de de Cimmino gradient conjugue calcul distribue calcul heterogene metho des itera tives calcul parallele pretraitement des systemes lineaires sequencement systemes lineaires creux et symetriques matrices creuses systemes lineaires creux non symetriques ii Solution of general linear systems of equations using blo ck Krylov based iterative metho ds on distributed computing environments Abstract We study the implementation of blo ck Krylov based iterative metho ds on distributed computing environments for the solution of general linear systems of equations First we study p otential implementation of the classical conjugate gradient CG metho d on parallel environments From the family of conjugate direction metho ds we study the Blo ck Conjugate GradientBlockCG metho d which is based on the classical CG metho d The Blo ckCG works on a blo ckofs right hand side vectors instead of a single one as is the case of the Classical CG and we study the implementation of the Blo ckCG on distributed environments The complexity of the Blo ckCG metho d is higher than the complexity of the classical CG in terms of computations and memory requirements We show that the fact that an iteration of the Blo ckCG requires more computations than the classical CG makes the Blo ckCG more suitable for vector and parallel environments Additionally the increase in memory requirements is only amultiple of s which generally is much smaller than the size of the linear system b eing solved We present three mo dels of distributed implementations of the Blo ckCG and discuss the ad vantages and disadvantages from each of these mo del implementations The Classical CG and Blo ckCG are suitable for the solution of symmetric and p ositive denite systems of equations Furthermore b oth metho ds guarantee in exact arithmetic to nd a so lution to p ositive denite systems in a nite numb er of iterations For non symmetric linear systems we study blo ckrow pro jection iterative metho ds for solving general linear systems of equations and we are particularly interested in showing that the rate of convergence of some row pro jection metho ds can b e accelerated with the Blo ckCG metho d We review twoblock pro jection metho ds the blo ck Cimmino and the blo ck Kaczmarz metho d Afterwards we study the implementation of an iterative pro cedure based on the blo ck Cimmino metho d using the Blo ckCG metho d to accelerate its rate of convergence on distributed com puting environments The complexity of the new iterative pro cedure is higher than the one of the Blo ckCG metho d in terms of computations and memory requirements In this case the main advantage is the extension of the application of the CG based metho ds to general linear systems of equations We present a parallel implementation of the blo ck Cimmino metho d with Blo ckCG acceleration that p erforms well on distributed computing environments This last parallel implementation op ens a study of p otential scheduling strategies for distribut ing tasks to a set of computing elements We presentascheduler for heterogeneous computing environments whichiscurrently implemented inside the blo ck Cimmino based solver and can b e reused inside parallel implementations of several other iterative metho ds Lastlywe combine all of the ab ove eorts for parallelizing iterative metho ds with prepro cessing strategies to improve the numerical b ehavior of the blo ck Cimmino based iterativesolver Keywords Cimmino metho d conjugate gradient distributed computing heterogeneous computing iter ative metho ds parallel computing prepro cessing linear systems scheduling symmetric sparse linear systems sparse matrices unsymmetric sparse linear systems iii Acknowledgements The author gratefully acknowledges the following p eople and organizations for their immeasur able supp ort Iain S Du For his professional support and research guidance Daniel Ruiz For the time and eorts he has dedicated to my thesis and the genuine pleasure that has been working with him Joseph Noailles Who was responsible for my PhD studies Mario Arioli For his remarkable contributions to my work Craig Douglas For al l his insights and valuable comments Proofreading my thesis during the processofprepa ration and great advice you are supposed to have fun while writing your thesis J C Diaz For his constant support in my career Dominique Bennett For al l the help settling in Toulouse and CERFACS which is an indirect but substantial support for four years of work in France CERFACS For providing the scientic resources and pleasant working environment To al l my col leagues and special ly to Osni Marques for helping me plotting and nding eigenvalues ENSEEIHT IRIT The APO team for hosting me in their group and sharing with me their research experiences and great resources Brigitte Sor and Frederic Noail les for their outstanding computing support MCS Argonne National Lab oratory For letting me use their paral lel computing facilities IANCNR Pavia Italy For hosting me in Summer CNUSC For the use of SP Thin node My family For always being with me To Sandra Drummond for her help and company My friends for al l that I have found learned and shared in our friendship iv Contents Intro duction Blo ck conjugate gradient Conjugate gradient algorithm A stable Blo ckCG algorithm Performance of Blo ckCG vs Classical CG Stopping criteria Sequential Blo ckCG exp eriments Solving the LANPRO NOS problem Solving again the LANPRO NOS problem Solving the LANPRO NOS problem Solving the SHERMAN problem Remarks Parallel Blo ckCG implementation Partitioning and scheduling strategies Partitioning strategy Scheduling strategy Implementations of Blo ckCG AlltoAll implementation MasterSlave distributed Blo ckCG implementation MasterSlave centralized Blo ckCG implementation Parallel Blo ckCG exp eriments Parallel solution of the SHERMAN problem Parallel solution of the LANPRO NOS problem Parallel solution of the LANPRO NOS problem Fixing the numb er of equivalent iterations Comparing dierent computing platforms Parallel solution of POISSON problem Remarks Solving general systems using Blo ckCG The blo ck Cimmino metho d The blo ck Kaczmarz metho d Blo ck Cimmino accelerated byBlockCG v vi CONTENTS Computer implementation issues Parallel implementation Ascheduler for heterogeneous environments Taxonomyofscheduling techniques A static scheduler for blo ck iterative metho ds Heterogeneous environment sp ecication Ascheduler for a parallel blo ck Cimmino
Recommended publications
  • Randomized Fast Solvers for Linear and Nonlinear Problems in Data Science
    UC Irvine UC Irvine Electronic Theses and Dissertations Title Randomized Fast Solvers for Linear and Nonlinear Problems in Data Science Permalink https://escholarship.org/uc/item/7c81w1kx Author Wu, Huiwen Publication Date 2019 License https://creativecommons.org/licenses/by/4.0/ 4.0 Peer reviewed|Thesis/dissertation eScholarship.org Powered by the California Digital Library University of California UNIVERSITY OF CALIFORNIA, IRVINE Randomized Fast Solvers for Linear and Nonlinear Problems in Data Science DISSERTATION submitted in partial satisfaction of the requirements for the degree of DOCTOR OF PHILOSOPHY in Math by Huiwen Wu Dissertation Committee: Professor Long Chen, Chair Professor Jack Xin Chancellor’s Professor Hongkai Zhao 2019 © 2019 Huiwen Wu DEDICATION To my family ii TABLE OF CONTENTS Page LIST OF FIGURES vi LIST OF TABLES vii ACKNOWLEDGMENTS ix CURRICULUM VITAE x ABSTRACT OF THE DISSERTATION xi 1 Introduction 1 2 Linear and Nonlinear Problems in Data Science 5 2.1 Least Squares Problem . .5 2.2 Linear Regression . .6 2.3 Logistic Regression . 10 2.3.1 Introduction . 10 2.3.2 Likelihood Function for Logistic Regression . 12 2.3.3 More on Logistic Regression Objective Function . 14 2.4 Non-constraint Convex and Smooth Minimization Problem . 16 3 Existing Methods for Least Squares Problems 20 3.1 Direct Methods . 21 3.1.1 Gauss Elimination . 21 3.1.2 Cholesky Decomposition . 21 3.1.3 QR Decomposition . 23 3.1.4 SVD Decomposition . 24 3.2 Iterative Methods . 25 3.2.1 Residual Correction Method . 25 3.2.2 Conjugate Gradient Method . 26 3.2.3 Kaczmarz Method .
    [Show full text]
  • Overview of Iterative Linear System Solver Packages
    Overview of Iterative Linear System Solver Packages Victor Eijkhout July, 1998 Abstract Description and comparison of several packages for the iterative solu- tion of linear systems of equations. 1 1 Intro duction There are several freely available packages for the iterative solution of linear systems of equations, typically derived from partial di erential equation prob- lems. In this rep ort I will give a brief description of a numberofpackages, and giveaninventory of their features and de ning characteristics. The most imp ortant features of the packages are which iterative metho ds and preconditioners supply; the most relevant de ning characteristics are the interface they present to the user's data structures, and their implementation language. 2 2 Discussion Iterative metho ds are sub ject to several design decisions that a ect ease of use of the software and the resulting p erformance. In this section I will give a global discussion of the issues involved, and how certain p oints are addressed in the packages under review. 2.1 Preconditioners A go o d preconditioner is necessary for the convergence of iterative metho ds as the problem to b e solved b ecomes more dicult. Go o d preconditioners are hard to design, and this esp ecially holds true in the case of parallel pro cessing. Here is a short inventory of the various kinds of preconditioners found in the packages reviewed. 2.1.1 Ab out incomplete factorisation preconditioners Incomplete factorisations are among the most successful preconditioners devel- op ed for single-pro cessor computers. Unfortunately, since they are implicit in nature, they cannot immediately b e used on parallel architectures.
    [Show full text]
  • Comparison of Two Stationary Iterative Methods
    Comparison of Two Stationary Iterative Methods Oleksandra Osadcha Faculty of Applied Mathematics Silesian University of Technology Gliwice, Poland Email:[email protected] Abstract—This paper illustrates comparison of two stationary Some convergence result for the block Gauss-Seidel method iteration methods for solving systems of linear equations. The for problems where the feasible set is the Cartesian product of aim of the research is to analyze, which method is faster solve m closed convex sets, under the assumption that the sequence this equations and how many iteration required each method for solving. This paper present some ways to deal with this problem. generated by the method has limit points presented in.[5]. In [9] presented improving the modiefied Gauss-Seidel method Index Terms—comparison, analisys, stationary methods, for Z-matrices. In [11] presented the convergence Gauss-Seidel Jacobi method, Gauss-Seidel method iterative method, because matrix of linear system of equations is positive definite, but it does not afford a conclusion on the convergence of the Jacobi method. However, it also proved I. INTRODUCTION that Jacobi method also converges. Also these methods were present in [12], where described each The term ”iterative method” refers to a wide range of method. techniques that use successive approximations to obtain more This publication present comparison of Jacobi and Gauss- accurate solutions to a linear system at each step. In this publi- Seidel methods. These methods are used for solving systems cation we will cover two types of iterative methods. Stationary of linear equations. We analyze, how many iterations required methods are older, simpler to understand and implement, but each method and which method is faster.
    [Show full text]
  • A Novel Greedy Kaczmarz Method for Solving Consistent Linear Systems
    A NOVEL GREEDY KACZMARZ METHOD FOR SOLVING CONSISTENT LINEAR SYSTEMS∗ HANYU LI† AND YANJUN ZHANG† Abstract. With a quite different way to determine the working rows, we propose a novel greedy Kaczmarz method for solving consistent linear systems. Convergence analysis of the new method is provided. Numerical experiments show that, for the same accuracy, our method outperforms the greedy randomized Kaczmarz method and the relaxed greedy randomized Kaczmarz method introduced recently by Bai and Wu [Z.Z. BAI AND W.T. WU, On greedy randomized Kaczmarz method for solving large sparse linear systems, SIAM J. Sci. Comput., 40 (2018), pp. A592–A606; Z.Z. BAI AND W.T. WU, On relaxed greedy randomized Kaczmarz methods for solving large sparse linear systems, Appl. Math. Lett., 83 (2018), pp. 21–26] in term of the computing time. Key words. greedy Kaczmarz method, greedy randomized Kaczmarz method, greedy strategy, iterative method, consistent linear systems AMS subject classifications. 65F10, 65F20, 65K05, 90C25, 15A06 1. Introduction. We consider the following consistent linear systems (1.1) Ax = b, where A ∈ Cm×n, b ∈ Cm, and x is the n-dimensional unknown vector. As we know, the Kacz- marz method [10] is a popular so-called row-action method for solving the systems (1.1). In 2009, Strohmer and Vershynin [18] proved the linear convergence of the randomized Kaczmarz (RK) method. Following that, Needell [13] found that the RK method is not converge to the ordi- nary least squares solution when the system is inconsistent. To overcome it, Zouzias and Freris [20] extended the RK method to the randomized extended Kaczmarz (REK) method.
    [Show full text]
  • Parallelization and Implementation of Methods for Image Reconstruction Applications in GNSS Water Vapor Tomography
    UNIVERSIDADE DA BEIRA INTERIOR Faculdade Engenharia Parallelization and Implementation of Methods for Image Reconstruction Applications in GNSS Water Vapor Tomography Fábio André de Oliveira Bento Submitted to the University of Beira Interior in candidature for the Degree of Master of Science in Computer Science and Engineering Supervisor: Prof. Dr. Paul Andrew Crocker Co-supervisor: Prof. Dr. Rui Manuel da Silva Fernandes Covilhã, October 2013 Parallelization and Implementation of Methods for Image Reconstruction ii Parallelization and Implementation of Methods for Image Reconstruction Acknowledgements I would like to thank my supervisor Prof. Paul Crocker for all his support, enthusiasm, patience, time and effort. I am grateful for all his guidance in all of the research and writing of this dissertation. It was very important to me to work with someone with experience and knowledge and that believed in my work. I also would like to thank my co-supervisor Prof. Rui Fernandes for all his support in the research and writing of this dissertation. His experience and knowledge was really helpful especially in the GNSS area and the dissertation organization. I am really grateful for all his support and availability. The third person that I would like to thank is André Sá. He was always available to help me with all my GNSS doubts, his support was really important for the progress of this dissertation. i am really thankful for his time, enthusiasm and patience. I am also very grateful to Prof. Cédric Champollion for all his support and providing the ES- COMPTE data, David Adams for his support and providing the Chuva Project data and Michael Bender for his support and for providing German and European GNSS data.
    [Show full text]
  • Chebyshev and Fourier Spectral Methods 2000
    Chebyshev and Fourier Spectral Methods Second Edition John P. Boyd University of Michigan Ann Arbor, Michigan 48109-2143 email: [email protected] http://www-personal.engin.umich.edu/jpboyd/ 2000 DOVER Publications, Inc. 31 East 2nd Street Mineola, New York 11501 1 Dedication To Marilyn, Ian, and Emma “A computation is a temptation that should be resisted as long as possible.” — J. P. Boyd, paraphrasing T. S. Eliot i Contents PREFACE x Acknowledgments xiv Errata and Extended-Bibliography xvi 1 Introduction 1 1.1 Series expansions .................................. 1 1.2 First Example .................................... 2 1.3 Comparison with finite element methods .................... 4 1.4 Comparisons with Finite Differences ....................... 6 1.5 Parallel Computers ................................. 9 1.6 Choice of basis functions .............................. 9 1.7 Boundary conditions ................................ 10 1.8 Non-Interpolating and Pseudospectral ...................... 12 1.9 Nonlinearity ..................................... 13 1.10 Time-dependent problems ............................. 15 1.11 FAQ: Frequently Asked Questions ........................ 16 1.12 The Chrysalis .................................... 17 2 Chebyshev & Fourier Series 19 2.1 Introduction ..................................... 19 2.2 Fourier series .................................... 20 2.3 Orders of Convergence ............................... 25 2.4 Convergence Order ................................. 27 2.5 Assumption of Equal Errors ...........................
    [Show full text]
  • Iterative Solution Methods and Their Rate of Convergence
    Uppsala University Graduate School in Mathematics and Computing Institute for Information Technology Numerical Linear Algebra FMB and MN1 Fall 2007 Mandatory Assignment 3a: Iterative solution methods and their rate of convergence Exercise 1 (Implementation tasks) Implement in Matlab the following iterative schemes: M1: the (pointwise) Jacobi method; M2: the second order Chebyshev iteration, described in Appendix A; M3: the unpreconditioned conjugate gradient (CG) method (see the lecture notes); M4: the Jacobi (diagonally) preconditioned CG method (see the lecture notes). For methods M3 and M4, you have to write your own Matlab code. For the numerical exper- iments you should use your implementation and not the available Matlab function pcg. The latter can be used to check the correctness of your implementation. Exercise 2 (Generation of test data) Generate four test matrices which have different properties: A: A=matgen disco(s,1); B: B=matgen disco(s,0.001); C: C=matgen anisot(s,1,1); D: D=matgen anisot(s,1,0.001); Here s determines the size of the problem, namely, the matrix size is n = 2s. Check the sparsity of the above matrices (spy). Compute the complete spectra of A, B, C, D (say, for s = 3) by using the Matlab function eig. Plot those and compare. Compute the condition numbers of these matrices. You are advised to repeat the experiment for a larger s to see how the condition number grows with the matrix size n. Exercise 3 (Numerical experiments and method comparisons) Test the four meth- ods M1, M2, M3, M4 on the matrices A, B, C, D.
    [Show full text]
  • Randomised Algorithms for Solving Systems of Linear Equations
    Randomised algorithms for solving systems of linear equations Per-Gunnar Martinsson Dept. of Mathematics & Oden Inst. for Computational Sciences and Engineering University of Texas at Austin Students, postdocs, collaborators: Tracy Babb, Ke Chen, Robert van de Geijn, Abinand Gopal, Nathan Halko, Nathan Heavner, James Levitt, Yijun Liu, Gregorio Quintana-Ortí, Joel Tropp, Bowei Wu, Sergey Voronin, Anna Yesypenko, Patrick Young. Slides: http://users.oden.utexas.edu/∼pgm/main_talks.html Research support by: Scope: Let A be a given m × n matrix (real or complex), and let b be a given vector. The talk is about techniques for solving Ax = b: Environments considered: (Time permitting . ) • A is square, nonsingular, and stored in RAM. • A is square, nonsingular, and stored out of core. (Very large matrix.) • A is rectangular, with m n. (Very over determined system.) • A is a graph Laplacian matrix. (Very large, and sparse). n d • A is an n × n “kernel matrix”, in the sense that given some set of points fxigi=1 ⊂ R , the ij entry of the matrix can be written as k(xi; xj) for some kernel function k. ? Scientific computing: High accuracy required. ? Data analysis: d is large (d = 4, or 10, or 100, or 1 000, . ). Techniques: The recurring theme is randomisation. • Randomized sampling. Typically used to build preconditioners. • Randomized embeddings. Reduce effective dimension of intermediate steps. Prelude: We introduce the ideas around randomized embeddings by reviewing randomized techniques for low rank approximation. (Recap of Tuesday talk.) Randomised SVD: Objective: Given an m × n matrix A of approximate rank k, compute a factorisation A ≈ UDV∗ m × n m × k k × k k × n where U and V are orthonormal, and D is diagonal.
    [Show full text]
  • A Kaczmarz Method with Simple Random Sampling for Solving Large Linear Systems
    A KACZMARZ METHOD WITH SIMPLE RANDOM SAMPLING FOR SOLVING LARGE LINEAR SYSTEMS YUTONG JIANG∗, GANG WU†, AND LONG JIANG‡ Abstract. The Kaczmarz method is a popular iterative scheme for solving large, consistent system of over- determined linear equations. This method has been widely used in many areas such as reconstruction of CT scanned images, computed tomography and signal processing. In the Kaczmarz method, one cycles through the rows of the linear system and each iteration is formed by projecting the current point to the hyperplane formed by the active row. However, the Kaczmarz method may converge very slowly in practice. The randomized Kaczmarz method (RK) greatly improves the convergence rate of the Kaczmarz method, by using the rows of the coefficient matrix in random order rather than in their given order. An obvious disadvantage of the randomized Kaczmarz method is its probability criterion for selecting the active or working rows in the coefficient matrix. In [Z.Z. BAI, W. WU, On greedy randomized Kaczmarz method for solving large sparse linear systems, SIAM Journal on Scientific Computing, 2018, 40: A592–A606], the authors proposed a new probability criterion that can capture larger entries of the residual vector of the linear system as far as possible at each iteration step, and presented a greedy randomized Kaczmarz method (GRK). However, the greedy Kaczmarz method may suffer from heavily computational cost when the size of the matrix is large, and the overhead will be prohibitively large for big data problems. The contribution of this work is as follows. First, from the probability significance point of view, we present a partially randomized Kaczmarz method, which can reduce the computational overhead needed in greedy randomized Kaczmarz method.
    [Show full text]
  • Iterative Methods for Linear Systems of Equations: a Brief Historical Journey
    Iterative methods for linear systems of equations: A brief historical journey Yousef Saad Abstract. This paper presents a brief historical survey of iterative methods for solving linear systems of equations. The journey begins with Gauss who developed the first known method that can be termed iterative. The early 20th century saw good progress of these methods which were initially used to solve least-squares systems, and then linear systems arising from the discretization of partial differential equations. Then iterative methods received a big impetus in the 1950s - partly because of the development of computers. The survey does not attempt to be exhaustive. Rather, the aim is to bring out the way of thinking at specific periods of time and to highlight the major ideas that steered the field. 1. It all started with Gauss A big part of the contributions of Carl Friedrich Gauss can be found in the vo- luminous exchanges he had with contemporary scientists. This correspondence has been preserved in a number of books, e.g., in the twelve \Werke" volumes gathered from 1863 to 1929 at the University of G¨ottingen1. There are also books special- ized on specific correspondences. One of these is dedicated to the exchanges he had with Christian Ludwig Gerling [67]. Gerling was a student of Gauss under whose supervision he earned a PhD from the university of G¨ottingenin 1812. Gerling later became professor of mathematics, physics, and astronomy at the University of Marburg where he spent the rest of his life from 1817 and maintained a relation with Gauss until Gauss's death in 1855.
    [Show full text]
  • Linear Algebra Libraries
    Linear Algebra Libraries Claire Mouton [email protected] March 2009 Contents I Requirements 3 II CPPLapack 4 III Eigen 5 IV Flens 7 V Gmm++ 9 VI GNU Scientific Library (GSL) 11 VII IT++ 12 VIII Lapack++ 13 arXiv:1103.3020v1 [cs.MS] 15 Mar 2011 IX Matrix Template Library (MTL) 14 X PETSc 16 XI Seldon 17 XII SparseLib++ 18 XIII Template Numerical Toolkit (TNT) 19 XIV Trilinos 20 1 XV uBlas 22 XVI Other Libraries 23 XVII Links and Benchmarks 25 1 Links 25 2 Benchmarks 25 2.1 Benchmarks for Linear Algebra Libraries . ....... 25 2.2 BenchmarksincludingSeldon . 26 2.2.1 BenchmarksforDenseMatrix. 26 2.2.2 BenchmarksforSparseMatrix . 29 XVIII Appendix 30 3 Flens Overloaded Operator Performance Compared to Seldon 30 4 Flens, Seldon and Trilinos Content Comparisons 32 4.1 Available Matrix Types from Blas (Flens and Seldon) . ........ 32 4.2 Available Interfaces to Blas and Lapack Routines (Flens and Seldon) . 33 4.3 Available Interfaces to Blas and Lapack Routines (Trilinos) ......... 40 5 Flens and Seldon Synoptic Comparison 41 2 Part I Requirements This document has been written to help in the choice of a linear algebra library to be included in Verdandi, a scientific library for data assimilation. The main requirements are 1. Portability: Verdandi should compile on BSD systems, Linux, MacOS, Unix and Windows. Beyond the portability itself, this often ensures that most compilers will accept Verdandi. An obvious consequence is that all dependencies of Verdandi must be portable, especially the linear algebra library. 2. High-level interface: the dependencies should be compatible with the building of the high-level interface (e.
    [Show full text]
  • Implementing a Preconditioned Iterative Linear Solver Using Massively Parallel Graphics Processing Units
    Implementing a Preconditioned Iterative Linear Solver Using Massively Parallel Graphics Processing Units by Amirhassan Asgari Kamiabad A thesis submitted in conformity with the requirements for the degree of Master of Applied Science Graduate Department of Electrical and Computer Engineering University of Toronto Copyright c 2011 by Amirhassan Asgari Kamiabad Abstract Implementing a Preconditioned Iterative Linear Solver Using Massively Parallel Graphics Processing Units Amirhassan Asgari Kamiabad Master of Applied Science Graduate Department of Electrical and Computer Engineering University of Toronto 2011 The research conducted in this thesis provides a robust implementation of a precondi- tioned iterative linear solver on programmable graphic processing units (GPUs), appro- priately designed to be used in electric power system analysis. Solving a large, sparse linear system is the most computationally demanding part of many widely used power system analysis, such as power flow and transient stability. This thesis presents a de- tailed study of iterative linear solvers with a focus on Krylov-based methods. Since the ill-conditioned nature of power system matrices typically requires substantial precon- ditioning to ensure robustness of Krylov-based methods, a polynomial preconditioning technique is also studied in this thesis. Programmable GPUs currently offer the best ratio of floating point computational throughput to price for commodity processors, outdistancing same-generation CPUs by an order of magnitude. This has led to the widespread
    [Show full text]