HPC Libraries

HPC Libraries

High Performance Computing: Concepts, Methods & Means HPC Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State University April 19 th , 2007 Outline • Introduction to High Performance Libraries • Linear Algebra Libraries (BLAS, LAPACK) • PDE Solvers (PETSc) • Mesh manipulation and load balancing (METIS/ParMETIS, JOSTLE) • Special purpose libraries (FFTW) • General purpose libraries (C++: Boost) • Summary – Materials for test 2 Outline • Introduction to High Performance Libraries • Linear Algebra Libraries (BLAS, LAPACK) • PDE Solvers (PETSc) • Mesh manipulation and load balancing (METIS/ParMETIS, JOSTLE) • Special purpose libraries (FFTW) • General purpose libraries (C++: Boost) • Summary – Materials for test 3 Puzzle of the Day #include <stdio.h> int main() { int a = 10; switch (a) { case '1': printf("ONE\n"); break ; case '2': printf("TWO\n"); break ; defa1ut : printf("NONE\n"); } If you expect the output of the above return 0; } program to be NONE , I would request you to check it out! 4 Application domains • Linear algebra – BLAS, ATLAS, LAPACK, ScaLAPACK, Slatec, pim • Ordinary and partial Differential Equations – PETSc • Mesh manipulation and Load Balancing – METIS, ParMETIS, CHACO, JOSTLE, PARTY • Graph manipulation – Boost.Graph library • Vector/Signal/Image processing – VSIPL, PSSL. • General parallelization – MPI, pthreads • Other domain specific libraries – NAMD, NWChem, Fluent, Gaussian, LS-DYNA 5 Application Domain Overview • Linear Algebra Libraries – Provide optimized methods for constructing sets of linear equations, performing operations on them (matrix-matrix products, matrix-vector products) and solving them (factoring, forward & backward substitution. – Commonly used libraries include BLAS, ATLAS, LAPACK, ScaLAPACK, PaLAPACK • PDE Solvers: – Developing general-porpose, parallel numerical PDE libraries – Usual toolsets include manipulation of sparse data structures, iterative linear system solvers, preconditioners, nonlinear solvers and time-stepping methods. – Commonly used libraries for solving PDEs include SAMRAI, PETSc, PARASOL, Overture, among others. 6 Application Domain Overview • Mesh manipulation and Load Balancing – These libraries help in partitioning meshes in roughly equal sizes across processors, thereby balancing the workload while minimizing size of separators and communication costs. – Commonly used libraries for this purpose include METIS, ParMetis, Chaco, JOSTLE among others. • Other packages: – FFTW: features highly optimized Fourier transform package including both real and complex multidimensional transforms in sequential, multithreaded, and parallel versions. – NAMD: molecular dynamics library available for Unix/Linux, Windows, OS X – Fluent: computational fluid dynamics package, used for such applications as environment control systems, propulsion, reactor modeling etc. 7 Outline • Introduction to High Performance Libraries • Linear Algebra Libraries (BLAS , LAPACK) • PDE Solvers (PETSc) • Mesh manipulation and load balancing (METIS/ParMETIS, JOSTLE) • Special purpose libraries (FFTW) • General purpose libraries (C++: Boost) • Summary – Materials for test 8 BLAS • (Updated set of) Basic Linear Algebra Subprograms • The BLAS functionality is divided into three levels: – Level 1: contains vector operations of the form: as well as scalar dot products and vector norms – Level 2: contains matrix-vector operations of the form as well as Tx = y solving for x with T being triangular – Level 3: contains matrix-matrix operations of the form as well as solving for triangular matrices T. This level contains the widely used General Matrix Multiply operation. 9 BLAS • Several implementations for different languages exist – Reference implementation (F77 and C) http://www.netlib.org/blas/ – ATLAS, highly optimized for particular processor architectures – A generic C++ template class library providing BLAS functionality: uBLAS http://www.boost.org – Several vendors provide libraries optimized for their architecture (AMD, HP, IBM, Intel, NEC, NViDIA, Sun) 10 BLAS: F77 naming conventions 11 BLAS: C naming conventions • F77 routine name is changed to lowercase and prefixed with cblas_ • All routines which accept two dimensional arrays have a new additional first parameter specifying the matrix memory layout (row major or column major) • Character parameters are replaced by corresponding enum values • Input arguments are declared const • Non-complex scalar input parameters are passed by value • Complex scalar input argiments are passed using a void* • Arrays are passed by address • Output scalar arguments are passed by address • Complex functions become subroutines which return the result via an additional last parameter ( void* ), appending _sub to the name 12 BLAS Level 1 routines • Vector operations (xROT, xSWAP, xCOPY etc.) • Scalar dot products (xDOT etc.) • Vector norms (IxAMX etc.) 13 BLAS Level 2 routines • Matrix-vector operations (xGEMV, xGBMV, xHEMV, xHBMV etc.) • Solving Tx = y for x, where T is triangular (xGER, xHER etc.) 14 BLAS Level 3 routines • Matrix-matrix operations (xGEMM etc.) • Solving for triangular matrices (xTRMM) • Widely used matrix-matrix multiply (xSYMM, xGEMM) 15 Demo 1 • Shows solving a matrix multiplication problem using BLAS expressed in FORTRAN, C, and C++ • Shows genericity of uBLAS, by comparing generic and banded matrix versions • Shows newmat, a C++ matrix library which uses operator overloading 16 Outline • Introduction to High Performance Libraries • Linear Algebra Libraries (BLAS, LAPACK ) • PDE Solvers (PETSc) • Mesh manipulation and load balancing (METIS/ParMETIS, JOSTLE) • Special purpose libraries (FFTW) • General purpose libraries (C++: Boost) • Summary – Materials for test 17 LAPACK • Linear Algebra PACKage – http://www.netlib.org/lapack/ – Written in F77 – Provides routines for • Solving systems of simultaneous linear equations, • Least-squares solutions of linear systems of equations, • Eigenvalue problems, • Householder transformation to implement QR decomposition on a matrix and • Singular value problems – Was initially designed to run efficiently on shared memory vector machines – Depends on BLAS – Has been extended for distributed (SIMD) systems (ScaPACK and PLAPACK) 18 LAPACK (Architecture) 19 LAPACK naming conventions 20 Demo 2 • Shows how using a library might speed up the computation considerably 21 Outline • Introduction to High Performance Libraries • Linear Algebra Libraries (BLAS, LAPACK) • PDE Solvers (PETSc) • Mesh manipulation and load balancing (METIS/ParMETIS, JOSTLE) • Special purpose libraries (FFTW) • General purpose libraries (C++: Boost) • Summary – Materials for test 22 PETSc (pronounced PET-see) • Portable, Extensible Toolkit for Scientific Computation (http://www-unix.mcs.anl.gov/petsc/petsc-as/ ) – Suite of data structures and routines for the scalable (parallel) solution of scientific applications modeled by partial differential equations (PDEs) – Employs the MPI standard for all message-passing communication – Intended for use in large-scale application projects – Includes a large suite of parallel linear and nonlinear equation solvers – Easily used in application codes written in C, C++, Fortran and Python • Good introduction: http://www-unix.mcs.anl.gov/petsc/petsc-as/documentation/tutorials/nersc02/nersc02.ppt 23 PETSc (general features) • Features include: – Parallel vectors • Scatters (handles communicating ghost point information) • Gathers – Parallel matrices • Several sparse storage formats • Easy, efficient assembly. – Scalable parallel preconditioners – Krylov subspace methods – Parallel Newton-based nonlinear solvers – Parallel time stepping (ODE) solvers 24 PETSc (Architecture) PETSc: Module architecture and layers of abstraction 25 PETSc: Component details • Vector operations (Vec) : Provides the vector operations required for setting up and solving large-scale linear and nonlinear problems. Includes easy-to-use parallel scatter and gather operations, as well as special-purpose code for handling ghost points for regular data structures. • Matrix operations (Mat) : A large suite of data structures and code for the manipulation of parallel sparse matrices. Includes four different parallel matrix data structures, each appropriate for a different class of problems. • Preconditioners (PC) : A collection of sequential and parallel preconditioners, including – (sequential) ILU(k) (incomplete factorization), – LU (lower/upper decomposition), – both sequential and parallel block Jacobi, overlapping additive Schwarz methods • Time stepping ODE solvers (TS) : Code for the time evolution of solutions of PDEs. In addition, provides pseudo-transient continuation techniques for computing steady-state solutions. 26 PETSc: Component details • Krylov subspace solvers (KSP) : Parallel implementations of many popular Krylov subspace iterative methods, including – GMRES (Generalized Minimal Residual method), – CG (Conjugate Gradient), – CGS (Conjugate Gradient Squared), – Bi-CG-Stab (BiConjugate Gradient Squared), – two variants of TFQMR (transpose free QMR), – CR (Conjugate Residuals), – LSQR (Least Square Root). All are coded so that they are immediately usable with any preconditioners and any matrix data structures, including matrix-free methods. • Non-linear solvers (SNES) : Data-structure-neutral implementations of Newton-like methods for nonlinear systems. Includes both line search and trust region techniques with a single interface. Employs by default the above data structures and linear solvers.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    69 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us