Managing HPC Software Complexity With

Managing HPC Software Complexity With

Managing HPC Software Complexity with Spack Supercomputing 2019 Full-day Tutorial November 18, 2018 The most recent version of these slides can be found at: https://spack-tutorial.readthedocs.io Dallas, Texas LLNL-PRES-806064 This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under contract DE-AC52-07NA27344. spack.io Lawrence Livermore National Security, LLC Tutorial Materials Download the latest version of slides and handouts at: spack-tutorial.readthedocs.io For more: § Spack website: spack.io § Spack GitHub repository: github.com/spack/spack § Spack Reference Documentation: spack.readthedocs.io 2 LLNL-PRES-806064 Follow along at spack-tutorial.readthedocs.io spack.io Tutorial Presenters Todd Gamblin, Greg Becker, Peter Scheibel Mario Melara Adam Stewart Massimiliano LLNL NERSC UIUC Culpo Sylabs, Inc. 3 LLNL-PRES-806064 Follow along at spack-tutorial.readthedocs.io spack.io Software complexity in HPC is growing glm suite-sparse yaml-cpp metis cmake ncurses parmetis pkgconf nalu hwloc libxml2 xz bzip2 openssl boost trilinos superlu openblas netlib-scalapack mumps openmpi zlib netcdf hdf5 matio m4 libsigsegv parallel-netcdf Nalu: Generalized Unstructured Massively Parallel Low Mach Flow 4 LLNL-PRES-806064 Follow along at spack-tutorial.readthedocs.io spack.io Software complexity in HPC is growing adol-c automake autoconf perl glm suite-sparse yaml-cpp metis cmake ncurses gmp libtool parmetis pkgconf m4 libsigsegv nalu hwloc libxml2 xz bzip2 openssl p4est pkgconf boost hwloc libxml2 trilinos superlu openblas xz netlib-scalapack netcdf-cxx netcdf mumps openmpi zlib netcdf hdf5 nanoflann matio m4 libsigsegv parallel-netcdf matio hdf5 Nalu: Generalized Unstructured Massively Parallel Low Mach Flow sundials glm netlib-scalapack mumps openmpi trilinos hypre openblas gdbm superlu-dist parmetis sqlite readline zlib petsc python dealii slepc ncurses arpack-ng openssl metis cmake gmsh oce suite-sparse tetgen intel-tbb netgen bzip2 assimp boost gsl muparser dealii: C++ Finite Element Library 5 LLNL-PRES-806064 Follow along at spack-tutorial.readthedocs.io spack.io Software complexity in HPC is growing r-randomforest glm suite-sparse yaml-cpp r-labeling metis cmake ncurses parmetis pkgconf nalu r-lazyeval r-munsell hwloc libxml2 xz r-colorspace bzip2 openssl r-rcolorbrewer boost r-viridislite trilinos superlu openblas r-dichromat netlib-scalapack r-scales mumps r-tibble r-assertthat openmpi zlib netcdf hdf5 r-rlang r-gtable matio m4 libsigsegv parallel-netcdf r-ggplot2 r-plyr r-reshape2 r-rcpp r-modelmetrics Nalu: Generalized Unstructured Massively Parallel Low Mach Flow freetype bzip2 gperf r-r6 r-minqa fontconfig r-caret r-nloptr r-digest cairo libxml2 r-rcppeigen pixman r-lme4 harfbuzz libpng r-praise r-testthat pango adol-c automake sed xz gobject-introspection autoconf perl glib gettext r-mgcv r-crayon tar gmp libtool m4 libsigsegv libffi r-nlme p4est pkgconf libtiff hwloc libxml2 r-pbkrtest help2man xz r-mass zlib r-car r-foreach r-iterators flex netcdf-cxx netcdf r-sparsem curl r python ncurses nanoflann pkgconf openssl readline matio r-quantreg r-matrixmodels pcre hdf5 jdk libjpeg-turbo r-matrix cmake sundials r-lattice icu4c sqlite font-util r-adabag r-glmnet glm tk tcl nasm libxau gdbm xproto netlib-scalapack r-cubist r-mlbench libxdmcp openmpi xextproto mumps r-magrittr perl libpthread-stubs trilinos xtrans r-data-table libxcb util-macros hypre r-stringr xcb-proto r-xgboost libx11 inputproto r-rminer r-e1071 r-class r-codetools openblas kbproto gdbm bison r-kknn r-igraph r-irlba r-stringi r-rpart automake superlu-dist parmetis sqlite autoconf readline r-nnet zlib r-pkgconfig petsc gmp libtool m4 libsigsegv python dealii slepc ncurses arpack-ng openssl r-mda cmake metis r-kernlab gmsh oce suite-sparse r-strucchange r-zoo tetgen r-sandwich intel-tbb r-party netgen r-survival bzip2 r-multcomp r-th-data assimp r-plotrix boost r-coin r-mvtnorm r-modeltools gsl r-pls muparser dealii: C++ Finite Element Library R Miner: R Data Mining Library 6 LLNL-PRES-806064 Follow along at spack-tutorial.readthedocs.io spack.io What is the “production” environment for HPC? § Someone’s home directory? § LLNL? LANL? Sandia? ANL? LBL? TACC? — Environments at large-scale sites are very different. § Which MPI implementation? § Which compiler? § Which dependencies? § Which versions of dependencies? — Many applications require specific dependency versions. Real answer: there isn’t a single production environment or a standard way to build. Reusing someone else’s software is HARD. 7 LLNL-PRES-806064 Follow along at spack-tutorial.readthedocs.io spack.io The complexity of the exascale ecosystem threatens productivity. 5+ target architectures/platforms 15+ applications X 80+ software packages X Xeon Power KNL NVIDIA ARM Laptops? Up to 7 compilers 10+ Programming Models 2-3 versions of each package X Intel GCC Clang XL X OpenMPI MPICH MVAPICH OpenMP CUDA X PGI Cray NAG OpenACC Dharma Legion RAJA Kokkos + external dependencies = up to 1,260,000 combinations! § Every application has its own stack of dependencies. § Developers, users, and facilities dedicate (many) FTEs to building & porting. § Often trade reuse and usability for performance. We must make it easier to rely on others’ software! 8 LLNL-PRES-806064 Follow along at spack-tutorial.readthedocs.io spack.io What about containers? § Containers provide a great way to reproduce and distribute an already-built software stack § Someone needs to build the container! — This isn’t trivial — Containerized applications still have hundreds of dependencies § Using the OS package manager inside a container is insufficient — Most binaries are built unoptimized — Generic binaries, not optimized for specific architectures § HPC containers may need to be rebuilt to support many different hosts, anyway. — Not clear that we can ever build one container for all facilities — Containers likely won’t solve the N-platforms problem in HPC We need something more flexible to build the containers 9 LLNL-PRES-806064 Follow along at spack-tutorial.readthedocs.io spack.io Spack is a flexible package manager for HPC § How to install Spack: $ git clone https://github.com/spack/spack $ . spack/share/spack/setup-env.sh § How to install a package: $ spack install hdf5 § HDF5 and its dependencies are installed within the Spack directory. § Unlike typical package managers, Spack can also install github.com/spack/spack many variants of the same build. — Different compilers — Different MPI implementations @spackpm — Different build options 10 LLNL-PRES-806064 Follow along at spack-tutorial.readthedocs.io spack.io Who can use Spack? People who want to use or distribute software for HPC! 1. End Users of HPC Software — Install and run HPC applications and tools 2. HPC Application Teams — Manage third-party dependency libraries 3. Package Developers — People who want to package their own software for distribution 4. User support teams at HPC Centers — People who deploy software for users at large HPC sites 11 LLNL-PRES-806064 Follow along at spack-tutorial.readthedocs.io spack.io Spack is used worldwide! Over 3,500 software packages Over 2,300 monthly active users (on docs site) Over 450 contributors from labs, academia, industry Plot shows sessions on spack.readthedocs.io for one month 12 LLNL-PRES-806064 Follow along at spack-tutorial.readthedocs.io spack.io Spack has been gaining adoption rapidly (if stars are an indicator) Stars over time Stars per day (same data, 60-day window) 13 LLNL-PRES-806064 Follow along at spack-tutorial.readthedocs.io spack.io Users on our documentation site have also been increasing 14 LLNL-PRES-806064 Follow along at spack-tutorial.readthedocs.io spack.io Spack is being used on many of the top HPC systems § Official deployment tool for the U.S. Exascale Computing Project § 7 of the top 10 supercomputers § High Energy Physics community — Fermilab, CERN, collaborators § Astra (Sandia) Fugaku coming to RIKEN in 2021 DOE/MEXT collaboration § Fugaku (Japanese National Supercomputer Project) Summit (ORNL), Sierra (LLNL) SuperMUC-NG (LRZ, Germany) Edison, Cori, Perlmutter (NERSC) 15 LLNL-PRES-806064 Follow along at spack-tutorial.readthedocs.io spack.io Contributions to Spack continue to grow! § In November 2015, LLNL provided most of the contributions to Spack § Since then, we’ve gone from 300 to over 3,500 packages § Most packages are from external contributors! § Many contributions in core, as well. § We are committed to sustaining Spack’s open source ecosystem! 16 LLNL-PRES-806064 Follow along at spack-tutorial.readthedocs.io spack.io Spack v0.13.1 is the latest release § Major new features: 1. Chaining: use dependencies from external "upstream" Spack instances 2. Views for Spack environments (covered today) 3. Spack detects and builds specifically for your microarchitecture (not shown in tutorial) • named, understandable targets like skylake, broadwell, power9, zen2 4. Spack stacks: combinatorial environments for facility deployment (covered today) 5. Projections: ability to build easily navigable symlink trees environments (covered today) 6. Support no-source packages (BundlePackage) to aggregate related packages 7. Extensions: users can write custom commands that live outside of Spack repo 8. ARM + Fujitsu compiler support 9. GitLab Build Pipelines: Spack can generate a pipeline from a stack (covered in slides) § Over 3,500 packages (~700 added since last year) § Full release notes: https://github.com/spack/spack/releases/tag/v0.13.0 17 LLNL-PRES-806064

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    62 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us