Install your scientific software stack easily with Spack Les mardis du developpement´ technologique
Florent Pruvost (SED) Outline
1. Context 2. Features overview 3. In practice 4. Some feedback
Florent Pruvost (SED) – Install your scientific software stack easily with Spack 2 1 Context
Florent Pruvost (SED) – Install your scientific software stack easily with Spack 3 A scientific software stack
• modular, several languages, different build systems
Application
Linear Algebra
Optimized Kernels Graph Processing Runtime Systems
Paradigms Miscellaneous
• difficulty to be an expert in all the chain
Florent Pruvost (SED) – Install your scientific software stack easily with Spack 4 An example: Aerosol
• a finite elements library developed by the Inria teams Cagire and Cardamom
Aerosol
PaMPA PaStiX HDF5 XML2
PT-SCOTCH StarPU BLAS/LAPACK
MPI CUDA
Florent Pruvost (SED) – Install your scientific software stack easily with Spack 5 Constraints
• R&D ⇒ develop prototypes - test many different builds • computing/data intensive application ⇒ HPC environment - different machines, OSes, environments - remote connection, not administrator • performances ⇒ highly tuned installation - well chosen components - specific build options • reproducibility - control the environment - characterize what influence the build
Florent Pruvost (SED) – Install your scientific software stack easily with Spack 6 Wish list
• a simple process to install a default version • a flexible way to choose build variants - choose compiler, software versions - enable components, e.g. MPI : yes/no - build options, e.g. --enable-debug • be able to install it on a remote machine (supercomputer) - no root permissions - no internet access (not necessarily) • be able to reproduce experiments - not destructive installation - control the environment and thirdparty libraries
Florent Pruvost (SED) – Install your scientific software stack easily with Spack 7 Traditional tools: binary package managers
• dpkg (APT), RPM, pacman, etc • designed to manage a single stack • install one version of each package in a single prefix (/usr) - root permission required • seamless upgrades to a stable, well tested stack
Florent Pruvost (SED) – Install your scientific software stack easily with Spack 8 Traditional tools: port systems
• BSD Ports, portage, Macports, Homebrew, Gentoo, etc • minimal support for builds parameterized by compilers, dependency versions
Florent Pruvost (SED) – Install your scientific software stack easily with Spack 9 Traditional tools: virtual machines and Linux containers
• Docker, etc • containers allow users to build environments for different applications • does not solve the build problem (someone has to build the image) • performance, security, and upgrade issues prevent widespread HPC deployment
Florent Pruvost (SED) – Install your scientific software stack easily with Spack 10 Tools designed for scientific applications
a short list to focus on: Nix/Guix, Easybuild, Spack
• Common features: - build from sources - own directory structure - hash parametrized by versions, dependencies, etc - no need to be root to install packages - usefull available packages: MPI, BLAS/LAPACK, FFTW, etc • Differences: - robustness VS. flexibility - maturity - languages and technical details
Florent Pruvost (SED) – Install your scientific software stack easily with Spack 11 Nix/Guix: functional languages (Guile)
• pros: - very precise dependency tracking - cryptographic hashes determine the exact build and run-time dependencies - safe upgrade - nice for reproducibility • cons: - administrator rights required to install Nix/Guix - limited to opensource softwares: no Intel, CUDA - deal with combinatorial builds? - multi-compiler and version support? - virtual dependency? - syntax for parametrization?
Florent Pruvost (SED) – Install your scientific software stack easily with Spack 12 Easybuild: Python
• pros: - designed for installation on HPC systems - support for proprietary software like Intel and CUDA - can reuse what is already installed, cf. dummy toolchain • cons: - cannot deal with combinatorial builds - requires a file per configuration of a stack - limited command line interface
Florent Pruvost (SED) – Install your scientific software stack easily with Spack 13 Spack: Python 2
• pros: - deals with combinatorial builds - a couple of packages = thousands of builds available - nice command line syntax to tune the stack parameters - can reuse what is already installed, cf. config. files - no need to be root - no need to have internet access if tarballs are available locally • cons: - Spack is currently alpha software (young project) - package parameters changed a lot - new nice features = re-write packages - all parameters that affect a build are not controlled internally (external compilers and libraries) = all builds are not safe
Florent Pruvost (SED) – Install your scientific software stack easily with Spack 14 2 Features overview
Florent Pruvost (SED) – Install your scientific software stack easily with Spack 15 Overview
• object-oriented Python 2 • Unix systems - Windows is not an OS for HPC • ∼ 500 available packages • opensource, opencommunity (github - pull request) - Feb 10, 2013 - Today - mainly developed by T. Gamblin and friends from LLNL - 95 contributors, 154 forks - 1 release every 6 months
Florent Pruvost (SED) – Install your scientific software stack easily with Spack 16 Overview
Florent Pruvost (SED) – Install your scientific software stack easily with Spack 16 Handles combinatorial software complexity
• each unique dependency graph is a unique configuration • each configuration installed in a unique directory • hash of DAG is appended to prefix • installed packages automatically find dependencies - Spack embeds RPATHs in binaries (compiler wrappers) - no need to use modules or set LD LIBRARY PATH - things work the way you built them
Florent Pruvost (SED) – Install your scientific software stack easily with Spack 17 Provides a spec syntax to describe customized DAG configurations
• each expression is a spec for a particular configuration - each clause adds a constraint to the spec - constraints are optional – specify only what you need - customize install on the command line! • syntax abstracts details in the common case - makes parameterization by version, compiler, and options easy when necessary
Florent Pruvost (SED) – Install your scientific software stack easily with Spack 18 Spack Specs can constrain versions of dependencies
• Spack ensures one configuration of each library per DAG - consistency - user does not need to know DAG structure; only the dependency names • Spack can ensure that builds use the same compiler, or you can mix - working on ensuring ABI compatibility when compilers are mixed
Florent Pruvost (SED) – Install your scientific software stack easily with Spack 19 Spack handles API-incompatibility
• mpi is a virtual dependency • install the same package built with two different MPI implementations:
• let Spack choose MPI version, as long as it provides MPI-2 interface:
Florent Pruvost (SED) – Install your scientific software stack easily with Spack 20 Spack packages are simple Python scripts
Florent Pruvost (SED) – Install your scientific software stack easily with Spack 21 Dependencies may be optional
• versions can be tarballs or VCS repositories (git, svn, hg) • the user can define named variants:
• and use them to install: $ spack install hwloc +cuda $ spack install hwloc -cuda • dependencies may be optional according to other conditions:
- e.g. gcc dependency on mpc from 4.5 on:
Florent Pruvost (SED) – Install your scientific software stack easily with Spack 22 Concretization fills in missing configuration details when the user is not explicit
Florent Pruvost (SED) – Install your scientific software stack easily with Spack 23 Default behaviours is configurable
• configure your preferred: - compilers, versions, depends on, variants • directly in the package, e.g. prefers Python 2.7.11
• or for specific machine/environment, edit ∼/.spack/packages.yaml
Florent Pruvost (SED) – Install your scientific software stack easily with Spack 24 Spack builds in isolated environment
• forking build process isolates environment for each build • compiler wrappers add include, lib, and RPATH flags - ensure that dependencies are found automatically
Florent Pruvost (SED) – Install your scientific software stack easily with Spack 25 3 In practice
Florent Pruvost (SED) – Install your scientific software stack easily with Spack 26 Setup
• entry point on github: https://github.com/LLNL/spack • read the doc, at least Getting started: http://spack.readthedocs.io/en/latest/index.html
$ sudo apt install python $ git clone https://github.com/LLNL/spack.git $ . spack/share/spack/setup-env.sh $ spack install gcc
Florent Pruvost (SED) – Install your scientific software stack easily with Spack 27 Check compilers
• check compilers found automatically $ spack compiler list
• information about a compiler $ spack compiler info gcc
Florent Pruvost (SED) – Install your scientific software stack easily with Spack 28 Configure compilers
• add compilers installed in some exotic paths $ spack compiler find /home/jdoe/intel/bin
• remove non desired compilers $ spack compiler rm clang
• compiler configuration can be edited by hand $ vi ~/.spack/compilers.yaml
Florent Pruvost (SED) – Install your scientific software stack easily with Spack 29 Install
• list available packages $ spack list
• information about a package $ spack info hwloc
• check the concrete stack to be installed $ spack spec hwloc
• install a package $ spack install [-v] hwloc
Florent Pruvost (SED) – Install your scientific software stack easily with Spack 30 Configure the software stack
• constrained spec
• providers of virtual packages
Florent Pruvost (SED) – Install your scientific software stack easily with Spack 31 Look for installed packages
• look for all installed packages $ spack find
• find all installed configuration of a package $ spack find [-d | p] hwloc~cuda
• location of an installation $ spack find -p hwloc~cuda%gcc $ spack location -i hwloc~cuda%gcc
Florent Pruvost (SED) – Install your scientific software stack easily with Spack 32 Uninstall + Install = update
• there is no update feature • new variants or dependencies ⇒ re-build the stack • uninstall + re-install = update for the poors! $ spack uninstall [-yad] hwloc~cuda%gcc $ spack install hwloc~cuda%gcc
Florent Pruvost (SED) – Install your scientific software stack easily with Spack 33 Spack on a supercomputer
• no internet access in an HPC environment • need to download the sources $ spack mirror create -d ./mirror/ hwloc $ tar cvf mirror.tar.gz mirror
• copy tarballs on the remote machine $ scp mirror.tar.gz plafrim:
• on the machine, add the mirror directory to the Spack mirrors $ ssh plafrim $ tar xvf mirror.tar.gz $ spack mirror add local file:///home/pruvost/mirror
Florent Pruvost (SED) – Install your scientific software stack easily with Spack 34 Spack on a supercomputer
• softwares installed externally can be used, edit ∼/.spack/packages.yaml
Florent Pruvost (SED) – Install your scientific software stack easily with Spack 35 4 Some feedback
Florent Pruvost (SED) – Install your scientific software stack easily with Spack 36 Spack is more and more used
• Spack strengths: - helpfull when dealing with combinatorial stacks - user-friendly and highly configurable - flexible enough for HPC needs • Spack is used in production at LLNL - build, test, and deployment by code teams - tools, libraries, and Python at Livermore Computing - build research projects for students, postdocs • Spack has a rapidly growing external community - users and contributors at NERSC, Argonne/IIT, EPFL, U. Oregon, Sandia, LANL - Kitware contributing ParaView builds & features - INRIA (HiePACS) using Spack to package their linear solvers
Florent Pruvost (SED) – Install your scientific software stack easily with Spack 37 Spack weaknesses
• Spack current version is 0.9.1. It is currently alpha software. It will remain so until it hits v1.0 • packages change a lot ⇒ re-build often • builds are not fully guaranted - external programs: compilers and libraries - the build environment is not fully isolated and controlled ⇒ problem for reproducibility ⇒ in practice: fail often on new clusters • high combinatorial = difficult to test all builds • high flexibility = possibility to consider different DAGs/variants for one software stack ⇒ non convergence of stack definitions between research teams?
Florent Pruvost (SED) – Install your scientific software stack easily with Spack 38 Thank you ANY QUESTIONS ?