Install your scientific stack easily with Spack Les mardis du developpement´ technologique

Florent Pruvost (SED) Outline

1. Context 2. Features overview 3. In practice 4. Some feedback

Florent Pruvost (SED) – Install your scientific software stack easily with Spack 2 1 Context

Florent Pruvost (SED) – Install your scientific software stack easily with Spack 3 A scientific software stack

• modular, several languages, different build systems

Application

Linear Algebra

Optimized Kernels Graph Processing Runtime Systems

Paradigms Miscellaneous

• difficulty to be an expert in all the chain

Florent Pruvost (SED) – Install your scientific software stack easily with Spack 4 An example: Aerosol

• a finite elements library developed by the Inria teams Cagire and Cardamom

Aerosol

PaMPA PaStiX HDF5 XML2

PT-SCOTCH StarPU BLAS/LAPACK

MPI CUDA

Florent Pruvost (SED) – Install your scientific software stack easily with Spack 5 Constraints

• R&D ⇒ develop prototypes - test many different builds • computing/data intensive application ⇒ HPC environment - different machines, OSes, environments - remote connection, not administrator • performances ⇒ highly tuned installation - well chosen components - specific build options • reproducibility - control the environment - characterize what influence the build

Florent Pruvost (SED) – Install your scientific software stack easily with Spack 6 Wish list

• a simple process to install a default version • a flexible way to choose build variants - choose , software versions - enable components, e.g. MPI : yes/no - build options, e.g. --enable-debug • be able to install it on a remote machine (supercomputer) - no root permissions - no internet access (not necessarily) • be able to reproduce experiments - not destructive installation - control the environment and thirdparty libraries

Florent Pruvost (SED) – Install your scientific software stack easily with Spack 7 Traditional tools: binary package managers

(APT), RPM, pacman, etc • designed to manage a single stack • install one version of each package in a single prefix (/usr) - root permission required • seamless upgrades to a stable, well tested stack

Florent Pruvost (SED) – Install your scientific software stack easily with Spack 8 Traditional tools: port systems

• BSD Ports, , Macports, Homebrew, Gentoo, etc • minimal support for builds parameterized by , dependency versions

Florent Pruvost (SED) – Install your scientific software stack easily with Spack 9 Traditional tools: virtual machines and containers

• Docker, etc • containers allow users to build environments for different applications • does not solve the build problem (someone has to build the image) • performance, security, and upgrade issues prevent widespread HPC deployment

Florent Pruvost (SED) – Install your scientific software stack easily with Spack 10 Tools designed for scientific applications

a short list to focus on: Nix/Guix, Easybuild, Spack

• Common features: - build from sources - own directory structure - hash parametrized by versions, dependencies, etc - no need to be root to install packages - usefull available packages: MPI, BLAS/LAPACK, FFTW, etc • Differences: - robustness VS. flexibility - maturity - languages and technical details

Florent Pruvost (SED) – Install your scientific software stack easily with Spack 11 Nix/Guix: functional languages (Guile)

• pros: - very precise dependency tracking - cryptographic hashes determine the exact build and run-time dependencies - safe upgrade - nice for reproducibility • cons: - administrator rights required to install Nix/Guix - limited to opensource : no Intel, CUDA - deal with combinatorial builds? - multi-compiler and version support? - virtual dependency? - syntax for parametrization?

Florent Pruvost (SED) – Install your scientific software stack easily with Spack 12 Easybuild: Python

• pros: - designed for installation on HPC systems - support for proprietary software like Intel and CUDA - can reuse what is already installed, cf. dummy toolchain • cons: - cannot deal with combinatorial builds - requires a file per configuration of a stack - limited command line interface

Florent Pruvost (SED) – Install your scientific software stack easily with Spack 13 Spack: Python 2

• pros: - deals with combinatorial builds - a couple of packages = thousands of builds available - nice command line syntax to tune the stack parameters - can reuse what is already installed, cf. config. files - no need to be root - no need to have internet access if tarballs are available locally • cons: - Spack is currently alpha software (young project) - package parameters changed a lot - new nice features = re-write packages - all parameters that affect a build are not controlled internally (external compilers and libraries) = all builds are not safe

Florent Pruvost (SED) – Install your scientific software stack easily with Spack 14 2 Features overview

Florent Pruvost (SED) – Install your scientific software stack easily with Spack 15 Overview

• object-oriented Python 2 • Unix systems - Windows is not an OS for HPC • ∼ 500 available packages • opensource, opencommunity (github - pull request) - Feb 10, 2013 - Today - mainly developed by T. Gamblin and friends from LLNL - 95 contributors, 154 forks - 1 release every 6 months

Florent Pruvost (SED) – Install your scientific software stack easily with Spack 16 Overview

Florent Pruvost (SED) – Install your scientific software stack easily with Spack 16 Handles combinatorial software complexity

• each unique dependency graph is a unique configuration • each configuration installed in a unique directory • hash of DAG is appended to prefix • installed packages automatically find dependencies - Spack embeds RPATHs in binaries (compiler wrappers) - no need to use modules or set LD LIBRARY PATH - things work the way you built them

Florent Pruvost (SED) – Install your scientific software stack easily with Spack 17 Provides a spec syntax to describe customized DAG configurations

• each expression is a spec for a particular configuration - each clause adds a constraint to the spec - constraints are optional – specify only what you need - customize install on the command line! • syntax abstracts details in the common case - makes parameterization by version, compiler, and options easy when necessary

Florent Pruvost (SED) – Install your scientific software stack easily with Spack 18 Spack Specs can constrain versions of dependencies

• Spack ensures one configuration of each library per DAG - consistency - user does not need to know DAG structure; only the dependency names • Spack can ensure that builds use the same compiler, or you can mix - working on ensuring ABI compatibility when compilers are mixed

Florent Pruvost (SED) – Install your scientific software stack easily with Spack 19 Spack handles API-incompatibility

• mpi is a virtual dependency • install the same package built with two different MPI implementations:

• let Spack choose MPI version, as long as it provides MPI-2 interface:

Florent Pruvost (SED) – Install your scientific software stack easily with Spack 20 Spack packages are simple Python scripts

Florent Pruvost (SED) – Install your scientific software stack easily with Spack 21 Dependencies may be optional

• versions can be tarballs or VCS repositories (git, svn, hg) • the user can define named variants:

• and use them to install: $ spack install hwloc +cuda $ spack install hwloc -cuda • dependencies may be optional according to other conditions:

- e.g. gcc dependency on mpc from 4.5 on:

Florent Pruvost (SED) – Install your scientific software stack easily with Spack 22 Concretization fills in missing configuration details when the user is not explicit

Florent Pruvost (SED) – Install your scientific software stack easily with Spack 23 Default behaviours is configurable

• configure your preferred: - compilers, versions, depends on, variants • directly in the package, e.g. prefers Python 2.7.11

• or for specific machine/environment, edit ∼/.spack/packages.yaml

Florent Pruvost (SED) – Install your scientific software stack easily with Spack 24 Spack builds in isolated environment

• forking build process isolates environment for each build • compiler wrappers add include, lib, and RPATH flags - ensure that dependencies are found automatically

Florent Pruvost (SED) – Install your scientific software stack easily with Spack 25 3 In practice

Florent Pruvost (SED) – Install your scientific software stack easily with Spack 26 Setup

• entry point on github: https://github.com/LLNL/spack • read the doc, at least Getting started: http://spack.readthedocs.io/en/latest/index.html

$ sudo install python $ git clone https://github.com/LLNL/spack.git $ . spack/share/spack/setup-env.sh $ spack install gcc

Florent Pruvost (SED) – Install your scientific software stack easily with Spack 27 Check compilers

• check compilers found automatically $ spack compiler list

• information about a compiler $ spack compiler info gcc

Florent Pruvost (SED) – Install your scientific software stack easily with Spack 28 Configure compilers

• add compilers installed in some exotic paths $ spack compiler find /home/jdoe/intel/bin

• remove non desired compilers $ spack compiler rm clang

• compiler configuration can be edited by hand $ vi ~/.spack/compilers.yaml

Florent Pruvost (SED) – Install your scientific software stack easily with Spack 29 Install

• list available packages $ spack list

• information about a package $ spack info hwloc

• check the concrete stack to be installed $ spack spec hwloc

• install a package $ spack install [-v] hwloc

Florent Pruvost (SED) – Install your scientific software stack easily with Spack 30 Configure the software stack

• constrained spec

• providers of virtual packages

Florent Pruvost (SED) – Install your scientific software stack easily with Spack 31 Look for installed packages

• look for all installed packages $ spack find

• find all installed configuration of a package $ spack find [-d | p] hwloc~cuda

• location of an installation $ spack find -p hwloc~cuda%gcc $ spack location -i hwloc~cuda%gcc

Florent Pruvost (SED) – Install your scientific software stack easily with Spack 32 Uninstall + Install = update

• there is no update feature • new variants or dependencies ⇒ re-build the stack • uninstall + re-install = update for the poors! $ spack uninstall [-yad] hwloc~cuda%gcc $ spack install hwloc~cuda%gcc

Florent Pruvost (SED) – Install your scientific software stack easily with Spack 33 Spack on a supercomputer

• no internet access in an HPC environment • need to download the sources $ spack mirror create -d ./mirror/ hwloc $ cvf mirror.tar.gz mirror

• copy tarballs on the remote machine $ scp mirror.tar.gz plafrim:

• on the machine, add the mirror directory to the Spack mirrors $ ssh plafrim $ tar xvf mirror.tar.gz $ spack mirror add local file:///home/pruvost/mirror

Florent Pruvost (SED) – Install your scientific software stack easily with Spack 34 Spack on a supercomputer

• softwares installed externally can be used, edit ∼/.spack/packages.yaml

Florent Pruvost (SED) – Install your scientific software stack easily with Spack 35 4 Some feedback

Florent Pruvost (SED) – Install your scientific software stack easily with Spack 36 Spack is more and more used

• Spack strengths: - helpfull when dealing with combinatorial stacks - user-friendly and highly configurable - flexible enough for HPC needs • Spack is used in production at LLNL - build, test, and deployment by code teams - tools, libraries, and Python at Livermore Computing - build research projects for students, postdocs • Spack has a rapidly growing external community - users and contributors at NERSC, Argonne/IIT, EPFL, U. Oregon, Sandia, LANL - Kitware contributing ParaView builds & features - INRIA (HiePACS) using Spack to package their linear solvers

Florent Pruvost (SED) – Install your scientific software stack easily with Spack 37 Spack weaknesses

• Spack current version is 0.9.1. It is currently alpha software. It will remain so until it hits v1.0 • packages change a lot ⇒ re-build often • builds are not fully guaranted - external programs: compilers and libraries - the build environment is not fully isolated and controlled ⇒ problem for reproducibility ⇒ in practice: fail often on new clusters • high combinatorial = difficult to test all builds • high flexibility = possibility to consider different DAGs/variants for one software stack ⇒ non convergence of stack definitions between research teams?

Florent Pruvost (SED) – Install your scientific software stack easily with Spack 38 Thank you ANY QUESTIONS ?