
Making Scientific Software Installation Reproducible On Cray Systems Using EasyBuild Petar Forai Kenneth Hoste Research Institute of Ghent University Molecular Pathology Krijgslaan 281, S9 Dr Bohrgasse 7 B-9000 Ghent, Belgium A-1030 Vienna, Austria [email protected] [email protected] Guilherme Peretti-Pezzi Brett Bode Swiss National National Center for Supercomputing Centre Supercomputing Applications Via Trevano, 131 University of Illinois 6900 Lugano, Switzerland 1205 W. Clark St. [email protected] Urbana, IL 61801 [email protected] ABSTRACT 1. INTRODUCTION Cray provides a tuned and supported OS and programming The job of High-Performance Computing (HPC) support environment (PE), including compilers and libraries inte- teams is to enable the productive and efficient use of HPC grated with the modules system. While the Cray PE is resources. This ranges from helping out users by resolving updated frequently, tools and libraries not in it quickly be- issues like login problems or unexpected software crashes, come outdated. In addition, the amount of tools, libraries providing detailed answers to both simple and deeply tech- and scientific applications that HPC user support teams are nical questions, installing requested software libraries, tools expected to provide support for is increasing significantly. and applications, to providing services such as performance The uniformity of the software environment across Cray sites analysis and optimization of scientific software being devel- makes it an attractive target for to share this ubiquitous oped. burden, and to collaborate on a common solution. One particularly time-consuming task for HPC support teams EasyBuild is an open-source, community-driven framework is installing (scientific) software. Due to the advanced na- to automatically and reproducibly install (scientific) soft- ture of supercomputers (i.e., multiple multi-core processors, ware on HPC systems. This paper presents how EasyBuild high performance network interconnect, need for most re- has been integrated with the Cray PE, in order to leverage cent compilers and libraries, etc.), compiling software from the provided optimized components and support an easy source on the actual operating system and for the system way to deploy tools, libraries and scientific applications on architecture that it is going to be used on is typically highly Cray systems. preferred, if not required, as opposed to using readily avail- able binary packages that were built in a generic way. We will discuss the changes that needed to be made to Easy- Build to achieve this, and outline the use case of providing Support teams at HPC sites worldwide typically invest large the Python distribution and accompanying Python packages amounts of time and manpower tackling this tedious and on top of the Cray PE, to obtain a fully featured `batter- time consuming task of installing the (scientific) software ies included' optimized Python installation that integrates tools, libraries, and applications that researchers are re- with the Cray-provided software stack. In addition, we will questing, while trying to maintain a coherent software stack. outline how EasyBuild was deployed at the Swiss National Supercomputing Centre CSCS and how it is leveraged to EasyBuild is a recent software build and installation frame- obtain a consistent software stack and installation workflow work that intends to relieve HPC support teams from the across multiple Cray (XC, CS-Storm) and non-Cray HPC ubiquitous burden of installing scientific software on HPC systems. systems. It supports fully automating the (often complex) installation procedure of scientific software, and includes fea- tures specifically targeted towards HPC systems while pro- viding a flexible yet powerful interface. As such, it has quickly grown to become a platform for collaboration be- tween HPC sites worldwide. We discuss EasyBuild in more detail in Section 2. The time-consuming problem of getting scientific software installed also presents itself on Cray systems, despite the extensive programming environment that Cray usually pro- vides. Both HPC support teams and end users of Cray systems struggle on a daily basis with getting the required tools and applications up and running (let alone doing so in a somewhat optimal way). While a policy for providing and the particular compiler and libraries being employed; software installations in a consistent way to end users is a see also Section 2.3.6. common goal, the reality more often than not is that short- cuts are being taken to speed up the process, resulting in a software stack that is organised in a suboptimal (and some- 2.2 Terminology times outright confusing) manner, with little consistency in Before going into more detail, we introduce some terminol- the software stacks provided on different systems beyond ogy specific to EasyBuild that will be used throughout this what is provided by Cray. Hence, it is clear that EasyBuild paper. could also be useful on Cray systems. 2.2.1 EasyBuild framework Originally, the main target of EasyBuild was the standard The EasyBuild framework is the core of the tool that pro- GNU/Linux 64-bit x86 system architecture that is common- vides the functionality that is commonly needed for building place on today's HPC systems. On these systems there typ- and installing scientific software on HPC systems. It consists ically is an abundant lack of a decent (recent) basic software of a collection of Python modules that implement: stack, i.e., compilers and libraries for MPI, linear algebra, etc., to build scientific applications on top of. EasyBuild was designed to deal with this, by supporting the installation of compilers and accompanying libraries, and through the • the eb command line interface toolchain mechanism it employs (see Section 2.2.4). As such, it was not well suited to Cray systems, where a well equipped • an abstract software installation procedure, split up programming environment is provided that is highly recom- into different steps including configuring, building, test- mended to be used. ing, installing, etc. (see Section 2.3.1) • functions to perform common tasks like downloading In this paper, we present the changes that had to made to and unpacking source tarballs, applying patch files, au- EasyBuild to provide stable integration with the available tonomously running (interactive) shell commands and programming environment on Cray systems (see Section 3). capturing output & exit codes, generating module files, In addition, we outline the use case of installing Python and etc. several common Python libraries using EasyBuild on Cray systems (Section 4), and discuss how EasyBuild deployed • an interface to interact with the modules tool, to check at the Swiss National Supercomputing Centre CSCS (Sec- which modules are available, to load modules, etc. tion 5). • a mechanism to define the environment in which the 2. EASYBUILD installation will be performed, based on the compiler, EasyBuild [17{19] is a tool for building and installing scien- libraries and dependencies being used (see also Sec- tific software on HPC systems, implemented in Python. It tion 2.2.4) was originally created by the HPC support team at Ghent University (Belgium) in 2009, and was publicly released in 2012 under an open-source software license (GPLv2). Soon 2.2.2 Easyblocks after, it was adopted by various HPC sites and an active The implementation of a particular software installation pro- community formed around it. Today, it is used by institu- cedure is done in an easyblock, a Python module that defines, tions worldwide that provide HPC services to the scientific extends and/or replaces one or more of the steps of the ab- research community (see Section 2.4). stract procedure defined by the EasyBuild framework. Easy- blocks leverage the supporting functionality provided by the EasyBuild framework, and can be viewed as `plugins'. 2.1 Goals The main goal of EasyBuild is to fully automate the task A distinction is made between software-specific and generic of building and installing (scientific) software on HPC sys- easyblocks. Software-specific easyblocks implement a proce- tems, including taking care of the required dependencies and dure that is entirely custom to one particular software pack- generating environment module files that match the instal- age (e.g., OpenFOAM), while generic easyblocks implement lations. It aims to be an expert system with which all as- a procedure using standard tools (e.g., CMake, make). pects of software installation procedures can be performed autonomously, adhering to the provided specifications. Each easyblock must define the configuration, build and in- stall steps in one way or another; the EasyBuild framework In addition, EasyBuild serves as a platform for collaboration leaves these steps purposely unimplemented since their im- where different HPC support teams, who likely apply differ- plementation heavily depends on the tools being used in the ent site policies concerning scientific software installations, installation procedure. Since easyblocks are implemented in can efficiently work together to implement best practices and an object-oriented scheme, the step methods implemented benefit from each others expertise. by a particular easyblock can be reused in others through inheritance, enabling code reuse across easyblocks. It aims to support achieving reproducibility in science, by enabling to easilyreproduce software installations that were For each software package being installed, the EasyBuild performed previously.
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages15 Page
-
File Size-