<<

How Open Source Supports the Largest Computers on the Planet Best Practices for HPC Software Developers Ian Lee Lawrence Livermore National Laboratory July 18, 2018

LLNL-PRES-754800 This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under contract DE-AC52-07NA27344. Lawrence Livermore National Security, LLC 2 LLNL-PRES-754800 software.llnl.gov 3 LLNL-PRES-754800 software.llnl.gov 4 LLNL-PRES-754800 https://upload.wikimedia.org/wikipedia/commons/a/a8/U.S._National_labs_map.jpgsoftware.llnl.gov 5 LLNL-PRES-754800 http://www.ex-astris-scientia.org/articles/new_enterprise/enterprise-warpcore.jpgsoftware.llnl.gov https://pixabay.com/get/e833b10d2af4083ed1534705fb0938c9bd22ffd41db612439df7c17ba0/silos-1602209_1920.jpg 6 LLNL-PRES-754800 software.llnl.gov 1960s 1970s 1980s 1990s 2000s 2010s

Petascale and exascale computing BlueGene ASCI Blue- Pacific

CDC 7600 CRAY 1

CDC 3600 Breakthrough visualizations of Detailed mixing fluids Helping the medical predictions community plan of ecosystems radiation treatment Dynamics in three dimensions Ozone mixing models Unprecedented Pioneering dislocation dynamics simulations of Global climate simulations particle tracking modeling

7 LLNL-PRES-754800 software.llnl.gov Top500.org

§ 3 out of 16 #1 systems over last 20 years Sequoia June 2012

ASCI White Nov 2000 – Nov 2001 BlueGene/L Nov 2004 – Nov 2007

8 LLNL-PRES-754800 https://www.top500.org/resources/top-systems/ software.llnl.gov

9 LLNL-PRES-754800 software.llnl.gov ZFS on Linux

§ ZFS is an open source filesystem and volume manager designed to address the limitations of existing storage solutions

§ 2011: Available for Linux

§ Ten LLNL filesystems, totaling ~ 100PB

§ Ships in Ubuntu 16.04

10 LLNL-PRES-754800 http://zfsonlinux.org software.llnl.gov 11 LLNL-PRES-754800 software.llnl.gov 12 LLNL-PRES-754800 software.llnl.gov 13 LLNL-PRES-754800 software.llnl.gov 14 LLNL-PRES-754800 software.llnl.gov 15 LLNL-PRES-754800 software.llnl.gov 16 LLNL-PRES-754800 software.llnl.gov 17 LLNL-PRES-754800 software.llnl.gov 18 LLNL-PRES-754800 software.llnl.gov 19 LLNL-PRES-754800 https://software.llnl.gov software.llnl.gov LLNL Open Source Presence

20 LLNL-PRES-754800 https://software.llnl.gov/explore software.llnl.gov LLNL Open Source Engagement

21 LLNL-PRES-754800 https://software.llnl.gov/explore software.llnl.gov LLNL Open Source Activities

22 LLNL-PRES-754800 https://software.llnl.gov/explore software.llnl.gov 23 LLNL-PRES-754800 software.llnl.gov Science & Technology Review

“Our large collection of software is a precious Laboratory asset, one that benefits both Lawrence Livermore, and in many cases, the public at large.”

- Bruce Hendrickson Associate Director, Computation

24 LLNL-PRES-754800 https://str.llnl.gov/2018-01/comjan18 software.llnl.gov 25 LLNL-PRES-754800 https://www.exascaleproject.org/more-on-the-software-that-underpins-the-exascale-computingsoftware.llnl.gov-project/ Federal Source Code Policy

§ “Federal Source Code Policy: Achieving Efficiency, Transparency, and Innovation through Reuseable and Open Source Software”

— “Agencies shall make custom-developed code available for Government-wide reuse and make their code inventories discoverable at https://www.code.gov (“Code.gov”) […]”

— “[…] establishes a pilot program that requires agencies, when commissioning new custom software, to release at least 20 percent of new custom-developed code as Open Source Software (OSS) […]”

https://code.gov & https://sourcecode.cio.gov

26 LLNL-PRES-754800 https://sourcecode.cio.gov software.llnl.gov 27 LLNL-PRES-754800 https://code.gov software.llnl.gov 28 LLNL-PRES-754800 https://osti.gov/doecode software.llnl.gov 29 LLNL-PRES-754800 https://government.github.com software.llnl.gov US Government Organizations on GitHub

30 LLNL-PRES-754800 https://government.github.com/community/ software.llnl.gov Thank You!

[email protected]

@IanLee1521 // @LLNL_OpenSource

https://speakerdeck.com/IanLee1521 This document was prepared as an account of work sponsored by an agency of the United States government. Neither the United States government nor Lawrence Livermore National Security, LLC, nor any of their employees makes any warranty, expressed or implied, or assumes any legal liability or responsibility for the accuracy, completeness, or usefulness of any information, apparatus, product, or process disclosed, or represents that its use would not infringe privately owned rights. Rference herein to any specific commercial product, process, or service by trade name, trademark, manufacturer, or otherwise does not necessarily constitute or imply its endorsement, recommendation, or favoring by the United States government or Lawrence Livermore National Security, LLC. The views and opinions of authors expressed herein do not necessarily state or reflect those of the United States government or Lawrence Livermore National Security, LLC, and shall not be used for advertising or product endorsement purposes. TOSS – Tri-Lab Operating System Software

§ Built on Red Hat Enterprise Linux HPSS Hopper — Not an HPC distribution Compiler & User § Adds LLNL developed additions and Development Tools File Systems Environment patches to support HPC Batch Scheduler (MOAB) — Low Latency Interconnect: Infiniband Resource Manager (SLURM) — Parallel File System: Lustre Kernel, Infiniband, Message Passing Interface — Resource Manager: SLURM Supported Linux Commodity Hardware Platform § Work closely with open communities TOSS Components Components not in TOSS

TOSS is a software stack for HPC – large, interconnected clusters!

33 LLNL-PRES-754800 LLNL-PRES-550311 software.llnl.gov § Began as simple resource manager — Now scalable to 1.6M+ cores (sequoia)

§ Launch and manage parallel jobs — Large, parallel jobs, often MPI

§ Queuing and scheduling of jobs — Much more work than resources

http://slurm.schedmd.com http://slurm.schedmd.com software.llnl.gov 34 LLNL-PRES-754800 http://www.ibm.com/developerworks/library/l-slurm-utility/figure3.gif § Family of projects used to build site-customized resource management systems

§ flux-core — Implements the communication layer and lowest level services and interfaces

§ flux-sched — Consists of an engine that handles all the functionality common to scheduling

§ capacitor — A bulk execution manager using flux-core, handles running and monitoring 1000’s of jobs

35 LLNL-PRES-754800 http://flux-framework.github.io software.llnl.gov SPACK

§ Handles combinatorial explosion of ABI-incompatible packages

§ All versions coexist, binaries work regardless of user’s environment

§ Familiar syntax, reminiscent of brew, yum, etc

$ spack install mpileaks unconstrained $ spack install [email protected] @ custom version $ spack install [email protected] %[email protected] % custom compiler $ spack install [email protected] %[email protected] +threads +/- build option $ spack install [email protected] os=SuSE11 os= $ spack install [email protected] os=CNL10 os= $ spack install [email protected] os=CNL10 target=haswell target=

36 LLNL-PRES-754800 https://spack.io software.llnl.gov https://github.com/ESGF

§ Manages the first-ever decentralized database for handling climate science data

§ Multiple petabytes of data at dozens of federated sites worldwide

§ International collaboration for the software that powers most global climate change research

37 LLNL-PRES-754800 https://esgf.llnl.gov software.llnl.gov VisIt

§ Originally developed to visualize and analyze the results of terascale simulations

§ Interactive, scalable, visualization, animation and analysis tool

§ Powerful, easy to use GUI

§ Distributed and parallel architecture allows handling extremely large data sets interactively

38 LLNL-PRES-754800 https://visit.llnl.gov software.llnl.gov 39 LLNL-PRES-754800 https://computation.llnl.gov/casc software.llnl.gov 40 LLNL-PRES-754800 https://code.gov/#/explore-code/agencies/DOE software.llnl.gov Public US Government GitHub Data Scrape

LLNL 5% § 252 US Government Orgs — U.S. Federal (137) — U.S. Military and Intelligence (12) — U.S. Research Labs (103)

§ 8716 Open Source Repositories Other US Governm ent 95%

41 LLNL-PRES-754800 https://github.com/LLNL/scraper/pull/3 software.llnl.gov