0 to 60 With Intel® HPC Orchestrator HPC Advisory Council Stanford Conference — February 7 & 8, 2017

Steve Jones Director, High Performance Computing Center Stanford University David N. Lombard Chief Software Architect Intel® HPC Orchestrator Senior Principal Engineer, Extreme Scale Computing, Intel Dellarontay Readus HPC Software Engineer, and CS Undergraduate Stanford University Legal Notices and Disclaimers

• Intel technologies’ features and benefits depend on system configuration and may require enabled hardware, software or service activation. Learn more at intel.com, or from the OEM or retailer. • No computer system can be absolutely secure. • Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more complete information visit http://www.intel.com/performance. • Cost reduction scenarios described are intended as examples of how a given Intel- based product, in the specified circumstances and configurations, may affect future costs and provide cost savings. Circumstances will vary. Intel does not guarantee any costs or cost reduction. • No license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by this document. • Intel, the Intel logo and others are trademarks of Intel Corporation in the U.S. and/or other countries. • *Other names and brands may be claimed as the property of others. • © 2017 Intel Corporation. Agenda

• Demo Outline • Intel® HPC Orchestrator and OpenHPC • Installation Walkthrough • Intel® Parallel Studio XE • Modules Support • Management and Maintenance • Workload Management • 3rd-Party Packages • Demo of Built Cluster • Moving Forward

Faster Compute Performance | Capability Intel® Continue the pace of pace the Continue Time μ Scalable System System Scalable Moore’s Moore’s Law Architecture A systems approach for Innovation

Performance | Capability Remove Bottlenecks Develop new technologies to Develop newtechnologies Time compute F ramework (Intel® SSF) (Intel® ramework software memory storage storage system fabric Tighter Integration Fabric PCIe Better power, performance, performance, power, Better Achieve via potential Achieve full density, scaling scaling & cost density, Memory Graphics Cores Execution of Vision Intel® HPC Orchestrator products are the realization of the software portion of Intel® Scalable System Framework

Intel® HPC Orchestrator Intel® Scalable System Framework (SSF) A Holistic Design Solution for All HPC • Small clusters through supercomputers • Compute and data-centric computing • Open Source Community under • Standards-based programmability • Intel’s distribution of OpenHPC Foundation • On-Premise and cloud-based • Intel HW validation • Ecosystem innovation building a • Expose best performance for Intel consistent HPC SW Platform HW • Platform agnostic Intel HW Validation, SW Alignment • Advanced testing & premium features • 30 global members & Technical Support • Product technical support & updates • Multiple distributions What is OpenHPC?

OpenHPC is a community effort endeavoring to: • provide collection(s) of pre-packaged components that can be used to help install and manage flexible HPC systems throughout their lifecycle Install Guide - CentOS7.1 Version (v1.0) Typical cluster • leverage standard Linux delivery model architecture to retain admin familiarity (i.e., package repos)

• allow and promote multiple system configuration recipes Lustre* storage system that leverage community reference designs and best practices • implement integration testing to high speed network gain validation confidence • provide additional distribution mechanism for groups releasing Master compute open-source software (SMS) nodes • provide a stable platform for Data new R&D initiatives Center eth0 eth1 Network to compute eth interface to compute BMC interface tcp networking

Courtesy of OpenHPC* Figure 1: Overview of physical cluster architecture. *Other names and brands may be claimed as the property of others.

the local dat a center network and et h1 used to provision and manage the cluster backend (not e that these interface names are examples and may be di↵erent depending on local settings and OS conventions). Two logical IP interfaces are expected to each compute node: the first is the standard Ethernet interface that will be used for provisioning and resource management. The second is used to connect to each host’s BMC and is used for power management and remot e console access. Physical connectivity for these two logical IP networks is often accommodat ed via separat e cabling and switching infrastructure; however, an alternat e configurat ion can also be accommodat ed via the use of a shared NIC, which runs a packet filter to divert management packets between the host and BMC. In addition to the IP networking, there is a high-speed network (InfiniBand in this recipe) that is also connected to each of thehosts. Thishigh speed network isused for applicat ion message passing and optionally for Lustre connectivity as well.

1.3 B ring your own license OpenHPC provides a variety of integrat ed, pre-packaged elements from the Intel® Parallel Studio XE soft- ware suite. Portions of the runtimes provided by the included compilers and MPI components are freely usable. However, in order to compile new binaries using these tools (or access ot her analysis tools like Intel ® Trace Analyzer and Collector, Intel® Inspector, etc), you will need to provide your own valid license and OpenHPC adopts a br ing-your-own-license model. Note that licenses are provided free of charge for many cat egories of use. In particular, licenses for compilers and developments tools are provided at no cost to academic researchers or developers contributing to open-source software projects. More informat ion on this program can be found at :

https://software.intel.com/en-us/qualify-for-free-software

1.4 I nputs As this recipe details installing a cluster starting from bare-metal, there is a requirement to define IP ad- dresses and gat her hardware MAC addresses in order to support a controlled provisioning process. These values are necessarily unique to the hardware being used, and this document uses variable substitution (${ var i abl e} ) in the command-line examples that follow to highlight where local site inputs are required. A summary of the required and optional variables used throughout this recipe are presented below. Note

6 Rev: bf8c471 Participation as of January’17

Official Members

Mixture of Academics, Labs, OEMs, and ISVs/OSVs

Courtesy of OpenHPC* Project member participation interest? Please contact *Other names and brands may be claimed as the property of others. Kevlin Husser or Jeff ErnstFriedman Individuals

• Reese Baird, Intel (Maintainer) • Thomas Sterling, Indiana University (Component Development • Pavan Balaji, Argonne National Laboratory (Maintainer) Representative) • David Brayford, LRZ (Maintainer) • Craig Stewart, Indiana University (End-User/Site Representative) • Todd Gamblin, Lawrence Livermore National Labs (Maintainer) • Scott Suchyta, Altair (Maintainer) • Craig Gardner, SUSE (Maintainer) • Nirmala Sundararajan, Dell (Maintainer) • Yiannis Georgiou, ATOS (Maintainer) • Balazs Gerofi, RIKEN (Component Development Representative) • Jennifer Green, Los Alamos National Laboratory (Maintainer) • Eric Van Hensbergen, ARM (Maintainer, Testing Coordinator) • Douglas