high-performance computing

Intel Cluster Ready and Platform Open Cluster Stack: Clusters Made Simple

The Intel® Cluster Ready program is designed to provide a common standard for high-performance computing (HPC) clusters, helping organizations design and build seamless, compatible configurations. Integrating the standards and tools provided by this program with Platform™ Open Cluster Stack and certified Dell™ clusters can help significantly simplify the deployment and management of HPC clusters.

s the cost of high-performance computing (HPC) The source of this complexity is the cluster architecture falls, a growing range of problems have become eco- itself. Cluster servers are fundamentally different from tra- Anomical to solve with compute clusters based on ditional symmetric multiprocessing servers, because they commodity hardware components. These problems range comprise individual units for processing and storage. In from risk management for insurance portfolios to design opti- addition, using clusters for either multiple jobs or to run mization and durability studies on automobile and aerospace software applications that can use multiple processors Related Categories: components, which can now be solved with levels of compute simultaneously requires high-speed interconnects as well High-performance power formerly reserved for only the biggest and most costly as workload management middleware. Because of these computing (HPC) problems. Even in areas where non-cluster HPC systems have differences, clusters also require different approaches to Parallel systems been in use for some time, such as oil and gas reservoir simu- specification, installation, and management than are used

Platform Computing lation and life sciences clinical studies, commodity hardware in traditional IT environments.

Visit DELL.COM/PowerSolutions enables analysis of larger problem sizes with more fidelity Until recently, it was difficult to ensure that HPC clusters for the complete category index. than was previously economically feasible. met a minimum set of standards: each cluster may have had According to IDC, the HPC technical server market, which different hardware and software components, and the is 50 percent compute clusters, is growing dramatically in resulting combinations may or may not have functioned in the divisional, departmental, and workgroup areas. This the same way. To help avoid this problem, Intel has collab- growth is largely attributable to the use of commodity com- oratively developed the Intel Cluster Ready program and ponents, which have dramatically reduced the price/ technology package with original equipment manufacturers, performance ratio of these systems. However, IDC still expresses channel members, and independent software vendors concern that adoption may be limited, because “clusters are (ISVs). By taking advantage of the standards and tools pro- increasingly complex to deploy and manage” and require vided by this program, and combining them with Platform advanced or specialist skill sets for IT personnel.1 Open Cluster Stack (OCS) software and certified Dell HPC

This product includes software developed by the Rocks™ Cluster Group at the San Diego Center at the University of California, San Diego, and its contributors. 1 “Intel Cluster Ready,” by IDC, Doc #207312, June 2007.

/PowerSolutions Reprinted from Dell Power Solutions, November 2007. Copyright © 2007 Dell Inc. All rights reserved. DELL.COM 109 high-performance computing

Platform OCS helps simplify the deployment and running tests periodically on the cluster and “ comparing the results with those of previous management of Intel Cluster Ready–certified clusters tests. Doing so helps detect deviations from the by installing and configuring Intel Cluster Ready soft- original cluster certification to help ensure that the cluster remains certified. ware components on Dell HPC platforms.” Combining Platform OCS and certified Dell HPC clusters Platform OCS is a pre-integrated, vendor- certified, modular software stack designed to clusters, organizations can help significantly increasing flexibility and helping reduce total streamline the deployment and management simplify the deployment and management of cost of ownership. of clusters running the Linux® OS. Backed by HPC clusters. The Intel Cluster Ready specification is a available global 24/7 enterprise support, it key part of the program, but the program con- transparently integrates open source and com- Introducing Intel Cluster Ready and Intel sists of more than just documentation. The Intel mercial software into a single consistent clus- Cluster Checker Cluster Checker, a script-based tool that per- ter operating environment. Platform OCS helps Intel and its partners have created the Intel forms direct computational tests and measure- simplify the deployment and management of Cluster Ready program to help simplify the ments, helps both vendors and IT organizations Intel Cluster Ready–certified clusters by definition, acquisition, installation, and man- ensure conformance to the specification, pro- installing and configuring Intel Cluster Ready agement of HPC clusters for organizations vides an objective measure of system perfor- software components on Dell HPC platforms without prior experience in cluster computing mance, and can assist in troubleshooting. (see Figure 2). and those working to increase their technical Figure 1 illustrates the architecture of the Intel Platform Load Sharing Facility (LSF®) HPC, computing capacity. By incorporating certified Cluster Checker engine. This tool is designed to a powerful, comprehensive, policy-driven work- hardware, cluster system software, applica- significantly reduce deployment time while load management application for engineering tion software, and cluster-ready configura- increasing uptime for certified clusters. and scientific environ- tions, this program helps reduce both If a cluster passes all of the Cluster Checker ments, works in conjunction with Platform OCS deployment time and total cost of ownership— tests, it is considered Intel Cluster Ready certi- to intelligently schedule parallel and serial both of which can be critical in environments fied. Organizations can also use this tool to help workloads, helping maximize available comput- where HPC applications are delivering essen- ensure that the cluster continues operating ing resources. By utilizing hardware-specific tial competitive and strategic advantages. properly and within the specification simply by integrations, Platform LSF HPC and Platform Intel Cluster Ready provides a reference specification for ISVs and system builders to help validate HPC clusters as well as a set of configurations describing in detail how to com- Cluster definition and Standard output and log file configuration XML file (pass/fail results and diagnostics) bine components to create an Intel Cluster Ready–certified cluster. For IT organizations, Intel Cluster Checker engine the key feature of this program is that it speci- Application fies a common basis for clusters, allowing them Configuration Output Result Configuration Output Result programming to select from a variety of hardware and soft- interface

ware components based on their cluster’s pur- Test module Test module pose and helping ensure that ISV applications Parallel Parallel operations Check operations Check

that work on one certified cluster can also run Node Node Node Node Node Node Node Node reliably on a different certified cluster. This common basis significantly simplifies the pro- cesses of designing, building, acquiring, and deploying clusters based on Intel components, Figure 1. Intel Cluster Checker engine architecture

This product includes software developed by the Rocks™ Cluster Group at the San Diego Supercomputer Center at the University of California, San Diego, and its contributors.

DELL POWER SOLUTIONS | November 2007 110 Reprinted from Dell Power Solutions, November 2007. Copyright © 2007 Dell Inc. All rights reserved. • One 16-port Dell PowerConnect™ switch • KVM (keyboard, video, mouse) over IP switch, Intel Cluster Checker cables, server rack, cable management

Application registered with Intel Cluster Ready program system, and power distribution system

Intel C++ Compiler and Intel MPI Library Intel Fortran Compiler Runtime Environment Building a standard for simplified Cluster Intel Runtime Library Direct Access Programming Library cluster deployment installer Linux OS and Intel Cluster Math Kernel Library OpenFabrics Enterprise Distribution The Intel Cluster Ready program is designed to let organizations easily deploy and manage OS kernel Distributed Message file system fabric HPC clusters, helping eliminate the need to Cluster hardware node architecture create custom implementations in which they must individually install and configure each application while modifying the cluster hard- Figure 2. Intel Cluster Ready components installed and configured by Platform OCS ware and system software to meet these appli- cations’ requirements. This program also OCS also enable organizations to take advan- cluster management tools. They can use helps significantly simplify the work required tage of the high-performance network intercon- Platform OCS to quickly install and configure by commercial and noncommercial applica- nects available on clustered systems and Intel Cluster Ready–certified Dell HPC clusters tion vendors, who can focus on certifying their . with the following components: applications for Intel Cluster Ready configura- Combining Platform software with Dell tions rather than on porting and configuring hardware allows organizations to easily create • Certified Dell hardware the applications for different potential combi- and deploy HPC clusters using industry- • A Message Passing Interface (MPI) imple- nations of hardware and software. By integrat- standard components. HPC clusters incorporate mentation such as Open MPI or Intel MPI ing the Platform OCS software stack with Intel a diverse array of hardware components, and Library Cluster Ready–certified Dell HPC clusters, the appropriate choice of computers, proces- • The Intel Runtime Library, including the Intel organizations can easily create seamless clus- sors, memory, hard drives, storage arrays, net- MPI Library Runtime Environment ter environments to help meet their HPC work devices, cabling, switches, and power • The OpenFabrics Enterprise Distribution requirements. supplies depends on the cluster’s purpose. stack (optional) Organizations must carefully match the hard- ware with their goal to help achieve the desired For example, an Intel Cluster Ready configu- performance. Hardware from Dell—a leader in ration might include Platform OCS with the Intel HPC—can provide a standard, reliable reference Cluster Checker Roll as well as the following platform when building clusters. hardware components: Because Platform OCS is already a part of several Intel Cluster Ready configurations, using • One Dell PowerEdge™ 2950 server as the it helps alleviate the need for organizations to front-end node QUICK LINKs assemble, configure, and install the necessary • 12 Dell PowerEdge 1950 servers as the components either manually or by using other compute nodes Intel Cluster Ready: www.intel.com/go/cluster Platform OCS: www.platform.com/Products/ “Combining Platform software with Dell hardware Platform.OCS Dell HPC cluster solutions: allows organizations to easily create and deploy HPC DELL.COM/HPCC clusters using industry-standard components.”

This product includes software developed by the Rocks™ Cluster Group at the San Diego Supercomputer Center at the University of California, San Diego, and its contributors.

DELL.COM/PowerSolutions Reprinted from Dell Power Solutions, November 2007. Copyright © 2007 Dell Inc. All rights reserved. 111