Scyld Clusterware HPC

Total Page:16

File Type:pdf, Size:1020Kb

Scyld Clusterware HPC Scyld ClusterWare HPC User’s Guide Scyld ClusterWare HPC: User’s Guide Revised Edition Published April 4, 2010 Copyright © 1999 - 2010 Penguin Computing, Inc. All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means (electronic, mechanical, photocopying, recording or otherwise) without the prior written permission of the publisher. Scyld ClusterWare, the Highly Scyld logo, and the Penguin Computing logo are trademarks of Penguin Computing, Inc. All other trademarks and copyrights referred to are the property of their respective owners. Table of Contents Preface .....................................................................................................................................................................................v Feedback.........................................................................................................................................................................v 1. Scyld ClusterWare Overview ............................................................................................................................................1 What Is a Beowulf Cluster?............................................................................................................................................1 A Brief History of the Beowulf.............................................................................................................................1 First-Generation Beowulf Clusters .......................................................................................................................2 Scyld ClusterWare: A New Generation of Beowulf..............................................................................................3 Scyld ClusterWare Technical Summary .........................................................................................................................3 Top-Level Features of Scyld ClusterWare ............................................................................................................3 Process Space Migration Technology ...................................................................................................................5 Compute Node Provisioning.................................................................................................................................5 Compute Node Categories ....................................................................................................................................5 Compute Node States............................................................................................................................................5 Major Software Components ................................................................................................................................6 Typical Applications of Scyld ClusterWare....................................................................................................................7 2. Interacting With the System..............................................................................................................................................9 Verifying the Availability of Nodes ................................................................................................................................9 Monitoring Node Status..................................................................................................................................................9 The BeoStatus GUI Tool.......................................................................................................................................9 BeoStatus Node Information .....................................................................................................................10 BeoStatus Update Intervals .......................................................................................................................10 BeoStatus in Text Mode ............................................................................................................................11 The bpstat Command Line Tool..........................................................................................................................11 The beostat Command Line Tool........................................................................................................................12 Issuing Commands........................................................................................................................................................14 Commands on the Master Node..........................................................................................................................14 Commands on the Compute Node ......................................................................................................................14 Examples for Using bpsh...........................................................................................................................14 Formatting bpsh Output.............................................................................................................................15 bpsh and Shell Interaction .........................................................................................................................16 Copying Data to the Compute Nodes ...........................................................................................................................17 Sharing Data via NFS .........................................................................................................................................17 Copying Data via bpcp........................................................................................................................................17 Programmatic Data Transfer ...............................................................................................................................18 Data Transfer by Migration.................................................................................................................................18 Monitoring and Controlling Processes .........................................................................................................................18 3. Running Programs ...........................................................................................................................................................21 Program Execution Concepts .......................................................................................................................................21 Stand-Alone Computer vs. Scyld Cluster ...........................................................................................................21 Traditional Beowulf Cluster vs. Scyld Cluster....................................................................................................21 Program Execution Examples.............................................................................................................................22 Environment Modules...................................................................................................................................................24 Running Programs That Are Not Parallelized ..............................................................................................................24 Starting and Migrating Programs to Compute Nodes (bpsh)..............................................................................25 Copying Information to Compute Nodes (bpcp) ................................................................................................25 Running Parallel Programs ...........................................................................................................................................26 An Introduction to Parallel Programming APIs..................................................................................................26 iii MPI............................................................................................................................................................27 PVM ..........................................................................................................................................................28 Custom APIs..............................................................................................................................................28 Mapping Jobs to Compute Nodes .......................................................................................................................28 Running MPICH Programs.................................................................................................................................29 mpirun........................................................................................................................................................29 Setting Mapping Parameters from Within a Program ...............................................................................29 Examples ...................................................................................................................................................30 Running OpenMPI Programs..............................................................................................................................30 Pre-Requisites to Running OpenMPI ........................................................................................................31
Recommended publications
  • Department of Energy National Laboratories and Plants: Leadership in Cloud Computing (Brochure), U.S. Department of Energy (DOE)
    Department of Energy National Laboratories and Plants Leadership in Cloud Computing Prepared by the National Renewable Energy Laboratory (NREL), a national laboratory of the U.S. Department of Energy, Office of Energy Efficiency and Renewable Energy; NREL is operated by the Alliance for Sustainable Energy, LLC. JJJTABLE OF CONTENTS U.S. DEPARTMENT OF ENERGY NEVADA NATIONAL SECURITY SITE ........................................34 LABORATORIES AND PLANTS ......................................................4 Current State ...............................................................................34 Cloud Vision .................................................................................34 ABOUT THIS REPORT .....................................................................8 Key Initiatives ..............................................................................34 History of Computing ...............................................................9 Evolution of Computing Models ...........................................9 OAK RIDGE NATIONAL LABORATORY ....................................36 What is Cloud Computing? ....................................................9 Current State and Future Work ............................................36 Cloud Security .............................................................................10 RightPath – DOE/NNSA Cloud Strategy ...........................11 PACIFIC NORTHWEST NATIONAL LABORATORY ..............38 Vision ..............................................................................................38
    [Show full text]
  • Intel® Technologies Advance Supercomputing for U.S. Security
    White Paper Government High Performance Computing Intel® Technologies Advance Supercomputing at National Laboratories for U.S. Security Penguin Clusters with Intel® Omni-Path Architecture and Intel® Xeon® Processors Deliver Enhanced Performance and Scalability for Department of Energy’s Workhorse Supercomputers, the Commodity Technology Systems 1 Executive Summary Table of Contents America’s nuclear stockpile of weapons is aging, yet they must be certified as Executive Summary ................1 effective without physical testing. That is the challenge of the Department of Energy’s National Nuclear Security Administration (NNSA). Without underground Protecting America—Safely .........2 testing, scientists must test, validate, and certify the nuclear stockpile using a Stockpile Stewardship Through combination of focused above-ground experiments, computational science and Computational Science. .2 simulations that run on large computing clusters. The Commodity Technology The Commodity Technology Systems 2 System 1 (CTS-1) deployment has over 15 Penguin Computing, Inc. supercomputer Scalable Computing at the Tri-labs. 2 clusters consisting of 83 Tundra* Scalable Units (SU). These clusters incorporate Building Scalable Systems with Intel® Intel® Xeon® processors and many other Intel® technologies, and Asetek* DCLC Omni-Path Architecture and Intel® direct chip cooling technologies. All clusters began deployment at the NNSA’s Architecture. 3 tri-labs in early 2016, providing the computational power for scientists to carry Driving Next-Generation
    [Show full text]
  • SPECIAL STUDY D2 Interim Report: Development of a Supercomputing Strategy in Europe
    SPECIAL STUDY D2 Interim Report: Development of a Supercomputing Strategy in Europe (SMART 2009/0055, Contract Number 2009/S99-142914) Earl C. Joseph, Ph.D. Steve Conway Chris Ingle Gabriella Cattaneo Cyril Meunier Nathaniel Martinez Author(s) IDC: Earl C. Joseph, Ph.D., Steve Conway, Chris Ingle, Gabriella Cattaneo, Cyril Meunier and Nathaniel Martinez Deliverable D2 Interim Report Date of July 7, 2010 delivery Version 2.0 Addressee Bernhard FABIANEK officers Performance Computing Officer A P.508.872.8200 F.508.935.4015 www.idc.com www.idc.com F.508.935.4015 P.508.872.8200 A European Commission DG Information Society Unit F03 — office BU25 04/087 25 Avenue Beaulieu, B-1160 Bruxelles Tel: Email: Contract ref. Contract Nr 2009/S99-142914 Global Headquarters: 5 Speen Street Framingham, MA 01701 US Framingham, MA Street Global Headquarters: 5 Speen Filing Information: April 2010, IDC #IDCWP12S : Special Study IDC OPINION This is the Interim Report (Deliverable D2) of the study, "Development of a Supercomputing Strategy in Europe" by IDC, the multinational market research and consulting company specialized in the ICT markets, on behalf of DG Information Society and Media of the European Commission. This D2 Interim Report presents the main results of the supercomputer market and industry analysis and the overview of main the technology trends and requirements (WP1 and WP2 of the study workplan). With the terms supercomputers or HPC (High Performance Computers) in this study, we refer to all technical computing servers and clusters used to solve problems that are computationally intensive or data intensive, excluding desktop computers.
    [Show full text]
  • Scyld Clusterware Documentation Release 7.6.2
    Scyld ClusterWare Documentation Release 7.6.2 Penguin Computing June 11, 2019 CONTENTS 1 Release Notes 1 1.1 Release Notes: Scyld ClusterWare Release v7.6.2-762g0000......................1 1.1.1 About This Release.......................................1 Important: Recommend using /usr/sbin/install-scyld script............1 Important for clusters using 3rd-party drivers or applications..................1 Important for clusters using Panasas storage...........................2 1.1.2 First Installation of Scyld ClusterWare 7 On A Server.....................2 1.1.3 Upgrading An Earlier Release of Scyld ClusterWare 7 to 7.6.................3 1.1.4 Post-Installation Configuration Issues.............................4 Resolve *.rpmnew and *.rpmsave configuration file differences................4 Disable SELinux and NetworkManager.............................4 Edit /etc/beowulf/conf.d/sysctl.conf as needed.................5 Optionally reduce size of /usr/lib/locale/locale-archive.....................5 Optionally configure and enable compute node CPU speed/power management........5 Optionally install a different TORQUE package.........................5 Optionally enable job manager..................................6 Optionally enable TORQUE scheduler..............................7 Optionally enable Ganglia monitoring tool...........................7 Optionally enable NFS locking..................................7 Optionally adjust the size limit for locked memory.......................7 Optionally increase the max number of processes per user...................8
    [Show full text]
  • March Transcript
    i CHINA’S PURSUIT OF NEXT FRONTIER TECH: COMPUTING, ROBOTICS, AND BIOTECHNOLOGY HEARING BEFORE THE U.S.-CHINA ECONOMIC AND SECURITY REVIEW COMMISSION ONE HUNDRED FIFTEENTH CONGRESS FIRST SESSION THURSDAY, MARCH 16, 2017 Printed for use of the United States-China Economic and Security Review Commission Available via the World Wide Web: www.uscc.gov UNITED STATES-CHINA ECONOMIC AND SECURITY REVIEW COMMISSION WASHINGTON: 2017 ii U.S.-CHINA ECONOMIC AND SECURITY REVIEW COMMISSION CAROLYN BARTHOLOMEW, CHAIRMAN HON. DENNIS C. SHEA, VICE CHAIRMAN Commissioners: ROBIN CLEVELAND JONATHAN N. STIVERS SEN. BYRON L. DORGAN SEN. JAMES TALENT SEN. CARTE P. GOODWIN DR. KATHERINE C. TOBIN ROBERT G. HUBBARD MICHAEL R. WESSEL DANIEL M. SLANE DR. LARRY M. WORTZEL MICHAEL R. DANIS, Executive Director The Commission was created on October 30, 2000 by the Floyd D. Spence National Defense Authorization Act for 2001 § 1238, Public Law No. 106-398, 114 STAT. 1654A-334 (2000) (codified at 22 U.S.C. § 7002 (2001), as amended by the Treasury and General Government Appropriations Act for 2002 § 645 (regarding employment status of staff) & § 648 (regarding changing annual report due date from March to June), Public Law No. 107-67, 115 STAT. 514 (Nov. 12, 2001); as amended by Division P of the “Consolidated Appropriations Resolution, 2003,” Pub L. No. 108-7 (Feb. 20, 2003) (regarding Commission name change, terms of Commissioners, and responsibilities of the Commission); as amended by Public Law No. 109- 108 (H.R. 2862) (Nov. 22, 2005) (regarding responsibilities of Commission and applicability of FACA); as amended by Division J of the “Consolidated Appropriations Act, 2008,” Public Law Nol.
    [Show full text]