Scyld Clusterware HPC
Total Page:16
File Type:pdf, Size:1020Kb
Scyld ClusterWare HPC User’s Guide Scyld ClusterWare HPC: User’s Guide Revised Edition Published April 4, 2010 Copyright © 1999 - 2010 Penguin Computing, Inc. All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means (electronic, mechanical, photocopying, recording or otherwise) without the prior written permission of the publisher. Scyld ClusterWare, the Highly Scyld logo, and the Penguin Computing logo are trademarks of Penguin Computing, Inc. All other trademarks and copyrights referred to are the property of their respective owners. Table of Contents Preface .....................................................................................................................................................................................v Feedback.........................................................................................................................................................................v 1. Scyld ClusterWare Overview ............................................................................................................................................1 What Is a Beowulf Cluster?............................................................................................................................................1 A Brief History of the Beowulf.............................................................................................................................1 First-Generation Beowulf Clusters .......................................................................................................................2 Scyld ClusterWare: A New Generation of Beowulf..............................................................................................3 Scyld ClusterWare Technical Summary .........................................................................................................................3 Top-Level Features of Scyld ClusterWare ............................................................................................................3 Process Space Migration Technology ...................................................................................................................5 Compute Node Provisioning.................................................................................................................................5 Compute Node Categories ....................................................................................................................................5 Compute Node States............................................................................................................................................5 Major Software Components ................................................................................................................................6 Typical Applications of Scyld ClusterWare....................................................................................................................7 2. Interacting With the System..............................................................................................................................................9 Verifying the Availability of Nodes ................................................................................................................................9 Monitoring Node Status..................................................................................................................................................9 The BeoStatus GUI Tool.......................................................................................................................................9 BeoStatus Node Information .....................................................................................................................10 BeoStatus Update Intervals .......................................................................................................................10 BeoStatus in Text Mode ............................................................................................................................11 The bpstat Command Line Tool..........................................................................................................................11 The beostat Command Line Tool........................................................................................................................12 Issuing Commands........................................................................................................................................................14 Commands on the Master Node..........................................................................................................................14 Commands on the Compute Node ......................................................................................................................14 Examples for Using bpsh...........................................................................................................................14 Formatting bpsh Output.............................................................................................................................15 bpsh and Shell Interaction .........................................................................................................................16 Copying Data to the Compute Nodes ...........................................................................................................................17 Sharing Data via NFS .........................................................................................................................................17 Copying Data via bpcp........................................................................................................................................17 Programmatic Data Transfer ...............................................................................................................................18 Data Transfer by Migration.................................................................................................................................18 Monitoring and Controlling Processes .........................................................................................................................18 3. Running Programs ...........................................................................................................................................................21 Program Execution Concepts .......................................................................................................................................21 Stand-Alone Computer vs. Scyld Cluster ...........................................................................................................21 Traditional Beowulf Cluster vs. Scyld Cluster....................................................................................................21 Program Execution Examples.............................................................................................................................22 Environment Modules...................................................................................................................................................24 Running Programs That Are Not Parallelized ..............................................................................................................24 Starting and Migrating Programs to Compute Nodes (bpsh)..............................................................................25 Copying Information to Compute Nodes (bpcp) ................................................................................................25 Running Parallel Programs ...........................................................................................................................................26 An Introduction to Parallel Programming APIs..................................................................................................26 iii MPI............................................................................................................................................................27 PVM ..........................................................................................................................................................28 Custom APIs..............................................................................................................................................28 Mapping Jobs to Compute Nodes .......................................................................................................................28 Running MPICH Programs.................................................................................................................................29 mpirun........................................................................................................................................................29 Setting Mapping Parameters from Within a Program ...............................................................................29 Examples ...................................................................................................................................................30 Running OpenMPI Programs..............................................................................................................................30 Pre-Requisites to Running OpenMPI ........................................................................................................31