2.2 Cray Linux Environment (CLE)

R Managing System Software for the Cray Linux Environment™ S–2393–5003 © 2005, 2006-2013 Cray Inc. All Rights Reserved. This document or parts thereof may not be reproduced in any form unless permitted by contract or by written permission of Cray Inc. U.S. GOVERNMENT RESTRICTED RIGHTS NOTICE The Computer Software is delivered as "Commercial Computer Software" as defined in DFARS 48 CFR 252.227-7014. All Computer Software and Computer Software Documentation acquired by or for the U.S. Government is provided with Restricted Rights. Use, duplication or disclosure by the U.S. Government is subject to the restrictions described in FAR 48 CFR 52.227-14 or DFARS 48 CFR 252.227-7014, as applicable. Technical Data acquired by or for the U.S. Government, if any, is provided with Limited Rights. Use, duplication or disclosure by the U.S. Government is subject to the restrictions described in FAR 48 CFR 52.227-14 or DFARS 48 CFR 252.227-7013, as applicable. Cray and Sonexion are federally registered trademarks and Active Manager, Cascade, Cray Apprentice2, Cray Apprentice2 Desktop, Cray C++ Compiling System, Cray CS300, Cray CX, Cray CX1, Cray CX1-iWS, Cray CX1-LC, Cray CX1000, Cray CX1000-C, Cray CX1000-G, Cray CX1000-S, Cray CX1000-SC, Cray CX1000-SM, Cray CX1000-HN, Cray Fortran Compiler, Cray Linux Environment, Cray SHMEM, Cray X1, Cray X1E, Cray X2, Cray XC30, Cray XD1, Cray XE, Cray XEm, Cray XE5, Cray XE5m, Cray XE6, Cray XE6m, Cray XK6, Cray XK6m, Cray XK7, Cray XMT, Cray XR1, Cray XT, Cray XTm, Cray XT3, Cray XT4, Cray XT5, Cray XT5h, Cray XT5m, Cray XT6, Cray XT6m, CrayDoc, CrayPort, CRInform, ECOphlex, LibSci, NodeKARE, RapidArray, The Way to Better Science, Threadstorm, uRiKA, UNICOS/lc, and YarcData are trademarks of Cray Inc. AMD is a trademark of Advanced Micro Devices, Inc. Aries, Gemini, and Intel are trademarks of Intel Corporation or its subsidiaries in the United States and other countries. DDN is a trademark of DataDirect Networks. Dell and PowerEdge are trademarks of Dell, Inc. Engenio and SANtricity are trademarks of NetApp, Inc. FlexNet is a trademark of Flexera Software. GNU is a trademark of The Free Software Foundation. InfiniBand is a trademark of InfiniBand Trade Association. Java, MySQL Enterprise, MySQL, NFS, and Solaris are trademarks of Oracle and/or its affiliates. Kerberos is a trademark of Massachusetts Institute of Technology. Linux is a trademark of Linus Torvalds. LSF, Platform, Platform Computing, and Platform LSF are trademarks of Platform Computing Corporation. LSI is a trademark of LSI Corporation. Lustre is a trademark of Xyratex and/or its affiliates. Mac, Mac OS, and OS X are trademarks of Apple Inc. Moab is a trademark of Adaptive Computing Enterprises, Inc. Novell, SLES, and SUSE are trademarks of Novell, Inc. NVIDIA and Tesla are trademarks of NVIDIA Corporation. PBS Professional is a trademark of Altair Engineering, Inc. PGI is a trademark of The Portland Group Compiler Technology, STMicroelectronics, Inc. Red Hat is a trademark of Red Hat, Inc. RSA is a trademark of RSA Security Inc. TotalView is a trademark of Rogue Wave Software, Inc. UNIX is a trademark of The Open Group. Whamcloud is a trademark of Whamcloud, Inc. Windows is a trademark of Microsoft Corporation. All other trademarks are the property of their respective owners. RECORD OF REVISION S–2393–5003 Published June 2013 Supports the Cray Linux Environment (CLE) 5.0.UP03 release and the System Management Workstation (SMW) 7.0.UP03 release. S–2393–42 Published April 2013 Supports the Cray Linux Environment (CLE) 4.2 release and the System Management Workstation (SMW) 7.0.UP02 release. S–2393–5002 Published March 2013 Supports the Cray Linux Environment (CLE) 5.0.UP02 release and the System Management Workstation (SMW) 7.0.UP02 release. S–2393–4101 Published December 2012 Supports the Cray Linux Environment (CLE) 4.1.UP01 release and the System Management Workstation (SMW) 7.0.UP01 release. S–2393–5001 Published November 2012 Limited Availability (LA) draft; supports the Cray Linux Environment (CLE) 5.0.UP01 release and the System Management Workstation (SMW) 7.0.UP01 release. S–2393–41 Published August 2012 Limited Availability (LA) draft; supports the Cray Linux Environment (CLE) 4.1.UP00 LA release and the System Management Workstation (SMW) 7.0.UP00 LA release. S–2393–4003 Published March 2012 Supports the Cray Linux Environment (CLE) 4.0.UP03 release and the System Management Workstation (SMW) 6.0.UP03 release. S–2393–4002 Published December 2011 Supports the Cray Linux Environment (CLE) 4.0.UP02 release and the System Management Workstation (SMW) 6.0.UP02 release. S–2393–4001 Published September 2011 Supports the Cray Linux Environment (CLE) 4.0.UP01 release and the System Management Workstation (SMW) 6.0.UP01 release. S–2393–4001 Published August 2011 Limited Availability draft; supports the Cray Linux Environment (CLE) 4.0.UP00 release and the System Management Workstation (SMW) 6.0.UP00 release. S–2393–3102 Published December 2010 Supports the Cray Linux Environment (CLE) 3.1.02 release and the System Management Workstation (SMW) 5.1.02 release. 3.1 Published June 2010 Supports the Cray Linux Environment (CLE) 3.1 release and the System Management Workstation (SMW) 5.1 release. 3.0 Published March 2010 Supports the Cray Linux Environment (CLE) 3.0 release and the System Management Workstation (SMW) 5.0 release. 2.2 Published July 2009 Supports the general availability (GA) release of Cray XT systems running the Cray Linux Environment (CLE) 2.2 release and the general availability (GA) release of the System Management Workstation (SMW) 4.0 release. 2.1 Published November 2008 Supports the general availability (GA) release of Cray XT systems running the Cray Linux Environment (CLE) 2.1 release and the System Management Workstation (SMW) 3.1 release as of the SMW 3.1.09 update. 2.0 Published October 2007 Supports general availability (GA) release of Cray XT series systems running the Cray XT series Programming Environment 2.0, UNICOS/lc 2.0, and System Management Workstation 3.0.1 releases. 1.5 Published October 2006 Supports general availability (GA) release of Cray XT series systems running the Cray XT series Programming Environment 1.5, UNICOS/lc 1.5, and System Management Workstation 1.5 releases. 1.4 Published May 2006 Supports Cray XT3 systems running the Cray XT3 Programming Environment 1.4, Cray XT3 RAS and Management System (CRMS) 1.4, and UNICOS/lc 1.4 releases. 1.3 Published November 2005 Supports Cray XT3 systems running the Cray XT3 Programming Environment 1.3, System Management Workstation (SMW) 1.3, and UNICOS/lc 1.3 releases. 1.2 Published September 2005 Supports Cray XT3 systems running the Cray XT3 Programming Environment 1.2, System Management Workstation (SMW) 1.2, and UNICOS/lc 1.2 releases. 1.1 Published June 2005 Supports Cray XT3 systems running the Cray XT3 Programming Environment 1.1, System Management Workstation (SMW) 1.1, and UNICOS/lc 1.1 releases. 1.0 Published February 2005 Draft documentation to support Cray XT3 limited availability (LA) systems. Changes to this Document Managing System Software for the Cray Linux Environment™ S–2393–5003 S–2393–5003 Added information • Figure 11 shows the port assignments for InfiniBand interfaces on IO Base Blades. Revised information • Dumping Information Using the xtdumpsys Command on page 87 • The xtclear_link_alerts command is deprecated for users and is only run internally by the xtnlrd command. When run internally by xtnlrd, xtclear_link_alerts saves a list of the components that had alerts, to allow xtnlrd to put the alert flags back onto the components and LCBs in case of failure so that an warmswap --add or warmswap --sync command is done correctly before handling any link or blade failures. • Added resFullNode, cleanSharedTimeout, cleanSharedAge, and cleanSharedInterval apsched options to The alps.conf Configuration File on page 271. • Configuring Fine-grained Routing with clcvt on page 327 S–2393–5002 Added information • Creating Logical Machines on Cray XC30 Systems on page 190. • Rebooting Controllers of a Cabinet or Blade on page 71. • Initiating a Network Discovery Process on page 72. • Displaying System Network Congestion Protection Information on page 100. • Configuring Fine-grained Routing with clcvt on page 327. Revised information • Table 1. Physical ID Naming Conventions. • Using the xthotbackup Command to Back Up Boot Root and Shared Root on page 252 has been updated to point to Logical Volume Manager (LVM) backup configuration procedures in Installing and Configuring Cray Linux Environment (CLE) Software. • Checking the Status of System Components: added a description of the new xtcli status -m option usage. • Procedure 47 on page 215: added new step 2, which executes the xtbounce --linktune=all command to tune PCIe and HSN links on the system. • HSS Commands on page 339. • Procedure 101 on page 353 was updated to use the new xtccreboot command to reboot all cabinet controllers at once. • Manually Cleaning ALPS and PBS or TORQUE and Moab After Downed Login Node on page 278 was updated to reflect that the procedure applies to both TORQUE and Moab and PBS Professional. • Configuring Node Health Checker (NHC) on page 164 was updated to include a description of the Sigcont Plugin test. Deleted information • Deleted the instructions for changing the correctable MCE threshold setting via the rc.local script in Callout to rc.local During Boot on page 193. This method is valid for AMD CPUs only. S–2393–5001 Added information • Cray XC30 systems use the cdump and crash utilities; see cdump and crash Utilities for Node Memory Dump and Analysis on page 88. • Monitoring the Health of PCIe Channels on page 100 describes the new xtpcimon command.

2.2 Cray Linux Environment (CLE)

Workload Management and Application Placement for the Cray Linux Environment™

Cray XT and Cray XE Y Y System Overview

The Gemini Network

Cray HPCS Response 10/17/2013

CUG Program-3Press.Pdf

Breakthrough Science Via Extreme Scalability

High Performance Datacenter Networks Architectures, Algorithms, and Opportunities Synthesis Lectures on Computer Architecture

Cray Linux Environment™ (CLE) 4.0 Software Release Overview Supplement

Cray the Supercomputer Company

Cray Centre of Excellence for Hector: Activities and Future Projects

Cray XE™ and XK™ Application Programming and Optimization Student Guide

Workload Management and Application Placement for the Cray Linux Environment