Administrator Manual Revision: 81E8b53
Total Page:16
File Type:pdf, Size:1020Kb
Bright Cluster Manager 7.0 Administrator Manual Revision: 81e8b53 Date: Mon Jun 21 2021 ©2015 Bright Computing, Inc. All Rights Reserved. This manual or parts thereof may not be reproduced in any form unless permitted by contract or by written permission of Bright Computing, Inc. Trademarks Linux is a registered trademark of Linus Torvalds. PathScale is a regis- tered trademark of Cray, Inc. Red Hat and all Red Hat-based trademarks are trademarks or registered trademarks of Red Hat, Inc. SUSE is a reg- istered trademark of Novell, Inc. PGI is a registered trademark of The Portland Group Compiler Technology, STMicroelectronics, Inc. SGE is a trademark of Sun Microsystems, Inc. FLEXlm is a registered trademark of Globetrotter Software, Inc. Maui Cluster Scheduler is a trademark of Adaptive Computing, Inc. ScaleMP is a registered trademark of ScaleMP, Inc. All other trademarks are the property of their respective owners. Rights and Restrictions All statements, specifications, recommendations, and technical informa- tion contained herein are current or planned as of the date of publication of this document. They are reliable as of the time of this writing and are presented without warranty of any kind, expressed or implied. Bright Computing, Inc. shall not be liable for technical or editorial errors or omissions which may occur in this document. Bright Computing, Inc. shall not be liable for any damages resulting from the use of this docu- ment. Limitation of Liability and Damages Pertaining to Bright Computing, Inc. The Bright Cluster Manager product principally consists of free software that is licensed by the Linux authors free of charge. Bright Computing, Inc. shall have no liability nor will Bright Computing, Inc. provide any warranty for the Bright Cluster Manager to the extent that is permitted by law. Unless confirmed in writing, the Linux authors and/or third par- ties provide the program as is without any warranty, either expressed or implied, including, but not limited to, marketability or suitability for a specific purpose. The user of the Bright Cluster Manager product shall accept the full risk for the quality or performance of the product. Should the product malfunction, the costs for repair, service, or correction will be borne by the user of the Bright Cluster Manager product. No copyright owner or third party who has modified or distributed the program as permitted in this license shall be held liable for damages, including gen- eral or specific damages, damages caused by side effects or consequential damages, resulting from the use of the program or the un-usability of the program (including, but not limited to, loss of data, incorrect processing of data, losses that must be borne by you or others, or the inability of the program to work together with any other program), even if a copyright owner or third party had been advised about the possibility of such dam- ages unless such copyright owner or third party has signed a writing to the contrary. Table of Contents Table of Contents . .i 0.1 Quickstart . xiii 0.2 About This Manual . xiii 0.3 About The Manuals In General . xiii 0.4 Getting Administrator-Level Support . xiv 1 Introduction 1 1.1 Bright Cluster Manager Functions And Aims . .1 1.2 The Scope Of The Administrator Manual (This Manual) . .1 1.2.1 Installation . .1 1.2.2 Configuration, Management, And Monitoring Via Bright Cluster Manager Tools And Applications . .2 1.3 Outside The Direct Scope Of The Administrator Manual .3 2 Cluster Management With Bright Cluster Manager 5 2.1 Concepts . .5 2.1.1 Devices . .5 2.1.2 Software Images . .6 2.1.3 Node Categories . .7 2.1.4 Node Groups . .8 2.1.5 Roles . .8 2.2 Modules Environment . .9 2.2.1 Adding And Removing Modules . .9 2.2.2 Using Local And Shared Modules . .9 2.2.3 Setting Up A Default Environment For All Users . 10 2.2.4 Creating A Modules Environment Module . 11 2.3 Authentication . 11 2.3.1 Changing Administrative Passwords On The Cluster 11 2.3.2 Logins Using ssh .................... 13 2.3.3 Certificates . 13 2.3.4 Profiles . 14 2.4 Cluster Management GUI . 15 2.4.1 Installing Cluster Management GUI On The Desktop 15 2.4.2 Navigating The Cluster Management GUI . 21 2.4.3 Advanced cmgui Features . 22 2.5 Cluster Management Shell . 26 2.5.1 Invoking cmsh ..................... 26 2.5.2 Levels, Modes, Help, And Commands Syntax In cmsh ........................... 28 2.5.3 Working With Objects . 30 2.5.4 Accessing Cluster Settings . 40 ii Table of Contents 2.5.5 Advanced cmsh Features . 41 2.6 Cluster Management Daemon . 46 2.6.1 Controlling The Cluster Management Daemon . 46 2.6.2 Configuring The Cluster Management Daemon . 48 2.6.3 Configuring The Cluster Management Daemon Logging Facilities . 48 2.6.4 Configuration File Modification . 49 2.6.5 Configuration File Conflicts Between The Standard Distribution And Bright Cluster Manager For Gen- erated And Non-Generated Files . 50 3 Configuring The Cluster 51 3.1 Main Cluster Configuration Settings . 52 3.1.1 Cluster Configuration: Various Name-Related Set- tings . 52 3.1.2 Cluster Configuration: Some Network-Related Set- tings . 53 3.1.3 Miscellaneous Settings . 54 3.1.4 Limiting The Maximum Number Of Open Files . 55 3.2 Network Settings . 56 3.2.1 Configuring Networks . 57 3.2.2 Adding Networks . 60 3.2.3 Changing Network Parameters . 61 3.3 Configuring Bridge Interfaces . 73 3.4 Configuring VLAN interfaces . 74 3.4.1 Configuring A VLAN Interface Using cmsh .... 74 3.4.2 Configuring A VLAN Interface Using cmgui .... 75 3.5 Configuring Bonded Interfaces . 75 3.5.1 Adding A Bonded Interface . 75 3.5.2 Single Bonded Interface On A Regular Node . 76 3.5.3 Multiple Bonded Interface On A Regular Node . 77 3.5.4 Bonded Interfaces On Head Nodes And HA Head Nodes . 77 3.5.5 Tagged VLAN On Top Of a Bonded Interface . 78 3.5.6 Further Notes On Bonding . 78 3.6 Configuring InfiniBand Interfaces . 78 3.6.1 Installing Software Packages . 79 3.6.2 Subnet Managers . 79 3.6.3 InfiniBand Network Settings . 80 3.6.4 Verifying Connectivity . 81 3.7 Configuring BMC (IPMI/iLO) Interfaces . 82 3.7.1 BMC Network Settings . 82 3.7.2 BMC Authentication . 84 3.7.3 Interfaces Settings . 85 3.8 Configuring Switches And PDUs . 85 3.8.1 Configuring With The Manufacturer’s Configura- tion Interface . 85 Table of Contents iii 3.8.2 Configuring SNMP . 85 3.8.3 Uplink Ports . 86 3.8.4 The showport MAC Address to Port Matching Tool 87 3.9 Disk Layouts: Disked, Semi-Diskless, And Diskless Node Configuration . 88 3.9.1 Disk Layouts . 88 3.9.2 Disk Layout Assertions . 88 3.9.3 Changing Disk Layouts . 88 3.9.4 Changing A Disk Layout From Disked To Diskless 88 3.10 Configuring NFS Volume Exports And Mounts . 90 3.10.1 Exporting A Filesystem Using cmgui And cmsh .. 91 3.10.2 Mounting A Filesystem Using cmgui And cmsh .. 93 3.10.3 Mounting A Filesystem Subtree For A Diskless Node Over NFS . 96 3.10.4 Mounting The Root Filesystem For A Diskless Node Over NFS . 98 3.10.5 Configuring NFS Volume Exports And Mounts Over RDMA With OFED Drivers . 100 3.11 Managing And Configuring Services . 101 3.11.1 Why Use The Cluster Manager For Services? . 101 3.11.2 Managing And Configuring Services—Examples . 102 3.12 Managing And Configuring A Rack . 106 3.12.1 Racks . 106 3.12.2 Rack View . 108 3.12.3 Assigning Devices To A Rack . 111 3.12.4 Assigning Devices To A Chassis . 112 3.12.5 An Example Of Assigning A Device To A Rack, And Of Assigning A Device To A Chassis . 122 3.13 Configuring A GPU Unit, And Configuring GPU Settings . 123 3.13.1 GPUs And GPU Units . 123 3.13.2 GPU Unit Configuration Example: The Dell Pow- erEdge C410x . 123 3.13.3 Configuring GPU Settings . 126 3.14 Configuring Custom Scripts . 129 3.14.1 custompowerscript ................. 129 3.14.2 custompingscript ................. 129 3.14.3 customremoteconsolescript .......... 130 3.15 Cluster Configuration Without Execution By CMDaemon . 130 3.15.1 Cluster Configuration: The Bigger Picture . 130 3.15.2 Making Nodes Function Differently By Image . 131 3.15.3 Making All Nodes Function Differently From Nor- mal Cluster Behavior With FrozenFile ...... 133 3.15.4 Adding Functionality To Nodes Via An initialize Or finalize Script . 133 3.15.5 Examples Of Configuring Nodes With Or Without CMDaemon . 134 iv Table of Contents 4 Power Management 137 4.1 Configuring Power Parameters . 137 4.1.1 PDU-Based Power Control . 138 4.1.2 IPMI-Based Power Control . 140 4.1.3 Combining PDU- and IPMI-Based Power Control . 140 4.1.4 Custom Power Control . 141 4.1.5 Hewlett Packard iLO-Based Power Control . 142 4.2 Power Operations . 143 4.2.1 Power Operations With cmgui ............ 143 4.2.2 Power Operations Through cmsh .......... 145 4.3 Monitoring Power . 146 4.4 CPU Scaling Governors . 146 4.4.1 The Linux Kernel And CPU Scaling Governors . 146 4.4.2 The Governor List According To sysinfo ..... 148 4.4.3 Setting The Governor . 148 5 Node Provisioning 151 5.1 Before The Kernel Loads . 151 5.1.1 PXE Booting . 151 5.1.2 iPXE Booting From A Disk Drive . 155 5.1.3 iPXE Booting Using InfiniBand .