Xcat3 Documentation Release 2.16.3

xCAT3 Documentation Release 2.16.3 IBM Corporation Sep 23, 2021 Contents 1 Table of Contents 3 1.1 Overview.................................................3 1.2 Install Guides............................................... 16 1.3 Get Started................................................ 33 1.4 Admin Guide............................................... 37 1.5 Advanced Topics............................................. 787 1.6 Questions & Answers.......................................... 1001 1.7 Reference Implementation........................................ 1004 1.8 Troubleshooting............................................. 1016 1.9 Developers................................................ 1019 1.10 Need Help................................................ 1037 1.11 Security Notices............................................. 1037 i ii xCAT3 Documentation, Release 2.16.3 xCAT stands for Extreme Cloud Administration Toolkit. xCAT offers complete management of clouds, clusters, HPC, grids, datacenters, renderfarms, online gaming infras- tructure, and whatever tomorrows next buzzword may be. xCAT enables the administrator to: 1. Discover the hardware servers 2. Execute remote system management 3. Provision operating systems on physical or virtual machines 4. Provision machines in Diskful (stateful) and Diskless (stateless) 5. Install and configure user applications 6. Parallel system management 7. Integrate xCAT in Cloud You’ve reached xCAT documentation site, The main page product page is http://xcat.org xCAT is an open source project hosted on GitHub. Go to GitHub to view the source, open issues, ask questions, and participate in the project. Enjoy! Contents 1 xCAT3 Documentation, Release 2.16.3 2 Contents CHAPTER 1 Table of Contents 1.1 Overview xCAT enables you to easily manage large number of servers for any type of technical computing workload. xCAT is known for exceptional scaling, wide variety of supported hardware and operating systems, virtualization platforms, and complete “day0” setup capabilities. 1.1.1 Differentiators • xCAT Scales Beyond all IT budgets, up to 100,000s of nodes with distributed architecture. • Open Source Eclipse Public License. Support contracts are also available, contact IBM. • Supports Multiple Operating Systems RHEL, SLES, Ubuntu, Debian, CentOS, Fedora, Scientific Linux, Oracle Linux, Windows, Esxi, RHEV, and more! • Support Multiple Hardware IBM Power, IBM Power LE, x86_64 • Support Multiple Virtualization Infrastructures IBM PowerKVM, KVM, IBM zVM, ESXI, XEN • Support Multiple Installation Options Diskful (Install to Hard Disk), Diskless (Runs in memory), Cloning • Built in Automatic discovery No need to power on one machine at a time for discovery. Nodes that fail can be replaced and back in action simply by powering the new one on. 3 xCAT3 Documentation, Release 2.16.3 • RestFUL API Provides a Rest API interface for the third-party software to integrate with 1.1.2 Features 1. Discover the hardware servers • Manually define • MTMS-based discovery • Switch-based discovery • Sequential-based discovery 2. Execute remote system management against the discovered server • Remote power control • Remote console support • Remote inventory/vitals information query • Remote event log query 3. Provision Operating Systems on physical (Bare-metal) or virtual machines • RHEL • SLES • Ubuntu • Debian • Fedora • CentOS • Scientific Linux • Oracle Linux • PowerKVM • Esxi • RHEV • Windows 4. Provision machines in • Diskful (Scripted install, Clone) • Stateless 5. Install and configure user applications • During OS install • After the OS install • HPC products - GPFS, Parallel Environment, LSF, compilers . • Big Data - Hadoop, Symphony • Cloud - Openstack, Chef 4 Chapter 1. Table of Contents xCAT3 Documentation, Release 2.16.3 6. Parallel system management • Parallel shell (Run shell command against nodes in parallel) • Parallel copy • Parallel ping 7. Integrate xCAT in Cloud • Openstack • SoftLayer 1.1.3 Operating System & Hardware Support Matrix Power Power LE zVM Power KVM x86_64 x86_64 KVM x86_64 Esxi RHEL yes yes yes yes yes yes yes SLES yes yes yes yes yes yes yes Ubuntu no yes no yes yes yes yes CentOS no no no no yes yes yes Windows no no no no yes yes yes 1.1.4 Architecture The following diagram shows the basic structure of xCAT: 1.1. Overview 5 xCAT3 Documentation, Release 2.16.3 xCAT Management Node (xCAT Mgmt Node): The server where xCAT software is installed and used as the single point to perform system management over the entire cluster. On this node, a database is configured to store the xCAT node definitions. Network services (dhcp, tftp, http, etc) are enabled to respond in Operating system deployment. Service Node: One or more defined “slave” servers operating under the Management Node to assist in system management to reduce the load (cpu, network bandwidth) when using a single Management Node. This concept is necessary when managing very large clusters. Compute Node: The compute nodes are the target servers which xCAT is managing. Network Services (dhcp, tftp, http,etc): The various network services necessary to perform Operating System deployment over the network. xCAT will bring up and configure the network services automatically without any intervention from the System Administrator. Service Processor (SP): A module embedded in the hardware server used to perform the out-of-band hardware control. (e.g. Integrated Management Module (IMM), Flexible Service Processor (FSP), Baseboard Management Controller (BMC), etc) Management network: The network used by the Management Node (or Service Node) to install operating systems and manage the nodes. The Management Node and in-band Network Interface Card (NIC) of the nodes are 6 Chapter 1. Table of Contents xCAT3 Documentation, Release 2.16.3 connected to this network. If you have a large cluster utilizing Service Nodes, sometimes this network is segregated into separate VLANs for each Service Node. Service network: The network used by the Management Node (or Service Node) to control the nodes using out-of- band management using the Service Processor. If the Service Processor is configured in shared mode (meaning the NIC of the Service process is used for the SP and the host), then this network can be combined with the management network. Application network: The network used by the applications on the Compute nodes to communicate among each other. Site (Public) network: The network used by users to access the Management Nodes or access the Compute Nodes directly. RestAPIs: The RestAPI interface can be used by the third-party application to integrate with xCAT. 1.1.5 xCAT2 Release Information The following tables documents the xCAT release versions and release dates. For more detailed information regarding new functions, supported OSs, bug fixes, and download links, refer to the specific release notes. xCAT 2.16.x Table 1: 2.16.x Release Information Version Release Date New OS Supported Release Notes 2.16.2 2021-05-25 RHEL 8.3 2.16.2 Release Notes 2.16.1 2020-11-06 RHEL 8.2 2.16.1 Release Notes 2.16.0 2020-06-17 RHEL 8.1,SLES 15 2.16.0 Release Notes xCAT 2.15.x Table 2: 2.15.x Release Information Version Release Date New OS Supported Release Notes 2.15.1 2020-03-06 RHEL 7.7 2.15.1 Release Notes 2.15.0 2019-11-11 RHEL 8.0 2.15.0 Release Notes xCAT 2.14.x Table 3: 2.14.x Release Information Version Release Date New OS Supported Release Notes 2.14.6 2019-03-29 2.14.6 Release Notes 2.14.5 2018-12-07 RHEL 7.6 2.14.5 Release Notes 2.14.4 2018-10-19 Ubuntu 18.04.1 2.14.4 Release Notes 2.14.3 2018-08-24 SLES 12.3 2.14.3 Release Notes 2.14.2 2018-07-13 RHEL 6.10, Ubuntu 18.04 2.14.2 Release Notes 2.14.1 2018-06-01 RHV 4.2, RHEL 7.5 2.14.1 Release Notes (Power8) 2.14.0 2018-04-20 RHEL 7.5 2.14.0 Release Notes 1.1. Overview 7 xCAT3 Documentation, Release 2.16.3 xCAT 2.13.x Table 4: 2.13.x Release Information Version Release Date New OS Supported Release Notes 2.13.11 2018-03-09 2.13.11 Release Notes 2.13.10 2018-01-26 2.13.10 Release Notes 2.13.9 2017-12-18 2.13.9 Release Notes 2.13.8 2017-11-03 2.13.8 Release Notes 2.13.7 2017-09-22 2.13.7 Release Notes 2.13.6 2017-08-10 RHEL 7.4 2.13.6 Release Notes 2.13.5 2017-06-30 2.13.5 Release Notes 2.13.4 2017-05-09 RHV 4.1 2.13.4 Release Notes 2.13.3 2017-04-14 RHEL 6.9 2.13.3 Release Notes 2.13.2 2017-02-24 2.13.2 Release Notes 2.13.1 2017-01-13 2.13.1 Release Notes 2.13.0 2016-12-09 SLES 12.2 2.13.0 Release Notes xCAT 2.12.x Table 5: 2.12.x Release Information Version Release Date New OS Supported Release Notes 2.12.4 2016-11-11 RHEL 7.3 LE, RHEV 4.0 2.12.4 Release Notes 2.12.3 2016-09-30 2.12.3 Release Notes 2.12.2 2016-08-19 Ubuntu 16.04.1 2.12.2 Release Notes 2.12.1 2016-07-08 2.12.1 Release Notes 2.12.0 2016-05-20 RHEL 6.8, Ubuntu 14.4.4 2.12.0 Release Notes LE, Ubuntu 16.04 8 Chapter 1. Table of Contents xCAT3 Documentation, Release 2.16.3 xCAT 2.11.x xCAT Version New OS New Hardware New Feature • Bug fix xCAT 2.11.1 2016/04/22 2.11.1 Release Notes • RHEL 7.2 LE • S822LC(GCA) • NVIDIA GPU for xCAT 2.11 • UBT 14.4.3 LE • S822LC(GTA) OpenPOWER 2015/12/11 • UBT 15.10 LE • S812LC • Infiniband for 2.11 Release Notes • PowerKVM 3.1 • NeuCloud OP OpenPOWER • ZoomNet RP • SW KIT support for OpenPOWER • renergy command for OpenPOWER • rflash command for OpenPOWER • Add xCAT Trou- bleshooting Log • xCAT Log Classifi- cation • RAID Configura- tion • Accelerate genim- age process • Add bmcdiscover Command • Enhance xcatde- bugmode • new xCAT doc in ReadTheDocs 1.1.

Xcat3 Documentation Release 2.16.3

Z/OS ISPF Services Guide COMMAND NAME

Sonexion 1.5 Upgrade Guide ()

How to Setup Bugtracking System Integration Mantis and Bugzilla Examples

Best Practices and Getting Started Guide for Oracle on IBM Linuxone

1. D-Bus a D-Bus FAQ Szerint D-Bus Egy Interprocessz-Kommunikációs Protokoll, És Annak Referenciamegvalósítása

Seminar HPC Trends Winter Term 2017/2018 New Operating System Concepts for High Performance Computing

TSO/E Programming Guide

The Linux Device File-System

Clustering with Openmosix

TECHNOLOGY LIST - ISSUE DATE: March 18, 2019 Technology Definition: a Set of Knowledge, Skills And/Or Abilities, Taking a Significant Time (E.G

Happy Birthday Linux

Cielo Computational Environment Usage Model