Xcat 2 Extreme Cloud Administration Toolkit

Total Page:16

File Type:pdf, Size:1020Kb

Xcat 2 Extreme Cloud Administration Toolkit Jordi Caubet, IBM Spain – IT Specialist xCAT 2 Extreme Cloud Administration Toolkit http://xcat.sourceforge.net/ © 2011 IBM Corporation xCAT – Extreme Cloud Administration Toolkit xCAT Overview xCAT Architecture & Basic Functionality xCAT Commands Setting Up an xCAT Cluster Example Energy Management Monitoring 2 © 2011 IBM Corporation xCAT – Extreme Cloud Administration Toolkit xCAT Overview xCAT Architecture & Basic Functionality xCAT Commands Setting Up an xCAT Cluster Example Energy Management Monitoring 3 © 2011 IBM Corporation xCAT – Extreme Cloud Administration Toolkit What is xCAT ? Extreme Cluster (Cloud) Administration Toolkit Open source (Eclipse Public License) cluster management solution Configuration database – a relational DB with a simple shell Distributed network services management and shell commands Framework for alerts and alert management Hardware management – control, monitoring, etc. Software provisioning and maintenance Design Goals Build on the work of others – encourage community participation Use Best Practices – borrow concepts not code Scripts only (source included) portability – key to customization! Customer requirement driven Provide a flexible, extensible framework Ability to scale “beyond your budget” 4 © 2011 IBM Corporation xCAT – Extreme Cloud Administration Toolkit What is xCAT ? Systems Need Management Administrators have to manage increasing numbers of both physical and virtual servers Workloads are becoming more specific to OS, libraries and software stacks Increasing need for dynamic reprovisioning Re-purposing of existing equipment Single commands distributed to hundreds/thousands of servers/VMs simultaneously File distribution Firmware and OS updates Cluster troubleshooting 5 © 2011 IBM Corporation xCAT – Extreme Cloud Administration Toolkit xCAT 2 Main Features Client/server architecture. Clients can run on any Perl compliant system. All communications are SSL encrypted. Role-based administration. Different users can be assigned various administrative roles for different resources New stateless and iSCSI support. Stateless can be RAM-root, compressed RAM-root, or stacked NFS-root. Linux software initiator iSCSI support for RH and SLES included. Systems without hardware-based initiators can still be iSCSI installed and booted Virtualization support, including the rmigrate command to request live migration of a virtualized guest from one host to another. (KVM, XEN, VMWare) Scalability. xCAT 2.x was designed to scale beyond your budget. 100,000 nodes? No problem with xCAT's Hierarchical Management Cloud (HMC). Automagic discovery. Single power button press, physical location based, discovery and configuration capability 6 © 2011 IBM Corporation xCAT – Extreme Cloud Administration Toolkit xCAT 2 Main Features Choice of database backend: SQLite, PostgreSQL, MySQL, DB2 Plug-in architecture for compartmental development. Add your own xCAT functionally to do what ever you want. New plug-ins extend the xCAT vocabulary available to xCAT clients Monitoring plug-in infrastructure to easily integrate 3rd party monitoring software into xCAT cluster. Plug-ins provided with xCAT: SNMP, RMC, Ganglia, Performance Copilot Notification infrastructure to be able to watch for xCAT DB table changes SNMP monitoring. Trap handler handles all SNMP traps Node status update (nodelist.status is updated during the node deployment, node power on/off process) Centralized console and systems logs Automatic installation of any additional rpms: Requested by the user during node deployment phase and after the nodes are up and running Supports a large number of operating systems 7 © 2011 IBM Corporation xCAT – Extreme Cloud Administration Toolkit xCAT - Functionality . Remote Hardware Control – Power, reset, vitals, inventory, event logs, SNMP alert processing . Remote Console Management – Serial console, SOL, logging . Remote Destiny Control – Local boot, network boot, iSCSI boot . Parallel Cluster control – parallel shell, parallel rsync, parallel secure copy, parallel ping . Remote Automated Unattended Network Installation – Auto-discovery • MAC address collection • Service processor programming – Remote flashing – Kickstart, Autoyast, imaging, stateless/diskless . Easy to Use and it Scales! Think 100000 nodes – xCAT will make you lazy - no need to walk to datacenter again 8 © 2011 IBM Corporation xCAT – Extreme Cloud Administration Toolkit xCAT – Past, Present . October 1999 – xCAT Zero created for Web 1.0 . January 2000 – Present – xCAT used WW for scale-out Linux and Windows clusters – xCAT Community: 273 members from at least 16 countries . May 2007 – xCAT and CSM developers begin work together on xCAT 2 . October 2007 – xCAT 1.3.0 released pSeries – xCAT 2.0-alpha (Linux Only) xSeries . 2008 PSSP – xCAT 2.0 and 2.1 released • xSeries and pSeries. • Linux and AIX, Windows and Xen xCAT • Open Source. CLI and GUI . 2009 (10 years) CSM – xCAT 2.2 and 2.3 released • Statelite support . 2010 xCAT 2.0 – xCAT 2.4 and 2.5 released 9 © 2011 IBM Corporation xCAT – Extreme Cloud Administration Toolkit Current state of xCAT Good Open Source CSM/xCAT developers Proven scalability – deployed at 100s of sites Community involvement Bad Version 2 is a complete rewrite (current = 2.5.x) Documentation (getting better) Ugly Need more developers Web Interface 10 © 2011 IBM Corporation xCAT – Extreme Cloud Administration Toolkit xCAT is a scaled-out DataCenter Control . Provision Operating Systems on physical or virtual machines: SLES10 SP2 & higher, SLES 11 (incl. SP1), RHEL4.x, RHEL5.x, RHEL 6, CentOS4.x, CentOS5.x, SL 5.5, Fedora 8-14, AIX 6.1, 7.1 (all available Technology Levels), Windows 2008, Windows 7, VMWare, KVM, PowerVM, zVM. Scripted install, Stateless, Statelite, iSCSI, or Cloning . Remotely Manage Sytems: Integrated Lights-out management, remote console, and distributed shell support . Quickly set up and control Management node services: DNS, HTTP, DHCP, TFTP . Ideal management for HPC clusters, RenderFarms, Grids, WebFarms, Online Gaming Infrastructure, Clouds, Datacenters, ... 11 © 2011 IBM Corporation xCAT – Extreme Cloud Administration Toolkit xCAT Overview xCAT Architecture & Basic Functionality Architecture Tables and Database (SQLite, PostGRES, MySQL, DB2) Provisioning Methods (Stateful, Stateless, Satelite) xCAT Commands Setting Up an xCAT Cluster Example Energy Management Monitoring 12 © 2011 IBM Corporation xCAT – Extreme Cloud Administration Toolkit Simple Infrastructure . A single xCAT Management Node (MN) for N number of nodes. – A single node DHCP/TFTP/HTTP/NFS server. – Scales to ~128 nodes. • If staggered boot is used, this can scale to 1024 nodes (tested) xCAT database Management SQLite Node MySQL (DHCP, TFTP, HTTP, NFS(hybrid)) PostgreSQL DB2 Compute Compute Compute …............ Compute Node Node Node Node 13 © 2011 IBM Corporation xCAT – Extreme Cloud Administration Toolkit Hierarchical Infrastructure xCAT database SQLite Management Node (MN) MySQL (DHCP, TFTP, HTTP, NFS(hybrid)) PostgreSQL DB2 Service Node (SN) …............ Service Node (SN) (DHCP, TFTP, HTTP, NFS(hybrid)) (DHCP, TFTP, HTTP, NFS(hybrid)) backup Compute Compute ….. Compute Compute Compute ….. Compute Node Node Node Node Node Node . A single xCAT Management Node (MN) with multiple Service Nodes (SN) providing boot services to increasing scaling. Can scale to 1000s and 100000s of nodes. 14 © 2011 IBM Corporation xCAT – Extreme Cloud Administration Toolkit xCAT Architecture The heart of the xCAT architecture is the xCAT daemon (xcatd) on the CLI Web GUI REST API Client management node. XML/SSL (optional) Mgmt Node xcatd XML/SSL Service xcatd Node database ACL ACL ACL ACL plugin plugin plugin plugin Deployment Logging/Events Deployment Services Logging/Events Services HW HW Ctrl Ctrl Node Node Node Node Node Node 15 © 2011 IBM Corporation xCAT – Extreme Cloud Administration Toolkit Tables and Database . xCAT stores all information about the nodes and subsystems it manages in a database. – XCAT default database is located in /etc/xcat in sqlite tables. XCAT can be instructed to store the tables in MySQL, PostgreSQL or DB2 as well. For most installations you won't need to even fill up half of the tables! – And for the tables that you do need, in most cases you'll only need to put one line in the table! . There are lot of tables but only some tables are for common to Linux and AIX, some are for only AIX, some just for monitoring, some for advanced functions (virtual machines, iSCSI settings), … . xCAT comes with a rich set of functions for manipulating tables. 16 © 2011 IBM Corporation xCAT – Extreme Cloud Administration Toolkit Tables and Database Common tables Table Description site Controls settings that are primarily used for the management node in how xCAT will behave. nodelist The list of all the nodes in the cluster, including each node's current status and what groups it is in. nodehm Settings that control how each node's hardware is managed. Typically, an additional table that is specific to the hardware type of the node contains additional info. E.g. the ipmi, mp, and ppc tables. ipmi Settings for nodes that are controlled by an on-board BMC via IPMI. mp Contains the hardware control info specific to blades. This table also refers to the mpa table, which contains info about each Management Module. mpa Contains info about each Management Module and how to access it. networks Describes the networks in the cluster and info necessary to set up nodes on that network. noderes Resources and settings to use
Recommended publications
  • Loadleveler: Using and Administering
    IBM LoadLeveler Version 5 Release 1 Using and Administering SC23-6792-04 IBM LoadLeveler Version 5 Release 1 Using and Administering SC23-6792-04 Note Before using this information and the product it supports, read the information in “Notices” on page 423. This edition applies to version 5, release 1, modification 0 of IBM LoadLeveler (product numbers 5725-G01, 5641-LL1, 5641-LL3, 5765-L50, and 5765-LLP) and to all subsequent releases and modifications until otherwise indicated in new editions. This edition replaces SC23-6792-03. © Copyright 1986, 1987, 1988, 1989, 1990, 1991 by the Condor Design Team. © Copyright IBM Corporation 1986, 2012. US Government Users Restricted Rights – Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp. Contents Figures ..............vii LoadLeveler for AIX and LoadLeveler for Linux compatibility ..............35 Tables ...............ix Restrictions for LoadLeveler for Linux ....36 Features not supported in LoadLeveler for Linux 36 Restrictions for LoadLeveler for AIX and About this information ........xi LoadLeveler for Linux mixed clusters ....36 Who should use this information .......xi Conventions and terminology used in this information ..............xi Part 2. Configuring and managing Prerequisite and related information ......xii the LoadLeveler environment . 37 How to send your comments ........xiii Chapter 4. Configuring the LoadLeveler Summary of changes ........xv environment ............39 The master configuration file ........40 Part 1. Overview of LoadLeveler Setting the LoadLeveler user .......40 concepts and operation .......1 Setting the configuration source ......41 Overriding the shared memory key .....41 File-based configuration ..........42 Chapter 1. What is LoadLeveler? ....3 Database configuration option ........43 LoadLeveler basics ............4 Understanding remotely configured nodes .
    [Show full text]
  • Installation Procedures for Clusters
    Moreno Baricevic CNR-IOM DEMOCRITOS Trieste, ITALY InstallationInstallation ProceduresProcedures forfor ClustersClusters PART 3 – Cluster Management Tools and Security AgendaAgenda Cluster Services Overview on Installation Procedures Configuration and Setup of a NETBOOT Environment Troubleshooting ClusterCluster ManagementManagement ToolsTools NotesNotes onon SecuritySecurity Hands-on Laboratory Session 2 ClusterCluster managementmanagement toolstools CLUSTERCLUSTER MANAGEMENTMANAGEMENT AdministrationAdministration ToolsTools Requirements: ✔ cluster-wide command execution ✔ cluster-wide file distribution and gathering ✔ password-less environment ✔ must be simple, efficient, easy to use for CLI addicted 4 CLUSTERCLUSTER MANAGEMENTMANAGEMENT AdministrationAdministration ToolsTools C3 tools – The Cluster Command and Control tool suite allows configurable clusters and subsets of machines concurrently execution of commands supplies many utilities cexec (parallel execution of standard commands on all cluster nodes) cexecs (as the above but serial execution, useful for troubleshooting and debugging) cpush (distribute files or directories to all cluster nodes) cget (retrieves files or directory from all cluster nodes) crm (cluster-wide remove) ... and many more PDSH – Parallel Distributed SHell same features as C3 tools, few utilities pdsh, pdcp, rpdcp, dshbak Cluster-Fork – NPACI Rocks serial execution only ClusterSSH multiple xterm windows handled through one input grabber Spawn an xterm for each node! DO NOT EVEN TRY IT ON A LARGE CLUSTER!
    [Show full text]
  • Bare Metal Provisioning to Openstack Using Xcat
    JOURNAL OF COMPUTERS, VOL. 8, NO. 7, JULY 2013 1691 Bare Metal Provisioning to OpenStack Using xCAT Jun Xie Network Information Center, BUPT, Beijing, China Email: [email protected] Yujie Su, Zhaowen Lin, Yan Ma and Junxue Liang Network Information Center, BUPT, Beijing, China Email: {suyj, linzw, mayan, liangjx}@bupt.edu.cn Abstract—Cloud computing relies heavily on virtualization mainly built on virtualization technologies (Xen, KVM, technologies. This also applies to OpenStack, which is etc.). currently the most popular IaaS platform; it mainly provides virtual machines to cloud end users. However, B. Motivation virtualization unavoidably brings some performance The adoption of virtualization in cloud environment penalty, contrasted with bare metal provisioning. In this has resulted in great benefits, however, this has not been paper, we present our approach for extending OpenStack to without attendant problems,such as unresponsive support bare metal provisioning through xCAT(Extreme Cloud Administration Toolkit). As a result, the cloud virtualized systems, crashed virtualized servers, mis- platform could be deployed to provide both virtual configured virtual hosting platforms, performance tuning machines and bare metal machines. This paper firstly and erratic performance metrics, among others. introduces why bare metal machines are desirable in a cloud Consequently, we might want to run the compute jobs on platform, then it describes OpenStack briefly and also the "bare metal”: without a virtualization layer, Consider the xCAT driver, which makes xCAT work with the rest of the following cases: OpenStack platform in order to provision bare metal Case 1: Your OpenStack cloud environment contains machines to cloud end users. At the end, it presents a boards with Tilera processors (non-X86), and Tilera does performance comparison between a virtualized and bare- not currently support virtualization technologies like metal environment KVM.
    [Show full text]
  • Xcat 2 Cookbook for Linux 8/7/2008 (Valid for Both Xcat 2.0.X and Pre-Release 2.1)
    xCAT 2 Cookbook for Linux 8/7/2008 (Valid for both xCAT 2.0.x and pre-release 2.1) Table of Contents 1.0 Introduction.........................................................................................................................................3 1.1 Stateless and Stateful Choices.........................................................................................................3 1.2 Scenarios..........................................................................................................................................4 1.2.1 Simple Cluster of Rack-Mounted Servers – Stateful Nodes....................................................4 1.2.2 Simple Cluster of Rack-Mounted Servers – Stateless Nodes..................................................4 1.2.3 Simple BladeCenter Cluster – Stateful or Stateless Nodes......................................................4 1.2.4 Hierarchical Cluster - Stateless Nodes.....................................................................................5 1.3 Other Documentation Available......................................................................................................5 1.4 Cluster Naming Conventions Used in This Document...................................................................5 2.0 Installing the Management Node.........................................................................................................6 2.1 Prepare the Management Node.......................................................................................................6
    [Show full text]
  • Torqueadminguide-5.1.0.Pdf
    Welcome Welcome Welcome to the TORQUE Adminstrator Guide, version 5.1.0. This guide is intended as a reference for both users and system administrators. For more information about this guide, see these topics: l Overview l Introduction Overview This section contains some basic information about TORQUE, including how to install and configure it on your system. For details, see these topics: l TORQUE Installation Overview on page 1 l Initializing/Configuring TORQUE on the Server (pbs_server) on page 10 l Advanced Configuration on page 17 l Manual Setup of Initial Server Configuration on page 32 l Server Node File Configuration on page 33 l Testing Server Configuration on page 35 l TORQUE on NUMA Systems on page 37 l TORQUE Multi-MOM on page 42 TORQUE Installation Overview This section contains information about TORQUE architecture and explains how to install TORQUE. It also describes how to install TORQUE packages on compute nodes and how to enable TORQUE as a service. For details, see these topics: l TORQUE Architecture on page 2 l Installing TORQUE on page 2 l Compute Nodes on page 7 l Enabling TORQUE as a Service on page 9 Related Topics Troubleshooting on page 132 TORQUE Installation Overview 1 Overview TORQUE Architecture A TORQUE cluster consists of one head node and many compute nodes. The head node runs the pbs_server daemon and the compute nodes run the pbs_mom daemon. Client commands for submitting and managing jobs can be installed on any host (including hosts not running pbs_server or pbs_mom). The head node also runs a scheduler daemon.
    [Show full text]
  • Magellan Final Report
    The Magellan Report on Cloud Computing for Science U.S. Department of Energy Office of Advanced Scientific Computing Research (ASCR) December, 2011 CSO 23179 The Magellan Report on Cloud Computing for Science U.S. Department of Energy Office of Science Office of Advanced Scientific Computing Research (ASCR) December, 2011 Magellan Leads Katherine Yelick, Susan Coghlan, Brent Draney, Richard Shane Canon Magellan Staff Lavanya Ramakrishnan, Adam Scovel, Iwona Sakrejda, Anping Liu, Scott Campbell, Piotr T. Zbiegiel, Tina Declerck, Paul Rich Collaborators Nicholas J. Wright Jeff Broughton Rollin Thomas Brian Toonen Richard Bradshaw Michael A. Guantonio Karan Bhatia Alex Sim Shreyas Cholia Zacharia Fadikra Henrik Nordberg Kalyan Kumaran Linda Winkler Levi J. Lester Wei Lu Ananth Kalyanraman John Shalf Devarshi Ghoshal Eric R. Pershey Michael Kocher Jared Wilkening Ed Holohann Vitali Morozov Doug Olson Harvey Wasserman Elif Dede Dennis Gannon Jan Balewski Narayan Desai Tisha Stacey CITRIS/University of STAR Collboration Krishna Muriki Madhusudhan Govindaraju California, Berkeley Linda Vu Victor Markowitz Gabriel A. West Greg Bell Yushu Yao Shucai Xiao Daniel Gunter Nicholas Dale Trebon Margie Wylie Keith Jackson William E. Allcock K. John Wu John Hules Nathan M. Mitchell David Skinner Brian Tierney Jon Bashor This research used resources of the Argonne Leadership Computing Facility at Argonne National Laboratory, which is supported by the Office of Science of the U.S. Department of Energy under contract DE-AC02¬06CH11357, funded through the American Recovery and Reinvestment Act of 2009. Resources and research at NERSC at Lawrence Berkeley National Laboratory were funded by the Department of Energy from the American Recovery and Reinvestment Act of 2009 under contract number DE-AC02-05CH11231.
    [Show full text]
  • Xcat3 Documentation Release 2.16.3
    xCAT3 Documentation Release 2.16.3 IBM Corporation Sep 23, 2021 Contents 1 Table of Contents 3 1.1 Overview.................................................3 1.2 Install Guides............................................... 16 1.3 Get Started................................................ 33 1.4 Admin Guide............................................... 37 1.5 Advanced Topics............................................. 787 1.6 Questions & Answers.......................................... 1001 1.7 Reference Implementation........................................ 1004 1.8 Troubleshooting............................................. 1016 1.9 Developers................................................ 1019 1.10 Need Help................................................ 1037 1.11 Security Notices............................................. 1037 i ii xCAT3 Documentation, Release 2.16.3 xCAT stands for Extreme Cloud Administration Toolkit. xCAT offers complete management of clouds, clusters, HPC, grids, datacenters, renderfarms, online gaming infras- tructure, and whatever tomorrows next buzzword may be. xCAT enables the administrator to: 1. Discover the hardware servers 2. Execute remote system management 3. Provision operating systems on physical or virtual machines 4. Provision machines in Diskful (stateful) and Diskless (stateless) 5. Install and configure user applications 6. Parallel system management 7. Integrate xCAT in Cloud You’ve reached xCAT documentation site, The main page product page is http://xcat.org xCAT is an open source project
    [Show full text]
  • Lico 6.0.0 Installation Guide (For SLES) Seventh Edition (August 2020)
    LiCO 6.0.0 Installation Guide (for SLES) Seventh Edition (August 2020) © Copyright Lenovo 2018, 2020. LIMITED AND RESTRICTED RIGHTS NOTICE: If data or software is delivered pursuant to a General Services Administration (GSA) contract, use, reproduction, or disclosure is subject to restrictions set forth in Contract No. GS-35F- 05925. Reading instructions • To ensure that you get correct command lines using the copy/paste function, open this Guide with Adobe Acrobat Reader, a free PDF viewer. You can download it from the official Web site https://get.adobe.com/ reader/. • Replace values in angle brackets with the actual values. For example, when you see <*_USERNAME> and <*_PASSWORD>, enter your actual username and password. • Between the command lines and in the configuration files, ignore all annotations starting with #. © Copyright Lenovo 2018, 2020 ii iii LiCO 6.0.0 Installation Guide (for SLES) Contents Reading instructions. ii Check environment variables. 25 Check the LiCO dependencies repository . 25 Chapter 1. Overview. 1 Check the LiCO repository . 25 Introduction to LiCO . 1 Check the OS installation . 26 Typical cluster deployment . 1 Check NFS . 26 Operating environment . 2 Check Slurm . 26 Supported servers and chassis models . 3 Check MPI and Singularity . 27 Prerequisites . 4 Check OpenHPC installation . 27 List of LiCO dependencies to be installed. 27 Chapter 2. Deploy the cluster Install RabbitMQ . 27 environment . 5 Install MariaDB . 28 Install an OS on the management node . 5 Install InfluxDB . 28 Deploy the OS on other nodes in the cluster. 5 Install Confluent. 29 Configure environment variables . 5 Configure user authentication . 29 Create a local repository .
    [Show full text]
  • Xcat3 Documentation Release 2.11
    xCAT3 Documentation Release 2.11 IBM Corporation Mar 09, 2017 Contents 1 Table of Contents 3 1.1 Overview.................................................3 1.2 Install Guides............................................... 15 1.3 Admin Guide............................................... 27 1.4 Advanced Topics............................................. 702 1.5 Developers................................................ 796 1.6 Need Help................................................ 811 i ii xCAT3 Documentation, Release 2.11 xCAT stands for Extreme Cloud/Cluster Administration Toolkit. xCAT offers complete management of clouds, clusters, HPC, grids, datacenters, renderfarms, online gaming infras- tructure, and whatever tomorrows next buzzword may be. xCAT enables the administrator to: 1. Discover the hardware servers 2. Execute remote system management 3. Provision operating systems on physical or virtual machines 4. Provision machines in Diskful (stateful) and Diskless (stateless) 5. Install and configure user applications 6. Parallel system management 7. Integrate xCAT in Cloud You’ve reached xCAT documentation site, The main page product page is http://xcat.org xCAT is an open source project hosted on GitHub. Go to GitHub to view the source, open issues, ask questions, and particpate in the project. Enjoy! Contents 1 xCAT3 Documentation, Release 2.11 2 Contents CHAPTER 1 Table of Contents Overview xCAT enables you to easily manage large number of servers for any type of technical computing workload. xCAT is known for exceptional scaling,
    [Show full text]
  • CSM to Xcat Transition On
    Front cover Configuring and Managing AIX Clusters Using xCAT 2 An xCAT guide for CSM system administrators with advanced features How to transition your AIX cluster to xCAT xCAT 2 deployment guide for a new AIX cluster Octavian Lascu Tim Donovan Saul Hiller Simeon McAleer László Niesz Sean Saunders Kai Wu Peter Zutenis ibm.com/redbooks International Technical Support Organization Configuring and Managing AIX Clusters Using xCAT 2 October 2009 SG24-7766-00 Note: Before using this information and the product it supports, read the information in “Notices” on page ix. First Edition (October 2009) This edition applies to Version 2, Release 2, Modification 1 of Extreme Cluster Administration Tool (xCAT 2) for AIX. © Copyright International Business Machines Corporation 2009. All rights reserved. Note to U.S. Government Users Restricted Rights -- Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp. Contents Notices . ix Trademarks . x Preface . xi The team that wrote this book . xi Become a published author . xiii Comments welcome. xiv Part 1. Introduction . 1 Chapter 1. Introduction to clustering . 3 1.1 Clustering concepts. 4 1.1.1 Cluster types . 4 1.1.2 Cluster components . 5 1.1.3 Cluster nodes . 7 1.1.4 Cluster networks . 9 1.1.5 Hardware, power control, and console access . 11 1.2 Suggested cluster diagram . 13 1.3 Managing high performance computing (HPC) clusters . 15 1.3.1 What is CSM . 15 1.3.2 xCAT 2: An 0pen source method for cluster management . 16 1.3.3 xCAT 1.1.4 and CSM to xCAT 2 - History and evolution .
    [Show full text]
  • Lico 6.1.0 Installation Guide (For SLES) Eighth Edition (Decmeber 2020)
    LiCO 6.1.0 Installation Guide (for SLES) Eighth Edition (Decmeber 2020) © Copyright Lenovo 2018, 2020. LIMITED AND RESTRICTED RIGHTS NOTICE: If data or software is delivered pursuant to a General Services Administration (GSA) contract, use, reproduction, or disclosure is subject to restrictions set forth in Contract No. GS-35F- 05925. Reading instructions • To ensure that you get correct command lines using the copy/paste function, open this Guide with Adobe Acrobat Reader, a free PDF viewer. You can download it from the official Web site https://get.adobe.com/ reader/. • Replace values in angle brackets with the actual values. For example, when you see <*_USERNAME> and <*_PASSWORD>, enter your actual username and password. • Between the command lines and in the configuration files, ignore all annotations starting with #. © Copyright Lenovo 2018, 2020 ii iii LiCO 6.1.0 Installation Guide (for SLES) Contents Reading instructions. ii Check environment variables. 29 Check the LiCO dependencies repository . 29 Chapter 1. Overview. 1 Check the LiCO repository . 29 Introduction to LiCO . 1 Check the OS installation . 30 Typical cluster deployment . 1 Check NFS . 30 Operating environment . 2 Check Slurm . 30 Supported servers and chassis models . 3 Check MPI and Singularity . 31 Prerequisites . 4 Check OpenHPC installation . 31 List of LiCO dependencies to be installed. 31 Chapter 2. Deploy the cluster Install RabbitMQ . 31 environment . 5 Install MariaDB . 32 Install an OS on the management node . 5 Install InfluxDB . 32 Deploy the OS on other nodes in the cluster. 5 Install Confluent. 33 Configure environment variables . 5 Configure user authentication . 33 Create a local repository .
    [Show full text]
  • HPC Solution on IBM POWER8
    Front cover Implementing an IBM High-Performance Computing Solution on IBM POWER8 Dino Quintero Wei Li Wainer dos Santos Moschetta Mauricio Faria de Oliveira Alexander Pozdneev Redbooks International Technical Support Organization Implementing an IBM High-Performance Computing Solution on IBM POWER8 September 2015 SG24-8263-00 Note: Before using this information and the product it supports, read the information in “Notices” on page vii. First Edition (September 2015) This edition applies to IBM Spectrum Scale 4.1.0-6, Ubuntu Server 14.04.1 LTS, xCAT 2.9, Mellanox OFED 2.3-2.0.0, XLC 13.1.1-0, XL Fortran 15.1.1-0, IBM Advance Toolchain 8.0 (provides GCC-4.9.2), Parallel Environment Runtime 2.1.0.0, Parallel Environment Developer Edition 2.1.0.0, ESSL 5.3.1-0, PESSL 5.1.0-0, IBM JDK 7.1-2.0, Platform LSF 9.1.3, and NVIDIA CUDA Toolkit 5.5-54 (on Ubuntu 14.10). © Copyright International Business Machines Corporation 2015. All rights reserved. Note to U.S. Government Users Restricted Rights -- Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp. Contents Notices . vii Trademarks . viii IBM Redbooks promotions . ix Preface . xi Authors. xi Now you can become a published author, too! . xii Comments welcome. xii Stay connected to IBM Redbooks . xiii Chapter 1. Introduction. 1 1.1 Why implement an IBM HPC solution on IBM POWER8 . 2 1.2 IBM Power System S824L . 3 1.2.1 The IBM POWER8 processor chip . 4 1.2.2 Memory subsystem.
    [Show full text]