IBM Cluster Systems Management V1.4.1 — Scalable System Administration Management Platform

Total Page:16

File Type:pdf, Size:1020Kb

IBM Cluster Systems Management V1.4.1 — Scalable System Administration Management Platform Software Announcement April 12, 2005 IBM Cluster Systems Management V1.4.1 — Scalable system administration management platform Overview • Support for Serial-Over-LAN on xSeries At a glance CSM V1.4.1 helps simplify the tasks • HA management server of installing, configuring, operating, Cluster Systems Management enhancements and maintaining clusters of servers (CSM) is designed to provide a using logical partitions (LPARs) This is the last release of CSM that robust, powerful, and centralized which may help reduce the overall includes the Linux Cluster Support way to manage a large number of cost of IT ownership. CSM offers one extension. IBM servers from a single point of consistent interface for managing control. both AIX and Linux nodes, with Specific product enhancements: capabilities for remote parallel New with CSM V1.4.1 on AIX 5L and network installation, remote • CSM for Linux on POWER : Linux: − hardware control, distributed Runs on SUSE LINUX • Enterprise Server (SLES) 9 and Cluster Ready Hardware Server command execution, file collection (CRHS) support enhancing the and distribution, and cluster-wide Red Hat AS 4.0 • CSM for Linux on Multiplatforms CSM management server monitoring capabilities. capabilities to manage IBM (xSeries, IBM 325 and CSM V1.4.1 has added function and 326, BladeCenter HS20 and pSeries POWER5 hardware hardware support, as follows: HS40): • − Support for SLES 9 and Red Separate installation servers • IBM OpenPower 710 Hat Enterprise Linux 4 for for dramatic improvement in and 720 servers; IBM p5 installation capabilities selected xSeries and POWER • 510, 520, 550, 570, 575, 590, and architectures Doubling of the CSM Hardware 595; IBM 326; and IBM • CSM for AIX 5L: Management Console (HMC) xSeries 336, 346, and − scaling capabilities Support for AIX 5L V5.3 and • IBM BladeCenter JS20, POWER5 servers Enhanced support for mixed JS20+, HS20 (Model 8843), and − cluster environments High Availability Management • HS40. Server (HA MS) feature Support for selected new OpenPower, p5, • CRHS function has been added, support on AIX 5L V5.3 and POWER5 servers xSeries and BladeCenter enabling hardware discovery, hardware models HMC to server assignment, and • Improved handling of CSM server processor password Key prerequisites event types such as management from the CSM consolidating error reports Runs on AIX 5L V5.2 and V5.3, SUSE management server for across platforms SLES, Red Hat, and IBM eServer POWER5 hardware. • Implementation documentation Cluster 1350, Cluster 1600, and for Firewalls • Separate installation servers selected xSeries, OpenPower, and • Support for Serial-Over-LAN on located off the management POWER servers server for dramatic improvement xSeries • High Availability (HA) in installation and configuration Planned availability date capabilities management server enhancements May 6, 2005 • Enhanced CSM HMC scaling capabilities For ordering, contact: Your IBM representative, an IBM Business • Enhanced support for mixed Partner, or the Americas Call Centers at cluster environments 800-IBM-CALL Reference: RE001 • Improved handling of CSM event types such as consolidating error reports across platforms • Implementation documentation for Firewalls This announcement is provided for your information only. For additional information, contact your IBM representative, call 800-IBM-4YOU, or visit the IBM home page at: http://www.ibm.com. IBM United States IBM is a registered trademark of International Business Machines Corporation. 205-088 Mixed clusters Description CSM has enhanced its mixed clusters support structure CSM V1.4.1 is designed to provide simple, low-cost for operating system support. The following lists outline management of distributed and clustered IBM eServer the mixed clusters scenarios supported in CSM 1.4.1. servers. For organizations with both Linux and AIX applications, a single AIX 5L V5.2 or V5.3 management Clusters managed by IBM pSeries Management console can provide management services to AIX 5L and Servers running AIX 5L can contain the following: Linux clients in distributed and clustered configurations. • All CSM supported releases of AIX on pSeries CSM is also a key element of the eServer Cluster 1600 managed nodes and up to one of the following: and Cluster 1350, platforms that are ideal for workload − All CSM supported releases of Red Hat on pSeries consolidation or for achieving high degrees of scalability managed nodes, or and performance for applications that take advantage of clustered systems architectures. Primary examples are − All CSM supported releases of Red Hat on xSeries computational modeling in high-performance computing managed nodes, or or multi-terabyte data warehouses in large corporations. − All CSM supported releases of SUSE SLES on The CSM licensed products include: pSeries managed nodes, or • CSM for Linux on POWER − All CSM supported releases of SUSE SLES on xSeries managed nodes • CSM for Linux on Multiplatforms (xSeries, BladeCenters) With APARs IY69115, IY69116 and IY69117, clusters managed by pSeries Management Servers running AIX • CSM for AIX 5L 5L can contain any combination of the following: New with CSM V1.4.1 in all three product offerings: • All CSM supported releases of AIX on pSeries managed nodes Cluster Ready Hardware Server (CRHS) • All CSM supported releases of Red Hat on pSeries CRHS is a set of software which enhances the ability of managed nodes the CSM management server to manage pSeries POWER5 hardware. The net gain of this software is that the CSM • All CSM supported releases of Red Hat on xSeries management server become more of a central point of managed nodes control in a cluster containing POWER5 HMCs, since more • All CSM supported releases of SUSE SLES on pSeries function can be executed and controlled directly from the managed nodes CSM management server, rather than via the HMC. Some of the features CRHS brings to CSM include: • All CSM supported releases of SUSE SLES on xSeries managed nodes • Hardware discovery of HMCs and POWER5 hardware on a cluster network from the CSM Management Clusters managed by xSeries Management Servers Server (MS) running SUSE SLES can contain any combination of the following: • The ability to assign, from the CSM MS, which HMC controls which POWER5 server • All CSM supported releases of Red Hat on xSeries • The ability to set, change, and distribute server service managed nodes processor passwords from CSM MS • All CSM supported releases of SUSE SLES on xSeries • Reduction in the number of HMCs required for failing managed nodes over to another HMC by allowing you to have one HMC Clusters managed by xSeries Management Servers that can be used to fail over any of the systems on the running Red Hat can contain any combination of the network. The old model was to have duplicate HMCs following: for each subnet of HMC/servers. • All CSM supported releases of Red Hat on xSeries Note: APARs IY69115, IY69116, and IY69117 are required managed nodes for CRHS. • All CSM supported releases of SUSE SLES on xSeries Install servers CSM V1.4.1 provides separate install managed nodes server support allowing node installation services to be off-loaded from the management server onto one or more Clusters managed by pSeries Management Servers separate servers. An install server handles installation running SUSE SLES can contain any combination of the services such as DHCP, TFTP, PXE, BOOTP, NFS, and following: other functions required during a network operating system installation of a node. All managed node • All CSM supported releases of Red Hat on pSeries operating system installations are still initiated using CSM managed nodes commands from the management server as a central • All CSM supported releases of SUSE SLES on pSeries point of control, and the CSM software will automatically managed nodes distribute the necessary operations to the appropriate install servers. Install servers in homogeneous clusters Clusters managed by pSeries Management Servers can improve scalability and node installation performance running Red Hat can contain any combination of the by distributing the required network services across following: multiple servers. Install servers in heterogeneous clusters allow CSM to provide node operating system • All CSM supported releases of Red Hat on pSeries installation for different operating system (OS) managed nodes distributions within the same cluster. • All CSM supported releases of SUSE SLES on pSeries managed nodes 205-088 -2- CSM now allows managed nodes to be installed with a − Support for eServer POWER5 servers running SUSE higher version of an operating system than is installed on SLES 9, and Red Hat EL 3.0 or EL 4.0 the management server. However, the version of CSM on the management server must continue to be as high, − High Availability Management Server (HA MS) or higher, than the version of CSM on the managed feature running SUSE SLES 9, and Red Hat EL 3.0 nodes. and EL 4.0 • CSM for Linux on Multiplatforms: Implementation documentation for Firewalls − Support for Red Hat EL 3.0 and 4.0 CSM support for Firewalls has documented configuration requirements to support a firewall within a CSM • CSM for AIX 5L: Management Domain (MD), such that a firewall host physically and logically separates a MS and its Managed − Support for new POWER5 servers and AIX 5L V5.3 Nodes (MNs). − HA MS feature supported on AIX 5.3 Support for p5 Virtual I/O Servers and I/O Server On all platforms, the optional CSM HA MS is designed to Partitions prevent the management server from being a single point of failure in the CSM cluster. CSM HA MS uses a shared CSM now supports as a non-node device the p5 Virtual I/O disk to store the CSM management server′s data and Server. On the p5 system a Virtual I/O Server allows requires a backup management server with a similar multiple client partitions to share physical resources, such configuration to the primary management server. CSM as network adapters and SCSI devices. HA MS is designed to automate failover in response to Usability enhancements failure conditions.
Recommended publications
  • An Introduction to Security in a CSM 1.3 for AIX 5L Environment
    Front cover An Introduction to Security in a CSM 1.3 for AIX 5L Environment Peek at the latest security mechanisms for pSeries clusters Practical security considerations included Security concepts and components explained Octavian Lascu Rashid Sayed Stuart Carroll Teresa Coleman Maik Haehnel Petr Klabenes Dino Quintero Rogelio Reyes, Jr. Mizuho Tanaka David Duy Truong ibm.com/redbooks International Technical Support Organization An Introduction to Security in a CSM 1.3 for AIX 5L Environment December 2002 SG24-6873-00 Note: Before using this information and the product it supports, read the information in “Notices” on page ix. First Edition (December 2002) This edition applies to Version 1, Release 3, of IBM Cluster Systems Management for use with the AIX operating system Version 5, Release 2. © Copyright International Business Machines Corporation 2002. All rights reserved. Note to U.S. Government Users Restricted Rights -- Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp. Contents Figures . vii Notices . .ix Trademarks . x Preface . .xi The team that wrote this redbook. .xi Become a published author . xiii Comments welcome. xiii Chapter 1. Introduction . 1 1.1 Security overview . 2 1.1.1 System security. 2 1.1.2 Network security basics . 3 1.1.3 Data transmission security . 4 1.2 Cluster Systems Management security basics . 5 1.2.1 Reliable Scalable Cluster Technology (RSCT) . 6 1.2.2 Resource Monitoring and Control (RMC) . 6 1.2.3 Resource managers (RM). 7 1.2.4 Cluster Security Services (CtSec). 7 1.2.5 Group Services and Topology Services . 8 Chapter 2.
    [Show full text]
  • Université De Montréal Low-Impact Operating
    UNIVERSITE´ DE MONTREAL´ LOW-IMPACT OPERATING SYSTEM TRACING MATHIEU DESNOYERS DEPARTEMENT´ DE GENIE´ INFORMATIQUE ET GENIE´ LOGICIEL ECOLE´ POLYTECHNIQUE DE MONTREAL´ THESE` PRESENT´ EE´ EN VUE DE L’OBTENTION DU DIPLOMEˆ DE PHILOSOPHIÆ DOCTOR (Ph.D.) (GENIE´ INFORMATIQUE) DECEMBRE´ 2009 c Mathieu Desnoyers, 2009. UNIVERSITE´ DE MONTREAL´ ECOL´ E POLYTECHNIQUE DE MONTREAL´ Cette th`ese intitul´ee : LOW-IMPACT OPERATING SYSTEM TRACING pr´esent´ee par : DESNOYERS Mathieu en vue de l’obtention du diplˆome de : Philosophiæ Doctor a ´et´edˆument accept´ee par le jury constitu´ede : Mme. BOUCHENEB Hanifa, Doctorat, pr´esidente M. DAGENAIS Michel, Ph.D., membre et directeur de recherche M. BOYER Fran¸cois-Raymond, Ph.D., membre M. STUMM Michael, Ph.D., membre iii I dedicate this thesis to my family, to my friends, who help me keeping balance between the joy of sharing my work, my quest for knowledge and life. Je d´edie cette th`ese `ama famille, `ames amis, qui m’aident `aconserver l’´equilibre entre la joie de partager mon travail, ma quˆete de connaissance et la vie. iv Acknowledgements I would like to thank Michel Dagenais, my advisor, for believing in my poten- tial and letting me explore the field of operating systems since the beginning of my undergraduate studies. I would also like to thank my mentors, Robert Wisniewski from IBM Research and Martin Bligh, from Google, who have been guiding me through the internships I have done in the industry. I keep a good memory of these experiences and am honored to have worked with them. A special thanks to Paul E.
    [Show full text]
  • Installing Conserver
    Installing Conserver version 1.0 David K. Z. Harris [email protected] Bryan Stansell [email protected] http://www.certaintysolutions.com/consoles/LISA2K-2.zip http://www.conserver.com/consoles/LISA2K-2.zip © 2000 Certainty Solutions, Inc. Pg. 1 This presentation is a supplement to my console services web pages located at http://www.certaintysolutions.com/consoles/LISA2K-2.zip. These pages have a substantial amount of information noted below each slide. We do this to help minimize the amount of note-taking that you need to do in class, and this should give you more time to listen to the instructors. If you feel that you learn better by taking notes, please feel free to do so. This presentation is meant to be a follow-up to a Basic Serial presentation. While this presentation can stand on its own, there is only a small amount of review of the earlier topic. During this tutorial, we will be discussing the topic of Console Servers as a generic application, but our technical emphasis will be on the Conserver application, which is freely available from http://www.conserver.com/. For most purposes in this tutorial, “Console Server” and “Conserver” can be used interchangeably. ©2000, David K. Z. Harris Certainty Solutions: Certainty in an uncertain world M12-2 v1.0 1 Pertinent Job History Ø Network Equipment Technologies ² (Comdesign, Bridge Communications) Ø Telebit Corp. Ø Cisco Systems, Inc. Ø Apple Computer, Inc. Ø Synopsys, Inc. Ø Global Networking & Computing ² (We’re now Certainty Solutions.) © 2000 Certainty Solutions, Inc. Pg. 2 Before moving into networking, David Harris was a hardware hacker, working in repair and R&D roles.
    [Show full text]
  • IBM Platform Computing Solutions Reference Architectures and Best Practices
    Front cover IBM Platform Computing Solutions Reference Architectures and Best Practices Helps with the foundation to manage enterprise environments Delivers reference architectures and best practices guides Provides case scenarios Dino Quintero Luis Carlos Cruz Ricardo Machado Picone Dusan Smolej Daniel de Souza Casali Gheorghe Tudor Joanna Wong ibm.com/redbooks International Technical Support Organization IBM Platform Computing Solutions Reference Architectures and Best Practices April 2014 SG24-8169-00 Note: Before using this information and the product it supports, read the information in “Notices” on page v. First Edition (April 2014) This edition applies to RedHat 6.4, IBM Platform Cluster Manager Standard Edition (PCM-SE) 4.1.1, IBM Platform Symphony Advanced Edition 6.1.1, GPFS FPO 3.5.0.13, Hadoop 1.1.1. © Copyright International Business Machines Corporation 2014. All rights reserved. Note to U.S. Government Users Restricted Rights -- Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp. Contents Notices . .v Trademarks . vi Preface . vii Authors. vii Now you can become a published author, too! . ix Comments welcome. ix Stay connected to IBM Redbooks . .x Chapter 1. Introduction. 1 1.1 Why IBM Platform Computing?. 2 1.2 High performance clusters . 2 1.3 IBM Platform HPC implementation scenario. 3 1.4 Big Data implementation on an IBM high performance computing cluster . 3 1.5 IBM Platform Computing solutions and products . 5 Intel . 7 Chapter 2. High performance clusters . 9 2.1 Cluster management. 10 2.1.1 IBM Platform HPC. 12 2.1.2 IBM Platform Cluster Manager Standard Edition . 19 2.1.3 IBM Platform Cluster Manager Advanced Edition.
    [Show full text]
  • An Empirical Study of the Effects of Open Source
    AN EMPIRICAL STUDY OF THE EFFECTS OF OPEN SOURCE ADOPTION ON SOFTWARE DEVELOPMENT ECONOMICS by Di Wu A thesis submitted to the Faculty of Graduate Studies and Research in partial fulfillment of the requirements for the degree of Master of Applied Science in Technology Innovation Management Department of Systems and Computer Engineering, Carleton University Ottawa, Canada, K1S 5B6 March 2007 © Copyright 2007 Di Wu Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Library and Bibliotheque et Archives Canada Archives Canada Published Heritage Direction du Branch Patrimoine de I'edition 395 Wellington Street 395, rue Wellington Ottawa ON K1A 0N4 Ottawa ON K1A 0N4 Canada Canada Your file Votre reference ISBN: 978-0-494-27005-9 Our file Notre reference ISBN: 978-0-494-27005-9 NOTICE: AVIS: The author has granted a non­ L'auteur a accorde une licence non exclusive exclusive license allowing Library permettant a la Bibliotheque et Archives and Archives Canada to reproduce,Canada de reproduire, publier, archiver, publish, archive, preserve, conserve,sauvegarder, conserver, transmettre au public communicate to the public by par telecommunication ou par I'lnternet, preter, telecommunication or on the Internet,distribuer et vendre des theses partout dans loan, distribute and sell theses le monde, a des fins commerciales ou autres, worldwide, for commercial or non­ sur support microforme, papier, electronique commercial purposes, in microform,et/ou autres formats. paper, electronic and/or any other formats. The author retains copyright L'auteur conserve la propriete du droit d'auteur ownership and moral rights in et des droits moraux qui protege cette these.
    [Show full text]
  • Remote-Serial-Console-HOWTO.Pdf
    Remote Serial Console HOWTO Glen Turner Australian Academic and Research Network <[email protected]> Mark F. Komarinski <mkomarinskiATwayga.org> v2.6 2003−03−31 Revision History Revision 2.6 2003−03−31 Revised by: gdt Correct opposing CTS/RTS explanations. Use <quote> in markup. TLDP PDF is now good, so remove instructions for rendering PostScript to PDF. Typo in GRUB configuration. Revision 2.5 2003−01−20 Revised by: gdt Only one console per technology type. Setting timezone. Use off parameter rather than comments in inittab. Cable lengths. Revision 2.4 2002−10−03 Revised by: gdt Kernel flow control bug, more cabling, Debian, Livingston Portmaster, typos (especially those found during translation to Japanese). Revision 2.3 2002−07−11 Revised by: gdt Updates for Red Hat Linux 7.3, corrections to serial port speeds and UARTs, ioctlsave. Revision 2.2 2002−05−22 Revised by: gdt Minor changes Revision 2.1 2002−05−16 Revised by: gdt Corrections to kernel console syntax. Addition of USB and devfs. Revision 2.0 2002−02−02 Revised by: gdt Second edition. Revision d1.0 2001−03−20 Revised by: mfk First edition. An RS−232 serial console allows Linux to be controlled from a terminal or modem attached to an asynchronous serial port. The monitor, mouse and keyboard are no longer required for system administration. Serial consoles are useful where Linux systems are deployed at remote sites or are deployed in high−density racks. This HOWTO describes how to configure Linux to attach a serial console. Dedication Glen Turner would like to thank his family for allowing him to work on this project for the surprisingly large number of evenings which it took to write this HOWTO.
    [Show full text]
  • Kit Microsoft Dell Software Group
    Kit Microsoft Dell Software Group Kit de terrain Dell Software À propos de Dell Guide des solutions visant à accélérer l’adoption de la plateforme Microsoft Fonctionnalités Dell Software Processus de déploiements Microsoft avec Dell Plateformes technologiques Microsoft prises en charge Optimisation des grands enjeux Microsoft • Plateforme Cloud • Productivité Cloud et enterprise social • Informations métiers et mission critical Accélération de l’adoption de la plateforme Microsoft • Migration Windows Server • Migration de la messagerie électronique • SQL Server® • SharePoint® Notre engagement envers Microsoft Récompenses Contacts Partenaires - Confidentiel À propos de Dell Dell rend la technologie plus accessible et économique, améliorant ainsi la vie des individus, le fonctionnement des entreprises et la marche du monde. Aujourd’hui, nous exploitons la puissance du Cloud, de la technologie mobile, des Big Data et de la sécurité pour permettre à davantage d’individus d’en faire plus. Connexion des Intégration et Simplification Accélération de UTILISATEURS optimisation de et sécurisation via l’innovation via FINAUX l’ENTREPRISE les LOGICIELS les SERVICES Partenaires - Confidentiel Fonctionnalités Dell Software Gestion du Cloud Gestion de Gestion du et des datacenters l’information personnel mobile • Gestion des serveurs et des • Gestion des bases de données • Gestion des appareils mobiles systèmes clients • Business Intelligence / analytique • Virtualisation des postes de • Surveillance des performances • Intégration des données et des
    [Show full text]
  • Linux HPC Cluster Installation
    Front cover Acrobat bookmark Draft Document for Review June 15, 2001 6:30 pm SG24-6041-00 Linux HPC Cluster Installation xCAT - xCluster Administration Tools Developed by Egan Ford IBM ^ xSeries Intel-based Linux® Installing Red Hat® with Kickstart and xCAT Luis Ferreira, Gregory Kettmann Andreas Thomasch, Eillen Silcocks Jacob Chen, Jean-Claude Daunois Jens Ihamo, Makoto Harada Steve Hill and Walter Bernocchi ibm.com/redbooks Draft Document for Review June 15, 2001 6:29 pm 6041edno.fm International Technical Support Organization Linux High Performance Cluster Installation May 2001 SG24-6041-00 6041edno.fm Draft Document for Review June 15, 2001 6:29 pm Take Note! Before using this information and the product it supports, be sure to read the general information in “Special notices” on page 239. First Edition (May 2001) This edition applies to Red Hat® Linux® Version 6.2 for Intel® Architecture. This document created or updated on June 15, 2001. Comments may be addressed to: IBM Corporation, International Technical Support Organization Dept. JN9B Building 003 Internal Zip 2834 11400 Burnet Road Austin, Texas 78758-3493 When you send information to IBM, you grant IBM a non-exclusive right to use or distribute the information in any way it believes appropriate without incurring any obligation to you. © Copyright International Business Machines Corporation 2001. All rights reserved. Note to U.S Government Users – Documentation related to restricted rights – Use, duplication or disclosure is subject to restrictions set forth in GSA ADP Schedule Contract with IBM Corp. Draft Document for Review June 15, 2001 6:29 pm 6041TOC.fm Contents Figures .
    [Show full text]
  • SGI® Management Centertm (SMC) Administration Guide for Clusters
    SGI® Management CenterTM (SMC) Administration Guide for Clusters 007–6358–001 COPYRIGHT © 2014 SGI. All rights reserved; provided portions may be copyright in third parties, as indicated elsewhere herein. No permission is granted to copy, distribute, or create derivative works from the contents of this electronic documentation in any manner, in whole or in part, without the prior written permission of SGI. The SGI Management Center software stack depends on several open source packages which require attribution. They are as follows: c3: C3 version 3.1.2: Cluster Command & Control Suite Oak Ridge National Laboratory, Oak Ridge, TN, Authors: M.Brim, R.Flanery, G.A.Geist, B.Luethke, S.L.Scott (C) 2001 All Rights Reserved NOTICE Permission to use, copy, modify, and distribute this software and # its documentation for any purpose and without fee is hereby granted provided that the above copyright notice appear in all copies and that both the copyright notice and this permission notice appear in supporting documentation. Neither the Oak Ridge National Laboratory nor the Authors make any # representations about the suitability of this software for any purpose. This software is provided "as is" without express or implied warranty. The C3 tools were funded by the U.S. Department of Energy. conserver: Copyright (c) 2000, conserver.com All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:- Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. - Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
    [Show full text]
  • Setting up and Running a Production Linux Cluster at PNNL
    Case Study: Setting up and running a production Linux cluster at Pacific Northwest National Laboratory Gary Skouson, Molecular Science Computing Facility, William R. Wiley Environmental Molecular Sciences Laboratory, Pacific Northwest National Laboratory Ryan Braby*, Molecular Science Computing Facility, William R. Wiley Environmental Molecular Sciences Laboratory, Pacific Northwest National Laboratory Abstract With the low price and increasing performance of commodity computer hardware, it is important to study the viability of using clusters of relatively inexpensive computers to produce a stable system, capable of the current demands for high performance massively parallel computing. A 192-processor cluster was installed to test and develop methods that would make the PC cluster a workable alternative to using other commercial systems for use in scientific research. By comparing PC clusters with the cluster systems sold commercially, it became apparent that the tools to manage the PC cluster as a single system were not as robust or as well integrated as in many commercial systems. This paper is focused on the problems encountered and solutions used to stabilize this cluster for both production and development use. This included the use of extra hardware such as remote power control units and multi-port adapters to provide remote access to both the system console and system power. A Giganet cLAN fabric was also used to provide a high-speed, low-latency interconnect. Software solutions were used for resource management, job scheduling and accounting, parallel filesystems, remote network installation and system monitoring. Although there are still some tools missing for debugging hardware problems, the PC cluster continues to be very stable and useful for users.
    [Show full text]
  • IBM Spectrum Computing Solutions
    Front cover IBM Spectrum Computing Solutions Dino Quintero Daniel de Souza Casali Eduardo Luis Cerdas Moya Federico Fros Maciej Olejniczak Redbooks International Technical Support Organization IBM Spectrum Computing Solutions May 2017 SG24-8373-00 Note: Before using this information and the product it supports, read the information in “Notices” on page vii. First Edition (May 2017) This edition applies to: Red Hat Linux ppc64 Little Endian version 7.2 IBM Spectrum Scale version 4.2.1 IBM Cluster Foundation version v4.2.2 IBM Spectrum Conductor with Spark version 2.2 IBM Spectrum MPI version 10 © Copyright International Business Machines Corporation 2017. All rights reserved. Note to U.S. Government Users Restricted Rights -- Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp. Contents Notices . vii Trademarks . viii Preface . ix Authors. ix Now you can become a published author, too . .x Comments welcome. xi Stay connected to IBM Redbooks . xi Chapter 1. Introduction to IBM Spectrum Computing . 1 1.1 Overview . 2 1.2 Big data and resource management . 2 1.3 The new era for high-performance computing (HPC) . 2 1.4 Hybrid cloud bursting . 3 1.5 The big data challenge . 4 1.5.1 Hadoop . 4 1.5.2 Apache Spark . 5 1.5.3 Hadoop Distributed File System (HDFS) . 5 1.5.4 Multi-tenancy. 5 1.6 IBM Spectrum Cluster Foundation . 6 1.7 IBM Spectrum Computing . 6 1.7.1 IBM Spectrum Conductor with Spark . 7 1.7.2 IBM Spectrum LSF . 7 1.7.3 IBM Spectrum Symphony . 7 Chapter 2.
    [Show full text]
  • IBM Power Systems 775 HPC Solution
    Front cover IBM Power Systems 775 for AIX and Linux HPC Solution Unleashes computing power for HPC workloads Provides architectural solution overview Contains sample scenarios Dino Quintero Kerry Bosworth Puneet Chaudhary Rodrigo Garcia da Silva ByungUn Ha Jose Higino Marc-Eric Kahle Tsuyoshi Kamenoue James Pearson Mark Perez Fernando Pizzano Robert Simon Kai Sun ibm.com/redbooks International Technical Support Organization IBM Power Systems 775 for AIX and Linux HPC Solution October 2012 SG24-8003-00 Note: Before using this information and the product it supports, read the information in “Notices” on page vii. First Edition (October 2012) This edition applies to IBM AIX 7.1, xCAT 2.6.6, IBM GPFS 3.4, IBM LoadLelever, Parallel Environment Runtime Edition for AIX V1.1. © Copyright International Business Machines Corporation 2012. All rights reserved. Note to U.S. Government Users Restricted Rights -- Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp. Contents Figures . vii Tables . xi Examples . xiii Notices . xvii Trademarks . xviii Preface . xix The team who wrote this book . xix Now you can become a published author, too! . xxi Comments welcome. xxii Stay connected to IBM Redbooks . xxii Chapter 1. Understanding the IBM Power Systems 775 Cluster. 1 1.1 Overview of the IBM Power System 775 Supercomputer . 2 1.2 Advantages and new features of the IBM Power 775 . 3 1.3 Hardware information . 4 1.3.1 POWER7 chip. 4 1.3.2 I/O hub chip. 10 1.3.3 Collective acceleration unit (CAU) . 12 1.3.4 Nest memory management unit (NMMU) .
    [Show full text]