Front cover IBM Platform Computing Solutions
Describes the offering portfolio
Shows implementation scenarios
Delivers add-on value to current implementations
Dino Quintero Scott Denham Rodrigo Garcia da Silva Alberto Ortiz Aline Guedes Pinto Atsumori Sasaki Roger Tucker Joanna Wong Elsie Ramos
ibm.com/redbooks
International Technical Support Organization
IBM Platform Computing Solutions
December 2012
SG24-8073-00 Note: Before using this information and the product it supports, read the information in “Notices” on page vii.
First Edition (December 2012)
This edition applies to IBM Platform High Performance Computing (HPC) v3.2, IBM Platform Cluster Manager Advanced Edition v3.2, IBM Platform Symphony v5.2, IBM Platform Load Sharing Facility (LSF) v8.3, RedHat Enterprise Linux v6.2/v6.3, Hadoop v1.0.1, Sun Java Development Kit (JDK) v1.6.0_25, and Oracle Database Express Edition v11.2.0.
© Copyright International Business Machines Corporation 2012. All rights reserved. Note to U.S. Government Users Restricted Rights -- Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp. Contents
Notices ...... vii Trademarks ...... viii
Preface ...... ix The team who wrote this book ...... ix Now you can become a published author, too! ...... xi Comments welcome...... xi Stay connected to IBM Redbooks ...... xii
Chapter 1. Introduction to IBM Platform Computing ...... 1 1.1 IBM Platform Computing solutions value proposition ...... 2 1.1.1 Benefits ...... 3 1.1.2 Cluster, grids, and clouds ...... 4 1.2 History ...... 5 1.2.1 IBM acquisition ...... 6
Chapter 2. Technical computing software portfolio...... 7 2.1 IBM Platform Computing product portfolio ...... 8 2.1.1 Workload management...... 8 2.1.2 Cluster management...... 8 2.2 IBM Platform Computing products overview ...... 10 2.2.1 IBM Platform Load Sharing Facility family ...... 10 2.2.2 IBM Platform Message Passing Interface...... 12 2.2.3 IBM Platform Symphony family...... 12 2.2.4 IBM Platform HPC...... 13 2.2.5 IBM Platform Cluster Manager ...... 13 2.3 Current user roadmap ...... 14
Chapter 3. Planning ...... 15 3.1 Hardware setup for this residency...... 16 3.1.1 Server nodes...... 16 3.1.2 Shared storage ...... 18 3.1.3 Infrastructure planning ...... 20 3.2 Software packages ...... 23
Chapter 4. IBM Platform Load Sharing Facility (LSF) product family ...... 27 4.1 IBM Platform LSF overview...... 28 4.2 IBM Platform LSF add-on products...... 36 4.2.1 IBM Platform Application Center...... 36 4.2.2 IBM Platform RTM...... 40 4.2.3 IBM Platform Process Manager ...... 41 4.2.4 IBM Platform License Scheduler...... 44 4.3 Implementation ...... 49 4.3.1 IBM Platform LSF implementation ...... 50 4.3.2 IBM Platform Application Center implementation ...... 54 4.3.3 IBM Platform RTM implementation ...... 65 4.3.4 IBM Platform Process Manager implementation...... 98 4.4 References ...... 109
© Copyright IBM Corp. 2012. All rights reserved. iii Chapter 5. IBM Platform Symphony ...... 111 5.1 Overview ...... 112 5.1.1 Architecture...... 117 5.1.2 Target audience ...... 118 5.1.3 Product versions ...... 118 5.2 Compute-intensive and data-intensive workloads...... 119 5.2.1 Basic concepts ...... 120 5.2.2 Core components ...... 121 5.2.3 Application implementation ...... 124 5.2.4 Application Deployment ...... 125 5.2.5 Symexec ...... 126 5.3 Data-intensive workloads ...... 126 5.3.1 Data affinity ...... 127 5.3.2 MapReduce...... 128 5.4 Reporting...... 134 5.5 Getting started...... 135 5.5.1 Planning for Symphony...... 136 5.5.2 Installation preferred practices ...... 139 5.6 Sample workload scenarios ...... 156 5.6.1 Hadoop ...... 156 5.7 Symphony and IBM Platform LSF multihead environment ...... 164 5.7.1 Planning a multihead installation ...... 167 5.7.2 Installation...... 167 5.7.3 Configuration...... 171 5.8 References ...... 179
Chapter 6. IBM Platform High Performance Computing ...... 181 6.1 Overview ...... 182 6.1.1 Unified web portal ...... 182 6.1.2 Cluster provisioning ...... 184 6.1.3 Workload scheduling...... 185 6.1.4 Workload and system monitoring and reporting ...... 186 6.1.5 MPI libraries ...... 186 6.1.6 GPU scheduling ...... 187 6.2 Implementation ...... 187 6.2.1 Installation on the residency cluster ...... 188 6.2.2 Modifying the cluster ...... 203 6.2.3 Submitting jobs ...... 205 6.2.4 Operation and monitoring ...... 205 6.3 References ...... 209
Chapter 7. IBM Platform Cluster Manager Advanced Edition ...... 211 7.1 Overview ...... 212 7.1.1 Unified web portal ...... 212 7.1.2 Physical and virtual resource provisioning ...... 213 7.1.3 Cluster management...... 214 7.1.4 HPC cluster self-service ...... 214 7.1.5 Cluster usage reporting...... 216 7.1.6 Deployment topology ...... 216 7.2 Implementation ...... 218 7.2.1 Preparing for installation ...... 218 7.2.2 Installing the software ...... 219 7.2.3 Deploying LSF clusters ...... 235 7.2.4 Deploying IBM Platform Symphony clusters...... 242 iv IBM Platform Computing Solutions 7.2.5 Baremetal provisioning ...... 273 7.2.6 KVM provisioning ...... 276 7.3 References ...... 292
Appendix A. IBM Platform Computing Message Passing Interface ...... 293 IBM Platform MPI ...... 294 IBM Platform MPI implementation ...... 294
Appendix B. Troubleshooting examples...... 297 Installation output for troubleshooting ...... 298
Appendix C. IBM Platform Load Sharing Facility add-ons and examples ...... 321 Submitting jobs with bsub ...... 322 Adding and removing nodes from an LSF cluster ...... 334 Creating a threshold on IBM Platform RTM ...... 337
Appendix D. Getting started with KVM provisioning ...... 341 KVM provisioning ...... 342
Related publications ...... 351 IBM Redbooks ...... 351 Other publications ...... 351 Online resources ...... 352 Help from IBM ...... 352
Contents v vi IBM Platform Computing Solutions Notices
This information was developed for products and services offered in the U.S.A.
IBM may not offer the products, services, or features discussed in this document in other countries. Consult your local IBM representative for information on the products and services currently available in your area. Any reference to an IBM product, program, or service is not intended to state or imply that only that IBM product, program, or service may be used. Any functionally equivalent product, program, or service that does not infringe any IBM intellectual property right may be used instead. However, it is the user's responsibility to evaluate and verify the operation of any non-IBM product, program, or service.
IBM may have patents or pending patent applications covering subject matter described in this document. The furnishing of this document does not grant you any license to these patents. You can send license inquiries, in writing, to: IBM Director of Licensing, IBM Corporation, North Castle Drive, Armonk, NY 10504-1785 U.S.A.
The following paragraph does not apply to the United Kingdom or any other country where such provisions are inconsistent with local law: INTERNATIONAL BUSINESS MACHINES CORPORATION PROVIDES THIS PUBLICATION "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF NON-INFRINGEMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Some states do not allow disclaimer of express or implied warranties in certain transactions, therefore, this statement may not apply to you.
This information could include technical inaccuracies or typographical errors. Changes are periodically made to the information herein; these changes will be incorporated in new editions of the publication. IBM may make improvements and/or changes in the product(s) and/or the program(s) described in this publication at any time without notice.
Any references in this information to non-IBM websites are provided for convenience only and do not in any manner serve as an endorsement of those websites. The materials at those websites are not part of the materials for this IBM product and use of those websites is at your own risk.
IBM may use or distribute any of the information you supply in any way it believes appropriate without incurring any obligation to you.
Any performance data contained herein was determined in a controlled environment. Therefore, the results obtained in other operating environments may vary significantly. Some measurements may have been made on development-level systems and there is no guarantee that these measurements will be the same on generally available systems. Furthermore, some measurements may have been estimated through extrapolation. Actual results may vary. Users of this document should verify the applicable data for their specific environment.
Information concerning non-IBM products was obtained from the suppliers of those products, their published announcements or other publicly available sources. IBM has not tested those products and cannot confirm the accuracy of performance, compatibility or any other claims related to non-IBM products. Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products.
This information contains examples of data and reports used in daily business operations. To illustrate them as completely as possible, the examples include the names of individuals, companies, brands, and products. All of these names are fictitious and any similarity to the names and addresses used by an actual business enterprise is entirely coincidental.
COPYRIGHT LICENSE:
This information contains sample application programs in source language, which illustrate programming techniques on various operating platforms. You may copy, modify, and distribute these sample programs in any form without payment to IBM, for the purposes of developing, using, marketing or distributing application programs conforming to the application programming interface for the operating platform for which the sample programs are written. These examples have not been thoroughly tested under all conditions. IBM, therefore, cannot guarantee or imply reliability, serviceability, or function of these programs.
© Copyright IBM Corp. 2012. All rights reserved. vii Trademarks
IBM, the IBM logo, and ibm.com are trademarks or registered trademarks of International Business Machines Corporation in the United States, other countries, or both. These and other IBM trademarked terms are marked on their first occurrence in this information with the appropriate symbol (® or ™), indicating US registered or common law trademarks owned by IBM at the time this information was published. Such trademarks may also be registered or common law trademarks in other countries. A current list of IBM trademarks is available on the Web at http://www.ibm.com/legal/copytrade.shtml
The following terms are trademarks of the International Business Machines Corporation in the United States, other countries, or both:
AIX® InfoSphere® Redbooks (logo) ® BigInsights™ Intelligent Cluster™ Symphony® GPFS™ LSF® System x® IBM SmartCloud™ Power Systems™ System z® IBM® POWER® Tivoli® iDataPlex® Redbooks®
The following terms are trademarks of other companies:
Adobe, the Adobe logo, and the PostScript logo are either registered trademarks or trademarks of Adobe Systems Incorporated in the United States, and/or other countries.
Intel Xeon, Intel, Intel logo, Intel Inside logo, and Intel Centrino logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries.
Linux is a trademark of Linus Torvalds in the United States, other countries, or both.
Microsoft, Windows, and the Windows logo are trademarks of Microsoft Corporation in the United States, other countries, or both.
Java, and all Java-based trademarks and logos are trademarks or registered trademarks of Oracle and/or its affiliates.
UNIX is a registered trademark of The Open Group in the United States and other countries.
Other company, product, or service names may be trademarks or service marks of others.
viii IBM Platform Computing Solutions Preface
This IBM® Platform Computing Solutions Redbooks® publication is the first book to describe each of the available offerings that are part of the IBM portfolio of Cloud, analytics, and High Performance Computing (HPC) solutions for our clients. This IBM Redbooks publication delivers descriptions of the available offerings from IBM Platform Computing that address challenges for our clients in each industry. We include a few implementation and testing scenarios with selected solutions.
This publication helps strengthen the position of IBM Platform Computing solutions with a well-defined and documented deployment model within an IBM System x® environment. This deployment model offers clients a planned foundation for dynamic cloud infrastructure, provisioning, large-scale parallel HPC application development, cluster management, and grid applications.
This IBM publication is targeted to IT specialists, IT architects, support personnel, and clients. This book is intended for anyone who wants information about how IBM Platform Computing solutions use IBM to provide a wide array of client solutions.
The team who wrote this book
This book was produced by a team of specialists from around the world that worked at the International Technical Support Organization, Poughkeepsie Center.
Dino Quintero is a complex solutions project leader and IBM Senior Certified IT Specialist with the ITSO in Poughkeepsie, NY. His areas of knowledge include enterprise continuous availability, enterprise systems management, system virtualization, technical computing, and clustering solutions. He is currently an Open Group Distinguished IT Specialist. Dino holds a Master of Computing Information Systems degree and a Bachelor of Science degree in Computer Science from Marist College.
Scott Denham is a Senior Architect with the IBM Technical Computing team, focused on HPC and data management in the upstream energy sector. After serving in the US Air Force Reserve as an Radar and Inertial Navigation specialist, he studied Electrical Engineering at the University of Houston while embarking on a 28-year career with the leading independent geophysical contractor. He held roles in software development and data center operations and support, and specialized in data acquisition systems that are used by the seismic industry and numerical acceleration through attached array processors and native vector computers. Scott held several leadership roles that involved HPC with the IBM user group SHARE. Scott joined IBM in 2000 as a member of the IBM High Performance Computing Center of Competency and later Deep Computing Technical Team. Scott has worldwide responsibility for pre-sales support of clients in the Petroleum, Automotive, Aerospace, Life Sciences, and Higher Education sectors, and has worked with clients worldwide on implementing HPC solutions. Scott holds the IBM Consulting IT Specialist Certification as a cross-brand specialist and the Open Group IT Specialist Certification. Scott participated in the evolution and commercialization of the IBM remote visualization solution Desktop Cloud Visualization (DCV). He has designed and implemented Linux HPC clusters by using x86, Power, and CellBE technologies for numerous clients in the Oil and Gas sector, using xCAT, IBM General Parallel File System (GPFS™), and other IBM technologies. Scott was a previous Redbooks author of Building a Linux HPC Cluster with xCAT, SG24-6623.
© Copyright IBM Corp. 2012. All rights reserved. ix Rodrigo Garcia da Silva is an Accredited IT Architect and a Technical Computing Client Technical Architect with the IBM System and Technology Group in Brazil. He joined IBM in 2007 and has 10 years experience in the IT industry. He holds a B.S. in Electrical Engineering from Universidade Estadual de Campinas, and his areas of expertise include HPC, systems architecture, OS provisioning, Linux, systems management, and open source software development. He also has a strong background in intellectual capital protection, including publications and patents. Rodrigo is responsible for Technical Computing pre-sales support of clients in the Oil and Gas, Automotive, Aerospace, Life Sciences, and Higher Education industries in Brazil. Rodrigo is a previous IBM Redbooks author with IBM POWER® 775 HPC Solutions, SG24-8003.
Alberto Ortiz is a Software IT Architect for IBM Mexico where he designs and deploys cross-brand software solutions. Since joining IBM in 2009, he participated in project implementations of varying sizes and complexities, from tuning an extract, transform, and load (ETL) platform for a data warehouse for a bank, to an IBM Identity Insight deployment with 400+ million records for a federal government police intelligence system. Before joining IBM, Alberto worked in several technical fields for projects in industries, such as telecommunications, manufacturing, finance, and government, for local and international clients. Throughout his 16 years of professional experience, he has acquired deep skills in UNIX, Linux, networking, relational databases and data migration, and most recently big data. Alberto holds a B.Sc. in Computer Systems Engineering from Universidad de las Americas Puebla in Mexico and currently studies for an MBA at Instituto Tecnologico y de Estudios Superiores de Monterrey (ITESM).
Aline Guedes Pinto is a Staff Software Engineer at the Linux Technology Center (LTC) in Brazil. She has five years of experience in Linux server management and web application development and currently works as the team leader of the LTC Infrastructure team. She holds a degree in Computer Science from the Universidade Estadual de Campinas (UNICAMP) with an MBA in Business Management and is Linux Professional Institute Certified (LPIC-1).
Atsumori Sasaki is a an Advisory IT Specialist with STG Technical Sales in Japan. His areas of expertise include cloud, system automation, and technical computing. He is currently a Service Delivery Manager at IBM Computing on Demand Japan. He has six years of experience in IBM. He holds a Masters degree in Information Technology from Kyushu Institute of Technology.
Roger Tucker is a Senior IT Specialist working in the System x and IBM Platform pre-sales Techline team for the United States. He has over 25 years of experience in systems management and HPC. His areas of expertise include large-scale systems architecture, data center design, Linux deployment and administration, and remote visualization. He started with IBM in 1999 as an Instructor Mentor at Tivoli®.
Joanna Wong is an Executive IT Specialist for the IBM STG Worldwide Client Centers focusing on IBM Platform Computing solutions. She has extensive experience in HPC application optimization and large systems scalability performance on both x86-64 and IBM Power architecture. Additional industry experience included engagements with Oracle database server and in Enterprise Application Integration. She holds an A.B. in Physics from Princeton University, M.S. and Ph.D. in Theoretical Physics from Cornell University, and an M.B.A from the Walter Haas School of Business with the University of California at Berkeley.
Elsie Ramos is a a Project Leader at the International Technical Support Organization, Poughkeepsie Center. She has over 28 years of experience in IT, supporting various platforms including IBM System z® servers.
x IBM Platform Computing Solutions Thanks to the following people for their contributions to this project:
David Bennin, Richard Conway, Ella Buslovich International Technical Support Organization, Poughkeepsie Center
Greg Geiselhart, Magnus Larsson, Kailash Marthi IBM Poughkeepsie
Rene-Paul G. Lafarie, Charlie Gonzales, J.D. Zeeman, Louise Westoby, Scott Campbell IBM US
Sum Huynh, Robert Hartwig, Qingda Wang, Zane Hu, Nick Werstiuk, Mehdi Bozzo-Rey, Renita Leung, Jeff Karmiol, William Lu, Gord Sissons, Yonggang Hu IBM Canada
Bill McMillan, Simon Waterer IBM UK
Thank you to QLogic Corporation for the InfiniBand fabric equipment loaner that we used during the residency.
Now you can become a published author, too!
Here’s an opportunity to spotlight your skills, grow your career, and become a published author—all at the same time! Join an ITSO residency project and help write a book in your area of expertise, while honing your experience using leading-edge technologies. Your efforts will help to increase product acceptance and customer satisfaction, as you expand your network of technical contacts and relationships. Residencies run from two to six weeks in length, and you can participate either in person or as a remote resident working from your home base.
Find out more about the residency program, browse the residency index, and apply online at: ibm.com/redbooks/residencies.html
Comments welcome
Your comments are important to us!
We want our books to be as helpful as possible. Send us your comments about this book or other IBM Redbooks publications in one of the following ways: Use the online Contact us review Redbooks form found at: ibm.com/redbooks Send your comments in an email to: [email protected] Mail your comments to: IBM Corporation, International Technical Support Organization Dept. HYTD Mail Station P099 2455 South Road Poughkeepsie, NY 12601-5400
Preface xi Stay connected to IBM Redbooks