Grid Computing for Developers Limited Warranty and Disclaimer of Liability

GRID COMPUTING FOR DEVELOPERS LIMITED WARRANTY AND DISCLAIMER OF LIABILITY THE CD-ROM THAT ACCOMPANIES THE BOOK MAY BE USED ON A SINGLE PC ONLY. THE LICENSE DOES NOT PERMIT THE USE ON A NETWORK (OF ANY KIND). YOU FURTHER AGREE THAT THIS LICENSE GRANTS PERMISSION TO USE THE PRODUCTS CONTAINED HEREIN, BUT DOES NOT GIVE YOU RIGHT OF OWNERSHIP TO ANY OF THE CONTENT OR PRODUCT CONTAINED ON THIS CD-ROM. USE OF THIRD-PARTY SOFTWARE CONTAINED ON THIS CD-ROM IS LIMITED TO AND SUBJECT TO LICENSING TERMS FOR THE RESPECTIVE PRODUCTS. CHARLES RIVER MEDIA, INC. (“CRM”) AND/OR ANYONE WHO HAS BEEN INVOLVED IN THE WRITING, CREATION, OR PRODUCTION OF THE ACCOMPA- NYING CODE (“THE SOFTWARE”) OR THE THIRD-PARTY PRODUCTS CON- TAINED ON THE CD-ROM OR TEXTUAL MATERIAL IN THE BOOK, CANNOT AND DO NOT WARRANT THE PERFORMANCE OR RESULTS THAT MAY BE OBTAINED BY USING THE SOFTWARE OR CONTENTS OF THE BOOK. THE AUTHOR AND PUBLISHER HAVE USED THEIR BEST EFFORTS TO ENSURE THE ACCURACY AND FUNCTIONALITY OF THE TEXTUAL MATERIAL AND PROGRAMS CONTAINED HEREIN. WE HOWEVER, MAKE NO WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, REGARDING THE PERFORMANCE OF THESE PROGRAMS OR CON- TENTS. THE SOFTWARE IS SOLD “AS IS” WITHOUT WARRANTY (EXCEPT FOR DEFECTIVE MATERIALS USED IN MANUFACTURING THE DISK OR DUE TO FAULTY WORKMANSHIP). THE AUTHOR, THE PUBLISHER, DEVELOPERS OF THIRD-PARTY SOFTWARE, AND ANYONE INVOLVED IN THE PRODUCTION AND MANUFACTURING OF THIS WORK SHALL NOT BE LIABLE FOR DAMAGES OF ANY KIND ARISING OUT OF THE USE OF (OR THE INABILITY TO USE) THE PROGRAMS, SOURCE CODE, OR TEXTUAL MATERIAL CONTAINED IN THIS PUBLICATION. THIS INCLUDES, BUT IS NOT LIMITED TO, LOSS OF REVENUE OR PROFIT, OR OTHER INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OF THE PRODUCT. THE SOLE REMEDY IN THE EVENT OF A CLAIM OF ANY KIND IS EXPRESSLY LIMITED TO REPLACEMENT OF THE BOOK AND/OR CD-ROM, AND ONLY AT THE DISCRETION OF CRM. THE USE OF “IMPLIED WARRANTY” AND CERTAIN “EXCLUSIONS” VARIES FROM STATE TO STATE, AND MAY NOT APPLY TO THE PURCHASER OF THIS PRODUCT. GRID COMPUTING FOR DEVELOPERS VLADIMIR SILVA CHARLES RIVER MEDIA, INC. Hingham, Massachusetts Copyright 2006 by THOMSON DELMAR LEARNING. Published by CHARLES RIVER MEDIA, INC. All rights reserved. No part of this publication may be reproduced in any way, stored in a retrieval system of any type, or transmitted by any means or media, electronic or mechanical, including, but not limited to, photocopy, recording, or scanning, without prior permission in writing from the publisher. Cover Design: Tyler Creative CHARLES RIVER MEDIA, INC. 10 Downer Avenue Hingham, Massachusetts 02043 781-740-0400 781-740-8816 (FAX) [email protected] www.charlesriver.com This book is printed on acid-free paper. Vladimir Silva. Grid Computing for Developers. ISBN: 1-58450-424-2 eISBN: 1-58450-651-2 All brand names and product names mentioned in this book are trademarks or service marks of their respective companies. Any omission or misuse (of any kind) of service marks or trademarks should not be regarded as intent to infringe on the property of others. The publisher recognizes and respects all marks used by companies, manufacturers, and developers as a means to distinguish their products. Library of Congress Cataloging-in-Publication Data Silva, Vladimir, 1969- Grid computing for developers / Vladimir Silva. p. cm. Includes index. ISBN 1-58450-424-2 (alk. paper) 1. Computational grids (Computer systems) 2. Computer software--Development. I. Title. QA76.9.C58S56 2005 004'.36--dc22 2005026627 Printed in the United States of America 05 7 6 5 4 3 2 First Edition CHARLES RIVER MEDIA titles are available for site license or bulk purchase by institutions, user groups, corporations, etc. For additional information, please contact the Special Sales Depart- ment at 781-740-0400. Requests for replacement of a defective CD-ROM must be accompanied by the original disc, your mailing address, telephone number, date of purchase and purchase price. Please state the nature of the problem, and send the information to CHARLES RIVER MEDIA, INC., 10 Downer Avenue, Hingham, Massachusetts 02043. CRM’s sole obligation to the purchaser is to replace the disc, based on defective materials or faulty workmanship, but not on the operation or functionality of the product. To my dear parents Manuel and Annissia who raised me with love and values; and to my brothers Natasha, Alfredo, Ivan, and Sonia from whom I drew inspiration for this project. This page intentionally left blank Contents Dedication v Preface xxi Part I Theory and Foundations 1 1 The Roadmap to High-Performance Computing 3 1.1 Evolution of Grid Technologies 3 1.2 The Grid in a Nutshell 5 1.2.1 What Can Be Called a Grid Application 6 1.3 Distributed Computing Models 6 1.3.1 Internet Computing 6 1.3.2 Peer-to-Peer (P2P) 7 1.3.3 Grid Architectures 8 1.4 Computing on Demand (COD) 11 1.4.1 How It Works 11 1.4.2 User Authorization 12 1.4.3 Limitation 12 1.5 Grids in Science and Technology 12 1.5.1 Grid Consortiums and Open Forums 12 1.5.2 Grids in Science and Engineering 13 1.6 Summary 15 2 Enterprise Computing 17 2.1 Existing Enterprise Infrastructures 18 vii viii Contents 2.2 Integration with Grid Architectures 19 2.2.1 Workload and Policy Management 20 2.2.2 Shared Services 21 2.2.3 Stateless Business Components 21 2.2.4 Client State Management 21 2.2.5 Pooled Resources and Lightweight Server Processes 22 2.2.6 Database Concurrency Maximization 22 2.3 Grid-Enabled Resource Management Models 23 2.3.1 Hierarchical 24 2.3.2 Abstract Owner Model (AO) 24 2.3.3 Economy/Market Model 25 2.4 Enter Open Grid Services 25 2.4.1 Grids on the Enterprise 26 2.4.2 Challenges Facing Grid Computing on the Enterprise 28 2.4.3 The Jump to Web Services Resource Framework (WSRF) 29 2.5 The Competitive IT Organization 29 2.6 Computational Economy 30 2.6.1 Why Computational Economy? 30 2.6.2 Economy-Driven Grid Resource Management 32 2.6.3 Resource Management Models 33 2.6.4 Grid Architecture for Computational Economy (GRACE) 34 2.7 Summary 35 3 Core Grid Middleware 39 3.1 P2P 41 3.1.1 Definition 41 3.1.2 Advantages and Disadvantages 41 3.1.3 P2P vs. the Client-Server Model 42 3.1.4 P2P Architectures 43 3.2 Globus 45 3.3 Grid Computing and Business Technologies (GRIDBUS) 46 Contents ix 3.4 Summary 46 Part II Grid Middleware 49 4 Grid Portal Development 51 4.1 Resource Management 52 4.1.1 Apache Jetspeed 53 4.1.2 A Resource Management JSP Portlet for Jetspeed 53 4.1.3 WebSphere® Portal Server v5.0 (WPS) 89 4.1.4 IBM Portlet API 89 4.1.5 Job Submission Portlet for WPS 91 4.1.6 Portlet Program 91 4.1.7 Portlet View JSP 95 4.1.8 Testing 98 4.2 Data Management Portlets 99 4.2.1 Global Access to Secondary Storage (GASS) 99 4.2.2 GridFTP 100 4.2.3 Remote I/O 100 4.2.4 Globus Executable Management (GEM) 100 4.2.5 Remote Data Access Portlet for Jetspeed 100 4.2.6 Remote Data Access Portlet for IBM WPS 107 4.3 Troubleshooting 115 4.3.1 Certificate Not Yet Valid Errors 115 4.3.2 Defective Credential Errors 115 4.4 Summary 116 5 Schedulers 119 5.1 An Overview of Popular Schedulers and Grid Middleware Integration 120 5.1.1 Job Schedulers versus Job Managers 121 5.2 Portable Batch System—OpenPBS 121 5.2.1 Overview 121 5.2.2 OpenPBS in Red Hat Linux (RHL) 9 122 x Contents 5.2.3 Configuring PBS on a Single Machine Cluster 123 5.2.4 Tweaking PBS GUIs for RHL 9 124 5.2.5 Configuring the Globus Toolkit 3.2 PBS Job Scheduler 125 5.2.6 Customizing the PBS Job Manager 126 5.2.7 Troubleshooting PBS MMJFS Runtime Errors 127 5.2.8 Customizing PBS Scheduler Execution Parameters with RSL 131 5.3 Silver/Maui Metascheduler 134 5.3.1 Reservations 135 5.3.2 Fairshare 135 5.3.3 Backfill 135 5.3.4 Job Prioritization 136 5.3.5 Throttling Policies 136 5.3.6 Quality of Service (QoS) 136 5.3.7 Node Allocation 136 5.4 Sun Grid Engine (SGE)—N1 Grid Engine 6 137 5.4.1 Workload Management 137 5.4.2 Architecture 137 5.4.3 Step-by-Step SGE Installation Transcript 138 5.4.4 Installation Troubleshooting Tips 140 5.4.5 Installation Verification 142 5.5 Condor-G 143 5.5.1 Condor Features 143 5.5.2 Condor-MMJFS Installation Transcript 144 5.6 MMJFS Integration with Other Job Schedulers 147 5.6.1 Step 1: Create a Job Manager Script 148 5.6.2 Step 2: Update the GT 3.2 Deployment Descriptors 161 5.6.3 Step 3: Verification 165 5.6.4 Integration Troubleshooting 170 5.6.5 Packaging Advice 176 5.7 Factors for Choosing the Right Scheduler 178 5.7.1 Features 178 Contents xi 5.7.2 Installation, Configuration, and Usability 178 5.7.3 User Interfaces 178 5.7.4 Support for Open Standards 178 5.7.5 Interoperability 179 5.7.6 Organization’s Requirements 179 5.7.7 Support for Grid Middleware 179 5.8 Summary 180 6 Open Grid Services Architecture (OGSA) 181 6.1 Web Services versus Grid Services 182 6.2 OGSA Service Model 184 6.3 Standard Interfaces 184 6.4 Other Service Models 185 6.5 Service Factories 186 6.6 Lifetime Management 187 6.7 Handles and References 187 6.8 Service Information and Discovery 187 6.9 Notifications 188 6.10 Network Protocol Bindings 188 6.11 Higher-Level services 189 6.12 Other Interfaces 189 6.13 Hosting Environments 189 6.14 Virtual Organizations (VOs) 190 6.15 The Globus Toolkit 191 6.15.1 Core Services 192 6.15.2 Security 192 6.15.3 Data Management 192 6.16 Sample Grid Services: Large Integer Factorization 193 6.16.1 Overview: Large Integer Factorization and Public Key Cryptosystems 193 6.16.2 Service Implementation 196 6.16.3 Factorization with the Quadratic Sieve 196 xiiContents 6.16.4 Service Architecture 197 6.16.5 Step 1: Obtaining the Factorization code 197

Grid Computing for Developers Limited Warranty and Disclaimer of Liability

Grid Computing: What Is It, and Why Do I Care?*

GT 4.0 RLS GT 4.0 RLS Table of Contents

A Globus Primer Or, Everything You Wanted to Know About Globus, but Were Afraid to Ask Describing Globus Toolkit Version 4

Parallelism of Machine Learning Algorithms

The Advanced Resource Connector User Guide

Contributions to Desktop Grid Computing Gilles Fedak

Resume Ivo Janssen

Grid Computing in Research and Business Status and Direction

Reflections on Building a High-Performance Computing Cluster Using Freebsd

The Grid: Past, Present, Future

Experimental Study of Remote Job Submission and Execution on LRM Through Grid Computing Mechanisms

Introduction to Grid Computing with the Globus Toolkit Version 3 Page 1 Colorado Software Summit: October 26 – 31, 2003 © Copyright 2003, IBM Corporation