Building Computational Grids with Apple’S Xgrid Middleware
Total Page:16
File Type:pdf, Size:1020Kb
Building Computational Grids with Apple’s Xgrid Middleware Baden Hughes Department of Computer Science and Software Engineering The University of Melbourne, Parkville VIC 3010, Australia Email: [email protected] Abstract The structure of this paper is as follows. First we consider the overall positioning of the Apple Xgrid so- Apple’s release of the Xgrid framework for distributed lution, and its high level systems architecture. Next computing introduces a new technology solution for we review in depth the Xgrid architecture and com- loosely coupled distributed computation. In this pa- ponents officially distributed by Apple in Xgrid 1.0. per systematically describe, compare and evaluate the Following this we will review a range of interopera- Apple native Xgrid solution and a range of third ble third party components that can be used to com- party components which can be substituted for the plement or replace the proprietary Apple components native versions. This description and evaluation is under certain circumstances. We report experience in grounded in practical experience of deploying a small using a heterogenous Xgrid to perform some experi- scale, internationally distributed, heterogenous com- ments in natural language processing; and report on a putational infrastructure based on the Xgrid frame- range of other production uses of Xgrid based on pub- work. lished papers and user group surveys. Finally we con- duct an evaluation of the overall strengths and weak- Keywords: Xgrid, Apple, computational grid, middle- nesses of the Xgrid solution and consider the niche(s) ware into which Xgrid based solutions may effectively be deployed and offer some concluding thoughts. 1 Introduction It is worth stating upfront that this paper specif- ically does not seek to either conduct empirical per- The release in 2005 of Apple’s Xgrid framework for formance comparisons between Xgrid and alternative grid introduces a new technology solution for loosely grid computing solutions nor to report the results of coupled, distributed computation. Xgrid has been a specific scientific experiment which is enabled by widely promoted as an extremely usable solution for Xgrid-based infrastructure. Rather the purpose of less technical user communities and challenges the this paper is to describe, and where relevant compare systems management paradigm incumbent in many and evaluate the overall Xgrid architecture and its computational grid solutions currently deployed. As components from a functional perspective. such, the uptake of grid computing by ad hoc groups of researchers with non-dedicated infrastructure using 2 A Brief History of Xgrid the Xgrid framework has been significant, in numer- ical terms and in terms of the visibility of the solu- Xgrid was first introduced by Appple in January 2004 tion. A particular point to note is that the simple as a Technology Preview (TP1). Xgrid TP1 was con- Xgrid framework has the potential to fundamentally sidered as a proof of concept, and was not designed change the delineation between grid users and grid for production applications owing to reliability, secu- maintainers, and as such to promote new types of rity and scalability issues. Rather it was designed to research enabled by a self-sustaining model for man- draw feedback from early adopters as to the viability aging computational grid infrastructure. of an Apple grid computing product. For researchers in the grid computing space and for Xgrid Technology preview (TP2) was released in systems manageers of production grid facilities, Xgrid November 2004, retaining most of the functionality has often been viewed as a toy solution. This paper of TP1, but with some underlying CLI and data for- seeks to counter this perception in two ways: first by mat changes. This was a very widely adopted re- describing the Xgrid architecture and its components lease, Xgrid based computational environments were in detail, and secondly by adopting an analysis model deployed for production use, and third party compo- more prevalent in the grid computing domain. A no- nents began to emerge. table point here is that this paper does not simply Xgrid 1.0 was released with Mac OS X 10.4 ‘Tiger’ cover the Apple distributed native Xgrid components in April 2005. 1.0 introduced a significant number of but also considers a range of third party components changes. Perhaps controversially, Xgrid 1.0 included which can be used to extend the Xgrid framework in a dependency on Mac OS X Server (TP1 and TP2 did directions more amenable to the types of production not require a server grade operating system), which environments currently occcupied by solutions such allowed Apple to leverage the significant investment as the widely used Globus toolkit. it had made in Mac OS X Tiger Server in the areas of Copyright c 2005, Australian Computer Society, Inc. This pa- scalability, single sign on, job specific authorization, per appeared at Twenty-Ninth Australasian Computer Science server local and remote administration, server grade Conference (ACSC2005), Hobart, Australia. Conferences in documentation, and the inclusion of a GUI based in- Research and Practice in Information Technology, Vol. 54. teraction model (TP1 and TP2 only had a CLI). Rajkumar Buyya and Tianchi Ma, Ed. Reproduction for aca- demic, not-for profit purposes permitted provided this text is included. 3 Solution Architecture authentication prevalent in other computational grid middleware suites. Service control is also enabled via The Apple Xgrid architecture is a standard three tier the Server Admin environment - in addition, there architecture consisting of a Client, Controller and is a command line control interface for Xgrid, which Agent. We will review each of these tiers in turn. allows service management (with the exception of au- The Controller, Agent and Client can all exist on a thentication policies) from a shell environment. single machine, although in practice, these are more The discovery mechanism in Xgrid is that both typically distributed. clients and agents natively search for a controller. The controller is the only Xgrid component that re- 3.1 Client quires an open TCP port for such discovery requests. All communications between clients, controllers and An Xgrid client provides the user interface to an Xgrid agents are able to be strongly encrypted over the net- system. The client is responsible for finding a suitable work ensuring in transit security across segregated controller, submitting a jobs, and retrieving the re- administrative domains. sults. Clients can rely on the controller to mediate all Xgrid can automatically discover available re- job submissions; they do not need to be aware of the sources on a local network via a number of Apple actual job execution schedule across available com- services including ZeroConf, Rendezvous or Bonjour. putational agents. It is useful to note that an Xgrid Discovery is recursive within a domain or subdomain client is detachable from the network even while jobs or Xgrid configurations can be created by manually are being executed - completed jobs are retrieved by entering IP addresses or hostnames. the client from the controller once network connectiv- ity is re-established. 5 Xgrid Job Management 3.2 Controller Xgrid jobs are expressed in Apple’s standard plist format (Apple, 2005a). Further details of the expres- The controller, representing the middle tier, is the sion are provided in a later section in the context of centre of the Xgrid framework. A controller typically the Xgrid Command Line Interface. runs on a dedicated system (like a cluster head node). Jobs can be submitted to the Xgrid controller us- The controller handles receiving jobs from clients, di- ing a range of client tools (discussed below). In all viding them into tasks to execute on various agents, cases, an accompanying job specification is used by and collecting and returning the results. the controller to determine whether (and how) to de- compose a submitted the job, the code and data pay- 3.3 Agent loads for a given job, and whether the job submission is to be synchronous or asynchronous. Xgrid con- The final tier in the Xgrid solution is the Agent. Typ- trollers schedule jobs in the order they are received, ically there is one agent per compute node, not dis- assigning each task to the fastest agent available at similar to other grid computing frameworks, although that time (determined by active probing of the cur- natively Xgrid agents on dual-CPU nodes default to rent computational load of each agent). Alternatively, accepting one task per CPU (similar policies are often the job specification allows for dependencies among implemented in a cluster LRMS). In similarity with jobs and tasks to be expressed to ensure schedul- other computational grid solutions, Xgrid allows for ing in the proper order. In most cases, if any job a range of agent types ranging from full-time agents, or task fails, the scheduler will automatically resub- part-time (cycle stealing) agents, and remote agents mit it to the next available agent. Xgrid tasks using (for distributed computational grids). password-based authentication will minimize possible interactions with the rest of the system by executing 4 Xgrid Systems Configuration and Manage- as unprivileged Xgrid users (by default the system ment user nobody in the system’s /tmp directory). Tasks using single sign-on for both clients and agents will The client, controller and agent software ships stan- run as the submitting user, allowing appropriate ac- dard with Apple’s Mac OS X 10.4 (Tiger) operating cess to local files or network services. system. As such, the systems management task is Because of the three-tier architecture of Xgrid, largely configuration oriented rather than installation clients can submit jobs to the controller asyn- oriented. A simplified configuration can be setup in chronously, then disconnect from the network as less than 5 minutes, significantly reducing the barrier mentioned earlier. The controller will cache all to entry for less technical users.