Many-Task Computing for Grids and Supercomputers

Many-Task Computing for Grids and Supercomputers Ioan Raicu 1, Ian T. Foster 1,2,3 , Yong Zhao 4 1Department of Computer Science, University of Chicago, Chicago IL, USA 2Computation Institute, University of Chicago, Chicago IL, USA 3Mathematics and Computer Science Division, Argonne National Laboratory, Argonne IL, USA 4Microsoft Corporation, Redmond, WA, USA [email protected], [email protected], [email protected] Abstract model of passing data via files between dependent tasks. This potentially larger class of task-parallel Many-task computing aims to bridge the gap between applications is precluded from leveraging the increasing two computing paradigms, high throughput computing power of modern parallel systems such as and high performance computing. Many task computing supercomputers (e.g. IBM Blue Gene/L [1] and Blue differs from high throughput computing in the emphasis Gene/P [2]) because the lack of efficient support in of using large number of computing resources over those systems for the “scripting” programming model short periods of time to accomplish many [3]. With advances in e-Science and the growing computational tasks (i.e. including both dependent and complexity of scientific analyses, more scientists and independent tasks), where primary metrics are researchers rely on various forms of scripting to measured in seconds (e.g. FLOPS, tasks/sec, MB/s I/O automate end-to-end application processes involving rates), as opposed to operations (e.g. jobs) per month. task coordination, provenance tracking, and Many task computing denotes high-performance bookkeeping. Their approaches are typically based on a computations comprising multiple distinct activities, model of loosely coupled computation, in which data is coupled via file system operations. Tasks may be small exchanged among tasks via files, databases or XML or large, uniprocessor or multiprocessor, compute- documents, or a combination of these. Vast increases in intensive or data-intensive. The set of tasks may be data volume combined with the growing complexity of static or dynamic, homogeneous or heterogeneous, data analysis procedures and algorithms have rendered loosely coupled or tightly coupled. The aggregate traditional manual processing and exploration number of tasks, quantity of computing, and volumes of unfavorable as compared with modern high data may be extremely large. Many task computing performance computing processes automated by includes loosely coupled applications that are generally scientific workflow systems. [4] communication-intensive but not naturally expressed The problem space can be partitioned into four main using standard message passing interface commonly categories (Figure 1 and Figure 2). 1) At the low end of found in high performance computing, drawing the spectrum (low number of tasks and small input attention to the many computations that are size), we have tightly coupled Message Passing heterogeneous but not “happily” parallel. Interface (MPI) applications (white). 2) As the data size increases, we move into the analytics category, Keywords: many-task computing, MTC, high- such as data mining and analysis (blue); MapReduce [5] throughput computing, HTC, high performance is an example for this category. 3) Keeping data size computing, HPC modest, but increasing the number of tasks moves us into the loosely coupled applications involving many 1. Defining Many Task Computing tasks (yellow); Swift/Falkon [6, 7] and Pegasus/DAGMan [8] are examples of this category. 4) We want to enable the use of large-scale distributed Finally, the combination of both many tasks and large systems for task-parallel applications, which are linked datasets moves us into the data-intensive many-task into useful workflows through the looser task-coupling computing category (green); examples of this category 978-1-4244-2872-4/08/$25.00 ©2008 IEEE are Swift/Falkon and data diffusion [9], Dryad [ 10], and applications, such as large number of tasks (i.e. millions Sawzall [11]. or more), rel atively short per task execution times (i.e. seconds to minutes long), and data intensive tasks (i.e. tens of MB of I/O per CPU second of compute) have lead to the definition of a new class of applicatioapplicati ns called Many-Task Computing. MTC emphasizes on usin g much large numbers of computing resources over short periods of time to accomplish many computational tasks, where the primary metrics are in seconds (e.g., FLOPS, tasks/sec, MB/sec I/O rates),rates) while HTC requires large amounts of computing for long perio ds of time with the primary metrics being operations per month [12 ]. MTC applications are composed of many tasks (both independent and dependent tasks) that can be individually scheduledschedule on many different computing resources across multiple administrative boundaries to achieve some larger application goal. MTC denotes high-performance computations comprising multiple distinct activities, coupled viv a file Figure 1: Problem types with respect to data s ize and system operations or message passing. Tasks may be number of tasks small or la rge, uniprocessor or multiprocessor, High performance computing can be considered to be compute-intensive or data-intensive. The set of tasks part of the first category (denoted by the white aarea). may be static or dynamic, homogeneous or High throughput computing [12 ] can be considered to heterogeneous, loosely coupled or tightly coupled. The be a subset of the third categor y (denoted by the yellow aggregate number of tasks, quantity of computing, and area). Many-Task Computing can be considered as part volumes of data may be extremely large. Is MTC really of categories three and four (denoted by the yelloyellow and different enough to justify coining a new term? TheTh re green areas). This paper focuses on defining many -task are certainly other choices we could have used instins ead, computing, and the challenges that arise as datasedatasets and such as multiple program multiple data (MPMD), highhig computing systems are growing exponentially. throughput computing, workflows, capacity computingcomputin , or embarrassingly parallel. MPMD is a variant of Flynn’s original taxonomy [ 13], used to denote computations in which several prograprogr ms each operate on different data at the same time. MPM MD can be contrasted with Single Program Multiple Data (SPMD), in which multiple instances of the same program each execute on different processors, operaoper ting on different data. MPMD lacks the emphasis that a set of tasks can vary dynamically. High throughput computing [12 ], a term coined by Miron Livny within the Condor project [14 ], to contrast workloads for which the key metric is not floating -point operations per second (as in high performance computi ng) but “per month or year.” MTC applications are often just as concerned with performance as is the most demandingdemandin Figure 2: An incomplete and simplistic view of HPC application; they just don't happen to be SPMD programming models and tools programs. The term “workflow” was first used to Clusters and Grids have been the preferred platforplatform for denote sequences of tasks in business pro cesses, but the loosely coupled applications that have been term is sometimes used to denote any computation ini traditionally par t of the high throughput computing which control and data passes from one “task” to class of applications, which are managed and execuexecuted another. We find it often used to describe many -task through workflow systems or parallel programming computations (or MPMD, HTC, MTC, etc.), making its systems. Various properties of a new emerging use too general. “Embarrassingly pa rallel computing” is used to denote parallel computations in which each support of MTC applications on petascale HPC individual (often identical) task can execute without systems: any significant communication with other tasks or with 1) The I/O subsystems of petascale systems offer unique a file system. Some MTC applications will be simple capabilities needed by MTC applications. For example, and embarrassingly parallel, but others will be collective I/O operations [16] could be implemented to extremely complex and communication-intensive, use the specialized high-bandwidth and low-latency interacting with other tasks and shared file-systems. interconnects. MTC applications could be composed of Is “many task computing” a useful distinction? Perhaps individual tasks that are themselves parallel programs, we could simply have said “applications that are many tasks operating on the same input data, and tasks communication-intensive but are not naturally that need considerable communication among them. expressed in MPI”, but there are also loosely coupled, Furthermore, the aggregate shared file system and independent many tasks. Through the new term performance of a supercomputer can be potentially MTC, we are drawing attention to the many larger than that found in a distributed infrastructure computations that are heterogeneous but not “happily” (i.e., Grid), with data rates in the 10GB+/s range, rather parallel. than the more typical 0.1GB/s to 1GB/s range at most Grid sites. 2. MTC for Clusters, Grids, and 2) The cost to manage and run on petascale systems Supercomputers like the Blue Gene/P is less than that of conventional clusters or Grids [15]. For example, a single 13.9 TF We claim that MTC applies to not only traditional HTC Blue Gene/P rack draws 40 kilowatts, for 0.35 GF/watt. environments such as clusters and Grids, assuming Two other systems that get good compute power per appropriate support in the middleware, but also watt consumed are the SiCortex with 0.32 GF/watt and supercomputers. Emerging petascale computing the Blue Gene/L with 0.23 GF/watt. In contrast, the systems, such as IBM’s Blue Gene/P [2], incorporate average power consumption of the Top500 systems is high-speed, low-latency interconnects and other 0.12 GF/watt [17]. Furthermore, we also argue that it is features designed to support tightly coupled parallel more cost effective to manage one large system in one computations.

Many-Task Computing for Grids and Supercomputers

Task Scheduling Balancing User Experience and Resource Utilization on Cloud

Resource Access Control Facility (RACF) General User's Guide

3.A Fixed-Priority Scheduling

The CICS/MQ Interface

Cluster, Grid and Cloud Computing: a Detailed Comparison

Grid Computing: What Is It, and Why Do I Care?*

Introduction to Xgrid: Cluster Computing for Everyone

Cloud Computing Over Cluster, Grid Computing: a Comparative Analysis

“Grid Computing”

GEMTC: GPU Enabled Many-Task Computing

Computer Systems Architecture

Real-Time Operating Systems with Example PICOS18