Scheduling in Metacomputing Systems

SCHEDULING IN METACOMPUTING SYSTEMS By Heath A. James, B.Sc.(Ma. & Comp' Sc')(Hons) July 1999 A THESIS SUBMITTED FOR THE DEGREE OF Docton oF PHILoSoPHY IN THE DEPARÎMENT OF COMPUTER SCIENCE UNwPRstrY oF AoPl¡'lPP Contents lx Abstract xl Declaration xllt Acknowledgements 1 Introduction 1 1.1 Project Motivation 2 4 7.2 Why MetacomPuting? . ' 5 1.3 Focus and Contributions of this Thesis 7.4 Organisation of this Thesis 6 , Cluster Computing I 2.L Resource Managers 10 2.1.1 Batch SYstems 72 2.L.2 Extended Batch SYstems 16 2.2 Middleware and Metacomputing Systems 19 2.2.L Middleware Tools 22 2.2.2 Globus 26 2.2-3 Harness 28 2.2.4 Infospheres 30 2.2.5 Legion 31 2.2.6 DOCT 33 2.2.7 WebFlow 34 2.3 DISCWorld 34 2.4 Discussion and comparison of Metacomputing systems 37 2.5 Conclusion 39 I 43 3 A Review of Scheduling 46 3.1 Scheduling Models 47 3.1.1 Static Scheduling 48 3.7.2 DYnamic Scheduling 49 3.1.3 Hybrid Static-Dynamic Scheduling 50 3.2 IndePendent Tasks 51 3.3 DePendent Tasks 51 3.3.1 List Schedules ' 53 3.3.2 Clustering Algorithms 53 3.4 System State Information 57 3.5 Historical Review of Scheduling 63 3.6 Discussion 64 3.7 Conclusion 67 4 Scheduling IndePendent Tasks 4.I Introduction 67 68 4.2 Scheduling IndePendent Jobs 70 4.3 Scheduling Algorithms 4.4 Algorithms Test Results 75 4.5 Discussion and Conclusion 81 5 Modeling Scheduling and Placement 85 86 5.1 Terminology . 5.2 Features and Constraints of Model 86 5.2,7 Distributed Scheduling with Partial System State 87 5.2.2 Model InPuts 88 o1 5.2.3 Characteristics oi System Componenis JL 5.2.4 Restricted Placement 96 5.2.5 Heuristic Optimal Schedules 97 5.2.6 Duplication of Services and Data 98 5.2.7 Clustering of Services 98 5.2.8 Non-PreemPtive Scheduling 99 5.3 A Distributed Job Placement Language 99 5.4 Scheduling Model 104 5.4.1 Cost-Minimisation Function 105 II 5.4.2 Schedule Creation and Processor Assignment ' 109 5.5 Discussion 1L2 113 5.6 Conclusion 6 Implementing Scheduling in DISC'World 115 116 6.1 Schedule Execution Models 6.2 DISCWortd Global Naming Strategy 118 r20 6.3 DISCWorld Remote Access Mechanism L24 6.3.1 OPerations on DRAMs ' t29 6.4 DRAM Futures 130 6.5 Execution in DISCWorld ' t32 6.5.1 Schedule Creation 133 6.5.2 Global Execution 148 6.5.3 LocalisedExecution 156 6.6 SummarY 159 7 DISC\Morld Performance AnalYsis 160 7.I ExamPIe Services 165 7.2 A Detailed ExamPIe L70 7.3 Performance Considerations L72 7.4 Conclusion 175 8 Conclusions and Future Work L75 8.1 Conclusions 178 8.2 Future Work A Distributed Geographic Information Systems 181 181 4.1 Introduction 183 A.2 Online Geospatial Data and Services Standards and APIs 184 ^.2.tA.2.2 Implementation of a standardised Geospatial Data Archive ' 185 4.3 Distributed Systems for Decision Support 187 193 4.4 Decision support Applications requiring Distributed GIS 4.4.1 Land Management and Crop Yield Forecasting' ' ' 193 Defence and C3I 193 ^.4.2 llt 4.4.3 Emergency Services 194 4.5 Conclusions 195 B ERIC 20L 8.1 Introduction 20L 8.2 Repository Design 202 203 8.3 Imagery OPerations . 8.4 Eric Architecture 205 8.5 Performance Issues 208 8.6 Discussion and Conclusions 209 Bibliography zlt IV List of Tables 40 1 Metacomputingsystemcharacteristicssummary 4t 2 Middleware tools characteristics summary 58 3 comparison of models found in scheduling literature ' 69 4 Simulation job duration times jobs 76 5 Execution time and assignment latency for independent 106 6 Description and examples of node characteristics L25 7 Formal definitions of DRAM operations ' 8 comparison of service execution overheads for different execution 161 environments services 163 9 Performance of DISCWorld prototype using multiple L64 10 Performance of DISCWorld prototype using single service images 208 11 Single image compressed HDF to JPEG conversion times for 209 12 Times to compute/retrieve cloud cover for Australia ' v List of Figures XU 1 Inspiration and creativitY . 10 2 Cluster example 11 3 An idealised batch queueing system T2 4 An idealised wide-area manager 2L 5 Metacomputing system architecture 36 6 High-level DISCWorld architecture 36 7 DISCWorld daemon architecture . programs 37 8 Level of integration of cluster computing packages with user 45 9 Graphical representation of scheduling precedence 1 0 List scheduling algorithms rely on task execution times and 52 relationships . 54 11 Clustering together nodes of a task graph L2 Complete system state information 55 13 Consequences of partial system state information 56 T4 dploader program structure. 70 72 15 Round-robin scheduling . 16 Round-robin scheduling with clustering 73 nr) t7 Minimai aciaptive sche<iuiing tù 18 Continual adaptive scheduling 74 19 First-come first-serve scheduling 75 20 Simulation execution times for 1-50 and 50-1000 processes 77 2L Job turnaround data from ANU Supercomputer Facility 80 22 Partial system state information 88 23 Relationship between DISCWortd processing request, services and data dependencies 89 24 DJPL XML document template definition (DTD) L02 vl 103 25 Example DJPL scriPt 111 26 Service and data placement pseudo-algorithm 27 SymbolsusedtoexplaintheDlsCWorldRemoteAccessMechanism L22 (DRAM) notation. t22 28 DRAMs are more than Pointers L24 29 DRAM Java base class L25 30 Service DRAM (DRAMS) Java class L25 31 Data DRAM (DRAMD) Java class 726 32 Example contents of a DRAMS 727 33 Example contents of a DRAMD ' ' 728 34 DRAM operations 131 35 Future DRAM (DRAMF) Java class 132 36 Example contents of a DRAMF' ' ' qF, 135 JI Example state of the DISCWorld environment 136 38 Exampleun-annotaúedDlsCWorldprocessingrequest.'. t37 39 Example annotated DISCWorld processing request 138 40 DISCWorld execution, steps i)-iv) ' 139 4t DISCWorld execution, steps v)-viii) 140 42 DISCWorld execution, steps ix)-xii) 747 43 A graphical user interface to DISCWorld 151 44 Selected components of DISCWorld architecture 151 45 DISCWorld Java classes and inner classes 153 46 DISCWorld thread hierarchY 166 47 Pictorial representation of a complex DJPL request 169 48 Complex processing request event timing sequence L97 49 Screen dump from GIAS implementation 198 50 Dedicated decision support system 198 51 Value-adding feeding chain 198 52 Hierarchical relationships between vaiue-adders 198 53 Value-adding to data 199 54 Benefit of caching intermediate and final data 199 55 Special products can be developed for users parts 199 56 Breaking up a complex query into smaller, computable bitmask 204 57 GMS-5 satellite image and corresponding cloud cover VII 58 Eric architecture 206 59 ERIC: Single image query form. 207 207 60 ERIC: Single image results screen. vlu Abstract environment is that of The problem of scheduling in a distributed heterogeneous execution of the programs' The deciding where to place programs? and when to start on jobs consisting of both general problem of scheduling is investigated, with focus program execution within the independent and dependent programs. We consider being able to make context of metacomputing environments, and the benefits of restricted predictions on the performance of programs. using the constraint of produces heuristically placement of programs, we present a scheduling system that good execution schedules in the absence of complete global system state information' naming strategy The scheduling system is reliant on a processor-independent global and a single-assignment restriction for data' Cluster computing is an abstraction that treats a collection of interconnected parallel or distributed computers as a single resource. This abstraction is commonly of queueing used to refer to the scope of resource managels' most often in the context and systems. To express greater complexity in a cluster computing environment, is now the programs that are run on the environment, the tetm metacomputing [74] being widely adopted. This may be defined as a collection of possibly heterogeneous for the computational nodes which have the appearance of a single, virtual computer purposes of resource management and remote execution. with We review current technologies for cluster computing and metacomputing, focus on their resource management and scheduling capabilities. we critically World review these technologies against our Distributed Information Systems Control (DISCWorld). program We develop novel mechanisms that enable and support scheduling and placement in the DISCWorld prototype. we also discuss a mechanism by which a high-level job's internal structure can be represented, and processing requests controlled. we formulate this using extended markup language (xML). lx To enable processing requests, which consist of a directed acyclic graph of programs, we develop a mechanism for their composition and scheduling placement' This rich data pointer and a complimentary futures mechanism are implemented as the basis part of the DISCWoTId remote access mechanism (DRAM). They also form for a model of wide-area computation independent of DISCWorld. simple RMI and \Me have implemented geospatial imagery applications using both thesis' After common gateway interface, and the novel mechanisms deveioped in this scalability analysing and measuring our system, we find significant performance and benefits. x Declaration for the award of any other This work contains no material which has been accepted institution and' to the best of degree or diploma in any university or other tertiary published or written by my knowledge and belief, contains no material previously made in the text' another person, except where due reference has been in the university Library' I give consent to this copy of my thesis, when deposited being available for loan and photocopying' High Although DISCWorld is a joint development of the Distributed k' work reported here Performance Computing Project group and other students, the on scheduling in DISCWoTId is my own work' Heath A.

Load more