Job Scheduling Strategies for Networks of Workstations

Job Scheduling Strategies for Networks of Workstations 1 1 2 3 B B Zhou R P Brent D Walsh and K Suzaki 1 Computer Sciences Lab oratory Australian National University Canb erra ACT Australia 2 CAP Research Program Australian National University Canb erra ACT Australia 3 Electrotechnical Lab oratory Umezono sukuba Ibaraki Japan Abstract In this pap er we rst intro duce the concepts of utilisation ra tio and eective sp eedup and their relations to the system p erformance We then describ e a twolevel scheduling scheme which can b e used to achieve go o d p erformance for parallel jobs and go o d resp onse for inter active sequential jobs and also to balance b oth parallel and sequential workloads The twolevel scheduling can b e implemented by intro ducing on each pro cessor a registration oce We also intro duce a lo ose gang scheduling scheme This scheme is scalable and has many advantages over existing explicit and implicit coscheduling schemes for scheduling parallel jobs under a time sharing environment Intro duction The trend of parallel computer developments is toward networks of worksta tions or scalable paral lel systems In this typ e of system each pro cessor having a highsp eed pro cessing element a large memory space and full function ality of a standard op erating system can op erate as a standalone workstation for sequential computing Interconnected by highbandwidth and lowlatency networks the pro cessors can also b e used for parallel computing To establish a truly generalpurp ose and userfriendly system one of the main problems is to provide users with a single system image By adopting the technique of dis tributed shared memory for example we can provide a single addressing space for the whole system so that communication for transferring data b etween pro cessors is completely transparent to the client programs In this pap er we discuss another very imp ortant issue relating to the provision of single system image that is eective job scheduling strategies for b oth sequential and paral lel processing on networks of workstations Many job scheduling schemes have b een intro duced in the literature and some of them implemented on commercial parallel systems These scheduling schemes for parallel systems can b e classied into either space sharing or time sharing or a combination of b oth With space sharing a system is partitioned into subsystems each containing a subset of pro cessors There are b oundary lines laid b etween subsystems and so only pro cessors of the same subsystem can b e co ordinated to solve problems assigned to that subsystem During the computation each subsystem is allo cated only for a single job at a time The space partition can b e either static or adaptive With static partitioning the system conguration is determined b efore the system starts op erating The whole system has to b e stopp ed when the system needs to b e recongured With adaptive partitioning pro cessors in the system are not divided b efore the computation When a new job arrives a job manager in the system rst lo cates idle pro cessors and then allo cates certain numb er of those idle pro cessors to that job according to some pro cessor allo cation p olicies eg those describ ed in Therefore the b oundary lines are drawn during the computation and will disapp ear after the job is terminated Normally the static partitioning is used for very large systems while the adaptive partitioning is adopted in systems or subsystems of small to medium size One disadvantage of space partitioning is that short jobs can easily b e blo cked by long ones for a long time b efore b eing executed However in practice short jobs usually demand a short turnaround time To alleviate this problem jobs can b e group ed into classes and a sp ecial treatment will b e given to the class of short jobs However it can only partially solve the problem Thus time sharing needs to b e considered Many scheduling schemes for timesharing of a parallel system have b een prop osed in the literature They may b e classied into two basic typ es The rst one is local scheduling With lo cal scheduling there is only a single queue on each pro cessor Except for higher or lower priorities b eing given pro cesses asso ciated with parallel jobs are not distinguished from those asso ciated with sequential jobs The metho d simply relies on existing lo cal schedulers on each pro cessor to schedule parallel jobs Thus there is no guarantee that the pro cesses b elonging to the same parallel job can b e executed at the same time across the pro cessors When many parallel programs are simultaneously running on a sys tem pro cesses b elonging to dierent jobs will comp ete for resources with each other and then some pro cesses have to b e blo cked when communicating or syn chronising with nonscheduled pro cesses on other pro cessors This eect can lead to a great degradation in overall system p erformance One metho d to alleviate this problem is to use twophase blo cking which is also called implicit coscheduling in In this metho d a pro cess waiting for communication spins for some time in the hop e that the pro cess to b e communicated with on the other pro cessor is also scheduled and then blo cks if a resp onse has not b een received The rep orted exp erimental results show that for parallel workloads this scheduling scheme p erforms b etter than the simple lo cal scheduling However the problem is that the scheduling p olicy is based on communication require ments Then it tends to give sp ecial treatment to jobs with a high frequency of communication demands The p olicy is also indep endent of service times The p erformance of parallel computation is thus unpredictable The second typ e of scheduling schemes for time sharing is coscheduling or gang scheduling which may b e a b etter scheme in adopting shortjobrst p olicy Using this metho d a numb er of parallel programs is allowed to enter a service queue as long as the system has enough memory space The pro cesses of the same job will run simultaneously across the pro cessors for only certain amount of time which is called scheduling slot When a scheduling slot is ended the pro cessors will contextswitch at the same time to give the service to pro cesses of another job All programs in the service queue take turns to receive the service in a co ordinated manner across the pro cessors Thus programs never interfere with each other and short jobs are likely to b e completed more quickly There are also certain drawbacks asso ciated with coscheduling A signicant one is that it is designed only for parallel workloads For networks of workstations we need an eective scheduling strategy for b oth sequential and parallel pro cessing The simple coscheduling technique is not a suitable solution The future networks of workstations should provide a programmingfree en vironment to general users By providing a variety of highp erformance com puting libraries for a wide range of applications plus userfriendly interfaces for the access to those libraries parallel computing will no longer b e considered just as clients sp ecial requests but b ecome a natural and common phenomenon in the system Along with many other critical issues therefore highly eective job management strategies are required for the system to meet various clients requirements and to achieve high eciency of resource utilisation Because of the lack of ecient job scheduling strategies most networks of workstations are currently used exclusively either as an MPP for pro cessing parallel batch jobs or as a group of separate pro cessors for interactive sequential jobs The p oten tial p ower of this typ e of system are not exploited eectively and the system resources are not utilised eciently under these circumstances In this pap er we discuss some new ideas for eectively scheduling b oth se quential and parallel workloads on networks of workstations To achieve a de sired p erformance for a parallel job on a network of workstations with a variety of comp etitive background workloads it is essential to provide a sustained ra tio of CPU utilisation to the asso ciated pro cesses on each pro cessor to allo cate more pro cessors to the job if the assigned utilisation ratio is small and then to co ordinate the execution across the pro cessors We rst intro duce the concepts of utilisation ratio and eective speedup and their relations to the system p er formance in Section In this section we also argue that b ecause the resources in a system are limited one cannot guarantee every parallel job to have a sus tained CPU utilisation ratio in a time sharing environment One way to solve the problem is that we give short jobs sustained utilisation ratios to ensure a short turnaround time while to each large job we allo cate a large numb er of pro cessors and assign a utilisation ratio which can vary in a large range according to the current system workload so that small jobs will not b e blo cked and the resource utilisation can b e kept high we then present in Section a twolevel scheduling scheme which can b e used to achieve go o d p erformance for paral lel jobs and go o d resp onse for interactive sequential jobs and also to balance b oth parallel and sequential workloads The twolevel scheduling can b e imple mented by intro ducing on each pro cessor a registration oce which is describ ed in Section We discuss a scalable coscheduling scheme loose gang schedul ing in Section This scheme requires b oth global and lo cal job managers It is scalable b ecause the coscheduling

Job Scheduling Strategies for Networks of Workstations

Three Level Scheduling

Balancing Efficiency and Fairness in Heterogeneous GPU Clusters for Deep Learning

Rotating Between Scheduling Algorithms

CPU Scheduling

CPU Scheduling

Flexible Coscheduling: Mitigating Load Imbalance Title: and Improving Utilization of Heterogeneous Resources

Processor Scheduling

Gang Scheduling Java Applications with Tessellation

CPU Scheduling

Lecture 4: Scheduling

COS 318: Operating Systems CPU Scheduling

Technical Report Nr. 745