Infrastructure for Load Balancing on Mosix Cluster

Infrastructure for Load Balancing on Mosix Cluster MadhuSudhan Reddy Tera and Sadanand Kota Computing and Information Science, Kansas State University Under the Guidance of Dr. Daniel Andresen. behind any typical clustering environment Abstract such as Beowulf parallel computing system, which comes with free versions of UNIX The complexity and size of software are and public domain software packages. increasing at a rapid rate. This results in the increase in build time and execution times. Mosix is a software that was specifically Cluster computing is proving to be an designed to enhance the Linux kernel with effective way to reduce this in an cluster computing capabilities. It is a tool economical way. Currently, most available consisting of kernel level resource sharing cluster computing software tools which algorithms that are geared for performance achieve load balancing by process scalability in a cluster computer. Mosix migration schemes, do not consider all supports resource sharing by dynamic characteristics such as CPU load, memory process migration. It relieves the user from usage, network bandwidth usage during the responsibility of allocation of processes migration. For instance, Mosix, a cluster to nodes by distributing the workloads computing software tool for Linux, does not dynamically. In this project, we are consider network bandwidth usage and concentrating on homogeneous systems, neither does it consider CPU usage and wherein we have machines with same family memory characteristics together. In this of processors running the same kernel. paper we present the infrastructure for efficient load balancing on Mosix cluster The resource sharing algorithm of Mosix through intelligent scheduling techniques. attempts to reduce the load differences between pairs of nodes (systems in the Introduction cluster) by migrating processes from higher As computers increase their processing loaded nodes to lesser loaded nodes. This is power, software complexity grows at an done in decentralized manner i.e. all nodes even larger rate in order to consume all of execute the same algorithms and each node those new CPU cycles. Not only does the performs the reduction of loads running of the new software require more independently. Also, Mosix considers only CPU cycles, but the time required to balancing of loads on processors and compile and link the software also increases. responds to changes in loads on processors The basic idea behind clustering approach is as long as there is no extreme shortage of to make a large number of individual other resources such as free memory and machines act like a single very powerful empty process slots. Mosix does not machine. With the power and low prices of consider certain parameters such as network today’s PCs and the availability of high bandwidth usage by a process running on a performance Ethernet connections, it makes node. In addition, Mosix distributes the load sense to combine them to build High evenly and does not give the user, the Performance Computing and Parallel control of load distribution i.e. if the user Computing environment. This is the concept wants only few of the machines to be evenly loaded and few others to be heavily /lightly b. Profitability Determination phase: We loaded, he will not be able to do this. Our should perform load balancing only when project aims to overcome these the cost of imbalance is greater that cost of shortcomings. Our initial scheduling load balancing. This comparison of cost of technique is decentralized and tries to give imbalance vs. cost of load balancing is the user, the control of balancing the load on profitability determination. Generally if cost various machines. The scheduling is not considered during actual migration, an algorithms, we use, try to achieve balance in excessive number of tasks can be migrated, load, memory and network bandwidth by and this will have negative influence on the collecting performance metrics of a process system performance. through Performance Co-pilot (PCP), a c. Task Selection phase: Now we must select framework and services to support system- a set of tasks that must be dispatched from level performance monitoring and the system so that the imbalance is removed. performance management from SGI. We This is done in the task selection phase. The also propose an implementation, which is task should be selected in such a way that based on a centralized scheduler, which tries moving the task from the system would to eliminate the problems of decentralized remove the imbalance to a large extent. For scheduling, such as every node trying to instance, we can see the proportion of CPU move their CPU intensive processes to a usage of the task on the system. We should lightly loaded node. The centralized also consider the cost of moving the task scheduler also takes care of network over the link in the cluster, size of the bandwidth usage of a process and tries to transfer, since larger tasks will take longer reduce the overall network bandwidth time to move that the smaller ones. consumption by migrating the d. Task Migration phase: This is the final communicating processes to single node in phase of load balancing in the cluster. This addition to balancing the load on individual step must be done carefully and correctly to processes as required by the user. ensure continued communication integrity. Load Balancing In the following sections, we will explore The notion in Mosix cluster is that whenever two most popular cluster computing a system in the cluster becomes heavily technologies namely, Mosix and Condor and loaded, then the load is distributed in the also conclude why we chose Mosix. We cluster. The dispatching of tasks from continue the paper with our implementation heavily loaded system and scheduling it to a and provide sample test results. lightly loaded system in the cluster is called load balancing. Load balancing can be Mosix divided into following phases: Mosix is a cluster-computing enhancement of Linux, which allows multiple uni- a. Load Evaluation phase: “ The usefulness processors, and SMP’s running the same of any load balancing scheme is directly version of kernel to share resources by dependent on the quality of load preemptive process migration and dynamic measurement and prediction.” [Watts98] load balancing. Mosix implements resource- Any good load balancing technique not only sharing algorithms, which respond to load has good measurement of load, but also, variations on individual computer systems sees that it does not affect the actual load on by migrating processes from one the system. workstation to another, preemptively. The goal is to improve the overall performance migration in Mosix is implemented by and to create a convenient multi-user, time- dividing the migrating process into two sharing environment for execution of contexts: user context (‘remote’)– that can applications. Unique features of Mosix are be migrated, and system context (‘deputy’) – that is UHN dependent and cannot be a. Network transparency: For all network migrated (see figure 1). The ‘remote’ related operations, the application level consists of stack, data, program code, programs are provided with a virtual memory maps and registers. The ‘deputy’ machine that looks like a single machine i.e. consists of description of the resources, the application programs do not need to which the process is attached to, and a know the current state of the system kernel-stack for the execution of the system configuration. on behalf of the process. The interaction of b. Preemptive process migration: It means ‘deputy’ and the ‘remote’ is implemented at that it can migrate any user’s process, the link layer as shown in figure 1, which transparently to any available node at any also shows two processes sharing a UHN, time. Transparency in migration means that one local and a deputy. the functional aspects of the system behavior should not be altered as a result of Remote processes are not accessible to other migration. processes that run at the same node and vice c. Dynamic load balancing: As explained versa. They do not belong to any particular earlier, Mosix has resource sharing user nor can they be sent signals or algorithms, which work in decentralized otherwise manipulated by any local process. manner. They can only be forced by system administrator to migrate out. The deputy The granularity of work distribution in does not have a memory map of its own. Mosix is process. Each process has Unique Instead, it shares the main kernel map Home Node (UHN) where it was created. similar to kernel thread. The system calls Process that migrates to other nodes (called’ that are executed by the process (remote) are remote’) use local (in the remote node) intercepted by remote site’s link layer. If the resources whenever possible, but interact system calls are site independent, it is with the user’s environment through the executed by the ‘remote’ locally, else, the UHN, for e.g. gettimeofday() would get the system call is forwarded to the ‘deputy’. The time from the UHN. Preemptive process ‘deputy’ then executes the call and returns c. Jobs can be ordered: The ordering of job the result back to remote site. execution required by dependencies among jobs in a set is easily handled. The set of jobs is specified using a directed acyclic Condor graph, where each job is a node in the graph. Condor is a High Throughput Computing Jobs are submitted to Condor following the environment that can manage very large dependencies given by the graph. collections of distributively owned d. Condor Enables Grid Computing: As grid workstations. The environment is based on a computing becomes a reality, Condor is novel layered architecture that enables it to already there. The technique of glide in provide a powerful and flexible suite of allows jobs submitted to Condor to be Resource Management services to sequential executed on grid machines in various and parallel applications.

Load more