Network RAM Based Process Migration for HPC Clusters
Total Page:16
File Type:pdf, Size:1020Kb
Journal of Information Systems and Telecommunication, Vol. 1, No. 1, Jan – March 2013 39 Network RAM Based Process Migration for HPC Clusters Hamid Sharifian* Department of Computer Engineering, Iran University of Science and Technology, Tehran, Iran [email protected] Mohsen Sharifi Department of Computer Engineering, Iran University of Science and Technology, Tehran, Iran [email protected] Received: 04/Dec/2012 Accepted: 09/Feb/2013 Abstract Process migration is critical to dynamic balancing of workloads on cluster nodes in any high performance computing cluster to achieve high overall throughput and performance. Most existing process migration mechanisms are however unsuccessful in achieving this goal proper because they either allow once-only migration of processes or have complex implementations of address space transfer that degrade process migration performance. We propose a new process migration mechanism for HPC clusters that allows multiple migrations of each process by using the network RAM feature of clusters to transfer the address spaces of processes upon their multiple migrations. We show experimentally that the superiority of our proposed mechanism in attaining higher performance compared to existing comparable mechanisms is due to effective management of residual data dependencies. Keywords: High Performance Computing (HPC) Clusters, Process Migration, Network RAM, Load Balancing, Address Space Transfer. 1. Introduction and data access locality in addition to an enhanced degree of dynamic load distribution [1]. A standard approach to reducing the runtime Upon migration of a process, the process of any high performance scientific computing must be suspended and its context information in application on a high performance computing the source node extracted and transferred to the (HPC) cluster is to partition the application into destination node. The process can only then several portions that can be run in parallel by resume executing from the point it was multiple cluster nodes simultaneously. suspended. Two critical challenges of process HPC clusters generally consist of three main migration are the transfer of the process address parts: a collection of off-the-shelf (COTS) space from the source node to the destination computing and storage nodes, a network node, and access to the opened files in the connecting the nodes, and a cluster manager destination node after process migration [2]. system software that manages all nodes and In this paper we only focus on resolving the presents a single system image to applications first challenge of process migration by while exploiting the parallel processing power of introducing a new process migration mechanism multiple nodes. by using the network RAM feature of HPC The cluster manager system software clusters, wherein the aggregate main memory of provides a set of global services that aim at all cluster nodes in a cluster represents the making resource distribution transparent to all network RAM of that cluster [3]. In addition to applications, managing resource sharing between achieve comparably higher performance than applications, deploying as much cluster resources existing process migration mechanisms, our as possible for demanding applications, and proposed mechanism is intended to allow for scheduling parallel processes on all cluster nodes. multiple-migration of each process that is a The services include global resource missing feature in existing process migration management, distributed scheduling, load mechanisms that is accounted as a source of sharing, process migration, and network RAM. inefficiency. Dynamic load sharing can be achieved by The rest of paper is organized as follows. moving processes from heavily-loaded nodes to Sections 2 and 3 introduce process migration and lightly-loaded nodes at runtime. This can lead to network RAM. Section 4 reviews related works fault resilience, ease of system administration, on process migration with respect to transfer of * Corresponding Author 40 Sharifian & Sharifi, Network RAM Based Process Migration 48 for HPC Clusters process address space. Sections 5 and 6 present The context of a process to be migrated our network RAM based mechanism and its includes the process’s running state; stack evaluation, and Section 7 concludes the paper. contents; processor registers; address space; heap data; and process’s communication state (like open files or message channels). The whole 2. Process Migration context must be transferred to the destination node before the process can continue its Process migration is the act of transferring an execution on the destination node. The process active process between two computers and address space is the largest part of the process restoring its execution in a destination node from context that might have hundreds of megabyte of the point it left off in the source node. The goals data [5] taking longest to be transferred to the of process migration are closely tied to destination node. This can adversely affect the applications that use migration. The primary performance of process migration. Therefore, the goals include resource locality, resource sharing, performance of any process migration dynamic load balancing, fault resilience, and mechanism largely depends on how long it takes ease of system administration [1]. Any process to transfer the context of migrant processes to migration mechanism can thus be benchmarked destination nodes. and evaluated with respect to the degree it Various data transfer techniques have been satisfies these goals. presented in the literature that try to reduce the Considering an HPC cluster, process high cost of address space transfer [6]. A well- migration has three main phases [4] (note that known technique is to transfer only parts of these phases are applicable to process migration process address space to allow resumption of in all types of networks of computers in general processes on destination nodes without waiting including HPC clusters that are the main focus of for the transfer of the whole process address this paper): space and context. Though very attractive on 1. Detaching Phase that involves the grounds of improving the performance of process suspension and the extraction of the migration, this technique leaves parts (pages) of context of a migrant process in its current process address space on different nodes when node. These activities must be done in a multiple migrations in the lifetime of process is way that none of the other processes allowed and occurs. In other words, the process running on the current node or in other address space is scattered on multiple nodes nodes of the cluster experience any resulting in residual dependencies. Management execution inconsistencies. At the start of of residual dependencies increases the this phase, the execution of the migrant implementation complexity of process migration process is frozen. that in turn results in performance degradation of 2. Transfer Phase that involves the transfer process migration. of the extracted context of the migrant Our proposed mechanism is particularly process to the destination node. useful to strong migrations wherein the entire 3. Attaching Phase that involves the process state (rather than code and some reconstruction of the migrant process on initialization data in weak code migration) must the destination node. The reconstruction in be transferred to destination. turn involves the allocation of resources on the target node to the migrant process, informing the beneficiaries and/or brokers 3. Network RAM of the migrant process about the current executing place of the migrant process, Large-memory high-performance applications and resuming the execution of the migrant such as scientific computing, weather prediction process on the destination node from the simulations, data warehousing and graphic point it left off on the source node. rendering applications need massive fast The time interval between freezing a migrant accessible address spaces [7] that are not provided process on a source node and resuming its by even high capacity DRAMs. The runtime execution on a destination node is called the performance of these applications degrades freeze time representing the status wherein the quickly when system encounters memory shortage migrant process is neither executing on the and starts swapping memory to local disks. source nor executing on the destination node. In today’s clusters with very low-latency The longer the freeze time the lower will be the networks, the idle memory of other nodes can be performance of the process migration. used as storage media faster than local disks, called network RAM. The goal of network RAM Journal of Information Systems and Telecommunication, Vol. 1, No. 1, Jan – March 2013 41 is to improve the performance of memory 4.3 DSM-based Techniques intensive workloads by paging to idle memory over the network rather than to disk [8]. Some HPC clusters have used the distributed Some common uses of network RAM are shared memory (DSM) mechanism to transfer remote memory paging [9], network memory file process address spaces between nodes upon systems, and swap block devices [10,11]. process migrations. Pages stored on DSM need Locating unused memory in every node requires not be transferred at all during process migration. that network RAM keeps up-to-date information Only pages accessed by the migrant after about all unused memories. migration are provided to the migrant process Since network RAM stores