A Case for High Performance Computing with Virtual Machines

A Case for High Performance Computing with Virtual Machines

A Case for High Performance Computing with Virtual Machines Wei Huangy Jiuxing Liuz Bulent Abaliz Dhabaleswar K. Panday y Computer Science and Engineering z IBM T. J. Watson Research Center The Ohio State University 19 Skyline Drive Columbus, OH 43210 Hawthorne, NY 10532 fhuanwei, [email protected] fjl, [email protected] ABSTRACT in the 1960s [9], but are experiencing a resurgence in both Virtual machine (VM) technologies are experiencing a resur- industry and research communities. A VM environment pro- gence in both industry and research communities. VMs of- vides virtualized hardware interfaces to VMs through a Vir- fer many desirable features such as security, ease of man- tual Machine Monitor (VMM) (also called hypervisor). VM agement, OS customization, performance isolation, check- technologies allow running different guest VMs in a phys- pointing, and migration, which can be very beneficial to ical box, with each guest VM possibly running a different the performance and the manageability of high performance guest operating system. They can also provide secure and computing (HPC) applications. However, very few HPC ap- portable environments to meet the demanding requirements plications are currently running in a virtualized environment of computing resources in modern computing systems. due to the performance overhead of virtualization. Further, Recently, network interconnects such as InfiniBand [16], using VMs for HPC also introduces additional challenges Myrinet [24] and Quadrics [31] are emerging, which provide such as management and distribution of OS images. very low latency (less than 5 µs) and very high bandwidth In this paper we present a case for HPC with virtual ma- (multiple Gbps). Due to those characteristics, they are be- chines by introducing a framework which addresses the per- coming strong players in the field of high performance com- formance and management overhead associated with VM- puting (HPC). As evidenced by the Top 500 Supercomputer based computing. Two key ideas in our design are: Virtual list [35], clusters, which are typically built from commodity Machine Monitor (VMM) bypass I/O and scalable VM im- PCs connected through high speed interconnects, have be- age management. VMM-bypass I/O achieves high commu- come the predominant architecture for HPC since the past nication performance for VMs by exploiting the OS-bypass decade. feature of modern high speed interconnects such as Infini- Although originally more focused on resource sharing, cur- Band. Scalable VM image management significantly reduces rent virtual machine technologies provide a wide range of the overhead of distributing and managing VMs in large benefits such as ease of management, system security, per- scale clusters. Our current implementation is based on the formance isolation, checkpoint/restart and live migration. Xen VM environment and InfiniBand. However, many of Cluster-based HPC can take advantage of these desirable our ideas are readily applicable to other VM environments features of virtual machines, which is especially important and high speed interconnects. when ultra-scale clusters are posing additional challenges on We carry out detailed analysis on the performance and performance, scalability, system management, and adminis- management overhead of our VM-based HPC framework. tration of these systems [29, 17]. Our evaluation shows that HPC applications can achieve In spite of these advantages, VM technologies have not almost the same performance as those running in a native, yet been widely adopted in the HPC area. This is due to non-virtualized environment. Therefore, our approach holds the following challenges: promise to bring the benefits of VMs to HPC applications • Virtualization overhead: To ensure system integrity, with very little degradation in performance. the virtual machine monitor (VMM) has to trap and process privileged operations from the guest VMs. This 1. INTRODUCTION overhead is especially visible for I/O virtualization, where the VMM or a privileged host OS has to in- Virtual machine (VM) technologies were first introduced tervene every I/O operation. This added overhead is not favored by HPC applications where communica- tion performance may be critical. Moreover, memory consumption for VM-based environment is also a con- Permission to make digital or hard copies of all or part of this work for cern because a physical box is usually hosting several personal or classroom use is granted without fee provided that copies are guest virtual machines. not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to • Management Efficiency: Though it is possible to uti- republish, to post on servers or to redistribute to lists, requires prior specific lize VMs as static computing environments and run ap- permission and/or a fee. ICS'06 June 28­30, Cairns, Queensland, Australia. plications in pre-configured systems, high performance Copyright 2006 ACM 1­59593­282­8/06/0006 ...$5.00. computing cannot fully benefit from VM technologies unless there exists a management framework which helps to map VMs to physical machines, dynamically With the deployment of large scale clusters for HPC appli- distribute VM OS images to physical machines, boot- cations, management and scalability issues on these clusters up and shutdown VMs with low overhead in cluster are becoming increasingly important. Virtual machines can environments. greatly benefit cluster computing in such systems, especially from the following aspects: In this paper, we take on these challenges and propose a VM-based framework for HPC which addresses various • Ease of management: A system administrator can performance and management issues associated with virtu- view a virtual machine based cluster as consisting of a alization. In this framework, we reduce the overhead of net- set of virtual machine templates and a pool of physi- work I/O virtualization through VMM-bypass I/O [18]. The cal resources [33]. To create the runtime environment concept of VMM-bypass extends the idea of OS-bypass [39, for certain applications, the administrator needs only 38], which takes the shortest path for time critical oper- pick the correct template and instantiate the virtual ations through user-level communication. An example of machines on physical nodes. VMs can be shutdown VMM-bypass I/O was presented in Xen-IB [18], a proto- and brought up much easier than real physical ma- type we developed to virtualize InfiniBand under Xen [6]. chines, which eases the task of system reconfiguration. Bypassing the VMM for time critical communication oper- VMs also provide clean solutions for live migration ations, Xen-IB provides virtualized InfiniBand devices for and checkpoint/restart, which are helpful to deal with Xen virtual machines with near-native performance. Our hardware problems like hardware upgrades and failure, framework also provides the flexibility of using customized which happen frequently in large-scale systems. kernels/OSes for individual HPC applications. It also al- lows building very small VM images which can be managed • Customized OS: Currently most clusters are utiliz- very efficiently. With detailed performance evaluations, we ing general purpose operating systems, such as Linux, demonstrate that high performance computing jobs can run to meet a wide range of requirements from various user as efficiently in our Xen-based cluster as in a native, non- applications. Although researchers have been suggest- virtualized InfiniBand cluster. Although we focus on Infini- ing that light-weight OSes customized for each type Band and Xen, we believe that our framework can be read- of application can potentially gain performance bene- ily extended for other high-speed interconnects and other fits [10], this has not yet been widely adopted because VMMs. To the best of our knowledge, this is the first study of management difficulties. However, with VMs, it is to adopt VM technologies for HPC in modern cluster envi- possible to highly customize the OS and the run-time ronments equipped with high speed interconnects. environment for each application. For example, ker- In summary, the main contributions of our work are: nel configuration, system software parameters, loaded modules, as well as memory and disk space can be • We propose a framework which allows high perfor- changed conveniently. mance computing applications to benefit from the de- sirable features of virtual machines. To demonstrate • System Security: Some applications, such as system the framework, we have developed a prototype system level profiling [27], may require additional kernel ser- using Xen virtual machines on an InfiniBand cluster. vices to run. In a traditional HPC system, this requires either the service to run for all applications, or users to • We describe how the disadvantages of virtual machines, be trusted with privileges for loading additional kernel such as virtualization overhead, memory consumption, modules, which may lead to compromised system in- management issues, etc., can be addressed using cur- tegrity. With VM environments, it is possible to allow rent technologies with our framework. normal users to execute such privileged operations on • We carry out detailed performance evaluations on the VMs. Because the resource integrity for the physical overhead of using virtual machines for high perfor- machine is controlled by the virtual machine monitor mance computing (HPC). This evaluation shows that instead of the guest operating system, faulty or mali- our virtualized InfiniBand cluster is able to deliver al- cious code in guest OS may in the worst case crash a most the same performance for HPC applications as virtual machine, which can be easily recovered. those in a non-virtualized InfiniBand cluster. 3. BACKGROUND The rest of the paper is organized as follows: To further In this section, we provide the background information justify our motivation, we start with more discussion on the benefits of VM for HPC in Section 2. In Section 3, we pro- for our work. Our implementation of VM-based comput- ing is based on the Xen VM environment and InfiniBand.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    10 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us