A Comparison of Virtualization Technologies for HPC∗
Total Page:16
File Type:pdf, Size:1020Kb
22nd International Conference on Advanced Information Networking and Applications A Comparison of Virtualization Technologies for HPC∗ J. P. Walters, Vipin Chaudhary, and Minsuk Cha Salvatore Guercio Jr. and Steve Gallo University at Buffalo, SUNY The Center for Computational Research Computer Science and Engineering University at Buffalo, SUNY {waltersj, vipin, mcha}@buffalo.edu {sguercio, smgallo}@ccr.buffalo.edu Abstract to provide “sandbox-like” environments with little perfor- mance reduction from the user’s perspective. Virtualization is a common strategy for improving the In light of the benefits to virtualization, we believe that utilization of existing computing resources, particularly virtual machine/virtual server-based computational clusters within data centers. However, its use for high performance will soon gain widespread adoption in the high performance computing (HPC) applications is currently limited despite computing (HPC) community. However, to date, a compre- its potential for both improving resource utilization as well hensive examination of the various virtualization strategies as providing resource guarantees to its users. This paper and implementations has not been conducted, particularly systematically evaluates various VMs for computationally with an eye towards its use in HPC environments. We fill intensive HPC applications using various standard bench- this literature gap by examining multiple aspects of cluster marks. Using VMWare Server, Xen, and OpenVZ we ex- performance with standard HPC benchmarks. In so doing amine the suitability of full virtualization, paravirtualiza- we make the following two contributions: tion, and operating system-level virtualization in terms of network utilization, SMP performance, file system perfor- 1. Single Server Evaluation: In Section 3 we evaluate mance, and MPI scalability. We show that the operating several virtualization solutions for single node perfor- system-level virtualization provided by OpenVZ provides mance and scalability. Rather than repeat existing re- the best overall performance, particularly for MPI scala- search, we focus our tests on industry-standard sci- bility. entific benchmarks including SMP (symmetric mul- tiprocessor) tests through the use of OpenMP (open multiprocessing) implementations of the NAS Paral- 1 Introduction lel Benchmarks (NPB) [2]. We examine file system and network performance (using IOZone [3] and Net- perf [4]) in the absence of MPI (message passing in- The use of virtualization in computing is a well- terface) benchmarks in order to gain insight into the established idea dating back more than 30 years [1]. Tradi- potential performancebottlenecks that may impact dis- tionally, its use has meant accepting a sizable performance tributed computations. reduction in exchangefor the convenienceof the virtual ma- chine. Now, however, the performance penalties have been 2. Cluster Evaluation: We extend our evaluation to the reduced. Faster processors as well as more efficient virtual- cluster-level and benchmark the virtualization solu- ization solutions now allow even modest desktop computers tions using the MPI implementation of NPB. Draw- to host virtual machines. ing on the results from our single node tests, we con- Soon large computational clusters will be leveraging the sider the effectiveness of each virtualization strategy benefits of virtualization in order to enhance the utility of by examining the overall performance demonstrated the cluster as well as to ease the burden of administering through industry-standard scientific benchmarks. such large numbers of machines. Indeed, virtual machines allow administrators to more accurately control their re- sources while simultaneously protecting the host node from 2 Existing Virtualization Technologies malfunctioning user-software. This allows administrators ∗ To accurately characterize the performance of different This research was supported in part by NSF IGERT grant 9987598, the Institute for Scientific Computing at Wayne State University, virtualization technologies we begin with an overview of MEDC/Michigan Life Science Corridor, and NYSTAR. the major virtualization strategies that are in common use 1550-445X/08 $25.00 © 2008 IEEE 861 DOI 10.1109/AINA.2008.45 Authorized licensed use limited to: SUNY Buffalo. Downloaded on November 5, 2008 at 12:12 from IEEE Xplore. Restrictions apply. for production computing environments. In general, most itself to aid in the virtualization effort. It allows multi- virtualization strategies fall into one of four major cate- ple unmodified operating systems to run alongside one gories: another, provided that all operating systems are capa- ble of running on the host processor directly. That 1. Full Virtualization: Also sometimes called hardware is, native virtualization does not emulate a processor. emulation. In this case an unmodified operating sys- This is unlike the full virtualization technique where tem is run using a hypervisor to trap and safely trans- it is possible to run an operating system on a fictional late/execute privileged instructions on-the-fly. Be- processor, though typically with poor performance. In cause trapping the privileged instructions can lead to x86 64 series processors, both Intel and AMD sup- significant performance penalties, novel strategies are port virtualization through the Intel-VT and AMD-V used to aggregate multiple instructions and translate virtualization extensions. x86 64 Processors with vir- them together. Other enhancements, such as binary tualization support are relatively recent, but are fast- translation, can further improve performance by reduc- becoming widespread. ing the need to translate these instructions in the fu- ture [5, 6]. For the remainder of this paper we use the word “guest” to refer to the virtualized operating system utilized within 2. Paravirtualization: Like full virtualization, paravir- any of the above virtualization strategies. Therefore a guest tualization also uses a hypervisor, and also uses the can refer to a VPS (OS-level virtualization), or a VM (full term virtual machine to refer to its virtualized operat- virtualization, paravirtualization). ing systems. However, unlike full virtualization, par- In order to evaluate the viability of the different virtual- avirtualization requires changes to the virtualized op- ization technologies, we compare VMWare Server version erating system. This allows the VM to coordinate with 1.0.21, Xen version 3.0.4.1, and OpenVZ based on kernel the hypervisor, reducing the use of the privileged in- version 2.6.16. These choices allow us to compare full vir- structions that are typically responsible for the major tualization, paravirtualization, and OS-level virtualization performance penalties in full virtualization. The ad- for their use in HPC scenarios. We do not include a com- vantage is that paravirtualized virtual machines typ- parison of native virtualization in our evaluation as the ex- ically outperform fully virtualized virtual machines. isting literature has already shown native virtualization to be The disadvantage, however, is the need to modify the comparable to VMWare’s freely available VMWare Player paravirtualized virtual machine/operating system to be in software mode [7]. hypervisor-aware. This has implications for operating systems without available source code. 2.1 Overview of Test Virtualization Im- 3. Operating System-level Virtualization: The most plementations intrusive form of virtualization is operating system- level virtualization. Unlike both paravirtualization and Before our evaluation we first provide a brief overview of full virtualization, operating system-level virtualiza- the three virtualization technologies that we will be testing: tion does not rely on a hypervisor. Instead, the op- VMWare Server [8], Xen [9], and OpenVZ [10]. erating system is modified to securely isolate multi- VMWare is currently the market leader in virtualization ple instances of an operating system within a single technology. We chose to evaluate the free VMWare Server host machine. The guest operating system instances product, which includes support for both full virtualization are often referred to as virtual private servers (VPS). and native virtualization, as well as limited (2 CPU) vir- The advantage to operating system-level virtualization tual SMP support. Unlike VMWare ESX Server, VMWare lies mainly in performance. No hypervisor/instruction Server (formerly GSX Server) operates on top of either the trapping is necessary. This typically results in system Linux or Windows operating systems. The advantage to this performanceof near-native speeds. The primary disad- approach is a user’s ability to use additional hardware that is vantage is that all VPS instances share a single kernel. supported by either Linux or Windows, but is not supported Thus, if the kernel crashes or is compromised, all VPS by the bare-metal ESX Server operating system (SATA hard instances are compromised. However, the advantage to disk support is notably missing from ESX Server). The dis- having a single kernel instance is that fewer resources advantage is the greater overhead from the base operating are consumed due to the operating system overhead of system, and consequently the potential for less efficient re- multiple kernels. source utilization. 4. Native Virtualization: Native virtualization leverages 1We had hoped to test VMWare ESX Server, but hardware incompati- hardware support for virtualization within a processor bilities prevented us from