Evaluation of Type-1 Hypervisors on Desktop-Class Virtualization Hosts

IADIS International Journal on Computer Science and Information Systems Vol. 12, No. 2, pp. 86-101 ISSN: 1646-3692 EVALUATION OF TYPE-1 HYPERVISORS ON DESKTOP-CLASS VIRTUALIZATION HOSTS Duarte Pousa1 and José Rufino2 1Polytechnic Institute of Bragança, 5300-253 Bragança, Portugal 2Polytechnic Institute of Bragança, 5300-253 Bragança, Portugal. Laboratory of Instrumentation and Experimental Particle Physics, University of Minho, 4710-057 Braga, Portugal ABSTRACT System Virtualization has become a fundamental IT tool, whether it is type-2/hosted virtualization, mostly exploited by end-users in their personal computers, or type-1/bare metal, well established in IT departments and thoroughly used in modern datacenters as the very foundation of cloud computing. Though bare metal virtualization is meant to be deployed on server-grade hardware (for performance, stability and reliability reasons), properly configured desktop-class systems or workstations are often used as virtualization servers, due to their attractive performance/cost ratio. This paper presents the results of a study conducted on commodity virtualization servers, aiming to assess the performance of a representative set of the type-1 platforms mostly in use today: VMware ESXi, Citrix XenServer, Microsoft Hyper-V, oVirt and Proxmox. Hypervisor performance is indirectly measured through synthetic benchmarks performed on Windows 10 LTSB and Linux Ubuntu Server 16.04 guests: PassMark for Windows, UnixBench for Linux, and the cross-platform Flexible I/O Tester and iPerf3 benchmarks. The evaluation results may be used to guide the choice of the best type-1 platform (performance-wise), depending on the predominant guest OS, the performance patterns (CPU- bound, IO-bound, or balanced) of that OS, its storage type (local/remote) and the required network-level performance. KEYWORDS Bare Metal Virtualization, Synthetic Benchmarking, Performance Assessment, Commodity Hardware 1. INTRODUCTION System Virtualization allows a physical host to run (simultaneously) many guest operating systems in self-contained software-defined virtual machines, sharing the host hardware, under control of a hypervisor. It comes in two basic forms: a) in hosted (or type-2) virtualization, an operating system (typically desktop-oriented) hosts virtual machines, along with other applications; this is achieved by type-2 platforms, like VMware Workstation 86 EVALUATION OF TYPE-1 HYPERVISORS ON DESKTOP-CLASS VIRTUALIZATION HOSTS (Vmware-a, 2016) or Oracle VM VirtualBox (VirtualBox, 2016); b) in bare metal (or type-1) virtualization, the platform hypervisor has direct hardware access, ensuring a more efficient use of the underlying resources; a common approach to type-1 virtualization is to install a thin hypervisor in the host, along with a management console; this console may be minimal, as in VMware ESXi (Vmware-b, 2016), or a management virtual machine, as in XenServer (XenServer, 2016) or Hyper-V (Hyper-V, 2016); an alternative approach is to have a server-class operating system with kernel modules that convert the operating system into a type-1 platform, like in KVM-based (KVM, 2017) platforms, as oVirt (oVirt, 2017) and Proxmox (Proxmox, 2017). Bare metal virtualization is commonly deployed on server-grade hardware, often vendor-certified for specific type-1 virtualization platforms. This ensures adequate performance, stability and reliability, along with advanced management features, that are essential for mission-critical applications. However, it is usually possible to run most modern type-1 platforms on commodity hardware, and the compromises entailed by such option are acceptable in many scenarios, often related to tight IT budgets (home labs, public schools, SMEs, etc.). This paper presents the results of benchmarks conducted on Windows 10 LTSB and Linux Ubuntu Server 16.04 virtual machines, hosted in well-known bare metal platforms (VMware ESXi, Citrix XenServer, oVirt, Proxmox and Microsoft Hyper-V), running in desktop-class hardware with reasonable (tough not high-end) configurations. The benchmark applications used in this study included OS-specific benchmarks, namely PassMark for Windows and UnixBench for Linux, and also cross-platform I/O-centric benchmarks, specifically Flexible IO Tester (FIO) and iPerf3. FIO was run within virtual machines deployed both on local and remote storage, and several remote storage options were evaluated (NFS and iSCSI). iPerf3 was ran both between virtual-machines on the same host and in different hosts (linked by 1 Gbps Ethernet). In order to provide a comparison baseline, all benchmarks were conducted in native instances of the operating systems selected for this study, installed on the same host hardware. The data collected during this study will help answering performance-centric questions like: what is (are) the most suitable type-1 platforms(s) to host modern Windows and Linux guests in commodity servers for different usage patterns (CPU-bound, IO-bound, mixed) ? what is the impact of different guest storage options ? how is network performance affected ? what is the impact introduced by each type-1 hypervisor, compared to a native scenario ? The rest of the paper is organized as follows. Section 2 introduces the testbed: type-1 platforms; guest operating systems; benchmarks; hosts, guests and remote storage configuration; evaluation methodology. Section 3 presents the benchmarking results. Section 4 revises related work on virtualization benchmarking. Section 5 concludes and sets future work. 2. TESTBED 2.1 Type-1 Virtualization Platforms The bare metal virtualization platforms selected for this study include proprietary and closed source platforms – VMware ESXi 6.0 (Free License) and Hyper-V (Windows Server 2016 Datacenter Edition) – and also open source alternatives – Citrix XenServer 7.0, oVirt 4.1 and 87 IADIS International Journal on Computer Science and Information Systems Proxmox 4.4. The specific versions used were the latest ones available when this study began (fall 2016). Bearing in mind the distinction between a virtualization platform and a hypervisor, each platform builds on the following hypervisors, respectively: ESXi, Hyper-V, Xen and QEMU/KVM (for both oVirt and Proxmox). These hypervisors rely on different virtualization techniques (Hwang, J. et al, 2013): i) Para-Virtualization (Hyper-V, Xen, KVM); ii) Hardware-Assisted Virtualization, based on ISA extensions like Intel’s VT-x and AMD-V (all hypervisors); iii) Full Virtualization (ESXi, Xen, KVM). In our study, the type-1 hypervisors choose automatically the virtualization technique to run each guest. Also, from this point onwards, the concepts of type-1 platform and type-1 hypervisor will be used interchangeably. 2.2 Guest Operating Systems Benchmarks were ran on a small, yet representative set of guest operating systems (OSs), playing specific roles in this study: Microsoft Windows 10 LTSB and Ubuntu Server 16.04 LTS. The first OS represents the modern line of Windows desktop-centric operating systems. The Windows 10 LTSB (Long Term Servicing Branch) variant is a long-term support version of Windows 10 Enterprise. It is designed for stability, meaning it will seldom (or ever) be updated with new features, but will have security updates and bug fixes for 10 years after its release. The second OS is a “proxy” for the many Linux-based distributions used in infrastructure servers. Canonical’s Ubuntu, specifically, has been consistently present at the top of usage rankings. Also, Ubuntu’s relevance in the cloud space has been steadily growing, becoming a major player, both at infrastructure layers (IaaS) and as an instance (virtual machine). As with the LTSB Windows variant, a Long Term Support (LTS) Ubuntu version was also selected. 2.3 Benchmark Applications One of the original experimental conditions devised for this work was to use only multi-platform, free (if possible open-source), and broad (not focused on a single system component) synthetic benchmarks. This would allow to compare different guest operating systems executed on the same type-1 virtual hardware. However, considering the choice of guest Operating Systems, no such benchmarks were found and we resorted to PassMark Performance Test v9 (PassMark-a, 2016) for Windows, and UnixBench v5.1.3 (UnixBench, 2016) for Linux. These are not multi-platform (and PassMark is not open-source), but all other requisites are fulfilled. PassMark generates CPU, 2D graphics, 3D graphics, Disk and Memory ratings, that are combined in a global PassMark score (PassMark-b, 2017). UnixBench is a system (OS-level) benchmark for Unix-like systems that mostly depends on the OS, libraries and compilers used. It includes tests on General CPU+Memory performance (Dhrystone2), CPU Arithmetic performance (Whetstone), File Copying, Process Creation & Substitution, Context Switching, System Call Overhead, Shell Scripting and 2D/3D Graphics (unfit to Ubuntu Server and so excluded in our study). An overall system index combines the scores of all tests. In order to still be able to establish some comparison between different guest operating systems, two multi-platform benchmarks were selected: Flexible I/O Tester (FIO, 2016) and 88 EVALUATION OF TYPE-1 HYPERVISORS ON DESKTOP-CLASS VIRTUALIZATION HOSTS iPerf3 (iPerf3, 2017). FIO was used to assess the impact of different guest storage options (local disk-based, remote NFS-based, and remote iSCSI-based). iPerf3 was used to measure IPv4 TCP network bandwidth between virtual-machines on the same host (with

Load more