Osv—Optimizing the Operating System for Virtual Machines

OSv—Optimizing the Operating System for Virtual Machines Avi Kivity, Dor Laor, Glauber Costa, Pekka Enberg, Nadav Har’El, Don Marti, and Vlad Zolotarov, Cloudius Systems https://www.usenix.org/conference/atc14/technical-sessions/presentation/kivity This paper is included in the Proceedings of USENIX ATC ’14: 2014 USENIX Annual Technical Conference. June 19–20, 2014 • Philadelphia, PA 978-1-931971-10-2 Open access to the Proceedings of USENIX ATC ’14: 2014 USENIX Annual Technical Conference is sponsored by USENIX. OSv— Optimizing the Operating System for Virtual Machines Avi Kivity Dor Laor Glauber Costa Pekka Enberg Nadav Har’El Don Marti Vlad Zolotarov Cloudius Systems {avi,dor,glommer,penberg,nyh,dmarti,vladz}@cloudius-systems.com Abstract Virtual machines in the cloud typically run existing general-purpose operating systems such as Linux. We notice that the cloud’s hypervisor already provides some !" ! features, such as isolation and hardware abstraction, ! which are duplicated by traditional operating systems, and that this duplication comes at a cost. v We present the design and implementation of OS , a new guest operating system designed specifically for running a single application on a virtual machine in the cloud. It addresses the duplication issues by using a low- Figure 1: Software layers in a typical cloud VM. overhead library-OS-like design. It runs existing applications written for Linux, as well as new applications v v written for OS . We demonstrate that OS is able to effi- support for a large selection of hardware, are losing their ciently run a variety of existing applications. We demon- relevance. At the same time, different features are be- strate its sub-second boot time, small OS image and how coming important: The VM’s operating system needs to it makes more memory available to the application. For be fast, small, and easy to administer at large scale. unmodified network-intensive applications, we demon- Moreover, fundamental features of traditional operat- strate up to 25% increase in throughput and 47% de- ing systems are becoming overhead, as they are now du- crease in latency. By using non-POSIX network APIs, plicated by other layers of the cloud stack (illustrated in we can further improve performance and demonstrate a Figure 1). 290% increase in Memcached throughput. For example, an important role of traditional operating systems is to isolate different processes from one an- 1 Introduction other, and all of them from the kernel. This isolation comes at a cost, in performance of system calls and con- Cloud computing (Infrastructure-as-a-Service, or IaaS) text switches, and in complexity of the OS. This was was born out of the realization that virtualization makes necessary when different users and applications ran on it easy and safe for different organizations to share one the same OS, but on the cloud, the hypervisor provides pool of physical machines. At any time, each organi- isolation between different VMs so mutually-untrusting zation can rent only as many virtual machines as it cur- applications do not need to run on the same VM. In- rently needs to run its application. deed, the scale-out nature of cloud applications already Today, virtual machines on the cloud typically run the resulted in a trend of focused single-application VMs. same traditional operating systems that were used on A second example of duplication is hardware abstrac- physical machines, e.g., Linux, Windows, and *BSD. tion: An OS normally provides an abstraction layer But as the IaaS cloud becomes ubiquitous, this choice is through which the application accesses the hardware. starting to make less sense: The features that made these But on the cloud, this “hardware” is itself a virtualized operating systems desirable on physical machines, such abstraction created by the hypervisor. Again, this dupli- as familiar single-machine administration interfaces and cation comes at a performance cost. USENIX Association 2014 USENIX Annual Technical Conference 61 This paper explores the question of what an operating Scala, Jruby, etc.) on OSv. We then demonstrate that a system would look like if we designed it today with the zero-copy, lock-free API for packet processing can result sole purpose of running on virtual machines on the cloud, in a 4x increase of Memcached throughput. and not on physical machines. In Section 4, we evaluate our implementation, and We present OSv, a new OS we designed specifically compare OSv to Linux on several micro- and macro- for cloud VMs. The main goals of OSv are as follows: benchmarks. We show minor speedups over Linux in • computation- and memory-intensive workloads such as Run existing cloud applications (Linux executa- the SPECjvm2008 benchmark, and up to 25% increase bles). in throughput and 47% reduction in latency in network- • Run these applications faster than Linux does. dominated workloads such as Netperf and Memcached. • Make the image small enough, and the boot quick v enough, that starting a new VM becomes a viable 2 Design and Implementation of OS alternative to reconfiguring a running one. OSv follows the library OS design, an OS construct pio- • Explore new APIs for new applications written for neered by exokernels in the 1990s [5]. In OSv’s case, the OSv, that provide even better performance. hypervisor takes on the role of the exokernel, and VMs the role of the applications: Each VM is a single applica- • Explore using such new APIs in common runtime tion with its own copy of the library OS (OSv). Library environments, such as the Java Virtual Machine OS design attempts to address performance and function- (JVM). This will boost the performance of unmodi- ality limitations in applications that are caused by tradi- fied Java applications running on OSv. tional OS abstractions. It moves resource management • Be a platform for continued research on VM oper- to the application level, exports hardware directly to the ating systems. OSv is actively developed as open application via safe APIs, and reduces abstraction and protection layers. source, it is written in a modern language (C++11), v its codebase is relatively small, and our community OS runs a single application in the VM. If several encourages experimentation and innovation. mutually-untrusting applications are to be run, they can be run in separate VMs. Our assumption of a single ap- OSv supports different hypervisors and processors, plication per VM simplifies OSv, but more importantly, with only minimal amount of architecture-specific code. eliminates the redundant and costly isolation inside a For 64-bit x86 processors, it currently runs on the KVM, guest, leaving the hypervisor to do isolation. Conse- Xen, VMware and VirtualBox hypervisors, and also on quently, OSv uses a single address space — all threads the Amazon EC2 and Google GCE clouds (which use and the kernel use the same page tables, reducing the a variant of Xen and KVM, respectively). Preliminary cost of context switches between applications threads or support for 64-bit ARM processors is also available. between an application thread and the kernel. In Section 2, we present the design and implementa- The OSv kernel includes an ELF dynamic linker which tion of OSv. We will show that OSv runs only on a hyper- runs the desired application. This linker accepts stan- visor, and is well-tuned for this (e.g., by avoiding spin- dard ELF dynamically-linked code compiled for Linux. locks). OSv runs a single application, with the kernel When this code calls functions from the Linux ABI (i.e., and multiple threads all sharing a single address space. functions provided on Linux by the glibc library), these This makes system calls as efficient as function calls, calls are resolved by the dynamic linker to functions im- and context switches quicker. OSv supports SMP VMs, plemented by the OSv kernel. Even functions which and has a redesigned network stack (network channels) are considered “system calls” on Linux, e.g., read(), to lower socket API overheads. OSv includes other facil- in OSv are ordinary function calls and do not incur spe- ities one expects in an operating system, such as standard cial call overheads, nor do they incur the cost of user- libraries, memory management and a thread scheduler, to-kernel parameter copying which is unnecessary in our and we will briefly survey those. OSv’s scheduler incor- single-application OS. porates several new ideas including lock-free algorithms Aiming at compatibility with a wide range of exist- and floating-point based fair accounting of run-time. ing applications, OSv emulates a big part of the Linux In Section 3, we begin to explore what kind of new programming interface. Some functions like fork() APIs a single-application OS like OSv might have be- and exec() are not provided, since they don’t have any yond the traditional POSIX APIs to further improve per- meaning in the one-application model employed by OSv. formance. We suggest two techniques to improve JVM The core of OSv is new code, written in C++11. This memory utilization and garbage-collection performance, includes OSv’s loader and dynamic linker, memory man- which boost performance of all JVM languages (Java, agement, thread scheduler and synchronization mecha- 2 62 2014 USENIX Annual Technical Conference USENIX Association nisms such as mutex and RCU, virtual-hardware drivers, work channels, a non-traditional design for the network- and more. We will discuss below some of these mecha- ing stack, and 4. the OSv thread scheduler, which incor- nisms in more detail. porates several new ideas including lock-free algorithms Operating systems designed for physical machines and floating-point based fair accounting of run-time. usually devote much of their code to supporting diverse hardware.

Osv—Optimizing the Operating System for Virtual Machines

The Design of the EMPS Multiprocessor Executive for Distributed Computing

Interrupt Handling in Linux

IBM VM Recovery Manager DR for Power Systems Version 1.5: Deployment Guide Overview for IBM VM Recovery Manager DR for Power Systems

The Next Generation Cloud: the Rise of the Unikernel

User Manual Issue 2.0.2 September 2017

Facility/370: Introduction

Amigaos 3.2 FAQ 47.1 (09.04.2021) English

A Linux in Unikernel Clothing Lupine

Governance in Collaborative Open Source Software Development Organizations: a Comparative Analysis of Two Case Studies

Scalability of VM Provisioning Systems

Faux Disk Encryption: Realities of Secure Storage on Mobile Devices August 4, 2015 – Version 1.0

Z/OS ♦ Z Machines Hardware ♦ Numbers and Numeric Terms ♦ the Road to Z/OS ♦ Z/OS.E ♦ Z/OS Futures ♦ Language Environment ♦ Current Compilers ♦ UNIX System Services