The Real Difference Between Emulation and Paravirtualization Of

Total Page:16

File Type:pdf, Size:1020Kb

The Real Difference Between Emulation and Paravirtualization Of The Real Difference Between Emulation and Paravirtualization of High-Throughput I/O Devices Arthur Kiyanovski Technion - Computer Science Department - M.Sc. Thesis MSC-2017-19 - 2017 Technion - Computer Science Department - M.Sc. Thesis MSC-2017-19 - 2017 The Real Difference Between Emulation and Paravirtualization of High-Throughput I/O Devices Research Thesis Submitted in partial fulfillment of the requirements for the degree of Master of Science in Computer Science Arthur Kiyanovski Submitted to the Senate of the Technion | Israel Institute of Technology Av 5777 Haifa August 2017 Technion - Computer Science Department - M.Sc. Thesis MSC-2017-19 - 2017 Technion - Computer Science Department - M.Sc. Thesis MSC-2017-19 - 2017 This research was carried out under the supervision of Prof. Dan Tsafrir, in the Faculty of Computer Science. Acknowledgements I would like to dedicate this thesis to my late grandfather, Ben-Zion Kiyanovski, who passed away while I was doing the research for this thesis. My grandfather fought courageously against the Nazis in World War II. Without people like him, none of us were here today. I would like to thank my dear wife Assya, for her infinite support, without it I wouldn't have been able to finish this research. I would like to thank my advisor, Prof. Dan Tsafrir, for his help and guidance along the way. The research leading to the results presented in this paper was partially supported by the Israel Science Foundation (grant No. 605/12). The generous financial help of the Technion is gratefully acknowledged. Technion - Computer Science Department - M.Sc. Thesis MSC-2017-19 - 2017 Technion - Computer Science Department - M.Sc. Thesis MSC-2017-19 - 2017 Contents List of Figures Abstract 1 Abbreviations and Notations 3 1 Introduction 5 2 Background 9 2.1 TCP Essentials . 9 2.1.1 TCP Checksum Offloading . 9 2.1.2 TCP Segmentation Offloading . 9 2.1.3 TCP Congestion Control . 10 2.1.4 TCP SRTT . 11 2.2 Linux Network Stack Implementation Essentials . 13 2.2.1 The Socket Buffer . 13 2.2.2 NAPI . 13 2.3 QEMU Essentials . 13 2.3.1 Main Threads of QEMU . 14 2.3.2 The qemu global mutex . 14 2.4 The Intel Pro/1000 PCI/PCI-X NICs (Bare Metal E1000) . 14 2.4.1 Control Registers . 15 2.4.2 Main Actions During Normal Operation of the Bare Metal E1000 16 2.5 The QEMU Emulated Intel Pro/1000 PCI/PCI-X NIC (E1000) . 17 2.5.1 Interrupt Coalescing . 18 2.6 The QEMU Virtio-Net Paravirtual NIC . 19 2.6.1 Interrupt and Kick Supression . 19 2.6.2 TX Interrupts . 19 2.7 Virtio-Net TCP Send Sequences in Throughput Workloads . 20 2.7.1 Virtio-Net Dual Core Send Sequence . 20 2.7.2 Virtio-Net Single Core TCP Throughput Send Sequence . 21 Technion - Computer Science Department - M.Sc. Thesis MSC-2017-19 - 2017 3 Motivation 27 3.1 Interposition . 27 3.2 Emulated I/O Devices . 28 3.3 Paravirtual I/O Devices . 28 3.4 Emulated vs Paravirtual Devices . 28 3.4.1 Guest Modification . 28 3.4.2 Performance . 29 3.5 Emulated vs Paravirtual NICs in Different Hypervisors . 30 3.6 Emulation vs Paravirtualization Comparison Model . 30 4 Experimental Setup 35 4.1 Hardware Setup . 35 4.2 Benchmarks . 35 4.2.1 Single Core Throughput Benchmark . 35 4.2.2 Dual Core Throughput Benchmark . 36 5 Single Core Configuration 37 5.1 Baseline Comparison . 38 5.2 Removal of TCP Checksum Calculation . 38 5.3 Removal of TCP Segmentation . 39 5.4 Improved Interrupt Coalescing . 40 5.4.1 ITR and TADV Conflict . 41 5.4.2 Static Set Interrupt Rate . 42 5.4.3 Interrupt Rate Considering ITR . 43 5.4.4 Evaluation . 44 5.5 Send from the I/O Thread . 44 5.5.1 Interrupt Coalescing in Virtio-Net . 47 5.6 Exposing PCI-X to Avoid Bounce Buffers . 47 5.7 Dropping Packets to Improve TSO Batching in Linux Guests . 48 5.8 Vectorized Send . 53 5.9 SRTT Calculation Algorithm Bug in Linux . 54 5.9.1 SRTT Calculation in the Linux Kernel . 55 5.9.2 Bug Description . 56 5.9.3 Effects of the Bug . 56 5.9.4 Bug Fix . 57 5.10 Final Throughput Comparison . 58 5.11 Improvements Summary . 60 6 Initial Work on a Dual Core Configuration 65 6.1 Baseline Comparison . 65 6.2 Scalability of the Emulated E1000 . 66 6.3 Sidecore . 68 Technion - Computer Science Department - M.Sc. Thesis MSC-2017-19 - 2017 6.3.1 Partial Sidecore Implementation . 69 7 Related Work 75 8 Future work 77 8.1 Challenges on the Way to Full Sidecore Emulation of E1000 . 78 8.1.1 ICR . 78 8.1.2 IMC and IMS . 79 9 Conclusion 81 Technion - Computer Science Department - M.Sc. Thesis MSC-2017-19 - 2017 Technion - Computer Science Department - M.Sc. Thesis MSC-2017-19 - 2017 List of Figures 2.1 CWND values over time, for two TCP connections with the same source and destination, one starting transmission at t=0, the other at t=100[sec], and both using the Cubic congestion avoidance algorithm . 12 2.2 Baseline Eh register exits . 17 2.3 Eh emulation of ICR reading. Some implementation details have been removed . 18 2.4 Vh dual core setup, single batch send sequence . 22 2.5 Vh single core setup, single batch send sequence . 24 2.6 Baseline Vh exits . 25 3.1 Google search results illustrating the problems with vmware tools . 29 3.2 Throughput comparison of Eb emulation vs a paravirtual NIC in different hypervisors. In QEMU/KVM and Virtual Box the paravirtual device is virtio-net, and in Vmware Workstation it is vmxnet3 . 31 5.1 Throughput comparison between baseline Vh and baseline Eh . 39 5.2 Change in throughput caused by removing the calculation of TCP check- sum in Eh ................................... 40 5.3 Change in throughput caused by removing the TCP segmentation code in Eh ...................................... 41 5.4 Throughput difference achieved when using the 2 types of interrupt coalescing heuristics described in subsections 5.4.2 (static), 5.4.3 (ITR sensitive) . 44 5.5 Change in throughput caused by using the improved static interrupt coalescing setting in Eh ........................... 45 5.6 Change in throughput caused by moving the sending of packets from the VCPU thread to the I/O thread in Eh ................... 46 5.7 Change in throughput caused by enabling PCI-X mode in Eh . 49 5.8 Vh compared to Eh with improvements 1-5 . 50 5.9 Throughput and CWND values over time, with Eh, running netperf TCP STREAM with default 16KB message size. At around 25 seconds packet dropping is enabled. 52 Technion - Computer Science Department - M.Sc. Thesis MSC-2017-19 - 2017 5.10 Change in throughput caused by adding the packet dropping for better TSO batching in Eh ............................. 53 5.11 Change in throughput caused by adding the packet dropping for better TSO batching in Vh ............................. 54 5.12 Change in throughput caused by using vectorized sending in Eh . 55 5.13 The routine in Linux kernel 3.13 that calculates the new SRTT given the RTT of the currently ACKed packet and the previous SRTT (irrelevant code omitted) . 56 5.14 The code of tcp rtt estimator() after fixing the bug in SRTT calculation (irrelevant code omitted) . 57 5.15 tp->srtt values as they react to RTT values over time, in both the original implementation of tcp rtt estimator() on the left and the fixed version on the right . 58 5.16 Difference in throughput of Eh with all improvements but packet dropping, with the SRTT bug and after fixing it . 58 5.17 Difference in throughput of Vh with the SRTT bug and after fixing it . 59 5.18 Throughput comparison of the best versions of Vh and Eh . 59 5.19 Throughput increase achieved by adding our improvements to Eh . 62 5.20 Throughput increase caused by each of the improvements added to Eh out of the highest achieved throughput . 63 6.1 Throughput comparison between baseline Vh and baseline Eh when running our dual core basic throughput benchmark . 66 6.2 Throughput difference between the best version of Vh when run on a single core vs when run on 2 cores . 67 6.3 Throughput difference between the best version of Eh when run on a single core vs when run on 2 cores . 68 6.4 Throughput difference between the best single core version of Eh and the best partial sidecore-emulated version of Eh . 71 6.5 Throughput difference between the partial sidecore-emulated best version of Eh and the best version of Vh when running our dual core basic throughput benchmark . 72 6.6 Throughput difference between the partial best sidecore-emulated version of Eh and the best version of Vh when running our dual core basic throughput benchmark without packet dropping . 73 8.1 State machine of the interrupt raising algorithm for sidecore emulating IMC and IMS . 80 Technion - Computer Science Department - M.Sc. Thesis MSC-2017-19 - 2017 Abstract Emulation of high-throughput Input/Output (I/O) devices for virtual machines (VM) is appealing because an emulated I/O device works out of the box without the need to install a new device driver in the VM when moving the VM from one hypervisor to another. The problem is that fully emulating a hardware device can be costly due to multiple virtualization exits. Installations therefore often prefer to use paravirtual I/O devices, which reduce the number of exits by making VMs aware that they are being virtualized at the cost of the need to install a new device driver when moving from one hypervisor to another.
Recommended publications
  • Oracle VM Virtualbox User Manual
    Oracle VM VirtualBox R User Manual Version 5.1.20 c 2004-2017 Oracle Corporation http://www.virtualbox.org Contents 1 First steps 11 1.1 Why is virtualization useful?............................. 12 1.2 Some terminology................................... 12 1.3 Features overview................................... 13 1.4 Supported host operating systems.......................... 15 1.5 Installing VirtualBox and extension packs...................... 16 1.6 Starting VirtualBox.................................. 17 1.7 Creating your first virtual machine......................... 18 1.8 Running your virtual machine............................ 21 1.8.1 Starting a new VM for the first time.................... 21 1.8.2 Capturing and releasing keyboard and mouse.............. 22 1.8.3 Typing special characters.......................... 23 1.8.4 Changing removable media......................... 24 1.8.5 Resizing the machine’s window...................... 24 1.8.6 Saving the state of the machine...................... 25 1.9 Using VM groups................................... 26 1.10 Snapshots....................................... 26 1.10.1 Taking, restoring and deleting snapshots................. 27 1.10.2 Snapshot contents.............................. 28 1.11 Virtual machine configuration............................ 29 1.12 Removing virtual machines.............................. 30 1.13 Cloning virtual machines............................... 30 1.14 Importing and exporting virtual machines..................... 31 1.15 Global Settings...................................
    [Show full text]
  • Understanding Full Virtualization, Paravirtualization, and Hardware Assist
    VMware Understanding Full Virtualization, Paravirtualization, and Hardware Assist Contents Introduction .................................................................................................................1 Overview of x86 Virtualization..................................................................................2 CPU Virtualization .......................................................................................................3 The Challenges of x86 Hardware Virtualization ...........................................................................................................3 Technique 1 - Full Virtualization using Binary Translation......................................................................................4 Technique 2 - OS Assisted Virtualization or Paravirtualization.............................................................................5 Technique 3 - Hardware Assisted Virtualization ..........................................................................................................6 Memory Virtualization................................................................................................6 Device and I/O Virtualization.....................................................................................7 Summarizing the Current State of x86 Virtualization Techniques......................8 Full Virtualization with Binary Translation is the Most Established Technology Today..........................8 Hardware Assist is the Future of Virtualization, but the Real Gains Have
    [Show full text]
  • Paravirtualization (PV)
    Full and Para Virtualization Dr. Sanjay P. Ahuja, Ph.D. Fidelity National Financial Distinguished Professor of CIS School of Computing, UNF x86 Hardware Virtualization The x86 architecture offers four levels of privilege known as Ring 0, 1, 2 and 3 to operating systems and applications to manage access to the computer hardware. While user level applications typically run in Ring 3, the operating system needs to have direct access to the memory and hardware and must execute its privileged instructions in Ring 0. x86 privilege level architecture without virtualization Technique 1: Full Virtualization using Binary Translation This approach relies on binary translation to trap (into the VMM) and to virtualize certain sensitive and non-virtualizable instructions with new sequences of instructions that have the intended effect on the virtual hardware. Meanwhile, user level code is directly executed on the processor for high performance virtualization. Binary translation approach to x86 virtualization Full Virtualization using Binary Translation This combination of binary translation and direct execution provides Full Virtualization as the guest OS is completely decoupled from the underlying hardware by the virtualization layer. The guest OS is not aware it is being virtualized and requires no modification. The hypervisor translates all operating system instructions at run-time on the fly and caches the results for future use, while user level instructions run unmodified at native speed. VMware’s virtualization products such as VMWare ESXi and Microsoft Virtual Server are examples of full virtualization. Full Virtualization using Binary Translation The performance of full virtualization may not be ideal because it involves binary translation at run-time which is time consuming and can incur a large performance overhead.
    [Show full text]
  • Vulnerability Assessment
    Security Patterns for AMP-based Embedded Systems Doctoral Thesis (Dissertation) to be awarded the degree of Doktor-Ingenieur (Dr. -Ing.) submitted by Pierre Schnarz from Alzenau approved by the Department of Informatics, Clausthal University of Technology 2018 Dissertation Clausthal, SSE-Dissertation 19, 2018 D 104 Chairperson of the Board of Examiners Prof. Dr. Jorg¨ P. Muller¨ Chief Reviewer Prof. Dr. Andreas Rausch 2. Reviewer Prof. Dr. Joachim Wietzke 3. Reviewer Prof. Dr. Jorn¨ Eichler Date of oral examination: December 19, 2018 Für Katrin Declaration I hereby declare that except where specific reference is made to the work of others, the contents of this dissertation are original and have not been submitted in whole or in part for consideration for any other degree or qualification in this, or by any other university. This dissertation is my own work and contains nothing which is the outcome of work done in collaboration with others, except as specified in the text and Acknowledgements. Pierre Schnarz April 2019 Acknowledgements - Danksagung The probability that we may fail in the struggle ought not to deter us from the support of a cause we believe to be just. Abraham Lincoln Viele Menschen haben mich auf dem langen Weg bis zur Fertigstellung dieser Arbeit begleitet. Daher möchte ich mich hier bei all Jenen bedanken, die beigetragen haben mir dies zu ermöglichen. Bei Prof. Dr. Joachim Wietzke bedanke ich mich für die Betreuung meiner Promotion. Gerade die Mitarbeit in seiner Forschungsgruppe und darüber hinaus hat mir die nötige Hartnäckigkeit vermittelt, welche es brauchte um dieses große Projekt zu Ende zu bringen.
    [Show full text]
  • Paravirtualization Poojitha, Umit Full Virtualization
    Paravirtualization Poojitha, Umit Full virtualization • Unmodified OS • It doesn’t know about hypervisor • Back and forth between hypervisor and MMU-visible shadow page table: inefficient • Unprivileged instructions which are sensitive: difficult to handle (binary translation VMware ESX) • Cannot access hardware in privileged mode • If guest OS wants real resource information? (Timer, superpages) 2 Paravirtualization • Modify Guest OS • It knows about hypervisor • Applications not modified • Some exposure to hardware and real resources like time • Improved performance (reduce redirections, allowing guest OS to use real hardware resources in a secure manner) • It can allow us to do virtualization without hardware support 3 Discussion – Xen • Memory Management • CPU • Protection • Exception • System call • Interrupt • Time • Device I/O 4 Protection • Privilege of OS must be less than Xen: • In x86, 4 levels of privilege • 3 for applications, Zero for OS - generally • Downgrade guest OS to level 1 or 2 • Xen will be at 0 Wikipedia 5 Exceptions • System calls, Page Faults • Register with Xen: descriptor table for exception handlers • No back and forth between Xen and Guest OS like in full Virtualization • Fast handlers for system call: • When Applications execute system call, it directly goes to Guest OS handler in ring 1 – not to Xen (But not page fault handler it has to go through Xen) • Handlers validated before installing in hardware exception table 6 Time • Guest OS can see: both real and virtual time • Real time • Virtual time • Wall clock time • Why do you want to see time? e.g., need it for TCP: TCP timeouts, RTT estimates 7 Memory Management • TLB flush on context switch (Guest OS – Guest OS) – Undesirable • Software TLB – can virtualize without flushing between switches • Hardware TLB – tag it with address space identifier.
    [Show full text]
  • KVM: Linux-Based Virtualization
    KVM: Linux-based Virtualization Avi Kivity [email protected] Columbia University Advanced OS/Virtualization course Agenda Quick view Power management Features Non-x86 KVM Execution loop Real time Memory management Xenner Linux Integration Community Paravirtualization Conclusions I/O Copyright © 2007 Qumranet, Inc. All rights reserved. At a glance KVM – the Kernel-based Virtual Machine – is a Linux kernel module that turns Linux into a hypervisor Requires hardware virtualization extensions Supports multiple architectures: x86 (32- and 64- bit) s390 (mainframes), PowerPC, ia64 (Itanium) Competitive performance and feature set Advanced memory management Tightly integrated into Linux 3 Copyright © 2007 Qumranet, Inc. All rights reserved. The KVM approach Reuse Linux code as much as possible Focus on virtualization, leave other things to respective developers Integrate well into existing infrastructure, codebase, and mindset Benefit from semi-related advances in Linux Copyright © 2007 Qumranet, Inc. All rights reserved. VMware Console User User User VM VM VM VM Hypervisor Driver Driver Driver Hardware Copyright © 2007 Qumranet, Inc. All rights reserved. Xen User User User Domain 0 VM VM VM Driver Driver Hypervisor Driver Hardware Copyright © 2007 Qumranet, Inc. All rights reserved. KVM Ordinary LinuxOrdinary User User User Ordinary ProcessLinux VM VM VM ProcessLinux Process KVM Modules Linux Driver Driver Driver Hardware Copyright © 2007 Qumranet, Inc. All rights reserved. KVM model enefits Reuse scheduler, memory management, bringup Reuse Linux driver portfolio Reuse I/O stack Reuse management stack Copyright © 2007 Qumranet, Inc. All rights reserved. KVM Process Model task task guest task task guest kernel 9 Copyright © 2007 Qumranet, Inc. All rights reserved. KVM Execution Model Three modes for thread execution instead of the traditional two: User mode Kernel mode Guest mode A virtual CPU is implemented using a Linux thread The Linux scheduler is responsible for scheduling a virtual cpu, as it is a normal thread 10 Copyright © 2007 Qumranet, Inc.
    [Show full text]
  • Vmware Security Best Practices
    VMware Security Best Practices Jeff Turley Lewis University April 25, 2010 Abstract Virtualization of x86 server infrastructure is one of the hottest trends in information technology. With many organizations relying on virtualization technologies from VMware and other vendors to run their mission critical systems, security has become a concern. In this paper, I will discuss the different types of virtualization technologies in use today. I will also discuss the architecture of VMware vSphere and security best practices for a VMware vSphere environment. 2 Table of Contents Introduction Page 5 Business Case For Virtualization Page 5 Understanding Virtualization Page 6 CPU Virtualization Page 6 The Hypervisor Page 11 Understanding VMware vSphere Page 12 VMware vSphere Architecture Page 12 VMware vSphere Components Page 13 The VMware vSphere Virtual Data Center Page 14 VMware vSphere Distributed Services Page 18 VMware Virtual Networking Page 25 VMware Virtual Storage Architecture Page 28 VMware vCenter Server Page 29 Securing a VMware Virtual Environment Page 30 VMware vSphere Network Security Page 32 VMware vSphere Virtual Machine Security Page 35 VMware vSphere ESX/ESXi Host Security Page 37 VMware vSphere vCenter Server Security Page 37 VMware vSphere Console Operating System (COS) Security Page 38 Conclusion Page 39 References Page 40 3 Table of Figures Figure 1: Operating system and applications running directly on physical hardware………………………………………………………..….....……………..…..Page 7 Figure 2: Operating system and applications running with
    [Show full text]
  • Open Enterprise Server 2015 SP1 OES Cluster Services Implementation Guide for Vmware
    Open Enterprise Server 2015 SP1 OES Cluster Services Implementation Guide for VMware July 2020 Legal Notices For information about legal notices, trademarks, disclaimers, warranties, export and other use restrictions, U.S. Government rights, patent policy, and FIPS compliance, see https://www.microfocus.com/about/legal/. Copyright © 2020 Micro Focus Software, Inc. All Rights Reserved. Contents About This Guide 5 1 Getting Started with OES Cluster Services in an ESXi Virtualized Environment 7 1.1 Configuration Overview . 8 1.2 Understanding Virtualization . 8 1.2.1 Where Is Virtualization Today? . 9 1.2.2 Why Use Virtualization?. 9 1.2.3 Why Use Novell Cluster Services? . 9 1.2.4 Server versus Service Virtualization . .9 1.3 Architectural Scenarios . 11 1.3.1 Only Service Virtualization. 11 1.3.2 Only Server Virtualization . 12 1.3.3 NCS on Host Machines Managing Xen or KVM Guest Machines . 12 1.3.4 NCS Managing Services on Guest Machines . 13 1.3.5 NCS Managing Services on a Cluster of Physical and Guest Machines. 13 1.4 Design and Architecture Considerations . 14 1.4.1 Challenges of Server Virtualization . 14 1.4.2 Cost of Server Virtualization . 14 1.4.3 Challenges of Service Virtualization . 15 1.4.4 Cost of Service Virtualization. 15 1.4.5 Fault Tolerance and Scalability . 15 1.4.6 Planning Considerations . 15 1.4.7 Full Virtualization versus Paravirtualization . 15 1.4.8 Comparing Architectures . 16 2 Planning for the Virtualized Environment 17 2.1 Things to Explore . 17 2.2 Infrastructure Dependencies . 17 2.3 LAN and SAN Connectivity .
    [Show full text]
  • Operating System Virtualization
    Introduction to virtualisation technology Predrag Buncic CERN CERN School of Computing 2009 Introduction to Virtualisation Technology History . Credit for bringing virtualization into computing goes to IBM . IBM VM/370 was a reimplementation of CP/CMS, and was made available in 1972 . added virtual memory hardware and operating systems to the System/370 series. Even in the 1970s anyone with any sense could see the advantages virtualization offered . It separates applications and operating systems from the hardware . With VM/370 you could even run MVS on top - along with other operating systems such as Unix. In spite of that, VM/370 was not a great commercial success . The idea of abstracting computer resources continued to develop 2 Predrag Buncic – CERN Introduction to Virtualisation Technology Resource virtualization . Virtualization of specific system computer resources such as . Memory virtualization . Aggregates RAM resources from networked systems into Memory virtualized memory pool . Network virtualization . Creation of a virtualized network addressing space within or across network subnets . Using multiple links combined to work as though they Networking offered a single, higher-bandwidth link . Virtual memory . Allows uniform, contiguous addressing of physically separate and non-contiguous memory and disk areas . Storage virtualization Storage . Abstracting logical storage from physical storage . RAID, disk partitioning, logical volume management 3 Predrag Buncic – CERN Introduction to Virtualisation Technology Metacomputing . A computer cluster is a group of linked computers, working together closely so that in many respects they form a single computer. The components of a cluster are commonly connected to each other through fast local area networks. Grids are usually computer clusters, but more focused on throughput like a computing utility rather than running fewer, tightly-coupled jobs .
    [Show full text]
  • Virtualization Basics: Understanding Techniques and Fundamentals
    Virtualization Basics: Understanding Techniques and Fundamentals Hyungro Lee School of Informatics and Computing, Indiana University 815 E 10th St. Bloomington, IN 47408 [email protected] ABSTRACT example of growing involvement in virtualization technolo- Virtualization is a fundamental part of cloud computing, gies with the cloud. Early technologies and developments especially in delivering Infrastructure as a Service (IaaS). in the virtualization have been accomplished by some com- Exploring different techniques and architectures of the vir- panies such as IBM from 1967 and VMware from 1998. In tualization helps us understand the basic knowledge of virtu- open source communities, Xen, KVM, Linux-vServer, LXC alization and the server consolidation in the cloud with x86 and others have supported virtualization in different plat- architecture. This paper describes virtualization technolo- forms with different approaches. In this paper, x86 archi- gies, architectures and optimizations regarding the sharing tecture virtualization will be discussed with these historical CPU, memory and I/O devices on x86 virtual machine mon- changes. itor. In cloud computing, Infrastructure-as-a-Service (IaaS) pro- vides on-demand virtual machine instances with virtualiza- Categories and Subject Descriptors tion technologies. IaaS has been broadly used to provide re- C.0 [General]: Hardware/software interface; C.4 [Performance quired compute resources in shared resource environments. of systems]: Performance attributes; D.4.7 [Operating Amazon Web Services (AWS), Google Compute Engine, Mi- Systems]: Organization and design crosoft Windows Azure, and HP Cloud offer commercial cloud services. OpenStack, Eucalyptus, SaltStack, Nimbus, General Terms and many others provide private open source cloud plat- Performance, Design forms with community support in development.
    [Show full text]
  • Virtio Networking: a Case Study of I/O Paravirtualization
    Virtio networking: A case study of I/O paravirtualization Ing. Vincenzo Maffione 1/12/2016 Outline 1. How NICs work 2. How Linux NIC drivers work 3. How NICs are emulated 4. Performance analysis of emulated e1000 5. I/O paravirtualization ideas 6. The VirtIO standard 7. The VirtIO network adapter (virtio-net) 8. Performance analysis of virtio-net How NICs work (1) Ethernet Network Interface Cards (NICs) are used to attach hosts to Ethernet Local Area Networks (LANs). NICs are deployed everywhere - laptops, PCs, high-end machines in data centers - and many vendors and models are available - e.g. Intel, Broadcom, Realtek, Qualcomm, Mellanox. But how does a NIC work? How does the Operating System (OS) control it? How NICs work (2) All modern NICs are DMA-capable PCI devices, exposing a model-specific set of registers. Most of the registers are memory-mapped, which means that x86 CPUs can access them with regular MOV instructions. Some registers can be mapped in the CPU I/O space, which means that they can be accessed with IN/OUT instructions on x86 CPUs. In x86, memory-mapped I/O is preferred over port I/O, because it is more flexible. IN/OUT instructions can only use DL or immediate operand as I/O port, and EAX/AX/AL as value. Direct Memory Access (DMA)* allows PCI devices to read (write) data from (to) memory without CPU intervention. This is a fundamental requirement for high performance devices. (Very) Old devices - e.g. ISA, non-DMA capable - are not considered here. (*) In the PCI standard DMA is also known as Bus Mastering How NICs work (3) Memory space I/O space BAR0 0x001b9800 BAR2 0x05e1 BAR1 0x00a3d520 ..
    [Show full text]
  • Paper: Xen and the Art of Virtualization
    Xen and the Art of Virtualization Paul Barham∗, Boris Dragovic, Keir Fraser, Steven Hand, Tim Harris, Alex Ho, Rolf Neugebauer†, Ian Pratt, Andrew Warfield University of Cambridge Computer Laboratory 15 JJ Thomson Avenue, Cambridge, UK, CB3 0FD {firstname.lastname}@cl.cam.ac.uk ABSTRACT 1. INTRODUCTION Numerous systems have been designed which use virtualization to Modern computers are sufficiently powerful to use virtualization subdivide the ample resources of a modern computer. Some require to present the illusion of many smaller virtual machines (VMs), specialized hardware, or cannot support commodity operating sys- each running a separate operating system instance. This has led to tems. Some target 100% binary compatibility at the expense of a resurgence of interest in VM technology. In this paper we present performance. Others sacrifice security or functionality for speed. Xen, a high performance resource-managed virtual machine mon- Few offer resource isolation or performance guarantees; most pro- itor (VMM) which enables applications such as server consolida- vide only best-effort provisioning, risking denial of service. tion [42, 8], co-located hosting facilities [14], distributed web ser- This paper presents Xen, an x86 virtual machine monitor which vices [43], secure computing platforms [12, 16] and application allows multiple commodity operating systems to share conventional mobility [26, 37]. hardware in a safe and resource managed fashion, but without sac- Successful partitioning of a machine to support the concurrent rificing either performance or functionality. This is achieved by execution of multiple operating systems poses several challenges. providing an idealized virtual machine abstraction to which oper- Firstly, virtual machines must be isolated from one another: it is not ating systems such as Linux, BSD and Windows XP, can be ported acceptable for the execution of one to adversely affect the perfor- with minimal effort.
    [Show full text]