Virtual Machine Monitors

Virtual Machine Monitors

spcl.inf.ethz.ch spcl.inf.ethz.ch @spcl_eth @spcl_eth ADRIAN PERRIG & TORSTEN HOEFLER Our small quiz Networks and Operating Systems (252-0062-00) Chapter 11: Virtual Machine Monitors . True or false (raise hand) . Spooling can be used to improve access times A SIGINT in time saves a kill -9 . Buffering can cope with device speed mismatches . The Linux kernel identifies devices using a single number . From userspace, devices in Linux are identified through files . Standard BSD sockets require two or more copies at the host . Network protocols are processed in the first level interrupt handler . The second level interrupt handler copies the packet data to userspace . Deferred procedure calls can be executed in any process context . Unix mbufs (and skbufs) enable protocol-independent processing . Network I/O is not performance-critical . NAPI’s design aims to reduce the CPU load . NAPI uses polling to accelerate packet processing . TCP offload reduces the server CPU load . TCP offload can accelerate applications 2 spcl.inf.ethz.ch spcl.inf.ethz.ch @spcl_eth @spcl_eth Receive-side scaling Receive-side scaling . Observations: . Too much traffic for one core to handle Flow table Received . Cores aren’t getting any faster packet Must parallelize across cores . Key idea: handle different flows on different cores pointer Flow state: . But: how to determine flow for each packet? • IP src + dest • Ring buffer . Can’t do this on a core: same problem! • TCP src + dest • Message-signalled interrupt Etc. Solution: demultiplex on the NIC Hash of . DMA packets to per-flow buffers / queues packet . Send interrupt only to core handling flow header DMA Core to address interrupt spcl.inf.ethz.ch spcl.inf.ethz.ch @spcl_eth @spcl_eth Receive-side scaling . Can balance flows across cores . Note: doesn’t help with one big flow! . Assumes: . n cores processing m flows is faster than one core . Hence: . Network stack and protocol graph must scale on a multiprocessor. Multiprocessor scaling: topic for later (see DPHPC class) Virtual Machine Monitors Literature: Barham et al.: Xen and the art of virtualization and Anderson, Dahlin: Operating Systems: Principles and Practice, Chapter 14 6 spcl.inf.ethz.ch spcl.inf.ethz.ch @spcl_eth @spcl_eth Virtual Machine Monitors What is a Virtual Machine Monitor? . Basic definitions . Virtualizes an entire (hardware) machine . Why would you want one? . Contrast with OS processes . Structure . Interface provided is “illusion of real hardware” . How does it work? . Applications are therefore complete Operating Systems themselves . Terminology: Guest Operating Systems . CPU . MMU . Memory . Old idea: IBM VM/CMS (1960s) . Devices . Recently revived: VMware, Xen, Hyper-V, kvm, etc. • Acknowledgement: . Network Thanks to Steve Hand for some of the slides! spcl.inf.ethz.ch spcl.inf.ethz.ch @spcl_eth @spcl_eth VMMs and hypervisors Why would you want one? . Server consolidation (program assumes own machine) . Performance isolation . Backward compatibility . Cloud computing (unit of selling cycles) App App App App Some folks . OS development/testing distinguish the . Something under the OS: replay, auditing, trusted computing, Virtual Machine Guest Guest Monitor from the rootkits operating operating Hypervisor system system (we won’t) Creates illusion of VMM VMM hardware Hypervisor Real hardware spcl.inf.ethz.ch spcl.inf.ethz.ch @spcl_eth @spcl_eth Running multiple OSes on one machine Server consolidation . Application . Many applications compatibility assume they have . I use Debian for almost everything, the machine to but I edit slides in themselves App App App App App App PowerPoint . Each machine is Application Application . Some people Application compile Barrelfish in mostly idle a Debian VM over Windows 7 with Consolidate Hyper-V servers onto a single physical . Backward machine compatibility Hypervisor Hypervisor . Nothing beats a Windows 98 virtual Real hardware machine for playing Real hardware old computer games spcl.inf.ethz.ch spcl.inf.ethz.ch @spcl_eth @spcl_eth Resource isolation Cloud computing . Surprisingly, . Selling computing modern OSes do capacity on Application not have an Application demand abstraction for a . E.g. Amazon EC2, Application Application single application Hypervisor GoGrid, etc. Application Application Application Application Application Real hardware . Performance Hypervisor . Hypervisors Application Real hardware Application isolation can be Hypervisor decouple Application critical in some Real hardware Application allocation of Hypervisor Application enterprises Real hardware Application resources (VMs) Hypervisor . Use virtual Real hardware from provisioning Hypervisor Real hardware of infrastructure Hypervisor machines as resource (physical machines) Real hardware containers spcl.inf.ethz.ch spcl.inf.ethz.ch @spcl_eth @spcl_eth Operating System development Other cool applications… . Building and . Tracing testing a new OS . Debugging without needing . Execution replay to reboot real . Lock-step Visual Visual Studio Editor hardware Tracer Application Application Compiler execution . VMM often gives you more . Live migration information about . Rollback faults than real . Speculation hardware anyway . Etc…. Hypervisor Hypervisor Real hardware Real hardware spcl.inf.ethz.ch spcl.inf.ethz.ch @spcl_eth @spcl_eth How does it all work? Hosted VMMs . Note: a hypervisor is basically an OS . With an “unusual API” . Many functions quite similar: . Multiplexing resources . Scheduling, virtual memory, device drivers App App Examples: . Different: • VMware workstation . Creating the illusion of hardware to “applications” Guest • Linux KVM . Guest OSes are less flexible in resource requirements operating • Microsoft Hyper-V Application Application system • VirtualBox VMM Host operating system Real hardware spcl.inf.ethz.ch spcl.inf.ethz.ch @spcl_eth @spcl_eth Hypervisor-based VMMs How to virtualize… . The CPU (s)? . The MMU? . Physical memory? . Devices (disks, etc.)? App App App App Mgmt. Examples: . The Network Console • VMware ESX Console • IBM VM/CMS Guest Guest and? (Mgmt) • Xen operating operating operating system system system VMM VMM VMM Hypervisor Real hardware spcl.inf.ethz.ch spcl.inf.ethz.ch @spcl_eth @spcl_eth Virtualizing the CPU Virtualizing the CPU . A CPU architecture is strictly virtualizable if it can be perfectly . A strictly virtualizable processor can execute a complete native emulated over itself, with all non-privileged instructions Guest OS executed natively . Guest applications run in user mode as before . Guest kernel works exactly as before . Privileged instructions trap . Kernel-mode (i.e., the VMM) emulates instruction . Problem: x86 architecture is not virtualizable . Guest’s kernel mode is actually user mode . About 20 instructions are sensitive but not privileged Or another, extra privilege level (such as ring 1) . Mostly segment loads and processor flag manipulation . Examples: IBM S/390, Alpha, PowerPC spcl.inf.ethz.ch spcl.inf.ethz.ch @spcl_eth @spcl_eth Non-virtualizable x86: example Solutions . PUSHF/POPF instructions 1. Emulation: emulate all kernel-mode code in software . Push/pop condition code register . Very slow – particularly for I/O intensive workloads . Includes interrupt enable flag (IF) . Used by, e.g., SoftPC 2. Paravirtualization: modify Guest OS kernel . Unprivileged instructions: fine in user space! . Replace critical calls with explicit trap instruction to VMM . IF is ignored by POPF in user mode, not in kernel mode . Also called a “HyperCall” (used for all kinds of things) . Used by, e.g., Xen VMM can’t determine if Guest OS wants interrupts disabled! 3. Binary rewriting: . Can’t cause a trap on a (privileged) POPF . Protect kernel instruction pages, trap to VMM on first IFetch . Prevents correct functioning of the Guest OS . Scan page for POPF instructions and replace . Restart instruction in Guest OS and continue . Used by, e.g., VMware 4. Hardware support: Intel VT-x, AMD-V . Extra processor mode causes POPF to trap spcl.inf.ethz.ch spcl.inf.ethz.ch @spcl_eth @spcl_eth Virtualizing the MMU Virtual/Physical/Machine . Hypervisor allocates memory to VMs . Guest assumes control over all physical memory Guest Guest Machine . VMM can’t let Guest OS to install mappings Virtual AS Physical AS Memory . Definitions needed: . Virtual address: a virtual address in the guest 17 . Physical address: as seen by the guest Guest 1: . Machine address: real physical address 2 As seen by the Hypervisor 5 6 9 Guest 2: 5 spcl.inf.ethz.ch spcl.inf.ethz.ch @spcl_eth @spcl_eth MMU virtualization Paravirtualization approach . Critical for performance, challenging to make fast, especially . Guest OS creates page tables the hardware uses SMP . VMM must validate all updates to page tables . Hot-unplug unnecessary virtual CPUs . Requires modifications to Guest OS . Use multicast TLB flush paravirtualizations etc. Not quite enough… . Xen supports 3 MMU virtualization modes . VMM must check all writes to PTEs 1. Direct (“Writable”) pagetables . Write-protect all PTEs to the Guest kernel 2. Shadow pagetables . Add a HyperCall to update PTEs 3. Hardware Assisted Paging . Batch updates to avoid trap overhead . OS Paravirtualization compulsory for #1, optional (and very . OS is now aware of machine addresses beneficial) for #2&3 . Significant overhead! spcl.inf.ethz.ch spcl.inf.ethz.ch @spcl_eth @spcl_eth Paravirtualizing the MMU Writeable Page Tables : 1 – Write fault . Guest OSes allocate and manage own PTs . Hypercall to change PT base guest reads . VMM must validate

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    11 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us