
Xen and The Art of Virtualization Paul Barham, Boris Dragovic, Keir Fraser, Steven Hand, Tim Harris, Alex Ho, Rolf Neugebauer, Ian Pratt & Andrew Warfield SOSP 2003 Additional source: Ian Pratt on xen (xen source) 1 www.cl.cam.ac.uk/research/srg/netos/papers/2005-xen-may.ppt Para virtualization 2 Virtualization approaches z Full virtualization z OS sees exact h/w OS z OS runs unmodified z Requires virtualizable VMM architecture or work around H/W z Example: Vmware z Para Virtualization z OS knows about VMM OS z Requires porting (source code) z Execution overhead VMM z Example Xen, denali H/W 3 The Xen approach z Support for unmodified binaries (but not OS) essential z Important for app developers z Virtualized system exports has same Application Binary Interface (ABI) z Modify guest OS to be aware of virtualization z Gets around problems of x86 architecture z Allows better performance to be achieved z Expose some effects of virtualization z Translucent VM OS can be used to optimize for performance z Keep hypervisor layer as small and simple as possible z Resource management, Device drivers run in privileged VMM z Enhances security, resource isolation 4 Paravirtualization 5 z Solution to issues with x86 instruction set z Don’t allow guest OS to issue sensitive instructions z Replace those sensitive instructions that don’t trap to ones that will trap z Guest OS makes “hypercalls” (like system calls) to interact with system resources z Allows hypervisor to provide protection between VMs z Exceptions handled by registering handler table with Xen z Fast handler for OS system calls invoked directly z Page fault handler modified to read address from replica location z Guest OS changes largely confined to arch-specific code z Compile for ARCH=xen instead of ARCH=i686 z Original port of Linux required only 1.36% of OS to be modified 5 Para-Virtualization in Xen z Arch xen_x86 : like x86, but Xen hypercalls required for privileged operations z Avoids binary rewriting z Minimize number of privilege transitions into Xen z Modifications relatively simple and self-contained z Modify kernel to understand virtualized environment. z Wall-clock time vs. virtual processor time z Xen provides both types of alarm timer z Expose real resource availability z Enables OS to optimise behaviour 6 x86 CPU virtualization z Xen runs in ring 0 (most privileged) z Ring 1/2 for guest OS, 3 for user-space z General Processor Fault if guest attempts to use privileged instruction z Xen lives in top 64MB of linear address space z Segmentation used to protect Xen as switching page tables too slow on standard x86 z Hypercalls jump to Xen in ring 0 z Guest OS may install ‘fast trap’ handler z Direct user-space to guest OS system calls z MMU virtualisation: shadow vs. direct-mode 7 x86_32 z Xen reserves top of VA 4GB space Xen S z Segmentation protects Kernel S Xen from kernel 3GB z System call speed unchanged ring 0 User U ring 1 ring 3 z Xen 3.0 now supports >4GB mem with 0GB Processor Address Extension (64 bit etc) 8 Xen VM interface: CPU z CPU z Guest runs at lower privilege than VMM z Exception handlers must be registered with VMM z Fast system call handler can be serviced without trapping to VMM z Hardware interrupts replaced by lightweight event notification system z Timer interface: both real and virtual time 9 Xen virtualizing CPU z Many processor architectures provide only 2 levels (0/1) z Guest and apps in 1, VMM in 0 z Run Guest and app as separate processes z Guest OS can use the VMM to pass control between address spaces z Use of software TLB with address space tags to minimize CS overhead 10 XEN: virtualizing CPU in x86 z x86 provides 4 rings (even VAX processor provided 4) z Leverages availability of multiple “rings” z Intermediate rings have not been used in practice since OS/2; x86- specific z An O/S written to only use rings 0 and 3 can be ported; needs to modify kernel to run in ring 1 11 CPU virtualization z Exceptions that are called often: z Software interrupts for system calls z Page faults z Improve Allow “guest” to register a ‘fast’ exception handler for system calls that can be accessed directly by CPU in ring 1, without switching to ring-0/Xen z Handler is validated before installing in hardware exception table: To make sure nothing executed in Ring 0 privilege. z Doesn’t work for Page Fault z Only code in ring 0 can read the faulting address from register 12 Xen 13 Some Xen hypercalls 14 z See http://lxr.xensource.com/lxr/source/xen/include/public/xen.h #define __HYPERVISOR_set_trap_table 0 #define __HYPERVISOR_mmu_update 1 #define __HYPERVISOR_sysctl 35 #define __HYPERVISOR_domctl 36 14 Xen VM interface: Memory z Memory management z Guest cannot install highest privilege level segment descriptors; top end of linear address space is not accessible z Guest has direct (not trapped) read access to hardware page tables; writes are trapped and handled by the VMM z Physical memory presented to guest is not necessarily contiguous 15 Memory virtualization choices z TLB: challenging z Software TLB can be virtualized without flushing TLB entries between VM switches z Hardware TLBs tagged with address space identifiers can also be leveraged to avoid flushing TLB between switches z x86 is hardware-managed and has no tags… z Decisions: z Guest O/Ss allocate and manage their own hardware page tables with minimal involvement of Xen for better safety and isolation z Xen VMM exists in a 64MB section at the top of a VM’s address space that is not accessible from the guest 16 Xen memory management 17 z x86 TLB not tagged z Must optimise context switches: allow VM to see physical addresses z Xen mapped in each VM’s address space z PV: Guest OS manages own page tables z Allocates new page tables and registers with Xen z Can read directly z Updates batched, then validated, applied by Xen 17 Memory virtualization z Guest O/S has direct read access to hardware page tables, but updates are validated by the VMM z Through “hypercalls” into Xen z Also for segment descriptor tables z VMM must ensure access to the Xen 64MB section not allowed z Guest O/S may “batch” update requests to amortize cost of entering hypervisor 18 x86_32 z Xen reserves top of VA 4GB space Xen S z Segmentation protects Kernel S Xen from kernel 3GB z System call speed unchanged ring 0 User U ring 1 ring 3 z Xen 3.0 now supports >4GB mem with 0GB Processor Address Extension (64 bit etc) 19 Virtualized memory 20 management VM1 VM2 z Each process in each VM has 1 2 1 5 1 2 1 5 its own VAS 2 3 2 6 2 3 2 6 z Guest OS deals with real (pseudo-physical) pages, Xen 2 8 maps physical to machine 3 12 z For PV, guest OS uses 5 14 hypercalls to interact with memory 6 9 z For HVM, Xen has shadow 2 4 page tables (VT instructions help) 3 16 5 17 6 18 20 TLB when VM1 is running VP PID PN 1 1 ? 2 1 ? 1 2 ? 2 2 ? 21 MMU Virtualization: shadow mode 22 Shadow page table z Hypervisor responsible for trapping access to virtual page table z Updates need to be propagate back and forth between Guest OS and VMM z Increases cost of managing page table flags (modified, accessed bits) z Can view physical memory as contiguous z Needed for full virtualization 23 MMU virtualization: direct mode z Take advantage of Paravirtualization z OS can be modified to be involved only in page table updates z Restrict guest OSes to read only access z Classify Page frames into frames that holds page table z Once registered as page table frame, make the page frame R_ONLY z Can avoid the use of shadow page tables 24 Single PTE update 25 On write PTE : Emulate guest reads Virtual → Machine first guest write Guest OS yes emulate? Xen VMM Hardware MMU 26 Bulk update z Useful when creating new Virtual address spaces z New Process via fork and Context switch z Requires creation of several PTEs z Multipurpose hypercall z Update PTEs z Update virtual to Machine mapping z Flush TLB z Install new PTBR 27 Batched Update Interface guest reads PD Virtual → Machine guest PT PT PT writes Guest OS validation Xen VMM Hardware MMU 28 Writeable Page Tables: create new entries guest reads PD Virtual → Machine guest writes X PT PT PT Guest OS Xen VMM Hardware MMU 29 Writeable Page Tables : First Use—validate mapping via TLB guest reads Virtual → Machine guest writes X Guest OS page fault Xen VMM Hardware MMU 30 Writeable Page Tables : Re- hook guest reads Virtual → Machine guest writes Guest OS validate Xen VMM Hardware MMU 31 Physical memory z Memory allocation for each VM specified at boot z Statically partitioned z No overlap in machine memory z Strong isolation z Non-contiguous (Sparse allocation) z Balloon driver z Add or remove machine memory from guest OS 32 Xen memory management 33 z Xen does not swap out memory allocated to domains z Provides consistent performance for domains z By itself would create inflexible system (static memory allocation) z Balloon driver allows guest memory to grow/shrink z Memory target set as value in the XenStore z If guest above target, free/swap out, then release to Xen z If guest below target, can increase usage z Hypercalls allow guests to see/change state of memory z Physical-real mappings z “Defragment” allocated memory 33 Xen VM interface: I/O z I/O z Virtual devices (device descriptors) exposed as asynchronous I/O rings to guests z Event notification is by means of an upcall as opposed to interrupts 34 I/O z Handle interrupts z Data transfer z Data written to I/O buffer pools in each domain z These Page frames pinned by Xen 35 Details: I/O z I/O Descriptor Ring: 36 I/O rings REQUEST CONSUMER (XEN) REQUEST PRODUCER
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages53 Page
-
File Size-