<<

System Virtual Machines

• Introduction • Key concepts • Resource – processors – memory – I/O devices • Performance issues • Applications

EECS 768 Virtual Machines 1 Introduction

• System – capable of supporting multiple system images – each guest OS manages a set of virtualized hardware resources – VMM owns the real system resources – VMM allocates resources to guest OS by partitioning or time-sharing – guest OS allocates resources to its user programs – each virtual resource may not have physical resource – see Figure 8.1

EECS 768 Virtual Machines 2 Introduction (2)

L i n u x W i n d o w s O S / 2 Application Application Application

L i n u x O S W i n d o w s O S O S / 2 O S

V irtual V irtual Intel x86 V irtual Intel x86

I n t e l x 8 6 H a r d w a r e

EECS 768 Virtual Machines 3 Key Concepts

• Outward appearance • State management • Resource control • Native and hosted virtual machines

• Several concepts are simplified – same guest and host ISA – uniprocessor systems

EECS 768 Virtual Machines 4 Outward Appearance

• Create an illusion of multiple machines. • Illusion in – no replication of hardware resources – switch to move devices from one VM to the other • Replication of hardware resources – devices used directly by each user are replicated – rest of the hardware is shared • Hosted VM – one OS more important than the other – UI of host OS provides way to display UI of the second EECS 768 Virtual Machines 5 State Management

• Refers to how each individual guest state is managed by the SVM – each guest OS has an architected state – host resources may not be adequate • Fixed location for all guest state in host – VMM manages pointer and switching – indirection can be very inefficient (Figure 8.3a) • Copy approach – more efficient (Figure 8.3b) – essential to allow direct execution of native code EECS 768 Virtual Machines 6 State Management (2)

V M M M e m o r y V M M M e m o r y

R e g i s t e r P r o c e s s o r R egister values P r o c e s s o r v a l u e s f o r V M 1 f o r V M 1

R e g i s t e r R egister values v a l u e s f o r V M 2 f o r V M 2 P r o c e s s o r R egister B lock R e g i s t e r s P o i n t e r

R e g i s t e r R egister values v a l u e s f o r V M 3 f o r V M 3

EECS 768 Virtual Machines 7 Resource Control

• Refers to how the VMM maintains overall control of all hardware resources. • Scheme similar to time-sharing OS – VMM intercept accesses to privileged resources – get control on all privileged instructions – VMM handles the timer interrupt (Figure 8.4) – similar tradeoffs between fair scheduling and large switching overhead

EECS 768 Virtual Machines 8 VMM Timesharing

• VMM timeshares resources among guests – similar to OS timesharing

VMM VM M restores determ ines next architected state V M t o b e for next VM a c t i v a t e d VM M sets tim er V M M s a v e s interval and VM M sets PC to tim er Tim er interrupt architected state e n a b l e s interrupt handler of O S o c c u r s of running VM i n t e r r u p t s i n n e x t V M

First VM A ctive V M M A c t i v e Next VM Active

EECS 768 Virtual Machines 9 Native and Hosted VM

• Refers to the runtime privilege level of the VMM (and OS). – for efficiency, some part of VMM should have higher privileges • Figure 8.5 – analogous relationship to OS and user programs – native VM system: only the VMM operates in privilege mode – hosted VM: VMM installed on top of existing OS • some or no part of VMM in privilege mode EECS 768 Virtual Machines 10 Native and Hosted VM (2)

V i r t u a l V i r t u a l M a c h i n e M a c h i n e V i r t u a l N o n - p r i v i l e g e d A pplications M a c h i n e m o d e s VMM VMM

OS VMM H o s t O S H o s t O S P r i v i l e g e d M o d e

H a r d w a r e H a r d w a r e H a r d w a r e H a r d w a r e

Traditional N a t i v e U s e r - m o d e D u a l - m o d e uniprocessor V M s y s t e m H o s t e d H o s t e d s y s t e m V M s y s t e m V M s y s t e m

EECS 768 Virtual Machines 11 Resource Virtualization – Processors

• Techniques to execute guest instructions – emulation – direct native execution • Emulation – interpretation or binary translation – necessary if different host and guest ISAs • Direct native execution – same host and guest ISAs required – may still require emulation of some instructions

EECS 768 Virtual Machines 12 Conditions for ISA Virtualizability

• Conditions laid out by Popek and Goldberg. • Assume native system VMs – VMM runs in system mode – all other software runs in user mode • Other assumptions – hardware consists of processor and uniformly addressable memory – processor can operate in two modes, user and system – some subset of instructions only available in system mode – relocation register available for memory addressing

EECS 768 Virtual Machines 13 Privileged Instructions

• Instructions that trap if the machine is in user mode, and does not trap if the machine is in system mode. • LPSW (Load Processor Status Word) – PSW can change the state of the CPU – can also change the instruction address (PC) • SPT (Set CPU Timer) – can change the CPU interval timer

EECS 768 Virtual Machines 14 Privileged Instructions (cont…)

• LRW (Load Real Address) – translate a virtual address, save corresponding real address in a specified GPR • POPF (Pop Stack into Flags Register) – replace flag register with top of memory stack – interrupt-enable flag can only be modified in system mode – no trap in user mode

EECS 768 Virtual Machines 15 Sensitive Instructions

• These specify instructions that interact with hardware. • Control-sensitive instructions – can change the configuration of system resources – e.g., physical memory assigned to a program – e.g., LPSW – change mode of operation – e.g., SPT – change CPU timer value

EECS 768 Virtual Machines 16 Sensitive Instructions (cont…)

• Behavior-sensitive instructions – behavior or results depend on configuration of resources – e.g., LRA – depends on mapping (state) of real memory resource – e.g., POPF – depends on mode of operation • Innocuous instructions – neither context-sensitive nor behavior- sensitive • see Figure 8.6

EECS 768 Virtual Machines 17 Instruction Types – Summary

N o n - P r i v i l e g e d I n n o c u o u s P r i v i l e g e d

B e h a v i o r - S e n s i t i v e C o n t r o l - S e n s i t i v e s e n s i t i v e s e n s i t i v e

EECS 768 Virtual Machines 18 Components of a VMM

Instruction trap occurs

These instructions D i s p a t c h e r desire to change m achine resources, P r i v i l e g e d Instruction e.g. Load R elocation Bounds Register P r i v i l e g e d P r i v i l e g e d Instruction Instruction R o u t i n e 1

A l l o c a t o r P r i v i l e g e d Instruction Interpreter R o u t i n e 2

These instructions do not change m achine resources, but access privileged resources, e.g. IN , O U T, Interpreter W r i t e T L B R o u t i n e n

EECS 768 Virtual Machines 19 Components of a VMM (2)

• Dispatcher – control module, called on hardware traps – decides the next module to be invoked • Allocator – decides allocation of system resources – invoked by dispatcher to change machine resource allocation for VMs • Interpreter routines – emulates trap functionality for each user VM – all traps except resource re-assignment traps

EECS 768 Virtual Machines 20 Properties for a True VMM

• Efficiency – All innocuous instructions must be natively executed • Resource control – impossible for guest s/w to directly change system resource configurations • Equivalence – guaranty of identical behavior – allowed exceptions • performance can be slower • available resources can be limited • differences in performance due to changed

timings EECS 768 Virtual Machines 21 Requirement for True VMM

• For any third-generation , a VMM may be constructed if the set of sensitive instructions for a computer is a subset of the set of privileged instructions – sensitive instructions always trap in user mode – non-privileged instructions execute natively

EECS 768 Virtual Machines 22 Handling Privileged Instruction

• Privileged instruction traps in the guest OS

G uest O S code in VM V M M c o d e (user m ode) (privileged m ode)

D i s p a t c h e r

Privileged instruction (LPSW ) … ... … LPSW Routine: ... Change m ode to privileged N ext instruction (target of LPSW ) C heck privilege level in VM Em ulate instruction Com pute target Restore m ode to user Jum p to target

EECS 768 Virtual Machines 23 Problem Instructions

• Sensitive instructions that are not privileged – POPF instruction in the Intel IA-32 – IA-32 not virtualizable by Popek-Goldberg rules • Handling problem instructions – interpret all guest software – scan and patch problem instructions before execution – see Figure 8.11

EECS 768 Virtual Machines 24 Patching Problem Instructions

Code Patch for d i s c o v e r e d critical instruction C ontrol transfer, e . g . t r a p Scanner and P a t c h e r VMM

O riginal Program Patched Program

EECS 768 Virtual Machines 25 Patching Problem Instructions (2)

• VMM takes control at the head of each basic block. • Scan to find problem instructions. • Replace with trap to VMM. • Place another trap at end of basic block. • Patched basic blocks can be chained. • Mostly similar to binary translation.

EECS 768 Virtual Machines 26 Caching Emulation Code

• High frequency of sensitive instructions requiring interpretation increases VMM overhead – cache interpreter actions in code cache – cache block containing the problem instruction • Each instance of the problem instruction in the code associated with a distinct piece of cache code – make code instance-specific and optimize • Simpler management of cached code – cached code executed in system mode

EECS 768 Virtual Machines 27 Resource Virtualization – Memory

• Generalizes the concept of • Virtual memory basics – application provided a logical view of memory – OS manages actual real memory – page tables provide logical-physical mapping – TLBs cache most common mappings • VMM distinguishes between real memory and physical memory – VMM does real-physical memory mapping

EECS 768 Virtual Machines 28 VMM Memory Support

• VMM mechanism to virtualize memory – maintain a per-VM real-map table – maps real pages to physical pages – VMM maintains a distinct swap space • Additional indirection layer may be inefficient – virtualize architected page tables – virtualize architected TLB

EECS 768 Virtual Machines 29 Resource Virtualization – I/O

• Virtualization of I/O is difficult – large number of I/O devices – each I/O device controlled in specific manner – OS have to deal with similar issues • Virtualization Scheme – provide VMs with virtual version of a device – intercept VM requests made to virtual device – convert request, perform equivalent action

EECS 768 Virtual Machines 30 Virtualizing Devices

• Dedicated devices – display, keyboard, mouse etc., for each user – no device virtualization necessary – device requests could bypass the VM • Partitioned devices – smaller virtual disks for each user – VMM use map to translate parameters – status information from the device also needs translation and reflection in the

map EECS 768 Virtual Machines 31 Virtualizing Devices (cont…)

• Shared devices – network , virtual network address for a VM – maintain state information for each guest VM – translate device outgoing/incoming requests • Spooled devices – shared at higher granularity, e.g., printer – two-level spool tables in OS and VMM – VMM schedules requests from multiple guest VMs

EECS 768 Virtual Machines 32 Performance Degradation

• Setup – initialize the state of the machine • Emulation – sensitive instructions, maybe others • Interrupt handling – handling interrupts generated by the guest programs • State saving – save/restore state of VM during switch to VMM • Book-keeping – special operations to maintain equivalent behavior • Time elongation – some instructions may require more processing time

EECS 768 Virtual Machines 33 VM Assists

• Piece of hardware that improves performance of an application when running on a VM. • May not always improve performance – improve user program-VMM system-mode switches – improve address translation performance • VM assists can help improve performance of – instruction emulation – other aspects of VMM – the user-mode system running as the guest – specific types of guest systems

EECS 768 Virtual Machines 34 Instruction Emulation Assists

• Emulation of privileged instructions in the guest VM is a cause of fundamental overhead – causes interrupt that is handled by the VMM – VMM emulation depends on whether guest VM in user or system mode when instruction called • Assist for LPSW on system/370 – check state of guest VM – provides hardware-assisted instructions to modify physical resources • Depends on VM system implementation • The overhead of the trap still remains – eliminates overhead of emulation and mode switching

EECS 768 Virtual Machines 35 VMM Assists

• Context switch – hardware to save/restore registers, machine state • Decoding of privileged instructions – helps certain critical parts of the emulation • Virtual interval timer – precise timer for all guest VMs to schedule jobs • Adding to the instruction set – to help commonly executed VMM parts

EECS 768 Virtual Machines 36 Improve Guest VM Performance

• Requires guest system to be VMM aware – avoid duplication of functionality – provide VMM with additional information (handshaking), e.g., DIAGNOSE instruction • Example; non-paged mode – disable dynamic address translation in guest VM – no translation from virtual to real address – translation from real to physical done by VMM • – modify guest OS to work around difficult to virtualize ISA features – for the IA-32, eliminates code detection, patching, shadow page tables

EECS 768 Virtual Machines 37 Applications

• Implementing multiprogramming • Multiple single-application virtual machines • Multiple secure environments • Mixed-OS environments • Legacy applications • Multiplatform application development • Gradual migration to new system • New system software development • training • Help desk support • Operating system instrumentation • Event monitoring • System encapsulation and checkpointing

EECS 768 Virtual Machines 38