ANALYSIS OF VIRTUAL COMPUTER SYSTEMS: PERFORMANCE AND SECURITY

Nidhi Aggarwal B.S., California State University, Fresno, 2006

PROJECT

Submitted in partial satisfaction of the requirements for the degree of

MASTER OF SCIENCE

Ill

COMPUTER ENGINEERING

at

CALIFORNIA ST ATE UNIVERSITY, SACRAMENTO

FALL 2008 ANALYSIS OF VIRTUAL COMPUTER SYSTEMS: PERFORMANCE AND SECURITY

Nidhi Aggarwal B.S., California State University, Fresno, 2006

PROJECT

Submitted in partial satisfaction of the requirements for the·degree of

MASTER OF SCIENCE

in

COMPUTER ENGINEERING

at

CALIFORNIA STATE UNIVERSITY, SACRAMENTO

FALL 2008 ANALYSIS OF VIRTUAL COMPUTER SYSTEMS: PERFORMANCE AND SECURJTY

A Project

by

Nidhi Aggarwal

Approved by:

------, Committee Chair Dr. -Behnam Arad

------, Second Reader Dr. William Mitchell

Date

II Student: Nidhi Aggarwal

I certify that this student has met the requirements for format contained in the University format manual, and that this project is suitable for shelving in the Library and credit is to be awarded for the Project.

, Graduate Coordinator i2/ 3 / ?.,,•~ r Dr. Suresh Vadhva Date f

Department ofComputer Engineering

Ill Abstract

of

ANALYSIS OF VIRTUAL COMPUTER SYSTEMS: PERFORMANCE AND SECURITY

by

Nidhi Aggarwal

Virtualizing physical resources of a computer system can improve resource sharing and utilization. is the pooling and abstraction of resources in a way that masks the physical. nature and boundaries of the resources from the users. The goal of this project was to analyze primarily the performance aspects of virtualization and understand security implications. This project report presents an overview of virtualization and discusses the key technologies behind it. The report then analyzes the key features of the ® Virtualization Technology and AMD® SVM Technology for , outlining the new instructions and hardware extensions introduced. A detailed performance analysis of various virtual environments and technologies are presented. Initially, comparison between physical and virtual environment is made at the architectural level by analyzing the perl, anagram and gee benchmarks using Simics execution environment. Then, the report presents the performance data for another benchmark (SPEC2006) for three different Monitors (VMMs) and provides a detailed performance analysis of the VMMs. A detailed analysis of is included based on the profiling done using Xenoprof to highlight the causes behind the performance bottlenecks. Finally, security aspects of virtualization are discussed and analyzed.

--~------..,--.------, Committee Chair Dr. Behnam Arad

Date

iv ACKNOWLEDGMENTS

I want to acknowledge Dr. Behnam Arad and Dr. William Mitchell for their guidance and co-operation throughout the Project.

V TABLE OF CONTENTS

Page

ACKNOWLEDGEMENT...... v GLOSSARY & KEY WORDS ...... XII Chapter 1 VIRTUALIZATION AND VIRTUAL MACHINES OVERVIEW ...... 1 1.1 INTRODUCTION ...... 1 1.2 VIRTUAL MACHINES OVERVIEW...... 3 1.3 VMM OVERVIEW ...... 4 1.3.1 Xen ...... , ...... 4 1.3.2 Microsoft ...... 8 1.3 .3 Parallels ...... 10 1.3.4 VMware...... · ...... 11 2 ARCHITECTURAL ANALYSIS OF HARDWARE VT EXTENSIONS ...... 12

2.1 INTEL VT-x...... 14 2.1.1 Life Cycle of VMM Software ...... · ...... 14 2.1.2 VMCS overview ...... 15 2.1.3 VMX Instruction Set ...... 18 2.2 AMD-V ...... 19 2.2.1 SVM Hardware Overview ...... 20 2.2.2 New Instructions ...... 22 2.2.3 Intercept operation ...... 23 2.2.4 1010 Intercepts.;...... 24 2.2.5 TLB Control ...... 24 2.2.6 New Processor Model: Paged Real Mode ...... 25 2.2.7 Event Injection...... 25 2.2.8 SMM Support (System Management Mode)...... 26 2.2.9 External Access Protection ...... 26 2.2.10 Nested Paging Facility ...... 27 3 PERFORMANCE ANALYSIS ...... 28 3.1 VIRTUALIZATION OVERHEAD ANALYSIS EXPERIMENTAL SETUP ...... 29 3 .2 BENCHMARKS ...... 31 3.2.1 SPECCPU2006...... 31 3.2.2 Integer Benchmarks ...... 31 3.2.3 Floating Point Benchmarks ...... :...... 32 3.3 SYSTEM CONFIGURATION...... 33 3.4 RESULTS ...... 34

vi 3.5 ANALYSIS ...... , ...... 39 3.6 XEN PROFILING AND PERF ANALYSIS ...... 55 3.6.2 Benchmarks Considered ...... 55 3.6.3 Experimental Results and Analysis ...... 56 3.6.3.1 OSDB Results ...... 56 3.6.4 Profiling OfXen Enviornment ...... 58 3.6.4.1 Experiments:& Analysis ...... 59 4 SECURITY ...... 62 4.1 INTRODUCTION....·...... 62 4.2 ARCHITECTURAL EXTENSIONS FOR SECURITY IN VIRTUAL MACHINES ...... 65 4.2.1 SKINIT Instruction ...... 66 4.2.2 Automatic Memory Clear ...... 66 4.2.3 Security Exceptipn ...... 67 4.3 TEST FOR SECURITY OF VMMs ...... 67 4.3.1 Test Programs.:., ...... 67 4.3.2 System Configuration and VMMs Used .... ; ...... 69 4.3.3 Test Results ...... : ...... : ...... 69 4.3.3.1 Crashme ...... :...... 69 4.3.3.2 Xensploit ...... : ...... 70 4.3.3.3 Host-to-Guest·shared folder ...... , ...... 70 4.3.4 Analysis ofResU;lts ...... :...... 71 5 CONCLUSION...... 73 REFERENCES...... ;...... 75

vii LIST OF TABLES Page

1. Tablel: List oflnteger benchmarks in SPEC2006 ...... 32

2. Table2: List of Floating Point Benchmarks in SPEC2006 ...... 33

3. Table3: Statistics for Sim Profile with perl in real mode ...... 35

4. Table4: Statistics for Sim Profile with perl in Virtual mode ...... 35

5. Table5: Statistics for Sim Cache with perl in real mode ...... 35

6. Table6: Statistics for Sim profile with anagram in virtual mode ...... 35

7. Table7: Statistics for Sim profile with anagram in real mode ...... 36

8. Table 8: Statistics for Sim cache with anagram in virtual mode ...... 36

9. Table 9: Statistics for Sim outorder with anagram in real mode ...... 36

10. Table 10: Statistics for Sim outorder with anagram in virtual...... 36

11. Table 11: Statistics for Sim profile with gee in real mode ...... 37

12. Table 12: Statistics for Sim profile with gee in virtual mode ...... 37

13. Table 13: OSDB performance for Desktop setup ...... 57

14. Table 14: Profiling samples ofthe execution ofOSDB...... :... 59

15. Table 15: Profiling samples ofthe execution ofosdb-my...... 60

16. Table 16: Profiling samples ofthe write routines...... 61

17. Table 17: Level of security for different VMM...... 71

viii LIST OF FIGURES

Page

1 FIGURE I: Hosted Virtualization ...... 2

2 FIGURE2: virtualization ...... 2

3 FIGURE3: Virtual Machine Execution Snapshot...... 4

4 FIGURE4: Xen split driver design ...... 7

5 FIGURES: Microsoft virtual PC architecture ...... 10

6 FIGURE6: Ring Architecture for Virtualization ...... 13

7 FIGURE?: Interactions ofthe VMM and Guests...... 15

8 FIGURES: The Pacifica SVM architecture ...... 21

9 FIGURE9: Experimental setup for Virtualization Overhead...... 30

10 FIGURE10: SPECCINT2006 score...... 38

11 FIGUREll: SPECCFP 2006 Scores ...... 39

12 FIGURE12: Percentage,overhead for Sim profile with perl Benchmark...... 40

13 FIGURE13: Percentage overhead for sim cache with perl benchmark ...... 40

14 FIGURE14: Percentage·overhead for sim-outorder with perl benchmark ...... 41

15 FIGURE15: Percentage overhead for sim-profile with anagram benchmark .... 41

16 FIGURE16: Percentage overhead for sim-cache with anagram benchmark ...... 42

17 FIGURE I 7: Percentage overhead for sim-profile with gee benchmark ...... 42

18 FIGURE18: Physical vs. Virtual scores for SPECCFP2006 ...... 44

19 FIGURE19: Virtual vs. Physical Overhead for SPECCFP2006 ...... 45

ix 20 FIGURE20: Physical vs. Virtual Machine SPECINT2006 scores...... 46

21 FIGURE21: Percentage :Virtualization overhead for SPECINT2006 ...... 47

22 FIGURE22: SPECCINT2006 Native scores...... 49

23 FIGURE23: SPECCINT2006 virtualization overhead ...... 50

24 FIGURE24: SPECCFP 2006 native scores...... 51

25 FIGURE25: SPECCFP 2006 virtualization overhead ...... 52

26 FIGURE26: SPECINT 2006 scores for VMMs ...... 53

27 FIGURE27: SPECCFP 2006 scores for VMMs ...... 53

28 FIGURE28: Comparison of VMware 32bit vs. 64 bit ...... 54

29 FIGURE29: OSDB performance for Desktop setup...... 57

X GLOSSARY & KEY WORDS

VM Virtual Machine

VMM Virtual Machine Monitor

HVM Hardware Virtual Machine

SVM Security and Virtual Machine

UML User Mode

VT-xNT Intel Virtualization Technology

Pacifica Codename for AMO Virtualization Technology

OS

VMCS Virtual Machine Control Structure

VMCB Virtual Machine Control Block•

xi Chapter 1

VIRTUALIZATION AND VIRTUAL MACHINES OVERVIEW

1.1 Introduction

Virtualization is a framework or methodology of dividing the resources of a computer

into multiple execution environments, by applying one or more concepts or technologies

such as hardware and/or software partitioning, time-sharing, partial or complete machine

simulation, emulation and quality of service. Established and emerging applications

motivate strong support for virtualization in both server and client computing systems.

For example, server consolidation is currently a hot topic in industry and has already

proven itself as a way of reducing enterprise power consumption, alleviating IT

administrative overhead, and making use of the huge amount of processing power

available in modem multi-processor servers and workstations. Furthermore, it helps in

significant cost savings by reducing the number of redundant servers.

~ In a non-virtualized system, a single Operating System (OS) controls all hardware

resources. of all system resources including processors, memory, and

1/0 devices makes it possible to run multiple operating systems on a single physical

platform. A virtualized system includes a new layer of software, the virtual machine

monitor (VMM). The VMM's principal role is to arbitrate accesses to the underlying

physical host platform's resources so that multiple operating systems (which are guests of

the VMM) can share them. The VMM presents to each guest OS a set of virtual platform 2 interfaces that constitute a virtual machine (VM). There are two popular approaches towards virtualization: the Hosted virtualization and the Hypervisor virtualization. Figure

1 and 2 [1] describe the two approaches:

• •GuesFi...• •.j -·.·o_s ·.·r . _·:_·_--. _- __ ; i!~b~J:~L{9f~,;~~~~n . .-. ·'.•.•, • . -.. ~-<:~":_?~:.::\•a•.,_-:,.-. . --- C:: ;I::\yfrfiS\H.YP~i•~pfJ.''.\?:' . . -,:,.,: ~---l,--.-,,: ~-.'-·--.--~- ,._ ·' ·--: >-- ..:.;.;. ::.._.,_.._:....,.-~·-:::-~- ,..::f:; <. _:_:._.-~_- :Pacifica1 Vanderpool_ .

' Figure 1: Hosted Virtualization- Figure 2: Hypervisor Virtualization

Earlier architectures imposed many challenges in providing virtualization support, as they were not originally designed for it. Because it is necessary for the VMM to run in ring 0 of the processor privilege hierarchy to maintain system-wide control, ring deprivileging occurs, and the OS must be pushed out to ring 1 or all the way out to ring 3 to run with the user-level applications. This kind of behavior is not desirable because it forces the operating system to run at the same privilege level as the user applications and hence presents great security threats to the system. There are also issues with instructions that can access privileged state without trapping to the monitor and the small amount of address space compression that occurs when the monitor sets aside a portion of the 3 memory space for itself that the guest should not access. Recently, Intel Corporation [ 1,

2] and AMD [3] have implemented extensions to the architecture which are designed to alleviate these virtualization difficulties.

1.2 Virtual Machines Overview

A VM is a piece of software that creates a virtualized environment between the computer platform and the end user in which the end user can run applications and interact with the system as he/she does with a physical system. In other words it is a self­ contained environment that behaves as if it is a separate computer. A VM provides a complete system environment in which many processes, possibly belonging to multiple users, can coexist. By using VM, a single host hardware platform can support multiple guest Operating System (OS) environments simultaneously. The history of VM back date to 1960s and early 1970s when they were first introduced by IBM in its S/360 range of servers to enable sharing of its mainframes by multiple legacy OS's. However recently, there has been a renewed interest in this field. Perhaps the most important feature of today's VMs is that they provide isolation and security where software running on one guest system is completely isolated from software running on other guest systems.

Furthermore, if security on one guest system is compromised or if the guest OS suffers a failure, the software running on other guest systems is not affected.

In addition, the ability to support different operating systems simultaneously on the same physical machine is one more reason for their appeal. The present state-of-the-art

VM includes commercial solutions such as VMware~ Microsoft Virtual PC, and other 4 open source solutions such as Xen and User-Mode-Linux. Figure 3 shows a Windows based execution environment hosting a Linux VM and another Windows based VM on the same hardware platform.

Figure 3: Virtual Machine Execution Snapshot

1.3 VMM Overview

In virtualization technology, VMM is the host program that allows a single computer or a server to support multiple, identical execution environments. It is the job of VMM to manage and allocate system processor, memory and other resources according to each virtual machine's requirements. All the different vir:tual machines see their systems as self-contained computers isolated from other users, even though they are all served by the same physical machine.

1.3.1 Xen

Xen is the most prominent open source VMM. It was originally designed to be used with guests but has since been the first system to incorporate support 5 for the new virtualization instructions. Xen is installed by patching the kernel of a host operating system such as Linux with Xen-specific code. This allows both the host and all of the guests to have access to the full set of drivers already available for the hardware. If this was not done, it would not be practical to use, since it would be limited to a very small subset of enterprise hardware like the ESX Server product from VMware. Since

Xen can rely on a host operating system for the basic infrastructure of an operating system, it can be made relatively simple in terms of the number of lines of code. This implies that it can be made highly reliable. There, are basically three key areas of the system that it must virtualize: memory, processor, and I/O.

The virtualization of memory is the most complex part of the Xen Hypervisor and is made even more difficult because of the design choices of the x86 architecture. The fact that the TLB lookups are handled by the hardware means that Xen cannot directly intercept and modify the lookups and that all translations need to be available in hardware to achieve reasonable performance. In addition, the lack of system identification tagging in the TLB means that each time a new guest operating system is switched in, the TLB must be flushed. These requirements led to Xen being placed at the top of memory for all of the guests to avoid being flushed out. It also led Xen to allow for the guests to manage their own hardware page tables with as little interference from the Xen monitor as possible. The guest operating system was then allowed to allocate pages for itself, but it was required to register them with Xen so all subsequent writes could be checked and the validity assured. 6

For virtualizing the processor, the initial design of Xen included placing the guest operating systems in ring 1 of the four privilege rings. This allowed Xen to run in ring 0 and maintain control over the operating systems and subsequently the operating systems to maintain control over the applications. The problem is that the guest operating systems must be modified to make them aware that they are being run in ring 1. This complicates the virtualization process and eliminates the ability to use closed source operating systems such as , but it provides the possibility for a great deal of performance enhancements. Since the guest is made aware that it is being virtualized, it can also be optimized to know that it does not have direct control over the physical machine memory. It can also be modified to expect different behavior from timer interrupts and know the difference between wall dock time and virtual run time. In addition, the privileged instructions that do not trap are not required to have special binary translation or code modifications applied; they are simply paravirtualized to now trap to the monitor when they are executed. All of these optimizations made the original version of Xen the highest performance monitor available at the cost of paravirtualizing the guests.

The next challenge in Xen is virtualizing the I/0 in the system. Xen attempts to make a clean, simple interface that allows the guests to make use of the drivers in the host. This split-driver design uses shared memory pages to set up what are referred to as event channels to communicate between machines. These event channels are modeled as asynchronous, buffered rings and are monitored by Xen to ensure that guests only access 7 the memory pages they have been allocated. It should be noted that it is also possible to communicate small amounts of information directly between virtual machines though the use of interrupts with specific bit masks set. The split driver design works by installing the "real" driver, known as a backend driver into the host virtual machine and stub drivers in each of the guests. When the guest makes calls to its driver, the front-end driver, the requests are packaged up and sent to the host to be actually executed. Figure 4 depicts the front-end and backend drivers for the block devices (hard drive, CD-ROM) and the network devices ( card). It shows how the event channel is used as a communication medium with the shared memory pages available to implement the ring buffers needed to store the data being transferred. The host is referred to as domain 0

(Dom0) and the guests are referred to as domain U (DomU).

~lnterDomain Event Channelf-,- Drivers Devices 1 , Domui-1~' NetFront 1 1 NetBack +-1 Shared Memory ~ .-17--,,- D

I Domu,d~-- __ · ~ interDomain Event Channel~ -7; ~t_.._.J I I BlockFront,___j Shared Memory ~ BlockBacki-➔. i~

Domain 0

Figure 4: Xen split driver design 8

1.3 .2 Microsoft

Microsoft Virtual PC is a hosted virtualization based VMM which emulates a physical computer so exactly that the applications users install in them do not distinguish the virtual machine from a physical computer. Virtual PC emulates many of the virtual machine's hardware in software. Emulated hardware components include the interrupt controller, DMA controller, IDE/ AT A controller, non-volatile RAM, real-time clock, buses, I/0 controller, keyboard controller, memory controller, programmable timers, and power-management hardware. Virtual PC then uses the host operating system to interact with any external devices such as the CD-ROM, floppy, keyboard, mouse or physical display. Virtual PC supports virtual hard disks in a number of powerful and flexible ways. Users can associate several virtual hard disks with each virtual machine. Virtual

PC supports the following types ofdisks:

• Dynamically expanding virtual hard disks. Virtual hard disks are a single file that

users create on the physical computer's hard disk. The virtual hard disk file will

dynamically expand as users write data to them. They initially use very little

space, and expand up to the maximum size of the disk.

• Fixed-size virtual hard disks. Like dynamically expanding virtual hard disks,

fixed-size virtual hard disks are a single file that users create on the physical

computer's hard disk. The file is approximately the same size as the virtual hard

disk and does not grow or shrink in size. For example, if a virtual hard disk is 2

GB, the virtual hard disk file is 2 GB. 9

• Linked virtual hard disks. Virtual PC supports linked virtual disks hard disks,

which link directly to a disk in the physical computer. This advanced capability

allows you to leverage already existing hard disk configurations

VM Additions package is installed on the guest operating system and is particular to the given guest operating system. It provides a high level of integration between the host and guest operating systems. Features include integrated mouse, time synchronization, cut & paste, drag & drop, folder sharing, and arbitrary screen resolutions. Full integration support is only available for Windows guest operating systems. Partial levels of integration support are available for other supported platforms. Figure 5 shows a more detailed view of the Virtual PC architecture. The ring O monitor has been separated into three parts:

• the internal monitor, which is installed as a driver in the host operating system

and utilizes kernel APis to interact with the host OS

• the external monitor which operates outside of the standard host environment and

provides VMM support while the VM is running, and

• the springboard, which alters the execution environment as needed to carry out

• the transition to the host operating system or VM 10

, Us'Efrc:tevet. ·:._ " · I J :: Monltor:r-'-----. ·· · · · · · .. · Simulated·:· · :10 Device :·. · ·· _Modet··· ·· B :>---='"'---'-'---'

Host ios1

I l e ice ..,. -·······•·•····. •·· ...._□ --,c·--•··--·· .... ·, ri ers 1\.'Spring- · ·i;L--:>· Ring O.VMM · . j / --board------··-. f\r--; · - External·--- J. ··r···· Monitor------· i

L __ CPU _i ·----r ,o· bev1ces- .-- ___ ,i I__ Meme>ry__l

Figure 5: Microsoft Virtual PC Architecture

1.3.3 Parallels

Parallels Workstation and Parallels Desktop are the first solutions based on Hypervisor technology, which was originally developed in the 1960s to max1m1ze the power of large mainframes. Using a thin layer of software inserted between the machine's hardware and the primary operating system to directly control critical hardware profiles and resources, Hypervisor technology dramatically improves the stability and security of applications running in a virtual machine. The inclusion of the Hypervisor technology ensures that virtual PCs built using Parallels Workstation are the most stable and efficient available. Parallels Workstation's lightweight Hypervisor also fully supports the benefits of Intel's VT architecture and AMD's Secure VM

Technology (AMD SVM). Its shared folders utility allows easy sharing of files and 11 folders between operating systems. Furthermore, its full Unicode support lets users name files and directories in any language and it has support for multi-interface USB devices

1.3.4 VMware

VMware is a VMM based on the hosted virtualization approach and makes the use of binary translation in order to accomplish virtualization. The major properties of the translator are:

• Binary: Input is binary x86 code, not source code.

• Dynamic: Translation happens at runtime, interleaved with execution of the

generated code.

• On demand: Code is translated only when it is .about to execute.

• System level: The translator makes no assumptions about the guest code. Rules

are set by the x86 ISA, not by a higher-level Application Binary Interface (ABI).

In contrast, an application-level translator might assume that "return addresses are

, always produced by calls" to generate faster code [6].

• Subsetting: The translator's input is the full x86 instruction set, including all

privileged instructions; output is a safe subset (mostly user-mode instructions).

• Adaptive: Translated code is adjusted in response to guest behavior changes to

improve overall efficiency. 12

Chapter 2

ARCHITECTURAL ANALYSIS OF HARDWARE VT EXTENSIONS

All of the challenges mentioned in Chapter 1 have been dealt with previously on the

Intel x86-architecture by different VMM vendors. These solution providers implemented

a number of "tricks" such as binary translation and instruction interception/emulation.

Another most commonly used technique was paravirtualization in which the guest

operating systems were hacked to make them aware that they are being run in a

virtualized environment. This approach has been successful, as VMware has been able to

produce a broad range of products that can host nearly any operating system on the

market with acceptable overhead. Xen [7] has also .produced very high-performance

VMMs based on paravirtualization, although these have been limited to Linux and BSD

guests since the source code must be modified. Such techniques have significantly

complicated the creation of a VMM and limited the number of players in the field as well

as the potential for innovation in the application ofthe technology.

However, with the introduction of the Vanderpool technology from Intel and Pacifica

from AMO, an attempt has been made towards resolving the challenges associated with

system virtualization at the hardware level and hence. simplifying the task of the VMM.

New instructions have been introduced in the x86 ISA which are designed to alleviate

these virtualization difficulties. These instructions are designed to employ a new mode

of operation that essentially allows the VMM to run completely below the existing 4-ring

architecture so it can maintain control over the OS's while allowing them to continue to

~------13 run m nng 0. Doing so eliminates the tricks necessary to avoid the complications associated with system virtualization while also ensuring that any guest operating system can be run. This ring privileging ~d deprivileging is illustrated in Figure 6:

Figure 6: Ring Architecture for Virtualization

Figure 6(a) represents the case of a non-virtualized system with the OS operating at level 0 and all software applications running at level3. Figure 6(b) shows the 0/1/3 model for ring deprivileging where the OS is pushed to ring 1, and 6( c) represents the 0/3/3 model where the OS has been pushed all the way to ring 3 and is running at the same privilege level as the guest applications. Lastly, Figure 6(d) represents the VT-x approach to solve the ring deprivileging problem where the VMM is running in the privileged ring and then the guest OS can also run in ring 0 in the non-privileged mode. 14

2.1 Intel VT-x

With the introduction of Intel VT-x, processor support for virtualization is provided by new form of processor operation called VMX. There are two kinds of VMX operation: root and non-root. VMM runs in the root mode, and VMs run in the non-root mode.

The Vmentry instruction allows processor to enter the non-root mode and execution of

Vmexit instruction causes processor to get into root mode. Functionality in the VMX root mode is largely similar to non-VT-x x86, except that new instructions are added to control VMs.

2.1.1 Life Cycle ofVMM Software

The processor enters the VMX root mode with Vmxon instruction. VT-x [1] extensions are not enabled if the processor does not enter .the VMX root mode. VMM can enter guest VMs through VM entries via Vmlaunch and Vmresume instructions. VMM gains back control on Vmexit. VMM can exit from VM operation through Vmxoff instruction.

Figure 7 represents the life cycle of a VMM and its guest software by illustrating the interactions between them. 15

Guest 0 Guest I

VMM

Figure 7: Interactions ofthe VMM and Guests

2.1.2 VMCS overview

VMX non-root operations and VMX transitions are managed by VMCS. It determines in part the events and instructions that cause Vmexits, loading of guest context on

Vmenter, and VMM entry point on Vmexit. The VMCS consists of six logical groups as follows:

1. Guest State Area: Processor states are saved irito and loaded from the guest-state

area on Vmexit and Vmentry respectively. The fields in the guest state area

corresponds to processor state and includes the control registers CRO, CR3, CR4,

the debug register DR7, the RSP, RIP and RFLAGS. The selector, base address,

segment limit also correspond to the processor state. In addition to the register

state, the guest state also includes interruptability state which is a 32 bit register.

Normally external interrupts are blocked only ifRFLAGS.IF=O and NMis are never

blocked. 16

11. Host State Area: All fields in the Host State Area correspond to processor registers.

The fields are control registers CR0, CR3, CR4; RSP, RIP and RFLAGS; selector

fields; and the base address fields for FS , GS , TR, GDTR and IDTR

111. VM Exit Control Fields: Vmexit controls: It manages operations of Vmexits. Two

Vmexit controls are currently defined: Bit 9 and Bit 15. Vmexit controls for

MSRs; following fields determine how MSRs are stored on Vmexit: a) VM-exit­

MSR-store count (32 bits) b) VM-exit-MSA-store address (64 bits).

1v. VM Execution Control Fields: VMCS includes several fields that control VMX

non-root operation. Most important ofthese are:

• Pin-based VM execution controls: These govern the handling of

asynchronous events. There are 2 pin based execution controls currently

defined: Bit 0 and Bit 3.

• Processor-based VM execution controls: These govern the handling of

synchronous events mainly those caused by execution of certain

instructions.

• Exception Bitmap: It is a 32 bit vector that contains 1 bit each for IA-32

exception.

• Page Fault controls: It determines which exceptions cause VM exits.

• I/O bitmap addresses: The VM execution control fields include 64 bit of

physical addresses of IO bitmaps. 17

• Time-Stamp counter offset: The VM execution control fields include 64

bit TSC offset for field.

• Guest/Host Masks and Read Shadows for CRO and CR4: The VM

execution control fields include guest/host masks and read shadows for

CROandCR4.

• CR3 Target controls: The VM execution control fields include a set of

CR3 target values.

• Controls for CR8 accesses: CR8 register can be used to access TPR in

processor's APIC. v. VM Entry Control Fields: VM Entry controls manages operations of Vmentries.

Two Vmexit controls currently defined are: Bit 9 and Bit 15. Vmentry controls for

MSRs; following fields determine how MSRs are stored on Vmexit : a) VM-exit­

MSR-store count (32 bits) b) VM-exit-MSA-store address (64 bits). Vmentry

controls for event injection; Vmentry includes a feature whereby it may conclude

by injecting an event through guest IDT. v1. VM-exit Information Field: Fields here contain information about most recent

Vmexit.

VM entries load processor state from the guest-state area of the VMCS. A VMM can optionally configure Vmentry to follow this loading by injecting an interrupt or exception. The CPU effects this injection using the; guest IDT, just as if the injected event had occurred immediately after Vmentry. This feature removes the need for a 18

VMM to emulate delivery of these events. Vmexits save processor state into the guest state area and load processor state from the host-state area. All Vmexits use a common entry point to the VMM. To simplify the design of a VMM, every Vmexit saves into the

VMCS detailed information specifying the reason for the exit; many exits also record an exit qualification, which provides further details. For example, if the MOV CR instruction causes a Vmexit, the exit reason would indicate "control-register access"; the exit qualification would indicate (1) the identity of the control register (for example,

CR0); (2) whether the MOV was to or from the control register; and (3) which general­ purpose register was the source or destination of the instruction. Both VM entries and

VM exits load CR3 (the base address of the page-table hierarchy). This implies that the

VMM and the guest can run in different linear-address.spaces.

2.1.3 VMX Instruction Set

• V mxoff - Disables VMX (VT) operations.

• V mxon - Enables VMX (VT) operations.

• Vmptrld - It loads the VMCS pointer with the operand and thus a VMCS points

to the referenced memory.

• Vmptrst- Stores the current VMCS pointer into memory .

• Vmclear- It sets a VMCS to its original state, prior to Vmlaunch .

• Vmread - Used to read data from the VMCS .

• Vmcrite - Writes data to the VMCS . 19

• Vmcall - Allows VMX Non-Root to send a message to VMX Root.

• Vmlaunch-Launches a VMCS and transfers control to the VM .

• Vmresume - Resumes a previously launched VMCS and transfers control to the

VM.

2.2AMD-V

AMD security and virtual machine (SVM) architecture, codenamed "Pacifica," is designed to provide enterprise-class server virtualization software technology that facilitates virtualization development and deployment. SVM processor support provides a set of hardware extensions designed to enable economical and efficient implementation of virtual machine systems. Hardware support falls into two complementary categories: virtualization support and security support. Virtualization support is provided by the mechanisms for fast world switch between VMM and guest and the ability to intercept selected instructions or events in the guest.

It is designed to enhance 64-bit client and server virtualization technologies for x86- based servers, workstations, desktops and mobile computers. It extends AMD64 technology with Direct Connect Architecture to enhance virtualization for clients and servers by introducing a new model and features into the processor and memory controller. Designed to enhance and extend traditional software-only based virtualization approaches, these new features will help reduce complexity and increase security of new virtualization solutions, while protecting IT investments through backward compatibility 20 with existing virtualization software.

2.2.1 SVM Hardware Overview

SVM processor support provides a set of hardware extensions designed to enable economical and efficient implementation of virtual machine systems. Hardware support falls into two complementary categories: virtualization support and security support.

Virtualization support is provided by the mechanisms for fast world switch between

VMM and guest and the ability to intercept selected instructions or events in the guest.

• Guest Mode: This new processor mode is entered through the Vmrun instruction.

When in guest mode, the behavior of some x86 instructions changes to facilitate

virtualization.

• External Access Protection: Guests may be granted direct access to selected I/0

devices. Hardware support is designed to prevent devices owned by one guest

from accessing memory owned by another guest (or the hyper visor).

• Tagged TLB: In the SVM model, the VMM is mapped in a different address

space than the guest. To reduce the cost of world switches, the TLB is tagged

with address space identifier (ASID) distinguishing host-space entries from guest­

space entries.

To facilitate efficient virtualization of interrupts, the following support is provided under control ofVMCB flags 21

• Intercepting physical interrupt delivery: The. VMM can request that physical

interrupts cause a running guest to exit, allowing the VMM to process the

interrupt.

• Virtual Interrupts: The VMM can inject virtual interrupts into the· guest. Under

control of the VMM, a virtual copy of the EFLAGS.IF interrupt mask bit, and a

virtual copy of the APIC's task priority register are used transparently by the

guest instead ofthe physical resources.

• Sharing a physical APIC: SVM allows multiple guests to share a physical APIC

while guarding against malicious or defective guests that might leave high­

priority interrupts unacknowledged forever :(and thus shut out other guest's

interrupts).

• Restartable Instructions: SVM is designed to safely restart, with the exception of

task switches, any intercepted instruction after the intercept. Instructions are

either atomic or idempotent.

Figure 8: The Pacifica SVM architecture

Guest Instructions run native speed to CPU IPhysical Resources I w/ no ring compression

EXCEl'TION

Device Allow access? Exclusion Vector Memory Controller

External "'1-r" Interrupts bd 22

2.2.2 New Instructions

The Vmrun instruction is the cornerstone of SVM. Vmrun takes, as a single argument, the physical address of a 4KB aligned page, the virtual machine control block (VMCB), which describes a virtual machine (guest) to be executed. The VMCB contains: a list of which instructions or events in the guest ( e.g., write to CR3) to intercept, various control bits that specify the execution environment of the guest or that indicate special actions to be taken before running guest code, and guest processor state (such as control registers).

The Vmrun instruction saves some host processor state in main memory at the physical address specified in the VM _HSAVE_ AREA MSR;. it then loads corresponding guest state from the VMCB state-save area. Vmrun also reads additional control bits from the

VMCB that allow the VMM to flush the guest TLB, inject virtual interrupts into the guest. The Vmrun instruction then checks the guest state just loaded. Ifillegal state has been loaded, the processor exits back to the host. Otherwise, the processor now runs the guest code until an intercept event occurs at which point the processor suspends guest execution and resumes host execution at the instruction following the Vmrun. This is called Vmexit. Vmrun saves or restores a minimal amount of state information to allow the VMM to resume execution after a guest has exited. This allows the VMM to handle simple intercept conditions quickly. Ifadditional guest state information must be saved or restored (e.g., to handle more complex intercepts or to switch to a different guest), the

VMM can employ the Vmsave and Vmload instructions. The Vmsave and Vmload instructions take the physical address of a VMCB in the (implicit) rAX operand. The 23 instructions are intended to complement the state save/restore abilities of the VMRUN instruction. They provide access to hidden processor state that software cannot otherwise access, as well as additional privileged state. Besides loading guest state, the Vmrun instruction reads various control fields from the VMCB; most of these fields are not written back to the VMCB on Vmexit (since they cannot change during guest execution).

The VMMcall instruction is meant as a way for a guest to explicitly call the VMM.

The global interrupt flag (GIF) is a·bit that controls ~hether interrupts and other events can be taken by the processor. The STGI and · CLGI instructions set and clear, respectively, the GIF.

2.2.3 · Intercept operation

Various instructions and events (such as exceptions) in the guest can be intercepted by means of control bits in the VMCB. The two primary classes of intercepts supported by

SVM are instruction and exception intercepts.

• Exception intercepts: These are checked when normal instruction processing must

raise an exception. For some exceptions, · the processor still writes certain

exception-specific registers even if the exception is intercepted. When an external

or virtual interrupt is intercepted, the interrupt is left pending.

• Instruction intercepts: These occur at well-defined points m instruction

execution-before the results of the instruction are committed, but ordered in an

intercept-specific priority relative to the instruction's exception checks.

Generally, instruction intercepts are checked after simple exceptions have been 24

checked, but ~efore exceptions related to memory accesses ( such as page faults)

and exceptions based on specific operand values. There are several exceptions to

this guideline, e.g., the RSM instruction. Instruction breakpoints for the current

instruction and pending data breakpoint traps from the previous instruction are

designed to be checked before instruction intercepts.

2.2.4 1010 Intercepts

The VMM can intercept 1010 instructions (IN, OUT, INS, OUTS) on a port-by-port basis by means of the SVM VO permissions map. The VO Permissions Map (10PM) occupies 12 Kbytes of contiguous physical memory. The table is structured as a linear array of 64K+3 bits (two 4-Kbyte pages, and the first three bits of a third 4-Kbyte page) and must be aligned on a 4-Kbyte boundary; the physical base address of the 10PM is specified in the IOPM_BASE_PA field in the VMCB and loaded into the processor by the VMRUN instruction.

2.2.5 TLB Control

The last bit of note for memory is the Tagged Transition Look-Aside Buffer, or TTLB.

A TLB is a cache that stores the addresses of recently looked up pages. Tagged TLBs assigns an ID to the TLB entries. This allows the hardware to know that an entry belongs to a specific guest OS, and avoid the conflicts that would otherwise arise. TLB entries are tagged with Address Space Identifier (ASID) bits .to distinguish different host and/or guest address spaces. The VMM can choose a software strategy in which it keeps 25 multiple shadow page tables (SPTs) up-to-date and allocates one ASID per SPT. This allows switching to a new process in a guest without flushing the TLBs.

2.2.6 New Processor Model: Paged Real Mode

To facilitate virtualization of real mode, the VMRUN instruction may legally load a guest CRO (Likewise, the RSM instruction is permitted to return to paged real mode.)

This processor mode behaves in every way like real mode, with the exception that paging is applied. The intent is that the VMM run the guest in paged-real mode, and with page faults intercepted. The VMM is responsible for setting up a shadow page table that makes guest physical memory appears at the proper virtual addresses inside the guest.

The behavior of running a guest in paged real mode wjthout also intercepting page faults to the VMM is undefined.

2.2.7 Event Injection

The VMM can inject exceptions or interrupts (events) into the guest by setting bits in the VMCB's EVENTINJ field prior to executing the Vmrun instruction. When an event is injected by means of this mechanism, the Vmruh instruction causes the guest to unconditionally take the specified exception or interrupt before executing the first guest instruction. Injected events are treated in every way as though they had occurred normally in the guest; injected events are not subject to intercept checks (Note, however, that if secondary exceptions occur during delivery of an injected event, those exceptions are subject to exception intercepts.) 26

2.2.8 SMM Support (System Management Mode)

In some usage scenarios, the VMM may not trust the existing platform SMM code. To address this case, SVM provides the ability to containerize SMM code, i.e., run it inside a guest, with the full protection mechanisms of the VMM in place. A simple solution is for the VMM to create its own trusted SMM handler and to use the handler as a trampoline to invoke the platform SMM code inside a container. The main function of the trampoline code is to set up a guest and associated VMCB, and copy relevant state between the trampoline's SMM save area, and the guest's (virtual) SMM save area. The guest executes the platform SMM code in paged real mode with appropriate SVM intercepts in place, thus ensuring security. For this approach to work, the VMM must be able to write the SMM_ BASE MSR, as well as related SMM control registers. For more efficient and flexible operation, the new SMM _ CTL is designed to allow the VMM to control explicitly. With this hardware support, the VMM can enter and exit SMM and the VMM code should be simplified.

2.2.9 External Access Protection

By securing the virtual address translation mechanism, the VMM can restrict guest CPU accesses to memory. However, should the guest have direct access to OMA-capable devices, an additional protection mechanism is required. SVM provides multiple protection domains which can restrict device access to physical memory on a per-page basis. This is accomplished via control logic in the Northbridge's host bridge which governs any external access port. The Northbridge's host bridge provides a number 27

(initially four) of protection domains. Each protection domain has associated with it a device exclusion vector (DEV) that specifies the per-page access rights of devices in that domain. Devices are identified by a device ID and the host bridge contains a lookup table of fixed size that maps device IDs to a protection domain. A DEV is a contiguous array of bits in physical memory; each bit in the DEV (in little-endian order) corresponds to one 4Kbyte page in physical memory.

2.2.10 Nested Paging Facility

The SVM Nested Paging facility provides for two levels of address translation, thus eliminating the need for the VMM to maintain shadow page tables. Nested Paging is an optional feature of SVM and is not available in all implementations of SVM-capable processors. The CPUID instruction should be used to determine nested paging support on a particular processor A guest page table (gPT) mapping guest virtual addresses to guest physical addresses is located in guest physical space. A host page table (hPT) mapping host virtual address to host physical addresses is located in host physical space.

After translating a guest virtual address using the guest page tables, the resulting (guest physical) address is treated as a host virtual address and is further translated, using the host page tables, into a host physical address. The resulting translation from guest virtual to host physical address is cached in the TLB and used on subsequent guest accesses. 28

Chapter 3

PERFORMANCE ANALYSIS

This section focuses on the performance analysis of software VMMs and also presents the overhead associated by virtualization. To analyze the performance of the virtual computer systems, following tasks were performed

1. Running benchmarks such as perl, anagram and gee on a virtual system

and a· physical machine to evaluate performance degradation in virtual

environments.

11. Profiling/Instrumentation of a software VMM to understand the causes of

overhead at the architectural level. This will be done by using a profiler,

where all the system calls from the VMM can be intercepted and will be

classified into different sub-categories giving a better understanding of the

causes of overhead. For instance, using an open source VMM such as

Xen [7] and profiling it with Xen-profile [8], the causes of overhead can

be identified to be memory reads/writes, page faults.

m. By using different VMM solutions like VMware Workstation [9], Parallels

[10], Microsoft Virtual PC [11] to do a comparative analysis of VMM

performance by running benchmarks in the individual VMs created using

these VMMs. 29

1v. Quantify the performance and overhead of the Intel-VT ~d AMD-V

extensions specifically using the VMware Workstation and setting up the

VM with hardware virtualization extensions enabled to do a comparative

performance analysis ofthese two different architectural extensions.

v. Finally, quantify the performance improvement made by virtual

architectures. For instance, software VMMs have existed long before the

hardware virtualization extensions were introduced. Experiments will be

done by setting up a VMWare VM on 32-bit Windows XP (which does

not take advantage of VT extensions) and another on 64-bit WinXP or by

enabling/disabling VT in BIOS (if that option exists in BIOS) and

compare the performance of benchmark applications like SPEC2006.

3.1 Virtualization Overhead Analysis Experimental Setup

This section presents the analysis of overhead :associated with virtualization by conducting experiments in the Simics environment. The experimental set up is shown in

Figure 9. The main objective behind conducting experiments was to evaluate the performance of running an application over a virtual !Jlachine as compared to that over a physical machine. As shown in Figure 9, Ubuntu was run on Linux for getting the statistics for the physical machine and on top of that Simics-2.2.19 was installed to get the hardware data. The Simics simulated RedHat 7.3 Linux distribution which was installed as workstation with no firewall and contained the KDE and Software Developer 30

Package Groups. This distribution is available as enterprise3-rh73.craff from

[26].

Benchmark

Linux

Benchmark Virtual Machine

Linux Linux

Simics Simics

Linux Host · Linux Host

I Figure 9: Experimental setup for Virtualization Overhead

Benchmarks were run on top of this simulated workstation. In case of evaluating the performance over the .VM, the steps till running the Linux workstation over the Simics stayed the same. However, User-Mode-Linux Virtual machine was installed on top of this RedHat workstation and another operating system was installed on it. In this case the operating system running on top of the UML virtual machine was a Debian Woody distribution which is available from [27] and benchmarks were ran on top of this Debian virtual machine.

Sim-profile, Sim-cache and Sim-outorder programs from the SimpleScalar suite were ran on perl, anagram and gee benchmarks. It is worth mentioning that these SimpleScalar tools were only used as application programs and there is no concern for the simulation statistics provided by them. Each program was run for both the real mode as well as for 31 the virtual mode. Where real mode is runrung the program directly over Simics workstation with no virtual machine involved and virtual mode is running the program over the Debian virtual machine.

3.2 Benchmarks

3.2.1 SPECCPU2006

SPEC CPU2006 is a standard CPU-intensive benchmark suite which stresses a system's processor, memory subsystem and compiler. SPEC CPU2006 was designed to provide a comparative measure of compute-intensive performance across the widest practical range of hardware using workloads developed from real user applications. These benchmarks are provided as source code and require the user ,to be comfortable using compiler commands as well as other commands via a command interpreter using a console or command prompt window in order to generate executable binaries. The benchmarks can be divided into two categories, namely, integer benchmarks and floating-point benchmarks. The individual benchmarks in either category are listed next:

3.2.2 Integer Benchmarks

The integer benchmark suite contains the following benchmarks:

Benchmark Number Benchmark Name

400 Perl bench

401 bzip2 32

403 Gee

429 Mcf

445 Gobmk

456 Hmmer

458 Sjeng

462 Libquantum

464 h264ref

471 Omnetpp

473 Astar

483 Xalancbmk

Table I: List ofinteger benchmarks in :sPEC2006.

3.2.3 Floating Point Benchmarks

The floating point benchmark suite contains the following benchmarks:

Benchmark Number Benchmark Name

410 Bwaves

416 Games

433 Mile

434 Zeusmp

435 Gromacs

436 cactusADM

437 leslie3d

444 Namd 33

447 dealII

450 Soplex

453 Povray

454 Calculix

459 GemsFDTD

465 Tonto

470 Lbm

481 Wrf

482 sphinx3

999 Specrand

Table 2: List of floating-point benchmarks in SPEC2006.

3.3 System Configuration

The hardware configuration of the systems used for performance analysis are:

• Intel Desktop System

o Intel Core Duo Processor

o 2 GB main memory

o 160 GB Hard Drive

• AMD System

o AMD FX-62 Processor

o 2 GB main memory

o 160 GB Hard Drive

The software configurations used in the analysis are: 34

• Windows XP Pro with SP2 and the OS was updated to have the latest patches

(patch id KB835935)

• Standard PC HAL was used

• Auto-updating was disabled

• Screen saver and screen blanking options were disabled

• For VMware measurements, VMware Workstation was used

• For Parallels measurements, Parallels Workstation was used

• For Microsoft, Virtual PC was used

• For timing synchronization ofthe VM and the host, VMware tools were used.

• For Parallels, the Parallels tools were installed'for timing synchronization

• Virtual Machine additions were used in case of Microsoft for timing

synchronization

3.4 Results

In this section results obtained for running different programs for different benchmarks are presented. The performance metrics consisted of observing the number of instructions executed, the number of I/0 read (logical read) and write (logical write) operations. The initial step was to set up a base image with the configuration described in section 3.3 and record the statistics for this and then compare each of the other statistics with the base image statistics. Tables 3 through 12 give the statistics for different programs for different benchmarks in the real mode as well as the virtual mode. 35

User Supervisor Total Instructions 617980844 946544062 1564524906 Executed Logical 3499 11239 14738 Reads Logical 134718 4098 138276 Writes

Table 3: Statistics for Sim-profile with per! in real mode

User Supervisor Total Instructions 913444154 1083265944 1996710098 Executed Logical 2430 32564 34994 Reads Logical 176915 27096 204011 Write Table 4: Statistics for Sim-profile with per! in virtual mode

User Supervisor Total Instructions 1968645233 982550854 2951196087 Executed Logical 2483 18161 20644 Reads Logical 148266 4464 152730 Writes

Table 5: Statistics for Sim-cache with per! in real mode

User Supervisor Total

Instructions 10747738374 2371284283 l3119022657 Executed Logical 7045 121101 128146 Reads Logical 184155 66606 250761 Writes

Table 6: Statistics for Sim-profile with anagram in virtual mode 36

User Supervisor Total Instructions 31240796888 4017581010 35258377898 Executed Logical 15912 227040 242952 Reads Logical 179691 60900 240591 Writes

Table 7: Statistics for Sim-cache with anagram in real mode

User Supervisor Total Instructions 32662834043 5226575255 37889409298 Executed Logical 20846 344434 365280 Reads Logical 234777 170010 404787 Writes

Table 8: Statistics for Sim-cache with anagram in virtual mode

User Supervisor Total Instructions 284468046450 415988438456 700456484906 Executed Logical 292038 3973090 4265128 Reads Logical 577434 496846 1074280 Writes

Table 9: Statistics for Sim-outorder with anagram in real mode

User Supervisor Total Instructions 320156968943 1747765549752 2067922518695 Executed Logical 844739 11520321 12365060 Reads Logical 1314471 1286688 2601159 Writes

Table I0: Statistics for Sim-outorder with anagram in virtual mode 37

User Supervisor Total Instructions 142200636070 26352561176 168553197246 Executed Logical 72857 1263908 1336765 Reads Logical 208016 443366 651382 Writes

Table 11: Statistics for Sim-profile with gee in real mode

User Supervisor Total Instructions 147414501814 39947403092 187361904906 Executed Logical 81897 1349789 1431686 Reads Logical 501506 434254 935760 Writes

Table 12: Statistics for Sim-profile with gee in virtual mode

Figure 10 shows the scores for the SPECCINT 2006 benchmarks for different VMMs.

Figure 11 represents the similar information for the SPECCFP2006 benchmarks. oIntel 32b Native oln11i!Vmwa11 32b Forced VT SPECCINT2006 Comparative Study 1lnlli!Vmwa11"32b"'Non-forced VT 38 olnllliParallels~VT • 1------; llnll!rMSVPC DI VT I 36 olnmrMSVS fib - r----1 1ln11r64b tetive I 34 olnte[Vmwa11_64b_VT ~ I 32 - -, I lnlel Vmwa11 64b Native i 32-1!11 "' olnm(Vmwa(64(VT 30 ' ~ 0 IFX 62 32b Native , I 28 e------i '' ' 1Fll2,:Vmware_32b i 26 i 64-bil ' ... 24 ' C: ,,, fl r- : C. C, I FU2_32b ... '!O ',' C j t: '' I' ( '48 t- ,- ( u ! ... 16 ,- 1:1 r--1 c-1:l r- ... i 1!1 ,I. ,_ ... 14 I- t-- r-- 11- ..- .~ ~ r- 1--' I-- I' 12 - t-- c- r-- ,- - c:: '1 I C: i1 'I 1 i C: 10 r-- :, ' t-- C. :r 8 : r-- i ,, ,, t... !i I ,, j ;! ! I (I 6 - ~ r ' ~ IJ I" 4 C ,, I ' ,, ' 2 - 11: - r:l I\ I' "

0 I I I I I

• ~ ~' ~~ ~ ~ ~~ ~ ~ ~ ~ ~~ ~ ~~ ~ ~ ~ ~ $ II::,~ ~ ~ ~~ ~ ~ * ~,$' Benchrmrks ~~-~-

w 00 actfim IGIM aCNfVMwze 32b 111rud VT SPECCFP2006 Comparative Study 1Cll(Vmwz(32b)lnfoiced_Vl 21 ~~~~32b_VT 12~1 I aCNfVMime Mb 1CN(MSVS !lb ' 22 aClfi_MSVFCJlb I ~i ICII! Mb IGIM actl(Vmwar1_6lb_VT .._ 20 I a I 1Fl-1232bNlllvl ~ 1FU2)mimJ!b 1;1 FU2_nb ! 18 L.. -- I ' I : I .._ I 16 ~ i I-- I, ! Ill ...... _ .._ U.14 ~ ...... ! ,~ I u: C 1:1, 11: - ,I i i ... I I II I L- L- ,I- .._ ,_____ II' .._ c:12 --- '- ! I- ,_; c,,; I I , ' .._ cr.10 1-- - l--f--1--- I- I-- ' - - I

I '- I- I-- 8 - ! I - 17' I >,I Ill ' I 6 ~ I ' ' I ' 4 ' I ' i i ' I' 2 ' ! I ! I I I 111111 I I 0 ' illl

l ,,: ~ ~ ~~ ~ ~ ~... ~ 'Si~ ~ ~ !,,i ~ ~ ~- ~$ ._tS~ t ~ ..: .J ~~... ~~ ~ ~ ~tS ~""~ ii>~ ...... ~~ ~ ~ ..~ ..... <;:;."'~ !,,;j.' ~... Benchmarks ~~~- 40

3.5 Analysis

The percentage overhead of running the application on virtual machine as compared to the physical machine can be calculated from the results obtained in Tables 3 to 12.

Figures 12 to 17 show the plot ofpercentage overhead for various cases studied.

sim-profile perl

160 - 137.4 ., 140 "'Cl) 120 -E Cl) > 100 0 Cl) 80 Cl s 60 C: Cl) ...u 40 27-, Cl) Q. 20 0 IORead IOWrite Totallnstruction- Figure 12: Percentage Overhead for sim-profile with perl benchmark

sim-cache perl

35 - 30.54 30 "Ccu i!... 25 CD ~ 20 ~ 15 J9 B 10 a; Q. 5 0 IORead IOWrite Totallnstruction

Figure 13: Percentage Overhead for sim-cache with perl benchmark 41

sim-outorder pert

160 ,, 140 ns Cl) 120 -E Cl) 100 > 0 Cl) 80 Cl J!I 60 C Cl) I:! 40 Cl) Q. 20 0 IORead IOWrite Totallnstruction

Figure 14: Percentage Overhead for sim-outorder with perl benchmark

sim-profile anagram

IORead IOWrite Totallnstruction

Figure 15: Percentage Overhead for sim-profile with anagram benchmark 42

sim-cache anagram

80,------~ 68.25 70 ------60 +------__J alGI .c Iii 50 > ~ 40 Cl ~ 30 20 GI~ a.. 7.46 10 0 IORead : IOVVrite TotalInstruction

Figure 16: Percentage Overhead for sim-cache with anagram benchmark

sim-profile gee

50 ,------, 45 +------43_.66______--l

-g 40 +------­ IV ~ 35 +------' ~ 30 +------0 25 +------­ EGI 20 _,______C ~ 15 :_ 10 +------1----- 5 0 IORead IOWrite Totallnstruction

Figure 17: Percentage Overhead for sim-profile with gee benchmark

From Figure 12 through Figure 17, we can observe that the number of 1/0 Read/Write operations executed in the virtual environment is much more than that of the physical environments. Furthermore it is worth noting that the overhead for the number of read operations is much more than the write operations. Since most of the read/write operations are associated with page misses in the application's virtual memory space, this 43 very high number of IO read/write operation while the application is running in the virtual environment could be related to the number of page faults generated by the application. This suggests that the read operations or page faults are much more expensive in virtual environments as compared to physical machines.

Based on the data in Table 3 through Table 12 and the calculation of the percentage overhead as depicted in Figure 12 through Figure 17, it can be concluded that there is a significant amount of overhead involved in running the application on VM as compared to the physical machine. This overhead can be attributed to the frequent switches between the host OS and the VMM OS for the execution of the privileged instructions. From the architectural extensions discussed in chapter 2, we know that the Vmenter and Vmexit are the instructions executed during the world switch. Hence, it can be concluded that most of the processor cycles are dedicated to these instructions while executing applications inside virtual machines. In the future architectures, this clearly supports the case for reducing the number of Vmexits and Vmenter caused by the world switches between the host and the guest or reduce the number of cycle taken by these instructions to execute.

Apart from the analysis for the virtualization overhead using Simics, the analysis for the overhead between the physical and the virtual machine was also conducted by running the SPEC2006 benchmark. The SPEC2006 benchmark was run inside the

VMware virtual machine and over the physical machine and then the sores obtained from 44 both the runs were compared to calculate the virtualization overhead. Figures 18 through

21 shows the scores obtained and the virtualization overhead calculated respectively.

vp~ ..c-~ Q) o.£ 6',r. co ..c <6~h ~ 0 ni~ ('~ ~v. .!::?-rn co -~~ ~OJ, >, 5:5~ ""~ @I • ~~

U) ~~ C) ~ C) c.... ~Git 0.. ->",S'~c,1 LL u ~Q ~ u %., w 0.. ~ JI: en... "oo­ _,.~ ..c~ J2 u ca C: ::::, ~~ G) t: ~°""' m > p u; ~@~ > ~di,.,: ca(.) "iii ·:-'.r" >­ ~q~ .c 0.. "'~., .r., i.(l>., i.(l>~ q,6' ~~ 6'~ ~~ .r.rc9 ~@6' ,116' -1~ LO C> LO C> LO C> ~? C'-1 c-..1 T""" T""" 8J o~s

Figure 18 Physical vs. Virtual Scores for SPECCFP 2006 Virtual vs. Physical Overhead for SPECCFP 2006

40.00% I• %Virtual O\erhead I 35.00%

30.00%

~ 25.00% G) .t:::. ~ 20.00% > 0 ~ 15.00%

10.00%

5.00%

0.00%

Benchmark Physical vs. Virtual Machine SPECINT2006 scores

35 ·Iii Physical Machine c------1 30 ,□ Virtual Machine

25

Benchmark %Virtualization Overhead for SPEC INT 2006 ......

45.00% · l!I %Virtualization Oi,erhead 40.00% 35.00%

-c 30.00% cu ~... 25.00% Cl) 0> 2000%.

~ 15.00% 10.00% 5.00% 0.00%

~ -~~ ~i:."' ~~ '<)~ ~ ~flj

Benchmark 48

Figure 22 shows the comparison of the SPEC2006 benchmark scores for all the VMMs with respect to the native scores onthe Intel and AMD system. All the scores have been normalized with respect to the native run involving no virtualization. The equation used for normalizing the score can be described as:

% Native score= [(VMM score)/(Native score)] * 100

Similarly Figure 23 shows the normalized scores for SPECCFP2006. Figure 24 - 25 attempts to summarize the Virtualization overhead for all the VMMs. The equation used to calculate the overhead can be desc;ribed as:

%Virtualization Overhead= [(score over VMM- Native score)/(Native score)]*lO0

Results in Figure 22 through Figure 25 show that out of all the VMMs, VMware

Workstation has the minimum overhead, followed by Parallels Desktop and Microsoft

Virtual PC has the maximum overhead among all the VMMs under study. The VMware and Parallels Virtual machines have the benchmark scores very close to the native runs.

We can conclude that the VMware's VMM is most efficient among all the other VMMs under study. + Intel_Vm1Wre_32b_VT SPECCINT2006 Comparative Study(% of Native for 32-blt binaries) -1-lntel_?arall~s_32b_VT

1.20 Intel_Vm1Wre_64b_VT + Intel_MSVPC_32b_VT .....'Tj ~ 1.00 ~ N N ..Cl "'.,, ....,r:/1 co.so C. tTl n .. n Cl ...,z -=0.60 N .. 0 :z 0 0\z -:;o.40 a...... ;I (1)< r:/1 () 0,20 0 '"I (1) r:n 0,00 . ~~ ~~' ~ ~

Benchmarks 101 - -· ------·------· ------··-- - -+- CNR_VllWlre_32b_VT 102 ---Comparativ.e:filild5eoL\m:trwizatioo-o_v.erneadlo[SeEC_cJNI200L: 1-- 99 ' : -+-CNR_Parallels_32b_VT >-' 98 CNR_VllWlre_64b_VT 93 I- 90 -+-FX-62_Vl1Wlre_32b_VT I-, 87 ~- 84 - CNR_MSVPC_32b_VT ~, 81 -+- CNR_VllWlre_64b_VT (64-b binary) 78 ~ 71 C 72 ~ 89 :: "O"' 68 n [fJ ,/ '\ l.!! 63 'I '"d .. I \ .. 80 tTl 0 J ;:-, 17 - n ..u ii \ n C 14 I 11 \ "' \ z !: 41 >-l 0 )-J- \ 'C 41 N .. ;J Fi\ \ 0 a. 42-- ~ If \\ ' \ 0 0 39 O'I VI "O 31 -"' < .."' \ \ ' .' -~I . I \\ · ''I ,,''" -e "30 I .. I J I \\ \ - > 27 - - - .. [ 0 \ \ / / ,, - ,.... 24 II I\ \ '\. '\. - :,fl. \ ,\ / - N 21 II I \ V '\. '\. / \ 7·\ . '\. . / e 18 II ./ \ \ / ..... I I I \ \ \, \ '\. / "/ 0 11 . " ::, \ \ ' \ '\. / 12 II I - 1! 0 \ \ \ ■ / 9 -.-;::--.s.; -:-::f-/ \\ \ ,../ ':',. 6' .__ -f; '' ~ \\.f.,,,,i_' \a -,,,. ,,. ·" , :,., .,.,. 3 , ,~ . - .··si·, ' -~ ~ ... \l . - . I 0 -- ·---- ..... - . - . ~ - ~~ t- ~ iJ ..,

V, 0 +CNR_Vmware_32b_VT

SPECCFP2006 Comparative Study(% of Native for 32-bit binaries) -1- CNR_parallels_32b_VT 1.20-,------,------j CNR_Vmware_64b_VT --- CNR_MSVPC_32b_VT

...a;

::z -:;o.40 +------1rc------~

0,20-f______

0.00

~ ;,,., ~~ ~ ~.. ...'I,,.... "' ~$'

Benchmarks

-Vl ~ ~ ~ ~~ ~ ,: ~ ~ ~ ~ .,,..'fl -:,..~ ~,~ ~ ,.,., .. ~ 5S .,_'fl ,..,.,. ~~ ~~ ~ ,,... ~.,~'fl if ,.o; ,t .... q; ,if ,.. ~"' ~ ~· ...... ,,..~ Benchmarks

V, N 53

In terms of integer benchmarks, gee seems to run pretty slow compared to others for all the VMs. On the other hand, libquantum looks to have the same performance as the native and has the minimum overhead in all the VMMs, as shown· in Figure 26. Figure

27 show that mile and tonto has the highest virtualization overhead whereas soplex and zeusmp has very minimal overhead among floating point benchmarks.

100.00 90.00 80.00 70.00 60.00 50.00 - 40.00 - ,....._ - 30.00 I I I I • ■ lntel_Vmware_32b_VT • I • - 20.00 I I I II I . I I rt 10.00 I I I I 111 I I I I I - ■ lntel_Parallels_32b_VT 0.00 - ,_ o lntel_MSVPC_32b_VT ..c N u 't; -"' 0.0 c:; 0. -"' \.0 u 0. u CJ C: E 0. C, 0.0 E c:, => .3 E C: ·;::; E .0 E ~ QJ V> .0 0 (1) ·;,, ('J N .0 0 C: 1.0 C u .0 E ('J N C (1) CJl ..c V\ :::; ..c E r:, r:, CJ CT 0 0. r:, .D .0 X ...., I - C 0 UJ 0. V)

Figure 26: SPECINT 2006 scores for VMMs

80.00 70.00 60.00 50.00 40.00 30.00 I I ■ lntel_Vmware_32b_VT 20.00 I I I I I I 10.00 I I I I I ,. I I I ■ lntel_Parallcls_32b_VT 0.00 r. □ lntcl_MSVPC_32b_VT a. u u X > 0 0 1.0 "'OJ "' ~ "'u ::ii: CJ -~ I- c X 0 "'CJ c i'5 2 "'C > 0 "'.S:! OJ a. "5 0 c :e 3 0 c ~ "' u N "' c :, c .;; "'C u 0 0 ,ii .8 i: s: OJ 0 "' ~ 5( .0 ~ - ::, "' a. u a. "' N f::! c "' "" ti CJ .0"' "" 1.9 I "'u Q. u_ u w Q. V, ~------"-~-----~- Figure 27: SPECCFP 2006 scores for VMMs 54

Another important observation to make is that with VT extension support in the hardware, there is a substantial improvement in the performance ofVMMs. For instance, as shown in Figure 28, when running 32 bit binaries for SPEC2006, there is a significant virtualization overhead inside the VM but when running 64 bit binaries (which were using the hardware VT extensions, the performance was much better).

30.00 25.00 20.00 15.00 10.00 5.00 ■ lntel_Vmware_32b_VT 0.00 ■ Intel_Vmware_64b_VT ,._ _,., .c ..:,:: c:; 0D '. "a; a. I.O u ~0.. u C: E C: ·;::; E a., .& = E ..0 E .:;: C, B! ...a a., -.,, I.O r., = ...a ...a 0 E C: u a., cc ..c ....., C: ~ ·..c E r., r.::, ci:i 0 "' 0.. 75 .0 X I ~ C: 0 LL.I a.. V,

Figure 28: Comparison ofVMware 32bit (no VT support) vs. 64 bit (with VT support)

The analysis in the previous paragraphs indicates that the virtualization overhead is also very dependent on the VMM used to create the virtual machine. For instance, in the case of running benchmarks inside the VM created using the Microsoft Virtual PC software, the overhead is much higher as compared to the one created using the VMware

Workstation. Another important conclusion to draw is that the use of hardware virtualization extensions reduces the overhead considerably. This is observed from running the SPEC 2006 benchmark over a VMware virtual machine making use of the hardware VT extensions and other which was not. The virtual machine utilizing the 55 hardware virtualization extensions has much less overhead compared to the virtual machine which is not utilizing the hardware virtualization extensions.

3.6 Xen Profiling and Perf analysis

This section presents the detailed performance analysis of the Xen VMM and also presents the profiling data to root cause the performance issues. Xen is the most prominent open source VMM. It was originally designed to be used with paravirtualization guests but has since been the first system to incorporate support for the new virtualization instructions. It is a hosted virtual machine, meaning that it does not run on bare hardware. Instead, it is installed by patching the kernel of a host operating system such as Linux with Xen-specific code. This allows both the host and all of the guests to have access to the full set of drivers already available for the hardware. If this was not done, it would not be practical to use, since it would be limited to a very small subset of enterprise hardware like the ESX Server product from VMware.

3.6.2 Benchmarks Considered

OSDB [20] is an open source transaction processing workload based on the AS3AP database benchmark. It can perform a variety of queries on pre-generated databases. It supports most of the mainstream databases, including Informix, MySQL, PostgreSQL. It can exercise the database with different typical workloads, including single-user workload, and multiple-user workload such as IR (Information Retrieval) and OLTP

(Online Transaction Processing). 56

3.6.3 Experimental Results and Analysis

3.6.3.1 OSDB Results

To evaluate the performance of the virtual machines for commercial applications, we used OSDB (Open Source Database Benchmark) with MySQL version 4 database as the benchmark. Different phases and usage scenarios of databases were studied, including,

• Database generation:

Two different sizes of databases were created in the experiments, 4MB and 40MB.

Transactions were generated to represent both the light and an intensive usage case of databases.

• Single user test: This setup modeled a personal usage case of databases.

• Multiple user test: This setup models a commercial usage case of databases, where

two different representative workloads were applied to the database, IR (Information

Retrieval) and OL TP (Online Transaction Processing).

The experimental results from tests executed on the desktop setup are listed in Table 13.

The comparison between the virtual machines' relative network performance, with respect to the native physical machine, are also illustrated in Figure 29. The results demonstrate that the paravirtualized virtual machine consistently delivers better performance for OSDB than the VT supported virtual machine because the executions of

OSDB are mainly VO-intensive and that there are optimizations for VO virtualization in the paravirtualized virtual machine, which are not available with the unmodified O/S. 57

Note that the performance of Mixed OLTP is particularly poor for both virtual machines: the numbers oftuples per second are close to zero.

DB Creation Single User Test Mixed IR Mixed OLTP Benchmark (second) (second) (tuple/second) (tuple/second)

Mean Stdev Mean Stdev Mean Stdev Mean Stdev

40MB 27.63 0.24 12.73 O.Q7 491.25 51.96 154.83 28.52 Paravirt 4MB 2.89 0.09 1.33 0.02 2606.50 83.98 3315.35 230.05

40MB 59.79 0.96 35.35 0.94 387.74 2.53 0.21 0.08 VMX 4MB 4.79 0.01 4.17 0.07 2040.16 13.43 2404.77 158.77

40MB 24.35 12.18 420.45 508.l Native 4MB 2.44 1.24 3617.62 3508.14

Table 13: OSDB performance for Desktop setup

1.4

~ Paravirtualized vs. Native 1.2 ■ VT vs. Native

0.8

0.6

0.4

0.2

0 generate single Mixed IR Mixed generate single Mixed IR Mixed db user test OL TP db user test OL TP 40MB database 4MB database

Figure 29 OSDB performances for Desktop setup 58

3.6.4 Profiling OfXen Enviornment

For profiling of Xen VM, Xenoprof was used. Xenoprof is a system-wide statistical

profiling toolkit implemented for the Xen virtual machine environment, based on the

Oprofile profiling tool [22]. It supports system-wide coordinated profiling in a Xen

environment to obtain the distribution of hardware events such as clock cycles,

instruction execution, TLB and cache misses. It allows profiling of concurrently

executing virtual machines, including both guest O/Ses and applications, and the Xen

VMM itself. It provides profiling data at the fine granularity of individual processes and

routines executing in either the virtual machine or in the Xen VMM [8]. .

Xenoprof implements extensions to the Xen hypervisor to support system-wide

statistical profiling. Hardware performance counters were programmed to generate

sampling interrupts at regular event count intervals. The program counter (PC) value was

sampled and stored in a per domain (VM) sample buffer. Each domain interacts with

Xenoprof using a specific hypercall, which enables the domain to define the hardware performance events to be sampled and their parameters, as well as to control the start and end ofprofiling [8].

Xenoprof leverages the OProfile kernel module to interpret the PC samples received from Xenoprof and mapping the PC sample to the appropriate routine in user, kernel or hypervisor level. The original OProfile kernel module for Linux was modified to use the

Xenoprof interface instead of accessing the hardware counters directly [8]. It also leverages the OProfile user level daemon and tools to enable a user to start and stop a 59 profiling session, and collect and store the performance event samples for later processing and reporting. The user level tools are also slightly modified in order to be used in a Xen environment [8].

3.6.4.1 Experiments & Analysis

OSDB database benchmarks were tested again on both the native physical machine and the paravirtualized virtual machine, but this time with profiling enabled. Oprofile was used for the native physical machine, while Xenopr~f was used for the paravirtualized virtual machine. Both of these tools use statistical sampling to profile program executions, so the number of samples reported by the tools for a process, thread, or function represents the execution time being spent on it. OSDB benchmark exercises both the OSDB program (osdb-my) and the database: program (mysqld) Table 14 is the comparison of these two different environments based :on their sample numbers:

Native Paravirtualized Samples

Mysqld 13125 23026

osdb-my 323 1278

Table 14. Profiling samples ofthe execution of OSDB

It can be noticed that osdb-my took four times longer to execute on the virtual machine than the native one. The number of samples used by different routines of osdb-my are shown in Table 15. Only interested routines are listed in the table. The percentages of 60 runtime they account for during the entire execution of osdb-my are also listed inside the parentheses. The results show that mu_ir_select and especially mu_oltp_update took much longer time percentage in the virtual machine than the native machine. This explains that why in Section IV(C) the performance of OLTP is poor in the virtual machines.

Samples Native Paravirtualized

mu_oltp _ update 72 (22.29%) 747 (58.45%)

mu-- ir select 55 (17.34%) 190 (14.86%)

other routines (60.37%) (26.69%)

Table 15. Profiling samples ofthe execution ofosdb-my

It is clear from Table 15 that mu_oltp_update routine is the most time consuming function. Since, mu_oltp_update involves mostly writes; a closer look was taken at the writes that occurred during the execution of the OSDB benchmarks. The sample number of relevant routines is listed in Table 16.

Samples Native Paravirtualized

Write 226 423

_generic_file_ aio _write_no lock 1 196

_block_prepare_write n/a 172

generic_file_ buffered_write 72 154

vfs write n/a 142

do _pwrite64 33 116 61

generic_file_ write n/a 104

- block-- commit write n/a 103

Sys_write n/a 97

pwrite64 8 54

sock- aio- write n/a 43

Do_sync_write n/a 39

Block_prepare_ write n/a 25

generic_commit_ write n/a 19

Sys_pwrite64 n/a 19

Table 16. Profiling samples ofthe write routines

This data breaks down the overhead incurred by the: virtualization of I/O. The results show that the various write routines are much slower in the virtual machine than the native one, and there are routines that are only required for the virtual machine. 62

Chapter 4

SECURITY

4.1 Introduction

Recently a lot of research focus has shifted to security of virtual machines. Virtual machines need to operate independently. However, the inherent design of virtual machine can open some serious security holes. An early discussion of VMM and security argued that the isolation provided by. a combined VMM/OS provided better software security than a conventional multiprogramming operating system [15]. It was also suggested that the redundant security mechanisms found in the VMM and the OS executing in one of its virtual machines enhance security [15]. Penetration experiments indicated that redundant weak implementations are irisufficient to secure a system [15].

It has also been shown in [14] that it is not necessary to aim for the highest levels of assurance when designing secure VMM for commodity hardware. However, when absolute isolation is required,. multi-system, separate hardware architecture is recommended [14].

The ring architecture also helps in creating a secure environment for virtual machines.

As discussed in Chapter 2 (Figure 6), in non-virtual machines, operating system runs at ring O and user level applications run at ring 3 where applications can send privileged instructions to the operating system, and the operating system can decide whether the instruction is valid or not. For a virtual environment, if a VMM runs at ring 3 due to being a user level program, the guest OS also runs at ring 3. This implementation allows 63 no security as a compromised VM can send privileged instructions to execute other VM's resources or data. In addition, some instructions are privileged and if not executed in the most privileged hardware domain, they cause a general protection exception. A VMM running only in ring 3 is unable to provide direct access to most privileged hardware domain. As a result, VT-x allows VMM to run in ring 0, while guest OS running in ring

1 and all user applications to run in ring 3. A VMM is executed in the privileged mode, and a VM is run in the user mode. When privileged instructions are executed in a VM, they cause a trap to the VMM. After trapping, the VMM executes code to emulate the proper behavior of the privileged instruction for the virtual machine [15]. Since,. the

VMM runs at ring O (driver level), it opens up security holes, both at the architectural level as well as at the application level.

One possible security issue in a Virtual Environment can be the case of Direct Memory

Access (DMA) attacks where one VM's application can break isolation by issuing DMA instructions to affect a copy into or out of the memory used by the other VM' s applications. Another case that can occur is a security agent within a VM requires protected access to the actual network controller hardware. This agent can then examine network traffic for malicious payloads or suspected intrusion attempts before the network packets are passed to the guest OS, where user applications might be affected. Therefore, commodity operating systems can often be compromi.sed, and their privileges escalated.

It is essential that a VMM is 'root secure' [24], meaning that no level of privilege within the virtualized guest environment permits interference with the host system. The level of 64 threat posed by a hostile virtualized environment that can subvert the normal operation of the virtual machine can be classified as follows:

• Total compromise: The VMM is subverted to execute arbitrary code on the host with

the privileges ofthe VMM process.

• Partial compromise: The VMM leaks sensitive information about the host, or a

hostile process interferes with the VMM (e.g., allocating more resources than the

administrator intended) or contaminating state checkpoints.

• Abnormal termination: The VMM exits unexpectedly or triggers an infinite loop that

prevents a host administrator from interacting with the virtual machine (for example,

suspending, rolling back), particularly if accessible as an unprivileged user within the

guest.

Another important aspect of security in a virtual machine is intrusion vulnerability by a compromised VM into the confidential element of another VM. Many techniques have been proposed for intrusion detection (IDS) in VMM _of which two main approaches are

Network-based IDS (NIDS) and Host-based IDS (HIDS) [25]. NIDS is based on watching the network traffic flowing through the system to the monitor. The problem with NIDS is that since it monitors only network traffic, much of the system level information is unavailable to NIDS for monitoring. . HIDS is based on watching local activity on a host such as system calls, network connections and logs. HIDS has a problem that it needs to be installed and run on a machine to monitor where an intruder easily deactivates or tampers with HIDS to render it useless. 65

Many emerging technologies try to eliminate these limitations of virtual environments.

One such popular technique is I/0 MMU that ensures a VMM can control all memory accesses, especially those between mutually distrusted parties [14]. Most VMMs use one of two types of policies on inter-VM operations: mandatory access control (MAC) and

Discretionary Access Control (DAC). MAC policy ensures that VMM security goals are achieved regardless of VM actions. DAC policy enables users to grant rights to the objects that they own. MAC policy ensures most secure virtual environments and is enforced by a reference monitor. As stated in [14], a reference monitor ensures mediation of all security sensitive operations and enables a policy to authorize all such operations [14]. The reference monitor must possess following properties to stimulate a secure environment: (i) mediating all security critical operations, (ii) shielding itself from modifications and (iii) as simple as possible. I/0 MMU also proposes a design of similar reference monitor to ensure isolation of different VM [ 14].

4.2 Architectural Extensions for Security in Virtual Machines

Hardware virtual extensions such as AMD's SVM provide additional hardware support designed to facilitate the construction of trusted software systems. While the security features described in this section are orthogonal to SVM's virtualization support (and are not required for processor virtualization), the two form the building blocks for trusted systems. The following sub-sections provide a brief description of the security features introduced in hardware. 66

4.2.1 SKINIT Instruction

The SKINIT instruction and associated system support (the Trusted Platform Module or

TPM) are designed to allow for verifiable startup of trusted software (such as a VMM), based on secure hash comparison. The SKINIT instruction is essential for creating a

"root of trust" starting with an initially untrusted operating mode. SKINIT reinitializes the processor to establish a secure execution environment for a software component called the secure loader (SL) and starts execution of the SL in a way that cannot be tampered with. SKINIT also copies the secure loader executable image to an external device, such as a Trusted Platform Module (TPM) for verification using unique bus transactions that preclude SKINIT operation from being emulated by software in a way that the TPM could not readily detect [3].

4.2.2 Automatic Memory Clear

Automatic memory clear (AMC) erases the contents of system memory after the processor is subjected to a cold reset, and under controlled circumstances after a warm reset. Automatic clearing of memory upon reset protects secrets stored in system memory from simple reset-based attacks. 67

4.2.3 Security Exception

A new Security Exception (SX) is used to signal certain security-critical events. The

Security Exception fault signals security-sensitive events that occur while executing the

VMM, in the form of an exception so that the VMM may take appropriate action.

4.3 Test for Security ofVMMs

Despite many technological improvements, security in virtual machines still remains an issue primarily due to inability. of all these improvements in providing complete isolation of VM. This section focuses on the security aspects of virtualization. To analyze the security aspects of the virtual computer systems, in particular, the software VMMs, I following tasks were performed:

• Running the crashme [24] tool to execute random byte sequences over the

software VMMs until a failure is encountered. This experiment did stress test of

fault handling over the VMMs similar to the scenario where a misbehaving

application can cause an unexpected crash ofthe VM.

• Using the Xensploit [23] to look at memory attacks ofVMMs. The Xensploit tool

will be used to manipulate the memory of a virtual machine and observe whether

the VMMs can void the memory manipulations or not.

• Look at VMM features such as sharing folders between host and guest.

4.3.1 Test Programs

Following test programs were used to analyze the security of different VMMs: 68

1. Crashme: For the first analysis, a simple user-mode tool called crashme [24] was

used. Crashme basically attempts to execute random byte sequences until a

failure is encountered. This experiment exposed the VMM to a stress test of fault

handling. The test was basically used to demonstrate a situation where a bad

application can cause a VM to crash unexpectedly and cause DNS attack.

Crashme allows specifying of the size of the random data string in bytes, number

of tries before exiting and an input seed to the random number generator. It is

suggested to run the tool for at least an hour [24]. The tool comes with four

custom batch files with different inputs. In our experiment, four more batch files

with random values were created and tested with different VMMs. These input

files acted as different workloads for the tool which were then executed on Virtual

PC and Parallel Desktop VMs running Windows guest OS.

2. Xensploit: Xensploit [23] is a custom tool that performs man-in-the-middle

attacks during the live migration ofvirtual mac,hines. This tool tries to manipulate

the memory of a virtual machine during it~ live migration on the network.

Xensploit tool has been previously shown to be capable of affecting host machine

memory on a Xen VMM [23]. In this test, previous studies were extended to

Parallel Desktop and Microsoft Virtual PC to demonstrate that similar security

issues exist with other popular VMMs as well. :

3. Host-to-Guest shared folder: VMware has a serious security flaw and can be

easily exploited for vulnerability. This security flaw comes from the VMware 69

host-to-guest shared folder. This folder allows the user to transfer of data from a

virtual machine to a host machine. This can give virtual machine the complete

access to the host machine's file system and security-sensitive information. To

test this vulnerability, a simple batch file on virtual machine was created which

could delete some system files and startup applications and was shared with the

host machine.

4.3.2 System Configuration and VMMs Used

System configuration used was same as that of performance analysis as described in section 4.3.4. The VMMs covered in the experiments were:

1. VMware: For VMware measurements, VMware Workstation was used

2. Microsoft Virtual PC: For Microsoft, Virtual PC was used

3. Parallel Desktop: For Parallels measurements, Parallels Workstation was used

4. Xen: Xen is an open source industry standard VMM for virtualization

4.3.3 Test Results

4.3.3.1 Crashme

The crashme tool was run inside the VMware virtual machine and caused it to crash or exit abnormally on only one out of eight tests run~ However, Microsoft Virtual PC,

Parallels Desktop and Xen VMs crashed or exited abnormally on three, three and five out of total eight tests, respectively. Since crashme basically attempts to execute random byte sequences until a failure is encountered, hence, VMware seems to have the best security 70 against applications generating random bytes and trying to exploit any security loop-hole, while Xen demonstrates some serious problems in this regard. This is especially important in the scenario where frequent crashes can reveal the piece of code in the

VMM that is source of security flaws and can be exploited to launch security attacks against the VM.

4.3.3.2 Xensploit

The Xenspoilt tool was used to test the security vulnerability of VMM. Xensploit attempts to modify the memory of the guest OS during its migration to another host machine and changes the data stored in text file to indicate whether memory corruption was successful or not.

During the test, the tool failed to manipulate the memory for Xen because Xen dev~lopers had already introduced a fix against the tool after it was brought to them with a previous study of [23]. VMware was also found secure against the tool. However, the memory of the host machine was successfully modified for Parallel Desktop and ·

Microsoft Virtual PC. This demonstrated serious problems with secure implications of

Parallel Desktop and Microsoft Virtual PC.

4.3.3.3 Host-to-Guest shared folder

The batch file was easily shared and executed with the VMware VMM exposing a serious security flaw with VMware. This flaw could not be exploited with other VMM 71 namely Microsoft Virtual PC, Parallel Desktop and Xen due to absence of host-to-guest shared folder option.

4.3.4 Analysis of Results

Table 17 summarizes the level of security of different VMM tested with their corresponding programs based upon their responses to different tests:

Microsoft Virtual Xen Programs VMware Parallel Desktop PC

ffiGH (one test MODERATE (three MOD ERA TE (three LOW (five Crashme failed) tests failed) tests (ailed) tests failed)

SECURED (tool UNSECURED (tool UNSI;:CURED (tool SECURED Xensploit failed) succeeded) succeeded (tool failed)

'

SECURED host-to-guest UNSECURED SECURED (feature SECURED (feature (feature sharedfolder (feature present) absent) absent) absent)

Table 17. Levels of Security for Different VMM

It can be concluded from Table 17 that despite many architectural changes and software improvements with the VMM to ensure more secure and isolated environment for different VMs, there remains some serious security flaws. For example, VMware is vulnerable to threats originating due to host-to-guest shared folder, Microsoft Virtual PC and Parallel Desktop are vulnerable to threats that try to modify guest OS during live migration of Virtual Machines, and Xen is vulnerable to threats that exposes VMM to stress test of fault handling. Therefore, these security flaws need to be addressed to 72 ensure that real environments are safe sitting behind virtual environments. Many promising technologies are emerging in response to these security threats which are discussed in detail in (14]. However as stated in (14], out of all these I/O MMU and improvements in reference monitor looks the most promising because sandboxing the I/O for VMM has emerged as one ofthe most critical security concern. 73

Chapter 5

CONCLUSION .

This project focused on architectural analysis of hardware virtualization extensions, performing a detailed performance analysis of virtualization (including multiple VMMs and quantifying virtualization overhead) and understanding its security implications. The architectural analysis included Intel's VT and AMD's AMD-V extensions. The performance analysis component included quantifying the overhead associated with virtualized environments by running benchmarks, performing a comparative performance analysis of different VMM software and profiling the Xen VMM using Xenoprof to identify performance bottlenecks.

Based on the results of Simics we can conclude that there is a significant amount of overhead involved in running the application on a virtual machine as compared to the physical machine. By running these benchmarks inside the software VMMs we confirmed this. Furthermore, in the case of running benchmarks inside the VM created using the Microsoft Virtual PC software, the overhead was much higher as compared to the one created using the VMware Workstation. The virtualization overhead could be attributed to the frequent switches between the host OS and the guest OS for the execution of the privileged instructions which ultimately entail the execution of the

Vmenter and Vmexit instructions. This clearly supports the case of reducing the number of V mexits and V menter caused by the switches between the host and the guest or reduce the number of cycle taken by these instructions to execute in future architectures. 74

The difference observed in the behavior of Virtual PC based VM and that of the

VMware Workstation based VM points to a very important conclusion that virtualization overhead seems to be very dependent on the VMM used to create the virtual machine.

Another very important result observed was that the use of hardware virtualization extensions reduced the overhead considerably. This was observed from running the

SPECCPU 2006 benchmark over a VMware virtual machine using the hardware VT extensions and other which was not.

Many architectural changes and improvements have made the virtual machines more secure and resilient to potential attacks. However, there remain many security loopholes that can be exploited by a compromised VM. The results obtained therefore demonstrate the need for further research into virtualization security and prove that virtualization is no security panacea.

Overall it can be concluded that despite many architectural changes and software improvements with the VMM to ensure less overhead and more secure and isolated environment for different VMs, there still remain considerable overhead and security risks due to virtualization. 75

REFERENCES

[1] Intel Corporation, "Intel Virtualization Technology Specification for the IA-32 Intel Architecture," whitepaper, April 2005.

[2] R.Uhlig, G. Neiger, D.Rodgers et.al, "Intel Virtualization Technology", in IEEE Computer Magazine, May 2005.

[3] Advanced Micro Devices, "AMD 'Pacifica' Virtualization Technology," whitepaper, 2005.

[4] Spec 2006, http://www.spec.org/

[5] Freebench, http://www.freebench.org

[6] Lmbench, http://www.bitmover.com/lmbench

[7] Pratt, I., Fraser, K., Hand, S., Limpach, C., Warfield, A., Magenheimer, D., Nakajima, J., and Mallick, A., "Xen 3.0 and the Art of Virtualization", In Proceedings of the Ottawa Linux Symposium (2005).

[8] Xenoprof - System-wide profiler for Xen VM, http://xenoprof.sourceforge.net/

[9] VMware Workstation, http://www.vmware.com/products/ws/

[ 1O] Parallels Workstation, http://www.parallels.com/en/products/workstation/

[11] Microsoft Virtual PC, http://www.microsoft.com/windows/products/winfamily/ virtualpc/default.mspx

[12] Perfmon, http://msdn2.microsoft.com/en-us/library/aa645516(VS. 71 ).aspx

[13] R. Sailer, "sHype Hypervisor Security Architecture - A Layered Approach Towards Trusted Virtual Domains", 1st Workshop on Advances in Trusted Computing, March ' ' 2006, Tokyo, Japan.

[14] R. Sailer, T. Jaeger et.al, "Building a General-purpose Secure Virtual Machine Monitor", Research Report, IBM T.J. Watson Research Center, February 2005. 76

[15] J. S. Robin and C. E. Irvine, "Analysis of the Intel Pentium's Ability to Support a Secure Virtual Machine Monitor", In Proceedings of the 9th USENIX Security Symposium, USENIX Association, 2000.

[16] D. Abramson et.al., "Intel Virtualization Technology for Directed 1/0", Intel Technology Journal, Vol. 10, Issue 03, August 2006.

[17] Xen Memory Management, wiki available at www.xensource.com.

[18] B. Dragovic, K. Fraser, S. Hand, T. Harris, A. Ho, I. Pratt, A. Warfield, P. Barham, and R. Neugebauer, "Xen and the art of virtualization." In Proc. ofthe ACM Symposium on Operating Systems Principles, October 2003.

[19] B. Clark, T. Deshane, E. Dow, S Evanchik, M. Finlayson, J. Heme, and J.N. Matthews. "Xen and the art of repeated research." In Proc. of the Usenix annual technical conference, Freenix track, July 2004.

[20] Iperf: The TCP/UDP Bandwidth Measurement Tool, http://dast.nlanr.net/Projects/Iperf/

[21] Pratt, I., Fraser, K., Hand, S., Limpach, C.,, Warfield,· A., Magenheimer, D., Nakajima, J., and Mallick, A., "Xen 3.0 and the Art of Virtualization", In Proceedings of the Ottawa Linux Symposium (2005).

[22] Oprofile, http://oprofile.sourceforge.net/

[23] J. Oberheide, E. Cooke, F. Jahanian. "Empirical Exploitation of Live Virtual Machine Migration". http://www.eecs.umich.edu/techreports/cse/2007/CSE-TR-539-07 .pdf, March 2008.

[24] T. Ormandy. "An Empirical Study into the Security Exposure to Host of Hostile Virtualized Environments." http://taviso.decsystem.org/virtsec.pdf, 2007.

[25] T. Garfinkel, M. Rosenblum. "A virtual machine introspection based architecture for intrusion detection". In Proc. Net. and Distrtbuted Sys. Sec. Symp., February 2003.

[26] www.simics.net

[27] http://user-mode-linux.sourceforge.net/