DEGREE PROJECT IN COMPUTER SCIENCE AND ENGINEERING, SECOND CYCLE, 30 CREDITS STOCKHOLM, SWEDEN 2016

A Study of OpenStack Networking Performance

PHILIP OLSSON

KTH ROYAL INSTITUTE OF TECHNOLOGY SCHOOL OF COMPUTER SCIENCE AND COMMUNICATION A Study of OpenStack Networking Performance

PHILIP OLSSON

Master’s Thesis at CSC Supervisor: Dilian Gurov Examiner: Johan Håstad Supervisor at Ericsson AB: Max Shatokhin

June 17, 2016

Abstract

Cloud computing is a fast-growing sector among software companies. Cloud platforms provide services such as spread- ing out storage and computational power over several geo- graphic locations, on-demand resource allocation and flex- ible payment options. is a technology used in conjunction with cloud technology and oers the pos- sibility to share the physical resources of a host machine by hosting several virtual machines on the same physical machine. Each runs its operating system which makes the virtual machines hardware independent. The cloud and virtualization layers add additional layers of software to the server environments to provide the ser- vices. The additional layers cause an overlay in latency which can be problematic for latency sensitive applications. The primary goal of this thesis is to investigate how the networking components impact the latency in an Open- Stack cloud compared to a traditional deployment. The networking components were benchmarked under dierent load scenarios, and the results indicate that the additional latency added by the networking components is not too significant in the used network setup. Instead, a significant performance degradation could be seen on the applications running in the virtual machine which caused most of the added latency in the cloud environment. Referat

En studie av Openstack nätverksprestanda

Molntjänster är en snabbt växande sektor bland mjukva- ruföretag. Molnplattformar tillhandahåller tjänster så som utspridning av lagring och beräkningskraft över olika geo- grafiska områden, resursallokering på begäran och flexib- la betalningsmetoder. Virtualisering är en teknik som an- vänds tillsammans med molnteknologi och erbjuder möj- ligheten att dela de fysiska resurserna hos en värddator mellan olika virtuella maskiner som kör på samma fysiska dator. Varje virtuell maskin kör sitt egna operativsystem vilket gör att de virtuella maskinerna blir hårdvaruobe- roende. Moln och virtualiseringslagret lägger till ytterliga- re mjukvarulager till servermiljöer för att göra teknikerna möjliga. De extra mjukvarulagrerna orsakar ett pålägg på responstiden vilket kan vara ett problem för applikationer som kräver snabb responstid. Det primära målet i detta ex- amensarbete är att undersöka hur de extra nätverkskompo- nenterna i molnplattformen OpenStack påverkar respons- tiden. Nätverkskomonenterna var utvärderade under olika belastningsscenarion och resultaten indikerar att den extra responstiden som orsakades av de extra nätverkskomponen- terna inte har allt för stor betydelse på responstiden i den använda nätverksinstallationen. En signifikant perstanda- försämring sågs på applikationerna som körde på den virtu- ella maskinen vilket stod för den större delen av den ökade responstiden. Glossary

blade Server computer optimized to minimize power consumption and physical space.

IaaS Infrastructure as aService.

IP Internet Protocol.

KVM Kernel-based Virtual Machine.

OS Operating System.

OVS Open vSwitch.

QEMU Quick Emulator.

SLA Service Level Agreement.

VLAN Virtual Local Area Network.

VM Virtual Machine. Contents

Glossary

1 Introduction 1 1.1 Motivation ...... 1 1.2 Problemstatement...... 2 1.3 Approach ...... 3 1.4 Contributions...... 3 1.5 Delimitations ...... 3 1.6 StructureOfTheThesis...... 3

2 Background 5 2.1 Virtualization and ...... 5 2.1.1 Virtualization ...... 5 2.1.2 Cloud Computing ...... 6 2.2 OpenStack...... 7 2.2.1 Keystone Identity Service ...... 7 2.2.2 Nova Compute ...... 8 2.2.3 Neutron Networking ...... 8 2.2.4 Cinder Block Storage Service ...... 13 2.2.5 Glance Image Service ...... 14 2.2.6 Swift Object Storage ...... 14

3 Related work 15

4 Experimental Setup 17 4.1 Physical Setup For Native And Virtual Deployment ...... 17 4.2 Native Blade Deployment Architecture ...... 19 4.3 Virtual Deployment Architecture ...... 20

5 Method 23 5.1 Load Scenarios ...... 23 5.2 Measuring Network Performance ...... 24 5.3 Measuring Server Performance ...... 25 5.4 Measuring Load Balancer Performance ...... 26 5.5 Measuring Packet Delivery In And Out From VM ...... 26

6 Results 27 6.1 Time Spent In Blade Native Versus Virtual Deployment ...... 27 6.2 Time Distribution Native Versus Virtual Deployment ...... 29 6.3 Network Components Impact On Latency ...... 31 6.4 Server Performance Impact On Latency ...... 33 6.5 Load Balancer Impact On Latency ...... 34 6.6 Packet Delivery In And Out From VM Impact On Latency . . . . . 35

7 Discussion and Analysis 37 7.1 Network Components Performance ...... 37 7.2 Load Balancer Performance ...... 38 7.3 Server Performance ...... 38 7.4 Packet Delivery In And Out From The VM ...... 39

8 Conclusion 41

9 Future Work 43

10 Social, Ethical, Economic and Sustainability aspects 45

Bibliography 47

Chapter 1

Introduction

This chapter introduces the concepts of the thesis, the motivation for the project, the investigated problem and the chosen approach. It also provides the delimitations of the project and finally describes the structure of the thesis.

1.1 Motivation

Migrating applications to a cloud environment has in recent years become a popular strategy among software companies. To deploy architecture and software in cloud environments, such as OpenStack, provide benefits such as spreading out storage and computational power over several geographic locations, on-demand resource al- location, pay-as-you-go services, and small hardware investments [18]. Data centers exploit the use of virtualization techniques which can increase the resource utiliza- tion of physical servers by letting several virtual machines (VMs), isolated from each other, run simultaneously on the same physical machine [10, 26]. Virtualiza- tion and cloud techniques provide the benefits of creating systems that are easy to scale horizontally, i.e., adding more servers to the server environment, and can make maintenance of both software and hardware easier. Depending on who the stakeholder is, a cloud environment can provide dier- ent benefits. By using virtualization techniques, standardized hardware can be used which can make it less costly to buy hardware for large data centers [10]. Customers who want to deploy arbitrary applications inside VMs in the cloud have the oppor- tunity to only pay for the hardware and bandwidth needed for their applications to run. Furthermore, customers or companies that use a cloud platform can easily scale their needs which limits the financial costs and burden of investing in a scale up or down often associated with it. The cloud environment adds additional layers of software abstraction to the server environments to provide the services compared to traditional server environ- ments. The additional layers are for example extra networking components and the layer. The hypervisor is responsible for hosting one or several VMs on a physical host. Open vSwitch (OVS) and Linux Bridges are referred to as net-

1 CHAPTER 1. INTRODUCTION working components, responsible for switching trac inside a cloud. It is possible to configure the networking in a cloud in several dierent ways, and in this thesis, an OpenStack provider network setup with OVS and Linux Bridges will be studied. OVS and Linux Bridges are commonly used in cloud computing platforms [22]. The additional layers in a cloud environment cause an overlay in latency which can be crucial for latency sensitive applications. Therefore, it is important to understand where the additional latency derives from to be able to prevent it, if possible. Currently, Ericsson is using many dierent specialized hardware components which their systems run on. By migrating products to a cloud environment, it is possible to use standardized hardware and doing so, possibly lower the costs for investments in new hardware and maintenance of both hardware and software. However, reducing the costs by using standardized hardware is sometimes re- ferred to as a myth. Even though that cheaper standardized hardware could be used, it might not lower the costs since specialized hardware can have other char- acteristics that provide benefits that standardized hardware does not have. This can for example be, better performance per price unit, lower power consumption, and improved cooling technology. By being able to run systems on both standard- ized and specialized hardware customers are oered the option to choose what they want. On traditional deployment, also referred to as native deployment, both upgrad- ing of software and scaling up is complicated. The real benefits of using virtualiza- tion and cloud technologies for Ericsson is that it gives the ability to horizontally scale up the system when needed and let the maintenance of software to be easier which implies lower costs. When Ericsson migrates their MTAS product to run in the cloud, also referred to as virtual deployment, they are experiencing higher latency from their system in comparison to native deployment. The MTAS product is for example respon- sible for setting up dierent kinds of calls between subscribers and handover of calls to dierent subsystems when subscribers are moving between various network zones, such as moving from a 4G to a 3G or 2G network zone. There are well- defined requirements on latency and on native deployment the latency is well below the requirements. In a virtualized deployment the latency is moving closer to the threshold of what can be tolerated. The latency is not allowed to be higher than the requirements to provide the best possible user experience.

1.2 Problem statement

The main question to answer in this thesis is: Given a provider network setup in virtual deployment, how much impact on the la- tency have the added networking components in a virtual deployment in comparison to latency on native deployment?

2 1.3. APPROACH

1.3 Approach

The networking components in virtual deployment were benchmarked under dif- ferent load scenarios using TCP as the transport protocol with a payload of 1000 or 2000 bytes. Identical tests were performed on virtual and native deployment where the latency for native deployment gives a baseline reference for latency. The benchmarking was done by calculating how much time on average the TCP packets spent between the dierent networking components in virtual deployment under the dierent scenarios. In chapter 5 a detailed description of the method is provided.

1.4 Contributions

The focus of this thesis is to determine how much latency a set of commonly used networking components (OVS and Linux Bridges) add to a virtualized environment. Other components in the system were also benchmarked on latency to investigate how much of the added latency they contributed to. More specifically the other investigated components were an echo server, a load balancer and the process of passing IP packets in and out from the VM. A full description of this can be found in chapter 5.

1.5 Delimitations

The latency of importance is how much longer time a packet spends inside a physical node in virtual deployment compared to a physical node in native deployment. The focus of the thesis is to determine how much latency the networking components add in virtual deployment. The experimental setup consists of a single compute node responding to requests, also referred to as a payload. Other aspects could be considered such as CPU consumption, memory usage, disk I/O performance or maximum network throughput but it is out of the scope of this project. The load balancer is optimized for a cluster containing two or more payloads. Therefore, it is not possible to guarantee that the profiling results of the load balancer will be the same in a production cluster.

1.6 Structure Of The Thesis

This thesis report consists of ten chapters. Chapter 1 introduced the concepts and the research goal of this thesis. Chapter 2 brings up the necessary background knowledge. Chapter 3 presents previous related work. Chapter 4 presents the used testbed in both native and virtual deployment. Chapter 5 gives a description of the method used in this thesis. Chapter 6 present the results from the conducted experiments. Chapter 7 discusses and analyzes the obtained results from the exper- iments. Chapter 8 present conclusions from the obtained results. Chapter 9 suggest

3 CHAPTER 1. INTRODUCTION some topics for further research related to cloud computing and virtualization. Fi- nally, in chapter 10 social, ethical, economic and sustainability aspects related to this project are discussed.

4 Chapter 2

Background

This chapter brings up the necessary background knowledge needed to answer the research question. In particular, the chapter introduces the concepts of virtualiza- tion, cloud and gives a brief description of the OpenStack cloud platform.

2.1 Virtualization and Cloud Computing

2.1.1 Virtualization Virtualization in computer science refers to creating virtual versions of computer components including, but not limited to, network and storage devices, hardware platforms, operating systems etc. A virtual machine (VM) is an emulation of a computer system running inside another computer system. A VM is often referred to being a guest running inside a host. A hypervisor is a software abstraction of hardware responsible for hosting one or several guest operating systems (OSs) simultaneously on a single physical machine [29]. In Figure 2.1 a high-level overview of the traditional computer architecture is shown. The operating system runs on top of the hardware, and the applications are running on the operating system.

Figure 2.1. Traditional computer architecture with the application, OS and hard- ware layer.

There are two types of , often referred to as Type 1 and Type 2 hy- pervisors [5]. In Figure 2.2 an overview of a virtual computer architecture is shown

5 CHAPTER 2. BACKGROUND illustrating the two types of hypervisor architecture.

A Type 1 hypervisor, often referred to as a bare-metal hypervisor, runs directly on the hardware of the host. Usually, a chosen guest system is responsible for the management and supervision of new guests on the hypervisor [5]. KVM and VMware ESXi are examples of bare-metal hypervisors[25].

A Type 2 hypervisor requires an OS first to be installed on the computer. The installed OS is referred to as the host OS. A Type 2 hypervisor runs on top of the host OS and each guest OS run as a normal process on the host OS. [5]. An example of a Type 2 hypervisor is Oracle Virtual Box.

Figure 2.2. Overview of the computer architecture with a Type 1 and Type 2 hypervisor.

2.1.2 Cloud Computing

Cloud computing is a technology that utilizes the outcome of the virtualization technology. It is a service that delivers a platform to manage virtualized resources such as hardware and VMs by letting end users add or remove VM instances in a cloud cluster, configure IP infrastructure and provide monitoring of a service level agreements (SLAs). SLAs make sure that only the agreed resources are used, and a cloud platform should oer the possibility of resource extension or contraction to scale easily up or down dynamically. OpenStack is a free open source cloud computing platform used by hundreds of the world’s largest companies to run their businesses [12].

6 2.2. OPENSTACK

2.2 OpenStack

OpenStack is an operating system consisting of a set of open source software tools allowing their users to create private and public clouds [12, 17, 26]. The Open- Stack operating system manages resources of compute, storage and networking pools which are configurable via both a web interface and command line interface [17]. OpenStack clouds are powered by modular components called OpenStack projects [17]. Each OpenStack project has its area of responsibility and it is possible to add any number of projects to an OpenStack cloud to satisfy the requirements that need to be met. Together the projects build up a complete Infrastructure as a Service (IaaS) platform. The main components are the six core services of OpenStack which are:

• Swift - Object storage

• Keystone - Authentication and authorization service

• Nova - Compute

• Neutron - Networking

• Cinder - Block storage

• Glance - Image service

A typical optional service to include in an OpenStack cloud is Horizon. Horizon is a web-based interface to let end users to manage and configure the cloud. A typical OpenStack cloud consists of several nodes where each node host one or several services, for example:

• A controller node - manages the cloud

• A network node - providing network services to the cloud

• One or several compute nodes - running the virtual machines

• One or several storage nodes - responsible for storing data and virtual machine images

2.2.1 Keystone Identity Service Keystone is the OpenStack service for authentication and authorization. Keystone is used to authenticate and authorize people and API calls from other OpenStack services.

7 CHAPTER 2. BACKGROUND

2.2.2 Nova Compute The Nova compute service manages running instances in the OpenStack cloud. Nova interacts with several other services such as: Keystone to perform authentication and authorization, Horizon to provide an administrative web interface and Glance to provide images. The compute service has the responsibility to boot the VMs with the virtual machine images provided by Glance, schedule the VMs and connect them to virtual networks inside the cloud. Nova consists of several components, and the most important ones are the API server, the scheduler, and the messaging queue. The API server allows end users and other OpenStack services to communicate with the cloud controller [9]. The collection of compute components is called cloud controller and represents the global state of the cloud. The cloud controller communicates with other OpenStack ser- vices through the messaging queue [11]. The scheduler’s task is to allocate physical resources to virtual resources by identifying the most suitable compute node. The messaging queue provides communication between processes in OpenStack [15]. Nova supports several hypervisors by providing an abstraction layer for compute drivers. The QEMU/KVM with libvirt is the default and the only fully supported hypervisor by OpenStack [13, 14]. Other supported hypervisors by Nova are for example, Hyper-V, VMware, -Server and Xen via libvirt. The support for all hypervisor are not equal and all does not support the same features [2, 14]. Kernel- based Virtual Machine (KVM) is a kernel module that provides virtualization in- frastructure for Linux machines using x86 hardware. QEMU is a generic machine emulator and virtualizer. When QEMU is used as an emulator it can run programs and operating systems made for a specific machine on another machine, for example running software made for an ARM board on a personal desktop. To run QEMU as a virtualizer, it has to run together with Xen or KVM. When used together with KVM it can be used as a virtualization software for x86 architecture capable of achieving close to native performance by letting the guest code execute directly on the physical host’s CPU. [19]

2.2.3 Neutron Networking The Neutron networking service provides networking as a service for other Open- Stack services, e.g., OpenStack Compute. Neutron uses Keystone for authentica- tion and authorization for all API requests. Neutron handles the virtual networking infrastructure which includes creation and management of networks, switches, sub- nets, routers, firewalls and virtual private networks (VPNs). When a new VM is created the Nova compute API communicates with the Neutron API to connect the VM correctly to the specified networks. It does so by plugging in the virtual network interface cards (vNICs) of the VM into the particular virtual networks with the use of Open vSwitch (OVS) bridges. OVS is an open source virtual switch used to bridge trac between VMs and external networks by connecting interfaces of VMs and physical network interface cards. OVS is intended to be used in multi

8 2.2. OPENSTACK node virtualization deployments for which the Linux Bridge is not well suited [27].

Tenant Networks OpenStack supports multitenancy, i.e., a group of OpenStack users [11]. Each tenant in the cloud requires its logical network to isolate access to compute resources. In OpenStack, this is provided by network isolation. Neutron provides support for four dierent types of network isolation and overlay technologies to isolate apps and tenants from each other in a cloud environment. [16]

Flat All hosts and VMs exist on the same network. There is no Virtual Local Area Network (VLAN) tagging or network segregation taking place, making it possible for two VMs belonging to dierent tenants to see each other’s trac.

VLAN VLAN allows separation of provider or tenant network trac by using VLAN IDs that maps to real VLANs in a data center. Neutron enables users to create multiple provider or tenant networks that correspond to the physical network in a data center. [4, 16]

GRE and VXLAN Generic Routing Encapsulation (GRE) and Virtual Extensible LAN (VXLAN) are protocols used for encapsulation to create overlay networks to increase the scalability of large computing deployments. The techniques provide the pos- sibility to create a layer-2 network on top of a layer-3 network to provide and control communication between VMs across dierent networks. The source and destination switches are then allowed to act as if they have a virtual point to point connection between them.

Network Deployment scenario There are many dierent ways to configure the Neutron networking service. One scenario is to use the Neutron ML2 layer plugin with Open vSwitch with the provider network. Provider networks map to existing physical networks in a data center [16]. The advantages of using provider networks are simplicity, better performance, and reliability with the drawback of less flexibility. The networking software components handling layer-3 operations are the ones that impact the performance and reliabil- ity the most. Better performance and reliability are archived by moving layer-3 operations to the physical network infrastructure. [22] To send trac between VMs and the external network in a provider network scenario the minimum requirements are to have one controller node and one com- pute node. The controller node requires two network interfaces, management, and provider. The physical network infrastructure switches/routes trac to external

9 CHAPTER 2. BACKGROUND

Figure 2.3. Overview of a provider network layout where all nodes connect directly to the physical network infrastructure. [22]

networks from a generic network to which the provider interface is connected. The dierence between the provider network and the external network, is that the provider network is available to instances and the external network is only available via a router. A provider network is a specific VLAN, and a generic network is a network providing one or more VLANs. [22] The compute node also need the management and provider interfaces. The provider interface also connects to a generic network that the physical network infrastructure routes to external networks. A general overview of the provider net- work layout can be seen in Figure 2.3. As seen all the nodes connect to the physical network infrastructure that takes care of the switching and routing. All nodes run switch services to provide connectivity to the VMs within the nodes. The controller node runs the Dynamic Host Configuration Protocol (DHCP) service. Figure 2.4 shows the networking components of the controller node. A tap port or tap device is a virtual network kernel device operating with layer-2 Ethernet frames. Hypervisors use tap devices to deliver Ethernet frames to guest operating systems. Patch ports are ports which connect OVS bridges. When trac is sent

10 2.2. OPENSTACK

Figure 2.4. Network components of the controller node. [22]

from a VM to the external network, the integration bridge, br-int, adds an internal tag for the provider network and forwards the trac to the provider bridge, br-ex. The provider bridge replaces the internal tag with the real VLAN segmentation ID and forwards the trac to the physical network. Figure 2.4 contains two provider networks to illustrate that it is possible to have several provider networks connected to the same physical network. The controller node has a DHCP agent for each provider network that provides the network with DHCP services. As seen in Figure 2.5 the compute node also has the integration and provider bridge like the controller node. In addition to this, the compute node also has a Linux Bridge to manage security groups for instances due to limitations in Open vSwitch and iptables [22, 8]. Figure 2.5 also contains two provider networks for illustration purposes.

11 CHAPTER 2. BACKGROUND

Figure 2.5. Network components of the compute node. [22]

Figure 2.6 describes the network trac flow and the components involved when IP packets are being sent between a VM and an external net. In essence, when an IP packet is being sent from a VM to the Internet it goes through the three bridges on the compute node and gets delivered to the physical network infrastructure. The physical network infrastructure does the switching and routing out to the Internet. Each IP packet sent from a VM to the external net is processed by 13 dierent virtual or physical network devices before reaching the public Internet.

12 2.2. OPENSTACK

Figure 2.6. Trac flow between virtual a machine and an external network. [22]

2.2.4 Cinder Block Storage Service

Cinder is the Block Storage service for OpenStack. It is designed to provide block storage resources to end users through the use of a reference implementation such as Logical Volume Management (LVM) or Network File System (NFS) or other plugin drivers for storage. Cinder provides end users with basic API requests to create, delete and attach volumes to virtual machines as well as more advanced functions such as extend, create snapshots or clone a volume. Cinder lets end users request and consume resources without requiring knowledge of the location of the storage or on what type of device the storage is deployed on. [3]

13 CHAPTER 2. BACKGROUND

2.2.5 Glance Image Service Glance is the image service used in OpenStack. Images contain already installed operating systems. Glance provides functionality to discover, register and retrieve virtual machine images. The Glance API allows users to retrieve both the actual image and metadata about the VM image. Glance support multiple back end sys- tems that can be used as storage, e.g., simple file systems or object storage systems like Swift. [6]

2.2.6 Swift Object Storage Swift is built to provide storage for large data sets with a simple API. It scales well and uses eventual consistency to provide high availability and durability for the stored data. [23]

14 Chapter 3

Related work

Ristov et al. [21] investigated how the performance of intensive compute and mem- ory web services changed as they were migrated to a cloud environment. According to their study, the performance in a cloud setup could drop by around 73% compared to when the same hardware setup was used without virtualization. Xie et al. [28] did a study regarding the maximum speed of database transactions with 30 users by comparing a bare metal physical computer and a virtual computer launched by the VMware hypervisor. The experimental setup was identical on both of the machines, and the used database was Oracle. Their results revealed that the bare metal machine had a performance gain of 12.42% compared to the virtual machine. Barker and Shenoy [1] studied how varying background load from other virtual machines running on the same physical cloud server interfered with the performance of latency-sensitive tasks. The measurements were done in a Xen-based laboratory cloud, and the background load was systematically introduced to the system level by level. The background load consisted of CPU and disk jitter from other virtual machines. According to the results, the throughput could decrease due to the back- ground load from other virtual machines. The CPU throughput was fair when the CPU allocations were capped by the hypervisor. Due to significant disk interfer- ence, up to 75% degradation in disk latency was experienced when the system was under sustained background load. Rathore et al. [20] compared Kernel-based Virtual Machines (KVM) and Linux Containers (LXC) as techniques to be used for virtual routers. They found that KVM is a potential bottleneck at high loads due to packet switching between kernel and userspace. Yamato [30] compared the performance between KVM, containers and bare metal machines. Compared to the bare metal machine, the results showed that the Docker containers had a performance degradation with around 75%, and KVM had a performance degradation with around 60%. Wang and Ng [7] studied the end to end networking performance in an Amazon EC2 cloud. They observed that when the physical resources are being shared, higher

15 CHAPTER 3. RELATED WORK latency and unstable throughput of TCP and UDP messages in the instances were experienced. Their conclusion was that the virtualization and processor sharing were causing the unstable network characteristics.

16 Chapter 4

Experimental Setup

This chapter describes the architecture of the testbed. It describes the setup for the physical architecture for both native and virtual deployment. The setup in virtual deployment is similar to the OpenStack reference configuration as described in section 2.2.3 with a few changes.

4.1 Physical Setup For Native And Virtual Deployment

Figure 4.1 shows the physical architecture for native and virtual deployment. The physical testbed used in the experiments consisted of a set of blade servers and two routers located in the server cabinet. The load machine was connected to a Juniper lab router which forwarded the trac to the cabinet. When the load machine sends trac to the blade server, the trac passes via the Juniper router and one of the two routers attached to the backplane of the cabinet. To ensure high availability, there are two routers connected to the backplane, one active and one passive, both Open Shortest Path First (OSPF) and Bidirectional Forwarding Detection (BFD) compatible. OSPF is a routing protocol that calcu- lates the shortest path in a network with Dijkstra’s algorithm. The protocol detects link failures and recalculates the path if a link goes down. BFD is a low overhead protocol used to detect link failures in a network and is used in conjunction with OSPF to detect link failures faster. The physical blades in a cabinet are identical and have 64 GB of RAM and a 10 core 2.40 GHz CPU with hyper-threading, which makes 20 cores available for the hypervisor. The backplane of the cabinet is con- nected to the two routers with a 10 Gb link. On both native and virtual deployment, there are two cluster controllers (CC1 and CC2) and one active payload, all running on separate blades. In virtual deployment, there is one additional active blade for the cloud controller. For the scope of this project, the cluster controllers only let the payload boot from them via the network. The payload is the machine that ends up processing the trac. The experimental setup is a minimal configuration and a real production cluster consist of several active payloads. To balance the load between the machines in a production cluster, a set of the payloads are equipped with a load

17 CHAPTER 4. EXPERIMENTAL SETUP

Figure 4.1. A general overview of the physical testbed used in the experiments. balancing functionality which distributes the trac among the payloads in a round robin way. The active payload used in the experiments was equipped with the load balancer and all trac sent in and out from the payload passed through it.

18 4.2. NATIVE BLADE DEPLOYMENT ARCHITECTURE

4.2 Native Blade Deployment Architecture

On native deployment, there are in total three blades active, the two cluster con- trollers and the payload which serves trac. The physical architecture of the native testbed is shown in Figure 4.2.

Figure 4.2. The physical architecture of the native deployment testbed.

An overview of the components in a payload blade when an app is running on a native node is shown in Figure 4.3. The physical blade is running Linux SUSE as the OS and the app is running on top of the OS. The load balancer is listening on the eth0 interface for incoming trac and forwards incoming trac to the O&M interface. For outgoing trac, the load balancer is listening on the O&M interface and forwards outgoing trac to the backplane of the cabinet via the eth0 interface. The server running on the node is listening and sending trac to the O&M interface.

19 CHAPTER 4. EXPERIMENTAL SETUP

Figure 4.3. The components of the compute blade in the native deployment.

4.3 Virtual Deployment Architecture

The physical architecture of the virtual testbed is shown in Figure 4.4. The physical setup consists of four identical physical blades installed on one OpenStack installa- tion. One of the blades is running the payload as a VM which serves trac, two blades are running the two cluster controllers (CC1 and CC2) as VMs, the fourth blade is running the cloud controller that manages OpenStack. Note that there is no network work node controlling the routing. As explained in section 2.2.3 the physical architecture is handling the L-3 operations. The physical nodes are running Ubuntu 14.04 as host operating system. The nodes are using QEMU/KVM version 2.0.0 as the hypervisor. The payload is a VM running Linux SUSE as OS as in the native case. The virtual hardware used by the payload consist of 10 virtual CPUs and 58 GB of RAM. The reason why the virtual payload only has 58 GB of RAM available, compared to 64 GB RAM as in the native case, is due to that the VM share memory with the hypervisor and host OS. Mirantis1 7.0 was used to deploy the OpenStack cloud on the nodes. In Figure 4.5 an overview of the components inside a virtual blade hosting the payload is shown. The tap device is connected to the Ethernet interface of the VM and the OVS integration bridge (br-int). Br-int and the OVS provider bridge (int-prv) are connected to each other via a patch port. The Linux Bridge (br-Aux) connects the provider bridge with the physical network interface eth0 by sharing the interface (pe) between the Linux Bridge and the provider bridge. The experiments conducted in this thesis used OVS version 2.3.1 and Linux Bridge-utils version 1.5. When an IP packet in a virtual node travels from the physical interface of the node to the application it has to travel through five additional virtual interfaces compared to native deployment.

1https://www.mirantis.com/

20 4.3. VIRTUAL DEPLOYMENT ARCHITECTURE

Figure 4.4. The physical architecture of the virtual deployment test set up.

21 CHAPTER 4. EXPERIMENTAL SETUP

Figure 4.5. The components of the compute blade in the virtual deployment.

22 Chapter 5

Method

This chapter presents the method and tools used to profile the performance of the components in native and virtual deployment with Ericsson software installed on the nodes. In both the native and virtual case a Python client located on the load machine sent concurrent requests to a Java server located on the payload. The server was multithreaded to be able to respond to the concurrent requests simultaneously. A new thread was created by the server for each incoming TCP connection and the operating system on the node was scheduling the threads on the dierent cores of the CPU. The nodes used in the experiments had pre-installed Ericsson software. One of the installed software components was a load balancer that distributes trac over several payloads in a production cluster. The load balancer internally uses a tunnel technology to send packets between dierent payloads. To measure how much time it takes for a TCP packet to travel from the physical network interface of the blade to the O&M interface of the payload only one payload could be active. The reason for this is that the load balancer corrupts the network flow so it is not possible to determine which payload the request will end up in while having several payloads active at the same time. By only having one active payload it is also possible to measure how much time the load balancer spends on processing an incoming and outgoing request as well as calculating how long time it took to send packets in and out from the VM in virtual deployment. An important notice is that the result from the profiling of the load balancer does not necessarily imply that the performance of the load balancer will be the same in a production cluster with several payloads active since it is optimized for a cluster with several payloads active.

5.1 Load Scenarios

Identical tests were performed on native and virtual deployment. The results from the native deployment are used as a baseline for comparison between the two sys- tems, and the latency was measured with respect to how long time a packet spent

23 CHAPTER 5. METHOD inside the physical blade on both setups. From the load machine, TCP requests carrying a payload of 1000 and 2000 bytes were sent to the server inside the payload. The server responded with an equal amount of bytes to the load machine. The reason for choosing 1000 and 2000 bytes of payload in a TCP packet was to investigate if segmentation of TCP packets aected the performance. The experiments were carried out by varying the load sent from the client to the server. The load is defined by how many requests/second were sent from the load machine. In the experiments, the load varied from 100 up to 1000 requests/second. The client on the load machine gradually increased the load from 100 to 1000 requests/second by letting each load size run for 10 seconds and then increase the load with 100 requests/seconds up to 1000 requests/second. To be able to track the correct TCP packets a specific session ID was inserted into the payload of each TCP packet. By doing so, it was possible to track how long time a particular packet spent between dierent interfaces and how long time it took for the server to process a specific request. The results are presented as the average times for the dierent components under the dierent load scenarios. There are other metrics of how latency can be measured, such as the percentile of the measured times, but it was not considered in this thesis. In addition to this a relative cost is calculated, showing how much time a specific component added in latency in virtual deployment compared to native deployment. The relative cost is presented in percent and is defined as below in equation 5.1.

time cost = measured 100 (5.1) latencydiff ú

timemeasured is the time taken by component that was measured and latencydiff is the latency dierence between virtual and native deployment.

5.2 Measuring Network Performance

To measure the amount of time that the packets spent inside the node on the native and virtual deployment and to profile where time was being lost on the virtual node tcpdump1 was used. Tcpdump is a Unix-based tool used to sninetwork trac on a given network interface and let the user know at what time a packet arrived at the interface. The provided timestamp by tcpdump reflects the time when the kernel applied the timestamp to the packet and the time is as accurate as the kernel’s clock [24]. The clock source used by the system was tsc. Tcpdump was used on a set of the network interfaces that the packets traveled through in a node on both deployments. By sning at network interfaces, it was possible to calculate how much time the intermediate networking steps took in the

1http://www.tcpdump.org/

24 5.3. MEASURING SERVER PERFORMANCE virtual setup. It was also possible to calculate the amount of time that the load balancer took to process trac, the amount of time taken to send packets in and out from the VM and the total amount of time a packet spent inside the physical blade. More specifically tcpdump was used the following network interfaces in virtual deployment:

• eth0 - the physical interface of the blade to which the incoming TCP packets first arrive

• pe - the shared interface between the Linux Bridge, br-aux, and the OVS bridge br-prv

• tap - the tap device used by the hypervisor to inject the packets to the network stack of the virtual machine

• eth2 - the interface of the payload to which the incoming TCP packets first arrive before being processed by the load balancer

and for native deployment, tcpdump was used on the following interface:

• eth0 - the physical interface of the blade to which the incoming TCP packets first arrive

The time spent between two network components in virtual deployment is cal- culated as the average time that was spent between two interfaces on the way in and out from the blade. With that said, the times calculated were the average times it took for a packet to travel:

• from eth0 to the interface pe and vice versa, and

• from pe to the tap device and vice versa

5.3 Measuring Server Performance

For each request processed by the server on native and virtual deployment the pro- cessing time was measured. This was measured to investigate if there was a dier- ence between the processing time on native and virtual setup. The server accepted incoming requests and then created a new thread of the type java.lang.Thread in which the rest of the computational work was done. The thread read all bytes from the payload of the TCP packet and responded with an equal amount of bytes. The timing was started when a new thread was created and was stopped after all bytes had been written to the output stream. To get the processing time the Java System.nanoTime library was used.

25 CHAPTER 5. METHOD

5.4 Measuring Load Balancer Performance

The average amount of time taken by the load balancer to process packets, timeLB, is calculated as shown in equation 5.2:

time =(eth2 eth2 ) serverP rocessingT ime (5.2) LB out ≠ in ≠

where eth2out is the time when a packet reached the eth2 interface on the way out from the VM, eth2in is the time when a packet reached the eth2 interface on the way into the VM, and serverP rocessingT ime is the amount of time that the server spent on processing the packet.

5.5 Measuring Packet Delivery In And Out From VM

The average time taken to send packets from the tap device to the virtual interface of the VM and vice versa, timetap vNIC, is calculated as shown in equation 5.3: ≠

timetap vNIC =(tapout tapin) (eth2out eth2in) (5.3) ≠ ≠ ≠ ≠

where tapout is the time of when a packet reached the tap device on the way out from the VM, tapin is the time of when a packet reached the tap device before going into the VM, eth2out is the time of when a packet reached the eth2 interface on the way out from the VM, eth2in is the time of when a packet reached the eth2 network interface when going into the VM. The reason why the time dierence for this case is not calculated in the same manner as for the networking components as described in Section 5.2 is due to the fact that the tap device, and the vNIC eth2 are located on dierent machines. Even though they are located on the same physical machine, their relative clocks are out of sync and therefore the time can not be calculated in the same way.

26 Chapter 6

Results

This chapter presents the results of the conducted measurements in native and virtual deployment. The results consist of a comparison between how much time a packet spends inside a native and virtual blade, how the time is distributed over dierent components in the blade and how the networking components impact the latency in virtual deployment. The performance of the load balancer and the server is also presented here. Finally, the impact on the latency of packet passing in and out from the VM is presented here.

6.1 Time Spent In Blade Native Versus Virtual Deployment

Figure 6.1 shows the average time a packet spent inside the blade on native and virtual deployment with varying load.

Figure 6.1. Average time packets spent inside a blade on native and virtual deploy- ment for dierent loads and payload sizes.

27 CHAPTER 6. RESULTS

Table 6.1 details the average factor of much longer time a packet spent inside a blade on virtual deployment compared to a blade on native deployment for dierent loads and payload sizes. The results indicate that adding a cloud layer increases the latency by a factor of at least 2.7 and a maximum factor of 5.2.Theresults suggest that as the load increases the latency between virtual and native deployment is increasing as well. In eight out of the ten cases the factor between virtual and native latency was higher with a payload of 2000 bytes.

Load Factor Factor (Requests/second) Virtual/Native 1000B Virtual/Native 2000B 100 2.7 3.8 200 3.2 3.5 300 3.8 3.5 400 3.2 4.0 500 3.2 4.9 600 3.6 5.2 700 3.5 4.5 800 4.8 4.7 900 3.9 4.1 1000 5.0 5.2

Table 6.1. The average factor of how much longer a packet spent inside a blade on virtual deployment compared to a blade on native deployment for dierent loads and payload sizes.

The added latency for the dierent load scenarios and payload sizes in the virtual deployment in comparison to native deployment are detailed in Table 6.2.

1000 bytes 2000 bytes Load (Requests/second) Time ms Time ms 100 0.89 1.2 200 0.77 0.85 300 0.89 0.79 400 0.66 0.91 500 0.61 1.2 600 0.59 1.2 700 0.57 0.96 800 0.85 0.95 900 0.71 0.84 1000 0.89 1.1

Table 6.2. Dierence in latency in milliseconds between virtual and native deploy- ment for various loads and payload sizes.

28 6.2. TIME DISTRIBUTION NATIVE VERSUS VIRTUAL DEPLOYMENT

6.2 Time Distribution Native Versus Virtual Deployment

This section presents the results of how the time was distributed over the dierent components inside a blade on native and virtual deployment. The Figures 6.2 and 6.3 show the average time distribution for the dierent components inside the blade on native deployment. In the figures, the line Load balancer + O&M in/out avg is the average amount of time that was spent by the load balancer to process the packet and send it to the server via the O&M interface and vice versa for dierent loads and payload sizes. The line Server processing time avg is the average amount of time that it took for the server to process a request for dierent loads and payload sizes.

Figure 6.2. Results of the average time distribution when packets traveled between the components on the native blade with a TCP payload size of 1000 bytes.

Figure 6.3. Results of the average time distribution when packets traveled between the components on the native blade with a TCP payload size of 2000 bytes.

29 CHAPTER 6. RESULTS

The figures 6.4 and 6.5 show the average time distribution for the dierent components inside a blade on virtual deployment for dierent loads and payload sizes. The line eth from/to prv avg is the average amount of time it took for a packet to travel from the pNIC eth0 of the blade to the vNIC pe and vice versa. The line prv from/to tap is the average time it took for a packet to travel from the pNIC pe to the tap device of the host via the OVS integration bridge and vice versa. The line VM in/out is the average time it took for a packet to travel from the tap device of the host to the eth2 vNIC of the VM and vice versa.

Figure 6.4. Results of the average time distribution when packets traveled between the components on the virtual blade with a TCP payload size of 1000 bytes.

Figure 6.5. Results of the average time distribution when packets traveled between the components on the virtual blade with a TCP payload size of 2000 bytes.

30 6.3. NETWORK COMPONENTS IMPACT ON LATENCY

6.3 Network Components Impact On Latency

The extra networking components added in the virtual deployment is the Linux Bridge (br-aux) and the two OVS bridges, the provider bridge (br-prv) and the integration bridge (br-int). Table 6.3 shows the average amount of time and the relative cost of the total added latency in virtual deployment when a packet traveled between the interfaces eth0 and pe on the way in and out of the from the blade under dierent loads and payload sizes.

1000 bytes 2000 bytes Relative cost of Relative cost of Load (Requests/second) Time ms Time ms added latency in % added latency in % 100 0.057 6.4 0.055 4.4 200 0.045 5.8 0.042 4.9 300 0.041 4.6 0.035 4.4 400 0.035 5.2 0.041 4.5 500 0.029 4.7 0.025 2.1 600 0.029 5.0 0.023 2.0 700 0.023 4.0 0.022 2.3 800 0.021 2.5 0.027 2.8 900 0.022 3.0 0.020 2.3 1000 0.021 2.4 0.021 2.0

Table 6.3. Average amount of time and the relative cost of the total added latency in virtual deployment when a packet traveled between the interfaces eth0 and pe on the way in and out of the from the blade under dierent loads and payload sizes.

Table 6.4 shows the average amount of time and the relative cost of the total added latency in virtual deployment when a packet traveled between the interface pe and the tap device of the host on the way in and out of the from the blade under dierent loads and payload sizes.

31 CHAPTER 6. RESULTS

1000 bytes 2000 bytes Relative cost of Relative cost of Load (Requests/second) Time ms Time ms added latency in % added latency in % 100 0.0062 0.70 0.0067 0.54 200 0.0056 0.73 0.0055 0.64 300 0.0055 0.62 0.0050 0.63 400 0.0051 0.77 0.0048 0.52 500 0.0048 0.78 0.0037 0.31 600 0.0046 0.79 0.0033 0.28 700 0.0034 0.59 0.0031 0.32 800 0.0028 0.33 0.0032 0.34 900 0.0027 0.38 0.0029 0.35 1000 0.0028 0.31 0.0031 0.28

Table 6.4. Average amount of time and the relative cost of the total added latency in virtual deployment when a packet traveled between the interface pe and the tap device on the way in and out of the from the blade under dierent loads and payload sizes.

Table 6.5 shows the average amount of time and the relative cost of the to- tal added latency in the virtual deployment that the networking components con- tributed to under dierent loads and payload sizes.

1000 bytes 2000 bytes Relative cost of Relative cost of Load (Requests/second) Time ms Time ms added latency in % added latency in % 100 0.063 7.1 0.062 5.0 200 0.051 6.5 0.047 5.5 300 0.046 5.2 0.040 5.0 400 0.040 6.0 0.046 5.1 500 0.034 5.5 0.029 2.4 600 0.034 5.7 0.027 2.3 700 0.027 4.6 0.025 2.6 800 0.024 2.9 0.030 3.2 900 0.024 3.4 0.023 2.7 1000 0.024 2.7 0.024 2.2

Table 6.5. The total amount of time and the relative cost of the total added latency in virtual deployment the networking components contributed to.

32 6.4. SERVER PERFORMANCE IMPACT ON LATENCY

6.4 Server Performance Impact On Latency

Figure 6.6 details the relationship between the server performance on native and virtual deployment under dierent loads and payload sizes. The results show a sig- nificant increase in processing time of the server in virtual deployment in comparison to native deployment.

Figure 6.6. Average server performance for dierent loads and payloads.

Table 6.6 shows a more detailed relationship between native and virtual server performance and the impact the server had on the total added latency in virtual deployment.

1000 bytes 2000 bytes Load Factor Relative cost of Factor Relative cost of (Requests/second) Virtual/native added latency in % Virtual/native added latency in % 100 2.3 56 2.4 46 200 2.4 54 2.2 45 300 2.6 50 2.4 52 400 2.4 54 2.4 44 500 2.3 48 2.5 33 600 2.6 50 3.3 46 700 2.6 49 2.8 43 800 3.0 40 3.1 47 900 2.7 45 2.5 40 1000 3.4 44 3.1 41

Table 6.6. The relationship between virtual and native server performance and the relative cost of the added latency in virtual deployment the server contributed to.

33 CHAPTER 6. RESULTS

6.5 Load Balancer Impact On Latency

Figure 6.7 shows an overview of the performance of the load balancer related to latency in native and virtual deployment. As in the case of the server performance in section 6.4 the results show that there is an increase of processing time in virtual deployment. Table 6.7 details the relationship related to the processing time of the load balancer in virtual and native deployment. It also shows how much of the added latency in virtual deployment the load balancer contributed with under dierent loads and payload sizes.

Figure 6.7. Average load balancer performance for dierent loads and payloads.

1000 bytes 2000 bytes Load Factor Relative cost of Factor Relative cost of (Requests/second) Virtual/native added latency in % Virtual/native added latency in % 100 3.0 26 12 42 200 5.6 30 21 41 300 8.6 37 14 36 400 5.3 31 16 45 500 5.7 38 23 60 600 5.4 36 14 48 700 5.2 37 11 49 800 9.2 51 10 44 900 6.6 45 8.9 52 1000 8.2 49 16 52

Table 6.7. The relationship between virtual and native load balancer performance and the relative cost of the added latency in virtual deployment the load balancer contributed to.

34 6.6. PACKET DELIVERY IN AND OUT FROM VM IMPACT ON LATENCY

6.6 Packet Delivery In And Out From VM Impact On Latency

In virtual deployment the payload is running as a VM inside of the host machine and because of that, when a TCP packet is sent from outside the VM to the VM, the packets has to injected onto the network stack of the VM from the host machine. The tap device is a software component located on the host machine responsible for injecting the packets from the host machine to the VM. This is also the device of the physical host that first receives the packets when they are sent out from the VM. Table 6.8 shows the average amount of time it took to send packets from the tap device of the host to the eth2 network interface of the VM and vice versa for dierent loads and payload sizes. It also displays the average relative cost that this process contributed to in subject to the total added latency for the virtual deployment.

1000 bytes 2000 bytes Load Relative cost of Relative cost of Time ms Time ms (Requests/second) added latency in % added latency in % 100 0.093 10 0.087 7.0 200 0.075 9.7 0.068 7.9 300 0.068 7.6 0.061 7.6 400 0.059 8.9 0.054 5.9 500 0.052 8.6 0.052 4.4 600 0.051 8.6 0.052 4.4 700 0.050 8.7 0.050 5.1 800 0.048 5.6 0.050 5.3 900 0.047 6.5 0.045 5.4 1000 0.045 5.1 0.046 4.3

Table 6.8. The average time taken for a packet to be delivered from the tap device to the eth2 interface of the VM and vice versa for dierent loads and payload sizes. The relative cost of the total amount of added latency in the virtual deployment that this process contributed with is shown in column three and four.

35

Chapter 7

Discussion and Analysis

This chapter presents a discussion and an analysis of the observed results. The performance of the networking components is evaluated and a discussion of the other investigated components is presented.

7.1 Network Components Performance

The load scenarios tested in this thesis are realistic and even exceed the normal limit of what a set of payloads would serve in a production environment. A set of payloads in a production cluster usually serve a maximum of 300 requests/second with real trac data. Even though the payload in the TCP packets was not real telephony data, it does not impact how the networking components perform since they are not responsible for handling the payload but only to forward packets. The sizes of the payload in the TCP packets used in the experiments are also realistic since the sizes of the packets that are being sent within a production cluster and come from the external network to to the cluster, have the size of one to two MTU sizes, i.e., 1500 to 3000 bytes. The goal of the thesis was to investigate how the impact of the added networking components aected the latency. The results show that the networking components are responsible for 2.2%up to 7.1%of the total amount of added latency in virtual deployment. This corresponds to an addition of latency between 0.023 and 0.063 milliseconds. The results from the impact of the networking components are regarded as good and an addition of maximum 0.063 milliseconds is not regarded significant. However, theoretically, if a lot of trac is passed between VMs internally in the cluster the significance of this overlay can increase. The addition of latency between the interface pe and the tap device is reduced by a factor of 10 compared to latency added between the interfaces eth0 and pe. From Table 6.3 and 6.4, in the time columns, there is a trend showing that the average travel time between the eth0 and pe interfaces and between the pe and tap device is reduced as the load increases. This is most probably due to caching mechanisms. The same trend can be seen in Table 6.5 where the aggregated times

37 CHAPTER 7. DISCUSSION AND ANALYSIS for the networking components are calculated. In addition to this, the impact that the networking components have on the total added latency in virtual deployment is also reduced which can be seen in the percentage columns in the Tables 6.3, 6.4 and 6.5. The results also indicate that there is no significant dierence in performance of the networking components when the TCP payload varies between 1000 and 2000 bytes. As stated in the OpenStack documentation [22] networking with a provider network setup has better performance compared to the classic scenario. The classic scenario requires one or more network nodes, typically located on another blade, which trac has to go through when traveling from and to an external network. If this setup is used, then the addition in latency is most likely to increase even more compared to the setup used in this thesis.

7.2 Load Balancer Performance

The performance of the load balancer in virtual deployment varies a lot in com- parison to native deployment as shown in Figure 6.7. As the load increases the relative cost of the added latency that the load balancer is responsible for increases. In virtual deployment, an increasing TCP payload size also seems to degrade the performance of the load balancer. As mentioned in chapter 5, by just having one active payload in the cluster serving trac is not a realistic scenario in a production cluster. The load balancer is optimized to give the best performance when there are several payloads active in the cluster, and therefore it can not be stated that it performs from 3.0 to 23 times worse in virtual deployment as the results suggest in Table 6.7. To properly investigate how the load balancer performs in virtual deployment versus native deployment several payloads have to be active and serve trac. However, this was not studied in this thesis since the goal was to investigate how the networking components aected the latency. It is still interesting to see that the hypervisor changes the performance characteristics of the load balancer in a single node case even though it cannot be confirmed from a full production cluster point of view.

7.3 Server Performance

The server and the load balancer were the two largest contributors to increased latency in virtual deployment. The processing time of the server was 2.2 to 3.4 times longer in virtual deployment compared to native deployment which implied that the server was responsible for 40% up to 56% of the added latency in virtual deployment. As the load increased the factor between virtual and native server performance increased, but the relative cost of added latency decreased. The average processing time for dierent loads and payload sizes has a decreasing trend in both virtual and native deployment which probably is a result of caching mechanisms. The increasing

38 7.4. PACKET DELIVERY IN AND OUT FROM THE VM factor between virtual and native server is most likely caused by the overhead that the QEMU/KVM hypervisor provides in a virtual deployment. The amount of work done on the server side was minimal. Theoretically, as the server executes more complex code resulting in longer execution times, it is presumably the component that will be the most significant component that adds the most latency to a virtualized deployment. The server component is the only component in comparison to the other investigated components that is dynamic in its execution time when the system is further developed. QEMU/KVM [19] state that when QEMU is used as a virtualizer it, achieves close to native performance. However, the results in this thesis show that the server has 2.27 up to 3.44 longer execution time in virtual deployment which can not be considered as close to native performance concerning latency sensitive applica- tions. However, Java, which is the language that the server was written in, might not be optimal, and the choice of another language might change the performance characteristics. The results indicate that the hypervisor layer decreases the performance of the load balancer and server significantly. The hypervisor layer is most probably the main bottleneck in the system since those two components together were responsible for 82.46% up to 93.45% of the total added latency in virtual deployment.

7.4 Packet Delivery In And Out From The VM

The average additional latency message passing in and out from the VM contributed to in virtual deployment was between 0.045 and 0.093 milliseconds which corre- sponded to a relative cost between 4.3%and 10 % of the total added latency. An addition of maximum 0.093 milliseconds in latency is not regarded a significant increase in latency in the system. However, as in the case with the networking com- ponents, if a lot of trac is sent between the VMs inside the cluster the significance of this overlay can increase.

39

Chapter 8

Conclusion

The question to answer in this thesis was: Given a provider network setup in virtual deployment, how much impact on the latency have the added networking components in a virtual deployment in comparison to latency on native deployment? The experiments were carried out by sending TCP requests from a client located on a load machine to a server located on a payload. The client sent the requests with a frequency ranging from 100 up to 1000 requests per second with a payload size of 1000 and 2000 bytes. The results indicate that networking components add between 0.023 and 0.063 milliseconds in latency, corresponding to between 2.2%and 7.1%of the total added latency in virtual deployment compared to native deployment. These results can be considered as good by not providing a significant increase of latency and indicates that the chosen network setup is good for the virtual deployment. In addition to this, the server and the load balancer were benchmarked with respect to latency in native and virtual deployment. The amount of time it took to pass packets in and out from the VM to the host was also benchmarked. The message passing in and out from the VM added a latency of 0.045 up to 0.093 milliseconds which corresponded to a relative cost of between 4.3%and 10 % of the total added latency in virtual deployment. This is not considered as a too significant increase in latency either. The average processing time on the server was extended by a factor of 2.2 to 3.4 in virtual deployment which corresponded to 40 % up to 56 % of the total added latency. The performance of the load balancer significantly changed in virtual deploy- ment. Even if the results of the load balancer performance can not be directly applied to how it will perform in a production cluster, the results suggest that the performance of it in a virtual environment is decreased. However, a more detailed study of the load balancer has to be done by having several active payloads in the cluster to determine the performance. The results suggest that the QEMU/KVM hypervisor layer is the main bot- tleneck in the system due to the significant increase in computational time on the

41 CHAPTER 8. CONCLUSION server and the load balancer.

42 Chapter 9

Future Work

The server running inside the VM was used as a minimal tool to be able to time dierent components in the system and can not be considered as a complete bench- mark of how applications inside a VM with the QEMU/KVM virtualizer perform. Further investigations should be done related to how the performance of an appli- cation is changed as it is moved to run inside a virtual machine instead of a host machine. For a complete understanding of how migration of applications into a virtualized environment aects the performance, other metrics than just latency should be considered. This can for example be how virtualization aects the CPU and memory utilization of the host. The performance of reading and writing to disk are also aspects that should be considered. How the relationship between the virtual hardware and physical hardware aects the performance of an application running inside a virtual machine is a topic that also could be of interest. There are other alternatives of hypervisors that OpenStack supports and the chosen hypervisor for this thesis is necessarily not the best performing one. A further investigation of the performance of other hypervisors can be of great interest. To completely understand how the performance of the load balancer is aected in a virtualized environment a more complete study has to be done on both native and virtual deployment with several active payloads to simulate a real production cluster. This thesis only investigated how the networking components aected the latency in one particular case by using provider network setup. Another possible case that could be investigated is how the classic network setup aects the latency in an OpenStack cloud.

43

Chapter 10

Social, Ethical, Economic and Sustainability aspects

Virtualization generally oers the possibility to run several virtual machines on the same physical compute resource. By running several virtual machines on the same physical host it is possible to reduce the number of active machines in a data center and by doing so, reduce power consumption. Shared resources imply both lower costs for power consumption as well as minimizing the emission of carbon dioxide. Virtualization provides the benefit of making the virtual machines hardware inde- pendent which could imply that providers of large data centers could save money by using standardized hardware. OpenStack makes it possible to deploy servers located across all over the world and sometimes the users do not know where the resources are located. When it comes to sharing compute and storage resources all countries do not have the same laws and regulations related to ownership of data, which can be problematic. By using shared resources also opens up for the possibility for intruders to get access to information they do not own and exploit this. In theory, it could also be possible for an intruder to make a denial of service attack on the shared resources, even though the service level agreement is supposed to prevent this. If critical systems are running on public cloud infrastructure it could be possible in theory for an intruder take out systems for private persons, companies or even nations.

45

Bibliography

[1] Sean Kenneth Barker and Prashant Shenoy. “Empirical evaluation of latency- sensitive application performance in the cloud”. In: Proceedings of the first annual ACM SIGMM conference on Multimedia systems. ACM. 2010, pp. 35– 46. [2] Meenakshi Bist, Manoj Wariya, and Abhishek Agarwal. “Comparing delta, open stack and Xen Cloud Platforms: A survey on open source IaaS”. In: Advance Computing Conference (IACC), 2013 IEEE 3rd International.IEEE. 2013, pp. 96–100. [3] Cinder wiki. https://wiki.openstack.org/wiki/Cinder. Accessed: 2016- 03-06. [4] James Denton. Rackspace Developer. https://developer.rackspace.com/ blog/neutron-networking-vlan-provider-networks/. Accessed: 2016-03- 15. [5] Michael Fenn et al. “An evaluation of KVM for use in cloud computing”. In: Proc. 2nd International Conference on the Virtual Computing Initiative, RTP, NC, USA. 2008. [6] Glance. http://docs.openstack.org/developer/glance/. Accessed: 2016- 02-25. [7] Wang Guohui and Eugene Ng T.S. “The impact of virtualization on network performance of Amazon EC2 data center”. In: INFOCOM, 2010 Proceedings IEEE. IEEE. 2010, pp. 1–9. [8] Open vSwitch OpenStack Docs. http : / / openvswitch . org / openstack / documentation/. Accessed: 2016-03-06. [9] OpenStack Compute architecture. http : / / docs . openstack . org / admin - guide-cloud/compute_arch.html. Accessed: 2016-02-29. [10] OpenStack Documentation. http://docs.openstack.org/icehouse/training- guides/content/operator-getting-started.html. Accessed: 2015-11-04. [11] OpenStack glossary. http://docs.openstack.org/admin-guide-cloud/ common/glossary.html. Accessed: 2016-02-29. [12] OpenStack Home page. http://www.openstack.org/. Accessed: 2016-02-29.

47 BIBLIOGRAPHY

[13] OpenStack Hypervisor. http://docs.openstack.org/kilo/config-reference/ content/hypervisor-configuration-basics.html. Accessed: 2016-02-29. [14] OpenStack Hypervisor Support Matrix. https://wiki.openstack.org/wiki/ HypervisorSupportMatrix. Accessed: 2016-02-29. [15] OpenStack messaging. http : / / docs . openstack . org / security - guide / messaging.html. Accessed: 2016-02-29. [16] OpenStack Networking Overview. http://docs.openstack.org/mitaka/ networking - guide / intro - os - networking - overview . html.Accessed: 2016-05-12. [17] OpenStack Software. http : / / www . openstack . org / software/.Accessed: 2016-02-29. [18] Simon Ostermann et al. “A performance analysis of EC2 cloud computing ser- vices for scientific computing”. In: Cloud computing. Springer, 2009, pp. 115– 131. [19] QEMU and KVM. http://wiki.qemu.org/Main_Page. Accessed: 2016-02- 29. [20] Muhammad Siraj Rathore, Markus Hidell, and Peter Sjödin. “KVM vs. LXC: comparing performance and isolation of hardware-assisted virtual routers”. In: American Journal of Networks and Communications 2.4 (2013), pp. 88– 96. [21] Ristov Sasko et al. “Compute and memory intensive web service performance in the cloud”. In: ICT Innovations 2012. Springer, 2013, pp. 215–224. [22] Scenario: Provider networks with Open vSwitch. http://docs.openstack. org / liberty / networking - guide / scenario - provider - ovs . html.Ac- cessed: 2016-03-06. [23] Swift documentation. http://docs.openstack.org/developer/swift/. Accessed: 2016-03-06. [24] Tcpdump man page. http://www.tcpdump.org/manpages/tcpdump.1.html. Accessed: 2016-04-25. [25] VMware ESXi. http://www.vmware.com/se/products/esxi- and- esx/ overview. Accessed: 2016-04-18. [26] Xiaolong Wen et al. “Comparison of open-source cloud management plat- forms: OpenStack and OpenNebula”. In: Fuzzy Systems and Knowledge Dis- covery (FSKD), 2012 9th International Conference on. IEEE. 2012, pp. 2457– 2461. [27] Why Open vSwitch. https://github.com/openvswitch/ovs/blob/master/ WHY-OVS.md. Accessed: 2016-03-02. [28] Jun Xie et al. “Bare metal provisioning to OpenStack using xCAT”. In: Jour- nal of Computers 8.7 (2013), pp. 1691–1695.

48 BIBLIOGRAPHY

[29] Sonali Yadav. “Comparative Study on Open Source Software for Cloud Com- puting Platform: Eucalyptus, Openstack and Opennebula.” In: International Journal Of Engineering And Science (2013). [30] Yoji Yamato. “OpenStack hypervisor, container and Baremetal servers perfor- mance comparison”. In: IEICE Communications Express 4.7 (2015), pp. 228– 232.

49 www.kth.se