DEGREE PROJECT IN TECHNOLOGY, FIRST CYCLE, 15 CREDITS STOCKHOLM, SWEDEN 2019

Comparing Cloud Architectures in terms of Performance and Scalability

Perttu Jääskeläinen

KTH ROYAL INSTITUTE OF TECHNOLOGY ELECTRICAL ENGINEERING AND COMPUTER SCIENCE Authors

Perttu Jääskeläinen; [email protected] Degree Programme in Computer Engineering KTH Royal Institute of Technology

Place for Project

Stockholm, Sweden

Examiner

Fadil Galjic KTH Royal Institute of Technology

Supervisor

Anders Sjögren KTH Royal Institute of Technology Abstract

Cloud Computing is becoming increasingly popular, with large amounts of corporations revenue coming in from various cloud solutions offered to customers. When it comes to choosing a solution, multiple options exist for the same problem from many competitors. This report focuses on the ones offered by in their Azure platform, and compares the architectures in terms of performance and scalability.

In order to determine the most suitable architecture, three offered by Azure are considered: Cloud Services (CS), Service Fabric Mesh (SFM) and Virtual Machines (VM). By developing and deploying a REST Web API to each service and performing a load test, average response times in milliseconds are measured and compared. To determine scalability, the point at which each service starts timing out requests is identified. The services are tested both by scaling up, by increasing the power of a single instance of a machine, and by scaling out, if possible, by duplicating instances of machines running in parallel.

The results show that VMs fall considerably behind both CS and SFM in both performance and scalability, for a regular use case. For low amounts of requests, all services perform about the same, but as soon as the requests increase, it is clear that both SFM and CS outperform VMs. In the end, CS comes ahead both in terms of scalability and performance.

Further research may be done into other platforms which offer the same service solutions, such as Web Services (AWS) and Cloud, or other architectures within Azure.

Keywords

Software , SaaS, Architectures, Scaling, Performance, , Virtual Machines, Cloud Services, Mesh Network

2 Abstract

Molntjänster blir alltmer populära i dagens industri, där stora mängder av företagens omsättning består av tjänster erbjudna i form av molnlösningar. När det kommer till att välja en lösning finns många för samma problem, där det är upp till kunden att välja vilken som passar bäst. Denna rapport fokuserar på tjänster erbjudna av Microsofts Azure plattform, i en jämförelse av arkitekturer som belastningstestas för att mäta prestanda och skalbarhet.

För att avgöra vilken arkitektur som är optimalast mäts tre olika tjänster erbjudna i Azure: Cloud Services (CS), Service Fabric Mesh (SFM) och Virtual Machines (VM). Detta görs genom att utveckla och deploya ett REST Web API som är simulerat med användare, där prestanda mäts genom att ta medelresponstiden i millisekunder per anrop. För att avgöra skalbarhet identifieras en punkt där tjänsten inte längre klarar av antalet inkommande anrop och börjar returnera felkoder. Maskinerna för varje tjänst testas både genom att skala upp, genom att förstärka en maskin, men även genom att skala ut, där det skapas flera instanser av samma maskin.

Resultatet visar att Virtual Machines hamnar betydligt efter både CS och SFM i både prestanda och skalbarhet för ett vanligt användarfall. För låga mängder anrop ligger samtliga tjänster väldigt lika, men så fort anropen börjar öka så märks det tydligt att SFM och CS presterar bättre än Virtual Machines. I slutändan ligger CS i framkant, både i form av prestanda och skalbarhet.

Vidare undersökning kan göras för de olika plattformarna erbjudna av konkurrenter, så som (AWS) och Google Cloud, samt andra arkitekturer från Azure.

Nyckelord

Software as a Service, SaaS, Arkitekturer, Skalning, Prestanda, Microsoft Azure, Virtuella Maskiner, Cloud Services, Mesh Network

3 Acknowledgements

I would like to thank Anders Sjögren for the guidance given throughout the project, in terms of constructive feedback in monthly meetings, ensuring that the project stays within its scope and requirements made by KTH.

I would also like to thank Fadil Galjic for being the examinator of the project, providing valuable constructive feedback on how the project is conducted and presented in a clear, concise way.

Lastly, I would like to thank Niko Rosenquist and Tim Liberg from Triona for giving me the opportunity to perform this project for them, with special thanks to Tim for providing guidance and support throughout the project.

4 Contents

1 Introduction 1 1.1 Background ...... 1 1.2 Problem ...... 2 1.3 Purpose ...... 3 1.4 Goal ...... 3 1.5 Methodology ...... 4 1.6 Delimitations ...... 4 1.7 Outline ...... 5

2 Theoretical Background 7 2.1 and Architectures ...... 7 2.2 Microsoft Azure ...... 12 2.3 Measuring Performance and Scalability ...... 14 2.4 Related Works ...... 15

3 Methodology 17 3.1 Research Strategies ...... 17 3.2 Research Methods ...... 19 3.3 Managing the degree project ...... 23 3.4 Tools ...... 25

4 Developing and Deploying The API, Gathering Data 27 4.1 Developing the API ...... 27 4.2 Deploying the API ...... 30 4.3 Gathering Performance and Scalability Data ...... 36

5 Measured Data and Service Comparison 43 5.1 Gathered Data ...... 43 5.2 Service Comparison ...... 64

5 6 Discussion and Conclusions 69 6.1 Research Methods ...... 69 6.2 Validity and Reliability ...... 71 6.3 Conclusion ...... 73 6.4 Future Work ...... 74

References 75

7 Appendices 78

A API Specification 78

B API Sequence Diagram 79

6 1 Introduction

Cloud computing consists of delivering on-demand computing resources from a cloud provider, typically on a ”pay for what you use” basis. This can be in the form of different types of services, most commonly as: IaaS (Infrastructure as a Service), PaaS () and SaaS () [3, 20, 25].

By hosting parts of an application on an external party, businesses can avoid large upfront investments in hardware, infrastructure, maintenance and management, only paying for what they use and being able to focus on developing and deploying applications.

Cloud computing is becoming increasingly popular in the industry, accumulating $266B in revenue in 2017 and predicted to grow up to $411B by 2020 [13].

1.1 Background

When it comes to choosing a service, multiple platforms and solutions exist for the same problem. Different portions of an application can be hosted and managed by the service provider, while the remainder is managed by the individual utilizing the service. In addition to choosing a suitable service provider and solution, it is unclear how they perform in different categories, such as performance and scalability.

Microsoft has a platform for cloud-based solutions, referred to as Microsoft Azure [11]. Here, they offer various cloud solutions with different prize plans and capabilities, such as Virtual Machines [9], Cloud Services [5] and Service Fabric Mesh [7].

1 1.2 Problem

Triona [32] is a software solutions company based in the Nordics, with headquarters located in Dalarna, Sweden, and offices located in Norway, Sweden and Finland. They employed 120 people in 2017 and had an annual revenue of about 150 MSEK [1]. Their software is used by various customers, including but not limited to, the forest industry, transportation, live-streaming and oil/gas.

Triona utilizes Microsoft Azure on a daily basis, hosting most of their software, databases and development environments, among others, using services provided by Azure. Even though many of the services are used daily, there is not a clear picture of exactly how much they differ in performance and scalability. An over- all difference is detectable and known, but they are looking to analyze each service more specifically in order to better differentiate between them in order to save money and improve performance for existing and future systems.

A sudden increase of users (surge) utilizing their services is common, in cases such as a popular hockey game being live-streamed at a certain time. This surge may cause spikes in latencies and response times, which is a specific problem they are looking to solve.

Since Cloud Computing is still relatively new with additional services coming every year, choosing the correct service may be difficult, and information about which one performs the best and handles the most users well is of interest. In order to gather information to answer their questions, the following research question can be defined:

RQ: How do various Cloud Architectures differ in terms of performance and scalability?

2 1.3 Purpose

Due to the increasing popularity of Cloud Computing, analyzing services in terms of performance and scalability is of interest for both small and large organizations. By doing a study into services within Azure, organizations will have an idea of the benefits from moving to the cloud and how the different services may differ.

1.4 Goal

The goal is to provide a guidance model for Triona and other parties, so that they may choose a suitable cloud solution for their projects, by being able to easily distinguish differences between services when it comes to scalability and performance.

1.4.1 Ethics and Sustainability

Since the project will focus on measuring performance and scalability data for services provided by Azure, data related to real customers or people will most likely not be encountered. The project will focus solely on the services themselves and simulating a real world situation, which means that ethics will not be relevant to the scope of this project.

By creating a guidance model for Triona and other parties, individuals looking into Azure will have an easier time choosing which service is most suitable for their needs without having to develop specific applications and testing it themselves. This may cause parties to move their current, on-premise solutions into the cloud, minimizing unused hardware and possibly improving their work environment, by avoiding to manage and maintain a local infrastructure.

3 1.5 Methodology

The project uses a qualitative research method by performing case studies to answer the research question. A case study focuses and analyzes a specific topic and use case, which goes hand in hand with a qualitative study.

The project is introduced with a literature study into the services to be compared, Cloud Computing in general, and how to determine and compare performance and scalability metrics within the services of interest.

A case study was done by developing and deploying a REST Web API to each service, where the API is used by customers of Triona to access live-streaming data without gaining access to their larger, main , referred to as MPP5.

By using the data gathered and compared from each service using tools and methods from the literature study, the research question defined in 1.2 can then be answered.

1.6 Delimitations

The project will exclusively focus on services provided by Microsoft Azure, as this is what is of interest to Triona.

There are also different architectures available as solutions for the same problems encountered in this project - the only ones interesting for Triona are analyzed. These include Cloud Services, Virtual Machines and Service Fabric Mesh.

4 1.7 Outline

The remainder of the report is structured as follows:

• Chapter 2 describes the theoretical background to the project, declaring and explaining areas which are important in order to understand the remainder of the project.

• Chapter 3 describes the research methodology of the project, which defines reasoning used in the project, along with how data is collected, analyzed and compared.

• Chapter 4 describes the development and deployment of a REST Web API to each service, while also specifying which machine sizes and variations were used for testing. The test plans used for gathering data are also defined here.

• Chapter 5 presents the results gathered from load testing each architecture. By analyzing and comparing the results, conclusions can be made in terms of performance and scalability differences.

• Chapter 6 includes a discussion about the project, the results gathered from the services along with suggestions about possible future work.

5 6 2 Theoretical Background

This chapter gives a detailed technical description of areas explored and used in this project. This includes the basics of cloud computing, service architectures, the Azure platform and how performance and scalability are measured.

2.1 Cloud Computing and Architectures

Cloud Computing is the on-demand delivery of computing resources, such as servers, storage, databases, applications, networking and more, accessed via the with a pay-as-you-go pricing, where the services provided are hosted in the cloud providers data centers [3, 20, 25]. With Cloud Computing, businesses can avoid large upfront investments into hardware and maintenance of a local server infrastructure. Instead, they can choose exactly how much power they need for their system, where resources can be increased to as much as needed, almost instantly, while only paying for what is used. Cloud Computing can be broken into five essential characteristics, three deployment models and three service models [22].

2.1.1 The Five Essential Characteristics

The five essential characteristics of a Cloud Computing provider are the following [22]:

On-demand self-service. A consumer can provision the computing capabilities, such as network storage, server time, and computing power, without human interaction, as needed by the consumer.

Broad network access. The cloud computing capabilities are available over the internet and are widely accessible through many different devices, such as phones, tablets, laptops and workstations.

7 Resource Pooling. The cloud resources accessed by a consumer are not limited to only one customer, but are shared among multiple, using a multi-tenant model with physical and virtual resources assigned and reassigned according to consumer demand by the provider.

Rapid Elasticity. The resources provided to the consumer can be scaled rapidly either up or down, and in some cases automatically, proportionate to the current consumer demand.

Measured Service. The Cloud Systems automatically optimize and control the use of resources through metering services. The resources used can be monitored, controlled and reported to the customer, providing transparency to both the user and cloud provider.

2.1.2 The Three Service Models

Cloud Computing services can be broken down to three service models, providing different levels of abstraction and control to the consumer. These are Software as a Service (SaaS), Platform as a Service (PaaS) and Infrastructure as a Service (IaaS). In addition to the three service models, on-premise solutions unrelated to cloud computing may be used.

Separate from the service models, on-premise solutions are hosted and owned by the consumer, where all parts of the application stack are handled and managed by them.

Infrastructure as a Service (IaaS) is the most basic version of cloud computing services, providing the consumer with the basic IT infrastructure needed for hosting applications - network, servers, storage and hosting in their data centers, leaving the consumer to handle the operating systems, middleware and applications.

8 Platform as a Service (PaaS) enables the consumer to deploy their created or acquired applications, which use programming languages, libraries, services and tools supported by the provider, onto a cloud infrastructure. As with SaaS, the consumer does not control the underlying cloud infrastructure, including network, servers, operating systems, or storage, but controls the deployed applications. The provider updates and maintains the infrastructure, while the consumer focuses on deploying and maintaining their applications.

Software as a Service (SaaS) offers the providers application(s) running and hosted on their own cloud infrastructure. The applications are accessible through various client devices, usually using a web browser (like web-mail or ). The consumer does not control or manage any of the underlying infrastructure, which is updated, maintained and upgraded by the provider without any need of consumer interaction.

Figure 2.1 visualizes the levels of abstraction when comparing on-premise, IaaS, PaaS and SaaS solutions, with cloud providers handling various levels of the application stack. Here, the differences can be seen more clearly, with each level of the stack being handled by the user or provider [2, 20, 22, 25].

9 Figure 2.1: Levels of abstraction when considering on-premise, SaaS, PaaS and IaaS infrastructure solutions. Different solutions provide different levels of abstraction on the application stack to suit the needs of the consumer [26].

10 2.1.3 The Three Deployment Models

Different deployment models exist when deploying applications to the cloud. The resources can either be shared with other consumers in a public cloud, isolated for the sole consumer in a private cloud, or a combination of both, where some part of an application may be hosted on a public cloud with more compute-intensive applications being hosted on a dedicated, private cloud [20, 22, 24].

A Public Cloud is the most common deployment model, where the cloud is owned and hosted by a third-party, such as Microsoft Azure, where the services offered are open to the public. Computing resources from the provider are shared among multiple consumers in a multi-tenant model, where each consumer can access their resources and services through a web browser. The public cloud services provide the infrastructure needed to host applications, such as hardware, software and supporting infrastructure, which is owned and managed by the provider.

A Private Cloud is reserved for a single organization, and may be hosted either on-premise at the consumer or in a provider’s . The hardware used for the private cloud is solely dedicated to the single consumer, making it easy for organizations to customize the resources needed to suit their specific needs. This type of cloud model suits organizations with business-critical operations where they want better control of their environment. Such organizations may include government agencies and financial institutions [24].

Hybrid Cloud combines both public and private clouds to take advantage of ”the best of both worlds”. Businesses who want to host their business-critical operations in a controlled environment can do so in a private cloud, while less critical operations can be hosted on a public cloud, which is cheaper and more suitable for high-volume, non-critical operations.

11 2.2 Microsoft Azure

Microsoft’s Azure platform is a Cloud Computing service provider, where customers can build, test, deploy and manage their applications and services through their managed data centers [11]. They offer a multitude of services and solutions, including all types of Service and Deployment models.

2.2.1 Virtual Machines

Virtual machines (IaaS) are hosted on a server within the provider’s data centers [9, 10], where the machine can be accessed through a virtual desktop. The consumer can install whatever they wish on the virtual machine, run whatever software they want along with configuring all types of network rules. The machine’s hardware can be scaled up or down [8], depending on the needs of the consumer. As specified for IaaS solutions, load balancing is not handled on this type of infrastructure, which has to be done by the consumer themselves.

2.2.2 Cloud Services

Similar to virtual machines, Azure’s Cloud Services (PaaS) is a management-type tool for multiple Virtual Machines [5, 6]. Unlike Virtual Machines, Cloud Services handle load balancing, scaling, and deployment, which makes it easier to both deploy applications and automatically handle scaling. It is possible to deploy multiple machines for balancing, or scale a single machine up or down.

The difference between Virtual Machines and Cloud Services is the infrastructure - VMs are IaaS, while Cloud Services are PaaS. For VM’s, the consumers create and configure their environment that the application will run in, after which the application is deployed into the said environment. The consumer is responsible for managing and updating the machine in the VM. In PaaS, with Cloud Services, the environment is automatically created by Azure, where the consumer only has to deploy the application into the already existing environment. All handling of updating and management is handled by Azure for the consumer [6].

12 Behind the scenes, VMs and Cloud Service machines are the same when it comes to hardware and model numbers, the difference being how the environment is created. Two types of automatic environments exist for consumers, so called Worker and Web roles. A Worker role has IIS (Internet Information Services) Web server disabled by default, while a Web role has it enabled, through which the application is hosted upon deployment. In a Worker role, the consumers app is instead run standalone.

For example, a simple application hosting a website might just need a single Web role to receive incoming requests, while a more complex application might use the Web role to receive requests and forward the more resource-demanding tasks to a Worker.

2.2.3 Service Fabric Mesh

Service Fabric Mesh (SFM) [7] is different to both Cloud Services and VMs, since the user does not choose a virtual machine size for their application. SFM consists of a cluster of thousands of machines, where multiple consumer’s applications are hosted using container images of their applications. The consumer does not have to manage any VMs, storage or networking, and only has to deploy the application and specify a CPU (in cores) and RAM (in GB) limit. All the operations of the cluster are hidden from the developer/consumer, where it automatically allocates the infrastructure and handles failures, scaling, and resource balancing, making sure that the consumer’s applications are highly available.

SFM supports any programming language or framework that can run in a container. This enables developers to make simple applications, not having to change their coding habits or structures to deploy their applications.

13 2.3 Measuring Performance and Scalability

This chapter will cover how performance and scalability can be measured in systems.

2.3.1 Measuring Performance

Performance can be defined in different ways depending on context. When hosting applications on a service, the performance of individual computers is looked at, for VMs and Cloud Service, and clusters of machines for Service Fabric Mesh.

Computer performance can be measured using metrics such as Response Time, Latency, Throughput, or by measuring how well it performs in a benchmark test, which may imitate work which the computer would do during actual use [28, 30].

2.3.2 Determining Scalability

Scalability of computers or software describes its capability to perform well and cope with an increased or expanding workload. A well scalable system would be able to maintain or increase its performance or efficiency, even when tested by larger and larger operational demands [21].

Ideally, a well scalable system would have a linear resource requirement as the load increases. For example, duplicating or doubling the power of a machine that can handle 1000 active users should be able to handle 2000 users [21].

Scaling can be done by either scaling up, by increasing power of a single machine, or scaling out, by adding more machines. Scaling up and out is also referred to as Vertical and Horizontal scaling [29, 31].

14 2.4 Related Works

Victor Delgado explores the limits of cloud computing architectures in his masters thesis, done at KTH in 2010 [15]. Relevant experiences when measuring performances can be applied from his thesis for this project.

Mikael Rapp summarizes the technological (both software and hardware) related solutions to scaling, while also covering financial and managerial aspects. This was done in his bachelors thesis at KTH, Stockholm in 2010 [27].

15 16 3 Methodology

This chapter describes the methodology and research strategies used, along with tools utilized in the project.

3.1 Research Strategies

3.1.1 Quantitative and Qualitative Methods

Choosing an appropriate method when conducting scientific research is important, as it will be vital in steering the work and results in the correct direction, providing accurate and proper results. Scientific methods can generally be divided into two categories: quantitative and qualitative [19].

Quantitative methods are often used for measurable data, such as numbers and percentages. This is done in the form of mathematical, computational and statistical techniques and experiments to conclude or verify theories and hypotheses from large sets of data.

Qualitative methods focus on smaller sets of data, which are also often non- numeric. This is also often done from multiple sources, instead of just numerical data from the same large data set.

3.1.2 Inductive and Deductive Reasoning

In addition, different reasoning strategies may be used, in the form of inductive and deductive reasoning. An inductive approach establishes theories, usually through collection of data in qualitative studies, in order to analyze and gain understanding of it. A deductive approach makes conclusions from already known data. This known data can be theories or hypotheses, which are to be confirmed by performing a deductive study [19].

17 3.1.3 Case Study

Usually, case studies focus on a single instance of the thing that is to be investigated, analyzing it in-depth in order to provide insights that would possibly not have been detected if a research strategy tried to cover a large number of instances. Case studies go hand-in-hand with qualitative studies, focusing on smaller data sets and individual instances of larger problems [16].

3.1.4 Bunge’s Scientific Method for Technology

Bunge’s scientific method may be used to ensure that a technological approach is done, avoiding relying on common sense [12]. Bunge’s method is defined according to the steps below:

1. Identify a problem.

2. State the problem clearly.

3. Search for information, methods or instruments.

4. Try to solve the problem with the help of the means collected in step 3.

5. Invent new ideas, produce new empirical data, design new artifacts.

6. Obtain a solution.

7. Derive the consequences.

8. Check the solution.

9. Correct possibly incorrect solution.

10. Examine the impact.

18 3.1.5 Applied Strategies

This project is a qualitative project, using a case study to measure and compare performance and scalability metrics for the various Azure services, by developing a Web API which is then load tested. The project uses an inductive reasoning method to make conclusions and theories from the data gathered throughout the project. The project also follows Bunge’s Scientific Method, in order to ensure that the project follows a technological method.

3.2 Research Methods

This section goes further in depth into possible project research methods and strategies.

3.2.1 Research Outline

The research was divided into two different phases: a literature study and the case study.

Literature study phase

The literature study is introduced by studying the services within Microsoft Azure, which are of interest for the project.

A basic understanding of Cloud Computing and services was needed to be able to fully understand what was being compared and how they differed. This was done by reading literature related to the services being compared (IaaS and PaaS), and cloud computing in general. The services researched were Virtual Machines, Cloud Services and Service Fabric Mesh.

In order to be able to correctly measure performance and scalability, additional research was done into how they should be formed and tools to do so. The load tool JMeter was found to simulate users concurrently accessing a system by web requests, which is suitable for testing a service.

19 Additional literature and articles were read to gain a deeper understanding of cloud computing, Azure architectures and measuring data throughout the rest of the project.

By performing the literature study, the first three steps from Bunge’s scientific method, referred to in section 3.1.4, could be defined:

1. Identify a problem: By understanding which services exist within Azure, the problem can be stated by defining that the differences between them are uncertain.

2. State the problem clearly: By discovering different services within Azure to compare, it is uncertain how they differ in performance and scalability.

3. Search for information, methods or instruments: By using the JMeter load tool, multiple users performing web requests to a service could be simulated by deploying and testing an API on service.

Constructing the Case Studies

When the literature study was completed, a REST Web API (defined in Appendix A) development was started. By following the requirements set by Triona (as seen in figure A.1), development continued until all functionality was implemented and working locally, after which it was deployed to each service.

To ensure the difference between systems is done by comparing their architectures, and not code differences, the same API is deployed to each, using as little differences as possible in terms of framework(s) and libraries used.

Before developing the Web API, it is necessary to establish what the application we are going to deploy and test on each architecture is going to include. This is done by using the MoSCoW method.

20 With the case study completed, the next steps in Bunge’s method could be defined:

4. Try to solve the problem with the help of the means collected in step 3: By developing and deploying an API to each service and performing a load test with the JMeter load tool, data could be gathered to analyze and determine the differences in performance and scalability between all services.

5. Invent new ideas, produce new empirical data, design new artifacts: By testing different sizes of requests, various results could be gathered, analyzed and compared between each service.

6. Obtain a solution: By using the data gathered from steps 4 and 5, a solution to the problem can be found.

7. Derive the consequences: Using the solution provided in step 6, differentiation between each service can be done.

Evaluating the results

With a completed case study, the results may be analyzed to determine the differences between services for performance and scalability.

It is also of interest to know how reliable and valid the data gathered is. To ensure it is valid, the possible bottle-necks of the system need to be determined. The test strategy used has to be analyzed and made sure it is not performing incorrect operations, along with performing the same amount of operations on each service.

By completing the evaluation of the results, the final steps of Bunge’s method can be defined.

8. Check the solution: looking at the results and identifying if there are differences between tests. Do some tests return very different response times and/or error rates? Is there some possible explanation for it happening?

21 9. Correct possibly incorrect solutions: if an incorrect solution is found, perform the tests so that the reason they were incorrect is avoided. Re- evaluate the results and compare the services.

10. Examine the impact: comparing the results received from the correct solutions can provide an accurate comparison between services, where their impact can be determined .

3.2.2 Responding to the Research Question

By exploring ways to measure performance and scalability and defining what was necessary to include in the application to fulfill the measurements, the research question may be revisited and addressed in how it will be answered in the project.

How do various Cloud Architectures differ in terms of performance and scalability?

By deploying the REST API application to each architecture, the JMeter load tool can be used to simulate a large amount of users utilizing the system at the same time. By measuring both what the average response time is for different amounts of requests and determining when the machine decreases in performance due to increased load, it can be determined how well it performs and what the scalability limits of each machine are. Using the comparisons between the different machine sizes for each service, using same price and power metrics (such as CPU and RAM), a comparison between all architectures can be made to determine how they differ.

22 3.3 Managing the degree project

To manage the project and ensure that it does not exceed in budget and/or other metrics, the project management triangle was used. In addition, the MoSCoW method was used to map out functionality that is necessary for the finished project, without exceeding the budget or time limit.

3.3.1 Project management triangle

To ensure the project stays within budget, time schedule and desired functionality, the Project Management Triangle, also referred to as The Iron Triangle, was used [4]. It defines a project in corners of a triangle, for the categories: cost, time and quality, seen in figure 3.1. In this project, the cost is the amount of hours spent to write the report and perform the research, which is set at 400 hours. The time is also set at a specific date, ending in June. This leaves the scope/quality of the project flexible - and to ensure that the project stays within the limits of time and budget, the MoSCoW method is employed.

Figure 3.1: The project triangle used to ensure the project stays within the time and cost limit, while allowing the quality of the API to be flexible in terms of functionality implemented and comparisons made.

23 3.3.2 The MoSCoW Method

In order to keep the project within the given time frame, the MoSCoW method [14] is employed to ensure the main goals of the project are achieved, while keeping within the time limits and project requirements.

The MoSCoW method is defined in four different categories: Must, Should, Could, Won’t - to define which project goals must, should, could, and won’t be done.

All requirements in the ’Must’ category are to be fulfilled in the project, with the ’Should’ category planned into the budget, but could be skipped if there isn’t enough time. Goals in the ’Could’ category may be done in case there is excess time, with the ’Won’t’ category being skipped completely.

Must:

1. A working Web API which fulfills the requirements set by Triona.

2. A performance measurement for one variant of each architecture (price or power, such as RAM).

3. A scalability measurement for one variant of each architecture (price or power, such as RAM).

4. Comparison between all architectures, using performance and scalability metrics.

Should:

1. Performance measurements for multiple (two or more) variants of each architecture.

2. Scalability measurements for multiple (two or more) variants of each architecture.

Could:

1. Measure performance/scalability for additional services within Azure.

2. Measure additional metrics for the services chosen within Azure.

24 Won’t:

1. Deploy the same API within different platforms, such as Amazon Web Services or Google Cloud.

3.4 Tools

In order to load test the application, JMeter was used to create test plans where web requests were performed on the specified API. By using built-in tools, a surge of users could be simulated as accessing the API simultaneously. The tool works by sending any type of HTTP requests to simulate a real world situation for the system.

All coding will be done in Visual Studio using C#, since it is easily supported by all Azure platform deployment methods and automatic publishing tools. The built-in .NET framework was used to create a REST Web API, which was the API deployed to each architecture.

In order to test the API locally and after deployment, Postman is used to send single, manual web requests to test the functionality before performing load testing.

25 26 4 Developing and Deploying The API, Gathering Data

This chapter will go into how the API was developed and deployed on the various Azure services, while also explaining how the performance and scalability data was gathered. In order to start the steps done in this chapter, Visual Studio had to be installed, an Azure account had to be made/accessed in order to use the services within Azure, along with functionality installed in Visual Studio to be able to deploy the specific types of applications needed for Cloud Services and Service Fabric Mesh.

4.1 Developing the API

The app was developed by the requirements set by Triona, creating a subset of their main APIs, referred to as MPP5. The functionality of the API was to be able to access the security (login), live-event and VOD (Video On Demand) APIs of MPP5 to access data related live-streams. It should be possible to access data related to both live events and VODs, along with being able to create new/future ones, so that the consumers using the API can develop their own front-end interface to represent the data stored by Triona. The requirement document provided by Triona is seen in appendix A.1, where all API calls are specified. The code was stored on GitHub using a private repository.

All code projects follow the software Model-View-Controller (MVC) pattern, where the controller was used to receive, forward and return requests from the MPP5 API, with no existing view and only minor model logic. In addition to the MVC pattern, some additional functionality from Web API libraries were used to be able to map incoming HTTP requests to controllers. The projects differed slightly in terms of how HTTP requests were received and returned, with most code being the exact same.

27 For Virtual Machines and Service Fabric Mesh, the projects used the AspNetCore library, while Cloud Service used System.Web.Http. This only made minor changes to the controllers, where they are extending different classes, but the same overall functionality exists with minor differences in syntax and HTTP request mapping. An example mapping in both projects would be http://ip-address/api/holder-id/controller-name/ - to reach a controller within the project.

In figure 4.1, the inheritance of the ApiController from the System.Web.Http library within the Cloud Service project can be seen. Similar to the Cloud Service library, the Mesh and VM projects extends the ControllerBase class from the AspNetCore.Mvc library, which can be seen in figure 4.2. Both libraries are part of the ASP.NET Core 2.2 library, but are different implementations of URL mapping to methods within controllers in each project. The settings are configured so that each function call works exactly the same, regardless of which library is used.

Figure 4.1: The library used within the cloud service project, utilizing the System.Web.Http library class ’ApiController’ to map HTTP requests to controllers located within the project.

All projects used the WebRequest class from the System.Net.WebRequest library to create and send requests to the external API. This was also done asynchronously, to ensure that the machines did not get stuck on waiting for responses and could handle other incoming requests in parallel. Figure 4.3 shows the code used to call the live-events API to retrieve data related live-events, available to the current user.

28 Figure 4.2: In both the Virtual Machine and Service Fabric Mesh projects, the AspNetCore.Mvc library class ’ControllerBase’ was used to map HTTP requests to controllers within a project.

Figure 4.3: How an external API call was made within the project, using the System.Net.WebRequest class. The URI in the request maps the call to the correct endpoint in the live-event API, in order to get all data related to live-events. The authorization header provides the authentication needed to get the data.

29 4.2 Deploying the API

Once the API was completed, it was deployed to the different services, with deployment methods differing slightly depending on the service deployed to. The machine sizes were chosen using different prices to match, and in the case of Mesh, using similar CPU and RAM as VM and Cloud Services, since pricing is not yet available.

4.2.1 Virtual Machine

The machines chosen for VMs can be seen in table 4.1, where ID describes the type of machine chosen and the individual specifications for it. The amount of cores refers to how many CPU cores the machine has, and RAM specifies how much RAM the machine has available. The VM sizes were found on the Azure website [8].

Table 4.1: Virtual Machines chosen for deployment and measuring data.

ID Cores RAM (GB) Price (month) B2S 2 4 $39.13 B2MS 2 8 $72.47

The Virtual Machine was first created using the Azure Portal, from which a machine size was specified. Once the machine was created, it was accessed using the Windows feature ’Remote Desktop’, along with credentials for the machine. Using the built-in tool for publishing an application from Visual Studio, a folder containing all data related to the project was created. The folder was then placed within a web server in the Virtual Machine, in this project, using IIS [23].

In addition to placing the project in the web server, certain Windows features had to be installed in order to be able to run the specific AspNetCore class calls. In this project, this included the ASP .NET Core Runtime Hosting package.

Figures 4.4 and 4.5 shows the publishing option within Visual Studio, which created the compressed files needed to run the project in the web server. It was also possible to upload the project directly to a VM, but since the permission rules had conflicts, the solution used was to move the files manually.

30 Figure 4.4: The VM Project publish option, located by right-clicking the project within Visual Studio.

Figure 4.5: The targets possible for publishing the VM Application.

31 4.2.2 Cloud Service

As with VMs, a Cloud Service project had to first be created within the Azure Portal. Figure 4.6 shows an example window that occurs when trying to create a Cloud Service, where information such as server location and resource group, which is a specific group unique within the Azure portal to map together projects and services, had to be specified.

Figure 4.6: An example of what a Cloud Service creation window looks like, where server location and resource group have to be specified.

32 To upload the application, a built-in publishing tool was used to create the two files containing the necessary information for the project. Accessing the Cloud Project created within the Azure Portal, an ’Update’ button could be pressed to access a window similar to the creation of the cloud project. The two files created using the publishing tool were selected here, after which the Azure software automatically allocated a machine that was running the software. The machine size and amount of replicas were chosen within the project, with Azure automatically reading the specified values and allocating the resources needed when the files were uploaded to the portal.

Figures 4.7 and 4.8 show the ’Update’ button to update the project with new machine sizes, code changes or replica amounts, along with what to specify within the window that showed up upon pressing ’Update’.

The machines chosen as VMs for Cloud Service are the same IDs as for VMs, but the environment is configured by the Azure portal. The prices are slightly different, since Microsoft has included the price for the software that handles both load balancing between the replicas running, but also the environment setup along with all other features included with Cloud Services. The sizes chosen were relevant to current ones used internally by Triona (A1), but also testing a smaller machine to see how much it can handle (A0). The machine specifications can be seen in table 4.2.

Table 4.2: Machine IDs for Cloud Service used for deployment and measuring data.

ID Cores RAM (GB) Price (month) A0/Extra Small 1 0.75 $14.60 A1/Small 1 2 $58.40

33 Figure 4.7: The ’update’ button which is pressed to modify the current distribution of cloud services.

Figure 4.8: The information needed to update the cloud project, including the location to store the project files along with which roles the update is concerned with.

34 4.2.3 Service Fabric Mesh

The same code from the VM project could be used when deploying to Service Fabric Mesh, the only difference being in creating a Mesh project within Visual Studio for the necessary features needed to deploy. The features included for Mesh were specifying the CPU limits, in amounts of cores, along with how much each application may need in terms of RAM. The limits were specified within the project, in a ’service.yaml’ file.

As with Cloud Service, using a built-in publishing tool, the application could be automatically deployed to the Azure portal, where the Mesh network automatically allocates the resources needed to run the project.

The sizes chosen for Service Fabric Mesh were to line up with the cores and GB of RAM chosen for VMs and Cloud Services, since pricing is not yet officially released. The sizes chosen can be seen in table 4.3.

Table 4.3: Service Fabric Mesh machine limits chosen for measuring data and deploying the API.

ID Cores RAM(GB) M1 1 2 M2 2 4

35 4.3 Gathering Performance and Scalability Data

In order to determine performance of each architecture, a certain amount of requests were performed on the API, while being deployed on each architecture. By measuring the average response time (ART) over the course of a test, the architectures could then be compared by using the ART as a metric for performance between all architectures. The amount of errors encountered during the load tests is also of interest, where an increase in errors can be attributed to the system not being able to handle the amount of incoming requests.

To ensure the performance metrics are purely based on the API and not bottle- necked by some other back-end, the database call limits and service sizes are considered. For example, performing a call to a live-event service database has the same live-event cached if multiple requests are made to the same event, while a login service does not cache credentials and may cause a very large bottle-neck to the requests being made to the Web API and impact gathered performance metrics.

The test were performed from three different locations: the offices of Triona, at home of student, and at another office. All three locations were in Stockholm, with download and upload speeds being about the same at each location (between 80- 120 MB/s download, 10-20 MB/s upload). Most tests were performed at Triona and at home, equalling to about 90% of all tests. The average values calculated are from tests across all locations, since there did not seem to be a significant difference when testing at different locations. The test are also not marked as to which location they were tested at. Most test results (90+%) are included in the report, with major deviations being skipped due to not contributing to the actual average that is to be calculated.

For scalability of the architecture, the amount of requests is increased both until the ART begins to deviate from previous tests where the ART has been about the same, and when errors are encountered. By identifying at which points each service reach certain response times and errors, it will then be easy to compare how well they scale with increased request amounts, both when scaling up and out.

36 Two test plans were created in JMeter to gather the data, seen in figures 4.9 for the ’Login Plan’ and 4.12 for the ’Performance Plan’. Both test plans are performed during a 5 minute period, in which all requests are performed.

Figure 4.9: JMeter ’Login’ Plan, used to simulate a use case often encountered in production.

The Login Plan simulates a normal use case, seen in figure 4.9, where a user logs in to the system and accesses a specific broadcast. This is similar to how most use cases are in production, where sudden surges of users want to access the same broadcast.

The plan performs a POST request to the ’Login’ method, after which it performs a GET request to the ’GetContent’ method. Each thread created in this test plan performs both these operations, seen in figures 4.10 and 4.11.

37 Figure 4.10: Login call performed on the API during a ’Login’ Test Plan.

Figure 4.11: Content Request performed on the API during a ’Login’ Test Plan.

38 The Second Plan, referred to the ’Performance Plan’, seen in figure 4.12, performs only the second call of the ’Login Plan’, where a GET request is done on the ’GetContent’ method, seen in figure 4.13. This is done due to restrictions related to the ’Login’ calls, where it is limited to a certain amount of calls on the database containing credentials. The method call for a single broadcast does not have the same restrictions, instead, if the same broadcast is requested a certain amount of times in a short span of time, it is cached for 30 seconds. This means that a lot less database calls are made, which ensures that the bottleneck of the services is not in the database limit but the machine receiving the requests.

Figure 4.12: JMeter ’Performance’ Plan, used to reach the maximum capacity of each service.

The details of how the plans call the API, along with how the API itself forwards requests and retrieves data, can be seen in figure 4.14, with a larger, rotated image in appendix B. Here it can be seen how the API communicates with MPP5, along with how data is retrieved from a Azure Cosmos DB, which stores both the login and content data.

39 Figure 4.13: Content request made to the API during the ’Performance’ Test Plan.

Figure 4.14: Sequence diagram of the developed API, where it is referred to as ’Client API’. Here, MPP5 is visualized by including some of its APIs: live-event, VOD and Security. In addition, the Cosmos DB from which the APIs retreive and store data is included.

40 To measure and gather data for performance, the same amount of incoming requests are simulated on all services. To analyze the data, the average response time and error rate for each service was noted. This was done with both test plans, to see how they perform in a regular use case and when simulating a massive increase of incoming requests. Comparing each service using the same amount of requests, it is then easy to identify how they differ in response times and error rates.

For scalability, a maximum response time was considered. When the service reached a certain response time thresh hold, it was considered to not be able to handle the incoming requests, since the queue for incoming requests increases as time passes and response times just keep getting worse. It is of interest to figure out at which point each machine, or combination of them, reach this point. This threshold was decided to be situations where the average response time increases drastically from previous tests, or when the service starts returning timeouts. This can be for cases such as when testing 6000 requests and having an average response time of 300ms, and increasing it to 8000 and reaching an average of 10 seconds.

To replicate the data gathered, a similar REST Web API could be hosted on Azure services, where a call to another external API is done from the service itself. Different results will vary depending on where the second API is hosted, the load it is experiencing and the response time from the external API. In the case of this project, all services are hosted within the same subscription in Azure, which may impact how the results are received due to the services being hosted close to each other.

Examples of sending and receiving the login and content requests can be seen in figures 4.15 and 4.16, where an actual request was made along with a response from the API.

41 Figure 4.15: Login request made to the API using the Postman tool.

Figure 4.16: Content request made to the API using the Postman tool.

42 5 Measured Data and Service Comparison

This chapter is a summary of the data gathered for all services. The detailed results of each service will be listed, with the variations of each service and how they performed. After presenting the results of each individual service, the metrics will be compared side-by-side, by using the results from each variation of each service next to each other.

5.1 Gathered Data

This section will present the data gathered for each service. By performing two test plans, defined in section 4.3, enough data to differentiate between the services for both performance and scalability metrics was gathered.

Depending on how the data is presented, it will be used differently when calculating overall averages and error rates.

Results presented using normal text are data that will be included when calculating the final result/average of a service. For example, if three tests are done on a machine and one of them is written in italic, it means that the specific test is an anomaly and does not represent the average case. This may be due to issues with bandwidth, database or the services, so it is not considered in the final result.

The average of all results will be presented in bold, where the average of all tests which are not in italic is calculated. This is the average that will be used when comparing the services between each other.

An example of a result table can be seen in table 5.1, where one invalid test is not considered when calculating the overall average of the remaining tests.

43 Table 5.1: Example of result data, where italic represents data not used when calculating the overall average, shown in bold.

Requests Average Response Time (ms) Error Amount/% 100 0/0% 150 0/0% 1000 3820 480/48% 125 0/0%

5.1.1 Virtual Machines

VMs seem to perform well for lower amounts of requests, ending at around or below 200ms response times for the login plan when performing 2000 or less requests. When increasing to 3000, the error rates and response rates skyrocket and time-out almost every request.

About the same results can be seen for the performance test plan, where the machines perform about equally when below or at 8000 requests. When increasing to 10000 and 12000, B2S performs better, but both machines time out or have very large response times at 12000.

The results for B2S can be seen in tables 5.2 and 5.3, and for B2MS in tables 5.4 and 5.5.

Table 5.2: B2S - login tests done for the B2S Virtual Machine.

Login Requests Average Response Time (ms) Error Rate (Requests/Total%) 123 0/0% 1000 124 0/0% 123 0/0% 128 0/0% 1500 144 0/0% 136 0/0% 158 0/0% 2000 234 0/0% 196 0/0% 27776 2506/83.55% 3000 26776 2400/80% 27276 2453/81.77%

44 Table 5.3: B2S - performance tests done for the B2S Virtual Machine.

Performance Requests Average Response Time (ms) Error Rate (Requests/Total%) 251 0/0% 161 0/0% 4000 172 0/0% 195 0/0% 220 0/0% 6000 220 0/0% 2036 0/0% 686 0/0% 8000 823 0/0% 1182 0/0% 19965 2100/21.00% 3052 0/0% 10000 3256 0/0% 3154 0/0% 25079 7681/64% 24776 7359/61.33% 12000 18371 3553/30% 22742 6198/52%

Table 5.4: B2MS - login tests done for the B2MS Virtual Machine.

Login Test Requests Average Response Time (ms) Error Rate (Requests/Total%) 143 0/0% 109 0/0% 1000 194 0/0% 116 0/0% 140 0/0% 159 0/0% 162 0/0% 1500 137 0/0% 153 0/0% 18620 1006/50.30% 146 0/0% 2000 172 0/0% 192 0/0% 170 0/0% 22655 2486/83% 3000 27855 2530/84.33% 27227 2405/80.17% 25912 2474/82.45%

45 Table 5.5: B2MS - performance tests done for the B2MS Virtual Machine.

Performance Test Requests Average Response Time (ms) Error Rate (Requests/Total%) 251 0/0% 206 0/0% 4000 203 2/0.05% 220 0/0% 203 1/0.02% 6000 238 1/0.02% 220 1/0.02% 2036 0/0% 1204 0/0% 8000 1284 1/0.01% 1508 0/0% 19965 2100/21.00% 14437 0/0% 10000 3962 0/0% 21546 4197/41.97% 9200 0/0% 25079 7681/64% 24675 7147/60.34% 12000 21285 4953/41.27% 23680 6594/55.20%

46 5.1.2 Cloud Service

The differences between A0 and A1, the cheaper, less powerful A0 seems to perform the same or better than A1 for the login plan, with differences being around 200ms (about 150ms for A0, 350ms for A1) for 1000 requests, and around 20-30ms for 2000 requests (155-185ms for A0 and 153-171 for A1). Both A0 and A1 start timing out at the same rate for 3000 requests, with response times being between 5-7 seconds for both A0 and A1 with 20-30% error rates.

For the performance plan, the average response times for both A0 and A1 are almost exactly the same, both when using one and two replicas, up until 6000 requests. At 8000 and 10000 request, A1 performs better than both variants of A0, even when using a single replica, with response times being around 2 seconds faster than A0, which sits at around 3 seconds for 8000 requests and 18 seconds for 10000. Both machines fail to hold up to 12000 requests, resulting in around 20-60% error rates with high variance between tests.

The tests performed on the A0 machine can be seen in tables 5.6 and 5.7 for a single replica, and tables 5.8, 5.9 for two replicas.

For A1, the tables 5.10 and 5.11 include the single replica results, while tables 5.12 and 5.13 contain the two replica tests.

47 Table 5.6: A0/ExtraSmall-1 - login tests performed on the A0/ExtraSmall Cloud Service using a single replica of the machine.

Login Test Requests Average Response Time (ms) Error Rate (Requests/Total%) 111 0/0% 139 0/0% 1000 204 1/0.05% 161 1/0.05% 154 0/0% 137 1/0.03% 161 3/0.10% 1500 130 1/0.03% 212 1/0.03% 160 1/0.05% 159 0/0% 191 0/0% 2000 166 0/0% 223 1/0.03% 185 0/0% 8324 618/20.53% 4868 605/20.17% 3000 3748 602/20.07% 21606 2822/47.03% 5647 608/20.26%

48 Table 5.7: A0/ExtraSmall-1 - performance tests performed on the A0/ExtraSmall Cloud Service using a single replica of the machine.

Performance Test Requests Average Response Time (ms) Error Rate (Requests/Total%) 234 3/0.07% 220 0/0% 4000 224 0/0% 179 0/0% 214 0/0% 193 0/0% 229 0/0% 6000 303 0/0% 291 0/0% 254 0/0% 654 0/0% 4463 0/0% 8000 4673 0/0% 3551 1/0.01% 3335 0/0% 5085 0/0% 22862 4034/40.34% 10000 23225 4006/40.06% 24194 5309/53.09% 18841 3337/33.37% 26924 6091/50.76% 26878 8719/72.66% 12000 25413 7590/63.25% 26405 7467/62.22%

49 Table 5.8: A0/ExtraSmall-2 - login tests performed on the A0/ExtraSmall Cloud Service using two replicas of the machine.

Login Test Requests Average Response Time (ms) Error Rate (Requests/Total%) 146 0/0% 2145 35/1.75% 1000 122 0/0% 161 0/0% 143 0/0% 124 0/0% 130 0/0% 1500 145 0/0% 133 0/0% 149 20/0.5% 151 0/0% 2000 165 0/0% 155 6/0.1% 2942 609/20.30% 13388 1205/39.97% 3000 4435 611/20.37% 6921 808/26.94%

50 Table 5.9: A0/ExtraSmall-2 - performance tests performed on the A0/ExtraSmall Cloud Service using two replicas of the machine.

Performance Test Requests Average Response Time (ms) Error Rate (Requests/Total%) 194 0/0% 136 0/0% 4000 180 0/0% 167 0/0% 169 0/0% 254 0/0% 150 0/0% 6000 256 0/0% 348 0/0% 252 0/0% 3084 0/0% 649 0/0% 8000 4750 0/0% 3375 0/0% 2964 0/0% 21719 4380/43.80% 3473 0/0% 23980 5376/53.75% 10000 22053 3625/36.25% 22199 3650/36.50% 23570 4245/42.45% 19499 3546/35.46% 7181 0/0% 2138 0/0% 3125 0/0% 12000 26579 8190/68.25% 23959 5674/47.28% 12596 2772/23.10%

51 Table 5.10: A1/Small-1 - login tests performed on the A1/Small Cloud Service using a single replica of the machine.

Login Test Requests Average Response Time (ms) Error Rate (Requests/Total%) 295 0/0% 632 0/0% 1000 127 0/0% 351 0/0% 133 0/0% 135 0/0% 1500 136 0/0% 141 0/0% 136 0/0% 155 0/0% 147 0/0% 2000 161 0/0% 148 0/0% 153 0/0% 3884 614/20.47% 8476 758/25.27% 3000 9043 612/20.40% 9750 743/24.77% 7788 682/22.72%

52 Table 5.11: A1/Small-1 - performance tests performed on the A1/Small Cloud Service using a single replica of the machine.

Performance Test Requests Average Response Time (ms) Error Rate (Requests/Total%) 157 0/0% 301 0/0% 4000 223 0/0% 227 0/0% 201 0/0% 216 0/0% 6000 226 0/0% 214 0/0% 425 0/0% 554 0/0% 8000 1569 0/0% 849 0/0% 4118 0/0% 19083 68/0.68% 10000 17227 2/0.02% 13476 23/0.23% 22478 4195/34.96% 25883 4055/33.79% 12000 18314 1018/8.48% 22225 3089/25.74%

53 Table 5.12: A1/Small-2 - login tests performed on the A1/Small Cloud Service using two replicas of the machine.

Login Test Requests Average Response Time (ms) Error Rate (Requests/Total%) 296 0/0% 148 0/0% 1000 627 0/0% 357 0/0% 133 0/0% 128 0/0% 1500 143 1/0.03% 135 0/0% 155 0/0% 210 0/0% 2000 148 0/0% 171 0/0% 3886 614/20.47% 3998 684/22.47% 3000 3687 614/20.47% 4105 608/20.27% 5225 630/20.92%

54 Table 5.13: A1/Small-2 - performance tests performed on the A1/Small Cloud Service using two replicas of the machine.

Performance Test Requests Average Response Time (ms) Error Rate (Requests/Total%) 148 0/0% 154 0/0% 4000 354 2/0.05% 218 0/0% 193 0/0% 129 0/0% 6000 473 0/0% 265 0/0% 359 0/0% 166 0/0% 8000 3191 3/0.04% 1239 1/0.01% 5665 0/0% 2961 0/0% 10000 17787 1739/17.39% 8804 580/5.80% 23137 5005/41.71% 21295 3905/32.52% 12000 27169 8207/68.39% 23867 5705/47.54%

55 5.1.3 Service Fabric Mesh

The difference between machines with Mesh is minimal, both when using stronger machines and multiple replicas. For M1, using one or two replicas for login tests, result in differences of about +-10ms. When using M2, the difference was more detectable, with using two replicas of the machine resulting in better results for all amounts of login requests.

When using the performance plan, using two replicas of M1 results in the machine being able to handle far more requests before timing out. Up to 10000 requests, using two replicas results in response times averaging up to 5-10 seconds faster than a single replica. For 12000 requests, both replica variants time out around 30-50% of requests, with response times being about the same at 20-25 seconds.

The same occurs when comparing one and two replicas of M2, but the other way - using a single replica seemed to result in shorter response times. This can be attributed to the large variance in the tests, sometimes resulting in zero errors and returning about 2000-3000 the next test run. In order to determine exactly how much they differ, more tests need to be done to narrow down the average times more. When testing requests at and below 6000, both replica variations have about the same response times with zero errors, as with M1.

Comparing M1 to M2 using the same amount of replicas, the results are about the same with M2 coming ahead when using larger request sizes. Comparing two replicas results in M1 coming ahead of M2, but this can be attributed to the large variance in the test results. If the best test from both machines were used, M1 still comes ahead at about half the response times of M2, which may imply that the averages gathered could be accurate.

The results for M1 can be seen in table 5.14 and 5.15 for a single replica, and tables 5.16, 5.17 for two replica tests. M2 results can be seen in tables 5.18, 5.19 for single replica results, and tables 5.20, 5.21 for two replica results.

56 Table 5.14: M1-1 - login tests performed on the Mesh machine referred to as M1, using a single replica.

Login Test Requests Average Response Time (ms) Error Rate (Requests/Total%) 222 0/0% 1000 213 0/0% 218 0/0% 251 0/0% 1500 237 0/0% 244 0/0% 292 0/0% 2000 297 0/0% 295 0/0% 3847 626/20.87% 3000 2963 615/20.50% 3405 620/20.68%

57 Table 5.15: M1-1 - performance tests performed on the Mesh machine referred to as M1, using a single replica.

Performance Test Requests Average Response Time (ms) Error Rate (Requests/Total%) 238 0/0% 4000 331 0/0% 285 0/0% 345 0/0% 6000 330 0/0% 338 0/0% 4810 0/0% 8000 1584 0/0% 3197 0/0% 20612 3107/31.07% 24326 4283/42.83% 10000 14251 51/0.51% 19730 2480/24.80% 20925 2224/18.53% 12000 22670 5642/47.02% 21798 3933/32.78%

Table 5.16: M1-2 - login tests done on the Mesh machine referred to as M1, using two replicas.

Login Test Requests Average Response Time (ms) Error Rate (Requests/Total%) 225 0/0% 244 0/0% 1000 204 0/0% 226 0/0% 225 0/0% 256 0/0% 240 0/0% 1500 262 0/0% 239 0/0% 249 0/0% 295 0/0% 257 0/0% 2000 280 0/0% 323 0/0% 289 0/0% 2926 615/20.50% 3586 665/21.37% 3000 3373 637/21.23% 3295 639/21.03%

58 Table 5.17: M1-2 - performance tests done on the Mesh machine referred to as M1, using two replicas.

Performance Test Requests Average Response Time (ms) Error Rate (Requests/Total%) 262 0/0% 264 0/0% 4000 322 0/0% 282 0/0% 249 0/0% 290 0/0% 6000 315 0/0% 284 0/0% 538 0/0% 2701 0/0% 8000 1347 0/0% 1529 0/0% 2676 0/0% 17056 302/3.02% 10000 3625 0/0% 7785 100/1% 22340 6167/51.39% 23448 8693/72.44% 12000 19628 1740/14.50% 21805 5533/46.11%

59 Table 5.18: M2-1 - login tests done on the Mesh machine referred to as M2, using a single replica of the application. There are minor deviations in tests, but mostly all results are around the same values across many tests.

Login Test Requests Average Response Time (ms) Error Rate (Requests/Total%) 269 0/0% 265 0/0% 1000 272 0/0% 268 0/0% 263 0/0% 264 0/0% 1500 239 0/0% 255 0/0% 260 0/0% 322 0/0% 2000 4436 470/20.15% 291 0/0% 7243 776/25.87% 3466 621/20.67% 3000 8401 1144/25.67% 6370 847/24.04%

60 Table 5.19: M2-1 - performance tests done on the Mesh machine referred to as M2, using a single replica of the application. There are minor deviations in tests, but mostly all results are around the same values across many tests.

Performance Test Requests Average Response Time (ms) Error Rate (Requests/Total%) 289 0/0% 302 0/0% 4000 261 0/0% 284 0/0% 378 0/0% 308 0/0% 6000 301 0/0% 329 0/0% 1127 0/0% 652 0/0% 8000 1260 0/0% 1013 0/0% 6539 0/0% 10226 0/0% 10000 17557 570/5.70% 11440 190/1.90% 23264 8874/73.95% 22414 4912/40.93% 12000 21937 3594/29.95% 22538 5793/48.28%

61 Table 5.20: M2-2 - login tests done on the Mesh machine referred to as M2, using two replicas.

Login Test Requests Average Response Time (ms) Error Rate (Requests/Total%) 237 0/0% 226 0/0% 1000 231 0/0% 221 0/0% 229 0/0% 243 0/0% 240 0/0% 1500 238 0/0% 235 0/0% 239 0/0% 258 0/0% 283 0/0% 2000 267 0/0% 299 0/0% 277 0/0% 3321 613/20.43% 5784 619/20.63% 3000 3425 592/19.73% 9234 1017/33.83% 5441 710/23.65%

62 Table 5.21: M2-2 - performance tests done on the Mesh machine referred to as M2, using two replicas.

Performance Test Requests Average Response Time (ms) Error Rate (Requests/Total%) 401 0/0% 247 0/0% 4000 594 0/0% 269 0/0% 378 0/0% 387 0/0% 349 0/0% 6000 368 0/0% 308 2/0.03% 353 0/0% 2556 0/0% 4402 2/0.03% 8000 4960 0/0% 2857 2/0.03% 3694 1/0.01% 12412 26/0.26% 22683 5384/53.84% 10000 22283 3470/34.70% 23457 5344/53.44% 20209 3556/35.56% 23336 9070/75.58% 23511 9582/79.85% 12000 23967 9383/78.19% 23117 7913/65.94% 23483 8987/74.89%

63 5.2 Service Comparison

Using the gathered average response times and error rates, the services can be compared for the different request amounts that were performed. By placing all the results in tables, we can clearly see the differences between each service.

Tables 5.23 and 5.24 show the exact results of each service, and enables them to be compared side-by-side.

To summarize all the machines tested, table 5.22 shows all machine sizes and their prizes, used in the project. The listed prices are when using a single replica, where adding additional replicas multiply the original price by the amount of replicas used. For example, using 2 replicas of A0 would cost 2 * $14.60 = $29.20.

Table 5.22: Machines used in the project for measuring performance and scalability, along with their monthly prices. Mesh is still as of writing in preview and does not have any official price-point, although predicted to be a lot cheaper than its equivalent in CPU and RAM when using Cloud Services or Virtual Machines. The prices listed are for a single replica, where 2 replicas of A0 or A1 would cost $29.20 and $116.80 respectively.

ID Cores RAM (GB) Price (month) B2S 2 4 $39.13 B2MS 2 8 $72.47 A0/Extra Small 1 0.75 $14.60 A1/Small 1 2 $58.40 M1 1 2 - M2 2 4 -

64 Table 5.23: Table containing all average results across services when performing different amounts of login requests, defined in figure 4.9.

Requests Service ID Replicas ART Error Amount B2S 1 123 0/0% Virtual Machine B2MS 1 140 0/0% 1 154 0/0% A0 2 143 0/0% Cloud Service 1 351 0/0% 1000 A1 2 357 0/0% 1 218 0/0% M1 2 225 0/0% Mesh 1 268 0/0% M2 2 229 0/0%

Requests Service ID Replicas ART Error Amount B2S 1 196 0/0% Virtual Machine B2MS 1 170 0/0% 1 185 0/0% A0 2 155 0/0% Cloud Service 1 153 0/0% 2000 A1 2 171 0/0% 1 295 0/0% M1 2 249 0/0% Mesh 1 291 0/0% M2 2 277 0/0%

Requests Service ID Replicas ART Error Amount B2S 1 27276 2453/81.77% Virtual Machine B2MS 1 25912 2474/83.60% 1 5647 608/20.26% A0 2 6921 808/26.94% Cloud Service 1 7788 682/22.72% 3000 A1 2 5225 630/20.92% 1 3405 620/20.68% M1 2 3295 639/21.03% Mesh 1 6370 847/24.04% M2 2 5441 710/23.65%

65 Table 5.24: Table containing all average response times and error amounts across all services when performing the performance test plan, defined in figure 4.12.

Requests Service ID Replicas ART Error Amount B2MS 1 220 0/0% Virtual Machine B2S 1 195 0/0% 1 214 0/0% A0 2 169 0/0% Cloud Service 1 227 0/0% 4000 A1 2 218 0/0% 1 285 0/0% M1 2 282 0/0% Mesh 1 284 0/0% M2 2 378 0/0%

Requests Service ID Replicas ART Error Amount B2MS 1 220 1/0.02% Virtual Machine B2S 1 220 0/0% 1 254 0/0% A0 2 252 0/0% Cloud Service 1 214 0/0% 6000 A1 2 265 0/0% 1 338 0/0% M1 2 284 0/0% Mesh 1 329 0/0% M2 2 353 0/0%

Requests Service ID Replicas ART Error Amount B2MS 1 1508 2508/83.60% Virtual Machine B2S 1 1182 0/0% 1 3335 0/0% A0 2 2964 0/0% Cloud Service 1 849 0/0% 8000 A1 2 1239 1/0.01% 1 3197 0/0% M1 2 1529 0/0% Mesh 1 1013 0/0% M2 2 3694 1/0.01%

66 Table 5.25: Performance Tests Table 2 - continuing from the previous performance test table 5.24.

Requests Service ID Replicas ART Error Amount B2MS 1 9200 0/0% Virtual Machine B2S 1 3154 0/0% 1 18841 3337/33.37% A0 2 19499 3546/35.46% Cloud Service 1 13476 23/0.23% 10000 A1 2 8804 580/5.80% 1 19730 2480/24.80% M1 2 7785 100/1% Mesh 1 11440 190/1.90% M2 2 20209 3556/35.56%

Requests Service ID Replicas ART Error Amount B2MS 1 23680 6594/55.20% Virtual Machine B2S 1 22742 6198/52.00% 1 26405 7467/62.22% A0 2 12596 2772/23.10% Cloud Service 1 22225 3089/25.74% 12000 A1 2 23867 5705/47.54% 1 21798 3933/32.78% M1 2 21805 5533/46.11% Mesh 1 22538 5793/48.28% M2 2 23483 8987/74.89%

67 68 6 Discussion and Conclusions

This chapter includes discussion about the results that were gathered, along with validity and reliability of the data and project methods and answering the research question.

6.1 Research Methods

With no prior experience in any of the areas explored, the project was introduced with a literature study into Cloud Computing, Microsoft Azure, and Measuring Performance and Scalability, before any development began. The research portal Pluralsight proved to be very useful for this purpose, where multiple guides into how development in the .NET and Azure existed and were explored, along with lots of documentation provided by Microsoft, Amazon and articles found on various tech blogs. Since the services being tested are very new, where Mesh is barely even released, it was sometimes difficult to find sufficient information, but by doing the literature study, enough information to perform the case study was gathered.

The case study was performed to simulate a real-life situation using an API that will be used in production, in order to measure how well each service performs using actual, real data and use cases. This case study was limited, however, since it does not perform any compute-intense tasks and does not compare the services in that regard. The study was also only limited to the Azure platform, which does not compare different providers and their Cloud Computing services. These measurements are left as suggestion for future work.

The load tool simulation was done to simulate a real world situation, where multiple consumers access the same broadcast. It is well over the amount of requests coming in on a day to day basis, but makes the differences and limits of each service more clear, when larger amounts are encountered. By using a larger amount of consumers, the limits of each service are more clear and decisions made in future projects can save money and time.

69 Problems encountered during the study were mostly related to installation, deployment and the Azure portal. Many deployment issues were encountered, especially with VMs and Service Fabric Mesh, where unusual error codes and deployment issues were encountered. This was mostly due to lack of experience of deploying .NET applications to web servers, but also due to how new Mesh still is, still being in a preview phase.

In order to ensure that the project followed a scientific method, Bunge’s method mentioned in section 3.1.4 was used at the beginning of the project, to ensure that the steps necessary to doing a scientific study were used. This method was chosen due to it being relevant in previous projects, which also made it a reliable method for scientific research.

To ensure that the project was kept within the limits, defined in figure 3.3.1, the Iron Triangle was employed. By ensuring that functionality necessary to answer the research question were defined, the project had a higher chance to be completed in time. By using the MoSCoW method from 3.3.2, the necessary functionality was defined and followed throughout the course of the project. This was done by first completing the requirements for ’Must have’, followed by completing the ’Should have’ requirements, without going into the ’Could Have’ category. By following the priority set, the research question could be answered.

70 6.2 Validity and Reliability

Validity refers to choosing the right elements to measure, so that the data that is gathered represents the information that is relevant to reaching the correct conclusion [17].

Reliability refers to how reproducible or recreatable the results gathered are. If the same experiment would be done by anyone using Azure, would they be able to gather the same result [18]?

In order to determine validity and reliability of both the research strategy and the results, we have to reason according to these definitions.

6.2.1 Research Method

The project was introduced with a literature study to gather information regarding the Azure services, Cloud Computing, performance and scalability measuring. By using multiple credible sources, such as Microsoft, Amazon, IBM, and popular tech blogs such as Pluralsight, reliable information that could be used was gathered. By studying what tools and how both performance and scalability could be measured, the literature study ensured that a case study could be performed to gather data, which could then answer the research question at the end of the project, validating the project method.

By performing a case study using a few chosen services within Azure, the same result could be reproduced by anyone willing to test out the method chosen in the project. The case study is thoroughly documented in chapter 4, where each step of the development and deployment is documented and explained, so that the exact same tests may be performed by anyone willing to reproduce it. There can, however, be differences occurring due to different testing and hosting locations, along with disturbances in external databases and the services themselves. This is a negative aspect in terms of reliability, where a more accurate method would have included both more different hosting and testing locations, but were limited in this project due to time constraints.

71 It is, however, a constant time difference in the tests, where the data gathered in the project may have resulted in being able to handle less requests than if only a single service was used in Azure, without any outgoing, external API calls.

6.2.2 Result

The tests performed to gather the data were done from three locations, ensuring that not only one is considered where results may be dependant on latencies and network speeds of a single location. In addition to testing from multiple locations, the tests were also done multiple times throughout each day, ensuring that the data was also not dependant on certain disturbances which may occur. By measuring the average of all the tests conducted, a valid conclusion could be made for the performance and scalability metrics of each service, due to minor differences being cancelled out by taking the average over multiple tests.

Since the case study represents a real world situation where consumers are accessing data, the measurements gathered reflect how they would experience interacting with any REST Web API. The data is simple and easy to reproduce, with minor differences occurring in hosting location, testing location and differences in external API calls, otherwise being easy to duplicate. The differences in external calls are minimal, with a duplicated test most likely resulting in being able to handle slightly more requests. By concluding that the case study specific differences are close to non-impactful, the results can be deemed reliable and reproducible.

72 6.3 Conclusion

Using the data gathered in chapter 5, an answer and conclusion to the research question could be provided.

• How do various Cloud Architectures differ in terms of performance and scalability?

When it comes to performance, all three versions perform similarly when small amounts of users are accessing the systems. The difference is noticeable both when multiple instances are run at the same time, where the duplicated instances are able to handle far more requests before being timed out, and when the amount of incoming requests increase overall. Virtual Machines fall off pretty heavily between 2000 and 3000 concurrent users, being behind almost 20 seconds to the slowest machine of both Cloud Services and Service Fabric Mesh.

However, when using only a single request type, all services seem to handle up to 6000 users without any issues. When reaching higher amounts, all machines start to fall off, with results fluctuating between 8000-12000 requests within the 5 minute test plan. All services seem to stop responding completely at around 10000-12000 requests, with error rates varying between 25-75%. For 8000 requests, Cloud Services are ahead, with Virtual Machines coming in second and Mesh in third. Overall, it seems that for smaller request amounts below and at 6000, all services perform fast with only minimal differences between them.

The conclusion that VMs fall behind to both Cloud Services and Mesh, both in terms of performance and scalability, can be made for a normal use case. The error rates for VMs escalate far quicker than for Cloud Service and Service Fabric Mesh when it comes to a normal use case, with Cloud and Mesh sitting at 20-25% error rates vs 80%+ of VMs. When simulating a single request type in larger quantities to measure performance and scalability limits, a conclusion cannot be made about which service performs the best. All three services start to fail around the same time, with error rates and response times being about the same or fluctuating a lot. In order to determine how many requests each service can handle, more extensive and detailed testing needs to be done to determine exactly at which point these services differ, which was not possible in this project due to budget.

73 6.4 Future Work

Future work could look into comparing the computing capabilities of each service, where both CPU and RAM play a bigger role in how well the machine performs. This could give a better picture of exactly how each service differs, both in terms of responding to large amounts of small requests and handling computing heavy tasks. To compare the speed of storing and writing to files in each service could be done, since there are also large differences in how each service works in these terms. Multiple platforms could also be compared, such as Amazon Web Services (AWS) and Google Cloud.

74 References

[1] allabolag.se. Triona Bokslut. URL: https : / / www . allabolag . se / 5565594123/triona-ab (visited on 06/16/2019).

[2] Amazon. Types of Cloud Computing. URL: https://aws.amazon.com/ types-of-cloud-computing/ (visited on 06/16/2019).

[3] Amazon. What is cloud computing? URL: https://aws.amazon.com/what- is-cloud-computing/ (visited on 06/16/2019).

[4] Atkinson, Roger. International Journal of Project Management. Elsevier, 1999. URL: https://doi.org/10.1016/S0263-7863(98)00069-6.

[5] Azure, Microsoft. Cloud Services. URL: https://azure.microsoft.com/ en-gb/services/cloud-services/ (visited on 06/16/2019).

[6] Azure, Microsoft. Cloud Services Documentation. URL: https : / / docs . microsoft.com/en-gb/azure/cloud-services/cloud-services-choose- me (visited on 06/16/2019).

[7] Azure, Microsoft. Service Fabric Mesh Documentation. URL: https : / / docs.microsoft.com/en- gb/azure/service- fabric- mesh/service- fabric-mesh-overview/ (visited on 06/16/2019).

[8] Azure, Microsoft. Sizes for Windows virtual machines in Azure. URL: https : / / docs . microsoft . com / en - us / azure / virtual - machines / windows/sizes (visited on 06/16/2019).

[9] Azure, Microsoft. Virtual Machines. URL: https : / / azure . microsoft . com/en-gb/services/virtual-machines/ (visited on 06/16/2019).

[10] Azure, Microsoft. Virtual Machines Documentation. URL: https://docs. microsoft.com/en- gb/azure/virtual- machines/windows/overview/ (visited on 06/16/2019).

[11] Azure, Microsoft. What is Azure? URL: https://azure.microsoft.com/ en-gb/overview/what-is-azure/ (visited on 06/16/2019).

[12] Bunge, M. Epistemology Methodology I:: Exploring the World. Springer Science Business Media, 1983.

75 [13] Columbus, Louis. “Cloud Computing Market Projected To Reach $411B By 2020”. In: (2017). URL: https://www.forbes.com/sites/louiscolumbus/ 2017/10/18/cloud-computing-market-projected-to-reach-411b-by- 2020/ (visited on 06/16/2019).

[14] Consortium, Agile Business. MoSCoW Prioritisation. URL: https : / / www.agilebusiness.org/content/moscow- prioritisation (visited on 06/16/2019).

[15] Delgado, Victor. Exploring the limits of Cloud Computing. URL: https: //people.kth.se/~maguire/DEGREE-PROJECT-REPORTS/101118-Victor_ Delgado-with-cover.pdf (visited on 06/16/2019).

[16] Denscombe, Martyn. The good research guide. Open University Press, 2010. URL: https://www.academia.edu/2240154/The_Good_Research_ Guide_5th_edition_.

[17] Dudovskiy, John. Validity. URL: https://research-methodology.net/ research - methodology / reliability - validity - and - repeatability / research-validity/ (visited on 06/16/2019).

[18] Golafshani, Nahid. Understanding Reliability and Validity in Qualitative Research. URL: https://nsuworks.nova.edu/tqr/vol8/iss4/6/ (visited on 06/16/2019).

[19] Håkansson, Anne. Portal of Research Methods and Methodologies for Research Projects and Degree Projects. 2013. URL: http://kth.diva- portal . org / smash / get / diva2 : 677684 / FULLTEXT02 . pdf (visited on 06/16/2019).

[20] IBM. Cloud computing: A complete guide. URL: https://www.ibm.com/ cloud/learn/cloud-computing (visited on 06/16/2019).

[21] Kenton, Will. Scalability. URL: https://www.investopedia.com/terms/ s/scalability.asp (visited on 06/16/2019).

[22] Mell, Peter and Grance, Tim. The NIST Definition of Cloud Computing. URL: https://csrc.nist.gov/publications/detail/sp/800-145/final (visited on 06/16/2019).

76 [23] Microsoft. IIS Web Server. URL: https://www.iis.net/overview (visited on 06/16/2019).

[24] Microsoft. What are public, private, and hybrid clouds? URL: https:// azure.microsoft.com/en- us/overview/what- are- private- public- hybrid-clouds/ (visited on 06/16/2019).

[25] Microsoft. What is cloud computing? A beginner’s guide. URL: https:// azure.microsoft.com/en- us/overview/what- is- cloud- computing/ (visited on 06/16/2019).

[26] PaaS vs IaaS: What’s The Difference, SaaS vs and Choose, How To. Stephen Watts. URL: https://www.bmc.com/blogs/saas- vs- paas- vs- iaas- whats-the-difference-and-how-to-choose/ (visited on 06/16/2019).

[27] Rapp, Mikael. Scalability Guidelines for Sfotware as a Service. URL: https : / / people . kth . se / ~maguire / .c / DEGREE - PROJECT - REPORTS / 100607-Mikael_Rapp-with-cover.pdf (visited on 06/16/2019).

[28] Rouse, Margaret. Performance Definition. URL: https : / / whatis . techtarget.com/definition/performance (visited on 06/16/2019).

[29] Schoeb, Leah. Cloud Scalability: Scale Up vs Scale Out. URL: https:// blog.turbonomic.com/blog/on-technology/cloud-scalability-scale- vs-scale (visited on 06/16/2019).

[30] Study.com. Computer Performance Evaluation. URL: https : / / study . com/academy/lesson/computer-performance-evaluation-definition- challenges-parameters.html (visited on 06/16/2019).

[31] Technopedia. What is the difference between scale-out versus scale-up? URL: https : / / www . techopedia . com / 7 / 31151 / technology - trends / what - is - the - difference - between - scale - out - versus - scale - up - architecture-applications-etc (visited on 06/16/2019).

[32] Triona. Triona. URL: https://www.triona.se/ (visited on 06/16/2019).

77 7 Appendices

A API Specification

Figure A.1: API Requirement Specification provided by Triona.

78 B API Sequence Diagram

79 80 TRITA-EECS-EX-2019:221

www.kth.se