<<

DEGREE PROJECT IN INFORMATION AND COMMUNICATION TECHNOLOGY, SECOND CYCLE, 30 CREDITS STOCKHOLM, SWEDEN 2020

Is QUIC a Better Choice than TCP in the 5G Core Based Architecture?

PETHRUS GÄRDBORN

KTH ROYAL INSTITUTE OF TECHNOLOGY SCHOOL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCE Is QUIC a Better Choice than TCP in the 5G Core Network Service Based Architecture?

PETHRUS GÄRDBORN

Master in Communication Systems Date: November 22, 2020 Supervisor at KTH: Marco Chiesa Supervisor at Ericsson: Zaheduzzaman Sarker Examiner: Peter Sjödin School of Electrical Engineering and Computer Science company: Ericsson AB Swedish title: Är QUIC ett bättre val än TCP i 5G Core Network Service Based Architecture?

iii

Abstract

The development of the 5G required a new 5G Core Network and has put higher requirements on its . For decades, TCP has been the transport protocol of choice on the . In recent years, major Internet players such as , and CloudFlare have opted to use the new QUIC transport protocol. The design assumptions of the Internet (best-effort delivery) differs from those of the Core Network. The aim of this study is to investigate whether QUIC’s benefits on the Internet will translate to the 5G Core Network Service Based Architecture. A testbed was set up to emulate traffic patterns between Network Functions. The results show that QUIC reduces average request latency to half of that of TCP, for a majority of cases, and doubles the even under optimal network conditions with no and low (20 ms) RTT. Additionally, by measuring request start and end times “on the wire”, without taking into account QUIC’s shorter connection establishment, we believe the results indicate QUIC’s suitability also under the long-lived (standing) connection model. In conclusion, from a performance perspective, QUIC appears to be a better candidate than TCP in the 5G Core Network Service Based Architecture. iv

Sammanfattning

Den snabba utvecklingen av det mobila nätverket 5G har medfört högre krav för 5G Core Network och dess protokoll. Under årtionden har TCP dominerat Internet som transportprotokoll. De senaste åren har dock stora aktörer som Google, Facebook och CloudFlare börjat använda det relativt nya transport- protokollet QUIC. Designantagandena för Internet (best-effort delivery) och 5G Core skiljer sig dock åt. Målet med denna studie är att undersöka huruvida QUIC är lika fördelaktigt att använda i 5G Core Network Service Based Ar- chitecture som på Internet. En testbädd sattes up för att emulera trafiken mel- lan två nätverksfunktioner. Resultaten visar att QUIC har halverar latensen jämfört med TCP i en majoritet av fallen, samt dubblerar genomströmning- en även under mycket goda nätverksförhållanden där inga paket förloras och tur-och-retur-tiden är låg (20 ms). Genom att mäta ett HTTP requests start- och sluttider “rakt på ledningen”, utan att inbegripa QUICs kortare förbin- delseetableringstid, menar vi att resultaten indikerar QUICs lämplighet också vid användande av långlivade kommunikationsförbindelser. Med avseende på prestanda, framstår QUIC som ett bättre alternativ än TCP också i 5G Core Network Service Based Architecture. v

Acknowledgements

I would like to express my gratitude to several experts at Ericsson for the op- portunity to work on this interesting topic and for the enriching discussions that emerged. I would like to thank my industrial supervisor, Zaheduzzaman Sarker, for his knowledgeable guidance, help, and feedback during this project. I would like to thank Patrick Sellstedt, for providing me the platform at Eric- sson for this work. I would also like to thank Magnus Westerlund and Mirja Kühlewind for additional very valuable feedback and insights. I am also grateful to my KTH supervisor, Marco Chiesa, for his valuable feedback and insights, and to my examiner, Peter Sjödin, for reviewing and approving the initial thesis proposal and providing helpful academic feedback on the report. I would also like to thank my parents for always being supportive, my father for the encouragement and support during all my studies, and my mother for believing in a positive outcome of this work. Finally, I want to thank my wife for enabling me to spend time, sometimes on nights and weekends, to track down bugs and finalize the report. Without your support, I would not be able to have both a Master Thesis and a beautiful family as well. Contents

1 Introduction 1 1.1 Overview ...... 1 1.2 QUIC vs. TCP ...... 2 1.3 Problem Area ...... 3 1.4 Goals ...... 4 1.5 Methodology ...... 4 1.6 Delimitations ...... 5 1.7 Ethics and Sustainability ...... 5 1.8 Outline ...... 5

2 Background 7 2.1 Mobile Systems Architecture ...... 7 2.1.1 Common Logical Building Blocks ...... 7 2.1.2 5G Core Service Based Architecture ...... 8 2.2 Protocols ...... 9 2.2.1 Introduction ...... 9 2.2.2 TCP/TLS ...... 11 2.2.3 HTTP/1.1 and HTTP/2 ...... 12 2.2.4 QUIC and HTTP/3 ...... 13 2.3 QUIC on the Web vs. the 5G Core ...... 15 2.4 Related Work ...... 16 2.4.1 QUIC in the 5G Core Service Based Architecture . . . 16 2.4.2 QUIC on the Web ...... 17

3 Method 19 3.1 Testbed Requirements ...... 19 3.1.1 Overview ...... 19 3.1.2 Requirements to Ensure Comparability ...... 20 3.2 Testbed Components ...... 24

vi CONTENTS vii

3.2.1 Platform & Network Tools ...... 24 3.2.2 QUIC/TCP Client & Server Implementations . . . . . 25 3.3 Benchmark Parameter Settings ...... 26 3.3.1 Main Parameters ...... 26 3.3.2 Sub-parameters ...... 27 3.4 Key Performance Indicators ...... 27 3.5 Test Design ...... 28

4 Results 30 4.1 Sub-parameters ...... 30 4.1.1 Number of Multiplexed Requests ...... 31 4.1.2 Request Payload ...... 35 4.1.3 Response Payload ...... 39 4.2 Main Parameters ...... 46 4.2.1 Round Trip Time ...... 46 4.2.2 Packet Loss Rate ...... 47

5 Discussion 50 5.1 Faster Connection Establishment – Not the Only Factor for Improving Latency ...... 50 5.2 QUIC has a Significant Advantage Also in QoS Networks . . . 52 5.3 Performance Difference is Negligible With Few Multiplexed GET Requests ...... 52 5.4 Impacts of the Method used to Measure Request Duration . . . 53 5.5 Limitations & Future Work ...... 53 5.6 Conclusions ...... 54

Bibliography 56

Chapter 1

Introduction

A short background overview of the research is presented followed by a brief comparison of QUIC and TCP. We then lay out the problem area, research question and hypotheses followed by the goals, methodology and a brief dis- cussion of ethical concerns and impacts on sustainability. Chapter 2 contains a more thorough background presentation.

1.1 Overview

Mobile systems evolve continuously in order to meet new and increasing demands such as massive IoT deployment, time-sensitive re- mote control of robots over a network and increased throughput. Between 2010 and 2020, there was an expected capacity increase of a 1000 times more devices, 100 times higher data rate and 10 times lower latency. The fifth gener- ation (5G) mobile telecommunication system has been designed to meet these demands. [1] 3G, and 5G architectures all have in common that they consist of two networks: the Radio Access Network and the Core Network. The role of the Radio Access Network is to provide a connection between the User Equipment – a cell phone or other device – and the Core Network. The Core Network is responsible for handling everything related to connection management and the forwarding of data to and from the Internet. It is logically divided in two parts, the Control Plane and the User Plane. The User Plane only handles the forwarding of data and the Control Plane handles all other functionality such as authentication, authorization, policy control and so forth. [2] In order to modularize network components and make them independently scalable and evolvable, the new Service Based Architecture (SBA) was in-

1 2 CHAPTER 1. INTRODUCTION

troduced with 5G for the Control Plane of the Core Network [3]. SBA is a microservices-type architecture where the goals are to provide easier scaling and more rapid development [4, p. 12]. According to the "System architecture for the 5G System" specification [3], Network Functions (NFs) communicate over Representational State Transfer (REST) interfaces via request-response and subscribe-notify communication patterns. The authors argue that by iso- lating NFs and using generic protocols such as HTTP for communication, scal- ing and evolvability of individual components become easier; thereby, SBA also paves the way for eventual transitioning into a cloud-based environment where NFs run in containers in data centers. The choice of protocols impacts all types of communication. With the introduction of REST interfaces in SBA, the protocol stack of the higher layers – layer 3 to layer 5 – becomes very similar to the protocol stack used on the Web [4, p. 22]. Therefore, any new protocol development in the Web stack could also be of use in the SBA protocol stack. In this thesis, we are interested in evaluating the transport protocol performance within the context of the 5G Core Network Control Plane SBA.

1.2 QUIC vs. TCP

The Transmission Control Protocol (TCP) has been the transport protocol of choice on the Internet for decades and provides reliability as well as congestion control [5]. In conjunction with the Security (TLS) protocol, authentication and encryption are also ensured [6]. However, in 2012 Google laid out the guidelines for a new transport protocol called QUIC [7]. The protocol was submitted to the Internet Engineering Task Force (IETF) which has since continued the development [8] and made substantial changes to the protocol. It is the IETF draft of QUIC [9] that this thesis is concerned with. To understand why companies such as Google, CloudFlare and others opt for QUIC whenever possible [10, 11], we need to understand the design goals of QUIC and in what regards QUIC performs better than TCP. One of the main goals was to reduce latency in order to make web applications more responsive [7]. QUIC achieves this goal by reducing the connection establishment time by one Round Trip Time (RTT) when compared with TCP over TLS1.3 and two RTTs when compared with TCP over TLS1.2 [12]. Another goal was to eliminate the Head-of- blocking problem plagu- ing TCP [7]. The problem arises since TCP is not able to differentiate multiple HTTP/2 sent over the same TCP connection [13]. A packet loss in one stream therefore brings all other streams to a halt until the lost packet has been CHAPTER 1. INTRODUCTION 3

recovered. With HTTP/3 over QUIC, however, streams are differentiated also at the transport layer and the Head-of-Line blocking problem eliminated [13].

1.3 Problem Area

Although QUIC brings certain improvements to the Web, these benefits do not necessarily translate to the Core Network. The design assumptions of the Internet are different from those of the Core Network. The Internet only pro- vides best-effort delivery guarantees which means that packets can be lost or reordered in the network [14]. In the Core Network, however, it is possible to have much more control over the network itself since it is often proprietary. with regards to throughput, latency, Packet Loss Rate (PLR) and Packet Error Rate (PER) can be ensured. Connection characteristics also need to be taken into consideration. After all, the transport protocol is only implemented in the end (the NFs in our case) [15]; therefore, even if identical network characteristics are used across multiple tests, they may still yield different results depending on differences in the transport protocol design. What we investigate in this thesis is:

Does QUIC perform significantly better than TCP in the 5G Core Service Based Architecture?

In order to provide a framework for evaluating the research question, first we present three hypotheses followed by a motivation for each: Hypothesis 1 QUIC allows the 5G Core Network to tolerate a significantly higher Packet Loss Rate than TCP.

Hypothesis 2 QUIC allows the 5G Core Network to tolerate a significantly higher degree of packet reordering than TCP.

Hypothesis 3 QUIC enables significantly better performance than TCP in the 5G Core Network when a majority of connections are short-lived.

The motivation for hypotheses 1 and 2 is due to the fact that some of QUIC’s greatest benefits are likely to come into play when packets are lost and reordered, since that is where QUIC’s HOL blocking elimination should enable higher throughput and lower latency than TCP [16, 17]. The motiva- tion behind Hypothesis 3 is that QUIC has faster connection establishment time than TCP and probably should benefit from an environment where many connections are short-lived [18]. 4 CHAPTER 1. INTRODUCTION

It is also important to note that each hypothesis includes a parameter that can be varied across a spectrum. The Packet Loss Rate may be adjusted for Hy- pothesis 1, the amount of packet reordering for Hypothesis 2 and the fraction of long-lived connections as well as how long they should be in Hypothesis 3. The idea is to identify the optimal performance region of QUIC under these premises and then see if it suits the specific use cases of SBA.

1.4 Goals

The first goal is to produce a test environment in which the performance of QUIC and TCP can be tested. This environment can then be used in future research, building on the work of this thesis. The second goal is to design tests in such a way that they will give an indication as to whether QUIC would increase overall performance in the 5G Core Network SBA. The third goal is to present data indicating whether QUIC performs significantly better than TCP in the 5G Core Network SBA with respect to latency and throughput.

1.5 Methodology

Chapter 3 gives a thorough description of the methodology used. In summary, the method of the thesis follows the steps listed below:

1. Perform a literature study.

2. Set up a test environment where it is possible to simulate different net- work characteristics such as throughput, packet loss rate and latency as well as including a number of end hosts (NFs).

3. Choose QUIC and TCP implementations and provide a framework in which it is possible to switch between QUIC and TCP in the same test scenarios.

4. Ensure that it is possible to log relevant data.

5. Test each hypothesis over a spectrum of measurements.

6. Find out under what circumstances QUIC/TCP performs the best. CHAPTER 1. INTRODUCTION 5

1.6 Delimitations

The 5G Core Network SBA is a vast ecosystem and it is virtually impossible to perform a comprehensive comparison of QUIC and TCP in all aspects of the network. We therefore delimit this study to emulate communication between two virtual Network Functions within a controlled virtual environment using either QUIC or TCP. The focus is on comparing the performance of each proto- col, with respect to latency and throughput, under various levels using message size ranges common to the 5G Core Network SBA. Although simplified, we consider our testbed to be a valid model of communication be- tween two Network Functions.

1.7 Ethics and Sustainability

From an individual viewpoint, we do not see any ethical issues with this project since there is no sensitive data collected on individuals. Neither do we see any ethical issues when it comes to potentially revealing trade secrets since all documents and source code used in this project has been publicly available. The aim of the study is to find out whether QUIC is a better protocol than TCP in the 5G Core Network SBA. If the study proves helpful in selecting the most efficient protocol, it would lead to more efficiently designed mobile networks, consequently saving energy resources in 5G networks all over the world, which would be beneficial in creating a more sustainable technological infrastructure. Mobile communication infrastructure is an essential part of societies all over the world. A more efficient architecture could help bring down overall costs and raise the quality of mobile communication also in lesser developed parts of the world.

1.8 Outline

In Chapter 2, we provide the necessary background needed to understand mo- bile systems architecture in general and the 5G Core Network SBA in partic- ular. We continue with a thorough presentation of QUIC, TCP and related studies. In Chapter 3, we describe the testbed in detail and motivate the setup and test design. We then present the results of our study in Chapter 4. Fi- nally, in Chapter 5, we point out the most interesting findings from the results with a view towards our initial hypotheses, research question and related work. 6 CHAPTER 1. INTRODUCTION

We discuss the limitations of the work, give suggestions for future work and present our final conclusions. Chapter 2

Background

First, we present an overview of mobile systems architecture with an emphasis on the 5G Core Network and more specifically the Service Based Architecture. Then follows a description of the protocols of interest for this project, mainly QUIC and TCP, and a discussion of related research. Finally, a brief overview of the tools used for the experiment is provided.

2.1 Mobile Systems Architecture

To begin with, the common traits of mobile systems architecture across differ- ent generations are accounted for and explained followed by a more detailed look at 5G, and more specifically the 5G Core Network Service Based Archi- tecture.

2.1.1 Common Logical Building Blocks Even though each generation of mobile networks was made possible due to new technology and architectural designs, certain logical building blocks re- main the same throughout generations 2, 3, 4 and 5 [19]. These building

Figure 2.1: Common Traits of Mobile Architecture

7 8 CHAPTER 2. BACKGROUND

Figure 2.2: 5G System Architecture blocks are displayed in Figure 2.1. The User Equipment is any device that uses the mobile network. It connects to the mobile network via the Radio Ac- cess Network. The responsibility of the Radio Access Network is to establish a connection between the User Equipment and the Core Network [2]. The Core Network handles connection management – authorization, authentica- tion and so on – and the forwarding of data to an external network [2]. The Core Network is logically divided into the Control Plane, which is responsible for connection management, and the User Plane, which handles data forward- ing [2]. The User Equipment, Radio Access Network and Core Network have persisted from the first generations of mobile systems up until 5G [19] but the internals of each of these building blocks have changed radically.

2.1.2 5G Core Service Based Architecture The general design principles that have guided the formation of 5G are recorded in the 3GPP standards [3]. They include separation between Control Plane and User Plane. The motivation for logically separating the Control Plane and User Plane is to enable independent scaling and evolution. Another motivation is to enable a more flexible deployment, centralized or distributed. Higher-level Network Functions could thus be moved to the cloud and serve many base stations [1]. Another design principle of the 3GPP was to take a modularized approach to Network Function design. Each Network Function provides a spe- cific service in the Control Plane, such as authentication, authorization and so on [3]. The Service Based Architecture (SBA) is the result of modularizing Network Functions and is, in essence, a microservices based design approach CHAPTER 2. BACKGROUND 9

[20]. Many of the Network Functions can be seen in the top of Figure 2.2. Instead of having Network Functions communicating over point-to-point interfaces, in SBA services are exposed – offered – to other Network Functions and they can request the services as needed [4, p. 19]. Each Network Function is logically connected to a common networking infrastructure and the princi- ples of HTTP REST are used for communication [4, pp. 22-23]. Important principles in REST, as described by 3GPP [21], are that in a communication scenario with two entities, one acts as a client and another as a server. The client requests services from the server. Every request needs to be stateless, in other words, the server should not have to remember what the client did in a previous request to be able to process the next. Even though the use of HTTP is a cornerstone also of Web traffic, in con- trast to the Web, the Core Network is not deployed on the Internet but resides between the Access Network and an External Network as seen in Figure 2.1. Since most studies of QUIC have been done on the Web, it is of interest to note some important differences between the Web and Core Network since the underlying network will affect the choice of transport protocol. We will discuss these differences in-depth in Section 2.4.2 after we have presented the protocols.

2.2 Protocols

Protocols form the basis for any type of communication. They describe the format and order of messages communicated between two or more entities and what actions should be performed when sending and receiving messages [22, p. 37]. After a brief introduction on the role of different protocol layers, we will focus on describing the transport layer protocols, TCP and QUIC in particular, which are the focus of this work.

2.2.1 Introduction The 5G Core Control Plane uses the HTTP protocol stack [4, p. 347] where each protocol corresponds to a certain layer. Each layer is shown in Figure 2.3 and their respective roles described below according to RFC 1122 [15]. End hosts implement the full protocol stack, as seen in Figure 2.3 – ap- plication, transport, network, link and physical layers – whereas routers in the network only implement the network, link and physical layers. The ap- plication layer protocol is what the application uses to communicate with an application at the other end. The hands over its messages to a 10 CHAPTER 2. BACKGROUND

Figure 2.3: Protocol Layers of 5GC Control Plane transport layer protocol. The transport layer provides certain services to both ends which will be discussed in detail below. The transport protocol hands over the message to the protocol which encapsulates the data, creates a packet and sends it out in the network. The network layer is used by routers in the network to the packet to its destination. The is responsible for handling communication between two physically adjacent nodes. [15] Broadly speaking, there are two main types of transport protocols: connec- tionless and connection-oriented. Connectionless transport protocols encap- sulate application messages into and send them off without knowing anything about the receiving end or the network. Many factors could prevent a packet from reaching its destination. The receiving end might be down or overloaded and not able to receive the packet at the moment, the network might lose the packet and so on. The benefits of a connectionless service is that it is simple and straightforward. The other main type is connection-oriented transport protocols which solve many of the problems of connectionless protocols. Connection-oriented pro- tocols establish a connection with the receiving end and send packets over that connection. Different parameters of the connection itself can be adjusted to provide reliable delivery and other services to the end hosts. [15] Another important aspect that could belong to the transport protocol layer itself or put in between the transport and application layer is security [9][23]. Nowadays, we expect a secure channel of communication that provides con- fidentiality, authentication and integrity [23]. Confidentiality refers to data only being visible to the endpoints, hindering any eavesdropping between two communicating endpoints. Authentication is the ability to verify the sender of a message. Integrity is the ability to ensure that a message has not been tampered with. CHAPTER 2. BACKGROUND 11

Figure 2.4: RTTs before transmitting first data

2.2.2 TCP/TLS The two most commonly used transport protocols are the User Pro- tocol (UDP) [24] and the Transmission Control Protocol (TCP) [25]. UDP offers a connectionless service, as described in Section 2.2.1, which does not guarantee that packets will arrive at their destination. TCP, on the other hand, has several built-in features to enable applica- tions to transparently send and receive continuous reliable -streams over an unreliable network, as described in the original RFC [25]. Reliability is ensured by having the receiver acknowledging each packet it receives so that the sender knows it has reached its destination, otherwise the sender will re- send the packet. is the ability to ensure that neither endpoint will send more data than the receiver can handle. Congestion control refers to the ability of ensuring that network routers will not be overloaded by a sender and cause packets to be dropped. Simplified, TCP congestion control infers net- work congestion when ACKs are not received for packets sent [22, p. 298]. It is then taken as a sign that buffers are full and the network is therefore starting to drop packets. Several congestion control algorithms can be used that are independent of the TCP implementation itself. Common congestion control algorithms used today include reno, cubic and bbr [26]. The connection establishment is the first thing that happens in a TCP con- nection and is referred to as a three-way handshake. The first step, as shown in the leftmost illustration of Figure 2.4, is when the client sends a SYN message to the server. If the server is ready to receive the connection, it will respond with a SYN,ACK, acknowledging the received SYN and indicating readiness to open a new connection. The third step is for the client to send an ACK ac- 12 CHAPTER 2. BACKGROUND

knowledging the SYN received from the server. Each connection is identified by the source and destination IP and port numbers. [25] Important to note for our purposes is that the TCP handshake induces one full Round-Trip-Time (RTT) before the client application can start to transmit data. However, TCP does not provide confidentiality, authentication or in- tegrity. For these purposes TLS () is widely used as an additional layer in between the transport and application layers [12]. After TCP’s three-way handshake has taken place, TLS then needs to perform its crypto handshake to establish the shared key and other parameters [23]. TLS 1.2, which is still widely used, induces two additional RTTs as seen in the middle illustration of Figure 2.4, whereas TLS 1.3 reduces it to one additional RTT before the server can start transmitting data [12].

2.2.3 HTTP/1.1 and HTTP/2 After a brief overview of the development of HTTP, we will have laid the groundwork for understanding the interplay problems between the application and transport layers, which resulted in the inception of QUIC. The authors of RFC 7231 define an HTTP message to be either a response or a request [27]. They further explain that the entity making a request is called client. Every request has a target identified by a Uniform Resource Identifier (URI). The entity receiving the request is called server and should respond to the requests of the client. Various HTTP methods can be used by the client to indicate the purpose of the request to the server. The methods of interest in this study is GET – which requests the current representation of a resource found at the server – and POST, where the client requests processing of data carried in its payload [27]. Saxcé et al. describe one of the major drawbacks with HTTP/1.1 to be that it is not possible to send more than one request at a time over a single TCP connection [28]. In order to send multiple requests concurrently, additional TCP connections need to be opened. Since TCP connections are identified through their port numbers and IP addresses, as mentioned in Section 2.2.2, it means that multiple ports would have to be used for a single client-server pair in order to achieve parallelization of requests. But as Roskind et al. noted: “pairs of IP addresses and sockets are finite resources” [7]; in other words, it is possible to run out of port numbers if used excessively. Google therefore ini- tiated the development of SPDY, which later resulted in HTTP/2, and enabled multiplexing of several requests over a single connection [28]. Each HTTP request/response exchange is multiplexed onto its own stream [29]. HTTP/2 CHAPTER 2. BACKGROUND 13

Table 2.1: Connection Establishment Latency Comparison QUIC TCP/TLS1.3 TCP/TLS1.2 Unknown server 1 RTT 2 RTTs 3 RTTs Known server 0 RTTs 1 RTTs 2 RTTs provided additional improvements as well due to binary message framing – in- stead of text-based as in HTTP/1.1 – and a compression scheme of the HTTP headers called HPACK [29]. Currently, HTTP/2 over TCP is used by 3GPP in the 5G Core Control Plane [4, p. 348]. However, despite the improvements of HTTP/2, it still had some weaknesses and that is why the development of QUIC and the corresponding HTTP/3 started.

2.2.4 QUIC and HTTP/3 In 2012, Jim Roskind and his team at Google laid down the design goals for QUIC [7]. One of the main goals, according to Roskind, was to increase the responsiveness of Web applications by reducing connection establishment latency. Another important goal was to eliminate the Head-of-Line (HOL) blocking problem that had been plaguing HTTP/2 over TCP. The strategies used in QUIC to resolve these issues are described in more detail below. Google submitted QUIC to the IETF for consideration in 2016 [8] and several changes were made to the original protocol such as incorporating the TLS 1.3 hand- shake [30]. When referring to QUIC in this thesis henceforth, it is the IETF QUIC version that is meant.

Figure 2.5: The Head-of-Line Blocking Problem in HTTP2/TCP. A packet loss in one stream results in all streams being put to a halt until the packet has been recovered.

From a latency point of view, a major problem with connection-oriented protocols such as TCP [25] and SCTP [31] is the RTTs induced by the initial handshakes in order to establish a secure connection as discussed in Section 14 CHAPTER 2. BACKGROUND

2.2.2. In Figure 2.4, we saw that TCP over TLS gives a delay of at least 2 RTTs before any data can start to be transmitted. This is because TCP and TLS require separate handshakes. The main way in which QUIC reduces the con- nection establishment latency is by combining the cryptographic handshake and the transport handshake [32]. In that way, QUIC never needs more than 1 RTT for setting up a secure connection. A comparison of connection estab- lishment latency between QUIC and TCP/TLS is presented in Table 2.1. If the server is previously known, then connection establishment latency could even be reduced to zero by sending a cryptographic cookie that identifies the client to the server [12]. The Head-of-Line blocking problem in HTTP/2 stems from the way that HTTP/2 interacts with TCP; HTTP/2 multiplexes requests by sending them on several individual streams – one stream per request-response interaction [29]. However, these streams are only visible at the application layer [32]. Once the application messages are handed over to TCP, they are treated as a single stream by TCP [32]. The advantage with HTTP/2 over TCP is that several requests can be sent in parallel over a single TCP connection. The disadvantage, however, is that if a packet loss occurs in one of the HTTP/2 streams, all of the other streams over that connection are put to a halt until the TCP has recovered the lost packet [32], as seen in Figure 2.5.

Figure 2.6: Elimination of the Head-of-Line Blocking Problem. With QUIC, only the stream that is experiencing the packet loss is put to a halt; packets in other streams will keep flowing.

In QUIC, the Head-of-Line blocking problem is eliminated by introducing streams also at the transport layer [30]. If a packet is lost in one of the streams, only that stream will be put to a halt until the packet is recovered, while packets will keep flowing on the other streams [32] as seen in Figure 2.6. This is one reason that makes QUIC more robust to packet loss than TCP, according to some studies [16, 17, 33]. Packet loss detection in QUIC is simplified since its packet number space is solely used to indicate order of transmission [34]. This is in contrast to TCP’s sequence number which is also used by the receiver to indicate the next ex- CHAPTER 2. BACKGROUND 15

pected segment in its ACK [25]. QUIC also enables more precise calculation of the RTT by having the endpoints providing the duration between receiving a packet and responding with an ACK [30]. The RTT is more accurate since it is now possible to deduct the time spent processing the packet. The reason that QUIC necessitated the new HTTP/3 protocol and could not simply use HTTP/2 is largely because HTTP/2’s compression scheme HPACK requires total ordering of frames across streams while QUIC only requires or- dering within individual streams [10]. Therefore, a new HTTP com- pression scheme needed to be created called QPACK [32]. Many other framing concepts from HTTP/2 are dropped in HTTP/3 since they are now taken care of at the transport layer instead [32]. Improving on existing TCP kernel implementations is described by Roskind as feasible albeit the implementation of these changes is cumbersome since the complete need to be updated in order for the changes to take effect [7]. Therefore, he and his team decided to build a connection-oriented protocol on top of the ubiquitously available UDP protocol. Thus, QUIC pack- ets are encapsulated within UDP datagrams [30]. Moreover, all functionality provided by QUIC such as reliability, security, flow and congestion control etc. is implemented in user space [12]. The advantages are that there is no need to update the operating system in order to experiment with the QUIC protocol. The downside is that in user space could be less efficient than in the kernel [35]. Other features of QUIC not directly applicable to our study includes a new way of identifying a connection. Instead of identifying it based on the IP ad- dresses and ports of the parties communicating – like in TCP – QUIC gives each connection a unique ID which makes it easier to retain the connection when changing to new ports or IPs [30].

2.3 QUIC on the Web vs. the 5G Core

There are two major differences between the Web and the 5G Core Network that are of interest when comparing transport protocols:

1. The guarantees provided by the network layer.

2. The type of traffic sent over the network.

The reason that there even is a need for a transport protocol has to do with the characteristics of the underlying network. In the case of the Web, the Internet and (IP) is used for its communications. IP is a best-effort 16 CHAPTER 2. BACKGROUND

protocol; as the name implies, it will do its best to deliver data packets sent over its network. However, it does not give any guarantees. More specifically, it does not guarantee that a packet will actually be delivered, that a sequence of packets will be delivered in order, or that the content of the packet remains unchanged (integrity). [22, p. 220] The Core Network, however, does offer Quality of Service guarantees on packet latency, throughput and error rate [4, ch. 9]. As we shall see in Section 2.4, most studies of QUIC have been done with the Web in mind. A different type of underlying network such as the 5G Core Network will also affect the performance of transport protocols. The second difference has to do with the type of traffic sent over the net- work. When downloading a Web site, multiple objects of various sizes need to be retrieved such as images with sizes of several KBs. Most of the re- quest/response traffic in the 5G Core Network, however, consists of smaller payloads between 20–300 as can be seen in the OpenAPI Specification Files for the 3GPP 5G Core Network [36]. These numbers have also been con- firmed verbally by people working in the industry; it has not been possible to retrieve other written sources on these numbers since many of those are pro- prietary. However, we believe the estimates used are certainly reasonable. In summary, the 5G Core Network is a more reliable network than the Internet and has a higher frequency of requests and responses with a low payload.

2.4 Related Work

To the best of our knowledge, there are no publicly available studies on QUIC within the 5G Core Network SBA that also present measurement data. The primary source of information regarding QUIC’s feasibility in the 5G Core Network SBA is the technical report TR 29.893 from the 3GPP which we dis- cuss below. However, this report mostly relies on comparing TCP and QUIC features from a theoretical point of view. Our work aims to provide real mea- surement results applicable for the 5G Core Network SBA. Although there is a lack of studies done on QUIC within the 5G Core SBA, there are several studies on QUIC within the context of the Web; we also dis- cuss those studies and their findings as they relate to the 5G Core SBA context.

2.4.1 QUIC in the 5G Core Service Based Architecture Although HTTP/2 over TCP is the current 3GPP standard in the 5G Core Net- work Control Plane [4, p. 348], in June 2019 the technical report TR 29.893 CHAPTER 2. BACKGROUND 17

[37] was published by 3GPP on the potential of using QUIC in the 5G Core Network SBA. Five requirements for a transport protocol was listed on page 19. They were: • Reliable message delivery.

• Flow control and congestion control mechanisms.

• Support of connection semantics (One HTTP connection maps to one TCP or QUIC connection).

• Failure to deliver one message shall not block subsequent messages (Head- of-Line blocking mitigation)

• The transport protocol supports mechanisms to authenticate the peer endpoint and secure the transfer of application messages. All of the requirements are said to be fulfilled by QUIC. Yet, the authors con- cluded that QUIC is not yet ready to be used as a basis for the 5G Core Network Control Plane signaling due to the immaturity of the implementations at the time and the difficulty of using QUIC over a proxy. However, the IETF is cur- rently working on a draft describing discovery mechanisms for non-transparent proxies, which would be part in providing a solution for QUIC over proxies [38]. Moreover, one year is a long time for implementations to mature and there is a lot of work done by major Internet players such as Facebook and Cloudflare in the development of QUIC implementations [39, 40]. Still, these implementations require a lot of testing. Finally, the 3GPP concludes by en- couraging further studies on QUIC’s feasibility specifically in the 5G Core Network SBA [37]; the aim of this study is to make a contribution within this area.

2.4.2 QUIC on the Web Most studies of QUIC are concerned with the Web. There seems to be a fairly general consensus that the setting where QUIC provides significant advan- tages, compared with TCP, is when network conditions are poor such as high RTT, high packet loss and low . So did the authors of Does QUIC make the Web faster? note that 90% of web pages loaded faster with 2G com- pared to 60% when using 4GLTE [16]. The authors of QUIC: Better for what and for whom? [17] similarly conclude that “QUIC outperforms HTTP/2 over TCP/TLS in unstable networks”. The initial tests made by Google [33] also noted that QUIC outperformed TCP under “poor network conditions” and that 18 CHAPTER 2. BACKGROUND

over the slowest connection, a Google Search page loaded a full second faster with QUIC. A study made by Megyesi et al. [18] comparing Google QUIC with SPDY (predecessor of HTTP/2) and QUIC found that the “network conditions deter- mine which protocol performs the best”. They also noted that the condition where QUIC would shine the most is high RTTs. One weakness of Google QUIC was its performance on high-speed links. They attribute this weakness to its packet pacing mechanism not being able to reach the full capacity of a high speed link. At a first glance, it seems like Kakhki et al. in their paper Taking a Long Look at QUIC [35] contradicts the previously mentioned studies by stating that “QUIC’s performance diminishes on mobile devices and over cellular net- works”. However, at a closer inspection, we find that it is not necessarily the features of the protocol itself that causes the worse performance over cellular network but rather QUIC’s reliance on application level processing and en- cryption makes it slower on mobile phones (from 2013 and 2014) and not as powerful as its desktop counterparts. In summary, the consensus seems to be that QUIC has its most significant advantage on lossy networks with high RTTs. High bandwidth links could be a problem if the processing power of the device is poor. Chapter 3

Method

In the following chapter, we describe and motivate the procedure undertaken to identify whether QUIC performs significantly better than TCP in the 5G Core Network Control Plane SBA. First, a brief overview of the testbed architecture is presented followed by the requirements we have on the testbed. We then present the measurement/network tools and QUIC/TCP implementations that were used to adhere to our requirements. Finally, we describe and motivate the benchmark parameter settings, choice of key performance indicators and test design.

3.1 Testbed Requirements

In order to compare QUIC and TCP, we first need an environment that can run identical benchmarks using one protocol or the other. It is important to configure the QUIC and TCP setups in a way that will reduce the impact of other sources affecting the results. We begin with an overview of the testbed setup followed by a description of each requirement that we have on the testbed to ensure a fair comparison of the two protocols.

3.1.1 Overview An overview of the testbed is shown in Figure 3.1. In the 5G Core Network SBA, Network Functions (NFs) make requests to other Network Functions over HTTP. As described in Section 2.2, HTTP runs over a transport layer pro- tocol. We needed one HTTP client/server pair (represented by NF A and NF B) running over TCP and another running over QUIC. Requests are sent from the client (NF A) and responses received from the server (NF B). In the case

19 20 CHAPTER 3. METHOD

Figure 3.1: Testbed Architecture of TCP, we decided to use HTTP/2 since it is the current standard in 5G Core Network SBA [21]. The TCP setup also needs an additional protocol layer to provide encryption. TLS v1.3 was chosen for this purpose. The reasoning behind selecting this specific version is explained in Section 3.1.2. HTTP/3 is used over QUIC since it is the only HTTP version that works over QUIC. The client, bridge and server each resides in its own virtual network as described in Section 3.2.1. Finally, network emulation was needed in order to emulate real network characteristics such as link delay, packet loss, packet reordering and so forth. Emulation of packet loss happens on both NF interfaces (client and server) and emulation of RTT happens on the bridge interfaces.

3.1.2 Requirements to Ensure Comparability The goal of this study is to determine whether QUIC gives any significant im- provements over TCP in the 5G Core Network Control Plane SBA. An over- arching guideline is to compare default QUIC and TCP implementations that would be widely available in various microservices rather than highly cus- tomized variants. Also, when comparing results from two different setups, there is always the risk that characteristics other than the ones you want to compare affect the resulting data and creates unwanted noise. In our case, we wanted to compare the difference between the TCP and QUIC protocols and not, for instance, con- gestion algorithms. We have therefore strived towards, as far as possible, elim- inating other sources of noise in the measurement data by carefully consider the different configuration options in the TCP and QUIC setups respectively. Each configuration option was set with the goal of providing a comparison that CHAPTER 3. METHOD 21

Option QUIC TCP QUIC Version IETF Draft 27 N/A TCP Version N/A Linux Kernel 5.4.0 TLS Version 1.3 1.3 Hardware Acceleration N/A TSO, GSO, LRO, GRO, TLS disabled. Congestion Algorithm Cubic Cubic TCPFastOpen N/A NO 0–RTT NO N/A Connection Type Simplified Long-lived Simplified Long-lived Multiplexing of Requests YES YES

Table 3.1: QUIC and TCP Configuration Option Decisions would be as fair and valid as possible. In this section, we present the different configuration options that we deemed especially important to consider without pointing out specific implementa- tions; the tools and QUIC/TCP and HTTP2/HTTP3 implementations used are presented in Section 3.2. The configuration options considered and the choices made for each are summarized in Table 3.1. Some selections of the configu- ration options were not necessarily more right than others. Nevertheless, it was still important to make a conscious decision for each option in order to interpret the results accurately. We will now proceed to explain the reasoning behind the inclusion and setting of each option.

QUIC Version As stated in Section 2.2, Google developed the initial QUIC protocol, now referred to as gQUIC, but submitted it to the IETF in 2016 to begin the stan- dardization process. It is the IETF QUIC protocol that we are interested to in- vestigate, and it is therefore important to ensure we are using the IETF version. Since, at the time of writing, it is not yet standardized, it was also important to consider the draft version used. We wanted to use a draft that was as up to date as possible and are therefore using draft 27, the most current at the time of writing. 22 CHAPTER 3. METHOD

TCP Version Even though there are specialized implementations of TCP running in user space, we wanted to use TCP from the Linux kernel since most microservices are not likely to implement their own TCP implementations. We use the Linux kernel 5.4.0.

TLS Version In Table 2.1, we show that there is a full RTT difference between using version TLS 1.2 and 1.3 with TCP. Since we are using the most recent version of QUIC, and TLS 1.3 is already widely deployed [41], there is no reason for not using the most recent version of TLS also with TCP. Therefore, version 1.3 was selected in this study.

Hardware Acceleration One of the major advantages with TCP is that it has been around for such a long time and therefore has a rich ecosystem supporting it. As discussed in Section 2.2 there are several hardware offloading mechanisms that can im- prove performance substantially. The offloading mechanisms identified by 3GPP [37] as able to increase TCP performance were: TCP Segmentation Offloading (TSO), crypto offloading (TLS), Generic Segmentation Offloading (GSO), Large Receive Offloading (LRO), Generic Receive Offload (GRO), and checksum offloading. By far, TCP Segmentation Offloading would have the most substantial impact with a reduction of up to 50 times on CPU usage [37]. All of these were disabled except for checksum offloading (due to limitations of the virtual machine environment) in order to compare the protocols them- selves and not hardware offloading capabilities. Checksum offloading is not expected to affect the result significantly.

Congestion Algorithm Cubic was selected as the congestion algorithm on all implementations since it was the only congestion algorithm available on every implementation. How- ever, Cubic is a widely used congestion algorithm as mentioned in Section 2.2.2 and should therefore be a valid choice. CHAPTER 3. METHOD 23

Connection Length A connection could close immediately after a single request-response pair has been processed – a short-lived connection – or remain open – a long-lived con- nection. Due to time restrictions, we decided to only choose one connection model. Since the multiplexing aspect is such an essential part of both HTTP2 and QUIC because of its relation to the Head-of-Line blocking problem (as dis- cussed in Section 2.2), we decided to study the long-lived connection model. Due to implementation restrictions, we chose a somewhat simplified vari- ant of the long-lived connection model in which all our requests are issued at once, albeit in a true long-lived connection model, they would be sent stochas- tically.

TCPFastOpen and QUIC 0–RTT As discussed in Section 2.2, QUIC’s 0–RTT and TCP’s TCPFastOpen settings enable a client to immediately start sending data to a server that is previously known. Our QUIC implementation lacks the ability to use 0–RTT. However, since we opted to use long-lived connections with multiplexing of many re- quests, the effect that these early-data mechanisms would have on any of the protocols is reduced. As we shall see in Chapter 4, there is still a significant difference in request duration in QUIC’s favor even without taking into account the initial connection setup time. Worthy to note is also that the QUIC draft acknowledges that the 0–RTT feature is not always used since it has some se- curity drawbacks; increased latency at connection setup can thus be traded for stronger security protection against replay attacks that comes with the 1–RTT handshake [9].

Multiplexing Some implementations that we tried were not able to perform multiplexing of requests. Multiplexing is an essential feature of both HTTP/2 and HTTP/3. The multiplexing aspect was one of the most interesting ones to compare be- cause of its relation to the Head-of-Line blocking problem as discussed in Sec- tion 2.2. It was therefore necessary to choose implementations that could han- dle multiplexing. The ability to multiplex requests contributed to the choice of the long-lived connection model as well. All of the nodes are therefore able to multiplex requests. 24 CHAPTER 3. METHOD

3.2 Testbed Components

Apart from deciding on the requirements to ensure comparability, described in Section 3.1.2, another important aspect is the selection of components for the testbed; the implementations of QUIC/TCP should adhere to the requirements list in Table 3.1 and be stable and trustworthy. Moreover, good documenta- tion and ease-of-use are preferable and sometimes necessary in order to be able to perform the tests we desire. In the following sections, we present the components selected and the reasoning behind.

3.2.1 Platform & Network Tools All benchmarks were run on the Ubuntu 18.04 operating system with four Intel i7 cores and 20 GB of memory. In order to emulate a real-world communi- cation scenario between networks, several tools described below were used to enable specific RTT and PLR selections.

Virtual Networks: Linux Network Namespaces Running benchmarks locally poses some problems in providing the desired RTT and packet loss characteristics since only one interface is available – the loopback interface. However, with Linux Network Namespaces it is possible to have logically separate copies of the network stack with their own interfaces, firewall rules and tables [42]. This is very useful when using the tools for controlling the RTT and PLR described below.

Round-Trip-Time: netem The netem Linux network emulator is a tool designed to emulate the charac- teristics of Wide Area Networks by controlling the RTT and PLR [43]. For our purposes, it is only used to apply a delay to packets to increase the RTT to a desired number. Even though netem is capable of also creating packet loss, the implementors do not guarantee accurate performance when running benchmarks locally [43]. Although we have a bridge in between the network namespaces, it is virtual and probably for that reason, we found that packet loss did not work as expected when using netem. CHAPTER 3. METHOD 25

Packet Loss: iptables The iptables program [44] is used to set up and maintain firewall rules and works very well in conjunction with netem. Since each network namespace has its own firewall rules, it is possible to set a desired amount of packet loss on the client and server interface respectively using iptables together with its statistical module; this is how it is used in our testbed.

Logging: tcpdump The tcpdump program captures packets on a network interface and prints out that information to a .pcap file [45]. In our testbed, it is used to capture raw packets on the client interface for later higher-level processing.

Logging: Wireshark Wireshark is a network protocol analyzer [46]. Its terminal-variant tshark [47] is used in our testbed to interpret the different HTTP/2 and QUIC fields found in the .pcap file, generated by tcpdump, in order to identify the beginning and end of a request-response exchange, the number of bytes transmitted and the length of a complete connection.

3.2.2 QUIC/TCP Client & Server Implementations There are two separate client programs for the TCP and QUIC implementa- tions. One server program is used that is able to run both TCP and QUIC.

QUIC Client: lsquic The IETF QUIC working group maintains a website that lists all the available QUIC and HTTP/3 implementations [48]. Several good candidates are avail- able, and we also tried out several of them. The lsquic implementation [49], which is our choice, is provided by the LiteSpeed company, a company that specializes in server software design [50]. Their LiteSpeed QUIC server has been demonstrated to be highly performant [51]. The lsquic implementation also uses draft version 27 and enables multiplexing of requests with ease. The main downside of this implementation is that it does not offer the 0–RTT fea- ture. However, even though we are not able to see the reduction in connection setup offered by this feature, we are still able to get valuable information on request duration once the connection is opened as is demonstrated in Chapter 4. 26 CHAPTER 3. METHOD

TCP Client: nghttp2 The nghttp2 is a mature HTTP/2 implementation [52] offering the HTTP/2 terminal client nghttp which runs over the default Linux TCP stack (version 5.4.0). Since the default Linux TCP stack is used in our testbed, it was easy to turn off the hardware acceleration features listed in Table 3.1 by using the standard configuration tools provided in Ubuntu.

QUIC/TCP Server: with quiche nginx is a well-known server that profiles itself as a highly performant web server [53]. It has default support for HTTP/2 and runs over the default Linux TCP stack. The Content Delivery Network Cloudflare has implemented a QUIC extension that runs with nginx named quiche [54, 55]. Benefits of using nginx for both TCP and QUIC include ease of configuration, for instance only allowing TLS v.1.3 and so on.

3.3 Benchmark Parameter Settings

There are a multitude of parameters that should be considered in order to make a thorough and comprehensive comparison of TCP and QUIC. It is, however, not feasible to do a fully comprehensive comparison within the scope of this thesis and therefore some delimiting choices have been made. The focus is on QUIC and TCP performance within the 5G Core Network Service Based Architecture; that also affects the choice of what parameters to include in the benchmarks. We first present the main parameters of our study and then what we refer to as the sub-parameters.

3.3.1 Main Parameters One of the main areas in which the future 5G Core Network SBA would differ from previous architectures is the physical distance between the nodes (Net- work Functions) communicating. 3GPP has clearly stated that the SBA micro- service-like architecture is intended to enable Network Functions to be de- ployed in different data centers far apart from each other rather than very close to each other as is the case today [3]. A more remote deployment of Network Functions will increase the RTT. It is therefore important to investigate how an increasing RTT affects protocol performance. As Network Functions are deployed further from each other, the risk also increases of having a higher PLR. Therefore, it is also important to CHAPTER 3. METHOD 27

look at the effect an increasing PLR would have on QUIC and TCP respec- tively. Since both an increasing RTT and PLR are important implications of the future 5G Core Network SBA with remotely located Network Functions, these will be the main parameters that we focus on in our experiments.

3.3.2 Sub-parameters The RTT and PLR, however, cannot be investigated without setting other pa- rameters that also affect protocol performance. The ones chosen to be included in this study are:

• Number of Multiplexed Requests

• Request Payload Size

• Response Payload Size

The number of multiplexed requests over a single connection is a param- eter that only needs to be set when the long-lived connection model is being used, since the short-lived connection model only sends one request at a time and then closes down the connection. Our implementation of QUIC did not have the 0–RTT feature available; hence, we only studied the long-lived con- nection model. The request payload and response payload are basic features of an HTTP request that always need to be set to zero or another value. In our implementation, the zero bytes payload request is represented by HTTP GET and as soon as payload is added, it is represented by HTTP POST. It is important that the sub-parameters are set in a way that is as similar as possible to the 5G Core Network SBA traffic. Since the 5G Core Network SBA payload size varies between 20–300 bytes, as described in Section 2.4.2, we are not investigating the effect of trying to POST objects larger than 1000 bytes or several MB. The response payload, however, could be large for imple- mentational reasons and therefore we are investigating response payload sizes of up to 10 MB.

3.4 Key Performance Indicators

It is important to be able to measure the effect that the different parameters have on QUIC and TCP protocol performance. Two of the most important and somewhat lowest common denominators when evaluating the performance of 28 CHAPTER 3. METHOD

network communication in general is throughput and latency. Throughput basically measures the data per time unit ratio within a network. Latency, on the other hand, measures how fast data propagates. It is fully possible to have protocol characteristics with a high throughput but low latency or vice versa. However, it is also possible to measure higher-level parameters such as Page Load Time (PLT) which is very important for the web where the end user is often loading a web page and reacting to how fast the web page loads. Since we are not dealing with the loading of web pages in the 5G Core Network Control Plane SBA, we reasoned that it is most interesting to focus on latency of individual requests and throughput. When evaluating a protocol, it could also be interesting to look at other aspects such as CPU usage that may also affect the final choice of protocol at the end. In this study, we decided to focus on pure characteristics of the protocols, for which throughput and latency make good candidates. There are also different ways of measuring throughput, but we use the model displayed in Equation 3.1. When measuring time, we always retrieve the timestamps from the client-side. The beginning of the connection is repre- sented by the very first packet that is sent over the connection and terminated by the very last packet, right before the client closes down the connection; thus, the full connection duration also includes the handshakes. Bytes From Server Throughput = (3.1) Connection Duration When it comes to latency, we are mostly interested in how fast an HTTP re- quest completes and the KPI chosen for measuring latency is therefore request duration. The request duration measurement model is shown in Equation 3.2.

Request Duration = Response RX − Request TX (3.2) The request duration is the difference between the timestamp of the last response packet and the timestamp of the first corresponding request packet.

3.5 Test Design

Given five parameters to vary, it is unfeasible to vary them all to any greater extent because of the time it takes to run the benchmarks and perform the subsequent analysis. In order to have statistical significance, each identical benchmark is run 50 times. Thus, generating a plot with 10 points necessitates the running of 500 benchmarks. Because of the rapidly increasing number of CHAPTER 3. METHOD 29

possible benchmarks that could be run when varying five different parameters, we have had to put some limitations and decide which parameters are the most important for the purpose of answering our research question. As mentioned in Section 3.3, the focus is on primarily determining the protocol efficiency of QUIC and TCP as RTT and PLR increase due to more remote placement of Network Functions. At a first glance, it might seem straightforward to simply run benchmarks varying the RTT and PLR and see the performance of TCP and QUIC with regards to throughput and request du- ration. However, the other three parameters also influence the result. There- fore, measuring the performance of TCP and QUIC with regards to PLR and RTT is not just a one-dimensional measurement; we need to also look at the performance when we are multiplexing few requests or many requests, when the request/response payload is small or large and so on. It is important to choose these values in a way that would reflect the 5G Core Network SBA as closely as possible. As discussed in Section 2.3, typ- ically, request/response payload within the 5G Core Network is between 20– 300 bytes. The response payload may be higher for implementational reasons, and we therefore try payloads up to 1 MB. The number of requests multiplexed over a single connection also varies and we have tried a whole spectrum, from 1–500 requests multiplexed over a single connection. Chapter 4

Results

In Section 3.3, we discuss in depth the reasoning behind the selection of pa- rameters to study. We point out that an increased RTT is a direct consequence of a futuristic deployment of the 5G Core Network SBA where Network Func- tions are placed in different data centers much further apart than what is the case today. It is also likely that the PLR would increase in this scenario. There- fore, it is of great interest to see if either protocol performs significantly better than the other with regards to request duration and throughput under varying degrees of RTT and PLR. We refer to these parameters as the main parameters. It is equally important to also understand how the tuning of the number of multiplexed requests, the request payload and the response payload influence protocol performance; in this study, we refer to these parameters as the sub- parameters. The impact of these sub-parameter settings on protocol perfor- mance is valuable to understand on its own merits; additionally, an accurate and precise understanding of sub-parameter impact is important in order to make an informed decision on what their settings should be when evaluating PLR and RTT impact on protocol performance. We begin by studying the impact of the sub-parameters on protocol per- formance in Section 4.1. Based on our findings, we then choose what the sub-parameter settings should be when we conclude with varying the main parameters – PLR and RTT – in Section 4.2.

4.1 Sub-parameters

Each main parameter and sub-parameter can only have one of two possible states at a time: static or dynamic. In a given benchmark, only one parameter can be dynamic; in other words, we only vary one parameter at a time in order

30 CHAPTER 4. RESULTS 31

to isolate the impact of each parameter on protocol performance. Yet, the parameters that are not varied in a particular benchmark – the static parameters – also need to be set to some value. In order to have an overview of the different static values that were used in at least one benchmark throughout Chapter 4, we present Table 4.1. The RTT static value is set to 20 ms to emulate inter- regional deployments. The PLR static value is set to 0% to have a baseline of request duration and throughput results not being affected by congestion control mechanisms. The static values for each sub-parameter are selected based on the outcome of their results as presented throughout Section 4.1. The reasoning behind sometimes selecting more than one value and why these particular values were chosen is also presented in the subsection of each sub- parameter. Important to know is also that when the request body is zero bytes, an HTTP GET request is used. When it is a non-zero value, an HTTP POST request is used. We begin by varying the number of multiplexed requests in Section 4.1.1 followed by varying the request payload in Section 4.1.2 and the response pay- load in Section 4.1.3. Parameter Value Number of Multiplexed Requests 10; 25; 200 Round Trip Time 20 ms Packet Loss Rate 0% HTTP Request Body 0; 25 bytes HTTP Response Body 200 bytes

Table 4.1: Even if a particular parameter is not being varied in a given bench- mark, it still needs to be set to a value. This table lists the various temporary values that a parameter can have in at least one of the benchmarks presented throughout Chapter 4.

4.1.1 Number of Multiplexed Requests To begin with, we study the impact on request duration when varying the num- ber of multiplexed requests as seen in Figure 4.1. The number of multiplexed requests is varied between 10 and 500. The values of the other parameters during this benchmark is presented in Table 4.1 with the HTTP request body set to zero bytes. At 10 multiplexed requests, the difference in request duration average be- tween QUIC and TCP is negligible; however, TCP’s request duration 95th 32 CHAPTER 4. RESULTS

Figure 4.1: The request duration mean, 5th and 95th percentiles as a function of the number of multiplexed requests. The RTT is 20 ms, the PLR is 0%, the HTTP request body is zero bytes and the HTTP response body is 200 bytes. percentile is twice as long as QUIC’s. Between 20–75 requests, QUIC re- duces average request latency to less than half of that of TCP. Between 50–75 requests, both TCP and QUIC request duration rises significantly. Between 75–175 requests, TCP average continues to rise until it reaches a peak average duration at 100 requests. At 75 requests, QUIC reaches a plateau and the re- quest duration average stays virtually the same until 175 requests after which it drops again. At 200 requests, QUIC request duration average is 60% of that of TCP and close to half when above 200 requests. Between 150–175 requests, the difference between QUIC and TCP average is very small; moreover, the 95th percentiles is not significantly better for one or the other. To summarize, the number of multiplexed requests does have a significant impact on request duration. On average, QUIC enables requests to complete twice as fast compared to TCP when a low number of requests are multiplexed (20–75) and when a high number of requests are multiplexed (> 200). At very low intervals of multiplexed requests (< 10) the difference in request duration average is negligible. The difference is also very small when multiplexing 150–175 requests. However, as we shall see, the request payload also has a significant impact on request duration. An important detail in this result is that CHAPTER 4. RESULTS 33

Figure 4.2: The throughput mean, 5th and 95th percentiles as a function of the number of multiplexed requests. The RTT is 20 ms, the PLR is 0%, the HTTP request body is zero bytes and the HTTP response body is 200 bytes. we are using an HTTP GET request with zero bytes request payload. Once we use an HTTP POST request with a payload of 25 bytes, we will see different results as presented in Section 4.1.2. Also, we should keep in mind that these results hold at an RTT of 20 ms and PLR of 0%. Noteworthy is also that overall QUIC’s 95th percentile is further away from its average than TCP’s 95th percentile. In other words, QUIC’s spread is larger than TCP’s. Although a majority of QUIC requests will complete significantly faster than TCPs, there are 5% of the QUIC requests that have a significantly larger delay which can be important to be aware of for time critical applications for example. In Figure 4.2, the throughput is displayed as a function of the number of multiplexed requests. Noticeably, TCP’s high increase in request duration av- erage between 10 and 25 requests is not reflected in the throughput as summa- rized by Table 4.2. At a first glance, one might think that an increasing request duration average should also lead to a decreasing throughput since fewer re- quests would be able to complete within the same time frame. Although coun- terintuitive at first, there is a valid explanation for this phenomenon. We will look at it in more detail since the same reasoning also applies when we look 34 CHAPTER 4. RESULTS

No. of reqs. Req. Avg. Dur. Throughp. Avg. Tot. Number of Pac. 10 23 ms 55.3 KB/s 40 20 42 ms 88.7 KB/s 42

Table 4.2: Example of increase in request duration average and throughput average over TCP.

Request ID Response ID Req TX Resp. RX 1–10 47 ms 1–8 68 ms 9–10 89 ms

Table 4.3: Individual request TX and response RX timestamps when multi- plexing 10 requests over TCP. at the request payload parameter in Section 4.1.2. In Table 4.3, which shows a representative run of this benchmark with 10 multiplexed requests, we see that each TCP request receive the same times- tamp when sent out, 47 ms (they were all in the same packet). Corresponding responses 1–8 receive timestamp 68 ms. But the last two responses arrive quite a lot later and receive timestamp 89 ms. However, since only 2 out of 10 responses receive the latest time stamp, it does not affect the request duration average too much. However, when increasing the number of multiplexed requests to 20, the situation changes as seen in Table 4.4. All 20 requests still receive the same request TX and the responses are received in two batches as in the example with 10 requests. But now, the latest response time account for 55% of all of the responses, thus increasing the request duration average substantially. At the same time, as seen in Table 4.2, the total amount of packets is 40

Request ID Response ID Req TX Resp. RX 1–20 48 ms 1–9 70 ms 10–20 93 ms

Table 4.4: Individual request TX and response RX timestamps when multi- plexing 20 requests over TCP. CHAPTER 4. RESULTS 35

when sending 10 requests and 42 when sending 20 requests, an increase of only 5%. The reason that the example with 20 requests has a higher throughput than the example with 10 requests – although the request duration average is also longer – is simply that the packets sent when issuing 20 requests contain a lot more data than when sending 10 requests. As explained in Section 3.4, throughput according to our method is the quotient of the total number of bytes sent from the server divided by the connection duration. As long as the connection duration stays roughly the same but we receive more bytes, we will have a higher throughput. This is the case here. Since we see a large difference in request duration depending on the num- ber of multiplexed requests, we select a few different numbers of multiplexed requests to use as static parameter values in subsequent benchmarks, namely:

• 10 requests

• 25 requests

• 200 requests

With 10 requests, we know that the request duration difference is negligible when the benchmark is run with a request payload of zero bytes. With 25 multiplexed requests, however, requests sent over QUIC complete almost twice as fast compared to TCP. The throughput does not show such drastic changes with either of the intervals other than when reaching 450 requests and above. The choice of 200 requests, in some cases, is to have an even greater sample of requests from which to calculate the average, the 5th and the 95th percentiles.

4.1.2 Request Payload After gaining knowledge of how the number of multiplexed requests affects request duration and throughput, the next sub-parameter to be investigated is the request duration. The default values are those of Table 4.1 with the number of requests initially set to 10. Figure 4.3 shows the request duration result of this benchmark. At zero bytes request payload, the difference in request duration between QUIC and TCP is negligible. Beginning at a payload of 25 bytes, however, QUIC request duration average is consistently close to half of that of TCP no matter the request payload size. In Figure 4.4, we see that QUIC throughput average is consistently more than twice as high as TCP’s. Again, we see the same phenomenon with TCP as described in Section 4.1.1. In this case, the throughput remains largely unaffected while TCP re- quest duration increases substantially between a payload of 0 and 25 bytes. 36 CHAPTER 4. RESULTS

Figure 4.3: The request duration mean, 5th and 95th percentiles as a function of the request payload. The number of multiplexed requests is 10, the RTT is 20 ms, the PLR is 0% and the HTTP response body is 200 bytes.

Figure 4.4: The throughput mean, 5th and 95th percentiles as a function of the request payload. The number of multiplexed requests is 10, the RTT is 20 ms, the PLR is 0% and the HTTP response body is 200 bytes. CHAPTER 4. RESULTS 37

Figure 4.5: The request duration mean, 5th and 95th percentiles as a function of the request payload. The number of multiplexed requests is 25, the RTT is 20 ms, the PLR is 0% and the HTTP response body is 200 bytes.

Figure 4.6: The throughput mean, 5th and 95th percentiles as a function of the request payload. The number of multiplexed requests is 25, the RTT is 20 ms, the PLR is 0% and the HTTP response body is 200 bytes. 38 CHAPTER 4. RESULTS

Figure 4.7: The total number of packets mean as a function of the request payload. The number of multiplexed requests is 25, the RTT is 20 ms, the PLR is 0% and the HTTP response body is 200 bytes.

Yet again, the explanation to this behavior is the same as described in Sec- tion 4.1.1; the TCP request duration average increases substantially when the request payload increases from 0 to 25 bytes since a much larger fraction of responses end up at a later time slot when we increase the payload. Through- put remains virtually unchanged. If we increase the number of multiplexed requests to 25 and keep the other parameters as before, the previous outlier when the request payload was set to zero bytes disappears as seen in Figure 4.5. The throughput does not change substantially as seen in Figure 4.6. Another interesting observation is that the total number of packets used by QUIC increases much more rapidly than for TCP as seen in Figure 4.7. It is possible that this is one reason why QUIC requests complete faster than TCP requests. An important lesson learnt so far is that the request duration average can be largely affected by either varying the number of requests or the request payload or a combination of both. This situation happens when we have a low number of multiplexed requests (10 in our case) coupled with an HTTP GET request carrying zero bytes payload. Under those conditions, there is no difference in request average duration between TCP and QUIC. CHAPTER 4. RESULTS 39

Apart from the special circumstances described above we can say that:

• On average, QUIC enables requests to complete twice as fast compared to TCP.

• Throughput over QUIC is consistently more than twice as high com- pared to TCP.

Since we note a significant difference in request duration when multiplex- ing 10 requests using a request payload of zero bytes compared to non-zero values, we select the following static values for the HTTP request payload in subsequent benchmarks:

• Zero bytes

• 25 bytes

4.1.3 Response Payload We see in Figure 4.8 that when varying the response payload between 0–1000 bytes, the request duration mean for both TCP and QUIC is largely unaffected; the total variation in request duration average is within 1 ms. Moreover, the dif- ference between TCP and QUIC performance is negligible. However, through- put is consistently twice as high for QUIC compared with TCP as seen in Fig- ure 4.9. The explanation to the significant difference in throughput in QUIC’s favor, despite the negligible difference in request duration, can be found when considering the 95th percentile TCP request duration in Figure 4.8. In our method, throughput is the quotient of the total number of bytes sent from the server divided by the connection duration, as explained in Section 3.4. The connection will not close until the very last request has completed. Even if only a small fraction of requests complete at a much later time than the others, it will cause the connection to stay open significantly longer; the connection cannot close until the last request has completed. In other words, the connec- tion duration is inversely proportional to the throughput. The reason that the throughput is significantly lower for TCP is because of these very late outlier requests. Even though many responses within the 5G Core Network are of shorter length, for implementational purposes, it was deemed interesting to also inves- tigate protocol performance when downloading larger files. Therefore, we also have the example of downloading a file varying the size between 1 KB–1 MB. Looking at the lower end first, we see in Figure 4.10 that the request duration 40 CHAPTER 4. RESULTS

Figure 4.8: The request duration mean, 5th and 95th percentiles as a function of a response payload of 25–1000 bytes. The number of multiplexed requests is 10, the RTT is 20 ms, the PLR is 0% and the HTTP request payload is zero bytes.

Figure 4.9: The throughput mean, 5th and 95th percentiles as a function of a response payload of 25–1000 bytes. The number of multiplexed requests is 10, the RTT is 20 ms, the PLR is 0% and the HTTP request payload is zero bytes. CHAPTER 4. RESULTS 41

Figure 4.10: The request duration mean, 5th and 95th percentiles as a function of a response payload of 1–15 KB. The number of multiplexed requests is 10, the RTT is 20 ms, the PLR is 0% and the HTTP request payload is zero bytes.

Figure 4.11: The throughput mean, 5th and 95th percentiles as a function of a response payload of 1–15 KB. The number of multiplexed requests is 10, the RTT is 20 ms, the PLR is 0% and the HTTP request payload is zero bytes. 42 CHAPTER 4. RESULTS

Figure 4.12: The request duration mean, 5th and 95th percentiles as a function of a response payload of 1 KB–1 MB. The number of multiplexed requests is 10, the RTT is 20 ms, the PLR is 0% and the HTTP request payload is zero bytes.

Figure 4.13: The throughput mean, 5th and 95th percentiles as a function of a response payload of 1 KB–1 MB. The number of multiplexed requests is 10, the RTT is 20 ms, the PLR is 0% and the HTTP request payload is zero bytes. CHAPTER 4. RESULTS 43

average is negligible at 1 KB. However, already when reaching a download size of 5 KB, requests over QUIC complete 1,5 times faster than over TCP on average. From 5–8 KB, requests over QUIC complete around twice as fast compared with TCP. The throughput, as depicted in Figure 4.11, is on average around twice as high for QUIC at 1 KB and around three times as high from 5–15 KB compared with TCP. At the higher end, we see in Figure 4.12 that requests over QUIC on average complete around twice as fast compared with TCP also between 10 KB–1 MB. Moreover, QUIC throughput average, as seen in Figure 4.13 is three times higher compared with TCP also at 100 KB. At 1 MB, it is twice as high. To summarize the impact that the response payload has on protocol per- formance, we can say the following:

• The request duration average difference is negligible at a response pay- load size between 25–1000 bytes.

• The TCP request duration 95th percentile is almost twice as high com- pared with QUIC at a response payload size between 25–1000 bytes.

• Requests over QUIC on average completes 1,5 times faster than over TCP at a response payload size of 5 KB and twice as fast from 8 KB–1 MB.

• QUIC throughput is consistently almost three times as high as for TCP when the response payload size is between 5–100 KB.

• QUIC throughput is consistently twice as high as that of TCP at response payload sizes of 25–1000 bytes and at 1 MB.

Although it would be of interest to study the higher end more thoroughly since QUIC advantages – both in terms of request duration and throughput – increases with increasing response payload, we will focus on the lower end for subsequent benchmarks since that is the most common use case in the 5G Core Network. As we have seen, between 25–1000 bytes, it does not mat- ter too much which response payload size is chosen. We choose 200 bytes as the response payload static value for subsequent benchmarks since many responses in the 5G Core Network are rather short. 44 CHAPTER 4. RESULTS

Figure 4.14: The request duration mean, 5th and 95th percentiles as a function of the RTT. The number of multiplexed requests is 10, the PLR is 0%, the HTTP request payload is zero bytes and the HTTP response payload is 200 bytes.

Figure 4.15: The request duration mean, 5th and 95th percentiles as a function of the RTT. The number of multiplexed requests is 200, the PLR is 0%, the HTTP request payload is zero bytes and the HTTP response payload is 200 bytes. CHAPTER 4. RESULTS 45

Figure 4.16: The throughput mean, 5th and 95th percentiles as a function of the RTT. The number of multiplexed requests is 10, the PLR is 0%, the HTTP request payload is zero bytes and the HTTP response payload is 200 bytes.

Figure 4.17: The throughput mean, 5th and 95th percentiles as a function of the RTT. The number of multiplexed requests is 200, the PLR is 0%, the HTTP request payload is zero bytes and the response payload is 200 bytes. 46 CHAPTER 4. RESULTS

4.2 Main Parameters

4.2.1 Round Trip Time In Sections 4.1.1 and 4.1.2, we demonstrate that particular settings of either the number of multiplexed requests or the request payload size makes the QUIC and TCP request duration average difference negligible when using an HTTP GET request with zero bytes payload. We therefore present two benchmarks when evaluating the RTT; one represents the case where the difference in re- quest duration average was negligible, as seen in Section 4.1.1 (using only 10 multiplexed requests), and the other represents the cases where requests sent over QUIC completed significantly faster (using 200 multiplexed requests). The request duration result when multiplexing 10 HTTP GET requests with zero bytes payload can be viewed in Figure 4.14. As could be expected, based on previous findings, the difference in request duration mean between TCP and QUIC is negligible. When using a request payload of zero bytes with only 10 multiplexed requests, the majority of the responses over TCP are re- ceived in the first round of packets as described in Section 4.1.1 making the request duration average low also for TCP. These parameter settings are op- timal for TCP request duration average performance. However, although the average request duration difference between the two protocols is negligible, the TCP 95th percentile increases at a significantly steeper angle than QUIC’s 95th percentile. In other words, TCP’s spread increases with increasing RTT while QUIC’s spread stays the same; an important consideration for time-critical applications. As described in Sections 4.1.1 and 4.1.2, increasing either the number of multiplexed requests or the request payload will cause TCP to move away from its “sweet spot” and the difference in request duration average will increase in QUIC’s favor. We increased the number of multiplexed requests to 200 and the plot looks completely different as seen in Figure 4.15. An important general trend here is that TCP’s curve is much steeper than QUIC’s; in other words, as RTT increases, the benefit of using QUIC with regards to request duration average also increases. Even though the request duration average difference varies significantly depending on if we are using 10 or 200 multiplexed requests, throughput is not impacted by this shift and QUIC consistently has 2–3 times higher throughput than TCP as seen in Figures 4.16 and 4.17. We summarize the most important observations below: • When multiplexing 10 HTTP GET requests with zero bytes request pay- CHAPTER 4. RESULTS 47

load, the request duration average difference between TCP and QUIC is negligible. But TCP’s 95th percentile is significantly larger than QUIC’s. TCP’s spread also increases with increasing RTT in contrast to QUIC’s which stays more or less the same.

• When instead increasing the number of multiplexed requests to 200 (and most likely beginning at 20 and above as demonstrated in Section 4.1.1), an increasing RTT also increases the benefit of using QUIC with regards to request duration average.

4.2.2 Packet Loss Rate

Figure 4.18: The request duration mean, 5th and 95th percentiles as a function of the PLR. The number of multiplexed requests is 200, the RTT is 20 ms, the HTTP request payload is zero bytes and the HTTP response payload is 200 bytes.

As we test the effect of packet loss on QUIC and TCP performance, we want to take into account the effect of QUIC’s HOL blocking mitigation. As described in Section 2.2.4, a packet loss in a TCP stream blocks all subsequent packets regardless of which HTTP/2 stream they belong to since TCP itself has no notion of multiple streams; it is only HTTP/2 which is aware of the streams. 48 CHAPTER 4. RESULTS

Each request-response pair is mapped to its own HTTP/2 stream but they all share the same TCP stream. If a packet loss occurs in TCP, then all requests that were multiplexed and still on the wire would be blocked. With QUIC, a packet loss in one stream would not result in all subsequent packets being blocked – only requests (QUIC streams) within the lost packet would have to wait for the retransmission but requests belonging to other QUIC streams could continue to be delivered. Therefore, the benchmark that would in theory demonstrate QUIC’s largest benefits in HOL blocking mitigation would be sending a large number of mul- tiplexed requests with small request and response payload sizes. Multiple re- quests would fit within each packet and therefore more requests would be on the wire at the same time than if sending a few very large requests for instance.

Figure 4.19: The throughput mean, 5th and 95th percentiles as a function of the PLR. The number of multiplexed requests is 200, the RTT is 20 ms, the HTTP request payload is zero bytes and the HTTP response payload is 200 bytes.

We ran a benchmark multiplexing 200 HTTP GET requests with a payload of zero bytes and a response payload of 200 bytes. The request duration results can be seen in Figure 4.18. On average, requests sent over QUIC complete 3 times faster than if sent over TCP at 0% PLR, 1,6 times faster between 1–3%, 1,7 times faster between 4–6% and then increasing until 10% where QUIC request duration average rises significantly. CHAPTER 4. RESULTS 49

Noticeably, QUIC has a substantial loss in performance between 0% and 1% PLR. After that, the QUIC average almost reaches a plateau between 1 and 9% PLR and only rises slowly. After 2%, however, we do see a significant increase in the 95th percentile. TCP starts with a request duration average that is almost three times longer than QUIC’s. At 1% PLR, however, the difference has decreased to 1,5 times longer than QUIC. The request duration 95th percentile of TCP is twice as long as QUIC already at 0% PLR and keeps rising between 0% and 3% while QUIC’s 95th percentile does not change too much. QUIC throughput average is significantly higher than TCP’s as seen in Figure 4.19. At 0%, it is more than twice as high. It stays around twice as high for most of the indicated PLR rates although it becomes smaller at 10% PLR. Yet, noteworthy is that QUIC’s spread in throughput is larger than that of TCP. The spread starts to increase substantially already at 1% PLR and from 4% and onwards, the 5th percentile is very low, even much lower than TCP’s average. To summarize the important points:

• QUIC request duration average is a third of that of TCP at 0% PLR.

• QUIC has a substantial loss in request duration average between 0–1% and 9–10% PLR.

• QUIC requests on average complete more than 1,5 times faster than TCP from 1–9% PLR.

• QUIC throughput stays around twice as high as that of TCP from 0–9% PLR.

• QUIC’s throughput spread is significantly higher than that of TCP from 1% PLR and onwards. At 4% PLR, its 5th percentile is the same as TCP’s 5th percentile even though TCP’s average is twice as low as QUIC. Chapter 5

Discussion

We present a review of the hypotheses presented in Section 1.3 and discuss whether they hold or not based on our results. We also discuss our results in light of the literature review and point out what we believe to be valuable findings for the 5G Core Network SBA. Suggestions are given for future work and limitations pointed out. At last, we conclude with summarizing the most important findings and answer whether QUIC is indeed a better choice for the 5G Core Network SBA or not.

5.1 Faster Connection Establishment – Not the Only Factor for Improving Latency

Hypothesis 3, from Section 1.3, states that QUIC enables significantly better performance than TCP in the 5G Core Network when a majority of connec- tions are short-lived. The reasoning behind is that QUIC’s faster connection establishment time should benefit QUIC in terms of lower latency. This is what the literature emphasizes, see for instance [16, 17, 37]. For implementa- tional reasons, our study does not evaluate QUIC under the strict short-lived connection model where only one request is sent at a time and the connection immediately closes once the response is received [37]. In our benchmarks, we multiplex at least 10 requests each time and the connection is closed immedi- ately upon completion of these requests. We labeled this a simplified variant of the long-lived connection model, in Section 3.1.2, since a more complete variant – a standing connection – would enable the connection to remain open even after requests have completed and also send requests at different time slots rather than all at the same time.

50 CHAPTER 5. DISCUSSION 51

When analyzing our results, it is important to remind ourselves of the method we use for measuring request duration. In fact, the connection setup time does not at all affect the request duration time according to our method of measuring; this is because we register the request TX and response RX times- tamps on the wire once the connection consequently is already set up. This way of measuring request duration has advantages and disadvantages and is discussed in further detail in Section 5.4. In this section, we focus on the fact that we are able to compare request duration between TCP and QUIC after a connection has already been established. This way of measuring is especially important for the long-lived connec- tion model where multiple requests are sent during the lifetime of a connec- tion. Under these conditions, our results show that when multiplexing 10 or 25 HTTP POST requests with a payload between 25–1000 bytes, QUIC re- quest duration average is less than half of that of TCP. One possible reason for QUIC’s significant advantage in terms of request duration average could be that QUIC is optimized for minimizing latency. In Figure 4.3 we see that QUIC uses more packets to send out its requests than TCP; by sending more packets, it is possible for requests and responses to not have to wait as long to be sent once they are ready, thereby reducing latency. TCP, on the other hand, prioritizes the sending of larger packets to reduce overhead but at the cost of increasing latency. We have used default settings of both TCP and QUIC but it is also possible to tune TCP, as we mention in Section 5.5 where we discuss future work; perhaps the results would be different. However, a latency reduc- tion with a factor of two is significant and it is likely that QUIC will still have a significant advantage even if TCP is tuned for lower latency. The authors of 3GPP’s Technical Report 29.893 [37] make the following remark on the difference of using QUIC with a short-lived or long-lived con- nection model in Section 5.4.4 (added emphasis):

When long lived connection is used in between NFs which uses SBI for communication even if QUIC provides faster connection it will not impact the performance of the inter-NF communica- tion significantly as only the initial request for a connection will see any improvement. However, if short lived connection models are used where NF-NF connection will be created for each request and response pair, QUIC will provide a faster experience of exe- cuting task vian HTTP request/response as one or two RTT are saved.

We would like to point out that even if a short-term connection model 52 CHAPTER 5. DISCUSSION

probably would benefit QUIC even more than a long-lived, our results indicate that QUIC will also have significantly lower latency when using the long-lived connection model.

5.2 QUIC has a Significant Advantage Also in QoS Networks

Hypothesis 1 states that QUIC allows the 5G Core Network to tolerate a signif- icantly higher Packet Loss Rate than TCP. As discussed in Section 2.4.2, most studies seem to conclude that the greatest benefits of QUIC are seen under poor network conditions (high PLR, high RTT and low bandwidth). However, we used 0% PLR and 20 ms RTT as our baseline and QUIC performed signifi- cantly better than TCP in almost every case. In fact, QUIC request duration average was three times shorter than TCP at 0% PLR but 1,6 times shorter at 1% PLR. Our results therefore show that QUIC is not only a better candidate, in terms of latency, under poor network conditions but also under optimal net- work conditions. This speaks in favor of using QUIC also in a network with QoS guarantees such as the 5G Core Network.

5.3 Performance Difference is Negligible With Few Multiplexed GET Requests

So far we have pointed out the advantages QUIC presents with respect to the 5G Core Network. However, it is also important to mention a few cases where the performance of QUIC and TCP is negligible or where QUIC performs worse. We found that under some circumstances when using the HTTP GET re- quest with zero bytes payload, the results were very different from other cases provided that we also multiplexed a low amount of requests – 10 in our case. Under those particular circumstances, the request duration average difference between QUIC and TCP was negligible although the TCP 95th percentile would still be significantly higher. However, by changing to an HTTP POST request with a payload of at least 25 bytes or issuing more HTTP GET re- quests than 10, QUIC again became significantly better than TCP. The details of this behavior are accounted for in Section 4.1.2 and Section 4.1.1. This is a special case where the difference between TCP and QUIC is negligible in terms of request duration average, although it is not negligible for time-critical CHAPTER 5. DISCUSSION 53

applications since the 95th percentile of TCP is still significantly higher.

5.4 Impacts of the Method used to Measure Request Duration

We should also reflect on the choice of method when measuring request du- ration. As explained in Section 3.4, to measure the length of a request, we took the RX timestamp of the last frame of a response and subtracted it with the TX timestamp of the first of its corresponding re- quest to get the request duration. This method, however, has some downsides. When multiplexing requests, several requests are sent in the same packet as explained in Section 4.1.1; yet, the TX timestamp for all of these requests will be the same. What we do not calculate specifically is the processing time of the different protocol layers – application, transport, network and link. How- ever, we do not believe that the difference would be large enough to affect the result in a significant way. An advantage of this method is that the testbed can be used with other protocol implementations without needing a lot of changes.

5.5 Limitations & Future Work

It is virtually impossible to model a full-scale 5G Core Network, therefore a study like this will naturally be limited in scope. In our case, the focus has been on two Network Functions communicating with each other with request sizes common to the 5G Core Network with different levels of multiplexing. There are several ways in which it would be possible to further continue in this research area. One way would be to use more than two nodes and mimic a real 5G Core call flow with messages that actual Network Functions would use in real scenarios. It would also be of interest to perform benchmarks with a long-lived con- nection in which requests are sent stochastically rather than all at the same time and see if it would affect the results, especially since our results seem to indicate a more favorable performance of QUIC under simplified long-lived connections than perhaps was expected. It would also be of interest to study QUIC’s benefits under a strict short-term connection model where the con- nection closes immediately after a single request has completed. We were able to gather information on request sizes in the 5G Core Net- work; however, we could not find large amounts of data to statistically support these ranges. An important reason for this problem is that a lot of such data is 54 CHAPTER 5. DISCUSSION

proprietary. Another reason is that the data itself is not readily available but needs to be collected and packaged. However, in discussion with several 5G Core experts, these ranges were considered reasonable assumptions. In our study, we deliberately focused on examining the default behavior of TCP and QUIC. Our results show that QUIC has half the request dura- tion average as that of TCP in a majority of cases. QUIC is more optimized for lowering latency of requests by sending more packets whereas TCP is op- timized towards filling the packets more and thereby delaying requests and responses. However, it is possible to change this behavior in TCP by setting the TCP_NODELAY option which bypasses the so-called Nagle’s algorithm [56]. It would be interesting to see if the request duration average goes up for TCP if enabling this option. One could also investigate deeper what the design decision was to have QUIC send more packets than TCP by default in order to gain an even better understanding of QUIC. Hypothesis 2 was not investigated due to the necessity of limiting the scope of the study in a reasonable way; instead more emphasis was put on investi- gating the influence of the sub-parameters – number of multiplexed requests, request payload and response payload. This was deemed necessary since they also needed to be set to be able to try out the effect of PLR and RTT.

5.6 Conclusions

Most of the literature states that QUIC’s lower latency is the result of its con- nection establishment being at least one RTT shorter than TCP. While this is an important factor in general, our results show that it is not the only factor in achieving a significantly lower latency than TCP; QUIC reduces average request latency to less than half of that of TCP for a majority of cases using the simplified long-lived connection model. This indicates that QUIC would perform significantly better than TCP also when using the long-lived connec- tion model, and not just the short-lived connection model, although further investigation is needed for the use cases where multiple requests are issued at different time slots rather than all at once. A majority of the literature also states that QUIC excels under poor network conditions with high PLR, high RTT and low bandwidth. However, our results show that QUIC reduces latency down to a third of that of TCP and doubles the throughput also under very favorable network conditions where PLR is 0% and RTT 20 ms. Both of these observations are important with regards to the 5G Core Net- work SBA since one topic of discussion in 3GPP’s Technical Report 29.893 CHAPTER 5. DISCUSSION 55

[37] is what are the benefits in using QUIC with the short-lived or long-lived connection model. If QUIC performs significantly better in both cases, it makes it an even stronger candidate for future standards. Since the 5G Core Network has QoS enabled, it is of great interest to see that QUIC excels also under very good network conditions. As other studies have noted, we also note that QUIC advantages increase with higher RTT. Given certain latency and throughput requirements, QUIC therefore enables Network Functions to be placed further away from each other. As such, it is an enabler of the future 5G Core Network SBA. Particular findings that might be of interest from an implementational point of view is that when multiplexing 10 HTTP GET requests with zero bytes payload or less, QUIC does not have significant advantages in terms of request duration average although the 95th percentile is still significantly lower with QUIC than TCP. From a performance point of view, there is no question that QUIC performs significantly better both in terms of latency and throughput than TCP. This can also be said for the specific cases of the 5G Core Network SBA where requests have low payloads and the network quality is high. Bibliography

[1] Patrick Kwadwo Agyapong et al. “Design Considerations for a 5G Net- work Architecture”. In: IEEE Communications Magazine 52.11 (Nov. 2014), pp. 65–75. issn: 1558-1896. doi: 10.1109/MCOM.2014. 6957145. [2] Sarp Köksal. Core Network Evolution(3G vs. 4G vs. 5G). url: : //medium.com/@sarpkoksal/core-network-evolution- 3g-vs-4g-vs-5g-7738267503c7 (visited on 02/11/2020). [3] Specification # 23.501. url: https : / / portal . 3gpp . org / desktopmodules/Specifications/SpecificationDetails. aspx?specificationId=3144 (visited on 01/23/2020). [4] Stefan Rommer et al. 5G Core Networks. Academic Press, 2020. isbn: 978-0-08-103009-7. [5] Ka-cheong Leung, Victor O.k. Li, and Daiqin Yang. “An Overview of Packet Reordering in Transmission Control Protocol (TCP): Problems, Solutions, and Challenges”. In: IEEE Transactions on Parallel and Dis- tributed Systems 18.4 (Apr. 2007), pp. 522–535. issn: 2161-9883. doi: 10.1109/TPDS.2007.1011. [6] RFC 8446 - The Transport Layer Security (TLS) Protocol Version 1.3. url: https://datatracker.ietf.org/doc/rfc8446/ (visited on 02/10/2020). [7] Jim Roskind. “Multiplexed Stream Transport over UDP”. In: (). url: https : / / docs . google . com / document / d / 1RNHkx _ VvKWyWg6Lr8SZ- saqsQx7rFV- ev2jRFUoVD34/edit (vis- ited on 01/21/2020). [8] What’s Happening with QUIC. url: https://www.ietf.org/ blog/whats-happening-quic/ (visited on 04/24/2020).

56 BIBLIOGRAPHY 57

[9] Draft-Ietf-Quic-Transport-25 - QUIC: A UDP-Based Multiplexed and Secure Transport. url: https://datatracker.ietf.org/ doc/draft-ietf-quic-transport/ (visited on 02/20/2020). [10] HTTP/3: The Past, the Present, and the Future. url: https://blog. cloudflare.com/http3-the-past-present-and-future/ (visited on 02/11/2020). [11] “QUIC Deployment Experience at Google”. url: https://www. ietf.org/proceedings/96/slides/slides-96-quic- 3.pdf (visited on 02/11/2020). [12] Yong Cui et al. “Innovating Transport with QUIC: Design Approaches and Research Challenges”. In: IEEE Internet Computing 21.2 (Mar. 2017), pp. 72–76. issn: 1941-0131. doi: 10.1109/MIC.2017.44. [13] Mike Bishop . Hypertext Transfer Protocol Version 3 (HTTP/3). url: https://tools.ietf.org/html/ draft-ietf-quic-http-24 (visited on 01/21/2020). [14] Rob Austein. The Rise of the Middle and the Future of End-to-End: Reflections on the Evolution of the Internet Architecture. url: https: //tools.ietf.org/html/rfc3724 (visited on 02/11/2020). [15] R. Braden. Requirements for Internet Hosts - Communication Layers. url: https://tools.ietf.org/html/rfc1122 (visited on 02/11/2020). [16] Prasenjeet Biswal and Omprakash Gnawali. “Does QUIC Make the Web Faster?” In: 2016 IEEE Global Communications Conference (GLOBE- COM). 2016 IEEE Global Communications Conference (GLOBECOM). Dec. 2016, pp. 1–6. doi: 10.1109/GLOCOM.2016.7841749. [17] Sarah Cook et al. “QUIC: Better for What and for Whom?” In: 2017 IEEE International Conference on Communications (ICC). 2017 IEEE International Conference on Communications (ICC). May 2017, pp. 1– 6. doi: 10.1109/ICC.2017.7997281. [18] Péter Megyesi, Zsolt Krämer, and Sándor Molnár. “How Quick Is QUIC?” In: 2016 IEEE International Conference on Communications (ICC). 2016 IEEE International Conference on Communications (ICC). May 2016, pp. 1–6. doi: 10.1109/ICC.2016.7510788. 58 BIBLIOGRAPHY

[19] 3G4G. “Advanced: 5G Service Based Architecture (SBA)”. 21:11:13 UTC. url: https : / / www . slideshare . net / 3G4GLtd / advanced - 5g - service - based - architecture - sba - 87217349 (visited on 04/21/2020). [20] Jiegang Lu et al. “5G Enhanced Service-Based Core Design”. In: 2019 28th Wireless and Optical Communications Conference (WOCC). 2019 28th Wireless and Optical Communications Conference (WOCC). May 2019, pp. 1–5. doi: 10.1109/WOCC.2019.8770604. [21] Specification # 29.891. url: https : / / portal . 3gpp . org / desktopmodules/Specifications/SpecificationDetails. aspx?specificationId=3176 (visited on 04/24/2020). [22] Jim Kurose and Keith Ross. Computer Networking A Top-Down Ap- proach. 7th ed. Pearson, 2017. isbn: 978-1-292-15359-9. [23] E. Rescorla. The Transport Layer Security (TLS) Protocol Version 1.3. url: https://www.rfc-editor.org/rfc/rfc8446.html (visited on 04/25/2020). [24] J. Postel. . url: https://tools.ietf. org/html/rfc768 (visited on 04/25/2020). [25] J. Postel. Transmission Control Protocol. url: https://tools. ietf.org/html/rfc793 (visited on 04/25/2020). [26] Kanon Sasaki et al. “TCP Fairness Among Modern TCP Congestion Control Algorithms Including TCP BBR”. In: 2018 IEEE 7th Interna- tional Conference on Cloud Networking (CloudNet). 2018 IEEE 7th In- ternational Conference on Cloud Networking (CloudNet). Oct. 2018, pp. 1–4. doi: 10.1109/CloudNet.2018.8549505. [27] Roy Fielding and Julian Reschke. Hypertext Transfer Protocol (HTTP/1.1): Semantics and Content. url: https://tools.ietf.org/html/ rfc7231#ref-REST (visited on 04/26/2020). [28] Hugues de Saxcé, Iuniana Oprescu, and Yiping Chen. “Is HTTP/2 Re- ally Faster than HTTP/1.1?” In: 2015 IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS). 2015 IEEE Con- ference on Computer Communications Workshops (INFOCOM WK- SHPS). Apr. 2015, pp. 293–299. doi: 10.1109/INFCOMW.2015. 7179400. BIBLIOGRAPHY 59

[29] Mike Belshe, Martin Thomson, and Roberto Peon. Hypertext Transfer Protocol Version 2 (HTTP/2). url: https://tools.ietf.org/ html/rfc7540#section-8.1 (visited on 04/27/2020). [30] Martin Thomson and Jana Iyengar. QUIC: A UDP-Based Multiplexed and Secure Transport. url: https://tools.ietf.org/html/ draft-ietf-quic-transport-27 (visited on 04/27/2020). [31] Randall Stewart . Stream Control Transmission Pro- tocol. url: https://tools.ietf.org/html/rfc4960 (vis- ited on 04/27/2020). [32] Draft-Ietf-Quic-Http-27 - Hypertext Transfer Protocol Version 3 (HTTP/3). url: https : / / tools . ietf . org / html / draft - ietf - quic-http-27 (visited on 04/27/2020). [33] A QUIC Update on Google’s Experimental Transport. url: https: //blog.chromium.org/2015/04/a-quic-update-on- -experimental.html (visited on 02/11/2020). [34] Jana Iyengar and Ian Swett. QUIC Loss Detection and Congestion Con- trol. url: https://tools.ietf.org/html/draft-ietf- quic-recovery-27 (visited on 04/27/2020). [35] Arash Molavi Kakhki et al. “Taking a Long Look at QUIC: An Ap- proach for Rigorous Evaluation of Rapidly Evolving Transport Proto- cols”. In: Proceedings of the 2017 Internet Measurement Conference. IMC ’17. London, United Kingdom: Association for Computing Ma- chinery, Nov. 1, 2017, pp. 290–303. isbn: 978-1-4503-5118-8. doi: 10. 1145 / 3131365 . 3131368. url: https : / / doi . org / 10 . 1145/3131365.3131368 (visited on 02/12/2020). [36] Jesus de Gregorio. Jdegre/5GC_APIs. Aug. 20, 2020. url: https: //github.com/jdegre/5GC_APIs (visited on 08/20/2020). [37] Specification # 29.893. url: https : / / portal . 3gpp . org / desktopmodules/Specifications/SpecificationDetails. aspx?specificationId=3494 (visited on 12/16/2019). [38] Mirja Kühlewind and Zaheduzzaman Sarker. Discovery Mechanism for QUIC-Based, Non-Transparent Proxy Services. url: https://tools. ietf.org/html/draft-kuehlewind-quic-proxy-discovery- 01 (visited on 04/27/2020). [39] Cloudflare/Quiche. Cloudflare, Apr. 27, 2020. url: https://github. com/cloudflare/quiche (visited on 04/27/2020). 60 BIBLIOGRAPHY

[40] Facebookincubator/Mvfst. Facebook Incubator, Apr. 26, 2020. url: https: //github.com/facebookincubator/mvfst (visited on 04/27/2020). [41] Qualys SSL Labs - SSL Pulse. url: https://www.ssllabs.com/ ssl-pulse/ (visited on 04/24/2020). [42] Ip-Netns(8) - Linux Manual Page. url: https : / / man7 . org / linux/man- pages/man8/ip- netns.8.html (visited on 08/16/2020). [43] Networking:Netem [Wiki]. url: https://wiki.linuxfoundation. org/networking/netem (visited on 08/16/2020). [44] Iptables(8) - Linux Man Page. url: https://linux.die.net/ man/8/iptables (visited on 08/16/2020). [45] Manpage of TCPDUMP. url: https://www.tcpdump.org/ manpages/tcpdump.1.html (visited on 08/16/2020). [46] Wireshark · Go Deep. url: https://www.wireshark.org/ (visited on 08/16/2020). [47] Tshark - The Wireshark Network Analyzer 3.2.6. url: https://www. wireshark.org/docs/man-pages/tshark.html (visited on 08/16/2020). [48] IETF QUIC WG. url: https://github.com/quicwg (visited on 08/16/2020). [49] LiteSpeed Tech. Litespeedtech/Lsquic. Aug. 16, 2020. url: https:// github.com/litespeedtech/lsquic (visited on 08/16/2020). [50] About Us - LiteSpeed Technologies. url: https://www.litespeedtech. com/company/about (visited on 08/16/2020). [51] HTTP3 LiteSpeed vs. Nginx Benchmark Tests - LiteSpeed Blog. Nov. 25, 2019. url: https://blog.litespeedtech.com/2019/11/ 25/http3-litespeed-vs-nginx/ (visited on 08/16/2020). [52] Nghttp2: HTTP/2 C Library - Nghttp2.Org. url: https://nghttp2. org/ (visited on 08/16/2020). [53] NGINX | High Performance Load Balancer, Web Server, & . url: https://www.nginx.com/ (visited on 08/16/2020). [54] Cloudflare/Quiche. url: https://github.com/cloudflare/ quiche (visited on 08/16/2020). [55] Cloudflare/Quiche. Cloudflare, Aug. 15, 2020. url: https://github. com/cloudflare/quiche (visited on 08/16/2020). BIBLIOGRAPHY 61

[56] J. Nagle. Congestion Control in IP/TCP Internetworks. url: https: //tools.ietf.org/html/rfc896 (visited on 09/05/2020). TRITA-EECS-EX-2020:867

www.kth.se