Evaluating the Latency Impact of Ipv6 on a High Frequency Trading System

Evaluating the Latency Impact of IPv6 on a High Frequency Trading System Nereus Lobo, Vaibhav Malik, Chris Donnally, Seth Jahne, Harshil Jhaveri [email protected] , [email protected] , [email protected] , [email protected] , [email protected] A capstone paper submitted as partial fulfillment of the requirements for the degree of Masters in Interdisciplinary Telecommunications at the University of Colorado, Boulder, 4 May 2012. Project directed by Dr. Pieter Poll and Professor Andrew Crain. 1 Introduction Employing latency-dependent strategies, financial trading firms rely on trade execution speed to obtain a price advantage on an asset in order to earn a profit of a fraction of a cent per asset share [1]. Through successful execution of these strategies, trading firms are able to realize profits on the order of billions of dollars per year [2]. The key to success for these trading strategies are ultra-low latency processing and networking systems, henceforth called High Frequency Trading (HFT) systems, which enable trading firms to execute orders with the necessary speed to realize a profit [1]. However, competition from other trading firms magnifies the need to achieve the lowest latency possible. A 1 µs latency disadvantage can result in unrealized profits on the order of $1 million per day [3]. For this reason, trading firms spend billions of dollars on their data center infrastructure to ensure the lowest propagation delay possible [4]. Further, trading firms have expanded their focus on latency optimization to other aspects of their trading infrastructure including application performance, inter-application messaging performance, and network infrastructure modernization [5]. As new networking technologies are introduced into the market, it is imperative that these technologies are evaluated to assess the impact on the end-to-end system latency. The objective of this research is to determine if there is a significant impact on HFT system latency from emerging networking technologies that could result in a competitive disadvantage. Specifically, this paper develops a latency-optimized HFT system performance model and contrasts latency performance between IPv4 and IPv6 through this model. For the purposes of measurement and evaluation, this paper defines a latency of 1 µs as significant based on the magnitude of forgone profit that can occur from such a disadvantage. Our research contributes a latency optimized end-to-end HFT system model for comparing latency performance with other HFT systems and platforms. Additionally, this paper provides recommendations for HFT system optimization. This research is important because the profitability of low latency trading strategies is highly coupled to end-to-end HFT system latency performance, where a small change in latency performance can result in a large swing in realized profits. Further, IP is a foundational networking technology implemented across an HFT system. Changing the IP implementation from IPv4 to IPv6 alters the latency characteristics of these devices due to additional overhead 1 processing and serialization delays. While the change in latency is arguably small, these small latencies, accumulated over an end-to-end HFT system, may result in a significant 1 µs latency increase. The salient feature of an HFT system is the speed in which it is able to execute a trade. In an HFT system, pre-trade analytics occur to determine if a trade should be executed or not. The latency of pre-trade analytics, which occurs at the application layer inside an HFT market analysis engine, is beyond the scope of this paper. Additionally, the inter-software messaging and the trade execution software processing latency are beyond the scope of this paper. Instead, this research focuses solely on the trade execution path latency performance across the processing platform and networking devices that provide connectivity to the financial exchange. The main audiences for this research are financial trading firms that employ low latency trading strategies and network service providers that service these firms. The findings from this research serve to assist with a financial trading firm’s HFT system IP modernization planning. Further, the resulting latency-optimized HFT system model can be used by trading firms and network service providers as a point of comparison to assess the performance of their HFT systems and to identify latency optimization opportunities for those systems. 2 Assumptions This paper makes two assumptions to establish a common networking interface across an HFT system. The first assumption is the message size for a financial trading firm’s trade execution order is 256 bytes. This assumption is based on the largest average trade order message size used by a financial exchange [6]. The second assumption is that all processing and networking devices implement a 10G networking adapter, which is derived from trading firms’ data center equipment expenditures [4]. 3 Prior Study Our initial literature review served to confirm the novelty of our research question. We were able to find general latency studies for specific computing and networking technologies, but did not find any studies specific to the latency of an end-to-end HFT system. We were also able to find general studies on latency issues caused by IPv6, especially as they relate to transition technologies; however, we were unable to find any studies that described potential IPv6 latency issues in an HFT network. 4 Methodology and Results To determine the latency impact of IPv6 on an HFT system, we decomposed the system into three segments: the trade execution computing platform segment, the IP network segment, and the optical transport segment. Next, we developed latency models to identify sources of latency for processing, networking, and transmission platforms used in the HFT system segments. From there, a literature survey was conducted to identify high performance platforms likely deployed in HFT systems, and to identify where differences between IPv4 and IPv6 implementations would be latency impacting. With the HFT system segment platforms identified, our literature survey expanded to obtain latency performance measures for each platform and to assess any performance differential between IPv4 and IPv6. Finally, with consideration to the latency contributions of IPv4 and 2 IPv6, platform latency performances are compiled into the HFT system model to establish a latency-optimized performance baseline. 4.1 Trade Execution Computing Platforms Trade execution computing platforms are simply high performance computers that run a financial trading firm’s trade execution software application [7]. To identify potential sources of latency on the computing platform, an IP packet processing model was developed to illustrate different potential paths through an operating system (OS) available to trade execution software executing a trade order. The first path is through the OS’s networking stack, where the IP protocol overhead processing occurs, along with processing of other protocols. The second path is through a TCP Offload Engine (TOE), which instantiates the networking stack on the platform’s network adapter. The computing platform IP packet processing model is illustrated in Figure 1. From this model, the pertinent computing platform processing aspects that contribute to trade execution latency are the OS networking stack, TOE, IPSec and IPv6 transition protocol processing. Further, HFT literature identifies that trade execution computing platform deployment configurations are non-standard and can be implemented with a variety of OS and hardware configurations [7]. Therefore, the literature survey focused on locating latency performance measures for High Performance Computing (HPC) platforms, which run modern versions of either Windows or Linux OS configured with a latency-optimized networking stack. Application Layer S T T Y A A S S Socket Layer S K K T Transport Layer E I S M N C T H Network Layer E E TCP Off- Zero -IPSec C R D Loading Copy A R U -v6 Transition U L L P I L T N Device Driver S S G Interrupt Coalescing Legend Packet Processing NIC Hardware Operating System Figure 1: Computing Platform IP Packet Processing Model The literature survey produced two independent research studies identifying round-trip time (RTT) latency performance on similar HPC platforms. The first study was conducted on a 3 GHz Intel Xeon 5160 processor with the following configuration: Linux 2.6.18, latency- optimized network stack, and a Chelsio T110 10 Gigabit Ethernet (GigE) adapter. The platform’s RTT latency performance was 10.2 µs [8]. The second study was conducted on a 2.2 3 GHz AMD Opteron processor with the following configuration: Linux 2.6.6, latency-optimized network stack, and a Chelsio T110 10GigE adapter. The platform’s RTT latency performance was 10.37 µs [9]. Further, the first study contrasted the platform’s OS networking stack latency performance against the TOE on the Chelsio T110 10GigE adapter. This resulted in a 1.3 µs latency improvement, which lowered the total platform latency performance to 8.9 µs [9]. In contrast, non-optimized computing platforms and networking stacks have a latency performance on the order of 40 µs, which is significantly higher than that of optimized platforms [10][11]. Finally, the computing platform’s total latency performance is decomposed into transmit and receive latency performance. Utilizing the TOE, the computing platform’s transmit latency

Evaluating the Latency Impact of Ipv6 on a High Frequency Trading System

Latency and Throughput Optimization in Modern Networks: a Comprehensive Survey Amir Mirzaeinnia, Mehdi Mirzaeinia, and Abdelmounaam Rezgui

Measuring Latency Variation in the Internet

The Effects of Latency on Player Performance and Experience in A

Low-Latency Networking: Where Latency Lurks and How to Tame It Xiaolin Jiang, Hossein S

Simulation and Comparison of Various Scheduling Algorithm for Improving the Interrupt Latency of Real –Time Kernal

Telematic Performance and the Challenge of Latency

Action-Sound Latency: Are Our Tools Fast Enough?

5G Qos: Impact of Security Functions on Latency

CPU Scheduling

Bufferbloat: Advertently Defeated a Critical TCP Con- Gestion-Detection Mechanism, with the Result Being Worsened Congestion and Increased Latency

Real-Time Latency: Rethinking Remote Networks

5G Ultra-Reliable Low-Latency Communication Implementation Challenges and Operational Issues with Iot Devices