Linux Kernel Packet Transmission Performance in High-Speed Networks

DEGREE PROJECT IN ELECTRICAL ENGINEERING, SECOND CYCLE, 30 CREDITS STOCKHOLM, SWEDEN 2016 Linux Kernel Packet Transmission Performance in High-speed Networks CLÉMENT BERTIER KTH ROYAL INSTITUTE OF TECHNOLOGY SCHOOL OF INFORMATION AND COMMUNICATION TECHNOLOGY Kungliga Tekniska hogskolan¨ Master thesis Linux Kernel packet transmission performance in high-speed networks ClémentBertier August 27, 2016 Abstract The Linux Kernel protocol stack is getting more and more additions as time goes by. As new technologies arise, more functions are implemented and might result is a certain amount of bloat. However new methods have been added to the kernel to circumvent common throughput issues and to maximize overall performances, given certain circumstances. To assess the ability of the kernel to produce packets at a given rate, we will use the pktgen tool. Pktgen is a loadable kernel module dedicated to traffic generation based on UDP. Its philosophy was to be in a low position in the kernel protocol stack to minimize the amount of overhead caused by usual APIs. As measurements are usually done in packets per second instead of bandwidth, the UDP protocol makes perfect sense to minimize the amount of time creating a packet. It has several options which will be investigated, and for further insights its transmission algorithm will be analysed. But a software is not just a compiled piece of code, it is a set of instructions ran on top of hardware. And this hardware may or may not comply with the design of one's software, making the execution slower than expected or in extreme cases even not functional. This thesis aims to investigate the maximum capabilities of Linux packet transmissions in high-speed networks, e.g. 10 Gigabits or 40 Gigabits. To go deeper into the understanding of the kernel behaviour during transmission we will use profiling tools, as perf and the newly adopted eBPF framework. Abstract Linux Kernel protokollstacken blir fler och fler tilläggsom tiden g˚ar.Som ny teknik uppst˚ar,fler funk- tioner har genomförtsoch kan leda till en viss mängdsvälla.Men nya metoder har lagts till kärnanför att kringg˚avanliga genomströmningproblem och att maximera den totala föreställningar,med tanke p˚avissa omständigheter. Att fastställaförm˚aganhos kärnanföratt producera paket med en given hastighet, kommer vi att användapktgen verktyget. Pktgen ären laddbar kärnmodul tillägnadtrafik generation baserad p˚aUDP. Dess filosofi var att vara i en l˚agposition i kärnanprotokollstacken föratt minimera mängdenav overhead orsakad av vanliga API: er. Som mätningarnagörsvanligtvis i paket per sekund i ställetförbandbredd, görUDP-protokollet vettigt att minimera mängdentid p˚aatt skapa ett paket. Det har flera alternativ som kommer att undersökas, och förytterligare insikter sin sändningsalgoritmenkommer att analyseras. Men en programvara ärinte bara en kompilerad bit kod, ärdet en uppsättninginstruktioner sprang ovanp˚ah˚ardvara. Och den härmaskinvaran kan eller inte kan följamed utformningen av en programvara, vilket görutförandetl˚angsammareänväntat eller i extrema fall även fungerar inte. Denna avhandling syftar till att undersöka de maximala kapacitet Linux paketsändningari höghastighetsnät, t.ex. 10 gigabit eller 40 Gigabit. Föratt g˚adjupare in i först˚aelsenav kärnanbeteende under överföringen kommer vi att användaprofilverktyg, som perf och det nyligen antagna ramen eBPF. Contents 1 Introduction 5 1.1 Problem . .6 1.2 Methodology . .6 1.3 Goal . .7 1.4 Sustainability and ethics . .7 1.5 Delimitation . .7 1.6 Outline . .7 2 Background 9 2.1 Computer hardware architecture . 10 2.1.1 CPU . 10 2.1.2 SMP . 11 2.1.3 NUMA . 11 2.1.4 DMA . 11 2.1.5 Ethernet . 11 2.1.6 PCIe . 13 2.1.7 Networking terminology . 14 2.2 Linux . 15 2.2.1 OS Architecture design . 15 2.2.2 /proc pseudo-filesystem . 16 2.2.3 Socket Buffers . 17 2.2.4 xmit more API . 18 2.2.5 NIC drivers . 18 2.2.6 Queuing in the networking stack . 19 2.3 Related work { Traffic generators . 20 2.3.1 iPerf . 20 2.3.2 KUTE . 20 2.3.3 PF RING......................................... 20 2.3.4 Netmap . 20 2.3.5 DPDK . 21 2.3.6 Moongen . 21 2.3.7 Hardware solutions . 21 2.4 Pktgen . 22 2.4.1 pktgen flags . 22 2.4.2 Commands . 23 2.4.3 Transmission algorithm . 24 2.4.4 Performance checklist . 27 2.5 Related work { Profiling . 28 2.5.1 perf . 28 2.5.2 eBPF . 29 1 3 Methodology 33 3.1 Data yielding . 33 3.2 Data evaluation . 34 3.3 Linear statistical correlation . 34 4 Experimental setup 35 4.1 Speed advertisement . 35 4.2 Hardware used . 36 4.2.1 Machine A { KTH . 36 4.2.2 Machine B { KTH . 37 4.2.3 Machine C { Ericsson . 38 4.2.4 Machine D { Ericsson . 39 4.3 Choice of Linux distribution . 40 4.4 Creating a virtual development environment . 40 4.5 Empirical testing of settings . 41 4.6 Creation of an interface for pktgen . 41 4.7 Enhancing the system for pktgen . 43 4.8 pktgen parameters clone conflict . 44 5 eBPF Programs with BCC 45 5.1 Introduction . 45 5.2 kprobes . 45 5.3 Estimation of driver transmission function execution time . 46 6 Results 49 6.1 Settings tuning . 49 6.1.1 Influence of kernel version . 49 6.1.2 Optimal pktgen settings . 49 6.1.3 Influence of ring size . 52 6.2 Evidence of faulty hardware . 53 6.3 Study of the packet size scalability . 54 6.3.1 Problem detection . 54 6.3.2 Profiling with perf . 55 6.3.3 Driver latency estimation with eBPF . 56 7 Conclusion 58 7.1 Future work . 58 A Bifrost install 62 A.1 How to create a bifrost distribution . 62 A.2 Compile and install a kernel for bifrost . 63 B Scripts 64 C Block diagrams 66 2 List of Figures 2.1 Caches location in a 2-core CPU. 10 2.2 Theoretical limits of the link according to packet size on a 10G link. 12 2.3 Theoretical limits of the link according to packet size on a 40G link. 13 2.4 Tux, the mascot of Linux . 15 2.5 Overview of the kernel [4] . 16 2.6 How pointers are mapped to retrieve data within the socket buffer [18]. 17 2.7 Example of a shell command to interact with pktgen. 22 2.8 pktgen transmission algorithm . ..

Linux Kernel Packet Transmission Performance in High-Speed Networks

Administració De Sistemes GNU Linux Mòdul4 Administració

Storage Administration Guide Storage Administration Guide SUSE Linux Enterprise Server 12 SP4

Interrupt Handling in Linux

User Manual Issue 2.0.2 September 2017

USB Composite Gadget Using CONFIG-FS on Dra7xx Devices

Faux Disk Encryption: Realities of Secure Storage on Mobile Devices August 4, 2015 – Version 1.0

Linux Kernel and Driver Development Training Slides

Linux Kernal II 9.1 Architecture

Singularityce User Guide Release 3.8

Oracle® Linux 7 Managing File Systems

Unionfs: User- and Community-Oriented Development of a Uniﬁcation File System

ODROID-HC2: 3.5” High Powered Storage  February 1, 2018