Linux Kernel Packet Transmission Performance in High-Speed Networks
Total Page:16
File Type:pdf, Size:1020Kb
DEGREE PROJECT IN ELECTRICAL ENGINEERING, SECOND CYCLE, 30 CREDITS STOCKHOLM, SWEDEN 2016 Linux Kernel Packet Transmission Performance in High-speed Networks CLÉMENT BERTIER KTH ROYAL INSTITUTE OF TECHNOLOGY SCHOOL OF INFORMATION AND COMMUNICATION TECHNOLOGY Kungliga Tekniska hogskolan¨ Master thesis Linux Kernel packet transmission performance in high-speed networks Cl´ementBertier August 27, 2016 Abstract The Linux Kernel protocol stack is getting more and more additions as time goes by. As new technologies arise, more functions are implemented and might result is a certain amount of bloat. However new methods have been added to the kernel to circumvent common throughput issues and to maximize overall performances, given certain circumstances. To assess the ability of the kernel to produce packets at a given rate, we will use the pktgen tool. Pktgen is a loadable kernel module dedicated to traffic generation based on UDP. Its philosophy was to be in a low position in the kernel protocol stack to minimize the amount of overhead caused by usual APIs. As measurements are usually done in packets per second instead of bandwidth, the UDP protocol makes perfect sense to minimize the amount of time creating a packet. It has several options which will be investigated, and for further insights its transmission algorithm will be analysed. But a software is not just a compiled piece of code, it is a set of instructions ran on top of hardware. And this hardware may or may not comply with the design of one's software, making the execution slower than expected or in extreme cases even not functional. This thesis aims to investigate the maximum capabilities of Linux packet transmissions in high-speed networks, e.g. 10 Gigabits or 40 Gigabits. To go deeper into the understanding of the kernel behaviour during transmission we will use profiling tools, as perf and the newly adopted eBPF framework. Abstract Linux Kernel protokollstacken blir fler och fler till¨aggsom tiden g˚ar.Som ny teknik uppst˚ar,fler funk- tioner har genomf¨ortsoch kan leda till en viss m¨angdsv¨alla.Men nya metoder har lagts till k¨arnanf¨or att kringg˚avanliga genomstr¨omningproblem och att maximera den totala f¨orest¨allningar,med tanke p˚avissa omst¨andigheter. Att fastst¨allaf¨orm˚aganhos k¨arnanf¨oratt producera paket med en given hastighet, kommer vi att anv¨andapktgen verktyget. Pktgen ¨aren laddbar k¨arnmodul till¨agnadtrafik generation baserad p˚aUDP. Dess filosofi var att vara i en l˚agposition i k¨arnanprotokollstacken f¨oratt minimera m¨angdenav overhead orsakad av vanliga API: er. Som m¨atningarnag¨orsvanligtvis i paket per sekund i st¨alletf¨orbandbredd, g¨orUDP-protokollet vettigt att minimera m¨angdentid p˚aatt skapa ett paket. Det har flera alternativ som kommer att unders¨okas, och f¨orytterligare insikter sin s¨andningsalgoritmenkommer att analyseras. Men en programvara ¨arinte bara en kompilerad bit kod, ¨ardet en upps¨attninginstruktioner sprang ovanp˚ah˚ardvara. Och den h¨armaskinvaran kan eller inte kan f¨oljamed utformningen av en program- vara, vilket g¨orutf¨orandetl˚angsammare¨anv¨antat eller i extrema fall ¨aven fungerar inte. Denna avhandling syftar till att unders¨oka de maximala kapacitet Linux pakets¨andningari h¨oghastighetsn¨at, t.ex. 10 gigabit eller 40 Gigabit. F¨oratt g˚adjupare in i f¨orst˚aelsenav k¨arnanbeteende under ¨overf¨oringen kommer vi att anv¨andaprofilverktyg, som perf och det nyligen antagna ramen eBPF. Contents 1 Introduction 5 1.1 Problem . .6 1.2 Methodology . .6 1.3 Goal . .7 1.4 Sustainability and ethics . .7 1.5 Delimitation . .7 1.6 Outline . .7 2 Background 9 2.1 Computer hardware architecture . 10 2.1.1 CPU . 10 2.1.2 SMP . 11 2.1.3 NUMA . 11 2.1.4 DMA . 11 2.1.5 Ethernet . 11 2.1.6 PCIe . 13 2.1.7 Networking terminology . 14 2.2 Linux . 15 2.2.1 OS Architecture design . 15 2.2.2 /proc pseudo-filesystem . 16 2.2.3 Socket Buffers . 17 2.2.4 xmit more API . 18 2.2.5 NIC drivers . 18 2.2.6 Queuing in the networking stack . 19 2.3 Related work { Traffic generators . 20 2.3.1 iPerf . 20 2.3.2 KUTE . 20 2.3.3 PF RING......................................... 20 2.3.4 Netmap . 20 2.3.5 DPDK . 21 2.3.6 Moongen . 21 2.3.7 Hardware solutions . 21 2.4 Pktgen . 22 2.4.1 pktgen flags . 22 2.4.2 Commands . 23 2.4.3 Transmission algorithm . 24 2.4.4 Performance checklist . 27 2.5 Related work { Profiling . 28 2.5.1 perf . 28 2.5.2 eBPF . 29 1 3 Methodology 33 3.1 Data yielding . 33 3.2 Data evaluation . 34 3.3 Linear statistical correlation . 34 4 Experimental setup 35 4.1 Speed advertisement . 35 4.2 Hardware used . 36 4.2.1 Machine A { KTH . 36 4.2.2 Machine B { KTH . 37 4.2.3 Machine C { Ericsson . 38 4.2.4 Machine D { Ericsson . 39 4.3 Choice of Linux distribution . 40 4.4 Creating a virtual development environment . 40 4.5 Empirical testing of settings . 41 4.6 Creation of an interface for pktgen . 41 4.7 Enhancing the system for pktgen . 43 4.8 pktgen parameters clone conflict . 44 5 eBPF Programs with BCC 45 5.1 Introduction . 45 5.2 kprobes . 45 5.3 Estimation of driver transmission function execution time . 46 6 Results 49 6.1 Settings tuning . 49 6.1.1 Influence of kernel version . 49 6.1.2 Optimal pktgen settings . 49 6.1.3 Influence of ring size . 52 6.2 Evidence of faulty hardware . 53 6.3 Study of the packet size scalability . 54 6.3.1 Problem detection . 54 6.3.2 Profiling with perf . 55 6.3.3 Driver latency estimation with eBPF . 56 7 Conclusion 58 7.1 Future work . 58 A Bifrost install 62 A.1 How to create a bifrost distribution . 62 A.2 Compile and install a kernel for bifrost . 63 B Scripts 64 C Block diagrams 66 2 List of Figures 2.1 Caches location in a 2-core CPU. 10 2.2 Theoretical limits of the link according to packet size on a 10G link. 12 2.3 Theoretical limits of the link according to packet size on a 40G link. 13 2.4 Tux, the mascot of Linux . 15 2.5 Overview of the kernel [4] . 16 2.6 How pointers are mapped to retrieve data within the socket buffer [18]. 17 2.7 Example of a shell command to interact with pktgen. 22 2.8 pktgen transmission algorithm . ..