Comparing and Improving Current Packet Capturing Solutions Based on Commodity Hardware

Comparing and Improving Current Packet Capturing Solutions based on Commodity Hardware Lothar Braun, Alexander Didebulidze, Nils Kammenhuber, Georg Carle Technische Universität München Institute for Informatics Chair for Network Architectures and Services {braun,didebuli,kammenhuber,carle}@net.in.tum.de ABSTRACT tured by Endace [1]—was mandatory for capturing Gigabit Capturing network traffic with commodity hardware has be- or Multi-gigabit network traffic, if little or no packet loss was come a feasible task: Advances in hardware as well as soft- a requirement. With recent development progresses in bus ware have boosted off-the-shelf hardware to performance lev- systems, multi-core CPUs and commodity network cards, els that some years ago were the domain of expensive special- nowadays off-the-shelf hardware can be used to capture net- purpose hardware. However, the capturing hardware still work traffic at near wire-speed with little or no packet loss needs to be driven by a well-performing software stack in in 1 GE networks, too [2, 3]. People are even building mon- order to minimise or avoid packet loss. Improving the cap- itoring devices based on commodity hardware that can be turing stack of Linux and FreeBSD has been an extensively used to capture traffic in 10 GE networks [4, 5] covered research topic in the past years. Although the ma- However, this is not an easy task, since it requires careful jority of the proposed enhancements have been backed by configuration and optimization of the hardware and software evaluations, these have mostly been conducted on different components involved—even the best hardware will suffer hardware platforms and software versions, which renders a packet loss if its driving software stack is not able to handle comparative assessment of the various approaches difficult, the huge amount of network packets. Several subsystems if not impossible. including the network card driver, the capturing stack of This paper summarises and evaluates the performance the operating system and the monitoring application are in- of current packet capturing solutions based on commodity volved in packet processing. If only one of these subsystems hardware. We identify bottlenecks and pitfalls within the faces performance problems, packet loss will occur, and the capturing stack of FreeBSD and Linux, and give explana- whole process of packet capturing will yield bad results. tions for the observed effects. Based on our experiments, we Previous work analysed [2, 4, 6] and improved [5, 7, 8] provide guidelines for users on how to configure their captur- packet capturing solutions. Comparing these work is quite ing systems for optimal performance and we also give hints difficult because the evaluations have been performed on dif- on debugging bad performance. Furthermore, we propose ferent hardware platforms and with different software ver- improvements to the operating system’s capturing processes sions. In addition, operating systems like Linux and FreeBSD that reduce packet loss, and evaluate their impact on cap- are subject to constant changes and improvements. Compar- turing performance. isons that have been performed years ago therefore might today not be valid any longer. In fact, when we started our capturing experiments, we were not able to reproduce the re- Categories and Subject Descriptors sults presented in several papers. When we dug deeper into C.2.3 [Network Operation]: Network Monitoring the operating systems’ capturing processes, we found that some of our results can be explained by improved drivers General Terms and general operating systems improvements. Other differ- ences can be explained by the type of traffic we analysed Measurement, Performance and by the way our capturing software works on the application layer. While it is not a problem to find comparisons 1. INTRODUCTION that state the superiority of a specific capturing solution, Packet capture is an essential part of most network mon- we had difficulties to find statements on why one solution is itoring and analysing systems. A few years ago, using spe- superior to another solution. Information about this topic is cialised hardware—e.g., network monitoring cards manufac- scattered throughout different papers and web pages. Worse yet, some information proved to be highly inconsistent, espe- cially when from Web sources outside academia. We therefore encountered many difficulties when debugging the per- Permission to make digital or hard copies of all or part of this work for formance problems we ran into. personal or classroom use is granted without fee provided that copies are This paper tries to fill the gap that we needed to step not made or distributed for profit or commercial advantage and that copies over when we set up our packet capturing environment with bear this notice and the full citation on the first page. To copy otherwise, to Linux and FreeBSD. We evaluate and compare different cap- republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. turing solutions for both operating systems, and try to sum- IMC’10, November 1–3, 2010, Melbourne, Australia. marise the pitfalls that can lead to bad capturing perfor- Copyright 2010 ACM 978-1-4503-0057-5/10/11 ...$10.00. 206 mance. The paper aims at future developers and users of capture systems and serves as a resource helping them not to repeat the pitfalls we and other researchers have encountered. A special focus of this paper is on explaining our findings. We try to identify the major factors that influence the performance of a capturing solution, providing users of packet capture systems with guidelines on where to search for performance problems in their systems. Our paper also targets developers of capturing and monitoring solutions: We identify potential bottlenecks that can lead to performance bottlenecks and thus packet loss. Finally, we propose a modification that can be applied to popular capturing so- Figure 1: Subsystems involved in the capturing pro- lutions. It improves capturing performance in a number of cess situations. Schneider et al. [4] compared capturing hardware based on The remainder of this paper is organised as follows: Sec- Intel Xeon and AMD Opteron CPUs with otherwise sim- tion 2 introduces the capturing mechanisms in Linux and ilar components. Assessing an AMD and an Intel plat- FreeBSD that are in use when performing network moni- form of comparable computing power, they found the AMD toring, and presents related improvements and evaluations platform to yield better capturing results. AMD’s superior of packet capturing systems. Section 3 presents the test memory management and bus contention handling mecha- setup that we used for our evaluation in Section 4. Our cap- nism was identified to be the most reasonable explanation. turing analysis covers scheduling issues in Section 4.1 and Since then, Intel has introduced Quick Path Interconnect [9] focuses on the application and operating system layer with in its recent processor families, which has improved the per- low application load in Section 4.2. Subsequently, we anal- formance of the Intel platform; however, we are not able to yse application scenarios that pose higher load on the system compare new AMD and Intel platforms at this time due to in Section 4.3, where we furthermore present our modifica- lack of hardware. In any case, users of packet capturing solutions to the capturing processes and evaluate their influence tions should carefully choose the CPU platform, and should on capturing performance. In Section 4.4, we move down- conduct performance tests before buying a particular hard- wards within the capturing stack and discuss driver issues. ware platform. Our experiments result in recommendations for developers Apart from the CPU, another important hardware aspect and users of capturing solutions, which are presented in Sec- is the speed of the bus system and the used memory. Current tion 5. Finally, Section 6 concludes the paper with a sum- PCI-E buses and current memory banks allow high-speed mary of our findings. transfer of packets from the capturing network card into the memory and the analysing CPUs. These hardware advances 2. BACKGROUND AND RELATED WORK thus have shifted the bottlenecks, which were previously lo- Various mechanisms are involved in the process of network cated at the hardware layer, into the software stacks. packet capturing. The performance of a capturing process thus depends on each of them to some extent. On the one Software stack: hand, there is the hardware that needs to capture and copy There are several software subsystems involved in packet all packets from the network to memory before the analysis capture, as shown in Figure 1. Passing data between and can start. On the other hand, there is the driver, the oper- within the involved subsystems can be a very important per- ating system and monitoring applications that need to care- formance bottleneck that can impair capturing performance fully handle the available hardware in order to achieve the and thus lead to packet loss during capturing. We will dis- best possible packet capturing performance. In the follow- cuss and analyse this topic in Section 4.3. ing, we will introduce popular capturing solutions on Linux A packet’s journey through the capturing system begins and FreeBSD in 2.1. Afterwards, we will summarise com- at the network interface card (NIC). Modern cards copy the parisons and evaluations that have been performed on the packets into the operating systems kernel memory using Di- different solutions in Section 2.2. rect Memory Access (DMA), which reduces the work the driver and thus the CPU has to perform in order to transfer 2.1 Solutions on Linux and FreeBSD the data into memory. The driver is responsible for allocat- Advances made in hardware development in recent years ing and assigning memory pages to the card that can be used such as high speed bus systems, multi-core systems or net- for DMA transfer.

Comparing and Improving Current Packet Capturing Solutions Based on Commodity Hardware

A Lecture Note on Csc 322 Operating System I by Dr

Optimizing Storage Performance with Calibrated Interrupts

Fitting Linux Device Drivers Into an Analyzable Scheduling Framework

Junos® OS Release 19.3R3 for the ACX Series, EX Series, MX Series, NFX Series, PTX Series, QFX Series, SRX Series, and Junos Fusion

Tolerating Malicious Device Drivers in Linux

CS 4310: Operating Systems Lecture Notes - Student Version∗

Towards Predictable Real-Time Performance on Multi-Core Platforms

Shared IRQ Line Considerations AN-PM-059

Towards Predictable Real-Time Performance on Multi-Core Platforms

Tolerating Malicious Device Drivers in Linux

111I.®@@ , Iw J , J '5 Expected 5 Expected 3 a Return to Timer :Ltimer Firings Normal Mode Duration W '5 L Expected 7 Timer Firings,Y Received 3

Interrupt Storm Detection" Feature