Flow-Based Network Accounting with Linux

Flow-based network accounting with Linux Harald Welte netfilter core team / hmw-consulting.de / Astaro AG [email protected] Abstract Once a connection is evicted from the state table, its accounting relevant data is transferred to userspace to a special accounting daemon for Many networking scenarios require some form further processing, aggregation and finally stor- of network accounting that goes beyond some age in the accounting log/database. simple packet and byte counters as available from the ‘ifconfig’ output. When people want to do network accouting, the 1 Network accounting past and current Linux kernel didn’t provide them with any reasonable mechanism for doing Network accounting generally describes the so. process of counting and potentially summariz- ing metadata of network traffic. The kind of Network accounting can generally be done in metadata is largely dependant on the particular a number of different ways. The traditional application, but usually includes data such as way is to capture all packets by some userspace numbers of packets, numbers of bytes, source program. Capturing can be done via a num- and destination ip address. ber of mechanisms such as PF_PACKET sock- ets, mmap()ed PF_PACKET, ipt_ULOG, or There are many reasons for doing accounting ip_queue. This userspace program then ana- of networking traffic, among them lyzes the packets and aggregates the result into per-flow data structures. • transfer volume or bandwisth based billing • monitoring of network utilization, band- Whatever mechanism used, this scheme has a width distribution and link usage fundamental performance limitation, since all packets need to be copied and analyzed by a • research, such as distribution of traffic userspace process. among protocols, average packet size, . The author has implemented a different approach, by which the accounting information is 2 Existing accounting solutions for stored in the in-kernel connection tracking table of the ip_conntrack stateful firewall state Linux machine. On all firewalls, that state table has to be kept anyways—the additional overhead in- There are a number of existing packages to do troduced by accounting is minimal. network accounting with Linux. The follow- • 265 • 266 • Flow-based network accounting with Linux ing subsections intend to give a short overview kernel message ring buffer. This mechanism about the most commonly used ones. is called ipt_LOG (or LOG target). Such messages are then further processed by klogd and syslogd, which put them into one or 2.1 nacctd multiple system log files. nacctd also known as net-acct is proba- As ipt_LOG was designed for logging policy bly the oldest known tool for network account- violations and not for accounting, its overhead ing under Linux (also works on other Unix- is significant. Every packet needs to be inter- like operating systems). The author of this pa- preted in-kernel, then printed in ASCII format per has used nacctd as an accounting tool to the kernel message ring buffer, then copied as early as 1995. It was originally developed from klogd to syslogd, and again copied into by Ulrich Callmeier, but apparently abandoned a text file. Even worse, most syslog installa- later on. The development seems to have con- tions are configured to write kernel log mes- tinued in multiple branches, one of them being sages synchronously to disk, avoiding the usual the netacct-mysql1 branch, currently at version write buffering of the block I/O layer and disk 0.79rc2. subsystem. Its principle of operation is to use an AF_ To sum up and anlyze the data, often custom PACKET socket via libpcap in order to cap- perl scripts are used. Those perl scripts have to ture copies of all packets on configurable net- parse the LOG lines, build up a table of flows, work interfaces. It then does TCP/IP header add the packet size fields and finally export the parsing on each packet. Summary information data in the desired format. Due to the inefficient such as port numbers, IP addresses, number of storage format, performance is again wasted at bytes are then stored in an internal table for analyzation time. aggregation of successive packets of the same flow. The table entries are evicted and stored in a human-readable ASCII file. Patches ex- 2.3 ipt_ULOG based (ulogd, ulog-acctd) ist for sending information directly into SQL databases, or saving data in machine-readable data format. The iptables ULOG target is a more efficient version of the LOG target described As a pcap-based solution, it suffers from the above. Instead of copying ascii messages via performance penalty of copying every full the kernel ring buffer, it can be configured to packet to userspace. As a packet-based solu- only copies the header of each packet, and tion, it suffers from the penalty of having to in- send those copies in large batches. A special terpret every single packet. userspace process, normally ulogd, receives those partial packet copies and does further in- terpretation. 2.2 ipt_LOG based ulogd2 is intended for logging of security vi- The Linux packet filtering subsystem iptables olations and thus resembles the functionality of offers a way to log policy violations via the LOG. it creates one logfile entry per packet. It 1http://netacct-mysql.gabrovo. 2http://gnumonks.org/projects/ com ulogd 2005 Linux Symposium • 267 supports logging in many formats, such as SQL 2.5 ipt_ACCOUNT (iptaccount) databases or PCAP format. ulog-acctd3 is a hybrid between ulogd ipt_ACCOUNT5 is a special-purpose iptables and nacctd. It replaces the nacctd libp- target developed by Intra2net AG and avail- cap/PF_PACKET based capture with the more able from the netfilter project patch-o-matic-ng efficient ULOG mechanism. repository. It requires kernel patching and is not included in the mainline kernel. Compared to ipt_LOG, ipt_ULOG reduces the amount of copied data and required ker- ipt_ACCOUNT keeps byte counters per IP nel/userspace context switches and thus im- address in a given subnet, up to a ‘/8’ net- proves performance. However, the whole work. Those counters can be read via a special mechanism is still intended for logging of se- iptaccount commandline tool. curity violations. Use for accounting is out of its design. Being limited to local network segments up to ‘/8’ size, and only having per-ip granularity are two limiteations that defeat ipt_ACCOUNT as 2.4 iptables based (ipac-ng) a generich accounting mechainism. It’s highly- optimized, but also special-purpose. Every packet filtering rule in the Linux packet filter (iptables, or even its predecessor 2.6 ntop (including PF_RING) ipchains) has two counters: number of packets and number of bytes matching this particular rule. ntop6 is a network traffic probe to show network usage. It uses libpcap to cap- By carefully placing rules with no target (so- ture the packets, and then aggregates flows in called fallthrough) rules in the packetfilter rule- userspace. On a fundamental level it’s there- set, one can implement an accounting setup, fore similar to what nacctd does. i.e., one rule per customer. From the ntop project, there’s also nProbe, a A number of tools exist to parse the iptables network traffic probe that exports flow based in- command output and summarized the coun- formation in Cisco NETFLOW v5/v9 format. ters. The most commonly used package is It also contains support for the upcoming IETF ipac-ng4. It supports advanced features such IPFIX7 format. as storing accounting data in SQL databases. The approach works quite efficiently for small To increase performance of the probe, the au- PF_RING8 installations (i.e., small number of accounting thor (Luca Deri) has implemented , rules). Therefore, the accounting granularity a new zero-copy mmap()ed implementation for can only be very low. One counter for each 5http://www.intra2net.com/ single port number at any given ip address is opensource/ipt_account/ certainly not applicable. 6http://www.ntop.org/ntop.html 7IP Flow Information Export 3http://alioth.debian.org/ http://www.ietf.org/html.charters/ projects/pkg-ulog-acctd/ ipfix-charter.html 4http://sourceforge.net/ 8http://www.ntop.org/PF_RING. projects/ipac-ng/ html 268 • Flow-based network accounting with Linux packet capture. There is a libpcap compatibil- 3.1 ip_conntrack_acct ity layer on top, so any pcap-using application can benefit from PF_RING. ip_conntrack_acct is how the in-kernel ip_conntrack counters are called. There is PF_RING is a major performance improve- a set of four counters: numbers of packets and ment, please look at the documentation and the bytes for original and reply direction of a given paper published by Luca Deri. connection. However, ntop / nProbe / PF_RING are If you configure a recent (>= 2.6.9) kernel, all packet-based accounting solutions. Every it will prompt you for CONFIG_IP_NF_CT_ packet needs to be analyzed by some userspace ACCT. By enabling this configuration option, process—even if there is no copying involved. the per-connection counters will be added, and Due to PF_RING optimiziation, it is probably the accounting code will be compiled in. as efficient as this approach can get. However, there is still no efficient means of reading out those counters. They can be ac- cessed via cat /proc/net/ip_conntrack, but that’s 3 New ip_conntrack based ac- not a real solution. The kernel iterates over counting all connections and ASCII-formats the data. Also, it is a polling-based mechanism. If the The fundamental idea is to (ab)use the connec- polling interval is too short, connections might tion tracking subsystem of the Linux 2.4.x / get evicted from the state table before their fi- 2.6.x kernel for accounting purposes. There are nal counters are being read. If the interval is too several reasons why this is a good fit: small, performance will suffer.

Flow-Based Network Accounting with Linux

Ngenius Collector Appliance Scalable, High-Capacity Appliance for Collection of Cisco Netflow and Other Flow Data

Free and Open Source Software Is Not a “Free for All”: German Court Enforces GPL License Terms

Open Source Software Notice

Beej's Guide to Unix IPC

Mmap and Dma

An Introduction to Linux IPC

Netflow Traffic Analyzer Real-Time Network Utilization and Bandwidth Monitoring

Netflow Optimizer™

Netfilter's Connection Tracking System

Table of Contents

Flow-Tools Tutorial

Brocade Vyatta Network OS Data Sheet