A Modeling Framework for Network Processor Systems
Total Page:16
File Type:pdf, Size:1020Kb
A Modeling Framework for Network Processor Systems Patrick Crowley & Jean-Loup Baer Department of Computer Science & Engineering University of Washington Seattle, WA 98195-2350 pcrowley, baer ¡ @cs.washington.edu Abstract cation W at the target line rate and number of inter- faces? This paper introduces a modeling framework for network ¢ If not, where are the bottlenecks? processing systems. The framework is composed of inde- ¢ If yes, can S support application W’ (that is, application pendent application, system and traffic models which de- W plus some new task) under the same constraints? scribe router functionality, system resources/organization ¢ Will a given hardware assist improve system perfor- and packet traffic, respectively. The framework uses the mance relative to a software implementation of the Click Modular router to describe router functionality. Click same task? modules are mapped onto an object-based description of ¢ How sensitive is system performance to input traffic? the system hardware and are profiled to determine maxi- mum packet flow through the system and aggregate resource The framework is composed of independent application, utilization for a given traffic model. This paper presents system and traffic models which describe router function- several modeling examples of uniprocessor and multipro- ality, system resources/organization and packet traffic, re- cessor systems executing IPv4 routing and IPSec VPN en- spectively. The framework uses the Click Modular router cryption/decryption. Model-based performance estimates to describe router functionality. Click modules are mapped are compared to the measured performance of the real sys- onto an object-based description of the system hardware tems being modeled; the estimates are found to be accurate and are profiled and simulated to determine maximum within 10%. The framework emphasizes ease of use and packet flow through the system and aggregate resource uti- utility, and permits users to begin full system analysis of ex- lization for a given traffic model. This paper presents sev- isting or novel applications and systems very quickly. eral modeling examples of uniprocessor and multiproces- sor systems executing Internet protocol (IPv4) routing and IP security (IPSec) virtual-private network (VPN) encryp- 1. Introduction tion/decryption. Model-based performance estimates are compared to the measured performance of the real sys- Network processors provide flexible support for commu- tems being modeled; the estimates are found to be accu- nications workloads at high performance levels. Designing rate within 10%. The framework emphasizes ease of use a network processor can involve the design and optimiza- and utility, and permits users to begin full system analysis tion of many component devices and subsystems, includ- of existing or novel applications and systems very quickly. ing: (multi)processors, memory systems, hardware assists, Framework users can include those writing router software, interconnects and I/O systems. Often, there are too many router system architects or network processor architects. options to consider via detailed system simulation or proto- The remainder of the paper is structured as follows. Sec- typing alone. tion 2 introduces the modeling framework, and describes System modeling can be an economical means of evalu- its components, operation and implementation. Examples ating design alternatives; effective system models provide a of system models executing an IP router application are pre- fast and sufficiently accurate description of system perfor- sented and analyzed in Section 3. Section 4 uses an IPSec mance. The modeling framework introduced here has been VPN router extension to demonstrate application modeling designed to help answer questions such as: within the framework. Likewise, Section 5 explores traffic modeling for the systems and applications from preceding ¢ Is a system S sufficiently provisioned to support appli- sections. Section 6 considers future work . The paper con- cludes with a discussion in section 7. completely. This is of great benefit; Click can forward pack- ets around 5 times faster than Linux on the same system due 2. Framework Description to careful management of I/O devices [9]. Click describes router functionality with a directed graph of modules called elements; such a graph is referred to This section describes the framework component mod- as a Click configuration. A connection between two ele- els, the Click modular router, and the overall operation of ments indicates packet flow. Upon startup, Click is given a the framework. It begins with a high-level description of configuration which describes router functionality. An ex- the models. ample click configuration implementing an Ethernet-based IPv4 router is shown in Figure 1. In this paper, configura- ¢ Application - A modular, executable specification of tions are often referred to as routers, even if they do more router functionality described in terms of program ele- than route packets. The Click distribution comes with a ments and the packet flow between them. Examples of fairly complete catalog of elements – enough to construct IP program elements for an Ethernet IPv4 router include: routers, firewalls, quality-of-service (QoS) routers, network IP Fragmentation, IP address lookup, and packet clas- address-translation (NAT) routers and IPSec VPNs. Click sification. users can also build their own elements using the Click C++ ¢ System - A description of the processing devices, API. Configurations are described in text files with a simple, memories, and interconnects in the system being mod- intuitive configuration language. eled. For instance, one or more processors could be The authors ported Click to the Alpha ISA [4] in or- connected to SDRAM via a high-speed memory bus. der to profile and simulate the code with various proces- All elements of the application model are mapped to sor parameters using the Simplescalar toolkit [1]. Within system components. the framework, Click configurations function as application ¢ Traffic - A description of the type of traffic offered to specifications and implementations. Using Click fulfills the the system. Traffic characteristics influence A) which authors’ goal to use real-world software rather than bench- paths (sequence of elements) are followed in the ap- mark suites whenever possible; previous experience in eval- plication model and B) the resources used within each uating network processor architectures [3] has shown the element along the path. importance of considering complete system workloads. In Click, the mechanism used to pass packets between As previously mentioned, Click configurations are used elements depends on the type of output and input ports in- to describe applications in terms of modular program ele- volved. Ports can use either push or pull processing to de- ments and the possible packet flow between them. The traf- liver packets. With push processing, the source element is fic model dictates the packet flow between elements. The the active agent and passes the packet to the receiver. In system model describes system resources in a manner com- pull processing, the receiver is active and requests a packet patible with the work description admitted by the applica- from an element on its input port. Packet flow through a tion model. The general approach is to approximate appli- Click configuration has two phases corresponding to these cation characteristics statistically, based on instruction pro- two types of processing. When a FromDevice element is filing and program simulation when considering a software scheduled, it pulls a packet off the inbound packet queue implementation of a task, and to use those approximations for a specified network device and initiates push process- to form system resource usage and contention estimates. ing; push processing generally ends with the packet being In addition to these models, the user provides a mapping placed in a Queue element near a ToDevice element; this of the application elements onto the system; this mapping can be seen by examining paths through Figure 1. When a describes where each element gets executed in the system ToDevice element gets scheduled, it initiates pull process- and how the packets flow between devices when they move ing, generally by dequeueing a packet from a Queue. As a from one program element to another. general rule, push processing paths are more time consum- ing that pull processing paths. 2.1. The Click Modular Router Click maintains a worklist of schedulable elements and schedules them in round-robin fashion. FromDevice, This section describes the Click modular router. A good PollDevice (the polling version of FromDevice used and introduction to Click is found in [9]; Kohler’s thesis [8] described later in this paper) and ToDevice are the only describes the system in greater detail. Click is fully func- schedulable elements. On a uniprocessor-based system, tioning software built to run on real systems (primarily x86 there is no need for synchronization between elements or Linux systems). When run in the Linux kernel, Click han- for shared resources like free-buffer lists, since there is only dles all networking tasks; Linux networking code is avoided one packet active at a time and packets are never pre-empted FromDevice(eth0) FromDevice(eth1) terest. Classifier(...) Classifier(...) Click code is profiled and simulated with the Sim- ARP ARP ARP ARP queries responses IP queries responses IP pleScalar toolkit [1], version 3.0, targeted for the