An Extensible System-On-Chip Internet Firewall
Total Page:16
File Type:pdf, Size:1020Kb
An Extensible System-On-Chip Internet Firewall ----- ----- ----- ----- ----- ----- ABSTRACT Internet Packets A single-chip, firewall has been implemented that performs packet filtering, content scanning, and per-flow queuing of Internet Fiber packets at Gigabit/second rates. All of the packet processing Ethernet Backbone Switch operations are performed using reconfigurable hardware within a Switch single Xilinx Virtex XCV2000E Field Programmable Gate Array (FPGA). The SOC firewall processes headers of Internet packets Firewall in hardware with layered protocol wrappers. The firewall filters packets using rules stored in Content Addressable Memories PC 1 (CAMs). The firewall scans payloads of packets for keywords PC 2 using a hardware-based regular expression matching circuit. Lastly, the SOC firewall integrates a per-flow queuing module to Internal Hosts Internet mitigate the effect of Denial of Service attacks. Additional features can be added to the firewall by dynamic reconfiguration of FPGA hardware. Figure 1: Internet Firewall Configuration network, individual subnets can be isolated from each other and Categories and Subject Descriptors be protected from other hosts on the Internet. I.5.3 [Pattern Recognition]: Design Methodology; B.4.1 [Data Communications]: Input/Output Devices; C.2.1 [Computer- Recently, new types of firewalls have been introduced with an Communication Networks]: Network Architecture and Design increasing set of features. While some types of attacks have been thwarted by dropping packets based on the value of packet headers, new types of firewalls must scan the bytes in the payload General Terms of the packets as well. Further, new types of firewalls need to Design, Experimentation, Network Security defend internal hosts from Denial of Service (DoS) attacks, which occur when remote machines flood traffic to a victim host at high Keywords rates [1]. Few existing firewalls have the ability to scan the full System On Chip, FPGA, Internet, Firewall, Packet Scanning, Per- packet payload or provide protection against DOS attacks. Of the Flow Queuing, Network Intrusion Detection systems that do, most run in software and are not fast enough to perform those functions at high speeds [3]. There exists a need for 1. INTRODUCTION hardware accelerated packet processing firewalls which maintain high throughput. As the Internet has grown, demand for network security has significantly increased. Internet-connected machines continuously Custom Integrated Circuits (ICs) can be used to implement are the target of malicious attacks from machines located around firewall functions at Gigabit/second rates. They achieve high the world. Internal hosts can be protected from remote attacks by throughput by performing operations in parallel and by processing filtering traffic through a firewall. As shown in Figure 1, firewalls packets in deep pipelines. In the past, hardware-based packet typically reside between the backbone switches and the internal processing systems required multiple ASICs to filter and forward hosts. Firewalls drop packets that are known to be malicious and packets in hardware. Today, an integrated circuit with tens of rate-limit traffic flows that attempt to transmit excessively large millions of transistors can implement a firewall as a single System amounts of traffic. By placing multiple firewalls throughout a On Chip (SOC). A challenge in building firewalls is to make the device capable of protecting against both current and future threats [6]. Reconfigurable hardware provides both the logic Permission to make digital or hard copies of all or part of this work for density to implement a complex firewall while maintaining the personal or classroom use is granted without fee provided that copies are flexibility to reconfigure and implement new functions. not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, or republish, to post on servers or to redistribute to lists, 2. SYSTEM ON CHIP FIREWALL requires prior specific permission and/or a fee. A System-On-Chip Internet firewall has been implemented on a Design Automation Conference ‘03, June 2-6, 2003, Anaheim, CA. Xilinx Virtex XCV2000E FPGA. In order to protect against Copyright 2003 ACM 1-58113-000-0/00/0000…$5.00. current threats, the SOC firewall integrates circuits to filter applied to the vector to select which bits of each row must match headers, scan payloads, and buffer traffic. In order to protect and which bits can be ignored. If all of the values match in all of against future threats, the SOC is extensible allowing insertion of the bit locations that are unmasked, then that row of the TCAM is new packet processing hardware modules. considered to be a match. The flow identifier associated with the rule in the highest-priority matching TCAM is then assigned to Interfaces to Off-Chip Memories flow. 0 SDRAM SRAM 111 103 Free List CAM_MASK_1 Controller Controller Manager 0 111 103 CAM_VALUE_1 Flow 0 Payload CAM Buffer 111 103 p p p Con- Src Dest Src IP Dest IP Proto Scanner Filter tent Port Port Data Output 0 Data Input 111 103 Queue Packet CAM_VALUE_2 Manager Scheduler 0 111 103 CAM_MASK_2 Layered Protocol Wrappers . Figure 2: Block Diagram of System-On-Chip Firewall Figure 3: Ternary CAM Filter The top-level architecture of the System On Chip firewall is 2.2 Payload Processing shown in Figure 2. When data first enters the SOC, a set of Many types of Internet traffic cannot be classified by examination layered protocol wrappers parse the headers of the Internet of the packet headers. For example, the KaZaA program packets. Next, the payload scanner examines the content of the sometimes disguises packet headers to appear as through they packets to identify keywords and/or regular expressions. Next, were being sent from a web server. For network administrators the CAM filter compares the fields in the header of the packet who care about the security of their networks, it is important to be with a set of rules stored in Ternary Content Addressable Memory able to classify a packets based on the their content rather than (TCAM). Some rules can cause the CAM filters to outright drop just the values that appear in the packet headers. packets, while other rules are used to classify the packet and assign it with a flow identifier. After classification, the queue 2.2.1 Regular Expression Matching manager schedules when packets are transmitted from the flow In order to scan the payload of packets, a regular expression buffer, which stores the packet in off-chip memory. Once matching circuit was implemented. Regular expressions provide a scheduled, data is read from the flow buffer and transmitted out of shorthand means to specify the value of a string, a wildcard the firewall. Additional features can be added to the system by character (specified by ‘?’), or a string of multiple characters inserting blocks along the data processing path. (specified by ‘*’). For example, the string “{A|a}lbert ? {E|e}instein” matches all four case variations of the name Albert 2.1 Header Processing Einstein and allows the middle initial to be an arbitrary character. Internet protocol packets contain both a header and a payload. The header contain multiple fields that specify the type of packet, 2.2.2 Implementation of the Payload Scanner the protocol of the packet, where a packet has come from, where To generate high-speed hardware that searches for the regular is packet is destined to, the length of the packet, and other options expression, a design flow was created to automatically generate relevant to the Internet protocols. finite state machines from the specification of regular expressions. 2.1.1 Layered Protocol Wrappers A match is detected when the sequence of arriving bytes cause the state machine to reach a matching state. In order to scan for To simplify the processing of the protocol fields on the SOC multiple regular expressions, a sequence of scanning engines is firewall, a set of layered protocol wrappers was implemented to instantiated. In order to achieve higher performance, pipelines process protocols at multiple layers [2]. At the lowest layer, data can operate in parallel. A payload scanner searching for eight is segmented and reassembled from short cells into complete Regular Expressions (RE1-RE8) using four parallel search flows frames. At another layer of the protocol stack, the fields of the is illustrated in Figure 4. Internet Protocol (IP) packets are computed and verified. At the highest level of the protocol processing, the user-level data is separated from the headers and transport fields used by the Pipeline of Regular Expression scanning (RE) engines RE1 RE2 RE3 RE4 RE5 RE6 RE7 RE8 network. Incoming Outgoing Packets Packets 2.1.2 Content Addressable Memory Filters RE1 RE2 RE3 RE4 RE5 RE6 RE7 RE8 Once the header has been processed, a Ternary Content RE1 RE2 RE3 RE4 RE5 RE6 RE7 RE8 Addressable Memory (TCAM) classifies packets as belonging to a specific flow. A diagram of a two-entry TCAM is shown in Flow RE1 RE2 RE3 RE4 RE5 RE6 RE7 RE8 Flow Dispatcher Collector Figure 4. When a packet arrives, the packet’s source address, Parallel Search Flows destination address, source port, destination port, and protocol are Figure 4: Regular Expression (RE) Payload Scanner simultaneously compared to the value fields in all of the rows of the TCAM. After the bits are compared, a mask register is 2.2.3 Application using the Payload Scanner When a packet arrives, the packet’s data is delivered to the flow A payload processing circuit has been implemented on the SOC buffer and the packet’s flow identifier is passed to the En-queue firewall that scans email for unwanted messages, commonly FSM in the Queue Manager. Using the flow ID, the En-queue referred to as SPAM.