Fast Binary and Multiway Prefix Searches for Packet Forwarding

Computer Networks 51 (2007) 588–605 www.elsevier.com/locate/comnet Fast binary and multiway prefix searches for packet forwarding Yeim-Kuan Chang * Department of Computer Science and Information Engineering, National Cheng Kung University, Tainan, Taiwan, ROC Received 24 June 2005; received in revised form 24 January 2006; accepted 12 May 2006 Available online 21 June 2006 Responsible Editor: J.C. de Oliveira Abstract Backbone routers with tens-of-gigabits-per-second links are indispensable communication devices to deploy on the Inter- net. The IP lookup operation is the most critical task that must be improved in routers. In this paper, we first present a systematic method to compare prefixes of different lengths. The list of prefixes can then be sorted and stored in a sequential array, which is contrary to the linked lists used in most of trie-based structures. Next, fast binary and multiway prefix searches assisted by auxiliary prefixes are proposed. We also developed a 32-bit representation to encode the prefixes of different lengths. For the large routing tables currently available on the Internet, the proposed multiway prefix search can achieve the worst-case number of memory accesses of three and four if the sizes of the CPU cache lines are 64 bytes and 32 bytes, respectively. The IPv4 simulation results show that the proposed prefix searches outperform the existing IP lookup schemes in terms of lookup times and memory consumption. The simulations using IPv6 routing tables also show the performance advantages of the proposed binary prefix searches. We also analyze the performance of the existing lookup schemes by con- currently considering the lookup speed, the update speed, and the memory consumption. Although the update speed of the proposed prefix search is worse than the dynamic routing table schemes with log(N) complexity for a table of N prefixes, our analysis shows that the overall performance of the proposed binary prefix search outperforms all the existing schemes. Ó 2006 Elsevier B.V. All rights reserved. Keywords: Binary search; CPU caches; Longest prefix match; Routing table 1. Introduction with a link speed at several 10-gigabits-per-second (Gbps), such as OC-192, 10 Gigabits and OC-768, Internet traffic continues to grow at an unprece- 40 Gigabits, are commonly deployed. These back- dented rate due to the advent of the World Wide bone routers have to forward millions of packets Web. This has put tremendous pressure on Internet per second at each port. All tasks that have to be providers who have to set up the necessary infra- executed by the router after receiving a packet can structure to support this growth. A crucial part of be divided into time-critical (fast path) and non- this infrastructure is the router. Backbone routers time-critical (slow path) operations depending on the packet type and its frequency. Time-critical * Tel.: +886 6 2757575. operations that are operated on the majority of E-mail address: [email protected] the packets must be implemented in a highly 1389-1286/$ - see front matter Ó 2006 Elsevier B.V. All rights reserved. doi:10.1016/j.comnet.2006.05.005 Y.-K. Chang / Computer Networks 51 (2007) 588–605 589 efficient and optimized manner to keep up with the lookup algorithms and compared their worst-case high link speed and router bandwidth. As described complexities of lookup latency, update time, and in ‘‘Requirements for IP Version 4 Routers’’ [26], storage usage [17]. The binary trie is the basic data this includes IP packet validation, packet life- structure used in most IP lookup algorithms. The time control, checksum recalculation, destination binary trie allows the search for LPM to be in a address parsing, and IP table lookups. Among all bit-by-bit fashion. It is mostly implemented using the tasks performed by routers, the IP table lookups linked lists in which each trie node has left and right entail the most time-consuming process in which the pointers pointing to its left and right subtries, destination addresses are looked up against a respectively. We cannot store all the nodes of the forwarding table by a forwarding engine that deter- binary trie in a sequential array and then apply mines the next-hops in the network, where the the binary search because there is no mechanism packets should be sent. to compare two prefixes of different lengths. The forwarding table must maintain an entry for In this paper, we shall propose two new IP every allocated network address block that is repre- lookup algorithms called binary prefix search and sented as a route prefix. However, the continuous multiway prefix search based on a new mechanism growth of the number of hosts and networks has to sort the prefixes in the forwarding table. Our goal made the forwarding tables in the backbone routers is to store the prefixes in a linear array, and thus a grow very rapidly. To efficiently use the IP address faster search speed and a smaller memory require- space and slow down the growth of the forwarding ment can be achieved. By generating less than N tables, the classless IP subnet scheme called Class- auxiliary prefixes in a routing table of N prefixes, less Inter-Domain Routing (CIDR) was introduced. the naı¨ve binary search can be applied. As a result, While CIDR reduces the size of the forwarding the worst-case lookup complexity of log(N) can tables, the table lookup problem now becomes more be obtained, and the storage complexity of the complex. With CIDR, each route prefix in the for- proposed binary prefix search can be as good as warding table can vary from 1 to 32 bits instead of O(N). In order to further reduce the storage require- 8, 16, or 24 bits in Classful Address scheme. As a ment, we also propose a 32-bit representation to result, the search in a forwarding table can no encode prefixes of any length. The comparisons longer be performed by exact matching because and matching operations can be performed effi- the length of the prefix cannot be derived from the ciently by using the 32-bit CPU instructions. The address itself. The table lookup problem becomes proposed binary prefix search can be easily extended the ‘‘Longest Prefix Match’’ problem, which deter- to a multiway search scheme. Using a special array mines the longest route prefix to a destination index technique, our multiway prefix search can address because there may be more prefixes that achieve the maximum degree of the tree which is match the destination address. up to 17 and 33 using 32-byte and 64-byte cache Router designers were challenged to come up lines, respectively. Therefore, the proposed multi- with fast and efficient algorithms for solving the way scheme can achieve at most four and three Longest Prefix Match (LPM) problem. Three met- memory accesses for an IP lookup based on a for- rics are primarily considered in designing routers: warding table containing more than 120K route prefixes. These simulation results show that the 1. Search time. This is the time taken by the for- proposed prefix search algorithms perform better warding engine to look up the forwarding table than other existing IP lookup algorithms. for the destination address of an incoming The rest of the paper is organized as follows. Sec- packet. tion 2 describes existing IP lookup schemes, and 2. Storage requirement. This is the memory space Section 3 illustrates the basic data structure and required for the table lookup data structure. the detailed lookup algorithms proposed in this 3. Update time. This is the time required to insert/ paper. Section 4 shows the efficient encoding delete a route prefix into/from the forwarding schemes for prefixes, and Section 5 improves the table. lookup performance by extending the proposed algorithms onto the fast cache architecture. Section Many IP table lookup algorithms have been pro- 6 presents the results of the performance compari- posed to solve LPM problems. For instance, Ruiz- sons using real routing tables, and finally, the last Sanchez et al. classified a large variety of table section gives the concluding remarks. 590 Y.-K. Chang / Computer Networks 51 (2007) 588–605 2. Existing schemes and discussions to make the binary search on the sequential array work. The primary idea of the binary range In this section, we classify the existing schemes search is to precompute the port numbers when based on three aforementioned metrics, namely, the target IP is equal to one of the endpoints or is search speed, update speed, and memory require- located between two consecutive endpoints. Wal- ment, instead of their data structures. These three dvogel et al. [25] proposed a scheme (denoted by metrics are the most important metrics used for BS-Length) that performs a binary search on hash evaluating different IP lookup schemes. However, tables organized by prefix length. Both schemes in it should be noted that improving one may signifi- [12,25] need precomputation to obtain fast lookup cantly degrade another. A comprehensive review performance. on the existing IP lookup schemes before 2001 can be found in [17]. In this paper, we are interested in 2.2. Schemes for optimizing memory requirement the software-based lookup schemes that can be implemented in routers using network processors. The small forwarding table (SFT) scheme [5] We will not address designs with special hardware used a compressed version of 16-8-8 trie to reduce supports such as Ternary content-addressable mem- memory consumption. Nilsson and Karlsson [16] ory (TCAM) [24] and pipelined ASIC-based engines proposed a scheme called Level-Compressed (LC) [1]. trie which recursively transforms binary tries with prefixes into multibit tries.

Fast Binary and Multiway Prefix Searches for Packet Forwarding

HARD DRIVES How Is Data Read and Stored on a Hard Drive?

Efficient In-Memory Indexing with Generalized Prefix Trees

Dictionary Budget: 11

Parallel-Prefix Structures for Binary and Modulo

Megabyte from Wikipedia, the Free Encyclopedia

Bits, Bytes and Digital Information

Optimizing Parallel Prefix Operations for the Fermi Architecture

SI Prefix 1 SI Prefix

Memory Devices and Systems

Units in the VO Version REC-1.0

Introduction

Data Representation