Scalable High-Speed Prefix Matching

Scalable High-Speed Prefix Matching Marcel Waldvogel Washington University in St. Louis and George Varghese University of California, San Diego and Jon Turner Washington University in St. Louis and Bernhard Plattner ETH Zürich The work of Marcel Waldvogel was supported in part by KTI grant 3221.1. The work of George Varghese was supported in part by an ONR Young Investigator Award and NSF grants NCR-940997 and NCR-9628218. Parts of this paper were presented in ACM SIGCOMM ’97 [Waldvogel et al. 1997]. Name: Marcel Waldvogel Affiliation: Washington University in St. Louis Address: Department of Computer Science; Campus Box 1045; Washington University in St. Louis; St. Louis, MO 63130-4899; USA; [email protected] Name: George Varghese Affiliation: University of California, San Diego Address: Computer Science and Engineering, MS 0114; University of California, San Diego; 9500 Gilman Drive; La Jolla, CA 92040-0114; [email protected] Name: Jon Turner Affiliation: Washington University in St. Louis Address: Department of Computer Science; Campus Box 1045; Washington University in St. Louis; St. Louis, MO 63130-4899; USA; [email protected] Name: Bernhard Plattner Affiliation: ETH Zurich¨ Address: TIK, ETZ G89; ETH Zurich;¨ 8092 Zurich;¨ Switzerland; [email protected] Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted with- out fee provided that copies are not made or distributed for profit or direct commercial advantage and that copies show this notice on the first page or initial screen of a display along with the full citation. Copyrights for com- ponents of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, to redistribute to lists, or to use any component of this work in other works, requires prior specific permission and/or a fee. Permissions may be requested from Publications Dept, ACM Inc., 1515 Broadway, New York, NY 10036 USA, fax +1 (212) 869-0481, or [email protected]. 2 · M. Waldvogel, G. Varghese, J. Turner, and B. Plattner Finding the longest matching prefix from a database of keywords is an old problem with a number of applications, ranging from dictionary searches to advanced memory management to computational geometry. But perhaps today’s most frequent best matching prefix lookups occur in the Internet, when forwarding packets from router to router. Internet traffic volume and link speeds are rapidly increasing; at the same time, an increasing user population is increasing the size of routing tables against which packets must be matched. Both factors make router prefix matching extremely performance critical. In this paper, we introduce a taxonomy for prefix matching technologies, which we use as a basis for describ- ing, categorizing, and comparing existing approaches. We then present in detail a fast scheme using binary search over hash tables, which is especially suited for matching long addresses, such as the 128 bit addresses proposed for use in the next generation Internet Protocol, IPv6. We also present optimizations that exploit the structure of existing databases to further improve access time and reduce storage space. Categories and Subject Descriptors: C.2.6 [Computer-Communication Networks]: Internetworking—Routers; E.2 [Data Storage Representations]: Hash-table representations; F.2.2 [Analysis of Algorithms and Problem Complexity]: Nonnumerical Algorithms and Problems—Pattern matching General Terms: Algorithms, Performance Additional Key Words and Phrases: collision resolution, forwarding lookups, high-speed networking 1. INTRODUCTION The Internet is becoming ubiquitous: everyone wants to join in. Since the advent of the World Wide Web, the number of users, hosts, domains, and networks connected to the Internet seems to be growing explosively. Not surprisingly, network traffic is doubling every few months. The proliferation of multimedia networking applications (e.g., Napster) and devices (e.g., IP phones) is expected to give traffic another major boost. The increasing traffic demand requires four key factors to keep pace if the Internet is to continue to provide good service: link speeds, router data throughput, packet forwarding rates, and quick adaptation to routing changes. Readily available solutions exist for the first two factors: for example, fiber-optic cables can provide faster links and switch- ing technology can be used to move packets from the input interface of a router to the corresponding output interface at multi-gigabit speeds [Partridge et al. 1998]. Our paper deals with the other two factors: forwarding packets at high speeds while still allowing for frequent updates to the routing table. A major step in packet forwarding is to lookup the destination address (of an incoming packet) in the routing database. While there are other chores, such as updating TTL fields, these are computationally inexpensive compared to the major task of address lookup. Data link Bridges have been doing address lookups at 100 Mbps [Spinney 1995] for many years. However, bridges only do exact matching on the destination (MAC) address, while Inter- net routers have to search their database for the longest prefix matching a destination IP address. Thus, standard techniques for exact matching, such as perfect hashing, binary search, and standard Content Addressable Memories (CAM) cannot directly be used for Internet address lookups. Also, the most widely used algorithm for IP lookups, BSD Patri- cia Tries [Sklower 1993], has poor performance. Prefix matching in Internet routers was introduced in the early 1990s, when it was fore- seen that the number of endpoints and the amount of routing information would grow enormously. At that time, only address classes A, B, and C existed, giving individual sites either 24, 16, and 8 bits of address space, allowing up to 16 Million, 65,534, and 254 host Scalable High-Speed Prefix Matching · 3 addresses, respectively. The size of the network could easily be deduced from the first few address bits, making hashing a popular technique. The limited granularity turned out to be extremely wasteful on address space. To make better use of this scarce resource, especially the class B addresses, bundles of class C networks were given out instead of class B addresses. This would have resulted in massive growth of routing table entries over time. Therefore, Classless Inter-Domain Routing (CIDR) [Fuller et al. 1993] was introduced, which allowed for aggregation of networks in arbitrary powers of two to reduce routing table entries. With this aggregation, it was no longer possible to identify the number of bits relevant for the forwarding decision from the address itself, but required a prefix match, where the number of relevant bits was only known when the matching entry had already been found in the database. To achieve maximum routing table space reduction, aggregation is done aggressively. Suppose all the subnets in a big network have identical routing information except for a sin- gle, small subnet with different information. Instead of having multiple routing entries for each subnet in the large network, just two entries are needed: one for the overall network, and one entry showing the exception for the small subnet. Now there are two matches for packets addressed to the exceptional subnet. Clearly, the exception entry should get preference there. This is achieved by preferring the more specific entry, resulting in a Best Matching Prefix (BMP) operation. In summary, CIDR traded off better usage of the limited IP address space and a reduction in routing information for a more complex lookup scheme. The upshot is that today an IP router’s database consists of a number of address prefixes. When an IP router receives a packet, it must compute which of the prefixes in its database has the longest match when compared to the destination address in the packet. The packet is then forwarded to the output link associated with that prefix, directed to the next router or the destination host. For example, a forwarding database may have the prefixes P 1 = 0000∗, P2 =0000111∗ and P3 =00001111 0000∗, with ∗ meaning all further bits are unspecified. An address whose first 12 bits are 0000 0110 1111 has longest matching prefix P1. On the other hand, an address whose first 12 bits are 0000 1111 0000 has longest matching prefix P3. The use of best matching prefix in forwarding has allowed IP routers to accommodate various levels of address hierarchies, and has allowed parts of the network to be oblivious of details in other parts. Given that best matching prefix forwarding is necessary for hierarchies, and hashing is a natural solution for exact matching, a natural question is: “Why can’t we modify hashing to do best matching prefix?” However, for several years now, it was considered not to be “apparent how to accommodate hierarchies while using hashing, other than rehashing for each level of hierarchy possible” [Sklower 1993]. Our paper describes a novel algorithmic solution to longest prefix match, using binary search over hash tables organized by the length of the prefix. Our solution requires a worst case of log W hash lookups, with W being the length of the address in bits. Thus, for the current Internet protocol suite (IPv4) with 32 bit addresses, we need at most 5 hash lookups. For the upcoming IP version 6 (IPv6) with 128 bit addresses, we can do lookup in at most 7 steps, as opposed to longer for current algorithms (see Section 2), giving an order of magnitude performance improvement. Using perfect hashing [Fredman et al. 1984], we can lookup 128 bit IP addresses in at most 7 memory accesses. This is significant because on current processors, the calculation of a hash function is usually much cheaper than an off-chip memory access. 4 · M.

Load more