Multiway Range Trees: Scalable IP Lookup with Fast Updates

Multiway Range Trees: Scalable IP Lookup with Fast Updates Subhash Suri George Varghese Priyank Ramesh Warkhede Department of Computer Science Washington University St. Louis, MO 63130. Abstract—In this paper, we introduce a new IP lookup scheme with A. Our Contribution log n n worst-case search and update time of O , where is the number of prefixes in the forwarding table. Our scheme is based on a new data structure, a multiway range tree. While existing lookup schemes are good We introduce a new IP lookup scheme whose worst-case log n n for IPv4, they do not scale well in both lookup speed and update costs search and update time is O , where is the number when addresses grow longer as in the IPv6 proposal. Thus our lookup of prefixes in the forwarding table. Our scheme is based on a scheme is the first lookup scheme to offer fast lookups and updates for new data structure, a multiway range tree, which achieves the IPv6 while remaining competitive for IPv4. optimal lookup time of binary search, but can also be updated The possibility of a global Internet with addressable appli- fast when a prefix is added or deleted. (By contrast, ordinary ances has motivated a proposal to move from the older In- binary search [5] is a precomputation-based scheme whose n ternet Ipv4 routing protocol with 32 bit addresses to a new worst-case update time is .) protocol, IPv6 with 128 bit addresses. While there is uncer- Our main contribution is to modify binary search for tainty about IPv6, any future Internet protocol will require at prefixes to provide fast update times. In standard binary least 64 bit, and possibly even 128 bit addresses as in IPv6. search [5], each prefix maps to an address range. A set of n Thus for this paper we simply use IPv6 and 128 bits to refer n prefixes partition the address line into at most inter- to the next generation IP protocol and its address length. vals. We build a tree, whose leaves correspond to interval Our paper deals with address lookup, a key bottleneck endpoints. All packet headers mapped to an interval have the log n for Internet routers. Our paper describes the first IP lookup same longest prefix match. Search time is O . scheme that scales well as address lengths increase as in IPv6, The problem is update: a prefix range can be split into n while providing fast search times and fast update times. intervals, which need to be updated when the prefix is There are four requirements for a good lookup scheme: added or deleted. Thus our first idea is associating an address v search time, memory, scalability, and update time. Search span with every tree node v ; the span of is the maximal time is important for lookup to not be a bottleneck. Schemes range of addresses into which the final search can fall after that are memory-efficient lead to faster search because com- reaching v . Thus the root span is the entire address range. pact data structures fit in fast but expensive Static RAM. Scal- We then associate with every node v the set of prefixes that ability in both number of prefixes and address length is im- contain the span of v . However, this results in storing redun- portant because prefix databases are growing, and a switch to dant prefixes. IPv6 will increase address prefix lengths. Finally, schemes To remove this redundancy, we store a prefix with a node v with fast update time are desirable because i) instabilities in if and only if the same prefix is not stored at the parent w of w backbone protocols [4] can cause rapid insertion and deletion the node. Since the span of v is a subrange of the span of , of prefixes. ii) multicast forwarding support requires route any prefix that contains the span of w , will contain the span entries to be added as part of the forwarding process and not of v . Using this compression rule we can prove that only a by a separate routing protocol iii) Any lookup scheme that small number of prefixes need be precomputed and stored; log n does not support fast incremental updates will require two thus updates change information only at O nodes. By copies of the lookup structure in fast memory, one for up- increasing the arity of the tree, we can reduce the height of log n d d dates and one for lookup. the tree to O , for any integer . We pick so that the d The main drawback of existing lookup schemes is their entire node can fit in one cache line. For instance, a 256 bit lack of both scalability and fast update. Current IP lookup cache line and arity-14 yields a tree height of about 5 or 6 for schemes come in two categories: those that use precomputa- the largest IPv4 databases. tion, and those that do not. Schemes like “binary search on Our third contribution is to extend multiway range trees prefix lengths” [9], “Lulea compressed tries” [1], or “binary to IPv6. On a 32-bit machine, a single 128-bit address will search on intervals” [5] perform precomputation to speed up require 4 memory accesses. Thus, an IP lookup scheme de- search; however, adding a prefix can cause the data struc- signed for IPv4 could slow down by a factor of 4 when used ture to be rebuilt. Thus, precomputation-based schemes have for IPv6. In practice, since we are assuming wide memories n a worst-case update time of . On the other hand, trie- of at least 128 bits, we can handle such wide prefixes in one based schemes (e.g., [8], [7]) do not use precomputation; memory access. Unfortunately, the use of 128 bit addresses however, their search time grows linearly with the prefix result in the use of a small branching factor and hence large length. tree heights. To avoid this problem, we generalize our scheme to the case of k -word prefixes, giving a worst-case search and with 32K additional 32-bit prefixes described by the sequence k log n n updates times of O , where is the total number . d k xy of prefixes present in the database. Notice that the factor of Thus all IP addresses of the form where the is additive. LSB of y will have as their best matching pre- The rest of this paper is organized as follows. Section I fix. The binary search method [5] handles this by setting match p describes basic binary search. Section II describes our main for all the 32K prefixes of length 32. data structure, the multiway range tree. Section III briefly Now consider adding the prefix Since this new pre- xy describes how to handle updates efficiently. Section IV de- fix is now a better match for prefixes of the form , match p scribes our extension to long addresses. Section V describes we have to update for 32K values of our experiments and measurements. Section VI states our p. Thus update can take linear time. We avoid this problem conclusions. using the following ideas. I. IP LOOKUP BY BINARY OR MULTIWAY SEARCH A. Address Spans We first review binary search [5] prefix matching. Suppose Our key idea is to associate address spans with nodes and r r r m keys of the tree; intuitively, the address span of a node is the m the database consists of prefixes . Each pre- b e fix r matches a contiguous interval of addresses. For range of addresses that can be reached through the node. For instance, the interval of prefix r begins at point the multiway tree in Figure 1, first consider a leaf node, such z x x and ends at . View the IP address as . For each key in it, we define the span of to be space as a line in which each prefix maps to a contiguous in- the range from its predecessor key to the key itself. (When terval, and each destination address is a point. Prefix intervals the predecessor key doesn’t exist, we use the artificial guard are disjoint but one prefix can completely contain another. key .) To ensure that spans are disjoint, each span in- The longest matching prefix problem reduces to determining cludes the right endpoint of its range, but not the left end- c spanaa the shortest interval containing a query point. point. Thus, the spans of a b and are , ba b spancb c span , . p p p n m n Let , where , denote the distinct endpoints of all the prefix intervals, sorted in ascending order. The span of a leaf node is defined to be the union of the spanz c match p p spans of all the keys in it. Thus, we have . i With each key i , we store two prefixes, and h p match p matc The span of a non-leaf node is defined as the union of the i i . The first, , is the longest prefix that match p p spans of all the descendants of the node. Thus, for in- i begins or ends at i . The second, , is the longest spanv i spanw i o p p stance, , and , and i prefix whose interval includes the range i , where spanuo p p . (In fact, it is easy to check that nodes at i i is the predecessor key of . the same level in the tree form a partition of the total address Given a destination address q , we perform binary search span defined by all the input prefixes.) We now describe how n q in log time to determine the successor of , which is the to associate prefixes with nodes of the tree.

Load more