An Internet Topology Data Collector

TraceNET: An Internet Topology Data Collector M. Engin Tozal Kamil Sarac Department of Computer Science Department of Computer Science The University of Texas at Dallas The University of Texas at Dallas Richardson, TX 75080 U.S.A. Richardson, TX 75080 U.S.A. [email protected] [email protected] ABSTRACT This paper presents a network layer Internet topology collec- Destination Vantage tion tool called tracenet.Comparedtotraceroute, trace- a net can collect a more complete topology information on an Traceroute end-to-end path. That is, while traceroute returns a list of IP addresses each representing a router on a path, tracenet attempts to return all the IP addresses assigned to the inter- Destination Vantage faces on each visited subnetwork on the path. Consequently, b TraceNET the collected information (1) includes more IP addresses be- longing to the traced path; (2) represents“being on the same LAN”relationship among the collected IP addresses; and (3) Figure 1: Traceroute vs tracenet on a path trace. annotates the discovered subnets with their observed subnet Traceroute collects a single IP address vs tracenet masks. Our experiments on Internet2, GEANT, and four collects a subnet at each hop. major ISP networks demonstrate promising results on the utility of tracenet for future topology measurement studies. maps enrich the router level maps with subnet level connec- Categories and Subject Descriptors tivity info; and AS level maps demonstrate the adjacency relationship between ASes. C.2 [COMPUTER-COMMUNICATION NETWORKS]: Adverse to the benefits of having a network topology map, Network Architecture and Design the main tools used to collect router or IP address level topology data are a few and operationally limited. Trace- General Terms route [12] and ping are the main data collection tools. Tra- Measurement ceroute collects a list of IP addresses one for each router on the path between two hosts and ping is mainly used to check whether an IP address is in use or not. Almost all topology Keywords mapping projects use data collected by traceroute from Internet, Network, Subnet, Topology, Traceroute multiple vantage points [19]. In this study, we propose a new end-to-end topology collection tool called tracenet. An accurate and complete In- 1. INTRODUCTION ternet topology map at the router level requires identifying Many successful research projects and efforts have been all routers and subnets among them. Traceroute attempts introduced attempting to derive an accurate and large scale to collect an IP address at each router on a path between two topology map of the Internet [17, 18, 22, 16]. These efforts hosts whereas tracenet attempts to collect a subnet at each focus on different but correlated topology maps: IP level router on the same path. In the worst case, tracenet returns maps show IP addresses that are in use on the Internet; the exact path that would be returned by traceroute, and, router level maps group the interfaces hosted by the same in the best case, it collects the complete topology of each router into a single unit (via alias resolution); subnet level subnet visited on the path. Consequently, a single session of tracenet (1) discovers new IP addresses that are missed by traceroute, (2) marks multi-access and point-to-point links, (3) reveals subnet relationship among IP addresses on Permission to make digital or hard copies of all or part of this work for the path, and (4) annotates the subnets with their observed personal or classroom use is granted without fee provided that copies are subnet masks. Traceroute’s ability to collect a similar data not made or distributed for profit or commercial advantage and that copies is often limited in practice due to the difficulties of obtain- bear this notice and the full citation on the first page. To copy otherwise, to ing a reverse path trace and due to the dynamics of the republish, to post on servers or to redistribute to lists, requires prior specific underlying routing behavior between the two systems. As permission and/or a fee. IMC’10, November 1–3, 2010, Melbourne, Australia. an example, consider the use of traceroute and tracenet Copyright 2010 ACM 978-1-4503-0057-5/10/11 ...$10.00. to collect router level topology info between a vantage point 356 R R R 12 R12 R12R A R R R A R R R R R R 3 4 5 3 4 5 A 3 4 5 R R R R R R R R R 6 7 8 9 6 8 9 6 R7 8 R9 B B P B P 1 1 P2 P2 Network Topology P Traceroute P TraceNET C D 3 C D 3 C D (a) (b) (c) Figure 2: A network topology section among hosts A, B, C,andD with unweighed links. P1 = {A, R1,R2,R5,R9,D} and P2 = {A, R3,R4,R5,R9,D} are two paths from A to D. P3 = {B,R6,R3,R4,R8,C} is apathfromB to C. Figures show the original network topology, traceroute view of the paths and tracenet view of the paths respectively. and a destination as shown in Figure 1. Figures 1.a and our approach allows us to collect more information at each 1.b show the data acquired by traceroute and tracenet path trace issued at a vantage point. respectively over a network segment. In the figure, small One critical observation about the existing topology dis- circles attached to the routers show the interfaces on the covery studies is that accuracy is always considered as a routers. An interface whose IP address is revealed during posterior process after data collection. Accuracy objective a trace is shown in black and otherwise is shown in white. is achieved by addressing several functional steps in convert- The lines represent point-to-point or multi-access LANs and ing raw topology data to the corresponding topology maps the arrows on the links show the routing direction of the and it involves IP alias resolution, anonymous router reso- trace between the vantage point and the destination. Note lution, and subnet inference steps. Most of these tasks are that in order for traceroute to return a similar topology shown to be computationally expensive due to the large vol- information as tracenet, one needs to run traceroute in ume of data [10, 8, 7]. Tracenet combines some of these the reverse direction (from the destination to the vantage functional steps, e.g., subnet inference, into topology point) as well. However, a key limitation for traceroute is collection phase significantly reducing the computational that one may not have access to the destination node to run complexity in converting the raw data into corresponding a traceroute query in the reverse direction. Even if one has topology maps. the required access, the paths between the two nodes may Note that both completeness and accuracy conditions af- not be symmetric and therefore the reverse trace returned fect practical utility of the resulting topology maps. As an by traceroute might not capture the missing information example, consider the use of a collected network topology from the first path trace. In fact, such a trace may return map in designing resilient overlay network systems where another incomplete network topology data for the reverse the goal is to use the topology map to identify node and path. Tracenet, on the other hand, attempts to collect the link disjoint overlay paths between two neighboring overlay complete topology information on the path in a single trace nodes as shown in Figure 2. Figure 2.a shows the physical from the vantage point to the destination. topology of a network that includes several point-to-point This valuable information comes with extra probing over- and multi-access links. Assume that our goal is to identify head. However, taking into account that acquiring similar node and link disjoint paths between A and D as well as information with traceroute requires extensive tracing con- between B and C in this network. Figure 2.b shows the net- ducted from many vantage points and a careful post process- work topology collected by traceroute where P1 and P2 are ing [10, 8, 7], tracenet can be regarded as a cost effective two paths between A and D and P3 is a single path between solution in terms of bandwidth and computation. B and C. Based on this topology map one would infer that Two important objectives in router level Internet topology the use of P1 for A to D path along with the use of P3 for mapping studies are completeness and accuracy. Complete- B and C path would satisfy the node and link disjointness ness objective requires discovering each and every alive IP requirement. However, this would be an inaccurate conclu- address on a given network and accuracy objective requires sion as routers R2,R4,R5 and R8 are sharing a multi-access grouping together IP addresses that are on the same router link and P1 and P3 are not really link disjoint. On the other and establishing both multi-access and point-to-point links hand, a tracenet collected topology info as shown in Fig- between the routers. ure 2.c would include the subnet information hence, would A common goal in most topology discovery studies is to help to avoid the incorrect conclusion above. increase the coverage of the underlying network as much Our experimental evaluations of tracenet on Internet2 as possible. This is typically implemented by increasing and GEANT topology, with data collected from a single the number of vantage points and destination addresses in vantage point at UT Dallas, resulted in a topology map topology discovery.

Load more