Evaluating Anycast in the Domain Name System

Evaluating Anycast in the Domain Name System Xun Fan John Heidemann Ramesh Govindan University of Southern California / Information Sciences Institute and Computer Science Dept. Abstract—IP anycast is a central part of production DNS. enumerate anycast use in DNS. As we show in Section V, While prior work has explored proximity, affinity and load such a capability can aid in diagnosing abnormal name service balancing for some anycast services, there has been little attention configurations, and help understand anycast deployment. to third-party discovery and enumeration of components of an anycast service. Enumeration can reveal abnormal service con- The first contribution of our work is to evaluate different figurations, benign masquerading or hostile hijacking of anycast approaches to automatically discover and enumerate all nodes services, and help characterize anycast deployment. In this paper, of an anycast service. To understand the challenges in anycast we discuss two methods to identify and characterize anycast discovery, we first taxonomize anycast deployment configura- nodes. The first uses an existing anycast diagnosis method based tions (Section II). Anycast discovery is challenging because on CHAOS-class DNS records but augments it with traceroute to resolve ambiguities. The second proposes Internet-class DNS anycast configurations can be complex, existing diagnosis records which permit accurate discovery through the use of methods are not standardized that leads to measurement ambi- existing recursive DNS infrastructure. We validate these two guity, and the visibility of anycast servers can be topologically methods against three widely-used anycast DNS services, using a scoped requiring a large number of vantage points (unique very large number (60k and 300k) of vantage points, and show observation locations). that they can provide excellent precision and recall. Finally, we use these methods to evaluate anycast deployments in top-level We then discuss the design of two methods to enumer- domains (TLDs), and find one case where a third-party operates ate anycast nodes. The first method uses an existing any- a server masquerading as a root DNS anycast node as well as cast diagnosis technique based on CHAOS-class TXT DNS a noticeable proportion of unusual DNS proxies. We also show records [41], but augments it with traceroute to identify that, across all TLDs, up to 72% use anycast. non-cooperative anycast nodes and resolve ambiguities (Sec- tion III-A). This approach requires specific measurement sup- I. INTRODUCTION port, sometimes limiting its coverage. Our second method Rapid response and high availability requires that large (Section III-B) proposes the use of Internet-class (IN) TXT network services be distributed widely, often with a single DNS records to enable the use of tens of thousands of recursive logical service is provided by distributed replicas accessed DNS servers as vantage points. using a single logical identifier. Content delivery networks A careful validation of these methods, using 60k and 300k (for example, [13]), mirroring services (for example, [12]), vantage points, reveals interesting trade-offs (Section IV). URNs [38], and IP anycast [33] all fit this model. CHAOS queries issued from 60k Netalyzr clients discover In IP anycast, as standardized by the IETF [1], [33], an (measured using recall from information retrieval) 93% of anycast service operates on a single IP address, but multiple F-root anycast servers. However, because the CHAOS query anycast nodes replicate that service at different physical lo- format is not standardized, different providers use different cations. Each node may be implemented with one or more conventions to identify anycast servers; this results in a servers (a physical or virtual machine), each of which listens measurement ambiguity that can be resolved using traceroute to the anycast address and often also one or more unicast probes. A smaller scale experiment on PlanetLab using 238 addresses. Standard interdomain routing directs clients to the nodes reveals that precision of CHAOS queries can be im- nearest replica and handles fail-over to other nodes as required. proved from 65% to 89% using traceroute, and to 100% if the (We review details and terms in Section II.) provider’s CHAOS labeling conventions are known. Finally, Anycast is used for many core services of the Internet today. we show that, up to 90% recall is possible on-demand, when It is widely used for DNS [19]: as of April 2011, 10 out of we shift to IN queries and 300k recursive DNS servers, as 13 root name servers employ anycast [35]. Other uses include evaluated on the AS112 anycast service. discovering IPv6-to-IPv4 relay routers [21] and sinkholes [18] More important, we find that 10,000 or more vantage points and for load distribution [14], [39]. Anycasted services benefit are required to reach a recall of 80% for either method from anycast’s load balancing and ability to mitigate denial- (Section IV-B). For context, almost all prior work on anycast of-service attacks [1], and research proposals have discussed performance (the exception being [8]) has used only hundreds improvements to scale to many anycast destinations [23]. of vantage points. Interesting future work may be to examine The use of anycast for core Internet services suggests we whether their conclusions would be significantly altered by a need to understand its performance, use, and robustness. In broader perspective as suggested by our approaches. this paper, we focus on understanding anycast use in DNS. Our second contribution is to understand how anycast is Extensive prior work (Section VI) has measured server prox- used in practice over many services (Section V). Until recently, imity, the affinity between clients and anycast services, and the the AS112 anycast service used manual methods to track their performance of load balancing of anycasted DNS. However, extent of deployment; our evaluations find that the manual list to date there has been no effort to discover, map, and fully is out-of-date and incomplete, with about 26% of listed nodes 2 each node may be implemented by one or more anycast servers. Figure 1 lists configurations that influence our design. The top row (T1) shows the simplest case, where a single server provides service at a given anycast node. This server listens to traffic on the service anycast address. Since anycast nodes are often placed in IXPs with complex local topologies, row T2 of Figure 1 shows a single server with links to multiple adjacent routers, either connected by a shared LAN or with multiple network interfaces. For large services such as a top-level domain server, a Fig. 1: Anycast node configurations. Observed labels in italics and service at an anycast node may be provided by multiple penultimate-hop traceroute routers in bold. “VP” indicates vantage points, “R1, R2” indicates penultimate routers, physical servers. Cases T3 and T4 show multiple servers “N1a, N1b” indicates servers in the same node. behind one (T3) or two or more (T4) routers. III. METHODS FOR ANYCAST DISCOVERY no longer operational. Recently, AS112 operators have adopted We wish to allow a third-party to fully enumerate all anycast a discovery method similar to what we propose. Second, we nodes of a service. Our method sends special DNS queries evaluate anomalous anycast usage (Section V-A). We found from vantage points in many anycast catchments to elicit one third-party DNS server masquerading as an anycast node unique information from each anycast node. In addition, since for a public root server, and hundreds of users observe what one anycast node may have multiple anycast servers (cases are likely in-path proxies. This demonstrates the importance T3 and T4 of Figure 1). We must therefore associate anycast of dynamic discovery methods to audit critical Internet infras- servers with nodes to avoid node overcounts (Section IV-C). tructure. Finally, in Section V-C, we apply anycast discovery We describe our two active probing methods below. First, to servers for all top-level domains, showing that up to 72% we extend the existing CHAOS queries by adding traceroutes. of TLDs may now be using anycast. Thus, our methods can Second, we develop a new proposal for a standardized type lead to new insights about anycast usage and, in the future, of IN query. These methods differ in what information they enable an understanding of how this usage evolves over time. return and what vantage points they can use. Data we generated for this paper is no cost [17]. A. CHAOS Queries Anycast providers require methods to observe and debug II. ATAXONOMY OF ANYCAST CONFIGURATIONS their services. Their current methods use DNS records to IP anycast provides clients of a distributed service with a identify individual anycast nodes and servers as documented single-server abstraction [33]. Clients send traffic to a specific in RFC-4892 [41]. Although not mandatory, we find these IP address identifying the anycast service. However, the conventions used widely (Section V-C). service itself is implemented by a service provider using one or Since anycast is often used for DNS services, and DNS more anycast nodes that can be physically distributed around provides a convenient external query mechanism, RFC-4892 the Internet. Standard routing protocols such as BGP ensure uses distinct DNS records to identify specific anycast servers. that the user’s packets are sent to a nearby anycast node. Each It re-purposes CHAOS-class network records from the now anycast node covers a catchment of nearby users, and, the defunct Chaosnet to provide anycast diagnosis. Standard DNS whole Internet is divided into as many catchments as anycast records [30] with class CHAOS, type TXT, and name host- nodes. Catchments may change when routing changes, so a name.bind or id.server are defined to return a unique string per user may switch between two or more catchments at any time.

Load more