Internet Background Radiation Revisited
Total Page:16
File Type:pdf, Size:1020Kb
Internet Background Radiation Revisited Eric Wustrow Manish Karir Michael Bailey Networking Research and Networking Research and Department of EECS Development Development University of Michigan Merit Network Inc. Merit Network Inc. Ann Arbor, MI 48109 USA Ann Arbor, MI 48104 USA Ann Arbor, MI 48104 USA [email protected] [email protected] [email protected] Farnam Jahanian Geoff Huston Department of EECS Asia Pacific Network University of Michigan Information Centre Ann Arbor, MI 48109 USA Brisbane QLD 4064 Australia [email protected] [email protected] ABSTRACT Categories and Subject Descriptors The monitoring of packets destined for routeable, yet un- C.2 [Information Systems Applications]: Network Op- used, Internet addresses has proved to be a useful technique erations for measuring a variety of specific Internet phenomenon (e.g., worms, DDoS). In 2004, Pang et al. stepped beyond these General Terms targeted uses and provided one of the first generic character- izations of this non-productive traffic, demonstrating both Measurement its significant size and diversity. However, the six years that followed this study have seen tremendous changes in both Keywords the types of malicious activity on the Internet and the quan- Darknet, Internet background traffic, Network pollution tity and quality of unused address space. In this paper, we revisit the state of Internet "background radiation" through the lens of two unique data-sets: a five-year collection from 1. INTRODUCTION a single unused /8 network block, and week-long collections The monitoring of allocated, globally routeable, but un- from three recently allocated /8 network blocks. Through used Internet address blocks has been widely used by the the longitudinal study of the long-lived block, comparisons security, operations, and research communities to study a between blocks, and extensive case studies of traffic in these wide range of interesting Internet phenomenon. As there blocks, we characterize the current state of background radi- are no active hosts in these unused blocks, packets destined ation specifically highlighting those features that remain in- to these IP addresses must be the result of worm propa- variant from previous measurements and those which exhibit gation[1, 2, 3], DDoS attacks[4], misconfiguration, or other significant differences. Of particular interest in this work is unsolicited activity. Systems that monitor unused address the exploration of address space pollution, in which signif- spaces have a variety of names, including darknets [5], net- icant non uniform behavior is observed. However, unlike work telescopes [6], blackhole monitors [7], network sinks[8], previous observations of differences between unused blocks, and network motion sensors [9]. we show that increasingly these differences are the result While this monitoring technique had seen heavy use in of environmental factors (e.g., misconfiguration, location), the measurement of specific phenomena, it wasn't until 2004 rather than algorithmic factors. Where feasible, we offer when Pang et. al [10] published their seminal paper \Char- suggestions for clean up of these polluted blocks and iden- acteristics of Internet Background Radiation"that a detailed tify those blocks whose allocations should be withheld. characterization of this incessant non-productive traffic was available. Through passive measurement and active elicita- tion of connection payloads over several large unused blocks, the authors characterized the behavior of sources and the activities prevalent in Internet background radiation. Most notable in their analysis was the ubiquity of Internet back- ground radiation, its scale, its rich variegation in targeted Permission to make digital or hard copies of all or part of this work for services, and the extreme dynamism in many aspects of the personal or classroom use is granted without fee provided that copies are observed traffic. not made or distributed for profit or commercial advantage and that copies The six years since this landmark paper have seen sig- bear this notice and the full citation on the first page. To copy otherwise, to nificant changes both in the size, shape, and traffic carried republish, to post on servers or to redistribute to lists, requires prior specific by the Internet as well as the methods and motivations of permission and/or a fee. IMC’10, November 1–3, 2010, Melbourne, Australia. malicious traffic that makes up Internet background radia- Copyright 2010 ACM 978-1-4503-0057-5/10/11 ...$10.00. tion. While both scanning as a reconnaissance activity and 62 as a propagation method are both alive and well, the emer- of significant volumes of non-uniform environmental gence and growth of botnets [11, 12] have changed the threat factors|a class of behaviors we collectively label as landscape significantly for most operators. This view of com- address space pollution. promised hosts as a resource worth protecting highlights a tension in botnet design between the degree of detection as • Availability of these Traces. We will make all evidenced by how noisy malicious behaviors are, and the de- 11 datasets, nearly 10 TB of compressed PCAP data, sire to maintain the useful resource and avoid detection. As available through the Protected Repository for the De- with any design tradeoff, there are malicious botnets that fense of Infrastructure Against Cyber Threats (PRE- continue to be noisy in how they use and acquire hosts (e.g., DICT) [16] dataset archive in an effort to further ex- Conficker), nevertheless, the last six years has seen a marked pand our knowledge of these interesting phenomena change in how malicious code behaves [13]. and encourage additional exploration. Additionally, today's Internet continues to witness tremen- The rest of this paper is organized as follows; in Section 2 dous year over year growth, fueled in large part by demand we describe some directly related work; Section 3 provides an for video [14]. The role of new content delivery mechanisms overview of our data collection methodology and describes have changed how traffic flows and user demands continue our datasets; Section 4 we revisit Internet background ra- to change the applications of interest. These changes im- diation, providing both temporal and spatial studies of this pact the behaviors observed in background radiation as new traffic; Section 5 outlines our study of address pollution; Sec- services become more desirable to discover and new network tion 6 summarizes our results and offers some conclusions services offer new ways to misconfigure themselves. and future work. Our study is primarily motivated by the dramatic shifts in attack behaviors and the Internet as as whole since the origi- 2. RELATED WORK nal 2004 Internet background radiation study [10], Addition- Directly related work in this area can generally be cate- ally, as IPv4 address exhaustion nears [15] and dirty net- gorized into two related areas. The first area is concerned work blocks can no longer be returned for newer allocations, with the design, operation and scalability of monitoring In- there is an increasing need to both identify and quantify ad- ternet background radiation, while the second focuses on the dress pollution to determine the quality of a network address analysis of the data collected via these systems. block and to determine the utility of any cleanup effort. The There have been several attempts at building Internet purpose of this paper is to revisit Internet background radi- background radiation monitoring systems; here we describe ation in order to determine any evolution in the nature of the three most popular systems. In [6] the authors discuss this traffic and to explore any new features that might have perhaps the most popular and visible monitor at CAIDA. emerged. To provide as broad and detailed a characteri- They describe how the size of the monitored address space zation as possible, we draw on two unique sources of data can influence its ability to detect events. They also present for our analysis. First, we examine five week-long datasets several alternative models for building distributed network taken from the same routed /8 unused address block, rep- monitors. Data from this monitor has been made available resenting the first week in February over the last five years. to the broad network research community which has served Second, we examine three week-long datasets built by an- to increase its visibility. The iSink monitor at the University nouncing and capturing traffic to three separate /8 networks of Wisconsin was first published in [8], where the authors de- recently allocated to APNIC and ARIN from IANA. These scribe their experience in building this system as well as us- three datasets are compared with each other, as well as with ing it for both active and passive monitoring for detection of three matching week-long collections from the /8 used in the possible network abuse activity. One of the chief character- longitudinal study. istics of the system was its ability to filter the traffic as well To summarize, the value of our work is threefold: as incorporate application level responders. The Internet • Revisiting Internet Background Radiation In this Motion Sensor (IMS) system at the University of Michigan paper we present the first thorough study of Internet has been described in [9]. The IMS system was perhaps the background radiation since 2004. We study and char- most distributed and extensive system of the three we have acterize this traffic in an attempt to answer two specific described here. A main finding of this work was the value in questions: a distributed monitoring system, as different blocks in differ- ent networks reported significantly different behaviors. The { Temporal Analysis of Internet Background Radi- spatial analysis in our study re-confirms these differences, ation The first question is an attempt to under- both in space and time, but highlights a growing trend to- stand how this traffic has evolved over a 5 year ward pollution as the cause of these differences.