DNS for Massive-Scale Command and Control Kui Xu Member, IEEE, Patrick Butler, Sudip Saha, Danfeng (Daphne) Yao Member, IEEE

JOURNAL OF LATEX CLASS FILES, VOL. 6, NO. 1, JANUARY 2007 1 DNS for Massive-Scale Command and Control Kui Xu Member, IEEE, Patrick Butler, Sudip Saha, Danfeng (Daphne) Yao Member, IEEE Abstract—Attackers, in particular botnet controllers, use stealthy messaging systems to set up large-scale command and control. In order to systematically understand the potential capability of attackers, we investigate the feasibility of using domain name service (DNS) as a stealthy botnet command-and-control channel. We describe and quantitatively analyze several techniques that can be used to effectively hide malicious DNS activities at the network level. Our experimental evaluation makes use of a two-month-long 4.6GB campus network dataset and 1 million domain names obtained from alexa.com. We conclude that the DNS-based stealthy command-and-control channel (in particular the codeword mode) can be very powerful for attackers, showing the need for further research by defenders in this direction. The statistical analysis of DNS payload as a countermeasure has practical limitations inhibiting its large-scale deployment. Index Terms—Network security, DNS security, botnet detection, and command and control. F 1 INTRODUCTION The decentralized nature of domain name systems (DNS) with a series of redundant servers potentially Botnet command and control (C&C) channel refers to provides an effective channel for covert communication the protocol used by bots and botmaster (i.e., botnet of a large distributed system, including botnets. To play controller) to communicate to each other, e.g., for bots the devil’s advocate, we focus on systematically ana- to receive new attack commands and updates from lyzing the feasibility of a pure DNS-based command- botmaster, or to submit stolen data. A C&C channel for a and-control 1. Such a study has never been reported botnet needs to be reliable, redundant, non-centralized, in the literature. Our C&C system is compatible with and easily disguised as legitimate traffic. Many botnet existing DNS infrastructure without enlisting any Web operators used the Internet Relay Chat protocol (IRC) or special-purpose servers. The DNS channel is aided by or HTTP servers to pass information. Botnet operators being a high traffic channel such that data can be easily constantly explore new stealthy communication mech- hidden. As virtually anyone can create and register their anisms to evade detection. HTTP-based command and own domain names (if available) and DNS servers, it is control is difficult to distinguish from legitimate Web a system that can be easily infiltrated by hackers and traffic. The feasibility of email as a stealthy botnet com- botnet operators. mand and control protocol was studied by researchers DNS tunneling is a technique known for transmit- in [29]. In this paper, we systematically investigate the ting arbitrary data via DNS protocol, e.g., DNScat and feasibility of solely using Domain Name System (DNS) DeNiSe. One application of DNS tunneling is to bypass queries for botnet command and control. DNS provides firewalls, as both inbound and outbound DNS con- a distributed infrastructure for storing, updating, and nections are usually allowed by organizational firewall disseminating data that conveniently fits the need for rules. Because DNS is often overlooked in current secu- a large-scale command and control system. The HTTP rity measures, it offers a command-and-control channel protocol is for the end-to-end communication between a that is unimpeded. Because nearly all traffic requires client and a server. In comparison, DNS provides not DNS to translate domain names to IP addresses and only a means of communication between computers, back, simple firewall rules cannot easily be created with- but also systematic mechanisms for naming, locating, out harming legitimate traffic. Recently, Dietrich and distributing, and caching resources with fault tolerance. colleagues reported Feederbot that used DNS as a These features of DNS may be utilized to fulfill a more communication channel for C&C traffic [5]. However, effective command-and-control system than what HTTP Feederbot fails to utilize any distributed storage and servers may provide. query mechanisms offered by DNS. This botnet simply tunnels its command and control traffic by sending it in • Xu, Butler, and Saha are currently Ph.D. candidates at Virginia Tech DNS format for the end-to-end communication between Department of Computer Science. • Yao is with the Department of Computer Science, Virginia Tech, Blacksburg bots and the bot master. The domains used by them are VA 24060. not registered and cannot be resolved. Email: see http://www.cs.vt.edu/user/yao While using DNS tunneling for command and control The preliminary version of this work appeared in the 9th International has been observed [13], it was still unclear how effec- Conference on Applied Cryptography and Network Security (ACNS ’11), Lecture Notes in Computer Science 6715, pages 238-254 [2]. This work was supported in part by National Science Foundation grants CNS-0831186, 1. Other C&C protocols (e.g., HTTP) also involve DNS queries for CAREER CNS-0953638, ARO grant STIR-450080, and NSF S2ERC. name translation, but they do not use DNS for command and control. JOURNAL OF LATEX CLASS FILES, VOL. 6, NO. 1, JANUARY 2007 2 tive and feasible to use DNS to maintain stealthy large Section 6. Conclusions and open problems are given in botnets. Specifically, three items to consider in order to Section 7. evade detection for attackers are 1) Query activities. When and how frequent do bots 2 COMMUNICATION MODES issue DNS queries to pull updates from or submit data to the bot master? How to modify the victim’s In this section, we describe protocols that pass mes- operating system to implement automatic query sages over the DNS between distributed entities, and strategies? illustrate the ease of setting up large-scale command- 2) Domain names. What domain names to use for com- and-control via DNS. We describe two forms of communication and how to synchronize the generation munication modes: codeword mode and tunneled mode. of new domain names between bots and the bot Codeword communication allows one-way communication master? from botmaster to a bot client, which is suitable for 3) DNS payload. How to evade the detection through issuing attack commands. Tunneled communication allows deep packet inspection by defenders on DNS pay- for the transmitting of arbitrary data in both directions load? between bot and botmaster, which may be used for both issuing commands and collecting stolen data. The former Our work systematic addresses these questions using only requires the ability to set a particular domain name system engineering, networking, and data mining tech- response, this could be done via any free DNS service, niques. Our technical contributions are summarized as while the latter requires setting up an authoritative follows. domain server. • We describe techniques for hiding query activities, The controller of the botnet first needs to create a including i) piggybacking query strategy – a bot blends domain or subdomain, which is administered from a its (outbound) DNS queries with legitimate DNS special DNS server. This DNS server waits for special queries and ii) exponentially distributed query strategy name lookups, which it then translates into incoming – a bot probabilistically distributes DNS queries data. The DNS server then responds with the appropri- so that inter-arrival times follow an exponential ate data using the agreed-upon semantics. We assume distribution. We demonstrate the ability for a bot that the botnet controller (i.e., botmaster) has access to to send piggybacking DNS traffic through traffic the authoritative domain name server for some domains sniffing in Linux. or sub-domains. Bots across the Internet frequently re- • For automatic domain flux, where the domain ceive commands and updates from a botmaster and names used for communications in the botnet are launch attacks accordingly, as well as submit stolen data changed frequently and in a synchronized fashion to the botmaster. We give brief background information across all bots and their controllers, we describe on DNS records. a practical automatic domain flux method with DNS Resources Records The DNS system allows a Markov chain, and experimentally evaluate it with name server administrator to associate different types of 1 million domain names from alexa.com. data with either a fully qualified domain name or an IP • Statistical methods can be used by defenders to address. To send a message to a bot, an adversary can detect anomalies in the content of DNS packets, store data in any one of these types of records. through comparing the probability distributions of • A record specifies an IP address for a given host normal DNS traffic and tunneling traffic. We evalu- name. ate these methods as countermeasures and point out • CNAME and MX records can point to textual data the practical limitations that hinder the large-scale representing the alias or mailing host of a particular deployment by defenders. host name. We perform comprehensive experiments to evaluate • TXT records are designed to store arbitrary textual the behaviors of proposed query strategies in terms data up to 255 characters. of how quickly new commands are disseminated to a • EDNS0 record allows storing up to a 1280 byte large number of bots. Our analysis utilizes a 4.6GB two- payload [24]. EDNS0 was introduced in RFC261 in month-long wireless network trace obtained from an order to extend the DNS protocol. When a capable organization. We conclude that the DNS-based botnet server or client encounters this field, it can decode command-and-control channel is feasible, powerful, and the packets, allowing several improvements to the difficult to detect and block. basic DNS protocol. These features include larger Organization We describe the basic DNS tunneling UDP packet size, a list of attribute value pairs, and mechanisms in Section 2.

Load more