Botnet Communication Patterns” in the Journal ”Communications Surveys & Tutorials”

This article is the accepted version of ”Botnet Communication Patterns” in the journal ”Communications Surveys & Tutorials”. Citation information: DOI 10.1109/COMST.2017.2749442 Abstract: http://ieeexplore.ieee.org/document/8026031/ 1 Botnet Communication Patterns Gernot Vormayr, Tanja Zseby, and Joachim Fabini Abstract—Malicious botnets have become a common threat A. Malicious Botnets and pervade large parts of the Internet today. Existing surveys Malicious botnets are used for various tasks including infor- and taxonomies focus on botnet topologies, Command and Control (C&C) protocols, and botnet objectives. Building on these mation theft [1], [2] or abuse of online services [3]. Another research results, network-based detection techniques have been large percentage of botnets is employed for service disruption, proposed that are capable of detecting known botnets. Methods which is used for stifling other players (e.g., competition, for- for botnet establishment and operation have evolved significantly eign states) [4], [5]. Those malicious botnets pose a real threat over the past decade resulting in the need for detection methods to individuals on the Internet [2], [6], networked infrastructures that are capable of detecting new, previously unknown types of botnets. [7], and even the Internet overall [4], [8]. Therefore, various In this paper we present an in-depth analysis of all network taxonomies [9]–[11] and recommendations [12] categorizing communication aspects in botnet establishment and operation. different aspects of botnets and ways of detection have been We examine botnet topology, protocols, and analyze a large set published. In line with the majority of earlier publications, of very different and highly sophisticated existing botnets from the remainder of this work uses the term botnets to refer to a network communication perspective. Based on our analysis, we introduce a novel taxonomy of generalized communication malicious botnets. patterns for botnet communication using standardized Unified A botnet consists of i) several bots, ii) a Command and Modeling Language (UML) sequence diagrams. We furthermore Control (C&C) server, and iii) a botmaster. Additionally, examine data exchange options and investigate the influence of this work uses the term victim for the target of an attack encryption and hiding techniques. Our generalized communi- or a bot infection. The diagram in fig. 1 illustrates botnet cation patterns provide a useful basis for the development of sophisticated network-based botnet detection mechanisms and participants, roles, and communication in a hypothetical botnet can offer a key component for building protocol- and topology- with centralized command structure. independent network-based detectors. Bots (also called zombies [13]) are infected machines that execute the bot executable. Those machines process the tasks Index Terms—bot, botnet, C&C, botnet detection supplied by the C&C server. The C&C server is the central rallying point for the botnet. Since this server is capable of controlling every bot, it has I. INTRODUCTION to be protected from takeover by third-parties like, e.g., law enforcement, researchers, or rivaling bot masters. Modern ETWORKED computers enable distributed computing botnet topologies (e.g., Peer to Peer (P2P)) can utilize regular N and sharing of resources. Distributing tasks over multiple bots that are part of the botnet as C&C server. Thus, the threat machines allows the execution of tasks whose demands exceed of bringing down the botnet via a single point of failure is the resources available on one single computer. This technique remedied. can also speed up the processing of a single task by splitting it The botmaster is the person controlling the botnet via the up into sub tasks that can be performed on several computers C&C server. Botmasters try to stay as stealthy as possible, simultaneously. because unmasking the identity of the botmaster can lead to One option to design such systems is by combining re- potential prosecution. This can be achieved for example by sources on networked computers, which are called bots, de- using as little traffic as possible, or by not using a direct signed to perform a task or sub tasks. connection to the rest of the botnet. An additional networked computer, called the master, is needed for coordinating the bots. Hence, the given name B. Botnet Detectors botnet, which is a combination of bot and network. Botnets taking part in possibly illegal activity can cause Since the Internet contains vast amounts of unused pro- unwanted network traffic, or interfere with day to day business. cessing power and network bandwidth, malicious actors found Thus, detection and removal of botnets are important tasks. ways to use those machines for their goals. This is achieved Since botnets tend to hide their operation, detection of botnets by infecting machines with malicious software (malware), and is an active field of research. Botnet detection can target one therefore building a malicious botnet, which uses compro- of the three parts of a botnet (bot, C&C server, or botmaster, mised machines for computation. as has been explained above and in fig. 1). Although this work concentrates on bot detection, the topologies described Manuscript received . in section III and the protocols described in section IV also G. Vormayr, T. Zseby, and J. Fabini are with Institute of Telecommunica- tions, TU Wien, Vienna, Austria (email: ffi[email protected]) concern the C&C server. Furthermore, the C&C server is part Digital Object Identifier . of the presented communication patterns in section VI. A 0000–0000/00$00.00 c 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://wwww.ieee.org/publications standards/publications/rights/index.html for more information. c 2017 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. This article is the accepted version of ”Botnet Communication Patterns” in the journal ”Communications Surveys & Tutorials”. Citation information: DOI 10.1109/COMST.2017.2749442 Abstract: http://ieeexplore.ieee.org/document/8026031/ 2 CC Bot Bot Operating System Application A Bot Binary detector (b) Application C Host Access router Internet detector (a) Bot Bot Bot Bot Botmaster Fig. 1. Botnet overview example consisting of a centralized botnet including botmaster, C&C server, communication channels, bots, and bot binary. Additionally, example network-based detector positions (a) and (b) are depicted. Detector (a) acts as an intermediate node while (b) is included in a router. communication pattern is a sequence of exchanged messages “abnormal” behavior (e.g., botnet C&C traffic) and to find needed for a specific communication scenario. Thus, this work characteristic patterns or properties (e.g., packet timing, packet can also be used as a base for building a C&C server detector. count, packet length). These properties are called features. Since network communication is paramount for a botnet, it The same can be achieved by observing “normal” behavior has to be present and can therefore be used for bot detection. (i.e., legitimate traffic captured during network operation), This is called network-based bot detection. Example positions extracting features, and defining network traffic that does not for network-based botnet detectors can be seen in fig. 1 match these patterns as an anomaly. (Detectors (a) and (b)). No inspection of the payload is needed, and therefore The least complex approach for implementing a network- this approach works with encrypted or obfuscated network based detector is signature-based detection or syntactic detec- traffic. Since this technique tries to find generalized properties tion [9]. This technique works by first collecting communica- of unwanted traffic, it can theoretically be used to detect tion from known botnets. Next, representative byte-sequences previously unknown botnets. or text excerpts used by the targeted botnets during C&C A simple way to implement this procedure is to use machine communication can be extracted from these samples. In the learning. Machine learning is a process which modifies the last step these results are compared with unknown traffic. If parameters of an algorithm until the algorithm output matches there is a match, bots have been detected. the expected output for given inputs. This is called training. A Since the above mentioned byte-sequences or text excerpts trained machine learning algorithm can be an approximation need to match exactly, a straightforward countermeasure for of the dependencies between the trained parameters and the hiding from such detection techniques is to modify the used expected results. Since there is not necessarily an underlying C&C commands. A generalization of this idea is to add correlation in the data, the results have to be validated and random sequences at the beginning or at arbitrary positions used with caution. of a C&C command, which is called padding. Adding data Various detectors have been proposed that train different at the beginning can be used to exploit length restrictions in machine learning techniques with observed botnet traffic. Saad detectors, while adding patterns at other

Load more