Network Based Botnet Detection

2012-ENST EDITE - ED 130 Ph.D. ParisTech DISSERTATION In Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy from TELECOM ParisTech Specialization « Internet and System’s Security » presented and defended by Leyla BILGE on the 15th of December 2011 Network Based Botnet Detection Thesis supervisor : Prof. Engin Kirda Jury Mr. Herbert BOS, Prof., Vrije Universiteit, Amsterdam Reviewer Mr. Christopher Kruegel, Prof., University of California, Santa Barbara Reviewer Mr. Marc Dacier, Prof., Symantec Examiner Mr. Refik Molva, Prof., Eurecom, Sophia Antipolis Examiner TELECOM ParisTech école de l’Institut Télécom - membre de ParisTech 2012-ENST EDITE - ED 130 Doctorat ParisTech THÈSE pour obtenir le grade de docteur délivré par TELECOM ParisTech Spécialité « Sécurité d’Internet et des systèmes » présentée et soutenue publiquement par Leyla BILGE le 15 Décembre 2011 La Détection des Botnet par l’Analyse de Réseau Directeur de thèse : Prof. Engin KIRDA Jury Mr. Herbert BOS, Prof., Vrije Universiteit, Amsterdam Rapporteur Mr. Christopher Kruegel, Prof., University of California, Santa Barbara Rapporteur Mr. Marc Dacier, Prof., Symantec Examinateur Mr. Refik Molva, Prof., Eurecom, Sophia Antipolis Examinateur TELECOM ParisTech école de l’Institut Télécom - membre de ParisTech Bubacı˘gıma... Abstract The days when the Internet used to be an academic network with no malicious activity are long gone. Today, there is a high incentive for cyber- criminals to engage in malicious, profit-oriented illegal activity on the Inter- net. A popular tool of choice for digital criminals today are bots. Compared to other types of malware, the distinguishing characteristic of a bot is its ability to establish a command and control (C&C) channel that allows an attacker to remotely control or update a compromised machine. A number of bot-infected machines that are combined under the control of a single, malicious entity are referred to as a botnet. Such botnets are often abused as platforms to launch denial of service to send spam or to host scam pages. In this thesis, we propose three network-based botnet detection techniques. Each technique models the detections by analyzing different types of network data : the first detection technique performs packet level inspection. The second one analyzes the DNS traffic to find the domains that are abused for different kinds of malicious purposes including being assigned for the command and control servers. And finally, the last one detects command and control servers by analyzing NetFlow data. We propose a detection approach to identify single, bot-infected machines without any prior knowledge about command and control mechanisms or the way in which a bot propagates. Our detection model leverages the characteristic behavior of a bot, which is that it (a) receives commands from the botmaster, and (b) carries out some actions in response to these commands. The basic idea of our system is that we can generate detection models by observing the behavior of bots that are captured in the wild. Based on the observations of commands and responses, we generate detection models that can be deployed to scan network traffic for similar activity, indicating the fact that a machine is infected by a bot. We introduce a passive DNS analysis approach and a detection system, Exposure, to effectively and efficiently detect domain names that are in- volved in malicious activity. We use a set of features that allow us to charac- i ii Abstract terize different properties of DNS names and the ways that they are queried. Our experiments with a large, real-world data set consisting of 100 billion DNS requests, and a real-life deployment for two weeks in an ISP show that our approach is scalable and that we are able to automatically identify un- known malicious domains that are misused in a variety of malicious activity (such as for botnet command and control, spamming, and phishing). Finally, we present Disclosure, a large-scale, wide-area botnet detection system that incorporates a combination of novel techniques to overcome the challenges imposed by the use of NetFlow data. In particular, we identify several groups of features that allow Disclosure to reliably distinguish C&C channels from benign traffic using NetFlow records : (i) flow sizes, (ii) client access patterns, and (iii) temporal behavior. We demonstrate that these features are not only effective in detecting current C&C channels, but that these features are relatively robust against expected countermeasures future botnets might deploy against our system. Furthermore, these features are oblivious to the specific structure of known botnet C&C protocols. Résumé La période ou Internet étaitun réseauuniversitaire sans activitésmalveil- lantes est révolue depuis longtemps. De nos jours, les cyber-criminels ont beaucoup plus de raisons et de motivations pour conduire des activités illégalesàbut lucratif. Un des outils les plus reconnus pour les criminels numériquessont les bots. Un botnet correspond àun ensemble de plusieurs machines "infectées"qui sont sous le contrôled'une seule entitémalveillante. Ces botnets sont souvent utiliséescomme des plateformes pour lancer des attaques de dénide service, pour envoyer des spams ou pour héberger des pages d'arnaques. Cette thèsepropose trois techniques de détectionde botnet en se bas- ant sur l'analyse du réseau.Chaque technique modéliseces détectionsen analysant différents types de données : la premièretechnique effectue une ispection des donnéesau niveau paquet. La deuxièmetechnique analyse le traffic DNS pour retrouver des domaines qui sont affectéspour des fins malveillantes notamment par des serveurs de commandes et de contrôle.En- fin, la dernièretechnique détectedes serveurs de commande et de contrôle en analysant les donnéesdu flux réseau. iii iv Résumé Table des matières Abstract . .i Résumé ................................. iii Contents . .v List of Figures . ix List of Tables . .x Acronyms . xiii 1 Introduction 1 1.1 Botnet Detection Through Network Level Packet Inspection .3 1.2 Botnet Detection Through Passive DNS Analysis . .5 1.3 Botnet Command and Control Server Detection Through Net- flow Analysis . .7 1.4 Contributions . .9 1.5 Outline . 10 2 The Malware Evolution : Botnets 11 2.1 Botnet Characteristics . 12 2.1.1 Botnet Command and Control Infrastructures . 12 2.1.2 Bot Propagation Mechanisms . 15 2.1.3 Malicious Activites . 16 2.2 Historical Evolution of Botnets . 19 3 State of the Art 23 3.1 General malware detection . 23 3.2 Network intrusion detection . 24 3.3 Signature generation . 24 3.4 Botnet analysis and defense . 25 3.5 Using DNS Analysis Techniques for Detecting Botnets . 26 3.5.1 Identifying Malicious Domains . 26 v vi Table des matières 3.5.2 Identifying Infected Machines by Monitoring Their DNS Activities . 28 3.5.3 Generic Identification of Malicious Domains Using Pas- sive DNS Monitoring . 28 3.6 Anomaly Detection Through NetFlow Analysis . 28 3.7 Botnet Detection with NetFlow Analysis . 30 4 Botnet Detection Through Network Level Packet Inspec- tion 31 4.1 System Overview . 32 4.1.1 Detection Models . 32 4.1.2 Model Generation . 34 4.2 Analyzing Bot Activity . 36 4.2.1 Locating Bot Responses . 36 4.2.2 Extracting Model Generation Data . 39 4.3 Generating Detection Models . 40 4.3.1 Command Model Generation . 41 4.3.2 Response Model Generation . 43 4.3.3 Mapping Models into Bro Signatures . 44 4.4 Evaluation . 44 4.4.1 Capturing Bot Traffic . 45 4.4.2 Generating Signatures . 46 4.4.3 Detection Capability . 48 4.4.4 Real-World Deployment . 49 4.4.5 Examples And Comparison To Hand-tuned Signatures 51 4.5 Limitations . 54 5 Botnet Detection Through Passive DNS Analysis 55 5.1 Overview . 56 5.1.1 Extracting DNS Features for Detection . 56 5.1.2 Architecture of EXPOSURE . 57 5.1.3 Real-Time Deployment . 58 5.2 Feature Selection . 58 5.2.1 Time-Based Features . 58 5.2.2 DNS Answer-Based Features . 63 5.2.3 TTL Value-Based Features . 64 5.2.4 Domain Name-Based Features . 65 5.3 Building Detection Models . 66 5.3.1 Constructing the Training Set . 66 5.3.2 The Initial Period of Training . 67 Table des matières vii 5.3.3 The Classifier . 67 5.4 Evaluation . 69 5.4.1 DNS Data Collection for Offline Experiments . 69 5.4.2 Evaluation of the Classifier . 70 5.4.3 Experiments with the Recorded Data Set . 71 5.4.4 Real-World, Real-Time Detection with EXPOSURE . 73 5.4.5 Comparison with Previous Work . 75 5.5 Real-World Deployment of EXPOSURE . 78 5.6 Limitations . 81 6 Botnet Command and Control Server Detection Through Netflow Analysis 83 6.1 System Overview . 84 6.2 Feature Selection and Classification . 85 6.2.1 NetFlow Attributes . 85 6.2.2 Disclosure Feature Extraction . 87 6.2.3 Building the Detection Models . 91 6.3 False Positive Reduction . 92 6.4 Evaluation . 94 6.4.1 The NetFlow Data Sets . 95 6.4.2 The Ground-Truth Data Sets . 95 6.4.3 Labeled Data Set Detection and False Positive Rates . 96 6.4.4 Real-Time Detection . 99 6.4.5 Performance Evaluation . 102 6.4.6 Deployment Considerations . 102 6.4.7 Resilience to Evasion . 103 7 Concluding Remarks 105 Appendices 107 A Résuméétendu 109 A.1 DétectionBotnet Par Packet Inspection Réseau. 111 A.2 DétectionBotnet par DNS Analyse Passive . 113 A.3 Détectiondes Serveurs de Commande et de Contrôleen Analysant les Donnéesdu Flux Réseau. 116 A.4 Contributions .

Network Based Botnet Detection

Analysis of Malware and Domain Name System Traffic

An Analysis of the Asprox Botnet

Detecting Botnets Using File System Indicators

Pax Technica

Zerohack Zer0pwn Youranonnews Yevgeniy Anikin Yes Men

Ethical Hacking

Defending Against Cybercrime: Advances in the Detection of Malicious Servers and the Analysis of Client-Side Vulnerabilities

Survey and Taxonomy of Botnet Research Through Life-Cycle

Detecting DGA Bots in a Single Network

Domain Generation Algorithm (DGA) Detection

Distributed Attacks As Security Games

Dissertation Detecting Advanced Botnets in Enterprise Networks