<<

Universidade Federal de Pernambuco Centro de Informática Pós-Graduação em Ciência da Computação

Daniel Araújo Melo

ARCA – Alerts Root Cause Analysis Framework

Dissertação de Mestrado

Recife 2014 Universidade Federal de Pernambuco Centro de Informática

Daniel Araújo Melo

ARCA - Alerts Root Cause Analysis Framework

This dissertation has been submitted to the Informat- ics Center of the Federal University of Pernambuco as a partial requirement to obtain the degree of Master in Computer Science.

Orientador: Djamel F. H. Sadok

Recife 2014

Catalogação na fonte Bibliotecária Jane Souto Maior, CRB4-571

M528a Melo, Daniel Araújo ARCA - Alerts root cause analysis framework / Daniel Araújo Melo. – Recife: O Autor, 2014. 122 f.: il., fig., tab.

Orientador: Djamel Fawzi Hadj Sadok. Dissertação (Mestrado) – Universidade Federal de Pernam- buco. CIn, Ciência da computação, 2014. Inclui referências.

1. Redes de computadores. 2. Segurança da informação. I. Sadok, Djamel Fawzi Hadj (orientador). II. Título.

004.6 CDD (23. ed.) UFPE- MEI 2015-42

Daniel Araújo Melo

ARCA - Alerts Root Cause Analysis

Dissertação apresentada ao Programa de Pós-Graduação em Ciência da Computação da Universidade Federal de Pernambuco, como requisito parcial para a obtenção do tí- tulo de Mestre em Ciência da Computação.

Aprovado em: 08/09/2014

BANCA EXAMINADORA

______Prof. Dr. Stênio Flávio de Lacerda Fernandes Centro de Informática / UFPE

______Prof. Dr. Arthur de Castro Callado Mestrado e Doutorado em Ciências da Computação / UFC

______Prof. Dr. Djamel Fawzi Hadj Sadok (Orientador) Centro de Informática / UFPE

A minha família, esposa e filhos. Acknowledgments

Initially, I would like to thank my family, especially my mother, Carmem Dolores, my wife Juliana, my son Enos Daniel and my grandmothers, Olga and Inez. They have always stood by my side even when I was absent working in this research. I would like to gratefully acknowledge the supervision of Professor Djamel Sadok. He provided me important suggestions and encouragement during the course of this work and offered the opportunity to join GPRT research team My sincere thanks also goes to Professor Judith Kelner for pulling my ears when needed and helping me when I lost the matriculation. I would not complete the aca- demic requirements without her help. I´d like to thank to my examination committee, Stenio Fernandes e Arthur Cal- lado, for suggestions that enriched this work. I cordially thank to my colleagues from GPRT for the help and revision of my presentation, and colleagues from SERPRO, especially those that always believed that this moment would come. I want to express my gratitude to Andre Tio, Lalá, Tadeu, Noemi, Iuri, Nacho, Suana, Amanda, Maíra, for the good vibrations. And finally, thanks Universe!

“If you know the enemy and know yourself you need not fear the results of hundred battles.” - Sun Tzu Abstract

Modern virtual plagues, or , have focused on internal host infection and em- ploy evasive techniques to conceal itself from antivirus systems and users. Traditional mechanisms, such as Firewalls, IDS (Intrusion Detection Systems) and Antivirus Systems, have lost efficiency when fighting propagation. Recent researches present alternatives to detect malicious traffic and malware propagation through traffic analysis, however, the presented results are based on experiments with biased artificial traffic or traffic too specific to generalize, do not consider the existence of background traffic related with local network services or demands previous knowledge of networks infrastructure. Specifically don’t consider a well-known intru- sion detection systems problem, the high false positive rate which may be responsible for 99% of total alerts. This dissertation proposes a framework (ARCA – Alerts Root Cause Analysis) capable of guide a security engineer, or system administrator, to iden- tify alerts root causes, malicious or not, and allow the identification of malicious traffic and false positives. Moreover, describes modern malwares propagation mechanisms, presents methods to detect malwares through analysis of IDS alerts and false positives reduction. ARCA combines an aggregation method based on Relative Uncertainty with Apriori, a frequent itemset mining algorithm. Tests with 2 real datasets show an 88% reduction in the amount of alerts to be analyzed without previous knowledge of network infrastructure.

Palavras-chave: Intrusion detection. Malwares. Alerts correlation. Advanced persis- tent threats.

Resumo

As pragas virtuais modernas focam na contaminação de estações em redes internas, e empregam técnicas evasivas para se ocultarem dos sistemas antivírus e dos usuá- rios dos sistemas. Mecanismos tradicionais de segurança de rede, como firewalls, sis- temas de detecção de intrusão (IDS – Intrusion Detection Systems) e sistemas antiví- rus, perdem eficiência no combate a propagação de malwares. Pesquisas apresentam alternativas para detectar de tráfego malicioso e propagação de malwares através da análise de tráfego, mas apresentam resultados baseados em conjuntos de dados ar- tificiais enviesados ou reais específicos demais para serem generalizados, não consi- deram a existência de tráfego de background relacionado com serviços de rede local ou exigem conhecimento prévio da infraestrutura de rede. Especificamente não con- sideram um problema bem conhecido dos IDS: a alta taxa de falsos positivos, que podem chegar a 99% do total de alertas. Esta dissertação propõe um framework (ARCA – Alerts Root Cause Analysis) capaz de auxiliar um engenheiro de segurança a identificar causas-raiz de alertas, maliciosos ou não, permitindo a identificação de tráfego malicioso e falsos positivos. Adicionalmente, descreve os mecanismos de pro- pagação de malwares modernos, propostas de detecção de malwares através da aná- lise de alertas emitidos por IDS e propostas de redução de falsos positivos. ARCA combina um mecanismo de agregação de alertas baseado na Incerteza Relativa com o algoritmo de análise de itens frequentes Apriori. Testes realizados com dados reais demonstraram uma redução em até 88% a quantidade de alertas a serem analisados sem conhecimento prévio da infraestrutura de rede

Palavras-Chaves: Intrusion detection. Malware. Alerts correlation. Advanced persis- tent threats.

Lista de Figuras

Figure 1 Worm propagation model (ZOU et al., 2005) ...... 24 Figure 2 Typical bonet´s elements (SILVA et al., 2013) ...... 26 Figure 4 Typical life-cycle proposed in (FEILY; SHAHRESTANI; RAMADASS, 2009) ...... 29 Figure 5 Botnet life cycle proposed in (RODRÍGUEZ-GÓMEZ; MACIÁ-FERNÁNDEZ; GARCÍA-TEODORO, 2013) ...... 31 Figure 6 IRC-based botnet DDOS Attack (COOKE; JAHANIAN; MCPHERSON, 2005) ...... 33 Figure 7 Hybrid P2P network ...... 36 Figure 10 Gameover Zeus network topology. Dotted line indicates information flow...... 41 Figure 11 Organizations Categories (MCAFEE, 2010) ...... 43 Figure 12 Victim´s Country of Origin (MCAFEE, 2010) ...... 44 Figure 13 Model for APT stages proposed by (GIURA; WANG, 2012)...... 44 Figure 14 A targeted attack in action (SOOD; ENBODY, 2013) ...... 45 Figure 15 Infected Hosts according Wan IP (FALLIERE; MURCHU; CHIEN, 2011) 48 Figure 16 Overview of Malware Operation ...... 49 Figure 17 Countries affected by Flame according to McAfee (GOSTEV, 2012b) .... 51 Figure 18 Countries affected by Flame according Symantec (SYMANTEC, 2012b) 52 Figure 19 Flame C&C Platform(ZHIOUA, 2013) ...... 54 Figure 20 An example of (a) bipartite graph and (b) one-mode projection...... 55 Figure 21 BotHunter System by (PORRAS, 2009) ...... 56 Figure 22 Vulnerabilities reported do NVD (NIST, 2014)...... 59 Figure 23 Incidents reported to Cert.br (CERT.BR, 2014) ...... 60 Figure 24 Layout of the proposed classification system in (PARIKH; CHEN, 2008)...... 68

Figure 25 A sample multi-step-attack (SOLEIMANI; GHORBANI, 2008) ...... 70 Figure 26 Generic view of alarm correlation according (HUBBALLI; SURYANARAYANAN, 2014)...... 71 Figure 27 Generic view of graph ordering (PAO et al., 2012)...... 74 Figure 28 ATLANTIDES architecture (BOLZONI; CRISPO; ETALLE, 2007) ...... 75 Figure 29 Proposed Architecture (HUBBALLI; BISWAS; NANDI, 2011)...... 76 Figure 30 Normalized SrcIp and DstIp quantities per significant class (SID). [Max(SrcIp), Min(SrcIp)]=[309,1] and [Max(DstIp), Min(DstIp)]=[542,2]...... 85 Figure 31 ARCA Architecture ...... 86 Figure 32 ARCA Workflow ...... 87 Figure 33 - Atable and Ctable ...... 89 Figure 34 Job1 collects the alerts and runs RUA and FIM ...... 90 Figure 35 Job2 imports one or more RCARs and removes the selected alerts ...... 91 Figure 36 Histogram of Class Counter from SERPRO’s dataset ...... 93 Figure 37 Histogram of SrcIP Counter from SERPRO’s dataset ...... 94 Figure 38 Histogram of DstIP Counter from SERPRO’s dataset ...... 94 Figure 39 Normalized alert quantities per significant alert class (SID)...... 96 Figure 40 Normalized SrcIp and DstIp quantities per significant class (SID)...... 96 Figure 41 Alert Reduction in 12 hours interval ...... 101 Figure 42 Total Alerts versus Final Alerts in 12 hours interval ...... 101 Figure 43 Histogram of Class Counter from MACCDC’s dataset ...... 102 Figure 44 Histogram of SrcIP Counter from MACCDC’s dataset...... 103 Figure 45 Histogram of DstIP Counter from MACCDC’s dataset...... 103

Lista de Tabelas

Comparison of life-cycle models ...... 28 APT’s model comparison...... 47 Methods comparison ...... 77 Apriori parameters ...... 90 Results from RU Algorithm. Class clustering from 8:00 am to 8:00 pm ...... 95 Results from RU Algorithm. SrcIP clustering from 8:00 am to 8:00 pm...... 95 Results from RU Algorithm. DstIp clustering from 8:00 am to 8:00 pm...... 95 Root Cause Association Rules from Serpro’s dataset, between 8:00 am and 9:00 am...... 97 Apriori’s Association Rules for Rule 1 ...... 98 Apriori’s Association Rules for Rule2 ...... 99 Apriori’s Association Rules for Rule3 ...... 99 Apriori’s Association Rules for Rule4 ...... 100 New RCARs created from new alerts detected between 15 and 17 pm ...... 102 RCAR Rules From MACCDC 2012 dataset ...... 104 Alerts triggered by Rule 1 ...... 104 Destinations from alerts triggered by Rule 2...... 104

Lista de Algoritmos

Algorithm 1 Simplified significant cluster extraction algorithm ...... 82

Lista de Siglas

IDS Intrusion Detection System

ARCA Alerts Root Cause Analysis

MLP Multilayer Perceptron

TP True Positive

FP False Positive

FQDN Fully Qualified Domain Name

RR Resource Record

NIDS Network-based Intrusion Detection

HIDS Host-based Intrusion Detection

IPS Intrusion Prevention

RCAR Root Cause Association Rule

Sumário

CHAPTER 1 INTRODUCTION ...... 17 1.1 MOTIVATION ...... 18 1.2 OBJECTIVES ...... 20 1.3 DOCUMENT ORGANIZATION ...... 20 CHAPTER 2 MALICIOUS SOFTWARE ...... 21 2.1 MALWARE TYPES...... 22 2.1.1 WORMS ...... 22 2.1.1.1 Propagation Model ...... 22 2.1.1.2 P2P worms ...... 24 2.1.2 BOTS AND ...... 25 2.1.2.1 Botnet Life-Cycle ...... 27 2.1.2.2 C&C Architectural Designs ...... 31 2.1.2.3 Fast-Flux ...... 37 2.1.2.4 Domain-flux...... 38 2.2 MODERN MALWARES ...... 38 2.2.1 MARIPOSA ...... 38 2.2.2 TDL4 ...... 39 2.2.3 GAMEOVER ZEUS ...... 40 2.3 ADVANCED PERSISTENT THREATS ...... 42 2.3.1 APT MODEL ...... 44 2.3.2 STUXNET ...... 47 2.3.3 FLAME ...... 50 2.4 FIGHTING MALWARE PROPAGATION ...... 54 2.5 CHAPTER SUMMARY ...... 57 CHAPTER 3 INTRUSION DETECTION AND FALSE ALARM REDUCTION ...... 58 3.1 IDS CLASSIFICATION ...... 61 3.2 PROBLEMS WITH DARPA DATASET ...... 62 3.3 FALSE ALARM GENERATION ...... 63 3.3.1 SIGNATURE ENHANCEMENT ...... 65 3.3.2 STATEFUL SIGNATURES ...... 65 3.3.3 VULNERABILITY SIGNATURES ...... 66 3.3.4 ALARM MINING ...... 66 3.3.4.1 Clustering ...... 67 3.3.4.2 Classification...... 67 3.3.4.3 Neural network approach ...... 69 3.3.4.4 Frequent pattern mining ...... 69 3.3.5 ALARM CORRELATION ...... 70 3.3.5.1 Multi-step correlation ...... 72 3.3.5.2 Causal relation based correlation ...... 72 3.3.5.3 Attack graphs based correlation ...... 73 3.3.6 ALARM VERIFICATION ...... 74 3.3.7 HYBRID METHODS ...... 75 3.4 CHAPTER SUMMARY ...... 77 CHAPTER 4 ARCA FRAMEWORK ...... 79

4.1 FUNDAMENTAL CONCEPTS ...... 80 4.1.1 ROOT CAUSES ...... 80 4.1.2 RELATIVE UNCERTAINTY CLUSTERING ...... 80 4.1.2.1 Extracting Significant Cluster ...... 82 4.1.3 FREQUENT ITEMSET MINING ...... 82 4.2 ARCA ARCHITECTURAL DESIGN ...... 84 4.3 IMPLEMENTATION ...... 87 4.3.1 RUA – RELATIVE UNCERTAINTY AGGREGATOR ...... 87 4.3.2 FIM – FREQUENT ITEMSET MINER ...... 89 4.3.3 ALERTS AGGREGATION ...... 90 4.4 EXPERIMENTS ...... 91 4.4.1 ALERTS PREPROCESSING ...... 92 4.4.2 EXPERIMENT WITH THE SERPRO DATASET ...... 92 4.4.2.1 Results evaluation ...... 98 4.4.3 EXPERIMENT WITH THE MACCDC´S DATASET ...... 102 CHAPTER 5 CONCLUSIONS ...... 106 5.1 CONTRIBUTIONS ...... 107 5.2 DIFFICULTIES FOUND ...... 107 5.3 LEARNED LESSONS ...... 108 5.4 FUTURE WORK ...... 108 REFERENCES ...... 109

Chapter 1 Introduction

Incident report statistics and ongoing researches at specialized centers such as Cert.br (CERT.BR, 2014), Enisa (ENISA, 2014) and Cert/cc (CERT, 2014), show an alarming increase of threats directed to end users and hosts. Many works from the industry also describe techniques adopted by malicious software (malwares), with the objective to steal private data and use infected computers to perpetrate network at- tacks (KAMLUK, 2009) (GONCHAROV, 2012). Furthermore, recent researches show that malwares have evolved from self- propagating programs, a.k.a. ‘worms’, (ZHOU, CHENFENG VINCENT; LECKIE; KARUNASEKERA, 2010), to controlled machines via Command and Control (C&C) servers, a.k.a., ‘bots’ (TSAI et al., 2011; YU et al., 2014). Moreover, the security com- munity has devoted efforts to research the rising of Advanced Persistent Threats (APT) and Remote Administration Tools (RAT), potentially harmful malwares with political or industrial motivation (BAIZE; CORP, 2012; BRADBURY, 2010; GIURA; WANG, 2012; SOOD; ENBODY, 2013; TANKARD, 2011). Given the malware’s code obfuscation techniques, each infection may produce a new code and circumvent traditional signature-based antivirus systems (OUELLETTE; PFEFFER; LAKHOTIA, 2013; SZÖR; FERRIE, 2001; WONG; STAMP, 2006). As a consequence, malware signatures may be outdated when distributed to antivirus clients. The problem is amplified by traditional network security countermeas- ures limitations when fighting malware propagation or internal attacks (BAIZE; CORP, 2012; PORRAS, 2009). Therefore, academia and industry have directed efforts on re- search network techniques to track malware traffic (PORRAS, 2009). Along this document we will discuss malware evolution, how to improve Intru- sion Detection Systems (IDS) to detect malware traffic, drawbacks that may influence

17

18

IDS in a negative way and a proposed framework, named ARCA (Alerts Root Cause Analysis), whose main objective is to group alerts and allow security engineers to an- alyze alerts root cause. The remainder of this chapter describes the focus of this dissertation and starts by presenting its motivation in Section 1.1 and a clear definition of the objectives in Section 1.2. Section 1.3 describes how this dissertation is organized.

1.1 Motivation

Traditional network security countermeasures lose efficiency when fighting mal- ware propagation, or internal attacks (BAIZE; CORP, 2012; PORRAS, 2009). Firewalls are generally deployed to protect local networks from outsiders and cannot avoid in- ternal attacks or attacks between workstations - unless a security policy demands fire- wall deployment in workstations and local servers. Intrusion Detection Systems (IDS) have been well utilized to spot inbound attacks or malicious outbound traffic, but in- fected hosts and internal attackers may direct attacks to other workstations and local network services while avoiding firewalls. Moreover, communication channels between infected machines and control servers may use encryption. Anti-Virus Systems cannot follow malware polymorphic capabilities and a malware signature may be outdated when distributed (OUELLETTE; PFEFFER; LAKHOTIA, 2013; PORRAS, 2009; SZÖR; FERRIE, 2001; WONG; STAMP, 2006). In last years, a great deal of work was dedicated to developing methods that classify and extract malicious from normal traffic, as in (GU et al., 2007, 2009; MANIKOPOULOS; PAPAVASSILIOU, 2002a; SHAHRESTANI et al., 2009; XU; WANG; GU, 2011a; YU et al., 2014). According to (SAAD et al., 2011) detection though network traffic behavior is advantageous because it´s possible to detect malwares ma- licious activities during any phase of its life cycle and has a lower cost than deep packet inspection. On the other hand, (PORRAS, 2009) has presented the challenges faced by such methods: malwares can be stealthy, irregular and deceptive, therefore, gen- erate few anomalies in network traffic. Modern malwares are in constant evolution. Each new version or variant imple- ments more deceptive techniques, to conceal itself from traffic analysis and system

19

administrators, as presented in Chapter 2. However, it is possible to observe a partic- ular characteristic that, to this date, remains unchanged and common to modern mal- wares: the majority of exploits used to infect new hosts are directed to known patchable vulnerabilities, the same was observed by McHugh et al. (MCHUGH; FITHEN; ARBAUGH, 2000) more than 10 years ago. Contemporary open source NIDS, such as Snort and Suricata, have active com- munities and industry initiatives developing signatures to detect exploitation of known vulnerabilities, network protocols anomalies and policy violations (EMERGING THREATS, 2013; SOURCEFIRE, 2013; SURICATA, 2014). Most of vulnerabilities ex- ploited by malwares presented in Chapter 2 have correspondent signatures; moreover, there are specific signature subsets with the objective to detect tools and protocols related with potential leaks, such as P2P protocols, binary downloads through HTTP, internet anonymizes, instant message, and others. Therefore, a NIDS may provide useful information to detect malicious traffic related with malware propagation. However, IDS have well-known drawbacks. The work presented in (HUBBALLI; SURYANARAYANAN, 2014) provides a survey on several schemes with a major con- cern, namely, how to minimize the false alarm rate in IDS. It also argues that hybrid approaches, mixing data mining schemes and filtering based schemes, are better suited to dynamic environments like an internal network perimeter. The survey’s con- clusion addresses questions to the research community with gaps to motivate future efforts, like incremental learning, testing with common datasets and real time capabil- ity. Given the IDS’s important role against potential malware propagation and the reduction of False Positive (FP) rate, the research community must consider the exist- ence of false positives and its influence on experimental results. So far, it seems to handle malicious behavior identification and false alerts reduction as separate prob- lems. Moreover, schemes have been tested with private datasets from traffic too par- ticular to generalize or biased artificially generated datasets (BRUGGER; CHOW, 2005; HUBBALLI; SURYANARAYANAN, 2014; MAHONEY; CHAN, 2003; MCHUGH, 2000; TJHAI et al., 2008).

20

1.2 Objectives

The main goal of this dissertation is to investigate and propose a method to fight malware propagation in internal networks, through the enhancement of contemporary signature-based NIDS. As secondary goals, it’s important to:  Evaluate how the alert aggregation method proposed in (FEITOSA, EDUARDO LUZEIRO, 2010) will behave when facing alerts from two real distinct traffic samples;  Evaluate if malicious activities generate regular statistical significant alerts;  Evaluate if the proposed method is useful to detect malware spreading and reduce alerts volume.  Survey modern malwares behavior and spread techniques;  Survey relevant strategies leading to false alerts reduction;

1.3 Document Organization

This dissertation is organized as follows:  Chapter 2 - Malware Evolution - describes malware evolution, the rise of APT (Advanced Persistent Threats) and proposals to fight malware propagation;  Chapter 3 – Intrusion Detection Systems – describes the evolution of Intrusion Detection and the research to minimize the false alarm rate problem;  Chapter 4 – ARCA Framework – ARCA’s theoretical basis is explained, implementation details are described and the tests results are pre- sented;  Chapter 5 – Conclusions – final conclusions and discussion about con- tributions and future work are made.

Chapter 2 Malicious Software

In this Chapter modern malwares are discussed, its fundamental concepts are presented and examples of the most relevant malwares are discussed. Moreover, methods to detect malicious traffic related with malwares are also presented. Malicious software, or software with malicious purposes, namely, malware, is a source of significant amount of unwanted traffic on the Internet (FEITOSA, EDUARDO LUZEIRO, 2010). First malwares were created in the early 1980´s and since them mal- wares have evolved with the objective to circumvent traditional security countermeas- ures, from simple code that infected boot sectors to complex software with multiple propagation vectors (AYCOCK, 2006; OUELLETTE; PFEFFER; LAKHOTIA, 2013). Modern malwares explore technical and social weaknesses to propagate. Non- solicited e-mails (SPAM) use social engineering to persuade users to execute mali- cious code and explore system vulnerabilities, or even take advantage of users per- missions. After the successful infection, if the infected station is part of a local network, attacks may be triggered to infect other stations or compromise internal servers (YU et al., 2014). There is no consensus of the financial impact of malware on the global econ- omy, but the participation of organized crime on malware development is well known, and estimations from Industry about are alarming. McAfee estimates the financial global impact between $300 billion and $1 trillion (CENTER OF STRATEGIC AND INTERNATIONAL STUDIES, 2013), and Symantec estimates that cybercrime has a cost of $388 billion to online adults from 24 countries (SYMANTEC, 2013). In the following sections the terms virus and malware are used interchangeably.

21

22

2.1 Malware Types

(AYCOCK, 2006) had classified malware according to its operational method. Three characteristics were used in the classification scheme:  Self-replication – When malwares actively attempt to autonomously spread by creating new copies, without user interference;  Population growth – The rate of a malware’s population growth due to self-replication;  Parasitic behavior – If a malware requires another executable, or any computer component like a boot block code on a disk or binary code, to exist.

2.1.1 Worms

A worm is a self-replicating program that spreads by exploiting vulnerabilities found in other machines (ANDROULIDAKIS; CHATZIGIANNAKIS; PAPAVASSILIOU, 2009). While a virus propagates infecting other code, a worm searches for vulnerabili- ties across a network or dispatches emails with infected attachments, seeking to trick users or explore e-mail clients vulnerabilities. It also employs obfuscation techniques like encryption, oligomorphism, polymorphism or metamorphism

2.1.1.1 Propagation Model

Worms generally use multiple techniques, or propagation vectors, to spread. (ZOU; TOWSLEY; GONG, 2006) proposed two major classes of worms, according to the way it spreads:  Email worms – propagate through e-mails and infect hosts when users read the email content or open attachments. Human interference is re- quired to propagate and thus propagation speed is relatively slow;  Scan-based worms – scan IP addresses prefixes and directly exploit vul- nerabilities on target hosts. As no human interference is required, they are faster than email worms;

23

According to (ZOU; TOWSLEY; GONG, 2006; ZOU et al., 2005), the epidemic model is adequate to model a scan-based worm, or “uniform scan worm”, which uni- formly picks IP addresses and scans for vulnerable targets. The epidemic model assumes that each subject resides in two states, has a single transition, from susceptible to infected state, and once infected, remains in the infectious state forever. Moreover, the model assumes all subjects can directly contact each other and don´t collaborate with their infection efforts. The model for a finite population is

푑퐼푡 ( 1) = 훽퐼 [푁 − 퐼 ] 푑푡 푡 푡

Where 퐼푡 is the number of infected subjects at time 푡 and 푁 is the size of vul- nerable population before any infection take place. 훽 is called pairwise rate of infec- tion, it represents “infection intensity” from infected to susceptible subjects and corre- sponds to 휂 ( 2) 훽 = Ω Where 휂 is average number of scans an infected host starts per unit time and

Ω is number of available IP addresses. Therefore, every scan has a probability of 1⁄Ω to hit any IP address from this scanning space. At 푡 = 0, 퐼0 subjects are initially in- fected while the remaining 푁 − 퐼0 subjects are susceptible. (ZOU et al., 2005) also argues that it is possible to roughly partition the propa- gation in three phases, as may be seen in Figure 1:

 Slow start phase – Since 퐼푡 ≪ 푁 the number of infected hosts grows ex- ponentially;  Fast spread phase – Many hosts are infected and start to infect others in a linear speed;  Slow finish phase – The infection rate decreases because fewer suscep- tible vulnerable computers are left.

24

Figure 1 Worm propagation model (ZOU et al., 2005)

The infection rate is the average number of vulnerable hosts that can be infected per unit of time by one infected host during the early stage of a worm’s propagation. It should be noted that model (1), for the sake of simplicity, does not consider two major factors affecting a worm’s spreading: human counteraction and network con- gestion. The former has to be considered to model a slow spreading worm, such as e- mail worm, while the later has to be considered to model fast spreading worm, such as uniform scan worm.

2.1.1.2 P2P worms

Peer-to-peer attacks are an increasingly popular technique for worm propaga- tion due to its simplicity (SZOR, 2005). After a succeeded infection, a worm searches for P2P download folders and makes a copy of itself to the folders found. Anything available in a download folder is shared in a P2P network and worms may overwrite or infect legitimate binary files.

25

2.1.2 Bots and Botnets

Bots are compromised computers controlled by one or more human operators, commonly known as botmasters, with the intent to perform malicious activities, and part of a network of infected computers, is known as botnet (RODRÍGUEZ-GÓMEZ; MACIÁ-FERNÁNDEZ; GARCÍA-TEODORO, 2013; SILVA et al., 2013). According to the survey in (ZHU et al., 2008) a botnet is “a collection of software robots, or bots, which run autonomously and automatically”. The infection methods used to compro- mise systems are similar to other classes of malwares, by exploiting vulnerabilities, code insertion and social engineering that leads users to download malicious code. According to (SILVA et al., 2013): “The primary purpose of botnets is for the controlling criminal, group of criminals or organized crime syndicate to use hijacked computers for fraudulent online activity”. Industry reports have called attention to the severity of botnet problems (SILVA et al., 2013). Botnets are responsible for 80% of all SPAM circulating in the Internet and some botnets had infected millions of hosts. It was claimed that the Mariposa bot- net had infected 12 million hosts in 190 countries (SINHA et al., 2010). Moreover, academic research had alerted to the outgrowing number of botnets (COOKE; JAHANIAN; MCPHERSON, 2005; RODRÍGUEZ-GÓMEZ; MACIÁ-FERNÁNDEZ; GARCÍA-TEODORO, 2013; ZHUGE et al., 2007). The major characteristic of a botnet is the control channel which allows the bot- master, or botnetmaster, to send commands and updates to the infected system. The updates include new exploits or code update to bypass signature-based antivirus. This command and control (C&C) channel can operate in different network topologies and use different network protocols. The general components of a botnet are illustrated in Figure 2 and in Section 2.1.2.2 the architectural design will be discussed in details.

26

Figure 2 Typical bonet´s elements (SILVA et al., 2013)

The communication between a botmaster and bots in a P2P network can be push-based or pull-based, depending on whether the first a bot waits for commands from the botmaster or asks the botmaster for commands (WANG, PING et al., 2009). Apart from botnets elements already illustrated, (RODRÍGUEZ-GÓMEZ; MACIÁ-FERNÁNDEZ; GARCÍA-TEODORO, 2013) extend the model and includes roles to represent the related social context :  Developer – A person, or group, who designs and implements the botnet. Not necessarily the botmaster, because development work may be subcon- tracted. There are development kits, commonly named Do-it-Yourself (DIY), that provide tools to assist botnets development and maintenance.  Client – Those that rent botnet services from a botmaster or seek to control a botnet and used it for their own purposes.  Victim – A system, person, network or organization which is the attack tar- get.  Passive Participant – the owner of the host infected.

27

2.1.2.1 Botnet Life-Cycle

Three botnets life-cycle models were proposed in literature, each one covers states observed in dissection of bots and botnets reported by security practitioners and researchers. Although they differ in how the life-cycle is detailed and the number of possible states, each draws attention for two common states: how the infection initi- ated, i.e. it is focused on initial infection or recruitment, and how the communication is established between C&C servers and bots, i.e. the C&C protocol and how the C&C servers are reached. Sinha et al. (SINHA et al., 2010) have observed that new generation botnets tends to employ automated strategies to spread, as worms. Several researchers have identified worms, such as Conficker(BURTON, 2010) and Sdbot(TREND MICRO, [S.d.]), as the main recruiting strategy of botnets. (SINHA et al., 2010) have observed that botnets combine capabilities of worms, viruses and Trojan horses. A new strategy has been identified in P2P botnets: propagation through existing P2P networks, such as VBS.Gnutella(SYMANTEC, 2007); however, the number of possible targets is limited by the P2P network size. Wang et al. (WANG, PING et al., 2009) had observed the rise of botnets with multiple spread mediums like e-mail, instant messages and file exchange. In (POLYCHRONAKIS; MAVROMMATIS; PROVOS, 2008) and (COVA; KRUEGEL; VIGNA, 2010) a new method called drive-by download attack is discussed. According to Polychronakis et al. (POLYCHRONAKIS; MAVROMMATIS; PROVOS, 2008): “In a drive-by download attack, a malicious web page exploits a vulnerability in a web browser, media player, or other client software to install and run malware on the un- suspecting visitor’s compute”. Once infected, a bot has to communicate with its C&C servers; otherwise it will be an isolated infected host. Each C&C architecture has particularities and will be dis- cussed in subsection 2.1.2.2.Table 2.1 presents a comparison of the proposed models and shows their common steps.

28

Table 2.1 Comparison of life-cycle models

Ramadass et al. Wang et al. Rodríguez-Gómez et. al. (FEILY; SHAHRESTANI; (WANG, PING et al., 2009) (RODRÍGUEZ-GÓMEZ; RAMADASS, 2009) MACIÁ-FERNÁNDEZ; GARCÍA-TEODORO, 2013) Conception Initial infection Recruiting Bot members Recruitment Secondary injection Connection Forming the botnet Malicious command and control Stand by for instructions Interaction Update and maintenance Marketing Attack Execution Attack Sucess

Ramadass et al. depicted a lifecycle with five phases (FEILY; SHAHRESTANI; RAMADASS, 2009), as may be seen in Figure 3: 1. Initial infection – The attacker scans a network for known vulnerability and exploits it to gain control of attacked system; 2. Secondary injection – A shell-code is executed and downloads via FTP, HTTP, or P2P, the actual bot binary to install itself on infected system, which become a “”, full controlled by botnetmaster. The bot code is automat- ically executed each system boot; 3. Connection – the bot establishes the C&C connection with the C&C server ; 4. Malicious command and control – bot programs receive and execute com- mand sent by botmaster; 5. Update and maintenance – Bot code may be updated to evade detection, correct bugs or change C&C server;

29

Figure 3 Typical botnet life-cycle proposed in (FEILY; SHAHRESTANI; RAMADASS, 2009)

In (WANG, PING et al., 2009) a new life-cycle model with three stages was proposed for P2P Botnets: 1. Recruiting Bot members – Similar to initial infection, as proposed in (FEILY; SHAHRESTANI; RAMADASS, 2009). 2. Forming the botnet – After infection, a host has to join the P2P network, otherwise it will be an isolated infected one. The initial procedure to join a P2P network is called “bootstrap” and according to (WANG, PING et al., 2009) two methods are well known: a. An initial list is hardcoded in each P2P client, and the bot tries to contact the nodes in this list to update its neighbor list. b. A shared web cache stores the initial host list and each bot has its address hardcoded. 3. Stand by for instructions – After a successful join, the bot keeps waiting for a command from the botmaster. The communication model may be push,

30

pull or a combination of both. More details about the communication model in P2P botnets are found in Section 2.1.2.2. Rodríguez-Gómez et. al. (RODRÍGUEZ-GÓMEZ; MACIÁ-FERNÁNDEZ; GARCÍA-TEODORO, 2013) extended the botnet life-cycle model, covering from its conception to the achievement of the desired (malicious) purpose. The life-cycle pro- posed is a linear sequence of stages and the failure of any intermediate stage thwarts the botnet aim. The proposed model is composed of six stages, depicted in Figure 4: 1. Conception – The main characteristics and botnet purposes are de- fined in this first stage; 2. Recruitment – After conceived and created, the botnet needs to re- cruit/infect hosts; 3. Interaction – The communication between an infected machine and a botnet server is established. The information exchanged is com- posed of commands and maintenance operations; 4. Marketing – the developer needs to make the botnet and its capabil- ities public, in order to attract clients and profit from it; 5. Attack Execution – The infected hosts may offer rentable private in- formation to the attacker, like financial data, and launch attacks, like DDOS attacks or phishing dissemination, according client’s interests; 6. Attack Success – when the botnet objective is fulfilled.

31

Figure 4 Botnet life cycle proposed in (RODRÍGUEZ-GÓMEZ; MACIÁ-FERNÁNDEZ; GARCÍA-TEODORO, 2013)

2.1.2.2 C&C Architectural Designs

According to (ZHU et al., 2008), the C&C architecture may be classified as:  IRC bot – The first, and most prevalent, botnets used (IRC) protocol, with a centralized C&C mechanism, due to the flexibility and scalability of this protocol.  HTTP bot – The C&C channel uses the Hyper Text Transfer Protocol (HTTP) due to its encryption capabilities and policies that allow internet access through TCP ports 80 and 443;  P2P bot – A P2P architecture offers a more stable architecture to a C&C channel than a centralized point of failure;

32

 Fast-flux (FF) networks - An advanced technique, first presented in (HONEYNET PROJECT, 2008), and also surveyed in (SHENG YU; SHIJIE ZHOU; SHA WANG, 2010) and (ZHANG et al., 2011), used to avoid the C&C channel detection. The idea is to rapidly change the map- ping between multiple IP addresses and one single domain. More details are presented in section 2.1.2.3. The survey in (SILVA et al., 2013) classifies C&C channels according to their specific architecture and operational modes, whether it is: centralized, decentralized, hybrid or random architectures, and has persistent or periodic (sporadic) modes.

Centralized C&C

This architecture implements the traditional client-server model where all bots establish connection with one or more C&C servers. The main advantage of a central- ized architecture is the fast information exchange between server and clients, and whether the major drawback is the C&C server as central point of failure. Earlier centralized botnets, such as Agobot, Phatbot and IRCbot, used IRC as their communication protocol in a push-base model, where the botmaster pushes com- mands to a bot, which then responds accordingly (FEDYNYSHYN; CHUAH; TAN, 2011). The advantages of using IRC as C&C channel protocol are:  Flexibility – botmasters can split the bots in groups and send different commands to each one, moreover, IRC servers can forward messages to bots at different servers ;  Open source – There are several open source servers available on the Internet;  Redundancy – Bots can connect to backup servers if the primary server is down and IRC servers can be part of an IRC network – group of inter- connected IRC servers;  Scalability – Tests comparing IRC servers performance demonstrated capacity to millions of users(PITCOCK, 2010). Moreover, IRC servers may be part of an IRC servers network and distribute bots load between these servers.  Versatility – Beyond message exchanges, IRC servers can transfer files.

33

In Figure 5, the elements of an IRC-based botnet are presented as proposed in (COOKE; JAHANIAN; MCPHERSON, 2005). The botmaster (commander) sends commands through an IRC network, which servers may be public or hid- den from the general public. The commands may be directed to all bots, or a group. A bot, or zombie, starts a malicious activity immediately after receiving a message from the botmaster, e.g. a DDOS attack.

Figure 5 IRC-based botnet DDOS Attack (COOKE; JAHANIAN; MCPHERSON, 2005)

Contemporary IRC botnets have evolved to obfuscate IRC messages and evade signature-based detection, but IRC C&C channel remains possible to detect because IRC traffic is not common in corporate networks. Therefore, a network admin- istrator can prevent botnet activity by blocking IRC traffic in firewalls. Due to this limi- tation, HTTP became popular in botnets, such as Storm and Bobax, as a C&C protocol, because HTTP has considerable advantages over IRC: it’s generally allowed between organizations, the bots poll the C&C server in a pull-based model, this means that C&C traffic behaves like normal HTTP traffic, and has cryptographic capabilities using TLS (Transport Layer Security). Though advantageous, HTTP has the main disadvantage of a centralized archi- tecture, the central point of failure. In (WANG, PING; SPARKS; ZOU, 2010) C&C

34

servers are evidenced as having the following fundamental weak points in contempo- rary botnets, which are:  Limited number of IP addresses facilitates the C&C server detection;  If a C&C server is shutdown, the botmaster will lose control over infected hosts;  If a C&C server is hijacked by authorities or security researches, the en- tire botnet can be exposed; Wang et al. (WANG, PING; SPARKS; ZOU, 2010) also argues that as security practitioners develop means to disrupt botnets, cybercriminal practitioners will develop more resilient and evasive C&C architectures.

Decentralized C&C

Given the limitations in a centralized architecture, security researches and law enforcement have succeeded in taking down attempts to disrupt botnets (BARFORD; YEGNESWARAN, 2007; FEDYNYSHYN; CHUAH; TAN, 2011; RODRÍGUEZ- GÓMEZ; MACIÁ-FERNÁNDEZ; GARCÍA-TEODORO, 2013; STONE-GROSS et al., 2011; WANG, PING; SPARKS; ZOU, 2010). The cybercrime answer was the develop- ment of botnets with a decentralized and more resilient architecture, organized as P2P networks, such as Waledac, Mariposa and Torpig (ROSSOW et al., 2013). The re- search in (ROSSOW et al., 2013) argues that even after being analyzed and disrupted, some P2P botnets keep in execution and their exact size is unknown, even a size estimation is a complex task. Jelasity et. al. (JELASITY; BILICKI, 2009) proposed that P2P botnets are based on a structured P2P overlay, such as Kademlia (CROWCROFT et al., 2005). Thus, this improves the botnet resiliency because failure of peers won’t cause network-wide failure and data is replicated across multiple peers. In (WANG, PING et al., 2009) P2P botnets are classified in three terms, accord- ing to the way a P2P botnet subverts, or not, an existent P2P network:  Parasite – all the bots are selected from vulnerable hosts within an exist- ing P2P network, and it uses this available P2P network for command and control.  Leeching – members join an existing P2P network and depend on this P2P network for C&C communication, but the bots could be vulnerable

35

hosts that were either inside or outside of the existing P2P network, e.g. early version of Storm botnet;  Bot-only – the P2P botnet builds its own P2P network, in which all mem- bers are bots, e.g. Stormnet and Nugache. A parasite botnet uses available P2P protocols to allow bots to locate and com- municate with each other, no design is required from the botmaster and the bootstrap method is already implemented by the P2P client. In leeching and bot-only botnets the botmaster must design bootstrap modules, in order to add an infected host which is not a member of the P2P network. The C&C mechanism in P2P networks was evaluated in (WANG, PING et al., 2009) and the way push and pull methods can be applied were discussed. For leeching and parasites P2P botnets the same mechanism that existent P2P protocols use for file search is adapted to command asking: In a pull-based method bots send requests for commands and botmasters answers with commands instead of files. Implementa- tion of a push method is more complex, but feasible in structured P2P networks. For bot-only P2P networks a new P2P communication protocol may be developed, or an existing P2P protocol may be extendedHybrid C&C This architecture employs characteristics from centralized and decentralized ar- chitectures. Wang et al. (WANG, PING; SPARKS; ZOU, 2010) argues that even with advanced designs, such as the absence of a bootstrap process in the Slapper Worm and Sinit, the public key cryptography to authenticate users in Sinit, or the encrypted control channel in Nugache, the P2P botnets have weaknesses and are not mature. A single captured bot can expose all the network and the complicated communication mechanisms facilitate detection through network flow analysis.

36

Figure 6 Hybrid P2P network

Given the weaknesses found in centralized and decentralized architectures (WANG, PING; SPARKS; ZOU, 2010) proposed a hybrid model, depicted in Figure 6, with the following features:  A bootstrap procedure is required, because the methods to detect boot- strap are well known;  Each bot has a limited list of peers, and if a bot is captured just a partial list of nodes will be exposed;  A botmaster can send report commands to a group of bots and the an- swer is redirected to a different node, called sensor node, every time a command is issued. This avoids the detection and blocking of sensor nodes;  A botmaster can update nodes list in each bot with a single update com- mand;  The bots with static IP addresses that are accessible from the Internet are candidates for being servant bots. In P2P terminology servant nodes acts like servers and clients simultaneously.

37

 Each servant bot listens for incoming connections and uses symmetric cryptography to ensure confidentiality, command and node authentica- tion, and to evade network analysis.

Random C&C

According to (COOKE; JAHANIAN; MCPHERSON, 2005), in random botnets no single bot knows about any more than another bot. In addition, when a botmaster wants to send a message to bots, it starts a random scan in the Internet and when a bot is found, a connection is established to the exchange encrypted messages and finished immediately. Despite the protocol simplicity and obscurity, a single bot cannot compromise the whole network and the message latency and the lack of delivery guar- antees are a major drawback. Even the random behavior is detectable.

2.1.2.3 Fast-Flux

Fast-Flux is a mechanism used in botnets to evade C&C channel detection, first introduced in (HONEYNET PROJECT, 2008). The main idea is to associate a fully qualified domain name (FQDN) to multiple, even thousands, IP addresses, using a very short Time-to-Live (TTL) for any given particular DNS Resource Record (RR) (IETF, 1987). Therefore, a bot may establish a new connection to a different C&C server, or botnet node, every 3-10 minutes. In addition, the bots don’t connect directly to C&C servers, but to blind proxies that forward content to backend servers. Two different types of fast-flux networks were categorized in (HONEYNET PROJECT, 2008): Single-flux and Double-flux. In a Single-flux network, every 3-10 minutes the DNS record is changed and the bot starts a new DNS resolution, which will deliver a new IP address from a fast-flux redirector, responsible for content for- warding between bot and the backend server, named “mothership”. These redirectors are generally infected hosts and if a redirector is shut down, another redirector on stand-by will take its place in IP address pool. In a Double-flux network, DNS A and NS records are continually changed in a round robin manner and advertised into the fast-flux network.

38

2.1.2.4 Domain-flux

Fast-flux networks have a single point of failure, the DNS resolution. A bot, or fast-flux agent, needs to resolve the FQDN and several techniques were proposed to detect botnet’s DNS resolutions (ZHANG et al., 2011). In (STONE-GROSS et al., 2011) a new evasion technique was presented, namely Domain-flux, in which each bot independently uses a domain generation algo- rithm (DGA) to compute a list of domains names. For each round, instead of a new DNS resolution with the same FQDN, the bot generates a new FQDN previously reg- istered by attackers, asks for this FQDN resolution and if the IP address provides a valid response, it is considered valid until the next round. In (ZHANG et al., 2011), several techniques to detect fluxing domains are also presented.

2.2 Modern Malwares

2.2.1 Mariposa

It was claimed that had infected around 12.7 million hosts in 190 countries until its disruption(GOODIN, 2010). Sinha et al. (SINHA et al., 2010) stated that Mariposa was extremely harmful because it could:  Download and execute binary code on the fly, using Direct Code Injection (DCI) to inject malicious code inside the address space of the explorer.exe program;  Infect machines already infected with different bots; Moreover, Mariposa had implemented a proprietary UDP-based C&C protocol, named the Iserdo Transport Protocol. Three main spreading techniques were detected in Mariposa Analysis:  USB Spreading: the bot copies itself to USB when a device is connected to the infected host;  MSN Spreading: if the infected host has the MSN messenger installed, malicious crafted messages are sent to recipients found in the infected host;

39

 P2P Spreading: If the infected host has a P2P application, such as: Ares, BearShare, iMesh, , Kazaa, DC++, eMule and LimeWire, the bot copies itself to the shared folder. A successful infection occurs if the binary code is executed whatever user’s permissions are, because the code is injected into the explorer.exe address space and can download other modules with new functionalities, including from other bots like Zeus, using HTTPS, HTTP, FTP or Butterfly Network Protocol. In addition, the modules can turn the infected host into a DDOS participant or a reverse proxy server. Sinha et al. (SINHA et al., 2010) summarized Mariposa C&C architecture, as:  Bot client - the infected host with spread functionalities already pre- sented;  Bot Server – A mediator with 2 functions: anonymizes the master and acts as a load balancer;  Bot Master – The core of operations, acts as a manager to multiple serv- ers. It has the ability to enable and disable servers and clients. Actually there is no consensus about the exact number of Servers, but several domains were identified, three hard-coded (SINHA et al., 2010) and the rest observed during analysis (DEFENCE INTELLIGENCE, 2010; ICS-CERT, 2010). It sends an en- crypted message to a server candidate and waits for the reply. If the server does not respond, it tries another one until a successful connection is achieved.

2.2.2 TDL4

TDL4, detected on June, 2011, is the fourth generation of a previously detected bot TDSS, which have evolved to version 4 as the most sophisticated contemporary bot, and according to the Kaspersky team (GOLOVANOV; SOUMENKOV; IGOR, 2011) had infected over 4.5 million hosts. Bots from the TDSS family spread using multiple techniques (SYMANTEC, 2008):  Drive-by-download infections, discussed in Section 2.1.2.1, through fake blogs, forum comments, legitimate hacked, forged websites and affiliate programs;  Fake torrent files and P2P downloads;  Cracks in Warez websites;

40

On infection, TDL4 installs an advanced in the Master Boot Record (MBR), in order to load before the operating system. The code in MBR is encrypted and capable to evade most of signature-based ; moreover, TDL4 re- moves approximately 20 others malicious programs. The main purpose of TDL4 is to generate revenue to cybercriminals by redirect- ing internet access from infected hosts to affiliated sites. The C&C architecture is hybrid, TDL4 may use a centralized architecture with approximately 60 HTTP C&C servers or embed its C&C protocol in the Kad network P2P protocol. Hence, TDL4 uses centralized servers or a public P2P network in order to transmit commands to infected hosts; moreover, the communication is encrypted with an unknown algorithm, probably developed by the attackers. It is worth to notice that TDL4 exploits the MS10-061 vulnerability, patched by since 2010.

2.2.3 Gameover Zeus

Gameover Zeus, also called P2P Zeus is, to this date, the newer variant of Zeus malware (ALAZAB et al., 2013; ANDRIESSE et al., 2013), a credential-stealing Trojan first discovered in 2007. This new variant introduced a P2P decentralized C&C proto- col, which network is divided in several virtual sub-botnets independently controlled by several botmasters. According to the Dell SecureWorks Counter Threat Unit (STONE-GROSS, 2012), P2P Zeus uses Cutwail (TREND MICRO, 2009), another SPAM botnet, to send massive amounts of email that impersonates well-known online retailers, cellular phone companies, social networking sites, and financial institutions. The e-mails con- tains links to fake webpages which use Blackhole (SURI, 2011), a commercial exploit kit which targets vulnerabilities in web browsers and plugins such as Adobe Reader, Flash and Java. According to (ANDRIESSE et al., 2013) Gameover Zeus network topology is organized in three disjoint layers, as depicted in Figure 7:

41

Figure 7 Gameover Zeus network topology. Dotted line indicates information flow.

 P2P Layer - Formed by infected hosts, which can play 2 roles: harvester bot and proxy bot. The first steals information located in the infected host and it sends to proxy bots and waits for commands from proxy bots, while the latter forward commands from C&C proxy servers and also sends the information stolen from harvester bots. Moreover, proxy bots also act as harvester bots and are elected manually by botmasters;  C&C Proxy Layer - Proxy bots interact with the C&C proxy layer to update their command repository and to forward the stolen data collected from the bots to the C&C server in the upper layer;  C&C Layer – The C&C server manages C&C proxy servers and its bots. The communication between bots is usually UDP-based, except for the C&C communication between harvester bots and proxy bots, and binary/configuration up- date exchanges, both of which are TCP-based. Moreover, critical messages are en- crypted with RSA-2048. Bootstrapping onto the network is achieved through a hardcoded bootstrap peer list. This list contains the IP addresses, ports and unique identifiers of up to 50 Zeus

42

bots. Zeus port numbers range from 1024 to 10000 in versions after June 2013, and from 10000 to 30000 in older versions. Unique identifiers are 20 bytes long and are generated at infection time by taking a SHA-1 hash over the Windows ComputerName and the Volume ID of the first hard-drive. These unique identifiers are used to keep contact information for bots with dynamic IPs up-to-date. Moreover, bots check the responsiveness of their neighbors every 30 minutes. Each neighbor is contacted in turn, and given 5 opportunities to reply. If a neighbor does not reply within 5 retries, it is discarded from the peer list. A Domain Generation Algorithm (DGA) is used to generate 1000 unique domains per week, which are the addresses of C&C proxy servers

2.3 Advanced Persistent Threats

While worms and bots usually attack broadly, without a specific target, several academic researches and industry reports have alerted to the growing number of tar- geted attacks, where the attacker has a monetary or political motivation to attack a specific organization (SOOD; ENBODY, 2013), (TANKARD, 2011), (LI, FRANKIE; LAI; DDL, 2011), (DE VRIES et al., 2012), (BAIZE; CORP, 2012), (THOMSON, 2011),(MANDIANT, 2010),(MCAFEE, 2010),(ISACA, 2013). The industry called such targeted attacks as Advanced Persistent Threats, or APT (MANDIANT, 2010; MCAFEE, 2010), because the attackers are professionals, more insidious, stealthy and persistent. The motivation isn’t the immediate gain pur- sued by cybercriminals, but trade secrets, intellectual property or governments classi- fied information. According to (TANKARD, 2011) ‘persistent’ refers to: “the fact that the goal of an APT is to gain access to targeted information and to maintain a presence on the targeted system for long-term control and data collection”. Moreover, according (SOOD; ENBODY, 2013): “Persistence is a characteristic of targeted attacks because they persist in the face of adversity instead of moving on to weaker targets”. Giura et al. (GIURA; WANG, 2012) have explained APT as follows: Advanced means that at- tackers are well trained, well-funded and with a wide spectrum of intrusion technolo- gies; Persistent means it is persistent over time; Threat means the attackers´ intention is to inflict damage or steal proprietary data.

43

The first industry report to address APTs is the report “Revealed: Operation Shady RAT” (MCAFEE, 2010), which describes how McAfee´s team had detected mal- ware variants with heuristic signatures which indicated an encrypted C&C HTML chan- nel. After they successfully gained access to one C&C server, they were able to identify a victim population since mid-2006 when the log collection began. It must be noticed that the malicious activity may have initiated before 2006, but the earlier evidence shows 2006. Most alarming were the number of organizations evidenced as victims: 71 organizations from 14 countries. The organizations were classified in 32 unique categories, as seen in Figure 8, and the 14 countries are depicted in Figure 9. The term RAT means Remote Access Trojan, defined by (AYCOCK, 2006) as programs that allow a computer to be monitored and controlled remotely.

Figure 8 Organizations Categories (MCAFEE, 2010)

44

Figure 9 Victim´s Country of Origin (MCAFEE, 2010) Following (ZHIOUA, 2013), given the amount of effort required to build sophisti- cated malware like APTs, and the consequences of the attacks, it´s possible to con- clude that the developers, or attackers, are not typical cybercriminals or hacktivists, and moreover, these malwares are using state-of-art hacking techniques.

2.3.1 APT Model

Giura and Wang (GIURA; WANG, 2012) analyzed industry reports and con- cluded that each APT is customized for each target. However, the stages of APT have similarities and differ mostly in the methods they use at each stage. Therefore, Giura and Wang proposed a model to APT´s stages, as shown in Figure 10:

Figure 10 Model for APT stages proposed by (GIURA; WANG, 2012).

 Reconnaissance Attackers gather public information about the target, identify IP address range used by an organization and scan the targeted network seeking for vulnerable servers. Information about the employees gathered from social networks is used to build pro- files, which will provide information to social engineering attacks.  Delivery Information gathered in the Reconnaissance initial stage will be used to craft a spear-phishing email, which is a phishing specially crafted to the targeted employees.

45

The e-mail might contain attached malicious files or a link to a malicious URL that the user is guided to trust. Emails are the main infection technique, but other infection channels may be used, such as USB based malware and time activated Trojan.  Exploitation Once the successful infection of a host in the targeted network is achieved, the APT establishes a connection with a C&C server and uploads information gathered in the infected host, including passwords, e-mails, network usernames and network shared resources.  Operation Attackers maintain the persistent presence and scans internal network seeking potential targets which store sensitive information.  Data Collection Attackers use privilege credential harvested in previous stages to collect sensitive data, compress and encrypt it before uploading.  Exfiltration The data organized in previous stage is uploaded to multiple servers, in order to prevent investigators to find the final data destination.

Figure 11 A targeted attack in action (SOOD; ENBODY, 2013)

46

Sood and Enbody (SOOD; ENBODY, 2013) developed a model of targeted attacks depicted in three phases, as show in Figure 11:

 Intelligence Gathering To perform reconnaissance, attackers collect target´s information from public available resources, such as DNS queries and WHOIS lookups, and organizational webpages. Useful information regarding employees, vendors and daily operations, can also be collected in social networks, such as Facebook or Twitter, or personal webpages. With this information attackers start to scan the target network looking for vul- nerabilities, opened ports, address range, outdated systems, virtualized platforms, and all available information about the target network infrastructure. Moreover, organiza- tion webpages are scanned for known vulnerabilities, such as SQL Injection (SQLI) and Cross-site Scripting (XSS).

 Threat Modeling The attackers create a profile of the target and its environment; even a replica of the target is constructed so that attackers may test penetrations and no suspicion is raised at the target.

 Attacking and Exploiting Targets In general, the attack aims to load a malware onto a target´s host and use it as a platform to analyze internal infrastructure and compromise other hosts. Attacks can vary but exhibit common patterns:  Drive-by-download and spear phishing;  Exploiting web infrastructure;  Exploiting communication protocols;  Exploiting co-location services;  Physical attacks. Several Elements are used frequently in targeted attacks:  Malware Infection Frameworks;  RATs and ;  Morphing and Obfuscation Toolkits;

47

 Interface with underground market. In Table 2.2 a comparison of the two proposed models is presented. The model pro- posed by Giura and Wang (GIURA; WANG, 2012) is more detailed; the Reconnaissance step is equivalent to Information Gathering and Threat Modeling in the model proposed by Sood and Enbody (SOOD; ENBODY, 2013). However, the latter offers more details about tools and techniques than the former.

Table 2.2 APT’s model comparison

Giura and Wang Sood and Enbody (GIURA; WANG, 2012) (SOOD; ENBODY, 2013) Reconnaissance Information Gathering Threat Modeling Delivery Exploitation Operation Attacking and Data Collection Exploiting Targets Exfiltration

2.3.2 Stuxnet

Stuxnet is considered the first weapon in the history of security (LANGNER, 2011) and, according to Symantec (MCDONALD et al., 2013), in the wild since early November 2007, first noticed by the industry in 2008 and in development as early as November 2005, and with 4 different versions: 0.500, 1.001, 1.100 and 1.101. Contrary to initial belief, Stuxnet’s objective was not industrial espionage, but to physically destroy an industrial controller, specific from one manufacturer (Siemens), attached to a SCADA system (GALLOWAY; HANCKE, 2013). An industrial control network is a system of interconnected equipment used to monitor and control physical equipment in industrial environments (GALLOWAY; HANCKE, 2013). It is composed of specialized components and applications, such as Programmable Logic Controllers (PLCs), Supervisory Control and Data Acquisition (SCADA) systems and Distributed Control Systems (DCSc). SCADA is a software

48

layer whose objective is to provide an interface between PLC and user level software, it captures signals from devices and sends high level control commands, e.g. the in- struction to start an engine or change control parameters, such as rotation speed. Stuxnet had taken a longer time in the slow start phase then conventional worms, mainly because its main spreading technique relied on local exploitation, through USB sticks and/or local networks. Moreover, the infection process included a fingerprinting procedure to deploy the payload only if the controller model identified was a model used by Iran´s Government to enrich uranium (LANGNER, 2011). Figure 12 presents the origin countries of hosts infected, according to Symantec (FALLIERE; MURCHU; CHIEN, 2011).

Figure 12 Infected Hosts according Wan IP (FALLIERE; MURCHU; CHIEN, 2011)

According to (ZHIOUA, 2013), the Stuxnet attack operates at three levels: (1)Windows OS, (2) Step 7 Software, and (3) PLC. Figure 13 gives an overview of how Stuxnet operates. Its main goal is to compromise the PLC through the infection of the Windows host connected to the PLC.

49

Figure 13 Overview of Stuxnet Malware Operation

Stuxnet’s main infection technique is the LNK exploit (MS10-046) delivered in a USB drive (MICROSOFT, 2010a). The vulnerability allows the execution of a malicious code inserted in shortcuts (.LNK files) when the shortcut icon is displayed. A Windows host is compromised when Windows Explorer is used to open the USB drive containing the malicious LNK file. During the infection process Stuxnet uses rootkit techniques to hide files and inject code into processes. If the host has the Step 7 installed (SIEMENS, [S.d.]), Stuxnet will hook specific APIs used to open Step 7 projects and execute each time a project is loaded, this allows Stuxnet to propagate using the infected files and infect the host again in case of SO update or replacement. After a successful infection Stuxnet initiates local network propagation (MCDONALD et al., 2013; ZHIOUA, 2013) through the exploitation of:  Print spooler service vulnerability (MS10-061) (MICROSOFT, 2010b), as it allows remote code execution through a Printer Service, if a printer is shared on the local network .  Windows Server service vulnerability (MS08-067) (MICROSOFT, 2008), allows remote code execution through Remote Procedure Call (RPC). It is worth to notice that these vulnerabilities were discovered during Stuxnet analysis which was unpatched then.

50

Stuxnet tries to communicate with a C&C servers and, if the connection is es- tablished can get updates, as well as more binary codes to execute in the infected machine, and upload infected host information, including installed Industrial Control Systems software. The control connection is not a mandatory procedure (MCDONALD et al., 2013), Stuxnet was developed to be autonomous with a behavior similar to a worm; therefore, the C&C protocol is simple, HTTP-based with 2 domains, where en- cryption is used only when uploading host information, and 4 servers in 4 countries were identified until Stuxnet disruption. Moreover, compromised hosts within the same local network established a P2P network, and the host capable to communicate with the C&C server acts as a proxy, and distributes information through the local P2P net- work. The payload is dropped and executed only if the PLC uses a Profibus commu- nication processor (TEXAS INSTRUMENTS, [S.d.]). The malicious code monitors the Profibus messaging bus and modifies the spinning frequency of the attached equip- ment, to 1410Hz then to 2Hz then to 1064Hz, with the objective to stress and destroy the equipment.

2.3.3 Flame

Flame was an APT discovered in 2012 by (IRAN NATIONAL CERT, 2008) and initially mistaken as related with Stuxnet. At a first glance Flame has evaded 43 antivi- ruses, demonstrated multiple spread and obfuscation techniques, and related with a mass data loss in Iran. The first in-depth study of flame was conducted at Budapest University of Tech- nology and Economics by the Laboratory of Cryptography and System Security – CrySyS Lab (CRYSYS, 2012). Flame was characterized as an info-stealer malware and with a modular structure which allows it to incorporate multiple techniques to prop- agate and to obfuscate, such as 5 different encryption methods, 3 different compres- sion techniques and 5 different file formats. According to Symantec (SYMANTEC, 2012f) Flame’s main characteristic is not to spread until asked to. After the initial infection process, no spread action is taken by the infected host until the C&C connection is established and a command to spread arrives. Moreover, Flame is maybe the first malware with a “suicide” routine

51

(SYMANTEC, 2012c, d): after the Flame details came to public, a new module was distributed by C&C servers to infected hosts and few weeks later a command to exe- cute this module and completely remove Flame was sent. The Flame activity gradually ceased since them. There is no consensus about the geographical information where Flame has attacked and what is its main spread technique. Kaspersky (GOSTEV, 2012b) stated that Flame had attacked middle-east countries, mostly in Iran and Israel, as seen in Figure 14, but Symantec (SYMANTEC, 2012b) said that the primary targets of this threat are located in the Palestinian West Bank, Hungary, Iran, and Lebanon; however, additional reports indicated infections in Austria, Russia, Hong Kong, and the United Arab Emirates, as seen in Figure 15. A possible explanation for this discrepancy is because each company handles infections from different constituencies.

Figure 14 Countries affected by Flame according to McAfee (GOSTEV, 2012b)

52

Figure 15 Countries affected by Flame according Symantec (SYMANTEC, 2012b)

Flame has multiple spreading techniques, including exploits to vulnerabilities already exploited by Stuxnet, and patched by Microsoft since 2010 at least: Windows Print Spooler Service vulnerability (MS10-061), Shortcut ‘LNK/PIF’ Files Automatic File Execution vulnerability (MS10-046) and Print Spooler Service vul- nerability (MS10-061). Some confusion about Flame being an evolution of Stuxnet has been considered by researchers, but this idea was discarded when a more in-depth analysis evolved. Unsuccessful efforts have been made to identify Flame´s main spread tech- nique, i.e. no one has identified how the infection initiated. The Kaspersky team (GOSTEV, 2012a) reported that no zero-day vulnerability was found and fully patched Windows 7 was infected. However, one of the spread techniques found may indicate how: attackers had forged Microsoft’s digital certificates (SYMANTEC, 2012g), actually revoked, and intercepted Microsoft Update Service requests to execute code in the target host as Microsoft´s (GOSTEV, 2012a). A module found in flame allows an in- fected host to act as a proxy for Windows updates requests, i.e. an infected host de- tects network clients configured to automatic proxy detection, announces itself as a proxy server, intercepts update requests and introduces malicious code signed with Microsoft’s forged digital certificates. There’s no evidence of this attack or interception on Internet Service Providers (ISP), but it may be applied into ISP´s infrastructure as well.

53

Analysis from the CrySyS laboratory (CRYSYS, 2012) and Symantec (SYMANTEC, 2012a) had drawn attention to a particular Flame’s module able to enu- merate devices around the infected host, to announce the host as a discoverable de- vice and encode the status of the malware in device information using base64 encod- ing. Symantec (SYMANTEC, 2012a) argues how an attacker can do with this func- tionality;  Identification of victim social networks – Monitoring devices within Blue- tooth range, attacker may catalog the devices encountered and maps the victim’s social and professional circles;  Identification of victim physical locations – By measuring the strength of ’s radio waves it is possible to calculate the distance between hosts and attackers can identify other near devices, including those owned by organization’s employees; moreover, attackers can deploy Bluetooth monitoring devices in public places in order to track them;  Enhanced information gathering – Attackers can steal contacts from mo- bile devices, SMS messages and any data. Attackers may even turn on the microphone of mobile devices and record a conversation. Flame infection installs a Lua interpreter (LUA, 1993) which allows attackers to deploy new functionalities through multiple scripts. Following Symantec (SYMANTEC, 2012e) the attackers have something equivalent to an “app store” where new modules can be retrieved. The scripts provide functionalities to extract data form infected hosts, capture users credentials – if the user has administrative clearance, the credentials are used to access domain servers and add user accounts with default passwords, distribute malicious code through network shares, and more, as found in (CRYSYS, 2012). After a successful infection, the infected host establishes a connection with a C&C Server, sends initial data collected and waits for instruction. Figure 16 presents Flame’s C&C architecture: 80 domains were used to obfuscate 22 C&C servers. The protocol used to communication between servers and infected hosts was HTTPS and attackers had accessed the Servers through SSH, to perform system administrative tasks, or HTTPS, to access a web application used to control the infected hosts (SYMANTEC, 2012d).

54

Figure 16 Flame C&C Platform(ZHIOUA, 2013)

2.4 Fighting Malware Propagation

(SAAD et al., 2011) shows that malware detection though network traffic behav- ior has the following advantages:  It is possible to detect bots during any phase of their life-cycle, and as a consequence, also detect worms network behavior;  Has a lower cost than deep packet inspection or behavior anal- ysis;  A bot may be detected during formation phase or through C&C connec- tion. On the other hand, (PORRAS, 2009) has presented the challenges faced by such methods:

55

 Malware can be stealthy and embed its communication protocol on ex- istent protocols already present in the network, such as HTTPS.  The communication with a C&C server may take irregular intervals and with a low rate enough to does not generate significant anomalies on network traffic; Several researches have dedicated efforts to detect malware propagation through traffic analysis (GU et al., 2007, 2009; MANIKOPOULOS; PAPAVASSILIOU, 2002a; SHAHRESTANI et al., 2009; XU; WANG; GU, 2011a; YU et al., 2014). Gu et al. (XU; WANG; GU, 2011b) proposed a method to cluster end hosts with similar behavior within the same network prefixes. Bipartite graphs are used to model the social behavior of end hosts, i. e. with whom a host communicates. A one-mode projection of the bipartite graph is used to capture social behavior similarity: edges are used to connect hosts with a same destination or source. Subsequently, a spectral clustering algorithm discovers inherent behavior within the same network prefix. Fig- ure 17 presents an example of bipartite graph and the projection with edges connect- ing nodes with the same source or destination, e.g. a1 and a4 have b4 as destination, and hence an edge connects them.

Figure 17 An example of (a) bipartite graph and (b) one-mode projection.

Tests were conducted with network traffic available at the Cooperative Associ- ation for Internet Data Analysis (CAIDA). Scanning activities and a DDOS attack was detected in the Internet backbone traffic, a worm was also detected in its earlier stage

56

of propagation in a sample with Witty Worm; however, no evidence of performance was presented considering background traffic in an internal network. The BotHunter System was proposed by (PORRAS, 2009). Its main objective is to detect inside hosts trying to propagate infections out. An infection dialog correlation strategy was modeled as a set of loosely ordered communication flows that are ex- changed between an internal host and one or more external entities, i.e. bots are mod- eled as sharing a common set of underlying actions that occur during the infection life cycle: target scanning, infection exploit, binary egg download and execution, command and control channel establishment, and outbound scanning. The model is depicted in Figure 18.

Figure 18 BotHunter System by (PORRAS, 2009)

Experiments were conducted, using Snort rules to detect evidence of direct ex- ploit detection (E2), binary download (E3) and C&C communication (E4). The rule-set was specially customized to malware detection, and two preprocessors were added to

57

a Snort configuration, Slade and Spade, in order to detect anomalies such as inbound scanning (E1). The results presented demonstrated a significant performance in a con- trolled environment with honeypots, 95,1% of true positive rate and a 4,9% false neg- ative rate; The experiments in a university campus network were inconclusive, mali- cious traffic was injected in real background traffic and the detection rate was 100% for 10 malicious patterns; however, after 4 months 98 malicious patterns were detected and approximately 61% of these were false positives; Experiments in a production internal network during 10 days were also inconclusive, a single detection was a false positive.

2.5 Chapter Summary

In this Chapter the most relevant malware threats, bots and worms, were de- picted, and their spreading techniques were presented. The modern malwares pre- sented have demonstrated a continuous evolution in order to evade local host and traffic detection, the latter using techniques to obfuscate the C&C communication with botmasters. Moreover, botnets have absorbed autonomous spread techniques from Trojans and worms, and rootkit capabilities to conceal themselves. However, the tech- niques to exploit vulnerabilities are common to most of them, and the vulnerabilities are generally already patched. Solutions to detect malware through traffic analysis were also presented; how- ever, they mostly presented positive results when tested in traffic without background noise which is generated by regular services and network protocol.

Chapter 3 Intrusion Detection and False Alarm Reduction

This Chapter presents the most relevant methods to reduce false alerts in Intru- sion Detection. Due the common flaws and vulnerabilities found in computer systems, even se- curity mechanisms such as access control and firewalls cannot avoid security breaches. According to (DENNING, 1987), most existing systems have security flaws and developing a system absolutely secure is generally impossible. The number of vulnerabilities reported in the last few years demonstrate that Denning´s statements still contemporary. Figure 19 presents the number of vulnerabilities with software flaws reported to the NVD - National Vulnerability Database (NIST, 2014), since 1998. More- over, 8,495 high severity vulnerabilities were reported since 2010, representing 36.83% among all vulnerabilities reported, and modern malwares take advantage of such flaws, as discussed in Chapter 2.

58

59

7000

6000

5000

4000

3000

2000

1000

0

1989 1995 1990 1991 1992 1993 1994 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 1988 Figure 19 Vulnerabilities reported do NVD (NIST, 2014).

The discussion from (MCHUGH; FITHEN; ARBAUGH, 2000) and malwares pre- sented in Chapter 2, shows attackers exploiting most systems through widely known security vulnerabilities. There are several reasons why administrators may fail to install software patches: • Disruption: if a patch installation requires a system reboot, and the service uptime is crucial, the system administrator may postpone it. • Unreliability: Software patches are typically released as soon as possible, af- ter vulnerability is disclosed. The patch may have not been tested enough and causes severe disruption or even damage to the host systems to which they are applied. Therefore, the system administrator may choose not to install it and accepts the risk of a compromise. • Irreversibility: Most patches are not designed to be easily reversible due to the ordering of changes that have been made to the system. Once applied, there is often no easy way of reverse to the original state. This factor increases the risk associated with applying a patch. • Unawareness: An administrator may simply miss a patch announcement for some reason, and therefore be unaware of it, or may have neglected acting on a re- ceived announcement.

60

The number of reported security incidents has grown as well. Figure 20 pre- sents the number of incidents reported to Cert.br (CERT.BR, 2014) since its creation in 1999.

Figure 20 Incidents reported to Cert.br (CERT.BR, 2014)

Given this scenario, Intrusion Detection Systems (IDS) have risen as counter- measures, implemented as hardware or software, able to monitor and report attacks or attempts to exploit possible flaws (FEITOSA, EDUARDO LUZEIRO, 2010; HUBBALLI; SURYANARAYANAN, 2014). An intrusion, or malicious activity, is any ac- tivity that aims to compromise the confidentiality, integrity or availability of computer systems (MUKHERJEE; HEBERLEIN; LEVITT, 1994) The idea to monitor user activities with the objective to detect malicious behavior was first introduced by (DENNING, 1987) and (ANDERSON, 1980) and, since then, several methods were proposed by security researchers (HUBBALLI; BISWAS; NANDI, 2011; HUBBALLI; SURYANARAYANAN, 2014; KUMAR, 1995; MANIKOPOULOS; PAPAVASSILIOU, 2002b; MUKHERJEE; HEBERLEIN; LEVITT, 1994). IDS are composed of sensors that generate and send events and security alerts to management stations whenever a malicious activity is detected. Each alert consists of information describing the attack, such as type of attack, source address and desti- nation address. Along this chapter, the terms alert and alarm will be used interchange- ably.

61

The remaining of this Chapter presents the types and classifications of IDS, a discussion about the major drawback of IDS, regarding the alarm volume and false alarm rate, and the state-of-the-art of alarm reduction and false alarm minimization.

3.1 IDS Classification

An IDS may be classified following the method used to detect an intrusion and the data source monitored. According to the method used (AXELSSON, 2000; FEITOSA, EDUARDO LUZEIRO, 2010), traditionally IDS can be classified as:  Signature-based (or misuse-based) – known attacks are described as signatures, or rules;  Anomaly-based – deviations from what is considered normal behavior are classified as malicious; The former approach considers everything that is known, described in rules, as malicious, while the later considers the unknown as malicious. Moreover, signatures describe known attacks but new attacks can be unnoticed, while anomalies may indi- cate new attacks but new normal behavior can be mistaken as being malicious. According to the data source, (MUKHERJEE; HEBERLEIN; LEVITT, 1994) de- fined IDS as:  Host-based IDS (HIDS) – Monitors the host’s operational system param- eters and audits trails to detect malicious behavior. Log files, processes behavior and file system changes may also be monitored.  Network-based IDS (NIDS) – Monitors network traffic to detect malicious behavior. A NIDS may be deployed as a passive monitor, collecting traffic from a switch mirror port or a network tap, or deployed as a bridge with the capacity to block malicious traffic. According (CHRUN; CUKIER; SNEERINGER, 2008), when a NIDS is able to block traffic, it’s called Intrusion Prevention System (IPS). An HIDS can identify a malicious process or binary file, even evidence of a net- work attack found in audit trails, but if the host is successfully compromised an attacker can shut the HIDS process down and/or can use rootkit techniques to conceal itself.

62

An NIDS can detect the host where the malicious traffic came from, but cannot identify the malicious process; however, if a host is compromised, the NIDS is not affected. In (VIGNA et al., 2003) a new classification is proposed, the application-based intrusion detection, which is tightly coupled with an application server, or web server, and where requests are analyzed before processed. This dissertation is focused on Signature-based NIDS because it has a lower false positive rate than anomaly-based (MUKHERJEE; HEBERLEIN; LEVITT, 1994) and malware detection throughout traffic analysis is discussed as a possible solution to the problem of malware detection in Chapter 2.

3.2 Problems with DARPA Dataset

Given the research effort to minimize the false positive rate in IDS, as discussed in Section 3.3, research efforts also have been conducted to evaluate the performance of IDS, in terms of its detection rate and false positive rate (TJHAI et al., 2008). In 1998 DARPA recognized the need to provide a common dataset to allow comparisons be- tween different IDS methods. Thus, MIT’s Lincoln Labs was contracted to work with the Air Force Research Laboratory in Rome, NY to build an evaluation dataset and perform an evaluation of the then current IDS research being funded by DARPA (BRUGGER; CHOW, 2005). Since then, DARPA dataset kept the status of default da- taset to compare the performance of a new IDS strategy with previous researches. However, several criticisms have raised indicating flaws in the way the dataset was created, and statistical problems which might make the obtained results by exper- iments with DARPA dataset unrealistic:  Statistics used to describe the real traffic and the measures used to es- tablish similarity are not given (MCHUGH, 2000);  The taxonomy used in the Lincoln Lab evaluation offers very little support for developing an understanding of intrusions and their detec- tion(MCHUGH, 2000);  Hostile IP packets have a TTL value which is lower by 1 than the back- ground traffic (MAHONEY; CHAN, 2003)

63

 Several attacks can be detected by anomalies in the TCP window size field, without a reasonable explanation for why these anomalies should occur (MAHONEY; CHAN, 2003).  Only 9 of the possible 256 TTL values were observed in DARPA while 177 different values were observed in real traffic. For TOS, 4 values were observed in DARPA while 44 values were observed in real traffic (MAHONEY; CHAN, 2003).  No fragmented traffic were found in DARPA dataset, the DF (Don’t Frag- ment) flag was set in all traffic (MAHONEY; CHAN, 2003).  Only HTTP GET requests were observed in the DARPA dataset (MAHONEY; CHAN, 2003).  The majority of malicious connections in the DARPA dataset come from denial of service attacks and probing activity(BRUGGER; CHOW, 2005);

3.3 False Alarm Generation

The major drawbacks identified in IDS research are the alert volume and the false alarm rate (JULISCH, KLAUS, 2003; PIETRASZEK; TANNER, 2005b). In fact, it has been estimated that 99% of alerts are not related to security issues (AXELSSON, 2000). According to (AXELSSON, 2000), the research in process automation indicates that a human operator will completely lose faith in a device which false alert rate reaches 50%. (AXELSSON, 2000) also proposed that the effectiveness of an IDS is affected by the Bayesian base-rate fallacy. Let 퐼 and ¬퐼 denote intrusive and nonintrusive be- havior, respectively, and 퐴 and ¬퐴 denote the presence or absence of an intrusion alert. Given the conditional probability: 푃(퐴) ∙ 푃(퐵|퐴) ( 3) 푃(퐴|퐵) = 푃(퐵) The four possible cases are:  True positive rate, or detection rate, is the probability 푃(퐴|퐼);  False positive rate, or false alarm rate, is the probability 푃(퐴|¬퐼);  True negative rate is the probability 푃(¬퐴|¬퐼);  False negative rate is the probability 푃(¬퐴|퐼);

64

Assuming that 1,000,000 packets were analyzed, and only 20 were intrusions, even with a perfect detection rate of 1.0 and a very low false positive rate on the order of 10-5 , 33% of alerts will be false positives. With a more realistic detection rate of 0.7, 42% of alerts will be false positives. This shows that building an IDS with a low false positive rate is, according to (PIETRASZEK; TANNER, 2005b), extremely difficult. (HUBBALLI; SURYANARAYANAN, 2014) presented general reasons for false positives generation:  Intrusion activity sometimes deviates slightly from normal and some cases are difficult to differentiate.  A context in which a particular event has happened often decides the usefulness of the alert. For example, ‘‘Microsoft Distributed Transaction (MDT)’’ service was vulnerable to the intrusion of large packets, which was generating a buffer overflow. This triggers a denial of service for the MDT service. However, this vulnerability was exploitable only in the Win- dows 2000 operating system which was not patched with latest patches.  Certain actions which are normal may be malicious under different pre- vailing circumstances. For example, network scan is normal if done by a security administrator.  Many IDS not only detect intrusions but also the number of attempts of intrusions. An attempt may not necessarily lead to a compromised sys- tem if the vulnerability does not exist or was corrected.  An alarm may represent a stage in a multistage attack which may even- tually fail due to various other reasons. With regard to signature-based IDS, (HUBBALLI; SURYANARAYANAN, 2014) also presented the following reasons for false positives:  Good quality signatures are often difficult to write and their presence is highly dependent on expert knowledge. An attack may have several variations and if a signature fails to match a specific attack it is a false negative. If it matches non-intrusive behavior it is a false positive. As the discovery of new flaws and vulnerabilities occurs, an expert has to understand the flaw behavior provided by sufficient data to analyze. Moreover, two conditions may affect the signature quality:

65

o Analyzing the irrelevant portion of related traffic; o Analyzing the wrong application data for finding a match.  The default signatures supplied with most IDS are not customized to the local network, and a signature which does not threaten the organization, such as an attack aiming to exploit unavailable services or operational systems, has to be disabled. This demands expert and infrastructure knowledge.  Latency in deployment of newly created signatures. The signature database has to be updated regularly and if this is not the case, poor quality signatures won´t be replaced by better ones. Several false alert minimization techniques were surveyed in (HUBBALLI; SURYANARAYANAN, 2014) and according to the proposed taxonomy, the most rele- vant and recent, related with this dissertation research, are presented in the following subsections.

3.3.1 Signature enhancement

Signature enhancement methods enhance regular signatures with context in- formation. (SOMMER; PAXSON, 2003) and (MASSICOTTE et al., 2007) proposed signature models with context information, such as the type of host’s operating system stack. Both obtained satisfactory results with low false positive rate, however, signature modification is error prone, needs knowledge and experience, and the experiments were realized with traffic from academic internet links.

3.3.2 Stateful signatures

A stateful IDS stores the state of the network, or previous packet information, while evaluating a new arriving packet, in other words, a stateful signature is applied to a full stream of packets instead of a in single packet. In (ECKMANN; VIGNA; KEMMERER, 2002) an attack language STATL with a high level specification allows to model multistep attacks and scenarios, using a state transition model which represents the evolution of an attack’s steps. Experiments have demonstrated effectiveness using the DARPA dataset, but DARPA has several statis- tical problems as discussed in Section 3.2.

66

Vigna et al. (VIGNA et al., 2003) extended STATL and proposed an attack lan- guage for use with application-based IDS. The experiments evaluated the throughput and response time, but no evaluation of effectiveness or false alarm rate was pre- sented. Moreover, the model is specific to detecting attacks on web servers.

3.3.3 Vulnerability signatures

Signature-based IDS works either by string matching or regular expression matching mechanisms, which fail to detect new obfuscation techniques used by at- tackers. In order to detect these sophisticated attacks, application semantics and pro- tocol awareness are used in the form of vulnerability signatures. HELEN J. et al. (WANG, HELEN J. et al., 2004) uses the characteristics of a vulnerability, to generate a signature for all variations of exploits of that vulnerability. The method is able to recognize and filter only traffic that exploits a specific vulnera- bility, but the experiments have tested vulnerabilities from a single network service, and are not enough to prove if it is free from false positives. Moreover, it is implemented in the host´s network stack, therefore, has to be deployed in the hosts. Song et al. (BRUMLEY; NEWSOME; SONG, 2006) proposed a formal definition of a vulnerability and investigates the computational complexity of creating and match- ing vulnerability signatures. The experiments demonstrated the capacity to generate a vulnerability signature using a single exploit; however, only two vulnerabilities were tested. ZHICHUN et al. (LI, ZHICHUN et al., 2010) proposed an implementation of vul- nerability signature-based IDS, which achieves multi-gigabit throughput. The experi- ments demonstrated performance, but used DARPA dataset (see Section 3.2) and traffic from an academic internet link.

3.3.4 Alarm mining

Alerts contain information, such as IP addresses, port numbers, attack classes, useful to characterize real attacks, and in addition, also useful to distinguish true from false positives. Data mining techniques use these attributes to learn and mine a set of given alarms, for summarizing them into either TPs or FPs, and classify future alarms.

67

There are many prominent techniques used within the data mining domain, how- ever, they generally fall either into clustering, classification, neural network based and frequent pattern mining models.

3.3.4.1 Clustering

A clustering technique uses a set of unlabeled alerts and creates a set of clus- ters of similar type. Later, meaning is assigned to these clusters as false or true posi- tives, and alarms in the same cluster are either FPs or TPs. In a series of papers published by researchers from IBM Zurich Research La- boratory (JULISCH, K; DACIER, 2002; JULISCH, K, 2001; JULISCH, KLAUS, 2001, 2002, 2003) clustering algorithms are used to group IDS alarms. The main idea is that FPs and TPs have root causes, which can be malicious or not and are the source of alerts. Two different data mining algorithms namely episode mining and clustering al- gorithms are used in the experiments. The former mines sets of episodes and identifies which ones correspond to FPs and TPs, the later uses a modified version of classical Attribute Oriented Induction (AOI) algorithm to generalize the alarms and the general- ized alerts as clusters. Experiments revealed a good reduction in the number of alerts; however, these were realized with DARPA’s dataset, see Section 3.2, with an unknown real dataset; No details about the dataset were presented, and AOI generalization needs previous knowledge about network infrastructure. (AL-MAMORY; ZHANG, 2008b) proposed improvements to the AOI algorithm, but the experiments were also based on the DARPA data and with an unidentified real dataset.

3.3.4.2 Classification

This method assumes a set of alerts labeled by an Expert as TP or FP, which will be used to train a classification algorithm. Once trained, the classifier can be used to classify future alarms. (PIETRASZEK; TANNER, 2005a) proposes an alert classifier to classify alarms during run time. The proposed model uses a combination of two approaches – one part comprises data mining of historical alarms and passes the knowledge learnt to a

68

human expert and the other part tunes the signature engine to reflect the learnt behav- ior. The scheme uses various network entities for building the classifier and these net- work entities are represented as a topology tree much similar to the decision tree. Alarms are clustered based on the similarity of their attributes such as IP address and port numbers after traversing the topology tree. The experiments have validated the method with considerable results, a real and large dataset were used, but no details about the characteristics of the dataset were presented and background knowledge about the network infrastructure is needed. (PARIKH; CHEN, 2008) describes a hierarchical classifier ensemble method, which combines information from multiple sources and uses a cost minimization strat- egy, in a post-training step, to validate correct classifications. As depicted in Figure 21, let different sources of information (feature sets) avail- able be denoted as 퐹푆푘, 푘 = 1,2, … , 퐾 where 퐾 is the total number of data sources. An ensemble of classifiers is trained for each individual feature set and then combined, i.e. a binary classification system is trained for each class which represents FP and TP. When new alerts arrive, each classification system labels the new alert and the cost minimization strategy will elect the final label, if TP or FP.

Figure 21 Layout of the proposed classification system in (PARIKH; CHEN, 2008).

Several classifiers were tested and significant results were presented, but the dataset used was DARPA, see Section 3.2, and expert knowledge is required to clas- sify the training dataset.

69

3.3.4.3 Neural network approach

An Artificial Neural Network (ANN) (HAYKIN, 1998) is an information processing method motivated by biological systems research. A set of interconnected units, called neurons, in a given topology, realizes a weighted sum of several inputs, according to a set of weights, and then computes a function to obtain an output value. ANNs are used in: (1) supervised learning, where a set of labeled objects are used in a training phase to adapt the weights of all interconnections, and further unlabeled objects are presented to the ANN, which according to the weights “learned” will classify and label it; (2) unsupervised learning, where the weights and/or network topology adapt auton- omously in the training phase. (THOMAS; BALAKRISHNAN, 2008) proposed a neural network based alarm classification technique to detect the alarm as either TP or FP. The experiments demonstrated considerable results; however, previous knowledge is necessary to train the ANN and the datasets used are real ones, from an academic internet link, and DARPA´s, see Section 3.2. No further information about the real dataset is presented.

3.3.4.4 Frequent pattern mining

Frequent pattern mining is a technique to identify frequent item sets in a given transaction database, i.e. IDS alarms are transactions and frequent alarm combina- tions indicate a sequence which is repeating. (HUBBALLI; SURYANARAYANAN, 2014) argues that repeating patterns might indicate the actions an intruder has tried before penetrating into the target host; however, as discussed and demonstrated in Chapter 4, frequent item sets may represent either FP or TP. (SOLEIMANI; GHORBANI, 2008) proposes a real time alarm classification scheme based on frequent structured patterns. Alerts are transformed into graphs rep- resenting the connectivity relationship between them, and then a frequent mining al- gorithm mines the frequent episodes, Figure 22 presents an example of a multistep attack, specifically a Distributed Denial of Service (DDOS) attack. In a next step, a security administrator will identify the most critical episodes, considering: source and destination IP address, attack type, destination port number, priority of attack and se- quence of alerts; therefore, this knowledge will be used to build a Decision Tree which

70

will classify further alerts. The experiments have demonstrated a 90% reduction; how- ever previous knowledge is required do build a decision tree, the dataset used was DARPA’s, see Section 3.2, and only DDOS attacks were evaluated.

Figure 22 A sample multi-step-attack (SOLEIMANI; GHORBANI, 2008)

In (SADODDIN; GHORBANI, 2009), the previous method was improved to sat- isfy real time constrains, but in a similar way, DARPA’s dataset and artificial data were used to evaluate its effectiveness and performance.

3.3.5 Alarm correlation

Following research from (HUBBALLI; SURYANARAYANAN, 2014), alarm cor- relation has the main objective to construct attack scenarios in order to identify the root cause of the alerts, basically by grouping alerts from distinct locations in the network and different IDS engines. (HUBBALLI; SURYANARAYANAN, 2014) also argues that there is no consen- sus about the definition of alert correlation in IDS community. The term correlation has been used to mean:  Correlation of alarms from a single IDS.;  Correlation of alarms from multiple IDS;  Correlation of alarms from the same type of IDS;  Correlation of alarms from different types of IDS;  Correlation of IDS alarms with other security components alarms (e.g. Netflow); Therefore, (HUBBALLI; SURYANARAYANAN, 2014) proposed a definition for cor- relation: “Correlation can be seen as black box which receives a bunch of alarms pos- sibly generated by different IDS and other security components and generates a con- densed view of alarm for system administrator”. A generic view of alarm correlation is shown in Figure 23, and the presented steps are:

71

 Alert Normalization – Alerts from heterogeneous IDS deployed in the net- work are transformed to a common format, to keep consistency. The Internet Engineering Task Force (IETF) has proposed Intrusion Detection Message Exchange Format (IDMEF) as a standard for alert message exchange (FEITOSA, EDUARDO LUZEIRO, 2010).  Alarm Clustering – The action of grouping alarms possibly generated from different sensors. Common attributes, like source IP address, destination IP address, source port, destination port, attack name, service and user, are considered to measure the similarity of alarms. Since many sensors may have captured the anomalous behavior this step plays a significant role in reducing the volume of alarms. Some literature name this step as alarm ag- gregation and some even call it alarm fusion;  Alarm Correlation - Analyzes the clusters of alarms and provides an output by merging some clusters which are related;  Intention Recognition - Identifies the plan the attacker has;  Report - Generates a condensed view of attack scenarios to the administra- tor;

Figure 23 Generic view of alarm correlation according (HUBBALLI; SURYANARAYANAN, 2014).

Alarm correlation techniques can be further grouped under the following clas- ses:  Multi-step correlation;  Knowledge-based correlation;  Complementary evidence based correlation;  Causal relation based correlation;  Fusion based correlation;

72

 Attack graphs based correlation;  Rule-based correlation.

3.3.5.1 Multi-step correlation

These schemes are based on the assumption that an attacker follows a se- quence of actions which may trigger alerts, before breaking into the system, and pre- vious steps are prerequisite for further steps in the overall attack. Zhou et al. (ZHOU, JINGMIN et al., 2007) have extended the requires/provides model introduced in (TEMPLETON; LEVITT, 2000), which states that, in a multistage intrusion comprised of a sequence of attacks, the early attacks acquire certain ad- vantages, e.g., information about the system under attack and the ability to perform actions on the system under attack, and use them to support the later attacks that require them. The experiments have successfully captured scenarios and correlates alerts with a real dataset, from traffic collected in four honeynet machines (HONEYNET, 2014); however, without background traffic from local network services it´s not possible to evaluate the effectiveness and the false alert rate. AL-Mamory and Zhang (AL-MAMORY; ZHANG, 2008a) proposed an alert post- processing and correlation method for the detection of multi-step intrusions, called Alerts Parser, which treats alerts as tokens, and a modified version of the Left-to-right (LR) parser algorithm is used to generate parse trees representing the scenario in the alerts. An attribute context-free grammar (ACF-grammar) is used for representing the multi-step attacks. Experiments conducted with DARPA and DEFCON 8 CTF demon- strated the correctness of the method i.e. its capacity to construct scenarios; however, the effectiveness test was inconclusive.

3.3.5.2 Causal relation based correlation

The objective of causal relationship discovery is to test and identify causal rela- tionships among variables under study. Given two or more random variables it identi- fies how the variables are related to each other.

73

The Bayesian network is a form of causal relationship description method which is usually shown as a Directed Acyclic Graph (DAC). Each node in the graph repre- sents a variable and an edge between any two nodes describes the dependency that exists between the variables. In our case, nodes represent alarms and edges represent relationships. By analyzing a set of alarms this correlation creates a DAC graph without any other input. Viinikka et al. (VIINIKKA et al., 2009) aggregate individual alerts to alert flows, and then process the flows instead of individual alerts. Individual alert’s relevancy is often indeterminable, but irrelevant alerts and interesting phenomena can be identified at the flow level. The proposed method can model regularities in flows consisting of alerts related to normal system behavior, and last n sample measurements are used to predict the current measurement. This moving history time series modeling essen- tially allows the model to adapt to changes in the normal behavior. The experiments were conducted in a real intranet environment which characteristics were presented, no previous knowledge was necessary and the results demonstrated that the method can detect anomalies in alert flows; moreover, regularities and smooth changes in the alert intensity are considered as echoes of normal system use and anomalies are events that are worth further investigation. Ghorbani et al. (REN; STAKHANOVA; GHORBANI, 2010) proposed an auto- mated adaptive approach for online correlation of intrusion alerts in two stages. In the first online stage, Bayesian network is employed to automatically extract information about the constraints and causal relationships among alerts. Based on the extracted information, attack scenarios are reconstructed on-the-fly providing network adminis- trator with the current network view and predicting the next potential steps of the at- tacker. The experiments, conducted with a DARPA dataset and alerts from a honeynet (HONEYNET, 2014) demonstrated a low false positive rate; however the DARPA da- taset has well-known problems and traffic from a honeynet lacks regular traffic found in an Intranet.

3.3.5.3 Attack graphs based correlation

Attack graph based correlation techniques assume that a vulnerability in a host can reveal little information if studied in isolation. If a particular host has known low

74

impact vulnerability, the corresponding alarm may be rated as a low priority alarm. On the other hand attackers may first break into a target host and use it as a hop to reach most critical systems and servers in network. According to (PAO et al., 2012), an attack graph identifies a set of possible paths that an attacker can take to compromise and affect critical systems. If the IDS is de- ployed to detect any misbehavior covering all those paths then more likely the admin- istrator will know about the incident. Moreover, attack graphs show the dependencies that exist for compromising a host and also interconnections between hosts in a net- work. Figure 24 presents a generic view of attack graphs. The x axis indicates the time and connections between alarms indicate the relation. There can be multiple paths between the first and last alarm in a scenario. The experiments were conducted with a proprietary dataset and the results presented satisfactory results, but no information about the dataset characteristics was presented. Moreover, the dataset is labeled and in a real environment it requires expert knowledge.

Figure 24 Generic view of graph ordering (PAO et al., 2012).

3.3.6 Alarm verification

Alarm verification implements a mechanism that verifies if an attempted attack was successful or impacted a targeted network. The verification can be active or pas- sive. The former verifies online immediately after the alert rises and the latter queries a database of possible success cases. Etalle et al. (BOLZONI; CRISPO; ETALLE, 2007) proposed an engine ATLAN- TIDES to correlate alerts with an Output Anomaly Detector (OAD). The main idea is that a successful attack often causes an anomaly in the output of the service, thus modifying the normal output outcome. Detecting this anomaly can help in reducing

75

false alerts. The time window used to look for the correlation is very critical for the correctness of the scheme; a very small time window may lead to the missing of attacks while a large time window may result in increase of false positives. Further, in some scenarios no response is seen. For example, a successful SQL Injection attack against a web application often causes the output of SQL table content whether a DOS attack leads the target host to shut down and no response is sent to the attacker. Figure 25 presents the proposed architecture and depicts the ATLANTIDES steps: (1) an alert is generated by a NIDS, (2) OAD verifies the output generated and (3) if an anomaly is detected it is a TP, else it is a FP. ATLANTIDES was evaluated in an academic internet link, focused on HTTP traffic, and with the DARPA dataset. The results were satisfac- tory; however, the tests with a real data was too specific to generalize whereas the DARPA dataset has well known problems.

Figure 25 ATLANTIDES architecture (BOLZONI; CRISPO; ETALLE, 2007)

3.3.7 Hybrid methods

Hubballi and Suryanarayanan (HUBBALLI; SURYANARAYANAN, 2014) argue that false alarm reduction is essentially a subset selection problem. Given a set of alarms generated by an IDS, the minimization method should select a subset which has more impact in the target network, i.e. those that are effective, and discard the remaining or ineffective ones. Therefore, the methods proposed in the literature do not

76

perform well in a dynamic network environment; moreover, data mining techniques are not always applicable to filter the false alarms. Thus there is a need for a hybrid approach which can combine the best of filter- ing based schemes and data mining schemes to reduce the false alarms. Nandi et al. (HUBBALLI; BISWAS; NANDI, 2011), proposed a scheme to FP reduction, where a threat profile of the network is created and alerts are correlated using an Artificial Neural Network (ANN). The proposed method has following two steps:  Threat profile generation of the local network;  Correlation of IDS alarms with network threat profile. Figure 26 presents the architecture proposed by Nandi et al. If an alert has a correspondent vulnerability in the threat profile, the alert is labeled as effective, other- wise it is labeled ineffective. The labeled alerts are used to train an ANN, and further alerts will be classified accordingly. A new training phase is necessary when the Threat Profile is changed, i.e. for each update.

Figure 26 Proposed Architecture (HUBBALLI; BISWAS; NANDI, 2011).

Experiments were conducted in a controlled environment and the results were expressive; however, the absence of real background traffic invalidates the results. Moreover, the method requires previous knowledge to build the threat profile.

77

3.4 Chapter Summary

In this chapter the latest relevant researches focusing on false alerts reduction were presented, and the way the experiments were conducted. Most of methods with satisfactory results in a real environment are too specific to generalize or lack infor- mation about the characteristics of the traffic or background noise. In Table 3.1 the presented methods are summarized and compared according to the requirement of previous knowledge and the quality of the datasets used.

Table 3.1 Methods comparison

Reference Type Previous Tested with Real Observation about experi- knowledge? DARPA? Data? ments .(SOMMER; Signature Enhancement yes no Yes Traffic from academic links PAXSON, 2003) (MASSICOTTE et Signature Enhancement yes no Yes Traffic from academic links al., 2007) (ECKMANN; Stateful signatures No Yes No Experiments conducted only VIGNA; with DARPA dataset KEMMERER, 2002) (VIGNA et al., Stateful signatures yes No Yes Specific to detect attacks on 2003) web servers. (WANG, HELEN Vulnerability signatures Yes No Yes Experiments have tested vulner- J. et al., 2004) abilities from a single network service. (BRUMLEY; Vulnerability signatures yes No Yes Only two vulnerabilities were NEWSOME; tested SONG, 2006) (LI, ZHICHUN et Vulnerability signatures Yes Yes Yes Traffic from academic link and al., 2010) experiments focused in through- put (JULISCH, K; Alarm mining Yes Yes Yes No details about the character- DACIER, 2002; (Clustering) istics of the real dataset were JULISCH, K, presented 2001; JULISCH, KLAUS, 2001, 2002, 2003) (AL-MAMORY; Alarm mining Yes Yes Yes No details about the character- ZHANG, 2008b) (Clustering) istics of the real dataset were presented (PIETRASZEK; Alarm mining Yes Yes Yes No details about the character- TANNER, 2005a) (Classification) istics of the dataset were pre- sented

78

(PARIKH; CHEN, Alarm mining yes yes no Well-known problems in DARPA 2008) (Classification) and absence of real background traffic (THOMAS; Alarm mining yes Yes Yes No information about character- BALAKRISHNAN, (ANN) istics of real traffic 2008) (SOLEIMANI; Alarm mining yes yes no Only DDOS attacks were evalu- GHORBANI, (Frequent pattern min- ated 2008) ing) (SADODDIN; Alarm mining yes yes no Artificial data were used GHORBANI, (Frequent pattern min- 2009) ing) (ZHOU, JINGMIN Alert Correlation No No Yes Traffic from honeynet without et al., 2007) (Multi-step) real background traffic (AL-MAMORY; Alert Correlation No Yes Yes Demonstrates capacity to con- ZHANG, 2008a) (Multi-step) struct scenarios, but effective- ness tests were inconclusive (VIINIKKA et al., Alert Correlation no no yes Real intranet traffic, but without 2009) (Causal relation) results to demonstrate effective reduction (REN; Alert Correlation no yes yes Traffic from a honeynet lacks ir- STAKHANOVA; (Causal relation) regular traffic found in a real in- GHORBANI, tranet environment 2010) (PAO et al., 2012) Alert Correlation yes no yes Proprietary dataset without in- (Attack graphs) formation about its characteris- tics (BOLZONI; Alert Correlation no yes yes Traffic from academic internet CRISPO; (Alarm verification) link, focused on HTTP traffic. To ETALLE, 2007) specific to generalize. (HUBBALLI; Hybrid yes no no The traffic was generated in a BISWAS; NANDI, controlled environment, without 2011) real background traffic

79

Chapter 4 ARCA Framework

In this chapter the fundamental concepts of ARCA are presented along the ar- chitectural design, implementation details and discussion about the experimentation. As discussed in the previous chapters of this dissertation, malware detection through traffic analysis has considerable advantages and an Intrusion Detection Sys- tem (IDS) plays an important role in order to attain malicious traffic detection. An IDS can detect malicious traffic within an internal network, according to the following stages of malware’s behavior:  Scan-based worms will scan the whole network in order to find vulnerable hosts, or vulnerable network services, and launch exploits to compromise whatever host found. It must be noticed that neighbor discovery is an im- portant spreading strategy in IPv6;  The recruiting stage, or initial infection, in a bot life cycle has the same behavior of a scan-based worm and an infected host launches exploits directed to other vulnerable hosts. Moreover, a botmaster may command infected hosts to scan the network and launch attacks;  In a similar way, a host infected with an APT may launch internal attacks to leverage its presence within an internal network and to compromise hosts with valuable data; Security practitioners and researchers proposed improvements to well-known IDS, such as Snort (SOURCEFIRE, 2013), in order to detect malware network behav- ior and fight malware propagation. However, as discussed earlier in this document, they have not considered the influence of the also well-known IDS major drawback, the alarm rate, or specifically the false alarm rate. Moreover, the proposed schemes

80

have been tested with traffic too particular to generalize or biased artificially generated datasets. Given this scenario we propose ARCA (Alerts Root Cause Analysis Framework) in order to help security and network administrators to extract patterns from IDS alerts and identify alerts root causes which may be malicious or erroneous. Moreover, the experiments to evaluate the proposed framework were realized with a real dataset, from SERPRO (Serviço Federal de Processamento de Dados), a Brazilian Govern- ment Organization (SERPRO, 2013), and a public dataset available in (MACCDC, 2012). This chapter presents in Section 4.1 ARCA’s fundamental concepts, in Section 4.2 the complete description of ARCA and in Section 4.3 the experiments results are discussed.

4.1 Fundamental Concepts

4.1.1 Root Causes

The concept of root cause was first introduced in a series of papers published in the early 2000s as resulted by the research conducted in IBM Zurich Research La- boratory (JULISCH, K; DACIER, 2002; JULISCH, K, 2001; JULISCH, KLAUS, 2001, 2002, 2003). A root cause is a reason for which an alert occurs and according to ob- servations in (JULISCH, K, 2001), a small number of root causes generally accounts for over 90% of all alerts. Moreover, according to (GHORBANI; SADODDIN, 2006) different alerts tend to have similar root causes or similar effects on network resources.

4.1.2 Relative Uncertainty Clustering

The concept of Relative Uncertainty (RU), inherited from information theory field, was first introduced as a mechanism to extract behavior patterns from network traffic in (XU; ZHANG; BHATTACHARYYA, 2005) and extended to cluster IDS alerts in (FEITOSA, EDUARDO LUZEIRO, 2010; FEITOSA, EDUARDO; SOUTO; SADOK, 2012; LINS; FEITOSA; SADOK, 2011).

81

The RU quantifies the uncertainty contained in a random variable 푋 given 푁푋 observed discrete values. Suppose 푋 is randomly observed m times, which induces an empirical probability distribution on 푋, 푝(푥푖) = 푚푖⁄푚 , 푥푖 ∈ 푋, where 푚푖 is the fre- quency or number of times 푋 was observed with value 푥푖. The entropy of 푋 is then defined as: ( 4) 퐻(푋) ∶= − ∑ 푝(푥푖) log 푝(푥푖)

푥푖휖 푋 Where by convention 0 log 0 = 0. Since entropy is an “observational variety” measurement in the observed values of 푋 , it is correct to assume that 0 ≤ 퐻(푋) ≤ 퐻푚푎푥(푋) ≔ log min(푁푥, 푚) , where

퐻푚푎푥(푋) is defined as the maximum entropy of 푋 when 푝(푥푖) = 1⁄푛. Assuming that

푚 ≥ 2 and 푁푥 ≥ 2, RU is a standardized entropy which provides an index of variety or uniformity regardless of the support or sample size defined as: 퐻(푋) ( 5) 푅푈(푋) = = 퐻(푋)⁄log 푚푖푛{푁푥, 푚} 퐻푚푎푥(푋)

If 푅푈(푋) = 0, all observations of 푋 are of the same kind, i.e. 푝(푥푖) = 1 and the observational variety is completely absent. On the other hand, when 푚 ≤ 푁푥 , 푅푈(푋) =

1 if and only if |퐴| = 푚 and 푝(푥푖) = 푚 for each 푥푖 ∈ 퐴, where A is a subset of observed values of 푋. Therefore, all observed values of X are different or unique and the obser- vations have the highest degree of uncertainty. If 푚 > 푁푥 , 푅푈(푋) = 1 if and only if

푚푖 = 푚⁄푁푥, thus 푝(푥푖) = 1⁄푁푥 for every 푥푖 ∈ 퐴 = 푋, i.e. the observed values are uni- formly distributed over X. In this case 푅푈(푋) measures the degree of uniformity in the observed values of 푋. As a general measure of uniformity in the observed values of 푋, the conditional entropy 퐻(푋|퐴) and relative uncertainty 푅푈(푋|퐴) are considered by conditioning 푋 based on 퐴. Then, 퐻(푋|퐴) = 퐻(푋), 퐻푚푎푥(푋|퐴) = log |퐴| and 푅푈(푋|퐴) = 퐻(푋)⁄log |퐴|.

Hence 푅푈(푋|퐴) = 1 if and only if 푝(푥푖) = 1⁄퐴, for every 푥푖 ∈ 퐴. In general, 푅푈(푋|퐴) ≈ 1 means that the observed values of 푋 are closer to being uniformly distributed, thus less distinguishable from each other, whereas 푅푈(푋|퐴) ≪ 1 distributed, thus less dis- tinguishable from each other, whereas it indicates that the distribution is more skewed, with a few values more frequently observed. This measure of uniformity is used for defining “significant clusters of interest”.

82

4.1.2.1 Extracting Significant Cluster

Algorithm 1 presents a simplified draft of the used algorithm (in pseudo-code) for extracting the significant clusters in 푆 from 퐴. Initializing with 훼0 = 2% , the algo- rithm searches for the optimal cut-off threshold 훼∗ from above via “exponential approx- imation” (reducing the threshold 훼 by an exponentially decreasing factor 1⁄2푘 with 푘 constant). As long as the relative uncertainty of the (conditional) probability distribution

푃푅 on the (remaining) feature set 푅 is less than 훽 , the algorithm examines each fea- ture value in 푅 and includes those whose probabilities exceed the threshold 훼 into the set of significant feature values. The algorithm stops when the probability distribution of the remaining feature values is close to being uniformly distributed ( > 훽 ≔ 0.9 ).

Algorithm 1 Simplified significant cluster extraction algorithm

퐼푛푝푢푡: 훼 ≔ 훼0; 훽 ≔ 0.9; 푆 ≔; 01: 푆 ≔, 푅 ≔ 퐴, 푘 ≔ 1

02: 푐표푚푝푢푡푒 푝푟표푏푎푏푖푙푖푡푦 푑푖푠푡푟푖푏푢푡푖표푛 푃푅 푎푛푑 푖푡푠 푅푈 휃 = 푅푈(푃푅); 03: 풘풉풊풍풆 휃 ≤ 훽 풅풐 04: 훼 = 훼 × 2−푘;

05: 풇풐풓 풆풂풄풉 훼푖 ∈ 푅 풅풐

06: 풊풇 푃퐴(훼푖) ≥ 훼 풕풉풆풏

07: 푆 ≔ 푆 ∪ {훼푖}; 푅 ∶= 푅 − {훼푖}; 08: 풆풏풅풊풇 09: 풆풏풅 풇풐풓

10: 푐표푚푝푢푡푒 (푐표푛푑푖푡푖표푛푎푙)푝푟표푏푎푏푖푙푖푡푦 푑푖푠푡푟푖푏푢푡푖표푛 푃푅 푎푛푑 푖푡푠 푅푈 휃 = 푅푈(푃푅) 11: 풆풏풅 풘풉풊풍풆

4.1.3 Frequent Itemset Mining

Apriori is an algorithm used to discover combinations where frequently pairs attribute/value occurs. A pair attribute/value is an item and a combination of items is an itemset. It uses an iterative approach known as level-wise search which is em- ployed to find the most frequent itemsets, where k-itemsets are used to explore (k + 1)-itemsets. The database is scanned to accumulate the count for each itemset found,

83

and items that satisfy minimum support are collected. The support is a measure to the number of transactions where a frequent itemset occurs among all transactions evalu- ated (SINGHAL, 2007). According to (SINGHAL, 2007) and (LU; BOEDIHARDJO; MANALWAR, 2005), Apriori’s association rules are an effective method to extract knowledge from intrusion alerts. First, the set of frequent 1-itemsets is found by scanning the database to accu- mulate the count for each item, and collecting those items that satisfy minimum sup- port. The resulting set is denoted by L1. Next, L1 is used to find L2, the set of frequent 2-itemsets, which is used to find L3, and so on, until no more frequent k-item sets can be found. The finding of each Lk requires one full scan of the database. The Apriori property: All nonempty subsets of a frequent item set must also be frequent, is used to improve the efficiency of the level-wise generation. It is based on the observation that, by definition, if an item set I does not satisfy the minimum support threshold, I is not frequent. Once the frequent itemsets have been found, association rules are generated to provide probabilistic statements where frequent itemsets are associated or corre- lated. If an association rule satisfies minimum support and minimum confidence, it is called strong rule (HAN, JIAWEI; KAMBER; PEI, 2011). This can be done using (6) for confidence, as follows:

푠푢푝푝표푟푡_푐표푢푛푡(퐴 ∪ 퐵) ( 6) 푐표푛푓푖푑푒푛푐푒 (퐴 ⇒ 퐵) = 푃(퐵|퐴) = 푠푢푝푝표푟푡_푐표푢푛푡(퐴)

Where 푠푢푝푝표푟푡_푐표푢푛푡(퐴 ∪ 퐵) is the number of transactions containing the item- sets 퐴 ∪ 퐵, and 푠푢푝푝표푟푡_푐표푢푛푡(퐴) is the number of transactions containing the itemset 퐴. Based on this equation, association rules can be generated as follows:  For each frequent itemset , generate all nonempty subsets of 푙.  For every nonempty subset 푠 of 푙 , output the rule “ 푠 ⇒ (푙 − 푠) ” if 푠푢푝푝표푟푡_푐표푢푛푡(푙) ≥ 푚푖푛, 푐표푛푓 where 푚푖푛, 푐표푛푓 is the minimum confidence 푠푢푝푝표푟푡_푐표푢푛푡(푆) threshold.

84

4.2 ARCA Architectural Design

In order to extract the statistically significant alerts from an IDS dataset and an- alyze its root causes, the Aggregation Module (AM) proposed in (LINS; FEITOSA; SADOK, 2011) was extended to combine its aggregation capabilities, the RU clustering presented in section 4.1.2, with Frequent Itemset Mining, see section 4.1.3. The results presented in (LINS; FEITOSA; SADOK, 2011) show that the AM is capable to identify the alerts whose features source address (SrcIP), destination ad- dress (DstIP) and attack class (Class) are the most statistically significant. This method is proposed to lower the alarm rate and help the security team to identify false positives. However, tests realized in both datasets, sections 0 and 4.4.3, demonstrated that even after a successful aggregation, both alert clusters, significant and insignifi- cant, present a high alerts volume and combinations of IP addresses and attack clas- ses; e.g. During the first hour interval in SERPRO’s alerts, from 8:00 to 09:00 am, 524392 alerts were generated, with 47 attack classes, 413 sources and 1491 destina- tions. After the aggregation, the significant cluster represented 94% of all alerts, with 13 alerts classes and a considerable range of IP addresses, as presented in Figure 27. For each class an intrusion analyst must evaluate if all listed IP addresses have the same root cause, without discarding the hypothesis that a root cause may trigger one or more classes, and a class may be related with one or more root causes. Even an experienced analyst will spent a considerable effort to reason the possible root causes, moreover, knowledge of network infrastructure and related network protocols is imperative.

85

14 19187 10 10 2002911 10 11 2101411 16 10 2100538 28 32 9119019 33 13 9138005 26 DstIp 10 2100368 11 SrcIp 10 2100366 11 58 9119031 36 54 9129005 16 100 9129015 98 10 2013504 26 62 9129012 100

Figure 27 Normalized SrcIp and DstIp quantities per significant class (SID). [Max(SrcIp), Min(SrcIp)]=[309,1] and [Max(DstIp), Min(DstIp)]=[542,2].

With ARCA a new strategy is proposed and a second layer of aggregation is added to mine the most frequent itemset found in significant cluster. Therefore, the proposed architecture adds the Frequent Itemset Miner (FIM) which uses Apriori to mine association rules on the significant cluster. The elements of ARCA, depicted in Figure 28, are:

86

Figure 28 ARCA Architecture

 IDS. A network-based intrusion detection system (Snort) with all signa- tures and preprocessors enabled.  Alert database. The alerts from IDS are stored in a relational database, provided with Snort’ s sourcecode;  RUA. Relate Uncertainty Algorithm module.  FIM. Frequent Itemset Miner module;  Operator. The human operator evaluates the results from RUA and FIM, and generates the RCAR (Root Cause Association Rule). ARCA also introduces a significant change in how the alerts aggregation is ex- ecuted. The proposed workflow is presented in Figure 29, and as one may see, after the Significant Cluster Extraction using RUA, the Pattern Recognition step using FIM is executed. At the next step, the Operator evaluates the results from both modules and generates a Root Cause Association Rule (RCAR). An RCAR identifies groups of alerts that will be removed from original dataset and, from this point, serves as a starting point to identify the root cause. The experi- ments in Section 4.3 demonstrate that an RCAR allows identifying malicious or false positive root causes. If the alert volume reduction after alerts removal reaches an acceptable level, no more rules will be created and the Security Administrator will have at this moment

87

rules to guide him, or his team, on root cause identification. If the alert volume contin- ues to be high, the procedure is repeated with the remaining alerts. For each iteration, one or more rules will be generated and a volume reduction will take place.

Figure 29 ARCA Workflow

4.3 Implementation

This section describes how the two main components of ARCA were imple- mented. Moreover, the method to collect the alerts and run the algorithm is also de- scribed.

4.3.1 RUA – Relative Uncertainty Aggregator

The RU aggregation algorithm was imported from the AM (Aggregation Module) presented in (FEITOSA, EDUARDO LUZEIRO, 2010; LINS; FEITOSA; SADOK, 2011). The RUA component, see Figure 28, like the former AM was developed in Java, ver- sion 1.7. All received alerts are stored in data structures using, named ATable and CTable. ATable is an array data structure that stores source IP address information (srcIP), source port, destination IP address (dstIP), destination port, class of attack (class), timestamp. Originally the alert severity was also used, but ARCA requires no knowledge of infrastructure and to evaluate the severity of an alert, it is necessary to

88

understand what services are the most relevant and if the host is vulnerable. Therefore, alert’s severity was discarded. In addition, Atable also has three alert pointers (next srcIP, next dstIP, and next class) to link alerts sharing the same feature value in the given dimension. This idea removes the need to duplicate the alerts and then to group each alert into three clusters along each dimension, not to mention that is both more scalable and efficient regarding memory cost especially when dealing with hundreds of millions of alerts. For example, in Figure 5.3, the next srcIP pointer of Alert 1 links to Alert 4 since they share the same source IP 192.168.0.51. Similarly, the next dstIP pointer of Alert 2 links to Alert 3 since they share the same destination IP 150.161.192.11, and the next class pointer of Alert 3 links to Alert 4 since they share the same class. Once all the received alerts are correctly stored in ATable, CTables are put in operation to reference the first occurrence of an alert, providing a way to quickly find the “old” alerts of the same clusters. This feature will be useful in the future work and to couple ARCA with the solution proposed in (FEITOSA, EDUARDO LUZEIRO, 2010), more details are available in Section 5.4 – Future Work. Since there are three types of clusters, three instances of CTable were created for managing clusters along three dimensions. In spite of their simple design, CTables are essential to the computation and extraction of the significant clusters. Each CTable stores an alert counter, for recording the number of occurrences of a given value, and an alert pointer, for referencing it for the first time. For example, when evaluating alert 1 in Figure 30, the given source IP address (150.161.192.51) is compared with srcIP cluster and as it is the first time that this IP address occurs it is therefore inserted into CTable. The alert count field is incremented by 1 and an alert pointer is linked to this alert at ATable. The same occurs with the other dimensions dstIP and class. However, when evaluating dstIP value (150.161.192.11) from alert 3, one finds the first alert (alert 2) of the cluster dstIP (index 1), and updates the next dstIP pointer of alert 2 to alert 3. Next, alert count is finally incremented by 1. When the CTables are filled (with the insertion of all ATable elements), the pro- cess of cluster extraction is triggered. It results in three lists (one for each cluster) composed by key (srcIP, dstIP or class of attack), frequency and the pointer for the first occurrence in its respective CTable.

89

As a final result of the Aggregation module, each list is used to create a vector (one for each dimension) containing only significant elements (alerts). Each vector is composed by the following attributes: source IP address, source port, destination IP address, destination port, class of attack, and timestamp.

Figure 30 - Atable and Ctable

4.3.2 FIM – Frequent Itemset Miner

The FIM was developed in Java 1.7 and the Apriori algorithm was imported from the Weka development package, class weka.associations.Apriori (PENTAHO, 2014b). The Apriori algorithm was parameterized with the following options:

90

Table 4.1 Apriori parameters

Parameter Description Value -N The required number of rules. 10 -T The metric type by which to rank rules. (0=confidence ) 0 -C The minimum confidence of a rule 0.9 -D The delta by which the minimum support is decreased in 0.05 each iteration -U Upper bound for minimum support 1.0 -M The lower bound for the minimum support 0.1 -S If used, rules are tested for significance at -1.0 the given level

4.3.3 Alerts Aggregation

The necessity to collect a huge amount of data and transform it, so that an min- ing algorithm can be used to extract information, is a well-known problem covered in Data Warehouse research area, and it´s called the ETL process, from Extraction, Transform and Load (KIMBALL; CASERTA, 2004). Therefore, an ETL open-source tool, named Kettle (PENTAHO, 2014a), was used to organize ARCA’s execution flow.

Figure 31 Job1 collects the alerts and runs RUA and FIM

Kettle is a visual tool that allows the creation of a data extraction flow and has 2 main components: Jobs and Transformations. A job is a group of transformations and a transformation is an operation executed in the data flow. The extraction flow in ARCA was split in two jobs, Job1 and Job2, as depicted in Figure 31 and Figure 32, respec- tively.

91

Figure 32 Job2 imports one or more RCARs and removes the selected alerts

When Job1 finishes the Security Administrator must evaluate the result and create one or more RCARs. Each RCAR must be inserted in a JAVA property file (ORACLE, 2014), which will be imported and in the sequence the alerts will be removed from the original dataset. According to the workflow presented in Figure 29, if the alert volume reduction after alerts removal reaches an acceptable level, no more rules will be cre- ated, if not, the full operation is repeated.

4.4 Experiments

The experiments were guided by the following requirements:  The IDS installation and configuration have to follow default instructions and all signatures available have to be enabled;  No additional knowledge of network infrastructure is required to configure the IDS or evaluate alerts;  No artificial dataset will be used; The IDS chosen was Snort (SOURCEFIRE, 2013), an open-source network- based IDS with two main detection engines: a signature engine and protocol anomaly engine. As of march of 2014, the former can use 2 rule sets, or signature set, distrib- uted by Sourcefire, the Snort’s sponsor, or (EMERGING THREATS, 2013), both rulesets have approximately 38,000 distinct signatures; the latter are formed by sev- eral preprocessors responsible for traffic evaluation before the signature engine is ap- plied to incoming packets. Moreover, Snort is a well-known IDS used in several academic researches (HAN, HONG; LU; REN, 2002; ISMAIL, 2009; RAFTOPOULOS; DIMITROPOULOS, 2013; RIKHTECHI, 2010; TRABELSI; BOX; AIN, 2013; WUU; HUNG; CHEN, 2006). The experiments realized in (RAFTOPOULOS; DIMITROPOULOS, 2013) demon- strated that Snort is effective for malware detection, but knowledge of network infra- structure and local analysis of hosts associated are required to validate alerts. The

92

results also demonstrated that, as of April 2011, 138 signatures were identified as the most effective to detect malware traffic without false positives. Our experiments were organized in two phases: the first one will evaluate how the proposed method will behave on real alerts, generated in SERPRO’s local network infrastructure (SERPRO, 2013). The second one will evaluate the capacity to identify internal network attacks in alerts originated from traffic collected in a security competi- tion (MACCDC, 2012). No knowledge of the infrastructure or traffic origin was used to analyze alerts from both datasets, the only information known is the evidence of net- work attacks in the second phase. More details about each experiment and the results obtained are found in Sections 0 and 4.4.3.

4.4.1 Alerts Preprocessing

An alert preprocessing operation was executed in order to create unique signa- ture IDs (SID). We observed that signatures from distinct pre-processors have equal SID and a distinct generator id (GID). The GID represents the preprocessor which trig- gered the alert. Therefore, for each signature, a new SID was created, using original values as seed: the prefix 9 was concatenated with the GID and SID. As an example, signature "stream5: TCP Small Segment Threshold Exceeded” has GID 129 and SID 12, then new the SID is 9129012.

4.4.2 Experiment with the SERPRO dataset

Our first experiment was testing RUA in a real alert dataset, to verify if the alert reduction observed in the original work, would aggregate alerts such a way that a Se- curity Administrator could manage significant alerts without cognitive overloading. A Snort IDS, version 2.5.9 and default rule set from (EMERGING THREATS, 2013), was deployed in a SERPRO’s subsidiary internal network, consisting of 400 workstations with Linux and Windows, local network services like LDAP authentication, DNS, File Sharing, Printer Services, DHCP, video stream, and regular access to the Internet and Corporative Internet Services, like e-mail and Intranet applications.

93

The alerts were collected on a normal working day, from 8:00 am to 8:00 pm. During the first hour interval, from 8:00 to 09:00 am, 524392 alerts were triggered, with 47 attack classes, 413 sources and 1491 destinations. The Ctables with Class, SrcIP and DsIP were generated, and as the reader may see in the Histogram of Class Coun- ter (Figure 33), the majority of the attack classes triggered few alerts while the minority triggered most of the alerts. Similar behavior was found in SrcIP (Figure 34) and DstIP counters (Figure 35). Moreover, the type 2 Kurtosis and Skewness coefficients from Class, SrcIP and DstIP where calculated in order to identify the shape characteristics of the 3 possible distributions, according the research in (BINTI YUSOFF; BEE WAH, 2012).

Max=356046 Min=1 Mean=11157.28 S. Deviation=54187.12 Median=38 Kurtosis=37.79064 Skewness=6.011112 p-value = 1.819e-14

Figure 33 Histogram of Class Counter from SERPRO’s dataset

The positive Kurtosis coefficient in Figure 33 indicates a peaked distribution, which is said to be leptokurtic, or heavy tail, and the positive Skewness coefficient indicates that mean is the data distribution is right-skewed, such as Poisson, Chi- Square, Exponential, Log- Normal and Gamma distributions (BINTI YUSOFF; BEE WAH, 2012). Figure 34 and Figure 35 also indicates that SrcIP and DstIP counters have leptokurtic right-skewed distributions. In addition, the Shapiro-Wilk normality test p- value less than 0.05 in all three distributions indicates that the distributions are not normal.

94

Max=323285 Min=1 Mean=1269.714 S. Deviation=16042.6 Median=65 Kurtosis=396.8196 Skewness=19.75261 p-value < 2.2e-16

Figure 34 Histogram of SrcIP Counter from SERPRO’s dataset

Max=118945 Min=1 Mean=351.7049 S. Deviation=3338.981 Median=12 Kurtosis=1070.434 Skewness=30.48124 p-value < 2.2e-16

Figure 35 Histogram of DstIP Counter from SERPRO’s dataset

The next step, the clustering RU algorithm was executed, and the results ob- tained from Class clustering presented a behavior different from what is expected from Algorithm 1. According to the original method proposed in (FEITOSA, EDUARDO LUZEIRO, 2010), the algorithm should stop when the probability distribution of the re- maining feature values is close to being uniformly distributed. However, the Class di- mension clustering had to be stopped with a lower RU, as presented in Table 4.2, with approximately 0.39. If the next round is executed, the algorithm extracts all elements

95

from vector 퐴 to insert in vector 푆, and vector 퐴 is emptied. Similar behavior was ob- served with IpSrc and IpDst clustering; the highest RU value was 0.84 and 0.83, re- spectively.

Table 4.2 Results from RU Algorithm. Class clustering from 8:00 am to 8:00 pm

Class Round RU # Alerts A S 1 0.08 524392 47 0 2 0.22 27693 44 3 3 0.39 1267 34 13

Table 4.3 Results from RU Algorithm. SrcIP clustering from 8:00 am to 8:00 pm.

SrcIP Round RU # Alerts A S 1 0.17 524392 413 0 2 0.39 141893 410 3 3 0.46 71084 394 19 4 0.54 13671 289 124 5 0.84 70 41 372

Table 4.4 Results from RU Algorithm. DstIp clustering from 8:00 am to 8:00 pm.

DstIP Round RU # Alerts A S 1 0.31 524392 1491 0 2 0.37 303182 1481 10 3 0.54 51031 1429 62 4 0.54 23749 1360 131 5 0.83 1975 617 874

All alerts with the significant Class, SrcIp and DstIp were flagged 0 and resulted in 521237 alerts flagged as being significant (cluster 0) and only 3155 as insignificant (cluster 1), leading to a 94% reduction. Figure 36 presents the significant alert quanti- ties grouped by attack classes and Figure 37 shows SrcIp and DstIp quantities per significant class. All values were normalized using min-max normalization and mapped to an interval between 10 and 100, in order facilitate the visualization of the graphic’s

96

lower and higher values without losing the relationship between the original values (HAN, JIAWEI; KAMBER; PEI, 2011).

19187 10,0 2002911 10,0 2101411 10,1 2100538 10,4 9119019 10,5 9138005 10,8 2100368 10,8 2100366 10,8 9119031 10,9 9129005 11,3 9129015 16 2013504 39 9129012 100

Figure 36 Normalized alert quantities per significant alert class (SID).

14 19187 10 10 2002911 10 11 2101411 16 10 2100538 28 32 9119019 33 13 9138005 26 DstIp 10 2100368 11 SrcIp 10 2100366 11 58 9119031 36 54 9129005 16 100 9129015 98 10 2013504 26 62 9129012 100

Figure 37 Normalized SrcIp and DstIp quantities per significant class (SID).

97

The reduction rate is a considerable evaluation parameter, however, without an analysis that could indicate what these significant alerts represent, the Security Admin- istrator still faces a large group of alerts without any reasoning about what they may represent, if malicious or erroneous behavior. Despite the considerably high alert reduction, Security Administrator cannot dis- card significant alerts because they might indicate erroneous or even malicious behav- ior that triggered frequent alerts. Some alert classes, like 19187 and 9129012, are found in both clusters, and we argue that an alert class may be triggered by distinct root causes; even a SrcIp or DstIp may trigger alerts from distinct root causes.

The next step taken was running FIM. The association rules generated clustered 88.16% of alerts in 5 rules, each rule with 100% of confidence, as the reader may see in. The initial result, after RUA processing, was 94% of alerts grouped within 13 classes with several SrcIp and DstIP, and unknown relations between them. In other words, the System Administrator would face a complex task to manually find associations and alerts root causes, among significant alerts (cluster 0). According to SERPRO´s secu- rity policy they were not allowed to disclose the IP addresses, therefore, the most sig- nificant 16 bits will be obfuscated, those with “X.X.” indicate internal addresses and “Y.Y.” Internet addresses.

Table 4.5 Root Cause Association Rules from Serpro’s dataset, between 8:00 am and 9:00 am.

Rule SrcIp Class DstIp N. of alerts Reduction 1 X.X.220.65 9129012 (50) 323266 61,65% 2 (55) 2013504 X.X.220.50 115901 83,74% 3 (74) 9129015 Y.Y.174.60 8788 85,42% 4 X.X.64.43 9129012 X.X.66.22 7229 86,80% 5 X.X.220.33 2100366 X.X.220.91 7164 88,16% 2100368

98

4.4.2.1 Results evaluation

Rule 1

The IP address X.X.220.65 triggered 323266 alerts, from a single class 9129012 (stream5: TCP Small Segment Threshold Exceeded), directed to 50 different destina- tions, all from the internal network. All alerts are TCP-based, have source port 2048 and 55 different destination ports. The source port will help a Security administrator to identify what network service may be related with the corresponding port. An exami- nation of rule documentation and Snort’s default configuration shows that each alert 9129012 indicates 3 segments with size lower than 150 bytes.

Table 4.6 Apriori’s Association Rules for Rule 1

1. src=10.200.220.65 323285 ==> cluster_class=0 323285 2. novo_sid= 9129012 sport=2049 323267 ==> cluster_class=0 323267 3. src=10.200.220.65 sport=2049 323266 ==> novo_sid= 9129012 323266 4. src=10.200.220.65 novo_sid= 9129012 323266 ==> sport=2049 323266 5. src=10.200.220.65 novo_sid= 9129012 323266 ==> cluster_class=0 323266 6. src=10.200.220.65 sport=2049 323266 ==> cluster_class=0 323266 novo_sid= 9129012 323266 8. src=10.200.220.65 novo_sid= 9129012 cluster_class=0 323266 ==> sport=2049 323266 9. src=10.200.220.65 novo_sid= 9129012 sport=2049 323266 ==> cluster_class=0 323266 10. src=10.200.220.65 sport=2049 323266 ==> novo_sid= 9129012 cluster_class=0 323266

Rule 2

55 different internal IP addresses triggered 115901 alerts, from attack class 2013504 (ET POLICY GNU/Linux APT User-Agent Outbound likely related to package management), destined to IP address X.X.220.50, TCP destination port 80. Consider- ing that the alert has the objective to indicate that internal hosts are updating their Linux packages form an external server and the destination host is internal, it may be con- sidered as false positive.

99

Table 4.7 Apriori’s Association Rules for Rule2

1. dst=10.200.220.50 118945 ==> dport=80 118945 2. dst=10.200.220.50 cluster_class=0 118845 ==> dport=80 118845 3. novo_sid= 2013504 115923 ==> dport=80 115923 4. dst=10.200.220.50 novo_sid= 2013504 115901 ==> dport=80 115901 5. novo_sid= 2013504 cluster_class=0 115901 ==> dst=10.200.220.50 115901 6. dst=10.200.220.50 novo_sid= 2013504 115901 ==> cluster_class=0 115901 7. novo_sid= 2013504 cluster_class=0 115901 ==> dport=80 115901 8. novo_sid= 2013504 dport=80 cluster_class=0 115901 ==> dst=10.200.220.50 115901 9. dst=10.200.220.50 novo_sid= 2013504 cluster_class=0 115901 ==> dport=80 115901 10. dst=10.200.220.50 novo_sid= 2013504 dport=80 115901 ==> cluster_class=0 115901

Rule 3

74 internal IP addresses triggered 8788 alerts, form attack class 9129015 (stream5: Reset outside window), destined to Internet IP address Y.Y.174.60, which is a SERPRO’s Internet Server, TCP destination port 443. There is no reference in Snort documentation about what may trigger this alert, however, the System administrator has a starting point to analyze the possible cause for this alert.

Table 4.8 Apriori’s Association Rules for Rule3

1. dst=161.148.174.60 8788 ==> novo_sid= 9129015 8788 2. dst=161.148.174.60 8788 ==> dport=443 8788 3. dst=161.148.174.60 8788 ==> cluster_class=0 8788 4. dst=161.148.174.60 dport=443 8788 ==> novo_sid= 9129015 8788 5. dst=161.148.174.60 novo_sid= 9129015 8788 ==> dport=443 8788 6. dst=161.148.174.60 8788 ==> novo_sid= 9129015 dport=443 8788 7. dst=161.148.174.60 cluster_class=0 8788 ==> novo_sid= 9129015 8788 8. dst=161.148.174.60 novo_sid= 9129015 8788 ==> cluster_class=0 8788 9. dst=161.148.174.60 8788 ==> novo_sid= 9129015 cluster_class=0 8788 10. dst=161.148.174.60 cluster_class=0 8788 ==> dport=443 8788

Rule 4

This is the first rule to demonstrate the influence of Security Administrator on ARCA’s rule creation. Table 4.9 presents the association rules generated by FIM, dur- ing this round: 5 rules with 100% of confidence, however, no IP address was evi- denced. Association rule 2 indicated 8605 alerts with source port 49859 and destina- tion port 22. The query using these ports indicated 5 distinct IP source addresses gen- erating 4 distinct attack classes destined to 5 distinct IP addresses. The analysis of these alerts evidenced 7229 alerts from X.X.64.43 to X.X.66.22, with attack class

100

9129012 (stream5: TCP Small Segment Threshold Exceeded), TCP source port 49859 and TCP destination port 22.

Table 4.9 Apriori’s Association Rules for Rule4

1. sport=49859 8611 ==> dport=22 8611 ) 2. sport=49859 cluster_class=0 8605 ==> dport=22 8605 3. sport=49859 8611 ==> cluster_class=0 8605 4. sport=49859 dport=22 8611 ==> cluster_class=0 8605 5. sport=49859 8611 ==> dport=22 cluster_class=0 8605

Rule 5

The IP address X.X.220.33 triggered 7164 alerts, from attack classes 2100366 (GPL ICMP_INFO PING BSDtype) and 2100368 (GPL ICMP_INFO PING *NIX), destined to IP address X.X.220.91. Both classes indicate ICMP echo requests originated in Unix systems. This behavior may be considered a false positive if the IP addresses are members of a server cluster, or the source is periodically verifying if the destination is up.

According to the purpose of this dissertation, the level of aggregation reached is suitable to demonstrate ARCA’s aggregation capability.

Evaluation within the next hours

In order to evaluate the stability of the system, i.e., how the alert reduction would behave during the next hours and if any new knowledge could be learned, we have tested if the rules generated on first time interval would maintain performance during the next hours. As the reader may see in Figure 38 and Figure 39, the average re- duction rate on the next hours was 80% with a decrease to 56.52% at 16:00h, where it is possible to identify an alert peek that may have influenced such performance re- duction.

101

Alert Reduction 100,00 90,00 80,00 70,00 60,00 50,00 40,00 30,00 20,00 10,00 0,00 8 9 10 11 12 13 14 15 16 17 18 19

Figure 38 Alert Reduction in 12 hours interval

The clustering procedure was executed with alerts at 16:00, in order to identify the possible cause of alarm burst and evaluate the algorithm capacity to adapt to new alerts. Two new rules were generated and the reduction rate increased to 70.89%, demonstrating that the framework can adapt to emerging alerts and add new knowledge to improve the reduction rate metric.

600000

500000

400000

300000 total final 200000

100000

0 8 9 10 11 12 13 14 15 16 17 18 19

Figure 39 Total Alerts versus Final Alerts in 12 hours interval

102

Table 4.10 New RCARs created from new alerts detected between 15 and 17 pm

Rule SrcIp Class DstIp N. of Reduc- alerts tion 6 X.X.64.105 9128004 X.X.66.22 139628 68,96% 9129012 7 X.X.222.85 9128004 Y.Y.162.241 10597 70,89%

4.4.3 Experiment with the MACCDC´s dataset

The second experiment was conducted with the MACCDC 2012 dataset (MACCDC, 2012). The Mid-Atlantic Collegiate Cyber Defense Competition reunites college and university students in a competition focused on aspects of managing and protecting a network infrastructure. Throughout the competition, teams have to protect network services from an attack team while satisfying IT business requirements to sim- ulate a real environment. Our objective is to verify ARCA performance when malicious behavior exists, though we start from the principle that no previous knowledge of infra- structure is known. The traffic was collected during the competition and a Snort IDS, version 2.6.1 with default rule set from Sourcefire (SOURCEFIRE, 2013), generated 139401 alerts, first at 2012-03-16 09:30:02 and last at 2012-03-17 17:57:15.

Max=35519 Min=1 Mean=11157.28 S. Deviation=6194.197 Median=81 Kurtosis=19.53017 Skewness=4.461998 p-value < 2.2e-16

Figure 40 Histogram of Class Counter from MACCDC’s dataset

103

Max=72702 Min=1 Mean=824.858 S. Deviation=5799.203 Median=34 Kurtosis=136.9627 Skewness=11.40397 p-value < 2.2e-16

Figure 41 Histogram of SrcIP Counter from MACCDC’s dataset

Max=46662 Min=1 Mean=834.7365 S. Deviation=4104.647 Median=82 Kurtosis=92.32111 Skewness=9.002304 p-value < 2.2e-16

Figure 42 Histogram of DstIP Counter from MACCDC’s dataset

As the reader may see in Figure 40, Figure 41 and Figure 42, the histograms exhibit a behavior similar to SERPRO’s dataset, i.e., Attack Classes, SrcIP and DstIP have leptokurtic right-skewed distributions and the Shapiro-Wilk normality test p-value less than 0.05 in all three distributions indicates that the distributions are not normal

104

Table 4.11 RCAR Rules From MACCDC 2012 dataset

Rule SrcIp Class DstIp N. of alerts Reduction 1 192.168.202.102 (9) 192.168.24.101 68479 33.45% 2 192.168.202.102 15474 (7) 21837 49.12% 3 192.168.24.101 9120003 192.168.202.102 16036 60.63% 4 (34) 9120003 192.168.202.110 21251 75.89% 5 192.168.202.79 (10) 192.168.229.101 8321 81.84%

The RCAR had a performance similar to our first experiment and clustered 81.84% of alerts in 5 rules, each rule with 100% of confidence, as the reader may see in Table 4.11. However, now it’s possible to identify malicious activity and it is worth to emphasize its statistical significance, in other words, malicious activities also have sta- tistical significance, similarly to erroneous behavior found in misconfigurations. As the reader may observe in Table 4.11, Rule 1 indicates that IP address 192.168.202.102 perpetrated attacks directed to 192.168.24.101, related with 9 different attack types. The alerts triggered are shown in Table 4.12. Table 4.12 Alerts triggered by Rule 1

SID Alert Class Name 9119033 http_inspect: UNESCAPED SPACE IN HTTP URI BAD-TRAFFIC Microsoft ISA Server and Forefront Threat Manage- 15474 ment Gateway invalid RST denial of service attempt 9125003 ftp_pp: FTP parameter length overflow 9125001 ftp_pp: Telnet command on FTP command channel 9125002 ftp_pp: Invalid FTP command 9129015 stream5: Reset outside window 9119024 http_inspect: MULTIPLE HOST HEADERS DETECTED 9119019 http_inspect: LONG HEADER 9119031 http_inspect: UNKNOWN METHOD

Rule 2 indicates that IP address 192.168.202.102 also attacked other 7 desti- nations using SID 15474, as presented in Table 4.13. Table 4.13 Destinations from alerts triggered by Rule 2

DstIP N. of alerts 192.168.24.101 13682 192.168.23.202 8859 192.168.26.202 7423

105

192.168.28.202 4123 192.168.24.202 1378 192.168.24.152 50 192.168.22.202 4

Chapter 5 [J1] Conclusions

As discussed throughout this document, modern malware has several strategies to spread within a local network. Moreover, its obfuscation techniques and ability to update itself turn the detection of an infected hosts a complex task. Academy research- ers have proposed several methods to detect malware traffic using network-based IDS, as a solution to detect malware recruiting new hosts, i. e., to detect malwares spreading through the exploitation of existent vulnerabilities. However, the research community has not considered the major IDS drawback, the alarm volume and the false alarm rate. Moreover, several experiments were real- ized with artificial biased traffic or traffic too specific to generalize the results. In order to handle these problems in a synergetic way, i.e., promote alert reduc- tion and malicious traffic detection, in an internal network environment, a framework capable to aggregate statistical significant alerts and analyze alerts root causes, mali- cious or not, were proposed in Chapter 4. The framework, named ARCA (Alerts Root Cause Analysis), evolved from the alert aggregation module first presented in (FEITOSA, EDUARDO LUZEIRO, 2010) and proposed the utilization of Apriori’s asso- ciation rules to help security engineers to identify alerts root causes. The experiments demonstrated that the former Relative Uncertainty Aggregation technique clustered the alerts with more statistical significance, but without details of how hosts and alerts were related. Following the experimentation in Section 4.4, ARCA was capable to pre- sent a comprehensive and limited set of rules, where each rule indicates a relation between hosts and alert types and a frequency pattern that may be related with a root cause. The former RUA results presented a 94% aggregation, 13 alert classes and no information of how 372 source hosts where related with 874 destination hosts, while

106

107

ARCA presented 88% aggregation and 5 rules with information of how 132 source hosts were related with 54 destination hosts. The remaining of this chapter summarizes the contributions of this dissertation, presents the difficulties found during experimentation and suggestions to future work and improvements.

5.1 Contributions

This document presents, initially, a study of the state-of-the-art of main malware behavior research and current academic research focused in alert reduction and IDS enhancements. Moreover, Table 3.1 presents a comparison between the methods of false alarm reduction according the requirement of previous knowledge and the quality of the datasets used. The main contribution is a framework capable to identify statistical signif- icant alerts which has demonstrated that false positives and malicious behavior may exhibit a frequent behavior. The proposed framework was able to reduce alert volume by aggregating up to 88% of alerts in few association rules and generate rules related with malicious activities. It must be noticed that no previous knowledge of the internal network infrastructure was necessary. The results were validated in experiments with real distinct traffic, from SERPRO and MACCDC. The former presents alerts generated in an internal network with local services commonly found in worldwide Intranets, and with Microsoft and Linux workstations. The latter presents alerts from a hacking competition where sev- eral teams have competed to compromise internal network servers.

5.2 Difficulties Found

Throughout the development of this work, the main difficulty was to find a public dataset containing normal and malicious traffic. Despite the real malicious traffic in MACCDC’s dataset the normal traffic is a simulation. The SERPRO’s dataset was col- lected in a real environment, nevertheless malicious traffic is absent or unknown.

108

5.3 Learned Lessons

According the results in Section 4.4 and the surveys in Chapters 2 and 3, we consider the following learned lessons:  Malicious activities may present a statistical significant behavior;  It is imperative that any method that claims to minimize false positives has to be tested with traffic in a real environment. The background traffic found in an Internal Network may compromise the results presented when experimenting with DARPA’s dataset.  Alerts root causes may be identified without previous knowledge or net- work infrastructure.

5.4 Future Work [J2]

As future work, we intent to:  Investigate techniques to handle alerts with no statistical significance (cluster 1): the main idea is to evaluate methods of multi-step correlation in order to identify malicious alert sequences;  Investigate if visualization techniques may help to improve root cause analysis: graphical alert visualization may help intrusion analyst to iden- tify network topology and its alert traffic.  Experiment with more and different IDS: Correlation between Host-based IDS alerts and Network-based may improve alert quality; therefore, ma- licious network traffic can be correlated with state alterations in compro- mised hosts. Moreover, Antivirus System’s alerts may be correlated with previous alert from a NIDS.

References

ALAZAB, Ammar et al. Crime Toolkits: The Productisation of Cybercrime. jul. 2013, [S.l.]: IEEE, jul. 2013. p.1626–1632. Disponível em: . Acesso em: 19 jul. 2014. 978-0-7695-5022-0. .

AL-MAMORY, Safaa O.; ZHANG, Hongli. IDS alerts correlation using grammar- based approach. Journal in Computer Virology v. 5, n. 4, p. 271–282 , 15 ago. 2008a. Disponível em: . Acesso em: 30 jul. 2014.

AL-MAMORY, Safaa O.; ZHANG, Hongli. New data mining technique to enhance IDS alarms quality. Journal in Computer Virology v. 6, n. 1, p. 43–55 , 10 set. 2008b. Disponível em: . Acesso em: 3 jul. 2014.

ANDERSON. James P. Computer Security Threat Monitoring and Surveillance. Technical Report, James P. Anderson Co, Fort Washington, Pa Fort Washington, PA 19034: [s.n.], 1980.

ANDRIESSE, Dennis et al. Highly resilient peer-to-peer botnets are here: An analysis of Gameover Zeus. out. 2013, [S.l.]: IEEE, out. 2013. p.116–123. Disponível em: . Acesso em: 3 ago. 2014. 978-1-4799-2535-3. .

ANDROULIDAKIS, Georgios; CHATZIGIANNAKIS, Vasilis; PAPAVASSILIOU, Symeon. Network anomaly detection and classification via opportunistic sampling. IEEE Network v. 23, n. 1, p. 6–12 , 2009. Disponível em: .

AXELSSON, Stefan. The base-rate fallacy and the difficulty of intrusion detection. ACM Trans. Inf. Syst. Secur. v. 3, n. 3, p. 186–205 , 2000.

AYCOCK, John. Computer Viruses and Malware. [S.l.]: Springer US, 2006. 22 v. Disponível em: . Acesso em: 11 jul. 2014. (Advances in Information Security).978-0-387-30236-2.

109

110

BAIZE, Eric; CORP, E M C. Developing Secure Products in the Age of Advanced Persistent Threats. Security Privacy, IEEE v. 10, n. 3, p. 88–92 , 2012.

BARFORD, Paul; YEGNESWARAN, Vinod. An Inside Look at Botnets. In: CHRISTODORESCU, Mihai et al. (Orgs.). Malware Detection. Advances in Information Security. [S.l.]: Springer US, 2007. 27 v. p. 171–191. Disponível em: . 978-0-387-32720-4.

BINTI YUSOFF, Sarah; BEE WAH, Yap. Comparison of conventional measures of skewness and kurtosis for small sample size. set. 2012, [S.l.]: IEEE, set. 2012. p.1–6. Disponível em: . Acesso em: 8 out. 2014. 978-1-4673-1582-1. .

BOLZONI, Damiano; CRISPO, Bruno; ETALLE, Sandro. ATLANTIDES: an architecture for alert verification in network intrusion detection systems. p. 1–12 , 1 nov. 2007. Disponível em: . Acesso em: 30 jul. 2014.978-1-59327-152-7.

BRADBURY, Danny. Shadows in the cloud: Chinese involvement in advanced persistent threats. Network Security v. 2010, n. 5, p. 16–19 , maio 2010. Disponível em: . Acesso em: 25 jun. 2014.

BRUGGER, S Terry; CHOW, Jedadiah. An Assessment of the DARPA IDS Evaluation Dataset Using Snort . [S.l: s.n.]. Disponível em:

BRUMLEY, D.; NEWSOME, J.; SONG, D. Towards automatic generation of vulnerability-based signatures. 21 maio 2006, [S.l.]: IEEE, 21 maio 2006. p.15 pp.– 16. Disponível em: . Acesso em: 29 jul. 2014. 0-7695-2574-1. .

BURTON, Kelly. The Conficker Worm. Disponível em: . Acesso em: 23 jul. 2014.

CENTER OF STRATEGIC AND INTERNATIONAL STUDIES. THE ECONOMIC IMPACT OF CYBERCRIME AND CYBER ESPIONAGE. [S.l: s.n.], 2013. Disponível em: .

CERT. Cert Coordination Center. Disponível em: . Acesso em: 27 jun. 2014.

CERT.BR. Centro de Estudos, Resposta e Tratamento de Incidentes de Segurança no Brasil. Disponível em: . Acesso em: 27 jun. 2014.

CHRUN, Danielle; CUKIER, Michel; SNEERINGER, Gerry. On the Use of Security Metrics Based on Intrusion Prevention System Event Data: An Empirical Analysis.

111

2008 11th IEEE High Assurance Systems Engineering Symposium p. 49–58 , dez. 2008. Disponível em: . Acesso em: 18 abr. 2012.978-0-7695-3482-4.

COOKE, Evan; JAHANIAN, Farnam; MCPHERSON, Danny. The Zombie roundup: understanding, detecting, and disrupting botnets. p. 6 , 7 jul. 2005. Disponível em: . Acesso em: 22 jul. 2014.

COVA, Marco; KRUEGEL, Christopher; VIGNA, Giovanni. Detection and analysis of drive-by-download attacks and malicious JavaScript code. 26 abr. 2010, New York, New York, USA: ACM Press, 26 abr. 2010. p.281. Disponível em: . Acesso em: 23 jul. 2014. 9781605587998. .

CROWCROFT, J. et al. A survey and comparison of peer-to-peer overlay network schemes. IEEE Communications Surveys & Tutorials v. 7, n. 2, p. 72–93 , 1 jan. 2005. Disponível em: . Acesso em: 22 jul. 2014.

CRYSYS. sKyWIper (a.k.a. Flame a.k.a. Flamer): A complex malware for targeted attacks. Budapest: [s.n.], 2012. Disponível em: .

DE VRIES, Johannes et al. Systems for Detecting Advanced Persistent Threats: A Development Roadmap Using Intelligent Data Analysis. 2012 International Conference on Cyber Security n. SocialInformatics, p. 54–61 , dez. 2012. Disponível em: . Acesso em: 25 jun. 2014.978-1-4799-0219-4.

DEFENCE INTELLIGENCE. Mariposa Botnet Analysis. [S.l: s.n.], 2010. Disponível em: .

DENNING, Dorothy E. An intrusion-detection model. IEEE Transactions on Software Engineering v. 13, n. 2, p. 222–232 , 1987.

ECKMANN, Steven T.; VIGNA, Giovanni; KEMMERER, Richard A. STATL: an attack language for state-based intrusion detection. Journal of Computer Security v. 10, n. 1-2, p. 71–103 , 1 jul. 2002. Disponível em: . Acesso em: 29 jul. 2014.

EMERGING THREATS. Emerging Threats. Disponível em: . Acesso em: 3 jul. 2013.

ENISA. European Network and Information Security Agency Online; accessed 20- Jan-2014. Disponível em: . Acesso em: 27 jun. 2014.

112

FALLIERE. Nicolas; MURCHU. Liam O; CHIEN. Eric. W32.Stuxnet Dossier. [S.l: s.n.], 2011.

FEDYNYSHYN, Gregory; CHUAH, Mooi Choo; TAN, Gang. Detection and classification of different botnet C&C channels. p. 228–242 , 2 set. 2011. Disponível em: . Acesso em: 22 jul. 2014.978-3-642-23495-8.

FEILY, M; SHAHRESTANI, A; RAMADASS, S. A Survey of Botnet and Botnet Detection. 2009, [S.l: s.n.], 2009. p.268–273.

FEITOSA, Eduardo Luzeiro. Doctorate in Computer Science “ An Orchestration Approach for Unwanted Internet Traffic Identification ” Doctorate in Computer Science “ An Orchestration Approach for Unwanted Internet Traffic Identification .”n. August , 2010.

FEITOSA, Eduardo; SOUTO, Eduardo; SADOK, Djamel H. An orchestration approach for unwanted Internet traffic identification. Computer Networks v. 56, n. 12, p. 2805–2831 , 1 ago. 2012. Disponível em: . Acesso em: 14 jul. 2014.

GALLOWAY, Brendan; HANCKE, Gerhard P. Introduction to Industrial Control Networks. IEEE Communications Surveys & Tutorials v. 15, n. 2, p. 860–880 , jan. 2013. Disponível em: . Acesso em: 30 jul. 2014.

GHORBANI, Ali; SADODDIN, Reza. Alert Correlation Survey : Framework and Techniques. 2006 international Conference on Privacy, Security and Trust - PST ’06 p. 1–10 , 2006.1595936041.

GIURA, Paul; WANG, Wei. A Context-Based Detection Framework for Advanced Persistent Threats. 2012 International Conference on Cyber Security n. SocialInformatics, p. 69–74 , dez. 2012. Disponível em: . Acesso em: 25 jun. 2014.978-1-4799-0219-4.

GOLOVANOV, Sergey; SOUMENKOV; IGOR. TDL4 - Top Bot. Disponível em: . Acesso em: 27 jul. 2014.

GONCHAROV. Max. Russian Underground 101. [S.l: s.n.], 2012.

GOODIN, Dan. Authorities dismantle botnet with 13 million infected PCs. Disponível em: . Acesso em: 27 jul. 2014.

113

GOSTEV, Alexander. Flame: Replication via Windows Update MITM proxy server. Disponível em: . Acesso em: 27 jul. 2014a.

GOSTEV. Alexander. The Flame: Questions and Answers. [S.l: s.n.], 2012b. Disponível em: .

GU, Guofei et al. Active Botnet Probing to Identify Obscure Command and Control Channels. 2009, [S.l: s.n.], 2009. p.241–253.

GU, Guofei et al. BotHunter: detecting malware infection through IDS-driven dialog correlation. 2007, Berkeley, CA, USA: USENIX Association, 2007. p.1–16. 111-333- 5555-77-9. .

HAN, Hong; LU, Xin-Liang; REN, Li-Yong. Using data mining to discover signatures in network-based intrusion detection. 2002, [S.l: s.n.], 2002. p.13 – 17 vol.1.

HAN, Jiawei; KAMBER, Micheline; PEI, Jian. Data Mining: Concepts and Techniques (The Morgan Kaufmann Series in Data Management Systems). 2nd. ed. [S.l.]: Morgan Kaufmann, 2011. Disponível em: . Acesso em: 25 jul. 2014. 0123814790, 9780123814791.

HAYKIN, Simon. Neural Networks: A Comprehensive Foundation. 2nd. ed. [S.l.]: Prentice Hall, 1998.

HONEYNET. Honeynet Project. Disponível em: . Acesso em: 29 jul. 2014.

HONEYNET PROJECT. Know Your Enemy: Fast-Flux Service Networks. Disponível em: . Acesso em: 18 jul. 2014.

HUBBALLI, Neminath; BISWAS, Santosh; NANDI, Sukumar. Network Specific False Alarm Reduction in Intrusion Detection System. Sec. and Commun. Netw. v. 4, n. 11, p. 1339–1349 , 2011. Disponível em: .

HUBBALLI, Neminath; SURYANARAYANAN, Vinoth. False alarm minimization techniques in signature-based intrusion detection systems: A survey. Computer Communications n. May , maio 2014. Disponível em: . Acesso em: 4 jun. 2014.

ICS-CERT. Advisory (ICSA-10-090-01) - Mariposa Botnet. Disponível em: . Acesso em: 27 jul. 2014.

IETF. Domain Names - Concept and Facilities. Disponível em: . Acesso em: 19 jul. 2014.

114

IRAN NATIONAL CERT. Identification of a New Targeted Cyber-Attack. Disponível em: . Acesso em: 27 jul. 2014.

ISACA. Advanced Persistent Threat Awareness. [S.l: s.n.], 2013.

ISMAIL, Mohd T Nazri Taha. Framework of Intrusion Detection System via Snort Application on Campus Network Environment. 2009, [S.l: s.n.], 2009. p.455–459.

JELASITY, Márk; BILICKI, Vilmos. Towards automated detection of peer-to-peer botnets: on the limits of local approaches. p. 3 , 21 abr. 2009. Disponível em: . Acesso em: 22 jul. 2014.

JULISCH, K. Mining Alarm Clusters to Improve Alarm Handling Efficiency. In 17th Annual Computer Security Applications Conference (ACSAC p. 12–21 , 2001.

JULISCH, K; DACIER, M. Mining intrusion detection alarms for actionable knowledge. In The 8th ACM International Conference on Knowledge Discovery and Data Mining, July , 2002.

JULISCH, Klaus. Chapter 1 DATA MINING FOR INTRUSION DETECTION A Critical Review .Applications of Data Mining in Computer Security. Boston, USA: Kluwer Academic Publisher. , 2002

JULISCH, Klaus. Clustering intrusion detection alarms to support root cause analysis. ACM Trans. Inf. Syst. Secur. v. 6, n. 4, p. 443–471 , 2003.

JULISCH, Klaus. Mining Alarm Clusters to Improve Alarm Handling Efficiency . [S.l.]: IBM Zurich Research Laboratory. Disponível em: . , 2001

KAMLUK. Vitaly. The Botnet Ecosystem available at http://brazil.kaspersky.com/recursos/centro-de-recursos/botnet-ecosystem. [S.l: s.n.], 2009. Disponível em: .

KIMBALL, Ralph; CASERTA, Joe. The Data Warehouse ETL Toolkit - Practical Techniques for Extracting, Cleaning, Conforming, and Delivering Data. Indianapolis: Wiley Publishing, Inc., 2004. p. 526. 0-764-57923-1.

KUMAR, Sandeep. Classification and Detection of Computer Intrusions. Purdue University, West Lafayette, Indiana, USA, 1995.

LANGNER, Ralph. Stuxnet: Dissecting a Cyberwarfare Weapon. IEEE Security & Privacy Magazine v. 9, n. 3, p. 49–51 , maio 2011. Disponível em: . Acesso em: 11 jul. 2014.

115

LI, Frankie; LAI, Anthony; DDL, Ddl. Evidence of Advanced Persistent Threat: A case study of malware for political espionage. 2011 6th International Conference on Malicious and Unwanted Software n. x, p. 102–109 , out. 2011. Disponível em: .978-1- 4673-0034-6.

LI, Zhichun et al. Netshield: massive semantics-based vulnerability signature matching for high-speed networks. ACM SIGCOMM Computer Communication Review v. 40, n. 4, p. 279 , 16 ago. 2010. Disponível em: . Acesso em: 29 jul. 2014.978- 1-4503-0201-2.

LINS, Bruno; FEITOSA, Eduardo Luzeiro; SADOK, Djamel. Uma Ferramenta de Agregação e Extração de Alertas para Soluções Colaborativas. , 2011.

LU, C.-T.; BOEDIHARDJO, A P; MANALWAR, P. Exploiting efficient data mining techniques to enhance intrusion detection systems. 2005, [S.l: s.n.], 2005. p.512– 517.

LUA. Lua - The programming language. Disponível em: . Acesso em: 27 jul. 2014.

MACCDC. Mid-Atlantic Collegiate Cyber Defense Competition. Disponível em: . Acesso em: 27 jun. 2014.

MAHONEY, Matthew V; CHAN, Philip K. An Analysis of the 1999 DARPA/Lincoln Laboratory Evaluation Data for Network Anomaly Detection. 2003, [S.l.]: Springer- Verlag, 2003. p.220–237.

MANDIANT. The Advanced Persistent Threat. [S.l: s.n.], 2010. Disponível em: .

MANIKOPOULOS, Constantine; PAPAVASSILIOU, Symeon. Network intrusion and fault detection: a statistical anomaly approach. Communications Magazine, IEEE v. 40, n. 10, p. 76–82 , 2002a.

MANIKOPOULOS, Constantine; PAPAVASSILIOU, Symeon. Network Intrusion and Fault Detection: A Statistical Anomaly Approach. IEEE Communications Magazine n. October, p. 76–82 , 2002b.

MASSICOTTE, Frederic et al. Model-driven, network-context sensitive intrusion detection. p. 61–75 , 30 set. 2007. Disponível em: . Acesso em: 29 jul. 2014.3- 540-75208-0, 978-3-540-75208-0.

MCAFEE. Revealed: Operation Shady RAT. [S.l: s.n.], 2010. Disponível em: .

116

MCDONALD. Geoff et al. Stuxnet 0.5: The Missing Link. [S.l: s.n.], 2013. Disponível em: .

MCHUGH, J.; FITHEN, W.L.; ARBAUGH, W.A. Windows of vulnerability: a case study analysis. Computer v. 33, n. 12, p. 52–59 , 2000. Disponível em: . Acesso em: 29 jul. 2014.

MCHUGH, John. Testing Intrusion detection systems: a critique of the 1998 and 1999 DARPA intrusion detection system evaluations as performed by Lincoln Laboratory. ACM Trans. Inf. Syst. Secur. v. 3, n. 4, p. 262–294 , 2000.

MICROSOFT. Microsoft Security Bulletin MS08-067. Disponível em: . Acesso em: 27 jul. 2014.

MICROSOFT. Microsoft Security Bulletin MS10-046. Disponível em: . Acesso em: 27 jul. 2014a.

MICROSOFT. Microsoft Security Bulletin MS10-061. Disponível em: .

MUKHERJEE, B; HEBERLEIN, L T; LEVITT, K N. Network intrusion detection. Network, IEEE v. 8, n. 3, p. 26–41 , 1994.

NIST. National Vulnerability Database. Disponível em: . Acesso em: 27 jul. 2014.

ORACLE. Java Properties. Disponível em: . Acesso em: 27 jul. 2014.

OUELLETTE, Jacob; PFEFFER, Avi; LAKHOTIA, Arun. Countering malware evolution using cloud-based learning. out. 2013, [S.l.]: IEEE, out. 2013. p.85–94. Disponível em: . Acesso em: 8 jul. 2014. 978-1-4799-2535-3. .

PAO, Hsing-Kuo et al. An Intrinsic Graphical Signature Based on Alert Correlation Analysis for Intrusion Detection. nov. 2012, [S.l.]: IEEE, nov. 2012. p.243–262. Disponível em: . Acesso em: 30 jul. 2014. 978-1-4244-8668-7. .

PARIKH, Devi; CHEN, Tsuhan. Data Fusion and Cost Minimization for Intrusion Detection. IEEE Transactions on Information Forensics and Security v. 3, n. 3, p. 381–389 , 1 set. 2008. Disponível em:

117

. Acesso em: 29 jul. 2014.

PENTAHO. Kettle - Data Integration. Disponível em: . Acesso em: 27 jul. 2014a.

PENTAHO. Weka . [S.l: s.n.]. Disponível em: . , 2014b

PIETRASZEK, Tadeusz; TANNER, Axel. Data mining and machine learning-Towards reducing false positives in intrusion detection. Inf. Secur. Tech. Rep. v. 10, n. 3, p. 169–183 , 2005a.

PIETRASZEK, Tadeusz; TANNER, Axel. Data Mining and Machine Learning— Towards Reducing False Positives in Intrusion Detection ∗. Information Security v. 10, n. 3 , 2005b.

PITCOCK, William. Scalability testing of various IRC Services platforms. Disponível em: . Acesso em: 22 jul. 2014.

POLYCHRONAKIS, Michalis; MAVROMMATIS, Panayiotis; PROVOS, Niels. Ghost turns zombie: exploring the life cycle of web-based malware. p. 11 , 15 abr. 2008. Disponível em: . Acesso em: 23 jul. 2014.

PORRAS, P. Directions in Network-Based Security Monitoring. Security Privacy, IEEE v. 7, n. 1, p. 82–85 , 2009.

RAFTOPOULOS, Elias; DIMITROPOULOS, Xenofontas. Understanding Network Forensics Analysis in an Operational Environment. maio 2013, [S.l.]: IEEE, maio 2013. p.111–118. Disponível em: . Acesso em: 14 jul. 2014. 978-1-4799-0458-7. .

REN, Hanli; STAKHANOVA, Natalia; GHORBANI, Ali A. An online adaptive approach to alert correlation. p. 153–172 , 8 jul. 2010. Disponível em: . Acesso em: 30 jul. 2014.3- 642-14214-1, 978-3-642-14214-7.

RIKHTECHI, Leila. Creating a Standard Platform for All Intrusion Detection/Prevention Systems. p. 0–3 , 2010.

RODRÍGUEZ-GÓMEZ, Rafael A.; MACIÁ-FERNÁNDEZ, Gabriel; GARCÍA- TEODORO, Pedro. Survey and taxonomy of botnet research through life-cycle. ACM Computing Surveys v. 45, n. 4, p. 1–33 , 1 ago. 2013. Disponível em: . Acesso em: 11 jul. 2014.

118

ROSSOW, Christian et al. SoK: P2PWNED - Modeling and Evaluating the Resilience of Peer-to-Peer Botnets. maio 2013, [S.l.]: IEEE, maio 2013. p.97–111. Disponível em: . Acesso em: 22 jul. 2014. 978-0-7695-4977-4. .

SAAD, Sherif et al. Detecting P2P Botnets through Network Behavior Analysis and Machine Learning. Ninth Annual International Conference on Privacy, Security and Trust Detecting , 2011.9781457705847.

SADODDIN, Reza; GHORBANI, Ali A. An incremental frequent structure mining framework for real-time alert correlation. Computers & Security v. 28, n. 3-4, p. 153–173 , maio 2009. Disponível em: . Acesso em: 26 jul. 2014.

SERPRO. Serviço Federal de Processamento de Dados - SERPRO. Disponível em: . Acesso em: 27 jun. 2014.

SHAHRESTANI, Alireza et al. Architecture for Applying Data Mining and Visualization on Network Flow for Botnet Traffic Detection. 2009, [S.l: s.n.], 2009. p.33–37.

SHENG YU; SHIJIE ZHOU; SHA WANG. Fast-flux attack network identification based on agent lifespan. jun. 2010, [S.l.]: IEEE, jun. 2010. p.658–662. Disponível em: . Acesso em: 18 jul. 2014. 978-1-4244-5850-9. .

SIEMENS. SIMATIC STEP 7. Disponível em: . Acesso em: 27 jul. 2014.

SILVA, Sérgio S.C. et al. Botnets: A survey. Computer Networks v. 57, n. 2, p. 378– 403 , 1 fev. 2013. Disponível em: . Acesso em: 15 jul. 2014.

SINGHAL, Anoop. Data Warehousing and Data Mining Techniques for Cyber Security. Disponível em: . Acesso em: 7 out. 2014.

SINHA, Prosenjit et al. Insights from the analysis of the Mariposa botnet. out. 2010, [S.l.]: IEEE, out. 2010. p.1–9. Disponível em: . Acesso em: 19 jul. 2014. 978-1-4244-8641-0. .

SOLEIMANI, Mahboobeh; GHORBANI, Ali A. Critical Episode Mining in Intrusion Detection Alerts. 2008, [S.l: s.n.], 2008. p.157–164.

SOMMER, Robin; PAXSON, Vern. Enhancing byte-level network intrusion detection signatures with context. 27 out. 2003, New York, New York, USA: ACM Press, 27

119

out. 2003. p.262. Disponível em: . Acesso em: 29 jul. 2014. 1581137389. .

SOOD, A K; ENBODY, R J. Targeted Cyberattacks: A Superset of Advanced Persistent Threats. Security Privacy, IEEE v. 11, n. 1, p. 54–61 , jan. 2013.

SOURCEFIRE. Snort IDS. Disponível em: . Acesso em: 3 jul. 2014.

STONE-GROSS, Brett et al. Analysis of a Botnet Takeover. IEEE Security & Privacy Magazine v. 9, n. 1, p. 64–72 , jan. 2011. Disponível em: . Acesso em: 19 jul. 2014.

STONE-GROSS, Brett. The Lifecycle of Peer-to-Peer (Gameover) ZeuS. Disponível em: . Acesso em: 27 jul. 2014.

SURI, Hardik. The BlackHole Theory. Disponível em: . Acesso em: 27 jul. 2014.

SURICATA. Suricata IDS. Disponível em: . Acesso em: 27 jul. 2014.

SYMANTEC. .Tidserv. Disponível em: . Acesso em: 27 jul. 2014.

SYMANTEC. Flamer: A Recipe for Bluetoothache. Disponível em: . Acesso em: 27 jul. 2014a.

SYMANTEC. Flamer: Highly Sophisticated and Discreet Threat Targets the Middle Eas. Disponível em: .

SYMANTEC. Flamer: Urgent Suicide. Disponível em: . Acesso em: 27 jul. 2014c.

SYMANTEC. Have I Got Newsforyou: Analysis of Flamer C&C Servers. Disponível em: .

SYMANTEC. Norton Report. [S.l: s.n.], 2013. Disponível em: .

120

SYMANTEC. Painting a Picture of W32.Flamer. Disponível em: .

SYMANTEC. VBS.Gnutella. Disponível em: . Acesso em: 23 jul. 2014.

SYMANTEC. W.32 Flammer. Disponível em: . Acesso em: 27 jul. 2014f.

SYMANTEC. W32.Flamer: Leveraging Microsoft Digital Certificates. Disponível em: . Acesso em: 27 jul. 2014g.

SZOR, Peter. The Art of Research and Defense. [S.l.]: Addison- Wesley Professional, 2005. p. 744. 978-0321304544.

SZÖR, Péter; FERRIE, Peter. Hunting for Metamorphic. 2001, [S.l: s.n.], 2001.

TANKARD, Colin. Advanced Persistent threats and how to monitor and deter them. Network Security v. 2011, n. 8, p. 16–19 , ago. 2011. Disponível em: . Acesso em: 25 maio 2014.

TEMPLETON, Steven J; LEVITT, Karl. A Requires/Provides Model for Computer Attacks. Distribution p. 31–38 , 2000.

TEXAS INSTRUMENTS. PROFIBUS Communications Development Platform. Disponível em: . Acesso em: 27 jul. 2014.

THOMAS, C.; BALAKRISHNAN, N. Performance enhancement of Intrusion Detection Systems using advances in sensor fusion. Information Fusion, 2008 11th International Conference on p. 1–7 , 2008. Disponível em: . Acesso em: 29 jul. 2014.978-3-8007-3092-6.

THOMSON, Gordon. APTs: a poorly understood challenge. Network Security v. 2011, n. 11, p. 9–11 , nov. 2011. Disponível em: . Acesso em: 21 jun. 2014.

TJHAI, Gina C. et al. The Problem of False Alarms: Evaluation with Snort and DARPA 1999 Datase. In: SPRINGER (Org.). Trust, Privacy and Security in Digital Business. [S.l: s.n.], 2008. 978-3-540-85734-1.

TRABELSI, Zouheir; BOX, P O; AIN, Al. Using Network Packet Generators and Snort Rules for Teaching Denial of Service Attacks. p. 285–290 , 2013.9781450320788.

121

TREND MICRO. CUTWAIL. Disponível em: . Acesso em: 27 jul. 2014.

TREND MICRO. SDBOT. Disponível em: . Acesso em: 23 jul. 2014.

TSAI, Meng-Han et al. C&C tracer: Botnet command and control behavior tracing. 2011 IEEE International Conference on Systems, Man, and Cybernetics p. 1859– 1864 , out. 2011. Disponível em: .978-1- 4577-0653-0.

VIGNA, G. et al. A stateful intrusion detection system for world-wide web servers. 2003, [S.l.]: IEEE, 2003. p.34–43. Disponível em: . Acesso em: 29 jul. 2014. 0-7695-2041-3. .

VIINIKKA, Jouni et al. Processing intrusion detection alert aggregates with time series modeling. Information Fusion v. 10, n. 4, p. 312–324 , 1 out. 2009. Disponível em: . Acesso em: 30 jul. 2014.

WANG, Helen J. et al. Shield: vulnerability-driven network filters for preventing known vulnerability exploits. 30 ago. 2004, New York, New York, USA: ACM Press, 30 ago. 2004. p.193. Disponível em: . Acesso em: 29 jul. 2014. 1581138628. .

WANG, Ping et al. A Systematic Study on Peer-to-Peer Botnets. 3 ago. 2009, [S.l.]: IEEE, 3 ago. 2009. p.1–8. Disponível em: . Acesso em: 22 jul. 2014. Bad - remove. .

WANG, Ping; SPARKS, Sherri; ZOU, Cliff C. An Advanced Hybrid Peer-to-Peer Botnet. IEEE Transactions on Dependable and Secure Computing v. 7, n. 2, p. 113–127 , 2010.

WONG, Wing; STAMP, Mark. Hunting for metamorphic engines. Journal in Computer Virology v. 2, n. 3, p. 211–229 , 11 nov. 2006. Disponível em: . Acesso em: 7 jul. 2014.

WUU, Lih-Chyau; HUNG, Chi-hsiang; CHEN, Sout-fong. Building intrusion pattern miner for Snort network intrusion detection system. Journal of Systems and Software From Duplicate 2 ( Building intrusion pattern miner for Snort network intrusion detection system - Wuu, Lih-Chyau; Hung, Chi-Hsiang; Chen, Sout-Fong ) Methodology of Security Engineering for Industrial Security Management Systems , v. 80, n. 10, p. 1699–1715 , 2006. Disponível em: .

122

XU, Kuai; WANG, Feng; GU, Lin. Network-aware behavior clustering of Internet end hosts. 2011 Proceedings IEEE INFOCOM p. 2078–2086 , abr. 2011a. Disponível em: .978-1- 4244-9919-9.

XU, Kuai; WANG, Feng; GU, Lin. Network-aware behavior clustering of Internet end hosts. 2011 Proceedings IEEE INFOCOM p. 2078–2086 , abr. 2011b. Disponível em: .978-1- 4244-9919-9.

XU, Kuai; ZHANG, Zhi-li; BHATTACHARYYA, Supratik. Profiling Internet Backbone Traffic : Behavior Models and Applications. 2005 Conference on Applications, Technologies, Architectures, and Protocols for Computer Communications (SIGCOMM ’05) , 2005.

YU, Shui et al. Malware Propagation in Large-Scale Networks. IEEE Transactions on Knowledge and Data Engineering v. 4347, n. c, p. 1–1 , 2014. Disponível em: . Acesso em: 28 jun. 2014.

ZHANG, Lei et al. A Survey on Latest Botnet Attack and Defense. 2011IEEE 10th International Conference on Trust, Security and Privacy in Computing and Communications p. 53–60 , nov. 2011. Disponível em: . Acesso em: 24 maio 2014.978-1-4577-2135-9.

ZHIOUA, Sami. The Middle East under Malware Attack Dissecting Cyber Weapons. jul. 2013, [S.l.]: IEEE, jul. 2013. p.11–16. Disponível em: . Acesso em: 11 jul. 2014. 978-1-4799-3248-1. .

ZHOU, Chenfeng Vincent; LECKIE, Christopher; KARUNASEKERA, Shanika. A survey of coordinated attacks and collaborative intrusion detection. Computers & Security v. 29, n. 1, p. 124–140 , fev. 2010. Disponível em: . Acesso em: 27 maio 2014.

ZHOU, Jingmin et al. Modeling network intrusion detection alerts for correlation. ACM Trans. Inf. Syst. Secur. v. 10, n. 1, p. 4 , 2007.

ZHU, Zhaosheng et al. Botnet Research Survey. 2008, [S.l.]: IEEE, 2008. p.967–972. Disponível em: . Acesso em: 17 jul. 2014. 978-0-7695-3262-2. .

ZHUGE, Jianwei et al. Characterizing the IRC-based Botnet Phenomenon . [S.l: s.n.]. Disponível em: . Acesso em: 22 jul. 2014. , 3 dez. 2007

123

ZOU, Cliff C et al. The monitoring and early detection of internet worms. IEEE/ACM Trans. Netw. v. 13, n. 5, p. 961–974 , 2005.

ZOU, Cliff C.; TOWSLEY, Don; GONG, Weibo. On the performance of Internet worm scanning strategies. Performance Evaluation v. 63, n. 7, p. 700–723 , 1 jul. 2006. Disponível em: . Acesso em: 20 jul. 2014.