Early Stage Malware Classification Using Behavior Analysis

Total Page:16

File Type:pdf, Size:1020Kb

Load more

Early Stage Malware Classification using Behavior Analysis A thesis submitted in partial fulfillment of the requirements for the degree of Master of Technology by Mugdha Gupta Department of Computer Science And Engineering INDIAN INSTITUTE OF TECHNOLOGY KANPUR June 2018 Abstract Name of the student: Mugdha Gupta Roll No: 16111041 Degree for which submitted: M.Tech. Department: Computer Science and Engineering Thesis title: Early Stage Malware Classification using Behavior Analysis Thesis supervisor: Dr. Sandeep Shukla Month and year of thesis submission: June 2018 In the recent years, there has been an exponential growth in the number of malware captured and analyzed by the antivirus companies. However, much of these malwares are variants of al- ready known malware. Thus, it has become necessary to determine whether a malware belongs to a known family, or exhibits a new behavior hitherto unseen, and requires further analysis. Existing traditional approaches used by antivirus companies are based on signature-based de- tection and can be thwarted in case of zero-day exploit-based malware. Manual examination of such executables is extremely cumbersome due to the enormous number of such cases. Also, it has become necessary to speed up the detection process and predict before the executable releases its malicious payload. In this work, we addressed all the above issues using automated yet efficient malware analysis. We classified the malicious executables into different malware classes in the earliest possible time using dynamic analysis. Dynamic analysis provides useful insights in the case of obfuscated or packed malware where static analysis fails. Our experi- ments achieve an accuracy of 98.02% for classifying malware into classes in the initial 4 seconds of its execution using XGBoost. We also classified samples which were not seen by the classi- fier before, thus attempted to classify zero-day malware. Our solution is robust and scalable as we have increased the number of samples used during analysis compared to prior work and reduced the execution time drastically. Our solution is also efficient since the state of the art accuracy for early stage malware detection is 91% for the first 4 seconds of execution and 96% for the first 19 seconds using recurrent neural networks. Acknowledgements I would express my profound gratitude to Dr. Sandeep Shukla for guiding me in this project. I would also like to thank Pranjul Ahuja, Bhaskar Mukhoty and Rohit Singh Kharanghar for their help and support whenever I needed. I am grateful to my parents and my siblings for the immense love they have given me. I am thankful to the Virustotal community for being so generous by providing me the access to their private API. I would also like to take this opportunity to thank CDAC Mohali for their help in building the dataset and TCG Digital for their support in creating the virtual network. iv Contents Abstract iii Acknowledgements iv Contents v List of Figures vii List of Tables viii 1 Introduction 1 1.1 Need for Malware Classification..............................1 2 Background 3 2.1 Malware and its classes...................................3 2.2 Malware Nomenclature...................................5 2.3 Available Defenses......................................6 2.4 Malware Analysis techniques................................6 2.4.1 Static Analysis.....................................7 2.4.1.1 Limitations of Static Analysis.......................7 2.4.2 Dynamic Analysis...................................8 3 Past Work 10 3.1 Static analysis based feature extraction.......................... 10 3.2 Dynamic analysis based feature extraction........................ 11 3.3 Time efficient detection................................... 13 3.4 Goals of this thesis...................................... 13 4 Machine Learning Background 15 4.1 Classifiers........................................... 15 4.2 Handling Imbalanced Data................................. 16 4.3 Cross Validation........................................ 18 4.4 Evaluation Metrics...................................... 19 4.4.1 Confusion Matrix................................... 19 v Contents vi 5 Classification of Existing malware 21 5.1 Architecture of classification system............................ 21 5.1.1 Dataset collection, Generation and Labeling................... 21 5.1.1.1 Dataset collection............................. 21 5.1.1.2 Dataset generation............................. 22 5.1.1.3 Labeling................................... 23 5.1.2 Feature Extraction.................................. 24 5.1.2.1 Network related features......................... 25 5.1.2.2 Process related features.......................... 30 5.1.2.3 API bins................................... 30 5.1.2.4 Signatures.................................. 33 5.1.3 Training and Testing................................. 38 5.1.4 Comparison to Existing Approaches........................ 40 6 Classification of Zero Day malwares 42 6.1 Architecture.......................................... 42 6.1.1 Dataset Collection and Generation........................ 42 6.1.2 Feature Extraction.................................. 45 6.1.3 Handling Imbalanced Data............................. 45 6.1.4 Training and Testing................................. 47 7 Scope And Future Work 49 7.1 Building a Hierarchical model............................... 49 7.2 Sliding window based approach for classification.................... 49 7.3 Building a robust classification system.......................... 49 A Appendix A 50 Bibliography 51 List of Figures 1.1 Growth of the the malware over years [4].........................2 2.1 Naming convention used by Microsoft [18]........................5 4.1 Neural network with one hidden layer [20]........................ 16 4.2 SMOTE oversampling technique [3]............................ 17 4.3 Tomek undersampling technique [3]............................ 17 4.4 K-Fold Cross Validation [2]................................. 18 4.5 Confusion Matrix[1]..................................... 19 5.1 Architecture of our classification system......................... 22 5.2 Cuckoo Architecture [6]................................... 23 5.3 Protocol Hierarchy of a malware using Wireshark.................... 26 5.4 LLMNR Poisoning [16].................................... 27 5.5 TLS Connections....................................... 28 5.6 HTTP Requests by Sventore.A malware.......................... 28 5.7 Frequency of API Calls in each bin............................. 31 5.8 Shortcuts created by worm family Yuner......................... 33 5.9 Registry Keys modified by Backdoor Agent malware to install itself at startup.... 34 5.10 Polymorphic nature exhibited by malware Yuner.................... 36 5.11 Exception raised by malware Renos............................ 37 5.12 Confusion Matrix - XGBoost................................. 40 6.1 Architecture of our classification system......................... 43 6.2 tSNE - Test Set......................................... 44 6.3 Imbalanced Virus families.................................. 45 6.4 Imbalanced Trojan families................................. 46 6.5 Confusion Matrix - XGBoost................................. 48 vii List of Tables 3.1 Summary - Dynamic analysis based feature extraction................. 12 5.1 Dataset............................................. 24 5.2 Testing accuracy - Simple Neural Network, for various optimizers and loss functions 39 5.3 Test Results for all classifiers................................ 39 5.4 Comparison to previous approaches............................ 41 6.1 Number of samples in Training and Testing Set..................... 44 6.2 Families in Training and Testing Set............................ 44 6.3 Number of samples in Training Set after SMOTE..................... 46 6.4 Accuracy for each type with corresponding FPR..................... 47 viii Dedicated to my parents ix Chapter 1 Introduction 1.1 Need for Malware Classification The rise of the Internet has readily affected our day to day life, turning it upside down. From buying products, doing online banking to using it for entertainment purposes or social net- working, it has made our lives a lot easier. With the ease of information flow, every other or- ganization is now getting connected to the Internet and going transparent with their opera- tions and resources. But as the Internet economy has grown, more serious cyber crimes have evolved. Almost every device like mobile phones, laptops to large systems such as power grid and nuclear plants are subjected to cyber attacks. Among serious cyber threats, there is mal- ware which evolves daily and has the capacity to disrupt every other sector without any fail. According to the reports published by AV-Test institute [4], there has been tremendous growth in the number of malicious samples as shown in figure 1.1, registering over 250,000 new mali- cious samples every day. Analyzing these samples manually using reverse engineering and dis- assembly is a tedious and cumbersome task, therefore not appreciated enough by the security analysts. Thus there is a dire need for automated malware analysis systems which produces ef- ficient results with minimal human intervention. Antivirus systems use the most common and primitive approach which involves generation of signatures of known malware beforehand and then comparing newly downloaded executables against these signatures to predict its nature. This technique drastically fails in case of any
Recommended publications
  • Botection: Bot Detection by Building Markov Chain Models of Bots Network Behavior Bushra A

    Botection: Bot Detection by Building Markov Chain Models of Bots Network Behavior Bushra A

    BOTection: Bot Detection by Building Markov Chain Models of Bots Network Behavior Bushra A. Alahmadi Enrico Mariconti Riccardo Spolaor University of Oxford, UK University College London, UK University of Oxford, UK [email protected] [email protected] [email protected] Gianluca Stringhini Ivan Martinovic Boston University, USA University of Oxford, UK [email protected] [email protected] ABSTRACT through DDoS (e.g. DDoS on Estonia [22]), email spam (e.g. Geodo), Botnets continue to be a threat to organizations, thus various ma- ClickFraud (e.g. ClickBot), and spreading malware (e.g. Zeus). 10,263 chine learning-based botnet detectors have been proposed. How- malware botnet controllers (C&C) were blocked by Spamhaus Mal- ever, the capability of such systems in detecting new or unseen ware Labs in 2018 alone, an 8% increase from the number of botnet 1 botnets is crucial to ensure its robustness against the rapid evo- C&Cs seen in 2017. Cybercriminals are actively monetizing bot- lution of botnets. Moreover, it prolongs the effectiveness of the nets to launch attacks, which are evolving significantly and require system in detecting bots, avoiding frequent and time-consuming more effective detection mechanisms capable of detecting those classifier re-training. We present BOTection, a privacy-preserving which are new or unseen. bot detection system that models the bot network flow behavior Botnets rely heavily on network communications to infect new as a Markov Chain. The Markov Chains state transitions capture victims (propagation), to communicate with the C&C server, or the bots’ network behavior using high-level flow features as states, to perform their operational task (e.g.
  • 2015 Threat Report Provides a Comprehensive Overview of the Cyber Threat Landscape Facing Both Companies and Individuals

    2015 Threat Report Provides a Comprehensive Overview of the Cyber Threat Landscape Facing Both Companies and Individuals

    THREAT REPORT 2015 AT A GLANCE 2015 HIGHLIGHTS A few of the major events in 2015 concerning security issues. 08 07/15: Hacking Team 07/15: Bugs prompt 02/15: Europol joint breached, data Ford, Range Rover, 08/15: Google patches op takes down Ramnit released online Prius, Chrysler recalls Android Stagefright botnet flaw 09/15: XcodeGhost 07/15: Android 07/15: FBI Darkode tainted apps prompts Stagefright flaw 08/15: Amazon, ENFORCEMENT bazaar shutdown ATTACKS AppStore cleanup VULNERABILITY reported SECURITYPRODUCT Chrome drop Flash ads TOP MALWARE BREACHING THE MEET THE DUKES FAMILIES WALLED GARDEN The Dukes are a well- 12 18 resourced, highly 20 Njw0rm was the most In late 2015, the Apple App prominent new malware family in 2015. Store saw a string of incidents where dedicated and organized developers had used compromised tools cyberespionage group believed to be to unwittingly create apps with malicious working for the Russian Federation since behavior. The apps were able to bypass at least 2008 to collect intelligence in Njw0rm Apple’s review procedures to gain entry support of foreign and security policy decision-making. Angler into the store, and from there into an ordinary user’s iOS device. Gamarue THE CHAIN OF THE CHAIN OF Dorkbot COMPROMISE COMPROMISE: 23 The Stages 28 The Chain of Compromise Nuclear is a user-centric model that illustrates Kilim how cyber attacks combine different Ippedo techniques and resources to compromise Dridex devices and networks. It is defined by 4 main phases: Inception, Intrusion, WormLink Infection, and Invasion. INCEPTION Redirectors wreak havoc on US, Europe (p.28) INTRUSION AnglerEK dominates Flash (p.29) INFECTION The rise of rypto-ransomware (p.31) THREATS BY REGION Europe was particularly affected by the Angler exploit kit.
  • Power-Law Properties in Indonesia Internet Traffic. Why Do We Care About It

    Power-Law Properties in Indonesia Internet Traffic. Why Do We Care About It

    by Bisyron Wahyudi Muhammad Salahuddien Amount of malicious traffic circulating on the Internet is increasing significantly. Increasing complexity and rapid change in hosts and networks technology suggests that there will be new vulnerabilities. Attackers have interest in identifying networks and hosts to expose vulnerabilities : . Network scans . Worms . Trojans . Botnet Complicated methods of attacks make difficult to identify the real attacks : It is not simple as filtering out the traffic from some sources Security is implemented like an “add on” module for the Internet. Understanding nature behavior of malicious sources and targeted ports is important to minimize the damage by build strong specific security rules and counter measures Help the cyber security policy-making process, and to raise public awareness Questions : . Do malicious sources generate the attacks uniformly ? . Is there any pattern specific i.e. recurrence event ? . Is there any correlation between the number of some attacks over specific time ? Many systems and phenomena (events) are distributed according to a “power law” When one quantity (say y) depends on another (say x) raised to some power, we say that y is described by a power law A power law applies to a system when: . large is rare and . small is common Collection of System logs from Networked Intrusion Detection System (IDS) The NIDS contains 11 sensors installed in different core networks in Indonesian ISP (NAP) Period : January, 2012 - September, 2012 . Available fields : ▪ Event Message, Timestamp, Dest. IP, Source IP, Attacks Classification, Priority, Protocol, Dest. Port/ICMP code, Source Port/ICMP type, Sensors ID Two quantities x and y are related by a power law if y is proportional to x(-c) for a constant c y = .x(-c) If x and y are related by a power law, then the graph of log(y) versus log(x) is a straight line log(y) = -c.log(x) + log() The slope of the log-log plot is the power exponent c Destination Port Distribution .
  • Transition Analysis of Cyber Attacks Based on Long-Term Observation—

    Transition Analysis of Cyber Attacks Based on Long-Term Observation—

    2-3 nicterReport —TransitionAnalysisofCyberAttacksBasedon Long-termObservation— NAKAZATO Junji and OHTAKA Kazuhiro In this report, we provide a statistical data concerning cyber attacks and malwares based on a long-term network monitoring on the nicter. Especially, we show a continuous observation report of Conficker, which is a pandemic malware since November 2008. In addition, we report a transition analysis of the scale of botnet activities. Keywords Incident analysis, Darknet, Network monitoring, Malware analysis 1 Introduction leverages the traffic as detected by the four black hole sensors placed on different network We have been monitoring the IP address environments as shown by Fig. 1. space that is reachable and unused on the ● Sensor I : Structure where live nets and Internet (i.e. darknets) on a large-scale to darknets coexist in a class B understand the overall impact inflicted by network infectious activities including malware. This ● Sensor II : Structure where only darknets report analyzes the darknet traffic that has exist in a class B network been monitored and accumulated over six ● Sensor III : Structure where a /24 subnet years by an incident analysis center named in a class B network is a dark- *1 the nicter[1][2] to provide changing trends of net cyber attacks and fluctuation of attacker host ● Sensor IV : Structure where live nets and activities as obtained by long-term monitor- darknets coexist in a class B ing. In particular, we focus on Conficker, a network worm that has triggered large-scale infections The traffic obtained by these four sensors since November 2008, and report its impact on is analyzed by different analysis engines[3][4] the Internet and its current activities.
  • Symantec Intelligence Report: June 2011

    Symantec Intelligence Report: June 2011

    Symantec Intelligence Symantec Intelligence Report: June 2011 Three-quarters of spam send from botnets in June, and three months on, Rustock botnet remains dormant as Cutwail becomes most active; Pharmaceutical spam in decline as new Wiki- pharmacy brand emerges Welcome to the June edition of the Symantec Intelligence report, which for the first time combines the best research and analysis from the Symantec.cloud MessageLabs Intelligence Report and the Symantec State of Spam & Phishing Report. The new integrated report, the Symantec Intelligence Report, provides the latest analysis of cyber security threats, trends and insights from the Symantec Intelligence team concerning malware, spam, and other potentially harmful business risks. The data used to compile the analysis for this combined report includes data from May and June 2011. Report highlights Spam – 72.9% in June (a decrease of 2.9 percentage points since May 2011): page 11 Phishing – One in 330.6 emails identified as phishing (a decrease of 0.05 percentage points since May 2011): page 14 Malware – One in 300.7 emails in June contained malware (a decrease of 0.12 percentage points since May 2011): page 15 Malicious Web sites – 5,415 Web sites blocked per day (an increase of 70.8% since May 2011): page 17 35.1% of all malicious domains blocked were new in June (a decrease of 1.7 percentage points since May 2011): page 17 20.3% of all Web-based malware blocked was new in June (a decrease of 4.3 percentage points since May 2011): page 17 Review of Spam-sending botnets in June 2011: page 3 Clicking to Watch Videos Leads to Pharmacy Spam: page 6 Wiki for Everything, Even for Spam: page 7 Phishers Return for Tax Returns: page 8 Fake Donations Continue to Haunt Japan: page 9 Spam Subject Line Analysis: page 12 Best Practices for Enterprises and Users: page 19 Introduction from the editor Since the shutdown of the Rustock botnet in March1, spam volumes have never quite recovered as the volume of spam in global circulation each day continues to fluctuate, as shown in figure 1, below.
  • An Adaptive Multi-Layer Botnet Detection Technique Using Machine Learning Classifiers

    An Adaptive Multi-Layer Botnet Detection Technique Using Machine Learning Classifiers

    applied sciences Article An Adaptive Multi-Layer Botnet Detection Technique Using Machine Learning Classifiers Riaz Ullah Khan 1,* , Xiaosong Zhang 1, Rajesh Kumar 1 , Abubakar Sharif 1, Noorbakhsh Amiri Golilarz 1 and Mamoun Alazab 2 1 Center of Cyber Security, School of Computer Science & Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China; [email protected] (X.Z.); [email protected] (R.K.); [email protected] (A.S.); [email protected] (N.A.G.) 2 College of Engineering, IT and Environment, Charles Darwin University, Casuarina 0810, Australia; [email protected] * Correspondence: [email protected]; Tel.: +86-155-2076-3595 Received: 19 March 2019; Accepted: 24 April 2019; Published: 11 June 2019 Abstract: In recent years, the botnets have been the most common threats to network security since it exploits multiple malicious codes like a worm, Trojans, Rootkit, etc. The botnets have been used to carry phishing links, to perform attacks and provide malicious services on the internet. It is challenging to identify Peer-to-peer (P2P) botnets as compared to Internet Relay Chat (IRC), Hypertext Transfer Protocol (HTTP) and other types of botnets because P2P traffic has typical features of the centralization and distribution. To resolve the issues of P2P botnet identification, we propose an effective multi-layer traffic classification method by applying machine learning classifiers on features of network traffic. Our work presents a framework based on decision trees which effectively detects P2P botnets. A decision tree algorithm is applied for feature selection to extract the most relevant features and ignore the irrelevant features.
  • Fortinet Threat Landscape Report Q3 2017

    Fortinet Threat Landscape Report Q3 2017

    THREAT LANDSCAPE REPORT Q3 2017 TABLE OF CONTENTS TABLE OF CONTENTS Introduction . 4 Highlights and Key Findings . 5 Sources and Measures . .6 Infrastructure Trends . 8 Threat Landscape Trends . 11 Exploit Trends . 12 Malware Trends . 17 Botnet Trends . 20 Exploratory Analysis . 23 Conclusion and Recommendations . 25 3 INTRODUCTION INTRODUCTION Q3 2017 BY THE NUMBERS: Exploits nn5,973 unique exploit detections nn153 exploits per firm on average nn79% of firms saw severe attacks nn35% reported Apache.Struts exploits Malware nn14,904 unique variants The third quarter of the year should be filled with family vacations and the back-to-school hubbub. Q3 2017 felt like that for a nn2,646 different families couple of months, but then the security industry went into a nn25% reported mobile malware hubbub of a very different sort. Credit bureau Equifax reported nn22% detected ransomware a massive data breach that exposed the personal information of Botnets approximately 145 million consumers. nn245 unique botnets detected That number in itself isn’t unprecedented, but the public nn518 daily botnet comms per firm and congressional outcry that followed may well be. In a congressional hearing on the matter, one U.S. senator called nn1.9 active botnets per firm the incident “staggering,” adding “this whole industry should be nn3% of firms saw ≥10 botnets completely transformed.” The impetus, likelihood, and extent of such a transformation is yet unclear, but what is clear is that Equifax fell victim to the same basic problems we point out Far from attempting to blame and shame Equifax (or anyone quarter after quarter in this report.
  • Download Hong Kong Security Watch Report

    Download Hong Kong Security Watch Report

    Hong Kong Security Watch Report 2019 Q1 1 Foreword Better Security Decision with Situational Awareness Nowadays, a lot of \invisible" compromised systems (computers and other devices) are controlled by attackers with the owner being unaware. The data on these systems may be mined and exposed every day, and the systems may be utilized in different kinds of abuse and criminal activities. The Hong Kong Security Watch Report aims to provide the public a better \visibility" of the situation of the compromised systems in Hong Kong so that they can make better decision in protecting their information security. The data in this report is about the activities of compromised systems in Hong Kong which suffer from, or par- ticipate in various forms of cyber attacks, including web defacement, phishing, malware hosting, botnet command and control centres (C&C) or bots. Computers in Hong Kong are defined as those whose network geolocation is Hong Kong, or the top level domain of their host name is \.hk". Capitalizing on the Power of Global Intelligence This report is the fruit of the collaboration of HKCERT and global security researchers. Many security researchers have the capability to detect attacks targeting their own or their customers' networks. Some of them provide the information of IP addresses of attack source or web links of malicious activities to other information security organizations with an aim to collaboratively improve the overall security of the cyberspace. They have good practice in sanitizing personal identifiable data before sharing information. HKCERT collects and aggregates such valuable data about Hong Kong from multiple information sources for analysis with Information Feed Analysis System (IFAS), a system developed by HKCERT.
  • CONTENTS in THIS ISSUE Fighting Malware and Spam

    CONTENTS in THIS ISSUE Fighting Malware and Spam

    OCTOBER 2010 Fighting malware and spam CONTENTS IN THIS ISSUE 2 COMMENT SPAM COLLECTING Changing times Claudiu Musat and George Petre explain why spam feeds matter in the anti-spam fi eld and discuss the 3 NEWS importance of effective spam-gathering methods. Overall fall in fraud, but online banking page 21 losses rise National Cybersecurity Awareness Month NEW KIDS ON THE BLOCK Dip in Canadian Pharmacy spam New anti-malware companies and products seem to spring up with increasing frequency, many 3 VIRUS PREVALENCE TABLE reworking existing detection engines into new forms, as well as several that are working on their MALWARE ANALYSES own detection technology. John Hawes takes a 4 It’s just spam, it can’t hurt, right? quick look at a few of the up-and-coming products which he expects to see taking part in the VB100 13 Rooting about in TDSS comparatives in the near future. page 25 16 TECHNICAL FEATURE Anti-unpacker tricks – part thirteen VB100 CERTIFICATION ON WINDOWS SERVER 2003 21 FEATURE This month the VB test team put On the relevance of spam feeds 38 products through their paces on Windows Server 2003. John Oct 2010 25 REVIEW FEATURE Hawes has the details of the VB100 Things to come winners and those who failed to make the grade. 29 COMPARATIVE REVIEW page 29 Windows Server 2003 59 END NOTES & NEWS ISSN 1749-7027 COMMENT ‘Ten years ago the While mostly very tongue-in-cheek, a substantial amount of what he wrote was accurate. idea of malware He predicted that by 2010 the PC would no longer be the writing becoming most prevalent computing platform in the world, having a profi t-making been overtaken in number by pervasive computing devices – in other words, PDAs and web phones.
  • CONTENTS in THIS ISSUE Fighting Malware and Spam

    CONTENTS in THIS ISSUE Fighting Malware and Spam

    APRIL 2009 Fighting malware and spam CONTENTS IN THIS ISSUE 2 COMMENT ROGUE TRADERS Flooding the cloud Rogue anti-malware applications have been around for several years, 3 NEWS conning and causing Ghostly goings on confusion among users as well as posing problems for anti-malware Internet fraud complaints rise vendors. Gabor Szappanos takes a look at a piece of anti-virus scamware. page 9 3 VIRUS PREVALENCE TABLE APPLE CATCHER Mario Ballano Barcena and Alfredo Pesoli take 4 TECHNICAL FEATURE a detailed look at what appears to be the fi rst real attempt to create a Mac botnet. Anti-unpacker tricks – part fi ve page 12 VB100 ON WINDOWS XP MALWARE ANALYSES VB’s anti-malware testing team put 9 Your PC is infected a bumper crop of products through their paces on Windows XP. Find out 12 The new iBotnet which products excelled and which have some more work to do. page 15 15 COMPARATIVE REVIEW Windows XP SP3 36 END NOTES & NEWS This month: anti-spam news and events; and John Levine looks at message authentication using Domain Keys Identifi ed Mail (DKIM). ISSN 1749-7027 COMMENT ‘An even better mutated variations of malware in large volume. While this strategy won’t work against all technologies solution is to be (for example it is ineffective against HIPS, advanced proactive in the heuristics, generic detection etc.), it is well worth the cloud.’ effort for its ability to evade signature detection. I was interested to fi nd out whether these explanations Luis Corrons could be verifi ed by our detection data – for example Panda Security to see for how long each threat was active.
  • Botnet Detection Using On-Line Clustering with Pursuit Reinforcement Competitive Learning (PRCL)

    Botnet Detection Using On-Line Clustering with Pursuit Reinforcement Competitive Learning (PRCL)

    View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by EMITTER - International Journal of Engineering Technology EMITTER International Journal of Engineering Technology Vol. 6, No. 1, June 2018 ISSN: 2443-1168 Botnet Detection Using On-line Clustering with Pursuit Reinforcement Competitive Learning (PRCL) Yesta Medya Mahardhika, Amang Sudarsono, Aliridho Barakbah Postgradaute Applied Engineering of Technology Division of Information and Computer Engineering, Department of Information and Computer Engineering, Electronic Engineering Polytechnic Institute of Surabaya EEPIS Campus, Jalan Raya ITS, Sukolilo 60111, Indonesia Email: [email protected], {amang, ridho}@pens.ac.id Abstract Botnet is a malicious software that often occurs at this time, and can perform malicious activities, such as DDoS, spamming, phishing, keylogging, clickfraud, steal personal information and important data. Botnets can replicate themselves without user consent. Several systems of botnet detection has been done by using classification methods. Classification methods have high precision, but it needs more effort to determine appropiate classification model. In this paper, we propose reinforced approach to detect botnet with On- line Clustering using Reinforcement Learning. Reinforcement Learning involving interaction with the environment and became new paradigm in machine learning. The reinforcement learning will be implemented with some rule detection, because botnet ISCX dataset is categorized as unbalanced dataset which have high range of each number of class. Therefore we implemented Reinforcement Learning to Detect Botnet using Pursuit Reinforcement Competitive Learning (PRCL) with additional rule detection which has reward and punisment rules to achieve the solution. Based on the experimental result, PRCL can detect botnet in real time with high accuracy (100% for Neris, 99.9% for Rbot, 78% for SMTP_Spam, 80.9% for Nsis, 80.7% for Virut, and 96.0% for Zeus) and fast processing time up to 176 ms.
  • The Real Face of KOOBFACE: the Largest Web 2.0 Botnet Explained

    The Real Face of KOOBFACE: the Largest Web 2.0 Botnet Explained

    The Real Face of KOOBFACE: The Largest Web 2.0 Botnet Explained A technical paper discussing the KOOBFACE botnet Written by Jonell Baltazar, Joey Costoya, and Ryan Flores Trend Micro Threat Research THE REAL FACE OF KOOBFACE: THE LARGEST WEB 2.0 BOTNET EXPLAINED TABLE OF CONTENTS Table of Contents .................................................................................................................. i Introduction...........................................................................................................................The WALEDAC Botnet 1 Overview................................................................................................................................3 KOOBFACE DOWNLOADER................................................................................................ 5 SOCIAL NETWORK PROPAGATION COMPONENTS ................................................................... 6 WEB SERVER COMPONENT .................................................................................................. 7 ADS PUSHER AND ROGUE ANTIVIRUS INSTALLER................................................................... 8 CAPTCHA BREAKERS........................................................................................................ 8 DATA STEALERS.................................................................................................................. 9 WEB SEARCH HIJACKERS .................................................................................................. 11 ROGUE DNS CHANGERS...................................................................................................