Botnet Detection a Numerical and Heuristic Analysis
Total Page:16
File Type:pdf, Size:1020Kb
University of Minho School of Engineering Luís Miguel Ferreira Costa Mendonça Botnet Detection A Numerical and Heuristic Analysis January 2012 University of Minho School of Engineering Luís Miguel Ferreira Costa Mendonça Botnet Detection A Numerical and Heuristic Analysis Master Thesis Master in Communication Networks and Services Engineering Research oriented by: Henrique Manuel Dinis dos Santos January 2012 “Information is not knowledge” Albert Einstein Botnet Detection: A Numerical and Heuristic Analysis i Table of Contents 1. Introduction .............................................................................................. 1 1.1 Goals .................................................................................................................... 2 1.2 Research Methodology ........................................................................................... 3 1.3 Document Organization and Notes of Attention ....................................................... 4 1.4 Research Calendar ................................................................................................ 5 2. Botnet Characterization ............................................................................. 7 2.1 Definition ............................................................................................................... 7 2.2 Propagation and Recruitment ................................................................................. 7 2.3 Command and Control ........................................................................................ 11 2.4 Maintenance and Availability ................................................................................ 20 2.5 Attacks ................................................................................................................ 23 2.6 Botnet History: The Malware Chronicles ............................................................... 27 2.7 Conclusions ......................................................................................................... 29 3. Botnet Detection and Analysis .................................................................31 3.1 Honeynets, Honeypots and SPAM Traps .............................................................. 31 3.2 Signature-based Methods ..................................................................................... 31 3.3 DNS-based Methods ............................................................................................ 32 3.4 Anomaly-based Methods ...................................................................................... 34 3.5 Related Work ....................................................................................................... 34 3.6 Conclusions ......................................................................................................... 36 4. Botnet Anomaly-Based Detection .............................................................39 4.1 Bot Behavior ........................................................................................................ 41 4.2 Botnet Behavior ................................................................................................... 42 4.3 Useful Traffic Data for Anomalous-Based Detection .............................................. 42 Botnet Detection: A Numerical and Heuristic Analysis ii 4.4 Conclusions ......................................................................................................... 44 5. Proposed Model and Prototype ...............................................................47 5.1 Initial Research .................................................................................................... 47 5.2 Netflow as the Traffic Feature Source ................................................................... 55 5.3 Traffic Attributes Heuristics .................................................................................. 57 5.4 Communications Fingerprints .............................................................................. 61 5.5 Detection Framework Implementation .................................................................. 63 5.6 Prototype Implementation .................................................................................... 70 5.7 Conclusions ......................................................................................................... 72 6. Experimental Setup, Results and Analysis ................................................73 6.1 Test Datasets Characterization ............................................................................. 73 6.2 Prototype Test Environment ................................................................................. 76 6.3 Testing Methodology ............................................................................................ 78 6.4 Detection Sensitivity and Specificity Analysis ........................................................ 82 6.5 Performance and Storage: Real-Time? .................................................................. 89 6.6 Final Results and Conclusions.............................................................................. 91 7. Discussion, Future Work and Conclusions ...............................................95 8. References .............................................................................................99 9. Apendix A: Botnet History ..................................................................... 107 Botnet Detection: A Numerical and Heuristic Analysis iii List of Figures Figure 1: IRC Botnet Topology ............................................................................................ 12 Figure 2: HTTP Botnet Topology ......................................................................................... 13 Figure 3: HTTP Botnet Hierarchic Topology ........................................................................ 14 Figure 4: P2P Botnet Topology ........................................................................................... 15 Figure 5: Number of distinct hosts with scan-like activity ..................................................... 49 Figure 6: Distinct source IPs per number of bytes and packets ........................................... 51 Figure 7: Number of distinct Source IPs with crowd-like behavior per ten-minute period ...... 52 Figure 8: Number of distinct hosts connecting to SQL Server ports per bytes per packet ..... 53 Figure 9: Communication Fingerprints (example) ................................................................ 62 Figure 10: Scanner Communication Fingerprints ................................................................ 62 Figure 11: Malicious Host Fingerprints ............................................................................... 63 Figure 12: HTTP Server Fingerprints .................................................................................. 63 Figure 13: Clustering Database Tables ............................................................................... 69 Figure 14: Padre Integrated Development Environment ...................................................... 71 Figure 15: SQL Server Scripts Test Examples ..................................................................... 71 Figure 16: Typical University of Minho 24-hour traffic shown in Nfsen (12th January 2011) . 74 Figure 17: Netflow Capture System .................................................................................... 74 Figure 18: Distinct active hosts for each hourly period (16th November 2011) .................... 76 Figure 19: Distinct active hosts for each hourly period present in blacklists (16th November 2011) ................................................................................................................................ 76 Figure 20: nfsen "testplugin" command-line tool ................................................................ 77 Figure 21: ROC Plot for CAST variation ............................................................................... 83 Figure 22: First-Pass Detection - Sensitivity versus Specificity .............................................. 84 Figure 23: MNACT ROC plots with (left) and without (right) clustering analysis .................... 85 Figure 24: ROC Plot for detection without clustering analysis for MNACT=70 and HAST=950 ......................................................................................................................................... 86 Figure 25: Hosts detected without clustering analysis for MNACT=70 and HAST=950 ......... 87 Figure 26: Hosts detected using clustering analysis for MNACT=70 and HAST=950 ............ 88 Figure 27: ROC Plot for detection with clustering analysis for MNACT=70 and HAST=990 ... 88 Figure 28: Hosts detected using clustering analysis for MNACT=70 and HAST=990 ............ 89 Botnet Detection: A Numerical and Heuristic Analysis iv Figure 29: Performance Evaluation Chart ........................................................................... 90 List of Tables Table 1: Network Traffic Data Source Classification ............................................................ 46 Table 2: Top 5 bytes per packet clusters with higher SQL Server scan activity ..................... 53 Table 3: Netflow Traffic Attributes Selected ......................................................................... 56 Table 4: Sample Communication Fingerpint Values ............................................................ 61 Table 5: Connection Anomaly