Implementation and Evaluation of Secure and Scalable Anomaly-Based Network Intrusion Detection
Total Page:16
File Type:pdf, Size:1020Kb
INSTITUT FUR¨ INFORMATIK DER LUDWIG{MAXIMILIANS{UNIVERSITAT¨ MUNCHEN¨ Bachelorarbeit Implementation and evaluation of secure and scalable anomaly-based network intrusion detection Philipp Mieden Aufgabensteller: Prof. Dr. Helmut Reiser Betreuer: Dipl.-Inform. Stefan Metzger Leibniz-Rechenzentrum M¨unchen Index terms| security, anomaly detection, intrusion detection systems Hiermit versichere ich, dass ich die vorliegende Bachelorarbeit selbst¨andigverfasst und keine anderen als die angegebenen Quellen und Hilfsmittel verwendet habe. M¨unchen, den December 18, 2018 ........................................... (Unterschrift des Kandidaten) Abstract Corporate communication networks are frequently attacked with sophisticated and previ- ously unseen malware or insider threats, which makes advanced defense mechanisms such as anomaly based intrusion detection systems necessary, to detect, alert and respond to security incidents. Both signature-based and anomaly detection strategies rely on features extracted from the network traffic, which requires secure and extensible collection strategies that make use of modern multi core architectures. Available solutions are written in low level system programming languages that require manual memory management, and suffer from frequent vulnerabilities that allow a remote attacker to disable or compromise the net- work monitor. Others have not been designed with the purpose of research in mind and lack in terms of flexibility and data availability. To tackle these problems and ease future experiments with anomaly based detection techniques, a research framework for collecting traffic features implemented in a memory-safe language will be presented. It provides ac- cess to network traffic as type-safe structured data, either for specific protocols or custom abstractions, by generating audit records in a platform neutral format. To reduce storage space, the output is compressed by default. The approach is entirely implemented in the Go programming language, has a concurrent design, is easily extensible and can be used for live capture from a network interface or with PCAP and PCAPNG dumpfiles. Furthermore the framework offers functionality for the creation of labeled datasets, targeting application in supervised machine learning. To demonstrate the developed tooling, a series of experi- ments is conducted, on classifying malicious behavior in the CIC-IDS-2017 dataset, using Tensorflow and a Deep Neural Network. Contents 1 Introduction1 1.1 Acknowledgements . .2 1.2 Outline Of The Thesis . .2 1.3 Motivation . .3 1.4 Terminology . 10 1.4.1 Data Collection . 10 1.4.2 Feature Extraction . 10 1.4.3 Feature Selection . 10 1.5 Problem Definition . 11 1.6 Task Description . 12 1.7 Related Work . 13 1.7.1 Literature . 13 1.7.2 Articles . 13 2 Requirement Analysis 14 2.1 Functional Requirements . 14 2.1.1 Protocol Support Coverage . 14 2.1.2 Data Availability . 14 2.1.3 Abstraction Capabilities . 14 2.1.4 Concurrent Design . 14 2.1.5 File Extraction Capabilities . 15 2.1.6 Supported Input Formats . 15 2.1.7 Suitable Output Formats . 15 2.1.8 Real-Time Operation . 15 2.2 Non-Functional Requirements . 16 2.2.1 Memory Safety . 16 2.2.2 Open Source Codebase . 16 2.2.3 Scalability . 16 2.2.4 Performance . 16 2.2.5 Configurable Design . 17 2.2.6 Extensibility . 17 2.2.7 Reliability . 17 2.2.8 Usability . 17 2.2.9 Storage Efficiency . 18 2.3 Summary . 18 3 State Of The Art 19 3.1 Flow Formats . 19 3.1.1 NetFlow . 19 3.1.2 sFlow . 19 iv Contents 3.1.3 IPFIX . 20 3.2 Packet Level Formats . 20 3.2.1 PCAP . 20 3.2.2 PCAP-NG . 20 3.3 Data Collection Tools . 20 3.3.1 Argus . 20 3.3.2 nProbe . 21 3.3.3 Bro / Zeek . 21 3.3.4 CICFlowMeter . 21 3.3.5 ipsumdump . 22 3.3.6 tshark . 22 3.4 Requirement Evaluation . 22 3.5 Summary . 24 4 Concept 26 4.1 Design Goals . 27 4.2 Netcap Specification . 28 4.3 Protocol Buffers . 28 4.4 Delimited Protocol Buffer Records . 28 4.5 Data Pipe . 29 4.6 Parallel Processing . 29 4.7 Data Compression . 29 4.8 Writing Data To Disk . 30 4.9 Audit Records . 30 4.10 Netcap File Header . 31 4.11 Packet Decoding . 31 4.12 Workers . 32 4.13 Encoders . 34 4.14 Unknown Protocols . 36 4.15 Error Log . 36 4.16 Filtering and Export . 37 4.17 Dataset Labeling . 38 4.18 Sensors . 39 4.19 Sensor Data Pipe . 40 4.20 Collection Server . 41 5 Implementation 42 5.1 Why Go? . 42 5.2 Platform and Architecture Support . 42 5.3 Reading PCAP Files . 43 5.4 Reading Traffic from a Network Interface . 43 5.5 Concatenating Strings . 44 5.6 Atomic Writers . 44 5.7 Supported Protocols . 45 5.8 Protocol Sub Structure Types . 46 5.9 Available Fields . 47 5.9.1 Layer Encoders . 47 v Contents 5.9.2 Custom Encoders . 49 5.10 TLS Handshakes . 50 5.10.1 JA3 Fingerprinting . 52 5.11 HTTP . 53 5.12 Flows and Connections . 54 5.13 Layer Flows . 55 5.13.1 Link Flow . 55 5.13.2 Network Flow . 56 5.13.3 Transport Flow . 56 5.14 Sensors & Collection Server . 57 5.14.1 Batch Encryption . 57 5.14.2 Batch Decryption . ..