Implementation and Evaluation of Secure and Scalable Anomaly-Based Network Intrusion Detection

Total Page:16

File Type:pdf, Size:1020Kb

Implementation and Evaluation of Secure and Scalable Anomaly-Based Network Intrusion Detection INSTITUT FUR¨ INFORMATIK DER LUDWIG{MAXIMILIANS{UNIVERSITAT¨ MUNCHEN¨ Bachelorarbeit Implementation and evaluation of secure and scalable anomaly-based network intrusion detection Philipp Mieden Aufgabensteller: Prof. Dr. Helmut Reiser Betreuer: Dipl.-Inform. Stefan Metzger Leibniz-Rechenzentrum M¨unchen Index terms| security, anomaly detection, intrusion detection systems Hiermit versichere ich, dass ich die vorliegende Bachelorarbeit selbst¨andigverfasst und keine anderen als die angegebenen Quellen und Hilfsmittel verwendet habe. M¨unchen, den December 18, 2018 ........................................... (Unterschrift des Kandidaten) Abstract Corporate communication networks are frequently attacked with sophisticated and previ- ously unseen malware or insider threats, which makes advanced defense mechanisms such as anomaly based intrusion detection systems necessary, to detect, alert and respond to security incidents. Both signature-based and anomaly detection strategies rely on features extracted from the network traffic, which requires secure and extensible collection strategies that make use of modern multi core architectures. Available solutions are written in low level system programming languages that require manual memory management, and suffer from frequent vulnerabilities that allow a remote attacker to disable or compromise the net- work monitor. Others have not been designed with the purpose of research in mind and lack in terms of flexibility and data availability. To tackle these problems and ease future experiments with anomaly based detection techniques, a research framework for collecting traffic features implemented in a memory-safe language will be presented. It provides ac- cess to network traffic as type-safe structured data, either for specific protocols or custom abstractions, by generating audit records in a platform neutral format. To reduce storage space, the output is compressed by default. The approach is entirely implemented in the Go programming language, has a concurrent design, is easily extensible and can be used for live capture from a network interface or with PCAP and PCAPNG dumpfiles. Furthermore the framework offers functionality for the creation of labeled datasets, targeting application in supervised machine learning. To demonstrate the developed tooling, a series of experi- ments is conducted, on classifying malicious behavior in the CIC-IDS-2017 dataset, using Tensorflow and a Deep Neural Network. Contents 1 Introduction1 1.1 Acknowledgements . .2 1.2 Outline Of The Thesis . .2 1.3 Motivation . .3 1.4 Terminology . 10 1.4.1 Data Collection . 10 1.4.2 Feature Extraction . 10 1.4.3 Feature Selection . 10 1.5 Problem Definition . 11 1.6 Task Description . 12 1.7 Related Work . 13 1.7.1 Literature . 13 1.7.2 Articles . 13 2 Requirement Analysis 14 2.1 Functional Requirements . 14 2.1.1 Protocol Support Coverage . 14 2.1.2 Data Availability . 14 2.1.3 Abstraction Capabilities . 14 2.1.4 Concurrent Design . 14 2.1.5 File Extraction Capabilities . 15 2.1.6 Supported Input Formats . 15 2.1.7 Suitable Output Formats . 15 2.1.8 Real-Time Operation . 15 2.2 Non-Functional Requirements . 16 2.2.1 Memory Safety . 16 2.2.2 Open Source Codebase . 16 2.2.3 Scalability . 16 2.2.4 Performance . 16 2.2.5 Configurable Design . 17 2.2.6 Extensibility . 17 2.2.7 Reliability . 17 2.2.8 Usability . 17 2.2.9 Storage Efficiency . 18 2.3 Summary . 18 3 State Of The Art 19 3.1 Flow Formats . 19 3.1.1 NetFlow . 19 3.1.2 sFlow . 19 iv Contents 3.1.3 IPFIX . 20 3.2 Packet Level Formats . 20 3.2.1 PCAP . 20 3.2.2 PCAP-NG . 20 3.3 Data Collection Tools . 20 3.3.1 Argus . 20 3.3.2 nProbe . 21 3.3.3 Bro / Zeek . 21 3.3.4 CICFlowMeter . 21 3.3.5 ipsumdump . 22 3.3.6 tshark . 22 3.4 Requirement Evaluation . 22 3.5 Summary . 24 4 Concept 26 4.1 Design Goals . 27 4.2 Netcap Specification . 28 4.3 Protocol Buffers . 28 4.4 Delimited Protocol Buffer Records . 28 4.5 Data Pipe . 29 4.6 Parallel Processing . 29 4.7 Data Compression . 29 4.8 Writing Data To Disk . 30 4.9 Audit Records . 30 4.10 Netcap File Header . 31 4.11 Packet Decoding . 31 4.12 Workers . 32 4.13 Encoders . 34 4.14 Unknown Protocols . 36 4.15 Error Log . 36 4.16 Filtering and Export . 37 4.17 Dataset Labeling . 38 4.18 Sensors . 39 4.19 Sensor Data Pipe . 40 4.20 Collection Server . 41 5 Implementation 42 5.1 Why Go? . 42 5.2 Platform and Architecture Support . 42 5.3 Reading PCAP Files . 43 5.4 Reading Traffic from a Network Interface . 43 5.5 Concatenating Strings . 44 5.6 Atomic Writers . 44 5.7 Supported Protocols . 45 5.8 Protocol Sub Structure Types . 46 5.9 Available Fields . 47 5.9.1 Layer Encoders . 47 v Contents 5.9.2 Custom Encoders . 49 5.10 TLS Handshakes . 50 5.10.1 JA3 Fingerprinting . 52 5.11 HTTP . 53 5.12 Flows and Connections . 54 5.13 Layer Flows . 55 5.13.1 Link Flow . 55 5.13.2 Network Flow . 56 5.13.3 Transport Flow . 56 5.14 Sensors & Collection Server . 57 5.14.1 Batch Encryption . 57 5.14.2 Batch Decryption . ..
Recommended publications
  • Passive Asset Discovery and Operating System Fingerprinting in Industrial Control System Networks
    Eindhoven University of Technology MASTER Passive asset discovery and operating system fingerprinting in industrial control system networks Mavrakis, C. Award date: 2015 Link to publication Disclaimer This document contains a student thesis (bachelor's or master's), as authored by a student at Eindhoven University of Technology. Student theses are made available in the TU/e repository upon obtaining the required degree. The grade received is not published on the document as presented in the repository. The required complexity or quality of research of student theses may vary by program, and the required minimum study period may vary in duration. General rights Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights. • Users may download and print one copy of any publication from the public portal for the purpose of private study or research. • You may not further distribute the material or use it for any profit-making activity or commercial gain Department of Mathematics and Computer Science Passive Asset Discovery and Operating System Fingerprinting in Industrial Control System Networks Master Thesis Chris Mavrakis Supervisors: prof.dr. S. Etalle dr. T. Oz¸celebi¨ dr. E. Costante Eindhoven, October 2015 Abstract Maintaining situational awareness in networks of industrial control systems is challenging due to the sheer number of devices involved, complex connections between subnetworks and the delicate nature of industrial processes. While current solutions for automatic discovery of devices and their operating system are lacking, plant operators need to have accurate information about the systems to be able to manage them effectively and detect, prevent and mitigate security and safety incidents.
    [Show full text]
  • PROGRAMMING ESSENTIALS in PYTHON | PCAP Certification
    PROGRAMMING ESSENTIALS IN PYTHON | PCAP Certification Programming Essentials in Python course covers all the basics of programming in Python, as well as general computer programming concepts and techniques. The course also familiarizes the student with the object-oriented approach. The course will prepare the student for jobs/careers connected with widely understood software development, which includes not only creating the code itself as a junior developer, but also computer system design and software testing. It could be a stepping-stone to learning any other programming language, and to explore technologies using Python as a foundation (e.g., Django, SciPy). This course is distinguished by its affordability, friendliness, and openness to the student. It starts from the absolute basics, guiding the student step by step to complex problems, making her/him a responsible software creator able to take on different challenges in many positions in the IT industry. TARGET AUDIENCE Programming Essentials in Python curriculum is designed for students with little or no prior knowledge of programming. TARGET CERTIFICATION Programming Essentials in Python curriculum helps students prepare for the PCAP | Python Certified Associate Programmer certification exam. PCAP is a professional certification that measures the student’s ability to accomplish coding tasks related to the basics of programming in the Python language, and the fundamental notions and techniques used in object-oriented programming. PCAP – COURSE MODULES & OBJECTIVES Module 1: Familiarize the student with the basic methods offered by Python of formatting and outputting data, together with the primary kinds of data and numerical operators, their mutual relations and bindings. Introduce the concept of variables and variable naming conventions.
    [Show full text]
  • Network Forensics
    Network Forensics Michael Sonntag Institute of Networks and Security What is it? Evidence taken from the “network” In practice this means today the Internet (or LAN) In special cases: Telecommunication networks (as long as they are not yet changed to VoIP!) Typically not available “after the fact” Requires suspicions and preparation in advance Copying the communication content At the source (=within the suspects computer): “Online search” This could also be a webserver, e.g. if it contains illegal content “Source” does NOT mean that this is the client/initiator of communication/… At the destination: See some part of the traffic Only if unavoidable or the only interesting part Somewhere on the way of the (all?) traffic: ISP, physically tapping the wires, home routers etc. Network Forensics 2 Problems of network forensics “So you have copied some Internet traffic – but how is it linked to the suspect?” The IP addresses involved must be tied to individual persons This might be easy (location of copying) or very hard “When did it take place?” Packet captures typically have only relative timestamps But there may be lots of timestamps in the actual traffic! As supporting evidence to some external documentation “Is it unchanged?” These are merely packets; their content can be changed Although it is possible to check e.g. checksums, this is a lot of work and normally not done Treat as any other digital evidence Hash value + Chain of Custody; work on copies only Network Forensics 3 Scenario Suspect: Mallory Malison; released
    [Show full text]
  • Packet Capture Procedures on Cisco Firepower Device
    Packet Capture Procedures on Cisco Firepower Device Contents Introduction Prerequisites Requirements Components Used Steps to Capture Packets Copy a Pcap File Introduction This document describes how to use the tcpdump command in order to capture packets that are seen by a network interface of your Firepower device. It uses Berkeley Packet Filter (BPF) syntax. Prerequisites Requirements Cisco recommends that you have knowledge of the Cisco Firepower device and the virtual device models. Components Used This document is not restricted to specific software and hardware versions. The information in this document was created from the devices in a specific lab environment. All of the devices used in this document started with a cleared (default) configuration. If your network is live, make sure that you understand the potential impact of any command. Warning: If you run tcpdump command on a production system, it can impact network performance. Steps to Capture Packets Log in to the CLI of your Firepower device. In versions 6.1 and later, enter capture-traffic. For example, > capture-traffic Please choose domain to capture traffic from: 0 - eth0 1 - Default Inline Set (Interfaces s2p1, s2p2) In versions 6.0.x.x and earlier, enter system support capture-traffic. For example, > system support capture-traffic Please choose domain to capture traffic from: 0 - eth0 1 - Default Inline Set (Interfaces s2p1, s2p2) After you make a selection, you will be prompted for options: Please specify tcpdump options desired. (or enter '?' for a list of supported options) Options: In order to capture sufficient data from the packets, it is necessary to use the -s option in order to set the snaplength correctly.
    [Show full text]
  • Innovative High-Speed Packet Capture Solutions
    Innovative High-Speed Packet Capture Solutions With the rapid increase in IP-based communications, are often limited by inadequate storage the inability to enterprises and telecommunications providers are struggling offload your data to new open-source analysis tools. to keep pace with numerous network-related tasks, including Continuum PCAP is a powerful, enterprise-class packet cyber security/incident response, network performance capture engine available in innovative portable and rack- monitoring, and corporate/government compliance. There mount systems. It is designed from the ground up to provide are a number of highly sophisticated network analysis tools you with full line-rate packet capture without the limitations available for addressing network issues, such as NetFlow of previous capture solutions. analysis or intrusion detection, but without reliable, high- speed packet capture you’re not getting all the information Continuum PCAP is built for companies who want deep you need. visibility into their network activity as well as OEMs who need a reliable capture engine for developing their own monitoring Low-end or home-grown packet capture solutions often don’t tools. This lossless, high-speed capture solution can be have the performance needed to quickly index the data, integrated into your existing infrastructure and combined query while recording, or provide reliable capture under peak with your preferred analysis tools. network load conditions. Even expensive custom solutions Features Benefits • Capture network traffic at line rates up to 40Gbps to industry- • Pre-configured appliance tuned for packet capture standard PCAP files with zero packet loss • OEM platform for building sophisticated network analysis • Each capture port is a separate stream with time stamping of appliances - let Continuum PCAP handle the capture and every packet.
    [Show full text]
  • Project 3: Networking Due: Parts 1–3: May 18 11:59 PM PT Part 4: May 25 11:59 PM PT
    CS155: Computer Security Spring 2021 Project 3: Networking Due: Parts 1{3: May 18 11:59 PM PT Part 4: May 25 11:59 PM PT Introduction This project is all about network security. You will both use existing software to examine remote machines and local traffic as well as play the part of a powerful network attacker. Parts one and two show you how a simple port scan can reveal a large amount of information about a remote server, as well as teach you how to use Wireshark to closely monitor and understand network traffic observable by your machine. Part three will focus on a dump of network traffic in a local network, and will teach you how to identify different types of anomalies. Finally, in part four, you will get to implement a DNS spoofer that hijacks a HTTP connection. This project will solidify your understanding of the mechanics and shortfalls of the DNS and HTTP protocols. Go Language. This project will be implemented in Go, a programming language most of you have not used before. As such, part of this project will be spent learning how to use Go. Why? As you saw in Project 1, C and C++ are riddled with memory safety pitfalls, especially when it comes to writing network code. It's practically impossible to write perfectly secure C/C++ code|even experienced programmers make errors. For example, just this week, Qualys uncovered 21 exploits in Exim, a popular open source mail server. The security community has largely agreed that future systems need to be built in safe languages.
    [Show full text]
  • Development and Design of Firmware Programming Tools for the Openhpsdr Hardware
    Development and Design of Firmware Programming Tools for the openHPSDR Hardware Dave Larsen, KV0S openHPSDR Development Group Columbia, Missouri, USA openhpsdr.org [email protected] Abstract Over the past 10 years the High Performance Software Defined Radio project has moved from the early attempt of producing an audio interface and a USB interface to a sophisticated single board single FPGA (Field Programmable Gate Array). In this process the communication protocol and the software tools to manage that process have changed several times. Some of the reason things are organized in certain ways are an artifact of the history of the changes. This paper will chronicaly these changes and some of the resulting software that allow the loading and replacement of the firmware in the radio boards. Introduction This paper presents the design rational and motivation behind the firmware programming tools used by the openHPSDR Development Group (openhpsdr.org). To understand the need for the programming tools, you need to know some history of the project. I started to participate in the HPSDR group in 2007. At that time the group had already developed the Atlas, which is a passive backplane board, Ozymandias an FPGA based interface controller card that provides the input and output connections to the real world, and Janus a dual, full duplex, A/D and D/A converter board, which is a board that could be used as a sound card interface to the FlexRadio SDR1000. At this point in the project, custom firmware programming tools were not need as the only board with an FPGA (Field Programmable Gate Array) was the Ozymandias board (Altera Cyclone II FPGA), which would read and reload its firmware on every startup from a file on the client computer.
    [Show full text]
  • PCAP Command-Line Madness!
    PCAP Command-Line Madness! Hal Pomeranz / [email protected] / @hal_pomeranz tcpdump You can use it to capture some packets tcpdump –n –i ens0 –w full-packets.pcap Or maybe just the packet headers tcpdump –n –i ens0 –s 160 –w headers-only.pcap But did you know? You can capture a collection of PCAP files tcpdump –n –i ens0 –w mypcap –C 1000 –W 7 tcpdump –n –i ens0 –w mypcap-%j –G 86400 –W 14 You can filter large PCAPs into smaller chunks tcpdump –n –r large.pcap –w dns.pcap 'port 53' tcpdump –n –r large.pcap –w smb.pcap 'tcp and port 445' tshark All the filtering power of Wireshark Only output the fields you want It's like AWK for packets! tshark -n -r example.pcap -Y http.request -T fields -e frame.time -e ip.src -e http.request.method -e http.host -e http.request.uri -e http.user_agent -e http.referer Ugh! That timestamp! Default timestamp format is ugly Sneaky conversion trick: -e frame.time_epoch + AWK tshark -n -r example.pcap -Y http.request -T fields -e frame.time_epoch … | awk '{$1=strftime("%F %T", $1); print}' Because I sed So tshark and shell commands go great together! Let's look at Google search activity in a PCAP tshark -n -r example.pcap -Y 'http.host contains google.com and http.request.uri contains "/search?"' -T fields -e http.request.uri | sed 's/.*q=//' | sed 's/&.*//' The Command-Line Histogram Find the most visited web sites tshark -n -r example.pcap -Y 'http.request' -T fields -e http.host | sort | uniq -c | sort -n Noise Reduction Only track sites with Google analytics cookies Gives you top web sites visited, no advertising domains tshark -n -r example.pcap -Y 'http.cookie contains "_utm"' -T fields -e http.host | sort | uniq -c | sort -n Other Useful PCAP Tools capinfos – Show basic PCAP stats editcap – Split PCAPs by date and time ngrep – String searching in packet content tcpflow – Write TCP streams to files nfpcapd – Create Netflow data from PCAP Snort and Bro can also read from PCAPs! Thanks For Listening! Any final questions? Hal Pomeranz [email protected] @hal_pomeranz Slides! – http://deer-run.com/~hal/.
    [Show full text]
  • Running Bro on BSD
    Running Bro on BSD An analysis of high performance solutions running on BSD operating systems. Michael Shirk BroCon 2016 @Shirkdog http://github.com/Shirkdog Agenda l Introduction l Bro is awesome l Why FreeBSD? l High Performance and FreeBSD l FreeBSD at the Berkley Lab l PF_RING vs. netmap l OpenBSD Rant Warning l Whenever you see the beastie with a hammer, there is a potential for some BSD bias to slip in. l The goal is to minimize this throughout the talk. l All information not cited in this talk is based on personal experience or opinion (marked with an asterisk *). Introduction l Worked in IDS/IPS since 2003 (various positions including consulting) - Engines: Snort, Suricata, Dragon and now Bro (also had to work with McAfee, ISS, NFR … others) - Signatures for Emerging Threats (since they were Bleeding Edge Snort) l Support Open Source Security Tools and Software - Maintain pulledpork for Snort/Suricata (rule updating script): http://github.com/shirkdog/pulledpork - Active community member of open source projects: l Operating Systems: FreeBSD, OpenBSD, HardenedBSD l Security Tools: Snort, Suricata, AIDE, Bro (starting) Bro Beginnings l 2013 – Bro setup on Linux with PF_RING and Suricata ( Dell R610 12 Core 32GB Appliance) - PoC was Security Onion, the production setup was on Ubuntu with PF_RING, Suricata and nothing else. - Gigamon TAP aggregated data to a single 10Gb Fiber interface fed to the Bro/Suricata sensor. - ~700Mbps peak, ~350Mbps non-peak l Bro logs were fed into Splunk (modified Splunk_TA_Bro to work with log formats) l Set it and forget it, it's just that simple.
    [Show full text]
  • Wireshark Guide
    Wireshark Guide Introduction Viewing and searching for data in a spacewire capture may be a labour intensive purpose. In order to aid looking for application and physical errors in a spacewire traffic we provide a means of viewing captures in Wireshark and Tshark. Both Wireshark and Tshark are commonly available network analysis tools which are free , for more information on these tools go to https://www.wireshark.org/ Plugins Wireshark and Tshark are pluggable tools in that custom network dissectors can be written and easily installed on your computer. We provide Wireshark plugins for • CentOS 6 (x64) Wireshark 1.8 • CentOS 7 (x64) Wireshark 1.10.14 • Ubuntu-16.04 • Ubuntu-18.04 • Windows (x64) Wireshark 2.66 Should you need a plugin for a different environment you can either contact us, or if you wish you can rebuild one as we supply the source code to the plugin (in order to comply with the GPL licensing requirements of Wireshark Our plugin is simply a file that needs to be copied a global location if you have computer administrative rights or to your own personal area if you don’t, the next section guides you through finding the directory for your Wireshark plugins. Finding Wireshark Plugin Directory Start Wireshark and you will have the familiar wireshark screen as shown below Click on Help and you will get the following menu Then you need to click on “About Wireshark”, and you will then be presented with Now click on the Folders Tab and you will be presented with the folders used by Wireshark (on CentOS) CentOS Installation of Global Plugin If you have administrative rights and want the plugin to be made available to all users then copy the file /opt/4links/etherlinks/lib64/wireshark/spaceshark.so (held in the wireshark RPM)to the Global Plugins folder CentOS Installation of Personal Plugin If you want to just install the plugin for your own usage then copy the above shared object to ~/.wireshark/plugins.
    [Show full text]
  • Download Application Note
    APPLICATION NOTE APPLICATION NOTE How to capture traffic into PCAP files and replay on Xena’s Vulcan traffic generation & analysis platform to reproduce your own reality for stateful performance verification. WWW.XENANETWORKS.COM APPLICATION NOTE WWW.XENANETWORKS.COM APPLICATION NOTE Stateful PCAP replay is an effective way to reproduce reality to your system under test and test the behaviors of your devices that are not visible with modeled traffic. Vulcan’s advanced Layer 4 replay provides a platform to replay your own PCAP with flexibility and scalability. In case of packet loss by the network, which is a normal behavior of any IP networks, Vulcan’s stateful TCP stack makes sure that the information delivery is reliable by means of TCP retransmission. Adaptive congestion control can be enabled with the measurement of round-trip latency between the client and the server. This application note describes how to make your or PCAP file that conform to Vulcan’s replay engine, and how to use Vulcan’s replay scenario to scale up the traffic for high-performance testing. WWW.XENANETWORKS.COM APPLICATION NOTE CAPTURE REAL-WORLD TRAFFIC INTO PCAP The simplest way to quickly generate a PCAP file is to use a network traffic analysis software, e.g. Wireshark, or other similar tools. We will use Wireshark in this section to demonstrate how to capture the traffic we want to replay. This section will describe how to capture traffic into a PCAP file using Wireshark. In-depth use of Wireshark is out of the scope of this section. For more on how to use Wireshark, please refer to https://www.wireshark.org LAUNCH WIRESHARK Launch Wireshark and you should see the window as shown Figure 1.
    [Show full text]
  • What Is PCAP?
    What is PCAP? PCAP (Packet Capture) is a protocol for wirelessInternet communication that allows a computer or device to receive incoming radio signals from another device and convert those signals into usable information. It allows a wireless device to convert information into radio signals in order to transfer them to another device. PCAP runs in the background of all wireless devices and plays a critical role in wireless communication, though it is seldom recognized. It has many uses and millions of computer users around the world use it daily. How PCAP Works PCAP and WinPcap, the Windows-based equivalent of PCAP, provide the majority of background support for most commercial wireless programs. PCAP essentially converts digital information into radio signals, referred to as packets, by using a specific algorithm. An external computer or wireless device can then receive and convert those packets back into usable information by decoding radio signals with the same algorithm. PCAP also provides security for a wireless network by scrambling radio signals with additional algorithms. Applications PCAP allows wireless devices to communicate with each other. Without this property, the Internet would not be as accessible as it is. This is because all wireless devices, including laptop computers, desktop computers with wireless adapters, cell phones, PDAs, wireless printers, wireless fax machines, etc., depend on data packets to transfer information. PCAP can be used as a protocol analyzer, network monitor, traffic generator, network tester, or network intrusion detection system. Advantages PCAP is very easy to use and is compatible with a wide variety of programs. In fact, PCAP is used as the framework for more than a dozen different packet capturing programs.
    [Show full text]