Network Anomaly Detection Using Artificial Neural Networks

______________________________________________________PROCEEDING OF THE 20TH CONFERENCE OF FRUCT ASSOCIATION Network Anomaly Detection 8sing Artificial Neural Networks Sergey Andropov, Alexei Guirik Mikhail Budko, Marina Budko ITMO University ITMO University Saint-Petersburg, Russia Saint-Petersburg, Russia [email protected], [email protected] [email protected], [email protected] Line 4 - Author1, [email protected] Abstract—This paper presents a method of identifying and • semantic gap between results and their operational classifying network anomalies using an artificial neural network interpretation; for analyzing data gathered via Netflow protocol. Potential anomalies and their properties are described. We propose using a • high variance in network’s behavior; multilayer perceptron, trained with the backpropagation algorithm. We experiment both with datasets acquired from a • difficulties in evaluating and verifying results. real ISP monitoring system and with datasets modified to Dealing with these problems is necessary to construct an simulate the presence of anomalies; some Netflow records are effective anomaly detection system. One part of the solution is modified to contain known patterns of several network attacks. setting a narrower focus for the analysis program, such as We evaluate the viability of the approach by practical limiting the scope of the network that is being monitored. experimentation with various anomalies and iteration sizes. Any network will change over time, as the new hardware I. INTRODUCTION gets installed, the new programs with different behavior Information security of modern computer networks is a patterns emerge, etc. Usage of a neural network allows the growing source of concerns. The need for tools capable of system to adapt to the gradual changes as it can adjust its detecting the ever-increasing number of network attacks, weights to accommodate for the shifts in the behavior. viruses and other incidents continues to grow. There are Another problem of automated anomaly detection is the methods of anomaly detection based on known signatures and difficulty of obtaining proper training data and filtering the metrics; while giving acceptable results and having a low false network noise. Usage of Netflow protocol allows countering alarm rates, they are normally unable to detect a previously some of these difficulties, as it provides vital information unknown attack or vectors of a virus spreading, which means without overloading the system with too much data and thus such programs have to rely on databases being updated on a has relatively small storage requirements [10]. regular basis. The rest of the article is structured as follows: section II Anomaly can be defined as a deviation from the normal contains the overall design of the detection method; section III behavior. An anomaly can be defined as "an event (or object) describes network anomalies and their typical properties; that differs from some standard or reference event, in excess of sections IV and V contain Netflow packet structure and the some threshold, in accordance with some similarity or distance aggregation criteria; section VI shows the design and the metric on the event" [1]. Major sources of network anomalies training process of the neural network; sections VII presents are network attacks, hardware or software malfunctions and malware. Anomalies should be considered dangerous, as the dataset acquisition methods and evaluation of the detection breaking or disabling one component of the network can lead algorithm; finally, conclusions are drawn in section VIII. to the entire network being compromised. Detecting – and II. DESIGN classifying – an anomaly is an important step in preventing, or at least reducing, the potential damage caused by its The detection system has the following capacities: occurrence. 1) Offline traffic analysis. With this type of analysis a This article presents a method of detecting and classifying model of normal behavior for the network can be created. It an anomaly using an artificial neural network that analyses also provides information about different anomalies. data received with the Netflow protocol. Implementation of offline analysis requires a dataset of normal network behavior for a certain period. The topic of using machine learning for intrusion and anomaly detection is a well researched one [2], [3]. As stated 2) Online traffic analysis. This is the primary mode for in [2], using neural networks and other means of adaptable the system. Data from the Netflow protocol is received, systems for network monitoring presents a set of challenges, processed and stored by the system; different aspects of it are such as: analyzed by an artificial neural network in real time. If an anomaly is detected, a warning will be issued along with a • high cost of errors; report containing information about the incident. • difficulties of obtaining labeled data for specific kinds Data aggregation allows the system to see patterns in the of anomalies/attacks; otherwise highly variable traffic properties. Netflow protocol ISSN 2305-7254 ______________________________________________________PROCEEDING OF THE 20TH CONFERENCE OF FRUCT ASSOCIATION is used to receive information from the network devices, which is then filtered and aggregated based on a number of criteria such as number of packets per hour, average packet size, port usage. This data serves as an input to a neural network, which consists of three layers: input layer, hidden layer and output layer. The outputs of the neural network show if an anomaly is present. For every known anomaly the system is designed to detect, there is a neuron that shows the probability of that anomaly occurring. There is an additional neuron for an “unknown” anomaly, as well as one for a “normal” behavior. We discuss further details on the internal design of the neural network in section VI. III. NETWORK ANOMALIES Network anomaly is a situation when the network behaves differently from its established pattern. There are numerous reasons for anomalies to appear: Fig. 2. An example of network anomaly: abnormally high number of • hardware malfunctions discarded packets • unauthorized actions of the staff • intentional security breaches involving the staff • network attacks • virus infection process • "flash-crowd" behavior Anomalies will have different features depending on their source. For example, hardware malfunctions are likely to have distinct drops in incoming/outgoing traffic, as presented in Fig. 1. Fig. 3. An example of network anomaly: GRE active The topic of network attacks has been researched extensively [4], [5]. We will give short descriptions along with identifiable properties of each anomaly we plan to detect. • DoS and DDoS attacks Denial of service attack aims to make an online service unavailable. It is usually carried out by overwhelming the target with superfluous requests, using multiple sources in case of a Distributed DoS. This type of attack is regarded as one of Fig. 1. An example of network anomaly: uplink hardware failure the most common. (D)DoS attacks often use minimal packet sizes, set an incorrect protocol number and use the same Another examples are the rising number of discarded source/destination addresses. Targeted service experiences a packets, as shown in Fig. 2, and the result of GRE protocol noticeable increase of traffic flow and specific port number being active in Fig. 3. usage. There are multiple subtypes of this attack: UDP Flood, ICMP Ping Flood, SYN Flood, NTP Amplification etc. While unauthorized actions and intentional security breaches might be detectable through traffic analysis, in this • Network scans work we primarily focus on monitoring anomalies caused by network attacks, failures of hardware and changes in network Scanning attacks are performed by making multiple behavior. attempts to connect with different hosts in order to find ports ---------------------------------------------------------------------------- 27 ---------------------------------------------------------------------------- ______________________________________________________PROCEEDING OF THE 20TH CONFERENCE OF FRUCT ASSOCIATION and addresses that are open for connections. This procedure network monitoring" [6]. It became a de-facto standard and is can be performed by network administrators to check security used widely on Cisco devices. There are similar protocols, levels, as well as by potential adversaries to find most notably sFlow and IPFIX, however comparison between vulnerabilities. Network scans can be spotted by observing an them beyond the scope of this article. increase in connections from a certain host, or from multiple hosts to one. Traffic generated by this type of attack is Netflow uses UDP/SCTP protocols to transfer data from comparatively small. routers to special collectors. According to protocol description, all packets with the same source/destination IP address, • Flash-crowd source/destination ports, protocol interface and class of service are grouped into a flow and then packets and bytes are tallied This anomaly can be mistaken for a DDoS attack. It [6]. Then these flows are bundled together and transported to happens when public’s interest towards a particular resource is the Netflow collector server. significantly increased; an example of this is a release of long- awaited program that a group of people rushes to download or Most common version of Netflow is 5 [6]. The following access. One of the ways to

Network Anomaly Detection Using Artificial Neural Networks

A Review on Outlier/Anomaly Detection in Time Series Data

Image Anomaly Detection Using Normal Data Only by Latent Space Resampling

Unsupervised Network Anomaly Detection Johan Mazel

Machine Learning and Extremes for Anomaly Detection — Apprentissage Automatique Et Extrêmes Pour La Détection D’Anomalies

Machine Learning for Anomaly Detection and Categorization In

CSE601 Anomaly Detection

Anomaly Detection in Data Mining: a Review Jagruti D

Anomaly Detection in Raw Audio Using Deep Autoregressive Networks

Detecting Anomalies in System Log Files Using Machine Learning Techniques

Anomaly Detection with Adversarial Dual Autoencoders

Backpropagated Gradient Representations for Anomaly Detection

Machine Learning for Automated Anomaly Detection In