Mining Web User Behaviors to Detect Application Layer Ddos Attacks
Total Page:16
File Type:pdf, Size:1020Kb
JOURNAL OF SOFTWARE, VOL. 9, NO. 4, APRIL 2014 985 Mining Web User Behaviors to Detect Application Layer DDoS Attacks Chuibi Huang1 1Department of Automation, USTC Key laboratory of network communication system and control Hefei, China Email: [email protected] Jinlin Wang1, 2, Gang Wu1, Jun Chen2 2Institute of Acoustics, Chinese Academy of Sciences National Network New Media Engineering Research Center Beijing, China Email: {wangjl, chenj}@dsp.ac.cn Abstract— Distributed Denial of Service (DDoS) attacks than TCP or UDP-based attacks, requiring fewer network have caused continuous critical threats to the Internet connections to achieve their malicious purposes. The services. DDoS attacks are generally conducted at the MyDoom worm and the CyberSlam are all instances of network layer. Many DDoS attack detection methods are this type attack [2, 3]. focused on the IP and TCP layers. However, they are not suitable for detecting the application layer DDoS attacks. In The challenges of detecting the app-DDoS attacks can this paper, we propose a scheme based on web user be summarized as the following aspects: browsing behaviors to detect the application layer DDoS • The app-DDoS attacks make use of higher layer attacks (app-DDoS). A clustering method is applied to protocols such as HTTP to pass through most of extract the access features of the web objects. Based on the the current anomaly detection systems designed access features, an extended hidden semi-Markov model is for low layers. proposed to describe the browsing behaviors of web user. • Along with flooding, App-DDoS traces the The deviation from the entropy of the training data set average request rate of the legitimate user and uses fitting to the hidden semi-Markov model can be considered as the abnormality of the observed data set. Finally the same rate for attacking the server, or employs experiments are conducted to demonstrate the effectiveness large-scale botnet to generate low rate attack flows. of our model and algorithm. This makes DDoS attacks detection more difficult. • Burst traffic and high volume are the common Index Terms—HsMM, web user behaviors, DDoS, DDoS characteristics of App-DDoS attacks and flash Attacks , clustering crowds. "Flash crowd" refers to the situation when a very large number of users simultaneously access a popular Website, which produces a surge I. INTRODUCTION in traffic to the Website and might cause the site to Distributed Denial of Service (DDoS) attacks have be virtually unreachable [4]. It is not easy for become a serious problem in recent years. Generally, current techniques to distinguish APP-DDoS DDoS attacks are carried out at the network layer, such as attacks from flash crowds. ICMP flooding, SYN flooding and UDP flooding. From the literature, few exiting researches focus on the Without advance warning, a DDoS attack can easily detection of app-DDoS attacks during the flash crowd exhaust the computing and communication resources of event. This paper introduces the hidden semi-Markov its victim within a short period of time [1]. Because of the model (HsMM) to capture the user browsing patterns seriousness and destructiveness, many studies have been during flash crowd and to implement the app-DDoS conducted on this type of attacks. A lot of effective attacks detection. Our contributions in this paper are schemes have been proposed to protect the network and threefold: 1) We use the clustering method to extract the equipment from bandwidth attack, so it is not as effortless web objects' access features, which can well portray as in the past for attackers to launch the network layer current web user browsing behaviors. 2) We apply hidden DDoS attacks. In order to dodge detection, attackers shift semi-Markov model (HsMM) to describe the dynamics of their offensive strategies to application-layer attacks and access features and to implement the detection of app- establish more sophisticated types of DDoS attacks. They DDoS attacks. 3) Experiments based on real traffic are utilize legitimate application layer HTTP requests from conducted to validate our detection scheme. legitimately connected network machines to overwhelm The organization of the paper is as follows. In the next web server. These attacks are typically more efficient section, we review the related work of our research. In © 2014 ACADEMY PUBLISHER doi:10.4304/jsw.9.4.985-990 986 JOURNAL OF SOFTWARE, VOL. 9, NO. 4, APRIL 2014 Section 3, we explain our model and algorithm to detect crowd and app-DDoS attacks are unstable, bursty and the app-DDoS attacks. Experiment results are presented huge traffic volume. The app-DDoS attackers are in section 4. Finally, we conclude our paper in Section 5. increasingly moving away from pure bandwidth flooding to more surreptitious attacks that hide in normal flash II. RELATED WORK crowd of the website. It is a challenge to detect the app- DDoS attacks when they occur during a flash crowd Most current DDoS-related researches are conducted event. on the IP layer or TCP layer instead of the application To meet the above challenges, we focus on analysis of layer. These mechanisms typically utilize the specific the user behaviors. In this paper, we assume that it is network features to detect attacks. Mirkovic et al. [5] impossible for app-attacks to completely mimic the proposed a defense system called D-WARD located in normal web user behaviors. This assumption is based on edge routers to monitor the asymmetry of two-way packet the following consideration. Generally, web user access rates and to detect attacks. Yu et al. [6] discriminated behaviors can be described by three factors: HTTP DDoS attacks from flash crowds by using the flow request rate, page retention time, and request sequence. correlation coefficient as a similarity metric among The attackers can mimic the normal access behavior by suspicious flows. Zhang et al. [7] proposed the launching attacks with similar HTTP request rate and Congestion Participation Rate (CPR) metric and a CPR- retention time as the normal users. However, the app- based approach to detect and filter the TCP layer DDos DDoS attacks cannot simulate the dynamic process of the attacks. Cabrera et al. [8] depend on the MIB traffic web user behavior or the request sequence, because it is a variables collected from the systems of attackers to pre-designed routine and cannot capture the dynamics of achieve the early detection. Yuan et al. [9] capture the the users and the networks. Another assumption of our traffic pattern by using cross correlation analysis and then paper is that the user behavior can be described by the decide where and when a DDoS attack possibly arises. distribution of web objects popularity. Then the variation Soule et al. [10] used the traffic matrix, which represents of web objects popularity could represent the dynamic the traffic state, to identify various dynamic attacks at changing process of user behavior. Since app-DDoS early stage. In [11], DDoS attacks were discovered by attacker is unable to obtain historical access records from analyzing the TCP packet header against the well-defined the victim server, it cannot mimic the dynamic of normal rules and conditions and distinguished the difference user behaviors. We consider the app-DDoS attacks as between normal and abnormal traffic. In [12], attackes anomaly browsing behavior. Moreover, we also assume are detected by computing the rate of TCP flags to TCP that normal users always access the "hot webpages". packets received at a web server. However, there are little However, the attackers access the "cold webpages”. DDoS defense methods that utilize the application layer Therefore, we could build the browsing behavior model information. In [19], a suspicion assignment mechanism for both attacks and normal user by monitoring the Ranjan et al. [19] used statistical method to detect time dynamic change of the web objects popularity. Using this related characteristics of HTTP sessions, such as request model, we could distinguish the app-DDoS attack from interarrival time, session inter-arrival time and session normal user. arrival time. Yen et al. [21] defended the application DDoS attacks with constraint random request attacks by IV. MODEL AND DETECTION PRINCIPLE the statistical methods. Kandula et al. [3] designed a system to protect a web cluster from DDoS attacks by (i) A. Hidden Semi-Markov Model optimally dividing time spent in authenticating new Hidden semi-Markov model (HsMM) [14] is an clients and serving authenticated clients (ii) using extension of hidden Markov model (HMM) [23] with CAPTCHAs designing a probabilistic authentication variable state duration. It is a stochastic finite state mechanism, but the task of requiring users to solve machine, specified by λ = (,,,,Q π ABP )where: graphic puzzles causes additional service delay. As a • Q result, the graphic puzzle cause annoying legitimate users is a discrete set of hidden states with cardinality = ∈ as well as act as another DDoS attack points. [16] M , i.e., QN{1,...,} . qQt denotes the introduced a web browsing model represented by the state that the system takes at time t ; transition of the click pages. In a few previous studies, • π is the initial state probability distribution, i.e., analysis of user behaviors have been applied in many π =∈{π | mQ} π ≡=Pr⎡⎤qm . q research fields [13, 15, 17, 20, 22]. m , m ⎣⎦1 t denotes the state that the system takes at time t III. MODEL PRELIMINARIES AND ASSUMPTIONS and mQ∈ . The initial state probability π = 1; When detecting the app-DDoS attacks, we are faced distribution satisfies ∑ m m with the following challenges: 1) attacker may launch an • A is the state transition matrix with probabilities: app-DDoS attack by mimicking the normal web user a ≡=Pr⎡⎤qnqmmnQ | = , , ∈ ,and access behavior, so the malicious requests differ from the mntt⎣⎦−1 legitimate ones but not in traffic characteristics.