<<

Available online at www.sciencedirect.com ScienceDirect

Procedia Computer Science 93 ( 2016 ) 893 – 901

6th International Conference on Advances In Computing & Communications, ICACC 2016, 6-8 September 2016, Cochin, A Scalable Detection Technique for Real-time Transport Protocol (RTP) Flooding Attacks in VoIP Network

G.Vennilaa*, MSK Manikandanb

a Department of Electronics and Communication Engineering, Thiagarajar College of Engineering, Madurai, Tamilnadu, India-625015 b Department of Electronics and Communication Engineering, Thiagarajar College of Engineering, Madurai, Tamilnadu, India-625015

Abstract

Voice over Internet Protocol (VoIP) has become widely uses RTP as its media protocol that delivers multimedia sessions over IP based network. RTP is the hub of media flooding attacks that causes inconsistency of media transmitted which degrading Quality of Service (QoS) and unnecessary resource consumption in servers. Thus, the motto of the paper is revolved around detection of RTP flooding attacks. In this paper, Packet Inspection using Statistical (PIS) technique is proposed that detects RTP flooding attacks. The experimental results show that the proposed PIS technique has a detection rate of more than 98% with False Positive (FP) of less than 0.1%. © 2016 The Authors.Authors. Published Published by by Elsevier Elsevier B.V. B.V. This is an open access article under the CC BY-NC-ND license Peer-review(http://creativecommons.org/licenses/by-nc-nd/4.0/ under responsibility of the Organizing). Committee of ICACC 2016. Peer-review under responsibility of the Organizing Committee of ICACC 2016 Keywords —VoIP; RTP; SIP; Chi-square test; Hellinger distance; Statistical inspection;

1. Introduction

An increasing the number of IP based business applications invent different IP based services. Such applications attract consumers by offering low cost with greater flexibility for making a long call over Internet based model. Since this fragile architecture of the Internet model and their utility of open standards like Session Initiation Protocol (SIP) and RTP contains the provision for intellectual vulnerabilities. The survey papers1,2,3 discussed different types of VoIP attacks and solutions. The summarization of these paper revealed that the RTP attacks were less dealt. The flooding attacks can be launched by any combination of protocols with different from a number of distributed locations and spoofing of INVITE messages. In the real time system, those attacks can disturb the QoS along with ongoing communication between users.

* Corresponding author: Tel. 0452 2482430 Email address: [email protected]

1877-0509 © 2016 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/). Peer-review under responsibility of the Organizing Committee of ICACC 2016 doi: 10.1016/j.procs.2016.07.278 894 G. Vennila and MSK Manikandan / Procedia Computer Science 93 ( 2016 ) 893 – 901

Therefore, both SIP and RTP requires security mechanism to provide a reliable service and hence, many researchers focusing towards the signaling attacks detection system that would be either flooding attacks4,5,6 or malformed message attacks7,8,9or both9,10,11,12. The authors4,5,6 detect SIP flooding by sketch data structure, which monitor and store specific features of SIP traffic. The authors assume that the attacker cannot deduce the normal sketch; this solution is not a feasible one in real time application, because the sketch method can change the standard SIP configuration during client registration phase. Similarly, the Hellinger Distance12 computes the deviation between training and test set probability distribution. But this method is ineffective when four attributes (INVITE, 200 OK, BYE, ACK) are proportionally flooded simultaneously and passive monitoring.

2. RTP Flooding Attack

VoIP network requires two level of security mechanism. One for protecting the signaling phases to establishment a media session and, other for protecting the media security. Thus, the media security is strongly coupled with signaling security, because the media sessions are described by the Session Description Protocol (SDP). Many researchers briefly described the SIP flooding2-10. This paper presents the RTP flooding attack; the attacker floods the media packets during media conversation phase as shown in Fig.1. Flooding generates a baggage of valid RTP voice messages (both the header and the payload are filled with random bytes) over UDP to the targeted clients in a dialog phase and makes the server prepare the response messages, with the objective of exhausting bandwidth. This can be achieved by hacking of legitimate user credential information (INVITE). Therefore, the sequence number in succeeding media packets should increase frequently else we must check the RTP packets of legitimate users5. Such consequence of flooding results server overload or crash, unwanted traffic rate, customer service to be delayed or frozen, unsuccessful call completion rate and finally, some innocent server to be overloaded. Thereby, the voice call quality is affected by intermittent voice transfer.

Fig. 1. An example RTP flooding attack

3. Proposed Packet Inspection Statistical Technique

The proposed PIS technique consists of two phases namely, feature extraction and decision phase. The PIS technique is the first entity to communicate with the clients and capture all SIP and RTP packets as shown in Fig.2. In this paper, we examine the media traffic information in terms of request message rate, payload and size of the message during the conversation phase. When the request rate increases or the sequence number varies from original one, those packets are reported to the SIP server. G. Vennila and MSK Manikandan / Procedia Computer Science 93 ( 2016 ) 893 – 901 895

In the existing flood detection based on HD with sketch5, the training set was captured during attacker in normal traffic. The HD method becomes ineffective when attacker changing their strategies. Therefore, in this work, the training dataset is said to be completely free of attack packets and used for comparison with the test set. The reason is that the legitimate user does not change their calling pattern. In the first phase, extract continuous stream of RTP packets without following any media encoding scheme and statistical observations were made for t sec sampling duration by running multiple VoIP applications with variable codec. The decision phase calculates the HD between the test and training set. If the measured distance exceeds the threshold, identified as the injected media attacks, else, the packet is added to the training set.

Fig. 2. The proposed system architecture

3.1 Feature extraction phase

The analysis is initiated by running multiple VoIP applications (, Zoiper, , X-lite and Viber) in ten systems running on Windows and kernel in a closed LAN environment. The call was established between client and Asterisk server and the traces of the applications were captured in Wireshark for a day, followed by a week and finally for a six month and feature extraction was done with a total 14523 calls. Out of different statistical features such as payload type, sequence number, timestamp, and SSRC, we concentrated on request message rate, payload and size of the message. Request message rate gives the information about the signalling (INVITE, BYE and ACK) and media packets for a given time. Payload type is a value indicating which codec is used to encode the audio payload and the size of the message indicates the value of message in bits/sec. In addition, the following analysis also included,

 The occurrence of individual events (signalling and media flow)  The sequence of the occurrence of the events  Duration of the events

Flooding typically consists of series of media data transferred by p users. The vector representation of the frequency of media transferred by individual users gives the least amount of information of the given user series. For obtaining the frequency property of individual event, we use a random variable to represent { 1 , 2 , 3 SSSS p },.... the occurrence of flooding attack that follows a given sequence of events such as INVITE followed by MEDIA conversation and then the corresponding BYE by each user. In addition, the sequences of events that are used for updating during media transfer such as MEDIA followed by UPDATE followed by INVITE succeeded with the MEDIA for each user was also observed. Because, during conversation phase the updating of the media sessions

896 G. Vennila and MSK Manikandan / Procedia Computer Science 93 ( 2016 ) 893 – 901

(i.e.,) audio to video and the reverse are updated using signalling messages. Hence, the sequence of these events was also observed. Flooding attacks manifest high frequency of a particular event and its corresponding duration

property is represented using  , 21 ,.... SSS p to represent the duration of p different users for the above mentioned events.

3.2 Decision phase

After extraction of the required payload and size of message features, the decision phase follow mathematical model of HD with chi-square test to decide the original media streams and the flooded media streams in RTP. Let S={S1,S2,S3,…SP}and R={R1,R2,R3,…RP}be the training and testing vectors of size n with S and R representing the frequency of the occurrence of discrete media streams at time t transmitted by each user. By using the data of size n,

the corresponding mean vector of the training vector S is calculated  { 1, 2 , 3 ,....SSSSS p } .The respective HD between and RS is given by,

2 1 d RS ),(  H N 2 (1) 2   RS  1 Where,

 S  1 and  R  1   The HD satisfies the inequality if S=R. The attributes of MEDIA, INVITE, BYE and ACK are considered, the probability measures for each attributes are obtained over the given training data such as SMEDIA, SINVITE, SBYE and SACK for the training set and RMEDIA, RINVITE, RBYE and RACK for the testing set. In order to model the traffic behavior along time, threshold is utilised to reflect normal condition and differentiate anomalies. Hence, the packet size of 150 bytes is maintained and the rate of RTP flooding is gradually increased from 50 packets per second (pps) to 1000 pps. However, there were fluctuations over time for normal traffic leading to non-stationary of the threshold and the maximum distance is identified to be 8*10-3 as shown in Fig.3. However, the existing HD method was incorporated for VoIP flooding taking different statistical method into account4,5. The proposed PIS technique follows HD along with chi-square test which reduces time complexity with less overhead. The probability at time instant t=0 is calculated by observing the number of incoming RTP packets for each stream of media identified with source address, destination address and sequence numbering RTPnobserved.

-3 x 10 9

8

7

6

5

4

3 Hellinger Distance

2

1

0 0 10 20 30 40 50 60 Time (minutes) Fig. 3. Hellinger distance for the given dataset

For obtaining the theoretical value RTPntheroritical for the RTP stream is seen from the communication between SIP and RTP. Thus, the individual media attribute for each call is integrated together by computed, N i RTPntheoretical   RTPntheoretical (2) i1 G. Vennila and MSK Manikandan / Procedia Computer Science 93 ( 2016 ) 893 – 901 897

where, i RTPntheoretic al =packets in the stream i for time t The value of RTP is assumed to be 50 pps for RTP stream coded with G.711 codec. Anything beyond theoritical 2500 pps is said to be a deviation. The HD is obtained from RTPtheoretical and RTPnobserved .

 RTP  (3)  ntheoretical  Stheoretical      RTP  RTP     ntheoretical nobserved  

 RTP   nobserved  (4) Sobserved      RTP  RTP     ntheoretical nobserved   Similarly for the test set, we obtain the values for testing time of t is,

 RTP * t  (5)  ntheoretical  R  theoretical     RTP * t  RTP     ntheoretical nobserved  

 RTP  (6)  nobserved  R  observed     RTP * t  RTP     ntheoretical nobserved   and for the corresponding media stream HD is calculated as, 2 2 (7) HD   Stheoretical  Rtheoretical    Sobserved Robserved The corresponding variance (Var) for the RTP stream of test and training data is obtained with, 1 n 1 n (8) Var   Si SS i  S))((  Var   Ri RR i  R))((  n  1 i1 , n  1 i1 We identify Si from the S vector. Let us consider S1be the sequence of events from the first user and Spbe the series of events generated by Pth user in the observed call vector. The duration between the first and second user is 2 calculated by TSP-TSP-1 at instant t. The required media flooding packet is identified by using the T statistic, which is calculated by using, 2  1   SST  Var   SS  (9) We calculate the mean, variance and the threshold are computed for each value of the training data. Owing to the high computational overhead required for the computation of HD and T2. A chi-square test is performed for the training set. The data is said to satisfy the criteria if, n 2  S  S  2 i i (10) S  i1 Si n 2  R  R  2  i i (11) R  i 1 Ri 2 However, unlike HD and T the chi-square test does not consider the correlation of variables of events generated by individual users in set S and R., rather the value of mean vector (i.e.) the series of events generated by individual

898 G. Vennila and MSK Manikandan / Procedia Computer Science 93 ( 2016 ) 893 – 901

users is enough to detect the mean deviations. The required deviation between the real RTP media packet and flooded RTP media packet is obtained for time instance t given by, 2 n O  E  (12)   n n i1 En For each RTP stream over UDP belonging to either of flooded or normal stream, the stream is coded to G group of users. The chi square test is computed for each block of b sub group of users such as i=1,2,3…G. b g 2 12 Oi  Ei )( (13) Cg   i0 Ei g th The Oi represents the occurrence of the g block that has a value of i .

4. Results and Discussion

In this section, we discuss the performance of the proposed technique by examining the results in the experimental test. From the table 1, it is observed that as the arrival rate for the server is increased, the number of RTP media floods keeps increasing and at a rate of the maximum arrival rate of λ=20, the attack rate increases without limit and becomes infinite. Table 1. Attack packet distribution

Server Arrival rate (λ/hr) Attack intensity (per sec) Trace (mins) No. of calls

2 32 5 52 5 76 10 76 Asterisk Server 8 92 15 97 10 97 15 153 15 283 20 500 18 575 25 730

4.1 RTP flooding rate

The attack rate for the PIS as shown in Fig. 4 is used as a measure for obtaining a statistical result of the number of RTP flooded packet generated for a call flow, the graph shows the increasing pattern of attack rate, or a call volume of 900 Erlangs, the attack rate is more than 90 packets i.e., attack packets are flooded at an average of 10 times of original RTP media stream.

100

90

80

70

60

50

40 Attack Rate

30

20

10

0 0 100 200 300 400 500 600 700 800 900 1000 Time Interval (mins) Fig. 4. Flooding attack rates G. Vennila and MSK Manikandan / Procedia Computer Science 93 ( 2016 ) 893 – 901 899

4.2 RTP flooding detection rate Fig. 5 depicts the ability of the proposed PIS technique detecting more attack rates than the HD method for increasing arrival rates. From the table I, it is clear that the attack rate linearly increases as the arrival rate of the packet increases, the PIS technique is seen to be 25% more effective in filtering RTP flooding packets.

Fig. 5. Comparision of HD and the proposed PIS technique

4.3 Testing computation time The time complexity for computation of HD for every new incoming packet is  mnO )( , where nO )( measures the HD computation time during dynamic updation and the time it takes for comparing it with previously obtained HD value, hence the testing phase outcome is said to have an additional time lag of mO )( . Whereas, in the PIS technique, the threshold is compared only with the metadata of possible values of HD from training phase, the overall time complexity for comparison is maintained to nO 2 )( . The proposed PIS technique offers reduced computation time than the HD is shown in Fig.6.

Fig. 6. Time complexity comparison

900 G. Vennila and MSK Manikandan / Procedia Computer Science 93 ( 2016 ) 893 – 901

4.4 False alarm rate (FPR)

The false alarm rate defines the number of wrongly classified RTP flooded packets. The result is obtained by comparing the call volume versus false alarm rate, the PIS technique is found to have a good effect on correctly identifying the RTP flooded packet. As the call volume increases, the false alarm rate of misclassified RTP flooded packet is less than 0.1 for the technique proposed is shown in Fig.7.

1

0.8

0.6

0.4 False Alarm Rate Alarm False

0.2

0 0 100 200 300 400 500 600 Call Volume (Erlang) Fig. 7. False alarm rate

4.5 Hit rate

The hit rate of the PIS technique is computed by Eq.14,

RTPclassified Hit rate  (14) RTPclassified  RTPflooded misclassified The hit rate gives the number of RTP flooded packets correctly identified in the total call volume generated. The proposed PIS technique exactly identifies more than 98% of the RTP flooded packets based on the threshold computed as shown in Fig.8.

100

90

80

70

60

50

40

30

Hit Rate(True Positive Rate) 20

10

0 0 100 200 300 400 500 600 700 800 900 1000 Call Volume (Erlangs) Fig. 8. Proposed PIS technique hit rate G. Vennila and MSK Manikandan / Procedia Computer Science 93 ( 2016 ) 893 – 901 901

Table 2. False positive and True positive rate

Method True positive (%) False positive (%)

Hellinger Distance with 92.831 8.76 sketch5

Proposed PIS technique 98.234 1.87

5. Conclusion

In this paper, we provided an innovative approach to design a VoIP security system with specific technique for RTP flooding attack. The proposed PIS technique detects RTP flooding with an increasing attack rate with a FPR less than 0.1 (i.e. less than 10% of misclassification). The experimental result proves that the detection rate of the algorithm has more than 98% and thus, from the classified results, the Asterisk server takes action over the packets, either to accept or drop the packets. The suspicion technique proposed here is more predictive towards RTP attacks. Furthermore, the proposed technique augments the robustness of the secured VoIP network.

References

1. Keromytis, Angelos D. A comprehensive survey of voice over IP security research. IEEE Communications Surveys & Tutorials 2012; 14(2): 514-537. 2. Ehlert, Sven, Dimitris Geneiatakis, and Thomas Magedanz. Survey of network security systems to counter SIP-based denial-of-service attacks. Journal of Computers & Security 2010; 29(2):225-243. 3. Geneiatakis D, Dagiuklas T, Kambourakis G, Lambrinoudakis C, Gritzalis S, Ehlert S, Sisalem D. Survey of Security Vulnerabilities in Session Initiation Protocol. IEEE Communication Surveys & Tutorials 2006; 8(3): 68-81. 4. Tang Jin, Yu Cheng, and Yong Hao. Detection and prevention of SIP flooding attacks in voice over IP networks. IEEE INFOCOM 2012: 1161-1169. 5. Tang J, Cheng, Y, Hao. Y. SIP Flooding Attack Detection with a Multi-Dimensional Sketch Design. IEEE Transactions on Dependable and Secure Computing 2014; 11(6): 582-595. 6. Tang Jin, Yu C. SIP Flooding Attack Detection. Springer Briefs in Computer Science 2013: 53-70. 7. Geneiatakis D, Kambourakis G, Dagiuklas T, Lambrinoudakis C, Gritzalis S. A framework for detecting malformed messages in SIP networks. 14th IEEE workshop on local and metropolitan area networks (LANMAN). Greece; September 2005. 8. Rieck K, Wahl S, Laskov P, Domschitz P, Muller KR. A self-learning system for detection of anomalous SIP messages. Principles, systems and applications of IP telecommunications (IPTComm 2008). Heidelberg, Germany; July 2008. 9. Wu YS, Bagchi S, Garg S, Singh N, Tsai T. SCIDIVE: A stateful and cross protocol intrusion detection architecture for Voice-over IP environments. International conference on dependable systems and networks (DSN-2004) 2004. Italy. 10. Dongwon S, Heejo L, and Ejovi N. SIPAD: SIP–VoIP anomaly detection using a stateful rule tree. Journal of Computer Communications 2013; 36(5): 562-574. 11. Sengar H, Wijesekera D, Wang H, Jajodia S. VoIP intrusion detection through interacting protocol state machines. International conference on dependable systems and networks (DSN-2006). USA; June 2006 12. Sengar H, Haining W, Duminda W, and Sushil J. Detecting VoIP floods using the Hellinger distance. IEEE Transaction on Parallel and Distributed Systems 2008; 19(6):794-805.