Classifying Service Flows in the Encrypted Skype Traffic

Total Page:16

File Type:pdf, Size:1020Kb

Classifying Service Flows in the Encrypted Skype Traffic Classifying Service Flows in the Encrypted Skype Traffic Maciej Korczynski´ and Andrzej Duda Grenoble Institute of Technology, CNRS Grenoble Informatics Laboratory UMR 5217 Grenoble, France. Email: [maciej.korczynski, andrzej.duda]@imag.fr Abstract—In this paper, we consider the problem of detecting have evaluated our classification method on a representative Skype traffic and classifying Skype service flows such as voice dataset to show excellent performance in terms of Precision calls, skypeOut, video conferencing, chat, file upload and down- and Recall. load. We propose a classification method for Skype encrypted traffic based on the Statistical Protocol IDentification (SPID) To the best of our knowledge, this is the first work that that analyzes statistical values of some traffic attributes. We proposes an accurate method for classifying encrypted Skype have evaluated our method on a representative dataset to show service TCP flows tunneled over the TLS protocol. excellent performance in terms of Precision and Recall. II. ISSUES IN THE ANALYSIS OF SKYPE TRAFFIC I. INTRODUCTION Accurate traffic identification and classification are essential Skype traffic presents a major challenge for detection and for proper network configuration and security monitoring. classification, because of proprietary software, several internal Application-layer encryption can however bypass restrictions obfuscation mechanisms, and a complex connection protocol set by network configuration and security checks. In this paper, designed for bypassing firewalls and establishing communica- we focus on Skype as an interesting example of encrypted tion regardless of network policies. traffic and provide a method for identifying different Skype Skype differs from other VoIP applications, because it relies flows inside encrypted TCP traffic—we want to discriminate on a Peer-to-Peer (P2P) infrastructure while other applications between voice calls, video conferencing, skypeOut calls, chat, use the traditional client-server model. Skype nodes include and file sharing. Previous papers on Skype concentrated on clients (ordinary nodes), supernodes, and servers for updates its architecture and the authentication phase [1], [2], [3], and authentication. An ordinary node with a public IP ad- on the mechanisms for firewall and NAT traversal [4] as dress, sufficient computing resources and network bandwidth well as on characterizing traffic streams generated by VoIP may become a supernode. Supernodes maintain an overlay calls and Skype signaling [5], [6]. Bonfiglio et al. proposed network, while ordinary nodes establish connections with a identification methods for encrypted UDP Skype traffic [7], small number of supernodes. Authentication servers store the but no work has considered encrypted TCP Skype flows. user account information. A Skype client communicates with Skype exemplifies the problem of identifying encrypted other nodes directly or in an indirect way via other peers that flows, because it multiplexes several services using the same relay packets. Skype can multiplex different service flows on ports: VoIP calls, video conferencing, instant messaging, or file an established connection: voice calls to another Skype node, transfer. A network administrator may assign a higher priority skypeOut calls to phones, video conferencing, chat, file upload to VoIP calls, but other flows may also benefit in an illegitimate and download. Our goal is to detect and classify the service way from a higher priority if we cannot distinguish them from flows in Skype traffic. We cannot use traditional port-based VoIP calls. flow identification methods, because Skype randomly selects We propose a classification method for Skype encrypted ports and switches to port 80 (HTTP) or 443 (TLS 1.0) if it traffic based on the Statistical Protocol IDentification (SPID) fails to establish a connection on chosen ports. [8] that analyzes statistical values of flow and application layer Another feature of the Skype design is the possibility of data. We consider a very special case of Skype traffic that is, using both TCP and UDP as a transport protocol. Skype in addition to proprietary encryption, tunneled over Transport uses TCP to establish an initial connection and then it can Layer Security (TLS) protocol version 1.0. We propose an interchangeably use TCP or UDP depending on network appropriate set of attribute meters to detect encrypted Skype restrictions. TCP traffic and identify Skype service flows. Our method Skype encrypts its traffic with the strong 256-bit Advanced involves three phases with progressive identification. To select Encryption Standard (AES) algorithm to protect from poten- the right attribute meters for each phase, we applied a method tial eavesdropping. However, some information in the UDP called forward selection [9] that evaluates how a given attribute payload is not encrypted so that a part of the Skype messages meter improves classification performance and promotes it encapsulated in UDP can be obtained and used for identifi- to the traffic model if its influence is significant. Forward cation [7]. We propose an accurate method for classification selection uses the Analysis of Variance (ANOVA) [10]. We of service flows inside encrypted TCP Skype traffic tunneled Table I DEFINITION OF ATTRIBUTE METERS USED IN CLASSIFICATION Attribute meter Definition mk 8 100 byte-frequency M1 : {(k,pk)}, k =0, 1, ..., 255; pk = , mk = δ i P mk i=1 j=1 xj P P m i i i i i hi action-reaction of first 3 bytes M2 : {(h ,phi ), ∀i∈(1,3)}, h :(y3∆,z3∆) → h(y3∆,z3∆), phi = m , mhi = δh(yi ,zi ) P hi 3∆ 3∆ i i mh 4 32 byte value offset hash M3 : {(h,ph)}, h :(j, x ) → h(j, x ), ph = , mh = δ i j j P mh i=1 j=1 h(j,xj ) P P i i mh 4 32 first 4 packets byte reoccurring dis- M4 : {(h,ph)}, ∀d<=16 : h :(x ,d) → h(x ,d), ph = , mh = δ i ) j j P mh i=1 j=1 h(xj ,d) tance with byte P P i i i i mh 4 16 first 4 packets first 16 byte pairs M5 : {(h,ph)}, h :(x ,x ) → h(x ,x ), ph = , mh = δ i i j j+1 j j+1 P mh i=1 j=1 h(xj ,xj+1) P P i i i i mf first 4 ordered direction packet size M6 : {(f,pf )}, f :(i,s(x ),dir(x )) → f(i,s(x ),dir(x )), pf = , P mf 4 mf = i=1 δf(i,s(xi),dir(xi)) P 1 1 1 1 mf f,p 1 1 1 f nib x ,j,dir x f nib x ,j,dir x p first packet per direction first N M7 : {( f )}, ∀x ∈{z ,y } : :( ( j ) ( )) → ( ( j ) ( )), f = P m , byte nibbles f 8 mf = δ 1 1 j=1 f(nib(xj ),j,dir(x )) P i i i i mf direction packet size distribution M8 : {(f,pf )}, f :(s(x ),dir(x )) → f(s(x ),dir(x )), pf = , P mf s(x) mf = i=1 δf(s(xi),dir(xi)) P i i i+1 i i i+1 mf byte pairs reoccurring count M9 : {(f,pf )}, ∀ i i+1 : f :(xj ,dir(xj ),dir(xj )) → f(xj ,dir(xj ),dir(xj )), pf = m , xj =xj P f s(x) 32 mf = i=1 j=1 δ i i i+1 f(xj ,dir(xj ),dir(xj )) P P ∈ Table II We consider a set of n attribute meters x1,...,xn X NOTATION and a set of m Skype services. We begin with a model that includes the most significant attribute in the initial analysis. M : {(k,pk)} – attribute meter m – attribute meter counter More precisely, we compute - defined as: k F Measure pk,k =0, 1, 2,... – probability distribution of an attribute meter (corresponds to Q(x) in traffic model generation and P (x) in traffic classification) TP TP X xi Precision = , Recall = , 1 if = j δ – indicator function; δ : X →{0, 1},δxi = i TP + FP TP + FN j 0 if X 6= xj h – hash function, h =0, 1, 2,... 2 ∗ Precision ∗ Recall f – compressing function, f =0, 1, 2,... F -Measure = , (2) i Precision + Recall xj – byte j in packet i i xj(m) – bit m in byte j in packet i for a particular Skype service and for each individual attribute i i x ↔ x – all packets in a TCP session meter. The True Positive (TP) term refers to all Skype flows Pyi – packet i, zi – packet sent in a different direction than yi i that are correctly identified, False Positives (FPs) refer to all x∆j – first j bytes in packet i d xi xi d, <d<j flows that were incorrectly identified as Skype traffic. Finally, – distance between two identical bytes; if j = j−d ⇒ 0 False Negatives (FNs) represent all flows of Skype traffic that s(x) – size of x; amount of packets in a TCP session s(xi) – size of packet xi in bytes were incorrectly identified as other traffic. dir – packet direction ∈ i i i i i We select attribute xi X with the largest average nib: xj ↔ xj(m∈(1...8)); xj(m∈(1...4)) XOR xj(m∈(5...8) ⇒ nib(xj ) 1 x - defined as ∈ , where F Measure maxx X m Pa∈(1,m) FMa x th FMa denotes a observation of F -Measure value corre- sponding to xth attribute meter. Let us focus on a particular F -test [10] that compares the In the next step, each of the remaining attributes influence of attribute meter xj ∈ x1,...xi−1,xi+1,...xn ∈ X x1,...xi−1,xi+1,...xn ∈ X is tested for inclusion in the with the first model based on xi ∈ X. We examine two groups xi xij model. We run several F -tests (explained below) that compare of F -Measure values FMa and FMa that respectively the variance of F -Measure values obtained in the preliminary correspond to attribute xi and to the set of two attribute xi selection, i.e.
Recommended publications
  • Reviewing Traffic Classification
    Reviewing Traffic Classification Silvio Valenti1,4, Dario Rossi1, Alberto Dainotti2, Antonio Pescape`2, Alessandro Finamore3, Marco Mellia3 1 Telecom ParisTech, France – [email protected] 2 Universita` di Napoli Federico II, Italy – [email protected] 3 Politecnico di Torino, Italy – [email protected] 4 Current affiliation: Google, Inc. Abstract. Traffic classification has received increasing attention in the last years. It aims at offering the ability to automatically recognize the application that has generated a given stream of packets from the direct and passive observation of the individual packets, or stream of packets, flowing in the network. This ability is instrumental to a number of activities that are of extreme interest to carriers, Internet service providers and network administrators in general. Indeed, traffic classification is the basic block that is required to enable any traffic management operations, from differentiating traffic pricing and treatment (e.g., policing, shap- ing, etc.), to security operations (e.g., firewalling, filtering, anomaly detection, etc.). Up to few years ago, almost any Internet application was using well-known transport- layer protocol ports that easily allowed its identification. More recently, the num- ber of applications using random or non-standard ports has dramatically increased (e.g. Skype, BitTorrent, VPNs, etc.). Moreover, often network applications are configured to use well-known protocol ports assigned to other applications (e.g. TCP port 80 originally reserved for Web traffic) attempting to disguise their pres- ence. For these reasons, and for the importance of correctly classifying traffic flows, novel approaches based respectively on packet inspection, statistical and machine learning techniques, and behavioral methods have been investigated and are be- coming standard practice.
    [Show full text]
  • An Adaptive Multi-Layer Botnet Detection Technique Using Machine Learning Classifiers
    applied sciences Article An Adaptive Multi-Layer Botnet Detection Technique Using Machine Learning Classifiers Riaz Ullah Khan 1,* , Xiaosong Zhang 1, Rajesh Kumar 1 , Abubakar Sharif 1, Noorbakhsh Amiri Golilarz 1 and Mamoun Alazab 2 1 Center of Cyber Security, School of Computer Science & Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China; [email protected] (X.Z.); [email protected] (R.K.); [email protected] (A.S.); [email protected] (N.A.G.) 2 College of Engineering, IT and Environment, Charles Darwin University, Casuarina 0810, Australia; [email protected] * Correspondence: [email protected]; Tel.: +86-155-2076-3595 Received: 19 March 2019; Accepted: 24 April 2019; Published: 11 June 2019 Abstract: In recent years, the botnets have been the most common threats to network security since it exploits multiple malicious codes like a worm, Trojans, Rootkit, etc. The botnets have been used to carry phishing links, to perform attacks and provide malicious services on the internet. It is challenging to identify Peer-to-peer (P2P) botnets as compared to Internet Relay Chat (IRC), Hypertext Transfer Protocol (HTTP) and other types of botnets because P2P traffic has typical features of the centralization and distribution. To resolve the issues of P2P botnet identification, we propose an effective multi-layer traffic classification method by applying machine learning classifiers on features of network traffic. Our work presents a framework based on decision trees which effectively detects P2P botnets. A decision tree algorithm is applied for feature selection to extract the most relevant features and ignore the irrelevant features.
    [Show full text]
  • Identifying and Measuring Internet Traffic: Techniques and Considerations
    Identifying and Measuring Internet Traffic: Techniques and Considerations An Industry Whitepaper Contents Executive Summary Accurate traffic identification and insightful measurements form Executive Summary ................................... 1 the foundation of network business intelligence and network Introduction to Internet Traffic Classification ... 2 policy control. Without identifying and measuring the traffic flowing on their networks, CSPs are unable to craft new Traffic Identification .............................. 2 subscriber services, optimize shared resource utilization, and Traffic Categories ............................... 3 ensure correct billing and charging. Data Extraction and Measurement .............. 4 First and foremost, CSPs must understand their use cases, as Techniques .......................................... 5 these determine tolerance for accuracy. It is likely less of a Considerations for Traffic Classification .......... 7 problem if reports show information that is wrong by a small margin, but it can be catastrophic if subscriber billing/charging Information Requirements ........................ 7 is incorrect or management policies are applied to the wrong Traffic Classification Capabilities ............... 7 traffic. Completeness .................................... 8 Many techniques exist to identify traffic and extract additional False Positives and False Negatives .......... 9 information or measure quantities, ranging from relatively Additional Data and Measurements ......... 10 simple to extremely
    [Show full text]
  • Machine Learning for Identifying Botnet Network Traffic
    Aalborg Universitet Machine learning for identifying botnet network traffic Stevanovic, Matija; Pedersen, Jens Myrup Publication date: 2013 Document Version Accepted author manuscript, peer reviewed version Link to publication from Aalborg University Citation for published version (APA): Stevanovic, M., & Pedersen, J. M. (2013). Machine learning for identifying botnet network traffic. General rights Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights. ? Users may download and print one copy of any publication from the public portal for the purpose of private study or research. ? You may not further distribute the material or use it for any profit-making activity or commercial gain ? You may freely distribute the URL identifying the publication in the public portal ? Take down policy If you believe that this document breaches copyright please contact us at [email protected] providing details, and we will remove access to the work immediately and investigate your claim. Downloaded from vbn.aau.dk on: September 23, 2021 Machine learning for identifying botnet network traffic (Technical report) Matija Stevanovic and Jens Myrup Pedersen Networking and Security Section, Department of Electronic Systems Aalborg University, DK-9220 Aalborg East, Denmark Email: {mst, jens}@es.aau.dk Abstract—During the last decade, a great scientific effort ment, improving it’s mechanisms of propagation, malicious has been invested in the development of methods that could activity, and resilience to take-down efforts.
    [Show full text]
  • FDPHI: Fast Deep Packet Header Inspection for Data Traffic Classification and Management
    Received: March 19, 2021. Revised: May 13, 2021. 373 FDPHI: Fast Deep Packet Header Inspection for Data Traffic Classification and Management Nahlah Abdulrahman Alkhalidi1* Fouad A. Yaseen2* 1College of Science, Computer Science Department, University of Baghdad, Iraq 2University of Baghdad, Computer Center, Iraq * Corresponding author’s Email: [email protected] Abstract: Traffic classification is referred to as the task of categorizing traffic flows into application-aware classes such as chats, streaming, VoIP, etc. Most systems of network traffic identification are based on features. These features may be static signatures, port numbers, statistical characteristics, and so on. Current methods of data flow classification are effective, they still lack new inventive approaches to meet the needs of vital points such as real-time traffic classification, low power consumption, ), Central Processing Unit (CPU) utilization, etc. Our novel Fast Deep Packet Header Inspection (FDPHI) traffic classification proposal employs 1 Dimension Convolution Neural Network (1D-CNN) to automatically learn more representational characteristics of traffic flow types; by considering only the position of the selected bits from the packet header. The proposal a learning approach based on deep packet inspection which integrates both feature extraction and classification phases into one system. The results show that the FDPHI works very well on the applications of feature learning. Also, it presents powerful adequate traffic classification results in terms of energy consumption (70% less power CPU utilization around 48% less), and processing time (310% for IPv4 and 595% for IPv6). Keywords: Traffic classification, Packet header inspection, Neural network, Computer network. control, could be carried out on the traffic classes [3].
    [Show full text]
  • Cisco IP Video Telephony Solution Reference Network Design (SRND) Cisco Callmanager Release 4.0 July 2004
    Cisco IP Video Telephony Solution Reference Network Design (SRND) Cisco CallManager Release 4.0 July 2004 Corporate Headquarters Cisco Systems, Inc. 170 West Tasman Drive San Jose, CA 95134-1706 USA http://www.cisco.com Tel: 408 526-4000 800 553-NETS (6387) Fax: 408 526-4100 Customer Order Number: 9562740406 THE SPECIFICATIONS AND INFORMATION REGARDING THE PRODUCTS IN THIS MANUAL ARE SUBJECT TO CHANGE WITHOUT NOTICE. ALL STATEMENTS, INFORMATION, AND RECOMMENDATIONS IN THIS MANUAL ARE BELIEVED TO BE ACCURATE BUT ARE PRESENTED WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED. USERS MUST TAKE FULL RESPONSIBILITY FOR THEIR APPLICATION OF ANY PRODUCTS. THE SOFTWARE LICENSE AND LIMITED WARRANTY FOR THE ACCOMPANYING PRODUCT ARE SET FORTH IN THE INFORMATION PACKET THAT SHIPPED WITH THE PRODUCT AND ARE INCORPORATED HEREIN BY THIS REFERENCE. IF YOU ARE UNABLE TO LOCATE THE SOFTWARE LICENSE OR LIMITED WARRANTY, CONTACT YOUR CISCO REPRESENTATIVE FOR A COPY. The Cisco implementation of TCP header compression is an adaptation of a program developed by the University of California, Berkeley (UCB) as part of UCB’s public domain version of the UNIX operating system. All rights reserved. Copyright © 1981, Regents of the University of California. NOTWITHSTANDING ANY OTHER WARRANTY HEREIN, ALL DOCUMENT FILES AND SOFTWARE OF THESE SUPPLIERS ARE PROVIDED “AS IS” WITH ALL FAULTS. CISCO AND THE ABOVE-NAMED SUPPLIERS DISCLAIM ALL WARRANTIES, EXPRESSED OR IMPLIED, INCLUDING, WITHOUT LIMITATION, THOSE OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT OR ARISING FROM A COURSE OF DEALING, USAGE, OR TRADE PRACTICE. IN NO EVENT SHALL CISCO OR ITS SUPPLIERS BE LIABLE FOR ANY INDIRECT, SPECIAL, CONSEQUENTIAL, OR INCIDENTAL DAMAGES, INCLUDING, WITHOUT LIMITATION, LOST PROFITS OR LOSS OR DAMAGE TO DATA ARISING OUT OF THE USE OR INABILITY TO USE THIS MANUAL, EVEN IF CISCO OR ITS SUPPLIERS HAVE BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.
    [Show full text]
  • Traffic Classification Technique in Computer Networks
    Traffic Classification Technique in Computer Networks S. M. Parvat 1, Prof. Dr. S. D. Lokhande 2 Department of E & TC, Sinhgad college of Engineering, Pune, India [email protected] 1 [email protected] 2 Abstract- Traffic classification enables a variety of applications may vary according to specific classification requirements and topics, including Quality of Service, security, monitoring, and analysis needs. In early days, traffic classification was and intrusion-detection that are of use to researchers, performed as part of traffic characterization work, often accountants, network operators and end users. Capitalizing on motivated by the dominance of a certain protocol in a network traffic that had been previously hand-classified network. The previous studies have discussed various provides with training and testing data-sets. The classification of classification methodologies (e.g., well-known port number network traffic can be done using Machine Learning Method, for this the use of simulating tools like NS2 can be used. It matching, payload contents analysis, machine learning, etc.). requires network protocol headers and the properties of Many variants of such methodologies have been introduced unknown traffic for a successful classification stage. continuously to improve the classification accuracy and efficiency. However, it is extremely difficult for any method Keywords — Machine Learning (ML), Internet Protocol (IP), to claim 100 percent accuracy due to fast-changing and Network Simulator version 2 (NS2). dynamic nature
    [Show full text]
  • Internet Multimedia Traffic Classification from Qos Perspective Using Semi-Supervised Dictionary Learning Models
    NETWORKS & SECURITY Internet Multimedia Traffic Classification from QoS Perspective Using Semi-supervised Dictionary Learning Models Zaijian Wang1,*, Yuning Dong2, Shiwen Mao3, Xinheng Wang 1 College of Physics and Electronic Information, Anhui Normal University, Wuhu, 241000 China 2 College of Communications and Information Engineering, Nanjing University of Posts and Telecommunications, Nanjing, 210003 China 3 Department of Electrical and Computer Engineering, Auburn University, Auburn, AL 36849-5201 USA 4 School of Computing, University of the West of Scotland, Paisley, PA1 2BE, UK * The corresponding author, email: [email protected] Aabstract: To address the issue of fine- sifier. Our experimental results demonstrate grained classification of Internet multimedia the feasibility of the proposed classification traffic from a Quality of Service (QoS) per- method. spective with a suitable granularity, this paper Keywords: dictionary learning; traffic clas- defines a new set of QoS classes and presents sication; multimedia traffic; K-singular value a modified K-Singular Value Decomposition decomposition; quality of service (K-SVD) method for multimedia identifi- cation. After analyzing several instances of I. INTRODUCTION typical Internet multimedia traffic captured in a campus network, this paper defines a new With the development of various Internet mul- set of QoS classes according to the difference timedia applications, cross-layer optimization in downstream/upstream rates and proposes a of network resource allocation for enhanced modified K-SVD method that can automatical- user experiences has attracted considerable ly search for underlying structural patterns in interests in the research community [1-8]. the QoS characteristic space. We define bag- On the other hand, Internet Service Pro- QoS-words as the set of specific QoS local viders (ISPs) need to consider different Qual- patterns, which can be expressed by core QoS ity-of-Service (QoS) requirements for data, characteristics.
    [Show full text]
  • Optimizing Deep Packet Inspection for High-Speed Traffic Analysis
    View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by PORTO Publications Open Repository TOrino Noname manuscript No. (will be inserted by the editor) Optimizing Deep Packet Inspection for High-Speed Traffic Analysis Niccol`oCascarano · Luigi Ciminiera · Fulvio Risso the date of receipt and acceptance should be inserted later Abstract Deep Packet Inspection (DPI) techniques are considered extremely expensive in terms of processing costs and therefore are usually deployed in edge networks, where the amount of data to be processed is limited. This paper demonstrates that, in case the application can tolerate some compromises in terms of accuracy (such as many measurement-based tasks) and in presence of normal traffic, the processing cost can be greatly reduced while even improving the classification precision, making DPI suitable also for high-speed networks. Keywords Traffic Analysis · Deep Packet Inspection · Network Monitoring · Traffic Classification Niccol`oCascarano Politecnico di Torino, C.so Duca degli Abruzzi 24, 10129 Torino, Italy E-mail: [email protected] Luigi Ciminiera Politecnico di Torino, C.so Duca degli Abruzzi 24, 10129 Torino, Italy E-mail: [email protected] Fulvio Risso (corresponding author) Politecnico di Torino, C.so Duca degli Abruzzi 24, 10129 Torino, Italy Phone: +39-0115647008, Fax: +39-0115647099, E-mail: [email protected] 2 1 Introduction The usage of the Internet changed dramatically in the last few years. The Internet new transports traffic generated by many different users and appli- cations including financial transactions, e-business, entertainment and more, which is definitely different from the traffic we had 30 years ago when the network was engineered for email, telnet and FTP.
    [Show full text]
  • Internet Traffic Classification
    Internet Traffic Classification A Sandvine Technology Showcase Contents Executive Summary Executive Summary ................................... 1 Accurate traffic identification and insightful measurements form the foundation of network business intelligence and network Introduction to Internet Traffic Classification ... 2 policy control. Without identifying and measuring the traffic Sandvine’s Traffic Classification Technology ..... 3 flowing on their networks, CSPs are unable to craft new subscriber services, optimize shared resource utilization, and The Global Internet Phenomena Program .. 4 ensure correct billing and charging. Technical Foundation .............................. 4 Many techniques exist to identify traffic and extract additional Overcoming Routing Asymmetry ............. 4 information or measure quantities, ranging from relatively Stateful Awareness ............................. 5 simple to extremely complex; in general, advanced techniques Correlating across Flows and Sessions ....... 5 that can provide the most comprehensive information and actionable utility are processor-intensive and are therefore only Looking inside Tunnels and Encapsulation .. 5 available on best-of-breed deep packet inspection (DPI) and Traffic Identification .............................. 5 policy control platforms. So-called embedded solutions typically Signatures ........................................ 6 make do with simplistic approaches that are prone to service- and revenue-impacting errors. Trackers .........................................
    [Show full text]
  • On Different Ways to Classify Internet Traffic: a Short Review of Selected
    On different ways to classify Internet traffic: a short review of selected publications Pawe lForemski The Institute of Theoretical and Applied Informatics of the Polish Academy of Sciences, Gliwice, POLAND E-mail: [email protected] Abstract Traffic classification is an important tool for network management. It reveals the source of observed network traffic and has many potential applications in Quality of Service, network security, traffic visualiza- tion, and more. In the last decade, traffic classification evolved quickly due to the raise of peer-to-peer traffic. Nowadays, researchers still find new methods in order to withstand the rapid changes in the Internet. In this paper, we review 13 papers on traffic classification and re- lated topics that were published during 2009-2012. We show diversity in recent algorithms and we highlight possible directions for the future research on traffic classification: relevance of multi-level classification, importance of experimental validation, and the need for common traffic datasets. 1 Introduction Internet traffic classification|or identification|is the act of matching IP packets to the application that generated them. Traffic classification is im- portant for managing computer networks: for example, it is used for traf- fic shaping, policy routing, and packet filtering. From business point of view, it provides valuable marketing information via customer profiling [1], whereas scientific and government agencies employ it to identify global In- ternet trends [2, 3]. Given just a single IP packet it is difficult to classify it|there is no ap- plication name in the protocol headers. In the past, the service port number was used for discriminating the traffic class [4], but this became ineffective in the early 2000s due to peer-to-peer (P2P) traffic [5].
    [Show full text]
  • A Survey of Methods for Encrypted Traffic Classification and Analysis 3
    INTERNATIONAL JOURNAL OF NETWORK MANAGEMENT Int. J. Network Mgmt 2014; 00:1–24 Published online in Wiley InterScience (www.interscience.wiley.com). DOI: 10.1002/nem A Survey of Methods for Encrypted Traffic Classification and Analysis Petr Velan∗y, Milan Cermˇ ak,´ Pavel Celeda,ˇ Martin Drasarˇ Institute of Computer Science, Masaryk University, Brno, Czech Republic SUMMARY With the widespread use of encrypted data transport network traffic encryption is becoming a standard nowadays. This presents a challenge for traffic measurement, especially for analysis and anomaly detection methods which are dependent on the type of network traffic. In this paper, we survey existing approaches for classification and analysis of encrypted traffic. First, we describe the most widespread encryption protocols used throughout the Internet. We show that the initiation of an encrypted connection and the protocol structure give away a lot of information for encrypted traffic classification and analysis. Then, we survey payload and feature-based classification methods for encrypted traffic and categorize them using an established taxonomy. The advantage of some of described classification methods is the ability to recognize the encrypted application protocol in addition to the encryption protocol. Finally, we make a comprehensive comparison of the surveyed feature-based classification methods and present their weaknesses and strengths. Copyright c 2014 John Wiley & Sons, Ltd. Received . KEY WORDS: encrypted traffic; monitoring; network; traffic classification; traffic analysis; machine learning; encryption protocols 1. INTRODUCTION Network visibility is becoming a necessity in current networks. Security, traffic provisioning, and failure detection are the prime reasons to deploy traffic measurement. Yet, measurement has other uses and new ones are still being discovered.
    [Show full text]