Using DPI and Statistical Analysis in Encrypted Network Traffic Monitoring

International Journal for Information Security Research (IJISR), Volume 10, Issue 1, 2020 Using DPI and Statistical Analysis in Encrypted Network Traffic Monitoring Luca Deri1, Daniele Sartiano2 1ntop/IIT-CNR 2IIT-CNR/University of Pisa Pisa, Italy Abstract The pervasive use of encrypted protocols and new • Availability of free and automated X.509 communication paradigms based on mobile and home certificates issued by non-profit certificate IoT devices has obsoleted traffic analysis techniques authority Let’s Encrypt has driven the adoption of that relied on clear text analysis. This has required HTTPS to new highs. new monitoring metrics being able to characterise, identify, and classify traffic not just in terms of • Computational overhead is no longer a problem network protocols but also behaviour and intended even on low-end devices, thus even home IoT use. This paper reports the lessons learnt while devices such as virtual assistants and smart home analysing traffic in both home networks and the devices relying on cloud-based services need to Internet, and it describes how monitoring metrics secure their communication with encryption. used in experiments have been implemented on an open source toolkit for deep packet inspection (DPI) As encryption is becoming pervasive with 87% of and traffic analysis developed by the authors. The the whole Internet traffic in 2019, it is becoming validation process confirmed that combining the important to provide network visibility in this new proposed metrics with DPI, it is possible to effectively changed scenario where clear-text protocols are used characterise and fingerprint encrypted traffic less frequently even though they are still relatively generated by home IoT and non-IoT devices, paving popular in LAN networks where obsolete operating the way to next generation DPI toolkit development. systems and outdated IoT devices will be used for some more years. This means that we need to 1. Introduction complement existing techniques with new measurements metrics able to inspect and characterise Network traffic has changed significantly in terms encrypted traffic for the purpose of identifying threats of network protocols and behaviour. Today most of and changes in network traffic behaviour. This is in the network traffic is encrypted and the reasons are particular because modern enterprises are rethinking manyfold: their network security moving off castle-and-moat approaches focusing on defending their perimeter to a • Changes in company network topologies with the new zero-trust model where no user is trusted based adoption of multi-cloud architectures require on the principle of “never trust always verify” [6]. In communications to be protected as they are no home networks the widespread use of IoT and longer limited to trusted LAN network segments healthcare devices that operate using cloud services traditionally protected by security devices. has created new security issues pushing towards the zero-trust model as users no longer interact directly • Devices such as mobile phone and portable PCs with the device but only through cloud services. This are used on public network and WiFi hotspots trend towards cloud-based security is present also on making compulsory to use encrypted products manufactured by leading firewall vendors communications in order to safely exchange that can be accessed solely using a cloud console and sensitive data while preserving privacy on no longer connecting to the firewall sitting on the potentially hostile networks. company premises. Providing network visibility is the base on which security of modern networks works, as it is • New multi-language cryptographic libraries such compulsory to implement mechanisms to enforce as Amazon s2n and Google Tink made encryption network policies that enable zero-trust and modern commodity for programers with respect to home networks to operate. This has been the obsolete libraries such as OpenSSL that were large motivation behind this work, being decryption of in size, difficult to use, and affected by severe encrypted traffic not practical for various reasons problems such as Heartbleed. Copyright © 2020, Infonomics Society 932 International Journal for Information Security Research (IJISR), Volume 10, Issue 1, 2020 including, but not limited to, ethical and technical 2. Related Work issues that prevent MITM (Man In The Middle) techniques [1] to operate on non-TLS (Transport This section first analyses TLS and SSH (Secure Layer Security) protocols such as SSH, BitTorrent Shell), the two leading encryption protocols and it and Skype. Contrary to previous research [2, 3, 4], describes various traffic analysis and fingerprint goal of this paper is not to define new methods for methods. Then it describes how IoT device traffic is identifying specific threats but rather to classify analysed and enforced in networks. network traffic in a generic way without searching specific traffic or malware fingerprints. As specified 2.1. SSL/TLS Fingerprinting later in this paper, this approach is able to classify traffic using specific protocol metrics and also detect TLS (Transport Security Layer) is the most changes in network behaviour. This fact is effective in popular cryptographic protocol used to secure particular on IoT and home networks, where the communications on computer networks. TLS has device behaviour should not change unless it is replaced its predecessor SSL (Secure Socket Layer) reconfigured or compromised. used for years on the Internet and now deprecated, and Another objective that has motivated this work, is it has been designed to provide privacy and data the definition of new metrics and techniques to be integrity between two communicating applications. used with encrypted traffic similar to those used with TLS uses TCP as transport protocol even though there clear text. For instance, in HTTP the User Agent has is also a variant called DTLS (Datagram TLS) mostly been used [5] to classify devices and identify used for VPNs and in some mobile applications (e.g. malware: how can this be implemented with the Signal messaging app) that is similar to the QUIC encrypted traffic? In essence, identify properties in protocol promoted by Google. TLS communications encrypted traffic analysis equivalent to those used for flow over an encrypted, bidirectional network tunnel years in clear text traffic so that it is possible to have that is encrypted using some cryptographic keys based the same level of visibility without decoding the on shared secrets negotiated at the start of the session encrypted traffic payload. named TLS handshake. During handshake the two In summary, the main contribution of this paper is communicating peers agree on algorithms, exchange to show in practice how existing network visibility certificates and cryptographic options before starting methods and algorithms have been enhanced to take encrypted data exchange. In this negotiation phase the into account encrypted traffic and to promote the TLS client sends a ClientHello message that contains creation of a next generation DPI engine that does a list of supported ciphers, compression methods and more than just identifying network protocols decoding various parameters including options on elliptic-curve a few packets. The novelty of this work is the cryptography used by TLS. The server responds with combination of existing protocol fingerprint a ServerHello message that contains the chosen TLS techniques coming from DPI with new traffic protocol version, ciphers and compression methods behavioural indicators that allow traffic not only to be selected out of the various options offered by the recognised in terms of application protocol, but also client in the ClientHello message. Then the server to be checked for compliance with the expected sends an optional certificate message containing the behavioral model. Doing this it is possible to improve public key used by the server. Handshake messages application protocol detection, and at the same time are exchanged in clear, so they can be decoded by spot suspicious traffic behaviour in a simple way with dissecting packets, with the exception of the server respect to what popular IDSs can do in a significantly certificate that in TLS 1.3 is encrypted. more complex fashion [6]. This is to create a comprehensive set of algorithms and metrics that can be effectively used to monitor both large and home/IoT networks as well Internet traffic. As described later in this paper, the results of this research have been implemented in a popular open source deep-packet inspection and traffic classification engine named nDPI [7] so that other people can benefit from this work. The rest of the paper is structured as follows. Section 2 analyses encryption protocols and standard Figure 1. Simplified TLS Handshake traffic fingerprint techniques used to classify (RFC 5246, 2008) encrypted traffic. Section 3 covers the proposed monitoring methodology, metrics and approach. Decoding the initial handshake packets allows Section 4 discusses the tool implementation and applications to inspect how data is exchanged and experiments, and finally Section 5 concludes the disclose information about both the client and server paper. configuration as well fingerprint and identify client Copyright © 2020, Infonomics Society 933 International Journal for Information Security Research (IJISR), Volume 10, Issue 1, 2020 applications. For instance, a typical misconception about TLS is that monitoring applications are unable to

Load more