Attacks Landscape in the Dark Side of the Web
Total Page:16
File Type:pdf, Size:1020Kb
Attacks Landscape in the Dark Side of the Web Onur Catakoglu Marco Balduzzi Davide Balzarotti Eurecom & Monaco Digital Trend Micro Research Eurecom Security Agency marco [email protected] [email protected] [email protected] ABSTRACT Area Impact Better in The Dark Web is known as the part of the Internet oper- Attack Identification Results Surface Web ated by decentralized and anonymous-preserving protocols Service Advertisement Operation Surface Web like Tor. To date, the research community has focused on Stealthiness Deployment Dark Web understanding the size and characteristics of the Dark Web Operational Costs Deployment Dark Web and the services and goods that are offered in its under- Collected Data Operation Surface Web ground markets. However, little is still known about the attacks landscape in the Dark Web. Table 1: Advantages and Disadvantages of Operat- For the traditional Web, it is now well understood how ing a High-Interaction honeypot in the Dark Web websites are exploited, as well as the important role played by Google Dorks and automated attack bots to form some sort of \background attack noise" to which public websites in marketplaces, money laundering, and assassination [12]. are exposed. Moreover, Tor has been reported to be leveraged in host- This paper tries to understand if these basic concepts and ing malware [18], and operating resilient botnets [9]. While, components have a parallel in the Dark Web. In particular, to a certain extent, these studies have shown how the Dark by deploying a high interaction honeypot in the Tor network Web is used to conduct such activities, it is still unclear if for a period of seven months, we conducted a measurement and how miscreants are explicitly conducting attacks against study of the type of attacks and of the attackers behavior hidden services, like a web application running within the that affect this still relatively unknown corner of the Web. Tor network. While web attacks, or attacks against exposed services on the Internet are common knowledge and have been largely studied by the research community [10, 11, 22], 1. INTRODUCTION no previous work has been conducted to investigate the vol- Based on the accessibility of its pages, the Web can be ume and nature of attacks in the Dark Web. divided in three parts: the Surface Web { which covers ev- To this extend, in this work we discuss the deployment of erything that can be located through a search engine; the a high-interaction honeypot within the Dark Web to collect Deep Web { which contains the pages that are not reached evidence of attacks against its services. In particular, we by search engine crawlers (for example because they require focus our study on web applications to try to identify how a registration); and the more recent Dark Web { which is attackers exploit them (without a search engine for localiza- dedicated to websites that are operated over a different in- tion) and what their purpose is after a service has been com- frastructure to guarantee their anonymity, and that often promised. Our preliminary measurement casts some light on require specific software to be accessed. the attackers' behavior and shows some interesting phenom- The most famous\neighborhood"of the Dark Web is oper- ena, including the fact that the vast majority of incoming ated over the Tor network, whose protocols guarantee anony- attacks are unintentional (in the sense that they were not mity and privacy of both peers in a communication, making targeted against Dark Web services) scattered attacks per- users and operators of (hidden) services in the Dark Web formed by automated scripts that reach the application from more resilient to identification and monitoring. the Surface Web through Tor2web proxies. As such, over the last years, miscreants and dealers in gen- eral have started to adopt the Dark Web as a valid platform to conduct their activities, including trading of illegal goods 2. HONEYPOT IN THE DARK WEB In this section, we discuss the main differences, in terms of Permission to make digital or hard copies of all or part of this work for personal or advantages and disadvantages, between deploying and main- classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation taining a honeypot in the Surface Web versus operating a on the first page. Copyrights for components of this work owned by others than the similar infrastructure in the Tor network. author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or In fact, the anonymity provided by Tor introduces a num- republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]. ber of important differences. Some are positives, and make SAC 2017, April 03 - 07, 2017, Marrakech, Morocco the infrastructure easier to maintain for researchers. Some are instead negative, and introduce new challenges in the c 2017 Copyright held by the owner/author(s). Publication rights licensed to ACM. ISBN 978-1-4503-4486-9/17/04. $15.00 honeypot setup and in the analysis of the collected data. DOI: http://dx.doi.org/10.1145/3019612.3019796 Table1 summarizes the five main differences between the two environments, mentioning their impact (on the deploy- tract attackers by simply placing some keywords or specific ment, operation, or on the results collected by the honeypot) files as John et al. described in their work [14]. For exam- and which environment (Dark or Surface Web) provides bet- ple, including a known web shell or disclosing the vulnerable ter advantages in each category. version of an installed application along with its name is a widely used strategy to lure attackers. Attack identification These popular\advertisement"approaches are not straight- The most significant difference between a deployment on forward to apply to services hosted on the Tor network. As the surface Web and on the Tor network is the anonymity we later discuss in Section4, it is still possible for .onion of the incoming requests. In a traditional honeypot, indi- web sites to be indexed by Google. However, in order to gain vidual requests are typically grouped together in attack ses- popularity and attract attackers, researchers should care- sions [10] to provide an enriched view on the number and fully employ alternative techniques { such as advertising the nature of each attack. A single session can span several website in forums, channels, or link directories specific to the minutes and include hundreds of different requests (e.g., to Dark Web. probe the application, exploit a vulnerability, and install post-exploitation scripts). Operational costs Since many malicious tools do not honor server-side cook- Since, as explained above, operating a honeypot in the Dark ies, this clustering phase is often performed by combining Web does not require any special domain registration or ded- two pieces of information: the timestamp of each request, icated hosting provider, the total cost of the operation is and its source IP address. Thus, requests coming in the typically very low. Canali et al [10] had to register hun- same empirically-defined time window and from the same dreds of domain names (and routinely change them to avoid host are normally grouped in a single session. blacklisting) as well as several dedicated hosting providers { Unfortunately, the source of each connection is hidden in which are often difficult to handle because they often block the Tor network, and therefore the identification of individ- the accounts if they receive complains about possibly mali- ual attacks becomes much harder in the Dark Web. More- cious traffic. over, if the attacker uses the Tor browser, also the HTTP In comparison, an equivalent infrastructure on the Dark headers would be identical between different attackers. Web only requires the physical machines where the honey- pot is installed, as creating new domains is free and can be Stealthiness performed arbitrarily by the honeypot administrator. If on the one hand the anonymity provided by the Tor net- work complicates the analysis of the attacks, on the other it Nature of the collected data also simplifies the setup of the honeypot infrastructure. In Some criminals use the Tor network to host illicit content fact, a core aspect of any honeypot is its ability to remain like child pornography, since it protects both the visitors and hidden as the quality of the collected data decreases if at- the host by concealing their identities. Therefore, as we ex- tackers can easily identify that the target machine is likely plain in Section3, we had to take some special precautions a trap. to prevent attackers from using our honeypot to store and For instance, the nature of the Surface Web reveals in- distribute this material. Unless researchers work in collabo- formation like the Whois and SOA records associated with a ration with law enforcement, these measures are required to domain name, or the geo-location of the IP address the hon- safely operate a honeypot in the Dark Web. eypot resolves to. To mitigate this risk, Canali et al. [10] employed a distributed architecture including hundreds of transparent proxy-servers that redirected the incoming traf- 3. HONEYPOT SETUP AND DEPLOYMENT fic via VPN to the honeypots hosted on the researchers' lab. This solution successfully solve the problem of hiding the In this section we describe the setup of our honeypot. real location of the web applications, but it is difficult to Our deployment is composed of three types of web-based maintain and requires the proxies to be located on many honeypots and a system-based honeypot.