Circuit Fingerprinting Attacks: Passive Deanonymization of Tor Hidden Services

Total Page:16

File Type:pdf, Size:1020Kb

Circuit Fingerprinting Attacks: Passive Deanonymization of Tor Hidden Services Circuit Fingerprinting Attacks: Passive Deanonymization of Tor Hidden Services Albert Kwon†, Mashael AlSabah‡§†∗, David Lazar†, Marc Dacier‡, and Srinivas Devadas† †Massachusetts Institute of Technology, fkwonal,lazard,[email protected] ‡Qatar Computing Research Institute, [email protected] §Qatar University, [email protected] This paper sheds light on crucial weaknesses in the As a result, many sensitive services are only accessi- design of hidden services that allow us to break the ble through Tor. Prominent examples include human anonymity of hidden service clients and operators pas- rights and whistleblowing organizations such as Wik- sively. In particular, we show that the circuits, paths ileaks and Globalleaks, tools for anonymous messag- established through the Tor network, used to commu- ing such as TorChat and Bitmessage, and black markets nicate with hidden services exhibit a very different be- like Silkroad and Black Market Reloaded. Even many havior compared to a general circuit. We propose two non-hidden services, like Facebook and DuckDuckGo, attacks, under two slightly different threat models, that recently have started providing hidden versions of their could identify a hidden service client or operator using websites to provide stronger anonymity guarantees. these weaknesses. We found that we can identify the users’ involvement with hidden services with more than That said, over the past few years, hidden services 98% true positive rate and less than 0.1% false positive have witnessed various active attacks in the wild [12, 28], rate with the first attack, and 99% true positive rate and resulting in several takedowns [28]. To examine the se- 0.07% false positive rate with the second. We then re- curity of the design of hidden services, a handful of at- visit the threat model of previous website fingerprinting tacks have been proposed against them. While they have attacks, and show that previous results are directly ap- shown their effectiveness, they all assume an active at- plicable, with greater efficiency, in the realm of hidden tacker model. The attacker sends crafted signals [6] to services. Indeed, we show that we can correctly deter- speed up discovery of entry guards, which are first-hop mine which of the 50 monitored pages the client is visit- routers on circuits, or use congestion attacks to bias entry ing with 88% true positive rate and false positive rate as guard selection towards colluding entry guards [22]. Fur- low as 2.9%, and correctly deanonymize 50 monitored thermore, all previous attacks require a malicious client hidden service servers with true positive rate of 88% and to continuously attempt to connect to the hidden service. false positive rate of 7.8% in an open world setting. In this paper, we present the first practical passive attack against hidden services and their users called 1 Introduction circuit fingerprinting attack. Using our attack, an at- In today’s online world where gathering users’ per- tacker can identify the presence of (client or server) hid- sonal data has become a business trend, Tor [14] has den service activity in the network with high accuracy. emerged as an important privacy-enhancing technology This detection reduces the anonymity set of a user from allowing Internet users to maintain their anonymity on- millions of Tor users to just the users of hidden ser- line. Today, Tor is considered to be the most popular vices. Once the activity is detected, we show that the anonymous communication network, serving millions of attacker can perform website fingerprinting (WF) attacks clients using approximately 6000 volunteer-operated re- to deanonymize the hidden service clients and servers. lays, which are run from all around the world [3]. While the threat of WF attacks has been recently criti- In addition to sender anonymity, Tor’s hidden services cized by Juarez et al. [24], we revisit their findings and allow for receiver anonymity. This provides people with demonstrate that the world of hidden services is the ideal a free haven to host and serve content without the fear setting to wage WF attacks. Finally, since the attack of being targeted, arrested or forced to shut down [11]. is passive, it is undetectable until the nodes have been deanonymized, and can target thousands of hosts retroac- ∗Joint first author. tively just by having access to clients’ old network traffic. Approach. We start by studying the behavior of Tor cir- cuits on the live Tor network (for our own Tor clients and OP G1 hidden services) when a client connects to a Tor hidden extend service. Our key insight is that during the circuit con- extended struction and communication phase between a client and extend a hidden service, Tor exhibits fingerprintable traffic pat- extended begin terns that allow an adversary to efficiently and accurately Legend: identify, and correlate circuits involved in the communi- connected Received by G1 cation with hidden services. Therefore, instead of mon- data Relayed by G1 itoring every circuit, which may be costly, the first step in the attacker’s strategy is to identify suspicious circuits with high confidence to reduce the problem space to just hidden services. Next, the attacker applies the WF at- Figure 1: Cells exchanged between the client and the entry guard to build a general circuit for non-hidden streams after the tack [10, 36, 35] to identify the clients’ hidden service circuit to G1 has been created. activity or deanonymize the hidden service server. Contributions. This paper offers the following contri- butions: evaluation. In Section 7, we demonstrate the effective- ness of WF attacks on hidden services. We then discuss 1. We present key observations regarding the commu- possible future countermeasures in Section 8. Finally, nication and interaction pattern in the hidden ser- we overview related works in Section 9, and conclude in vices design in Tor. Section 10. 2. We identify distinguishing features that allow a pas- sive adversary to easily detect the presence of hid- 2 Background den service clients or servers in the local network. We evaluate our detection approach and show that We will now provide the necessary background on Tor we can classify hidden service circuits (from the and its hidden services. Next, we provide an overview of client- and the hidden service-side) with more than WF attacks. 98% accuracy. 3. For a stronger attacker who sees a majority of the 2.1 Tor and Hidden Services clients’ Tor circuits, we propose a novel circuit cor- relation attack that is able to quickly and efficiently Alice uses the Tor network simply by installing the detect the presence of hidden service activity using Tor browser bundle, which includes a modified Firefox a sequence of only the first 20 cells with accuracy browser and the Onion Proxy (OP). The OP acts as an of 99%. interface between Alice’s applications and the Tor net- work. The OP learns about Tor’s relays, Onion Routers 4. Based on our observations and results, we argue that (ORs), by downloading the network consensus document the WF attacker model is significantly more realis- from directory servers. Before Alice can send her traffic tic and less costly in the domain of hidden services through the network, the OP builds circuits interactively as opposed to the general web. We evaluate WF at- and incrementally using 3 ORs: an entry guard, middle, tacks on the identified circuits (from client and hid- and exit node. Tor uses 512-byte fixed-sized cells as its den service side), and we are able to classify hidden communication data unit for exchanging control infor- services in both open and closed world settings. mation between ORs and for relaying users’ data. The details of the circuit construction process in Tor 5. We propose defenses that aim to reduce the detec- proceeds as follows. The OP sends a create fast cell tion rate of the presence of hidden service commu- to establish the circuit with the entry guard, which re- nication in the network. sponds with a created fast. Next, the OP sends an extend command cell to the entry guard, which causes Roadmap. We first provide the reader with a back- it to send a create cell to the middle OR to establish ground on Tor, its hidden service design, and WF attacks the circuit on behalf of the user. Finally, the OP sends in Section 2. We next present, in Section 3, our obser- another extend to the middle OR to cause it to cre- vations regarding different characteristics of hidden ser- ate the circuit at exit. Once done, the OP will receive vices. In Section 4, we discuss our model and assump- an extended message from the middle OR, relayed by tions, and in Sections 5 and 6, we present our attacks and the entry guard. By the end of this operation, the OP 2 will have shared keys used for layered encryption, with every hop on the circuit.1 The exit node peels the last Hidden Service layer of the encryption and establishes the TCP connec- Client tion to Alice’s destination. Figure 1 shows the cells ex- changed between OP and the entry guard for regular Tor 4 3 5 connections, after the exchange of the create fast and created fast messages. IP Tor uses TCP secured with TLS to maintain the OP- RP HSDir to-OR and the OR-to-OR connections, and multiplexes circuits within a single TCP connection. An OR-to- 2 1 OR connection multiplexes circuits from various users, 6 whereas an OP-to-OR connection multiplexes circuits from the same user. An observer watching the OP-to-OR Hidden TCP connection should not be able to tell apart which Service TCP segment belongs to which circuit (unless only one circuit is active).
Recommended publications
  • A Decentralized Private Marketplace: DRAFT 0.1
    A Decentralized Private Marketplace: DRAFT 0.1 Ido Kaiser1 Abstract— The online services we use are increasingly de- structure provided by the Bitcoin blockchain but is equally manding more of our personal data, a disturbing trend that applicable to any of it derivatives, meaning the marketplace threatens the privacy of users on a global scale. Entities such as is indifferent about the underlying cryptocurrency used for Google, Facebook and Yahoo have grown into colossal, seem- ingly unaccountable corporations by monetizing their users’ payments. personal data. These entities are charged with keeping said data secure and, in the case of social and economic interactions, II. HIGH LEVEL OVERVIEW safeguarding the privacy of their users. Centralized security The overview consists of two main components: a models are not applicable to the new generation of technologies blockchain and a data storage network. Technically speaking such as Bitcoin. This paper discusses a system which combines these networks can operate over the same set of nodes. But a Bitmessage-style network with anonymous payment schemes to create a privacy-centric marketplace. Furthermore we apply for clarity we separate them to highlight that it does not have a multi-signature escrow technique involving insurance deposits to be the same set. should which deter fraudulent actors from participating in trades, given that their incentive is to make a profit. A. Blockchain The blockchain is typically tasked with processing pay- I. INTRODUCTION ments but for our purpose it will also be storing the market- Satoshi Nakamoto, the visionary and creator of Bitcoin[1], place index and the identities.
    [Show full text]
  • January 2020 Zillman Column
    2020 Guide to Online Privacy Resources and Tools By Marcus P. Zillman, M.S., A.M.H.A. Executive Director - Virtual Private Library http://www.VirtualPrivateLibrary.org The January 2020 Zillman Column features the 2020 Guide to Online Privacy Resources and Tools and is a very comprehensive listing of Internet and Web privacy resources, sources and sites on the Internet for the latest competent sources and research. The below list of sources is taken partially from my Subject Tracer™ white paper titled Privacy Resources 2020 and is constantly updated with Subject Tracer™ bots at the following URL: http://www.PrivacyResources.info/ http://www.StealthMode.info/ These resources and sources will help you to discover the many pathways available through the Internet to find the latest Internet and web search and discovery research, resources, sources and sites. As this site is constantly updated it would be to your benefit to bookmark and return to the above URL frequently. Figure 1: 2020 Guide to Online Privacy Resources and Tools 1 January 2020 Zillman Column – 2020 Guide to Online Privacy Resources and Tools http://www.zillmancolumns.com/ [email protected] eVoice: (800) 858-1462 © 2020 Marcus P. Zillman, M.S., A.M.H.A. 2020 Guide to Online Privacy Resources and Tools: 10 Best Security and Privacy Apps for Smartphones and Tablets http://drippler.com/drip/10-best-security-privacy-apps-smartphones-tablets 10 Minute Mail http://10minutemail.com/10MinuteMail/index.html 10 Privacy Gadgets To Help You Keep a Secret http://www.popsci.com/keep-your-secrets-a-secret
    [Show full text]
  • 5.Sustainability
    P2Pvalue More than 95% of the cases surveyed use centralized servers to store the users’ data. Over the whole population of cases this would be lower, as less than 88% has a centralized architecture allowing for central storage. Index infrastructure provision On a scale of 1 to 9, half of the cases have less than 3, and 84.1% of the cases are at the intermediate level of the index (between 4 and 5). None of the cases are at the highest range of the index. 5.Sustainability Regarding the question of profitability versus non profitability character of infrastructure provision, what results from the data on the legal type of infrastructure provision (see table above as part of infrastructure provision section) is that non-profit organizations make up the majority of cases (57%), something that makes sense with the voluntary dimension of the majority of CBPP experiences. Nevertheless, we consider it important to highlight that 28.9% of the cases are for profit organizations, something that is closely related to the diffusion of hybrid cases in CBPP. The data on the type of organization connected to the case (see table at section infrastructure provider) notes that 25.1% of the cases are businesses, which is the second type of most common organization. What we highlight about this data concerning the main strategies to achieve economic sustainability is the high level of importance that is given to the non- monetary contributions. For instance, 51% of respondents assign a value of 10 to non-monetary contributions. Instead, when we analyze all the other strategies of sustainability, the median is very low.
    [Show full text]
  • Rock in the Reservation: Songs from the Leningrad Rock Club 1981-86 (1St Edition)
    R O C K i n t h e R E S E R V A T I O N Songs from the Leningrad Rock Club 1981-86 Yngvar Bordewich Steinholt Rock in the Reservation: Songs from the Leningrad Rock Club 1981-86 (1st edition). (text, 2004) Yngvar B. Steinholt. New York and Bergen, Mass Media Music Scholars’ Press, Inc. viii + 230 pages + 14 photo pages. Delivered in pdf format for printing in March 2005. ISBN 0-9701684-3-8 Yngvar Bordewich Steinholt (b. 1969) currently teaches Russian Cultural History at the Department of Russian Studies, Bergen University (http://www.hf.uib.no/i/russisk/steinholt). The text is a revised and corrected version of the identically entitled doctoral thesis, publicly defended on 12. November 2004 at the Humanistics Faculty, Bergen University, in partial fulfilment of the Doctor Artium degree. Opponents were Associate Professor Finn Sivert Nielsen, Institute of Anthropology, Copenhagen University, and Professor Stan Hawkins, Institute of Musicology, Oslo University. The pagination, numbering, format, size, and page layout of the original thesis do not correspond to the present edition. Photographs by Andrei ‘Villi’ Usov ( A. Usov) are used with kind permission. Cover illustrations by Nikolai Kopeikin were made exclusively for RiR. Published by Mass Media Music Scholars’ Press, Inc. 401 West End Avenue # 3B New York, NY 10024 USA Preface i Acknowledgements This study has been completed with the generous financial support of The Research Council of Norway (Norges Forskningsråd). It was conducted at the Department of Russian Studies in the friendly atmosphere of the Institute of Classical Philology, Religion and Russian Studies (IKRR), Bergen University.
    [Show full text]
  • Download: Brill.Com/Brill-Typeface
    Poets of Hope and Despair Russian History and Culture Editors-in-Chief Jeffrey P. Brooks (The Johns Hopkins University) Christina Lodder (University of Kent) Volume 21 The titles published in this series are listed at brill.com/rhc Poets of Hope and Despair The Russian Symbolists in War and Revolution, 1914-1918 Second Revised Edition By Ben Hellman This title is published in Open Access with the support of the University of Helsinki Library. This is an open access title distributed under the terms of the CC BY-NC-ND 4.0 license, which permits any non-commercial use, distribution, and reproduction in any medium, provided no alterations are made and the original author(s) and source are credited. Further information and the complete license text can be found at https://creativecommons.org/licenses/by-nc-nd/4.0/ The terms of the CC license apply only to the original material. The use of material from other sources (indicated by a reference) such as diagrams, illustrations, photos and text samples may require further permission from the respective copyright holder. Cover illustration: Angel with sword, from the cover of Voina v russkoi poezii (1915, War in Russian Poetry). Artist: Nikolai K. Kalmakov (1873-1955). Brill has made all reasonable efforts to trace all rights holders to any copyrighted material used in this work. In cases where these efforts have not been successful the publisher welcomes communications from copyright holders, so that the appropriate acknowledgements can be made in future editions, and to settle other permission matters. The Library of Congress Cataloging-in-Publication Data is available online at http://catalog.loc.gov Typeface for the Latin, Greek, and Cyrillic scripts: “Brill”.
    [Show full text]
  • Considering PGP
    Security Now! Transcript of Episode #418 Page 1 of 38 Transcript of Episode #418 Considering PGP Description: This week, Steve and Leo continue covering the consequences of the Snowden leaks and, with that in mind, they examine the Pretty Good Privacy (PGP) system for securely encrypting eMail and attachments. High quality (64 kbps) mp3 audio file URL: http://media.GRC.com/sn/SN-418.mp3 Quarter size (16 kbps) mp3 audio file URL: http://media.GRC.com/sn/sn-418-lq.mp3 SHOW TEASE: It's time for Security Now!. Steve Gibson, our security guru, is here. This is a show everybody has to watch. In fact, share it with your friends, your neighbors, your colleagues: Using PGP to protect your email. Steve talks about it next on Security Now!. Leo Laporte: This is Security Now! with Steve Gibson, Episode 418, recorded August 21st, 2013: Considering PGP. It's time for Security Now!, the show that covers your security, your privacy, your safety online with this man here, the 'Splainer in Chief, Steven Gibson at GRC.com. Hey, Steverino. Steve Gibson: Hey, Leo. Great to be with you for Show No. 1 of Year No. 9. Leo: Wow. Steve: We begin our ninth year. Leo: Wow. Episode 418, and you've only missed one, and that was because we made you. Steve: Yeah. So we're not going to do that again. That was not pretty. There was an uprising among the natives. Security Now! Transcript of Episode #418 Page 2 of 38 Leo: Well, you've got to fight it out with Lisa because I don't - I never had the cojones to stop you, but she does.
    [Show full text]
  • Security Analysis of Instant Messenger Torchat
    TALLINN UNIVERSITY OF TECHNOLOGY Faculty of Information Technology Department of Informatics Chair of Software Engineering Security Analysis of Instant Messenger TorChat Master's Thesis Student: Rain Viigipuu Student code: 072125 Supervisor: Alexander Norta, PhD External Supervisor: Arnis Parˇsovs, MSc TALLINN 2015 Abstract TorChat is a peer-to-peer instant messenger built on top of the Tor network that not only provides authentication and end-to-end encryption, but also allows the communication parties to stay anonymous. In addition, it prevents third parties from even learning that communication is taking place. The aim of this thesis is to document the protocol used by TorChat and to analyze the security of TorChat and its reference implementation. The work shows that although the design of TorChat is sound, its implementation has several flaws, which make TorChat users vulnerable to impersonation, communication confirmation and denial-of-service attacks. 2 Contents 1 Introduction 6 2 Tor and Hidden Services 8 2.1 Hidden Services . .9 2.1.1 Hidden service address . 11 3 TorChat 12 3.1 Managing Contacts and Conversations . 12 3.2 Configuration Options . 16 4 TorChat Protocol 17 4.1 Handshake . 17 4.2 File Transfers . 18 4.3 Protocol Messages . 19 4.3.1 not implemented . 19 4.3.2 ping . 20 4.3.3 pong . 21 4.3.4 client . 22 4.3.5 version . 22 4.3.6 status . 22 4.3.7 profile name . 23 4.3.8 profile text . 23 4.3.9 profile avatar alpha . 24 4.3.10 profile avatar . 24 4.3.11 add me.............................
    [Show full text]
  • Unveiling the I2P Web Structure: a Connectivity Analysis
    Unveiling the I2P web structure: a connectivity analysis Roberto Magan-Carri´ on,´ Alberto Abellan-Galera,´ Gabriel Macia-Fern´ andez´ and Pedro Garc´ıa-Teodoro Network Engineering & Security Group Dpt. of Signal Theory, Telematics and Communications - CITIC University of Granada - Spain Email: [email protected], [email protected], [email protected], [email protected] Abstract—Web is a primary and essential service to share the literature have analyzed the content and services offered information among users and organizations at present all over through this kind of technologies [6], [7], [2], as well as the world. Despite the current significance of such a kind of other relevant aspects like site popularity [8], topology and traffic on the Internet, the so-called Surface Web traffic has been estimated in just about 5% of the total. The rest of the dimensions [9], or classifying network traffic and darknet volume of this type of traffic corresponds to the portion of applications [10], [11], [12], [13], [14]. Web known as Deep Web. These contents are not accessible Two of the most popular darknets at present are The Onion by search engines because they are authentication protected Router (TOR; https://www.torproject.org/) and The Invisible contents or pages that are only reachable through the well Internet Project (I2P;https://geti2p.net/en/). This paper is fo- known as darknets. To browse through darknets websites special authorization or specific software and configurations are needed. cused on exploring and investigating the contents and structure Despite TOR is the most used darknet nowadays, there are of the websites in I2P, the so-called eepsites.
    [Show full text]
  • Deep Web for Journalists: Comms, Counter-Surveillance, Search
    Deep Web for Journalists: Comms, Counter-surveillance, Search Special Complimentary Edition for Delegates attending the 28th World Congress of the International Federation of Journalists * By Alan Pearce Edited by Sarah Horner * © Alan Pearce June 2013 www.deepwebguides.com Table of Contents Introduction by the International Federation of Journalists A Dangerous Digital World What is the Deep Web and why is it useful to Journalists? How Intelligence Gathering Works How this affects Journalists 1 SECURITY ALERT . Setting up Defenses 2 Accessing Hidden Networks . Using Tor . Entry Points 3 Secure Communications . Email . Scramble Calls . Secret Messaging . Private Messaging . Deep Chat . Deep Social Networks 4 Concealed Carry 5 Hiding Things . Transferring Secret Data . Hosting, Storing and Sharing . Encryption . Steganography – hiding things inside things 6 Smartphones . Counter-Intrusion . 007 Apps 7 IP Cameras 8 Keeping out the Spies . Recommended Free Programs . Cleaning Up . Erasing History . Alternative Software Share the Knowledge About the Authors Foreword by the International Federation of Journalists Navigating the Dangerous Cyber Jungle Online media safety is of the highest importance to the International Federation of Journalists. After all, the victims are often our members. The IFJ is the world’s largest organization of journalists and our focus is on ways and means to stop physical attacks, harassment and the killing of journalists and media staff. In an age where journalism – like everything else in modern life – is dominated by the Internet, online safety is emerging as a new front. In this new war, repressive regimes now keep a prying eye on what journalists say, write and film. They want to monitor contacts and they want to suppress information.
    [Show full text]
  • A Security Analysis of Email Communications
    A security analysis of email communications Ignacio Sanchez Apostolos Malatras Iwen Coisel Reviewed by: Jean Pierre Nordvik 2 0 1 5 EUR 28509 EN European Commission Joint Research Centre Institute for the Protection and Security of the Citizen Contact information Ignacio Sanchez Address: Joint Research Centre, Via Enrico Fermi 2749, I - 21027 Ispra (VA), Italia E-mail: [email protected] JRC Science Hub https://ec.europa.eu/jrc Legal Notice This publication is a Technical Report by the Joint Research Centre, the European Commission’s in-house science service. It aims to provide evidence-based scientific support to the European policy-making process. The scientific output expressed does not imply a policy position of the European Commission. Neither the European Commission nor any person acting on behalf of the Commission is responsible for the use which might be made of this publication. All images © European Union 2015, except: Frontpage : © bluebay2014, fotolia.com JRC 99372 EUR 28509 EN ISSN 1831-9424 ISBN 978-92-79-66503-5 doi:10.2760/319735 Luxembourg: Publications Office of the European Union, 2015 © European Union, 2015 Reproduction is authorised provided the source is acknowledged. Printed in Italy Abstract The objective of this report is to analyse the security and privacy risks of email communications and identify technical countermeasures capable of mitigating them effectively. In order to do so, the report analyses from a technical point of view the core set of communication protocols and standards that support email communications in order to identify and understand the existing security and privacy vulnerabilities. On the basis of this analysis, the report identifies and analyses technical countermeasures, in the form of newer standards, protocols and tools, aimed at ensuring a better protection of the security and privacy of email communications.
    [Show full text]
  • How Do Tor Users Interact with Onion Services?
    How Do Tor Users Interact With Onion Services? Philipp Winter Anne Edmundson Laura M. Roberts Princeton University Princeton University Princeton University Agnieszka Dutkowska-Zuk˙ Marshini Chetty Nick Feamster Independent Princeton University Princeton University Abstract messaging [4] and file sharing [15]. The Tor Project currently does not have data on the number of onion Onion services are anonymous network services that are service users, but Facebook reported in 2016 that more exposed over the Tor network. In contrast to conventional than one million users logged into its onion service in one Internet services, onion services are private, generally not month [20]. indexed by search engines, and use self-certifying domain Onion services differ from conventional web services names that are long and difficult for humans to read. In in four ways; First, they can only be accessed over the Tor this paper, we study how people perceive, understand, and network. Second, onion domains are hashes over their use onion services based on data from 17 semi-structured public key, which make them difficult to remember. Third, interviews and an online survey of 517 users. We find that the network path between client and the onion service is users have an incomplete mental model of onion services, typically longer, increasing latency and thus reducing the use these services for anonymity and have varying trust in performance of the service. Finally, onion services are onion services in general. Users also have difficulty dis- private by default, meaning that users must discover these covering and tracking onion sites and authenticating them. sites organically, rather than with a search engine.
    [Show full text]
  • Detection and Analysis of Tor Onion Services
    Detection and Analysis of Tor Onion Services Martin Steinebach1;∗, Marcel Schäfer2, Alexander Karakuz and Katharina Brandl1 1Fraunhofer SIT, Germany 2Fraunhofer USA CESE E-mail: [email protected] ∗Corresponding Author Received 28 November 2019; Accepted 28 November 2019; Publication 23 January 2020 Abstract Tor onion services can be accessed and hosted anonymously on the Tor network. We analyze the protocols, software types, popularity and uptime of these services by collecting a large amount of .onion addresses. Websites are crawled and clustered based on their respective language. In order to also determine the amount of unique websites a de-duplication approach is implemented. To achieve this, we introduce a modular system for the real-time detection and analysis of onion services. Address resolution of onion services is realized via descriptors that are published to and requested from servers on the Tor network that volunteer for this task. We place a set of 20 volunteer servers on the Tor network in order to collect .onion addresses. The analysis of the collected data and its comparison to previous research provides new insights into the current state of Tor onion services and their development. The service scans show a vast variety of protocols with a significant increase in the popularity of anonymous mail servers and Bitcoin clients since 2013. The popularity analysis shows that the majority of Tor client requests is performed only for a small subset of addresses. The overall data reveals further that a large amount of permanent services provide no actual content for Tor users. A significant part consists instead of bots, services offered via multiple domains, or duplicated websites for phishing Journal of Cyber Security and Mobility, Vol.
    [Show full text]