Comprehensive Analysis of Web Privacy and Anonymous Web Browsers: Are Next Generation Services Based on Collaborative Filtering?

Total Page:16

File Type:pdf, Size:1020Kb

Comprehensive Analysis of Web Privacy and Anonymous Web Browsers: Are Next Generation Services Based on Collaborative Filtering? Comprehensive Analysis of Web Privacy and Anonymous Web Browsers: Are Next Generation Services Based on Collaborative Filtering? Gábor Gulyás, Róbert Schulcz, Sándor Imre Budapest University of Technology and Economics, Department of Telecommunications, Mobile Communication and Computing Laboratory (MC 2L) Magyar Tudósok krt.2, H-1117, Budapest, Hungary {gulyasg, schulcz, imre}@hit.bme.hu Abstract. In general, networking privacy enhancing technologies are better on larger user bases - such criteria that can be enhanced by combining them with community based services. In this paper we present main web privacy issues and today’s complex preventive solutions, anonymous web browsers, in several aspects including a comprehensive taxonomy as a result of our inquiry. Also, we suggest a next generation anonymous browser scheme based on collaborative filtering concerning issues on semantic web. Finally we analyze the benefits and drawbacks of such services, also examining the possible investors and raised moral considerations. Keywords: anonymous web browsers, user tracking, web privacy, collaborative filtering 1 Introduction The web has become a generic platform and takes a serious place in the everyday life of the digital age’s citizens. Several life-like transactions can be done on the web such as browsing items in a web shop, executing financial transactions, booking hotel rooms, while users require a high level of privacy, meaning as strong as they were committing these actions in real life. However, privacy on the web is not as strong as it is desired to be. Browsing items on web shops, or reading on-line magazines should be done anonymously if desired, but in many cases users are being observed and information is collected for profiling purposes. In most cases these profiles are later being used for direct marketing implying targeted advertising, dynamic pricing. Although user profiles can also be useful in determining content relevancy or in creating customized services. Privacy enhancing technologies (PET) are the solution against privacy vulnerabilities. The necessity of privacy enhancing technologies for the Internet emerged in the early beginnings [1] and since solutions evolve, however, there are still a lot of open questions. On the web anonymous web browsers represent the complex solution for sustaining anonymity and defending privacy. In this paper after outlining the current problems of web privacy and analysis of anonymous web browsers, we recommend a new solution based on collaborative filtering. This next generation service does not presume the existence of a semantic web, instead offering the possibility to create it, while it also strengthens user privacy by providing anonymous web surfing. We structure this paper as follows. In order to determine how seriously user privacy is endangered on the web, in Section 2 we discuss web privacy issues by inspecting participants concerned in violating user privacy and primal techniques. As a summary at the end of the section we propose a criterion determining the proper conditions for achieving anonymity. Anonymous web browsing services provide preceding solutions for the yielding privacy vulnerabilities. In Section 3 we present the architecture of today’s anonymous web browsing services, and publish a short taxonomy for classifying such service types, including a comparison as well. In Section 4 we suggest a solution describing how collaborative filtering should be applied to anonymous web browsers. In this section we analyze possible investors and examine moral questions raised as well. Finally, we give a conclusion about our work in this paper. 2 Web Privacy Web privacy issues can be divided in two main categories, correspondingly to the categorization of passive and active attacks in security: information leaking and technologies used to compromise privacy. However, information shared inadvertently can also result in compromising privacy (for instance, tracking) which raises the importance of total user control over shared or leaked information as well, not only preventing and detecting active attacks. Since several services rely on their advertising incomes, some privacy-friendly altering methods should be considered. For example instead of animated advertising, text based should be used, which is more audio-visual privacy friendly. Alternatively for tracking user activities and profiling, user preferences should be guessed by analyzing the web context in which the advertisement is shown.. Using this method users are classified into general preference groups instead of creating personal profiles. Otherwise, requesting the users’ consent is also a privacy-friendly approach if individual profiles needed to be created. 2.1 Participants, Goals and Motivation We divided the participants into two main groups: the users and others possibly compromising their privacy. However, in another aspect participants could be classified to be neutral, supportive or endangering user privacy. Intrinsically, actual participants can have several roles at the same time. The users ’ objective is to have total control over all their data : any information sent and received, and preserving anonymity. A user should also be aware of the information shared with the service provider or a third party any time, and should be able to defend herself against advertising and the leak of private information like e- mail addresses or login names. Neglecting the way the profile was acquired, the advertisers ’ goal is to achieve precise targeted advertising : to get the proper advertisements to the proper users (in large numbers). In some cases advertising is used together with profiling by tracking user activity and storing profile information on contextual or click-through bases. Advertisements, besides overloading system resources, can violate audio-visual privacy, by expanding over their designated area and playing sound effects or music. Web shops and stores might be using targeted advertising and profiling, however, they might use profiles for dynamic pricing , for extra profit they offer desired products more expensive and uninteresting ones cheaper. The user’s profile can be easily updated accordingly to her purchase statistics. Data collectors use special tracking techniques and often collaborate with service providers for profiling purposes. Their goal is to create accurate databases for merchandising or for some previously mentioned activities. The category of service providers includes Internet Service Providers (ISP) and web services providers as well. The IPS-s’ proxies are the bottleneck of the users’ whole traffic, which is ideal for logging user activities and blocking access to web service providers (politically motivated censorship). Web service providers are often the link between different participants by applying auditor services, placing third-parties’ advertisements on their pages or could be collaborating with other web service providers for merging logs or creating wired networks for tracking purposes. Censoring activities can be motivated by corporate policies or political regulations. In the world of the web the main goal of censorship is to block access to websites with certain URL addresses or any other sites providing specific content. Free Internet service providers may also be censoring web content or several URL addresses containing forbidden words or phrases (for instance Internet access available in libraries). 2.2 Passive Attacks: Public Information and Abuses Most of the information shared with websites, such as user agent or display properties are intended to be used for customizing services, but this information can also be used to create comprehensive profiles. Information like exact browser agent version, list of installed plug-ins can be used to check the existence of specific vulnerabilities, which can be used to install spyware on the user’s computer. There are several websites demonstrating severity of information leak by listing public information available of the computer which the user uses to access the web, like in [2]. Revealing the network address is a technical need due to the networking mechanisms and architecture of the Internet, however, these addresses are almost unique and allow tracking. The IP address can also be used for geo-locating users quite precisely, narrowing down the possibilities to the most likely country and city. For a visual demonstration, visit [3]. Since IP addresses can change and might be referring to several users and devices, other mechanism are used for tacking purposes, which are considered to be active attacks against user privacy as they require tracking identifiers bound to the user. Fig. 1. TCP/IP stack model and information can be used to abuse privacy. Different types of public information can be sorted by network layers as described in Fig. 1. This classification is required later for defining anonymity criterion (network and application level anonymity should be preserved in a different but related way). 2.3 Active Attacks: Techniques Violating Web Privacy The main purpose of active attacks is profiling, however, censorship should not be left out. Profiling is the method of mapping user activities by time and logging preferences plus interests altogether. This is mainly done by tracking user activity throughout the web, but due to caching and history preserving mechanisms built in the browsing agents there are other methods as well. In addition, we should consider the possibility of collaborating service providers.
Recommended publications
  • Copyrighted Material
    Anonymizing 1 Your Activities In our daily lives we like to have a certain level of privacy. We have curtains on our win- dows, doors for our offices, and even special screen protectors for computers to keep out prying eyes. This idea of wanting privacy also extends to the use of the Internet. We do not want people knowing what we typed in Google, what we said in our Instant Message conversations, or what websites we visited. Unfortunately, your private information is largely available if someone is watching. When doing any number of things on the Internet, there are plenty of reasons you might want to go incognito. However, that does not mean you’re doing anything wrong or illegal. he justification for anonymity when researching malware and bad guys is pretty Tstraightforward. You do not want information to show up in logs and other records that might tie back to you or your organization. For example, let’s say you work at a finan- cial firm and you recently detected that a banking trojan infected several of your systems. You collected malicious domain names, IP addresses, and other data related to the malware. The next steps you take in your research may lead you to websites owned by the criminals. As a result, if you are not taking precautions to stay anonymous, your IP address will show up in various logs and be visible to miscreants. If the criminals can identify you or the organization from which you conduct your research, they may COPYRIGHTEDchange tactics or go into hiding, MATERIAL thus spoiling your investigation.
    [Show full text]
  • Privacy-Enhancing Technologies for the Internet
    Privacy-enhancing technologies for the Internet Ian Goldberg David Wagner Eric Brewer University of California, Berkeley iang,daw,brewer ¡ @cs.berkeley.edu Abstract ing privacy issues on the Internet, and Section 3 provides some relevant background. We then discuss Internet pri- The increased use of the Internet for everyday activi- vacy technology chronologically, in three parts: Section 4 ties is bringing new threats to personal privacy. This pa- describes the technology of yesterday, Section 5 explains per gives an overview of existing and potential privacy- today’s technology, and Section 6 explores the technology enhancing technologies for the Internet, as well as moti- of tomorrow. Finally, we conclude in Section 7. vation and challenges for future work in this field. 2. Motivation 1. Introduction The threats to one’s privacy on the Internet are two-fold: your online actions could be (1) monitored by unauthorized Recently the Internet has seen tremendous growth, with parties and (2) logged and preserved for future access many the ranks of new users swelling at ever-increasing rates. years later. You might not realize that your personal infor- This expansion has catapulted it from the realm of academic mation has been monitored, logged, and subsequently dis- research towards new-found mainstream acceptance and in- closed; those who would compromise your privacy have no creased social relevance for the everyday individual. Yet incentive to warn you. this suddenly increased reliance on the Internet has the po- The threat of long-term storage and eventual disclosure tential to erode personal privacies we once took for granted. of personal information is especially acute on the Internet.
    [Show full text]
  • Identifying and Reducing Usability Issues in Anonymity Systems
    Why Johnny Can’t Blow the Whistle: Identifying and Reducing Usability Issues in Anonymity Systems Greg Norcie Jim Blythe Kelly Caine L Jean Camp Indiana University Information Sciences Institute Clemson University Indiana University [email protected] [email protected] [email protected] [email protected] Abstract—Tor is one of the most widely used anonymity Tor uses onion routing [24] to prevent third parties from networks in the world. The anonymity provided to any individual observing or transparently interjecting themselves into a user’s Tor user is a function of the total number of Tor users. Adoption web activity. The term “onion routing” refers to Tor’s layered of anonymity tools, including Tor, can be hindered by a lack of usability. Therefore, enhancing Tor’s usability and making it encryption, analogous to the layers in an onion, which prevents easier for more people to use Tor successfully increases security third parties (including Tor nodes) from observing the content for all Tor users. To enhance Tor usability we begin by identifying of packets in transit. Tor also prevents the linkage of the source stop-points which may act as barriers to using the Tor Browser and destination of any packet in the Tor network. Of course, Bundle (TBB). We suggest changes based on our analysis of there are attacks that threaten privacy even when using Tor, these stop-points. To follow through, in our second study we test whether the changes we recommended to the Tor Project are including subversion via control of malicious nodes and timing effective in increasing usability, and find a significant decrease attacks.
    [Show full text]
  • Information Security for Journalists
    Information Security for Journalists This handbook is a very important practical tool for journalists. And it is of particular importance to investigative reporters. For the first time journalists are now aware that virtually every electronic communication we make or receive is being recorded, stored and subject to analysis and action. As this surveillance is being conducted in secret, without scrutiny, transparency or any realistic form of accountability, our sources, our stories and our professional work itself is under threat. After Snowden’s disclosures we know that there are real safeguards and real counter measures available. The CIJ’s latest handbook, Information Security For Journalists, lays out the most effective means of keeping your work private and safe from spying. It explains how to write safely, how to think about security and how to safely receive, store and send information that a government or powerful corporation may be keen for you not to know, to have or to share. To ensure your privacy and the safety of your sources, Information Security For Journalists will help you to make your communications indecipherable, untraceable and anonymous. Although this handbook is largely about how to use your computer, you don’t need to have a computer science degree to use it. Its authors, and the experts advising the project are ensuring its practical accuracy and usability, and work with the latest technology. Gavin MacFadyen, Director of the Centre for Investigative Journalism By Silkie Carlo and Arjen Kamphuis 1 July 2014: Version 1.01 Acknowledgements I would like to express my deep gratitude to Arjen Kamphuis for his truly brilliant, patient and generous teaching, for his excellent work and for his friendship.
    [Show full text]
  • Improving Tor Using a TCP-Over-DTLS Tunnel
    Improving Tor using a TCP-over-DTLS Tunnel Joel Reardon∗ Ian Goldberg Google Switzerland GmbH University of Waterloo Brandschenkestrasse 110 200 University Ave W. Zurich,¨ Switzerland Waterloo, ON, Canada [email protected] [email protected] Abstract of the transport layer that can be changed to improve Tor’s performance. Specifically, each pair of routers The Tor network gives anonymity to Internet users by re- maintains a single TCP connection for all traffic that is laying their traffic through the world over a variety of sent between them. This includes multiplexed traffic for routers. All traffic between any pair of routers, even different circuits, and results in cross-circuit interference if they represent circuits for different clients, are mul- that degrades performance. We find that congestion con- tiplexed over a single TCP connection. This results in trol mechanisms are being unfairly applied to all circuits interference across circuits during congestion control, when they are intended to throttle only the noisy senders. packet dropping and packet reordering. This interference We also show how packet dropping on one circuit causes greatly contributes to Tor’s notorious latency problems. interference on other circuits. Our solution is to use a TCP-over-DTLS (Datagram Section 3 presents our solution to this problem—a new Transport Layer Security) transport between routers. We transport layer that is backwards compatible with the ex- give each stream of data its own TCP connection, and isting Tor network. Routers in Tor can gradually and in- protect the TCP headers—which would otherwise give dependently upgrade, and our system provides immedi- stream identification information to an attacker—with ate benefit to any pair of routers that choose to use our DTLS.
    [Show full text]
  • Bittersweet Cookies. Some Security and Privacy Considerations
    Bittersweet cookies. Some security and privacy considerations Abstract Cookies have emerged as one of the most convenient solutions to keep track of browser – server interaction. Nevertheless, they continue to raise both security and privacy concerns due to their continuous evolution, which addresses the needs of industries (i.e. advertising companies). In this paper, we identify and briefly analyse some of the most common types of cookies in terms of security vulnerabilities and the relevant privacy concerns. There is limited support for confidentiality, integrity and authentication in the way cookies are used. In this respect, the possibilities for misusing cookies are very real and are being exploited. Furthermore, due to the nature of the information they store and the way cookies are used for profiling, privacy concerns have been raised and policy initiatives have been initiated to address such issues. This paper takes into consideration the new types of cookies now being deployed in the online environment; these new cookies do not have enough exposure to demonstrate how they are being used and, as such, their security and privacy implications are not easily quantifiable. Studies are required to identify to what extent the new cookies can be used for tracking and also to evaluate the level of identification provided. Regarding policy perspective, the relevant EU legal framework that also addresses cookies is in transition. Currently, Member States are in the process of transposing EU directives addressing cookies and there is space for interpretations and clarifications. A study is required at the end of the transposition process to evaluate possible different requirements or interpretations.
    [Show full text]
  • A Survey of Anonymous Communication Systems
    On Anonymity in an Electronic Society: A Survey of Anonymous Communication Systems MATTHEW EDMAN and BULENT¨ YENER Rensselaer Polytechnic Institute The past two decades have seen a growing interest in methods for anonymous communication on the Internet, both from the academic community and the general public. Several system designs have been proposed in the literature, of which a number have been implemented and are used by diverse groups, such as journalists, human rights workers, the military, and ordinary citizens, to protect their identities on the Internet. In this work, we survey the previous research done to design, develop, and deploy systems for enabling private and anonymous communication on the Internet. We identify and describe the major concepts and technologies in the field, including mixes and mix networks, onion routing, and Dining Cryptographers networks. We will also review powerful traffic analysis attacks that have motivated improvements and variations on many of these anonymity protocols made since their introduction. Finally, we will summarize some of the major open problems in anonymous communication research and discuss possible directions for future work in the field. Categories and Subject Descriptors: C.2.0 [Computer-Communication Networks]: General—Security and protection (e.g. firewalls); C.2.2 [Computer-Communication Networks]: Network Protocols—Applica- tions (SMTP, FTP, etc.); C.2.4 [Computer-Communication Networks]: Distributed Systems—Distributed applications; K.4.1 [Computers and Society]: Public Policy Issues—Privacy General Terms: Security 5 Additional Key Words and Phrases: Anonymity, anonymous communication, DC-nets, mixes, mix networks, onion routing, privacy, traffic analysis ACM Reference Format: Edman, M. and Yener, B.
    [Show full text]
  • Take Control of Your Online Privacy
    EBOOK EXTRAS: v1.1 Downloads, Updates, Feedback TAKE CONTROL OF YOUR ONLINE PRIVACY by JOE KISSELL $10 Thank You, AgileBits This book is a gift from AgileBits, the company behind 1Password. One of the essential steps to keeping your personal information private is choosing passwords that are long, random, and different for every site and service you use. 1Password makes this process simple, and it has been my favorite password manager for many years. —Joe Kissell 2 From Dave at AgileBits To all our amazing customers. We wouldn’t be here without you. Thank you for enabling us to work at our dream jobs. —Dave Teare 3 Table of Contents Read Me First .............................................................. 7 Updates and More .......................................................... 7 Basics ......................................................................... 8 What’s New in Version 1.1 ............................................... 8 Introduction .............................................................. 10 Online Privacy Quick Start ......................................... 13 Learn What You Have to Hide .................................... 15 Things You Might Want to Keep Private ............................. 15 Personally Identifiable Information ................................... 18 Learn Who Wants Your Private Data (and Why) ....... 20 Advertisers ................................................................. 20 Local Villains ............................................................... 24 Hackers ....................................................................
    [Show full text]
  • Protocol-Level Attacks Against
    Computer Networks 57 (2013) 869–886 Contents lists available at SciVerse ScienceDirect Computer Networks journal homepage: www.elsevier.com/locate/comnet Protocol-level attacks against Tor ⇑ Zhen Ling a,1, Junzhou Luo a,2, Wei Yu b, ,3, Xinwen Fu c,4, Weijia Jia d,5, Wei Zhao e,6 a School of Computer Science and Engineering, Southeast University, Nanjing 211189, PR China b Department of Computer and Information Sciences, Towson University, Towson, MD 21252, United States c Department of Computer Science, University of Massachusetts Lowell, Lowell, MA 01854, United States d Department of Computer Science, City University of Hong Kong, Hong Kong Special Administrative Region, PR China e Rector of University of Macau, Macau SAR, China article info abstract Article history: Tor is a real-world, circuit-based low-latency anonymous communication network, sup- Received 31 January 2012 porting TCP applications over the Internet. In this paper, we present an extensive study Received in revised form 20 July 2012 of protocol-level attacks against Tor. Different from existing attacks, the attacks investigated Accepted 6 November 2012 in this paper can confirm anonymous communication relationships quickly and accurately Available online 16 November 2012 by manipulating one single cell and pose a serious threat against Tor. In these attacks, a malicious entry onion router may duplicate, modify, insert, or delete cells of a TCP stream Keywords: from a sender, which can cause cell recognition errors at the exit onion router. If an accom- Protocol-level attacks plice of the attacker at the entry onion router also controls the exit onion router and rec- Anonymity Mix networks ognizes such cell recognition errors, the communication relationship between the sender Tor and receiver will be confirmed.
    [Show full text]
  • Usability of Anonymous Web Browsing: an Examination of Tor Interfaces and Deployability
    Usability of Anonymous Web Browsing: An Examination of Tor Interfaces and Deployability Jeremy Clark P.C. van Oorschot Carlisle Adams School of Information and School of Computer Science School of Information and Technology (SITE) Carleton University Technology (SITE) University of Ottawa [email protected] University of Ottawa [email protected] [email protected] ABSTRACT To this end, a number of tools have been developed to as- Tor is a popular privacy tool designed to help achieve online sist a user in deploying and using Tor. By examining the anonymity by anonymising web traffic. Employing cogni- usability of these tools and offering suggestions for their im- tive walkthrough as the primary method, this paper evalu- provement, the motivation of this paper is to help increase ates four competing methods of deploying Tor clients, and a Tor's affability among novice users with hopes of expanding number of software tools designed to be used in conjunction its user base. In this paper, we examine the usability of Tor with Tor: Vidalia, Privoxy, Torbutton, and FoxyProxy. It and evaluate how easy it is for novice users to install, con- also considers the standalone anonymous browser TorPark. figure, and use Tor to anonymise the Firefox web-browser. Our results show that none of the deployment options are In a Tor network, anonymity is achieved by mixing the fully satisfactory from a usability perspective, but we offer sender's internet traffic with traffic from other users such suggestions on how to incorporate the best aspects of each that the true sender of a particular message is indistinguish- tool.
    [Show full text]
  • Dissent in Numbers: Making Strong Anonymity Scale
    Dissent in Numbers: Making Strong Anonymity Scale David Isaac Wolinsky, Henry Corrigan-Gibbs, and Bryan Ford Yale University Aaron Johnson U.S. Naval Research Laboratory Abstract tion tools [24,25,50,62], users discussing the wrong topic without protecting their identity risk jail or worse. Current anonymous communication systems make a Even in countries with strong free speech traditions, trade-off between weak anonymity among many nodes, anonymity can protect minority groups from discrimina- via onion routing, and strong anonymity among few tion [53]. Increasingly pervasive, profit-motivated track- nodes, via DC-nets. We develop novel techniques in Dis- ing practices [51] have made communication linkability sent, a practical group anonymity system, to increase by a widespread privacy concern [30]. Finally, anonymity over two orders of magnitude the scalability of strong, plays other well-established roles in modern societies, traffic analysis resistant approaches. Dissent derives its such as in voting [2, 19, 44] and auctions [52]. scalability from a client/server architecture, in which Anonymous relay tools such as Tor [26] offer the many unreliable clients depend on a smaller and more strongest practical identity protection currently available, robust, but administratively decentralized, set of servers. but exhibit several classes of weaknesses. First, relay sys- Clients trust only that at least one server in the set is hon- tems are vulnerable to traffic analysis [6,13,37,43,45]. A est, but need not know or choose which server to trust. state-controlled ISP, for example, who can monitor both Unlike the quadratic costs of prior peer-to-peer DC-nets a user’s “first-hop” link to Tor and the “last-hop” link schemes, Dissent’s client/server design makes communi- from Tor to the user’s communication partner, can cor- cation and processing costs linear in the number of clients, relate packets to de-anonymize flows [43].
    [Show full text]
  • A Survey of Anonymous Communication Channels
    Journal of Privacy Technology 20060701001 A Survey of Anonymous Communication Channels George Danezis [email protected] K.U. Leuven, ESAT/COSIC, Kasteelpark Arenberg 10, B-3001 Leuven-Heverlee, Belgium. Claudia Diaz [email protected] K.U. Leuven, ESAT/COSIC, Kasteelpark Arenberg 10, B-3001 Leuven-Heverlee, Belgium. Abstract We present an overview of the field of anonymous communications, from its establishment in 1981 from David Chaum to today. Key systems are presented categorized according to their underlying principles: semi-trusted relays, mix systems, remailers, onion routing, and systems to provide robust mixing. We include extended discussions of the threat models and usage models that different schemes provide, and the trade-offs between the security properties offered and the communication characteristics different systems support. Keywords: Anonymous communications, Mix systems, Traffic analysis, Privacy. 1. Introducing Anonymous Communications Research on anonymous communications started in 1981 with Chaum’s seminal paper “Untrace- able electronic mail, return addresses, and digital pseudonyms” (Chaum (1981)). Since then, a body of research has concentrated on building, analyzing and attacking anonymous communica- tion systems. In this survey we look at the definition of anonymous communications and the major anonymous communication systems grouped in families according to the key design decisions they are based on. Data communication networks use addresses to perform routing which are, as a rule, visible to anyone observing the network. Often addresses (such as IP addresses, or Ethernet MACs) are a unique identifier which appear in all communication of a user, linking of all the user’s transactions. Furthermore these persistent addresses can be linked to physical persons, seriously compromis- ing their privacy.
    [Show full text]