Comprehensive Analysis of Web Privacy and Anonymous Web Browsers: Are Next Generation Services Based on Collaborative Filtering?
Total Page:16
File Type:pdf, Size:1020Kb
Comprehensive Analysis of Web Privacy and Anonymous Web Browsers: Are Next Generation Services Based on Collaborative Filtering? Gábor Gulyás, Róbert Schulcz, Sándor Imre Budapest University of Technology and Economics, Department of Telecommunications, Mobile Communication and Computing Laboratory (MC 2L) Magyar Tudósok krt.2, H-1117, Budapest, Hungary {gulyasg, schulcz, imre}@hit.bme.hu Abstract. In general, networking privacy enhancing technologies are better on larger user bases - such criteria that can be enhanced by combining them with community based services. In this paper we present main web privacy issues and today’s complex preventive solutions, anonymous web browsers, in several aspects including a comprehensive taxonomy as a result of our inquiry. Also, we suggest a next generation anonymous browser scheme based on collaborative filtering concerning issues on semantic web. Finally we analyze the benefits and drawbacks of such services, also examining the possible investors and raised moral considerations. Keywords: anonymous web browsers, user tracking, web privacy, collaborative filtering 1 Introduction The web has become a generic platform and takes a serious place in the everyday life of the digital age’s citizens. Several life-like transactions can be done on the web such as browsing items in a web shop, executing financial transactions, booking hotel rooms, while users require a high level of privacy, meaning as strong as they were committing these actions in real life. However, privacy on the web is not as strong as it is desired to be. Browsing items on web shops, or reading on-line magazines should be done anonymously if desired, but in many cases users are being observed and information is collected for profiling purposes. In most cases these profiles are later being used for direct marketing implying targeted advertising, dynamic pricing. Although user profiles can also be useful in determining content relevancy or in creating customized services. Privacy enhancing technologies (PET) are the solution against privacy vulnerabilities. The necessity of privacy enhancing technologies for the Internet emerged in the early beginnings [1] and since solutions evolve, however, there are still a lot of open questions. On the web anonymous web browsers represent the complex solution for sustaining anonymity and defending privacy. In this paper after outlining the current problems of web privacy and analysis of anonymous web browsers, we recommend a new solution based on collaborative filtering. This next generation service does not presume the existence of a semantic web, instead offering the possibility to create it, while it also strengthens user privacy by providing anonymous web surfing. We structure this paper as follows. In order to determine how seriously user privacy is endangered on the web, in Section 2 we discuss web privacy issues by inspecting participants concerned in violating user privacy and primal techniques. As a summary at the end of the section we propose a criterion determining the proper conditions for achieving anonymity. Anonymous web browsing services provide preceding solutions for the yielding privacy vulnerabilities. In Section 3 we present the architecture of today’s anonymous web browsing services, and publish a short taxonomy for classifying such service types, including a comparison as well. In Section 4 we suggest a solution describing how collaborative filtering should be applied to anonymous web browsers. In this section we analyze possible investors and examine moral questions raised as well. Finally, we give a conclusion about our work in this paper. 2 Web Privacy Web privacy issues can be divided in two main categories, correspondingly to the categorization of passive and active attacks in security: information leaking and technologies used to compromise privacy. However, information shared inadvertently can also result in compromising privacy (for instance, tracking) which raises the importance of total user control over shared or leaked information as well, not only preventing and detecting active attacks. Since several services rely on their advertising incomes, some privacy-friendly altering methods should be considered. For example instead of animated advertising, text based should be used, which is more audio-visual privacy friendly. Alternatively for tracking user activities and profiling, user preferences should be guessed by analyzing the web context in which the advertisement is shown.. Using this method users are classified into general preference groups instead of creating personal profiles. Otherwise, requesting the users’ consent is also a privacy-friendly approach if individual profiles needed to be created. 2.1 Participants, Goals and Motivation We divided the participants into two main groups: the users and others possibly compromising their privacy. However, in another aspect participants could be classified to be neutral, supportive or endangering user privacy. Intrinsically, actual participants can have several roles at the same time. The users ’ objective is to have total control over all their data : any information sent and received, and preserving anonymity. A user should also be aware of the information shared with the service provider or a third party any time, and should be able to defend herself against advertising and the leak of private information like e- mail addresses or login names. Neglecting the way the profile was acquired, the advertisers ’ goal is to achieve precise targeted advertising : to get the proper advertisements to the proper users (in large numbers). In some cases advertising is used together with profiling by tracking user activity and storing profile information on contextual or click-through bases. Advertisements, besides overloading system resources, can violate audio-visual privacy, by expanding over their designated area and playing sound effects or music. Web shops and stores might be using targeted advertising and profiling, however, they might use profiles for dynamic pricing , for extra profit they offer desired products more expensive and uninteresting ones cheaper. The user’s profile can be easily updated accordingly to her purchase statistics. Data collectors use special tracking techniques and often collaborate with service providers for profiling purposes. Their goal is to create accurate databases for merchandising or for some previously mentioned activities. The category of service providers includes Internet Service Providers (ISP) and web services providers as well. The IPS-s’ proxies are the bottleneck of the users’ whole traffic, which is ideal for logging user activities and blocking access to web service providers (politically motivated censorship). Web service providers are often the link between different participants by applying auditor services, placing third-parties’ advertisements on their pages or could be collaborating with other web service providers for merging logs or creating wired networks for tracking purposes. Censoring activities can be motivated by corporate policies or political regulations. In the world of the web the main goal of censorship is to block access to websites with certain URL addresses or any other sites providing specific content. Free Internet service providers may also be censoring web content or several URL addresses containing forbidden words or phrases (for instance Internet access available in libraries). 2.2 Passive Attacks: Public Information and Abuses Most of the information shared with websites, such as user agent or display properties are intended to be used for customizing services, but this information can also be used to create comprehensive profiles. Information like exact browser agent version, list of installed plug-ins can be used to check the existence of specific vulnerabilities, which can be used to install spyware on the user’s computer. There are several websites demonstrating severity of information leak by listing public information available of the computer which the user uses to access the web, like in [2]. Revealing the network address is a technical need due to the networking mechanisms and architecture of the Internet, however, these addresses are almost unique and allow tracking. The IP address can also be used for geo-locating users quite precisely, narrowing down the possibilities to the most likely country and city. For a visual demonstration, visit [3]. Since IP addresses can change and might be referring to several users and devices, other mechanism are used for tacking purposes, which are considered to be active attacks against user privacy as they require tracking identifiers bound to the user. Fig. 1. TCP/IP stack model and information can be used to abuse privacy. Different types of public information can be sorted by network layers as described in Fig. 1. This classification is required later for defining anonymity criterion (network and application level anonymity should be preserved in a different but related way). 2.3 Active Attacks: Techniques Violating Web Privacy The main purpose of active attacks is profiling, however, censorship should not be left out. Profiling is the method of mapping user activities by time and logging preferences plus interests altogether. This is mainly done by tracking user activity throughout the web, but due to caching and history preserving mechanisms built in the browsing agents there are other methods as well. In addition, we should consider the possibility of collaborating service providers.