User and the Evolution of Third-party Tracking Mechanisms on the

Sonal Mittal

May 18, 2010 Abstract

Third-party tracking refers to tracking done by that a never navigates to explicitly. Many Internet users are vaguely aware that their information may be collected online. However, data suggests there is relatively little knowledge about third-party tracking and its associated privacy risks. The FoxTracks software tool attempts to address this lack of knowledge about third-party online tracking for the benefit of interested users with varying levels of technical knowledge. FoxTracks is a add-on program that browses the web along with the user and collects information about three types of trackers that may be monitoring the user: HTTP cookies, Local Shared Flash Objects, and DOM Storage entries. The interface to FoxTracks displays the user’s information as it has been collected by the trackers; the highly personalized view of third-party tracking is uniquely accessible and informative for end-users. Beyond the development of FoxTracks, the analysis presented in this thesis discusses the history, key players, and motivations of third-party tracking, and how each influenced the design choices made in the software. In particular, the motivations of third-party entities, who are frequently online advertisers, are examined in at length. A computer security rubric is then applied to the behavior and tracking methodologies of third parties in order to show their adversarial qualities in matters of user privacy. Contents

1 Introduction 2

2 Third-Party Tracking in the Literature and Code 6

3 HTTP Cookies and Web bugs 10 3.1 The Introduction of State Management Mechanisms ...... 10 3.2 Advertisers and Personally Identifiable Information ...... 12 3.3 HTTP Cookies and Web Bugs in FoxTracks ...... 14

4 Flash Local Shared Objects 17 4.1 A History of LSOs ...... 17 4.2 Corporations and Market Incentives ...... 18 4.3 Flash Cookies in FoxTracks ...... 20

5 DOM Storage 25 5.1 in the W3C Standard ...... 25 5.2 Case Study: Gmail Mobile Privacy ...... 27 5.3 Community Approach to the Study of DOM Storage ...... 30

6 Results 33 6.1 The FoxTracks Implementation ...... 33 6.2 Third Parties as Privacy Adversaries ...... 35

7 Conclusions 38

Acknowledgements 41

Bibliography 43

1 Chapter 1

Introduction

Protection of online privacy refers to freedom from unwanted interferences with an Internet user’s digitally stored, . This includes data residing on a user’s local computer, data transmitted by a user to remote servers, and data that is generated in the process of browsing websites: mouse strokes, searches, history, and other page inputs. Evolving Internet protocols and standards have contributed to a strong emphasis on user self-management of online privacy rather than legal or regulatory restrictions on how remote servers can collect and use user information. Users’ general lack of knowledge about the transmission and uses of personal data, combined with little privacy jurisdiction on the Internet, has left many paths open for the aggregation and use of individuals’ web data without their given consent. Many non-expert Internet users are vaguely aware that their information and data may be collected online. However, surveys of users such as Internet-using college students suggest that individuals know little about the pervasiveness of online tracking and the kinds of personal information that can be recorded [9][3]. The surveys show there is particularly little awareness about the data collection and activity logging undertaken by third-party websites. Third-party websites are entities that a user never visits explicitly; they are juxtaposed with first-party websites which correspond to that a user enters in a browser address bar. Because third-party tracking is undertaken surreptitiously by unfamiliar entities, it is less transparent than tracking done by first-party websites. Users are able to consult the stated

2 privacy policies of first-party websites in order to understand how their data and activity will be monitored. However, without their identities, a user cannot examine the privacy policies of third-party websites and opt out of third-party tracking. Personal information collected in an opaque manner by remote entities reflects an especially serious privacy risk since digital data can be copied and distributed to additional parties with electronic ease. To increase awareness about third-party tracking and survey common online tracking methodologies, I created the FoxTracks software tool. FoxTracks is designed as a Firefox add-on program and was created using JavaScript and XUL utilities. The ul- timate goal of the software is to educate regular Internet users about the different kinds of tracking technologies employed by third parties and demonstrate the great extent to which their activity and data is monitored by unfamiliar entities. To achieve this aim, I designed FoxTracks to show how three different tracking technologies personally affect the user as she browses the web. FoxTracks contains three panels, one for each of HTTP cookies, Local Shared Objects, and DOM Storage. Each panel aims to show the user which third parties are using the technology to track her online activities and what personal information they have collected. Each panel also provides to web pages answering potential questions about the tracking technology, the types of user information at risk, and available opt-out mechanisms for the technology. The web content associated with each panel was generated from my research and reviewed by privacy experts at the Center for Democracy and Tech- nology (CDT).1 The CDT also kindly hosts the FoxTracks web content on their servers.2 Beyond the FoxTracks development cycle, the analysis presented in this thesis discusses the history, key players, and motivations of third-party tracking, and how each influenced the design choices made in the software. In the analysis, I also explore the following research

1CDT is a non-profit public interest organization working at the intersection of law, technology, and policy. It is headquartered in Washington, D.C. More information can be found on their : http://www.cdt.org/ 2See http://www.cdt.org/foxtracks/. Because of CDT’s contribution, their logo appears on the FoxTracks tab panels.

3 questions: Does the evolution of tracking technologies suggest an adversarial relationship between users and third parties, as in the computer security paradigm? To what extent does the Internet infrastructure (e.g., HTML standards) facilitate third-party tracking? Should Internet users be comfortable with opaque information collection, and if not, what kinds of responses are effective? For Internet users who are interested in learning more about online privacy and the risks posed by third-party tracking, FoxTracks is an accessible, all-in-one resource. By showing users the information that specific third parties have collected about them, FoxTracks demon- strates privacy risks in a novel, highly personalized way. With a better understanding of third-party tracking, users are able to make more informed decisions about how they browse the web. As such, FoxTracks has the ability to synchronize user beliefs about digital privacy with online behavior in a way that closes the information gap suggested by Internet user surveys. For instance, FoxTracks adopters may adjust their browser settings and browsing habits based on how they find their information is collected and used by third parties. At a minimum, users who learn about third-party tracking will continue their current browsing patterns in a more transparent environment—one in which third parties have the “informed consent” of users to track their online activities. In this way, FoxTracks plays an important public role in spreading information about third-party tracking to online content consumers. The accompanying written analysis on the history and identity of third parties, and the potential uses of collected user data also has significance for the public discourse on privacy. Through my research, I find that third parties have developed increasingly advanced tech- nologies to combat user efforts to restrict access to personal data. Additionally, they have significant economic motives to track users and are heavily aided by new HTML standards. Taken together, these findings suggest that third parties act like adversaries to individual users under the computer security paradigm. It follows that users should take active, and perhaps organized, steps to restrict third-party activities online.

4 This thesis is organized as follows: Chapter 2 contextualizes my software and analysis within the existing body of computer science literature on third-party tracking. The following sections explore the three types of third-party tracking technologies included in the FoxTracks tool. Chapter 3 is an overview of HTTP cookies and web bugs in relation to third-party tracking, and Chapter 4 examines how third parties make use of flash object technology. Chapter 5 explores the new DOM storage feature of the HTML5 standard. Each of these chapters describes the basic technology, how the technology is modified to serve information collection purposes, types and potential uses of the information collected by the technology, and how these facts influenced the design of FoxTracks. Results related to the FoxTracks software and the proposed research questions are given in Chapter 6. Chapter 6 also reviews the results by considering some of the limitations of the software. Chapter 7 concludes the thesis and discusses further directions for research.

5 Chapter 2

Third-Party Tracking in the Literature and Code

HTTP cookies have been included in the HTML standard since 2000. Since their introduc- tion, cookies and web bugs, which are functionally similar to cookies, have generated a great deal of academic interest. Kristol (2001), author of the original cookie standard, was among the first to give a general overview of how cookie mismanagement could result in serious security breaches involving information leakage across domains [6]. Kristol acknowledges the third-party profiling potential of cookies in correctly implemented cookie management systems and questions whether users are aware of the tracking potential of cookies. Similar explanations and concerns have been reiterated as tangential points in numerous academic computer security papers since 2001. Other researchers have taken a code-based approach to the analysis of HTTP cookies; motivated by a desire for web transparency, they have de- veloped software programs for viewing and managing first-party and third-party cookies. A great deal of this research has come from the public sector with individual programmers and non-profit advocacy groups undertaking software development and publicizing their work and findings. The extent of this research is evidenced by the myriad cookie management programs available for download today. , a popular Firefox add-on, is one of the few programs designed to exclusively identify and block third-party HTTP cookies and the web bugs associated with them. Unlike

6 regular cookie managers, Ghostery strives to give users information about the third parties attempting to set cookies through first-party websites that users visit. Specifically, it alerts users about the identities and first-party associations of third-party trackers in real-time using a menu located in the bottom of the Firefox status bar. The menu also links to more information about the identified trackers. This informational quality of Ghostery addresses a deficiency of popular cookie managers—an average Internet user without a precise under- standing of third-party cookies and web bugs can easily use Ghostery and learn about online tracking in the process. Though Ghostery is the closest content analog to FoxTracks, it does not provide a user with a complete picture of how third parties log user activity on the web. FoxTracks attempts to address this informational deficiency by compiling a history of all first-party websites on which a given third-party tracker has been found in order to show the browsing history profile that the tracker compiles. Local Shared Objects (LSOs) refer to client-side, remotely-accessible storage. Adobe makes use of LSOs in its popular Flash video player. Much of the interest in LSO im- plementations has come from the private sector, with various web company white papers outlining the potential of Adobe’s Flash LSOs as client-side storage bins. LSOs in general have been examined extensively in systems and networking literature as persistent storage bins for communication between machines. They have been explored as a mechanism for executing malicious attacks on host computers, and their third-party tracking potential is frequently noted in privacy papers. Soltani et al. (2009) were the first to present a holistic picture of Flash LSO usage and web practices [13]. They conclude that Flash LSOs present a substantial privacy threat because of their third-party capabilities and their obfuscation from users, which gives them an especially persistent nature. As with HTTP cookies, a great deal of LSO privacy analysis has come from the public sector. Organizations such as the Electronic Privacy Information Center () have com- piled public fact sheets on how third parties use LSOs to track users and many individual

7 developers have created stand-alone and browser add-on programs for Flash LSO manage- ment. In the Firefox add-on tradition, BetterPrivacy is the most widely used LSO removal and editing tool. BetterPrivacy lists all the LSOs that currently exist on a user’s machine and allows a user to select the frequency and timing of LSO deletion. Despite a user-friendly design, BetterPrivacy adopts the “install and forget” browser add-on model and thus pri- marily benefits users with an understanding of LSOs and their risks. The add-on may be less useful for users who do not have a good understanding of first-party and third-party LSOs, or the nature of the privacy threats that they present. With FoxTracks, my goal was to build on BetterPrivacy’s control over obfuscated LSOs by providing a visual explanation of how third parties may set and use LSOs. By illuminating actual LSO tracking of a user’s web activities, I extend LSO control to a wider, non-technical audience. Unlike HTTP cookies and LSOs, DOM Storage remains relatively unexplored in the aca- demic and private sector literature. Where it does appear, it is studied for its efficiency properties as a remote-access storage space. Between public sector organizations and indi- vidual web developers, no software tools have been developed to specifically examine DOM Storage contents or to clarify the exact contents to users. Some web pages hosted by privacy interest groups like the Electronic Frontier Foundation (EFF) briefly describe the location of DOM contents without more specific information or instances of DOM Storage use in user tracking. FoxTracks aims to bring DOM Storage more fully into the privacy discourse by exposing its contents to users and beginning to clarify its role in third-party tracking. FoxTracks is thus a unique development in the body of work on DOM Storage. In sum, there is substantial amount of literature on HTTP cookies, significant work on LSOs, and relatively little information about the uses of DOM Storage in tracking. While the existing literature may inform interested users about these technologies, it provides high-level explanations rather than a personally relevant demonstration of privacy invasion. The latter is a more accessible and tangible educational experience for users with little background in

8 online privacy. Because current software tools that do show how users are personally af- fected by tracking require intermediate technological knowledge, I designed FoxTracks to be relevant and accessible to all Internet users. Surpassing the traditional storage management model in favor of a personalized approach allows non-expert users to see how tracking figures actively in their browsing experience. The accessibility of FoxTracks comes from its interface and its all-in-one nature. Including all three tracking technologies in a single privacy man- agement tool provides a novel, holistic survey of third-party tracking on the web. Moreover, each technology represents an advance in the capabilities and persistence of trackers, which informs users about the evolution and growth of third-party trackers. FoxTracks aims to increase general knowledge and awareness of third-party tracking through these educational and accessible qualities.

9 Chapter 3

HTTP Cookies and Web bugs

3.1 The Introduction of State Management Mechanisms

As the complexity of web-applications grew in the late 1990s, the Internet Engineering Task Force (IETF)1 recognized the value of adopting an HTTP state management mechanism.2 The IETF believed such a mechanism could support virtual shopping carts for e-commerce and improve the user browsing experience by “remembering” preferences for websites [7]. The state management mechanism adopted was the HTTP cookie (cookie). Cookies are small pieces of text that servers can set and read from a client computer in order to register its “state.” They have strictly specified structures and can contain no more than 4 KB of data each. When a user navigates to a particular domain, the domain may call a script to set a cookie on the user’s machine. The browser will send this cookie in all subsequent communication between the client and the server until the cookie expires or is reset by the server. As predicted by the IETF, cookies have been used to improve the functionality of many websites. For example, they have been used to implement carts, cache data

1The IETF is an open standards organization that works with similar groups to propose and review Internet standards. 2HTTP refers to the HyperText Transfer Protocol, which governs how requests are sent over the Internet. These requests are stateless; in other words they do not carry any configuration information about the systems exchanging requests.

10 values, personalize website views, and transmit user credentials [11]. Such use of cookies improves the user browsing experience and in turn benefits websites who receive more visitors. However, cookies can also compromise user privacy in many ways. At the time of adoption, the IETF described the cookie’s potential for cross-domain information exchange, a particularly serious threat to user privacy. The following text appears under the header of “Unexpected Cookie Sharing” in the IETF’s Request for Comment (RFC) 2965 document explaining the new cookie standard:

A user agent should make every attempt to prevent the sharing of session infor- mation between hosts that are in different domains. Embedded or inlined objects may cause particularly severe privacy problems if they can be used to share cook- ies between disparate hosts. For example, a malicious server could embed cookie information for host a.com in a URI for a CGI on host b.com.3 User agent im- plementors are strongly encouraged to prevent this sort of exchange whenever possible.

Users can navigate to webpages that load content such as images or advertisements from third-party servers. Because a third-party server establishes a connection to a user’s machine when its contents are loaded on a first-party website, the third party is able to set a cookie on the user’s machine. Cookies set by these third parties have the potential to track a user’s browsing habits. To see how this is possible, consider an image that is stored on a.com’s servers and loaded on two websites: b.com and c.com. If a user navigates to b.com, a.com can set a cookie containing a unique alphanumeric string on the user’s machine and associate b.com with that string somewhere on its own servers. When the user next navigates to c.com, a.com will read the cookie it previously set on the user’s machine. It can then recognize the unique string contained in the cookie and associate c.com with the string. a.com now

3URIs and CRIs are placeholders for content that exists outside the immediate context.

11 has a small profile of the user’s browsing habits and can grow this profile along the order of the number of websites that host its content. Thus, agents interested in tracking users are able to exploit the cookie state management mechanism to capture users’ browsing habits without their knowledge or consent. Web bugs are functionally similar to cookies set by third parties. Web bugs affiliated with particular third parties are embedded objects loaded from third-party servers that are invisible to users. Unlike third-party cookies, they do not set any data on a user’s computer. Rather, they collect a user’s IP address, browser type, the current first-party URL, and read any unique-string cookies that have been set by the third party in the past. If such a cookie is found, the server is able to augment its profile of the user with the current first-party URL. In this way, web bugs can be used in conjunction with third-party cookies to facilitate user tracking.

3.2 Advertisers and Personally Identifiable Informa- tion

The previous section explains how third-party cookies can be used to track a user’s browsing habits. However, understanding the full privacy consequences of third-party cookie tracking requires an examination of third parties and the of user browsing profiles. I used the Ghostery Firefox add-on to examine which organizations are carrying out third-party tracking using cookies and web bugs. Ghostery identifies third-party trackers on a given by first searching the underlying HTML of a web page for script tags. Content loaded from a domain different than that of the primary URL must be loaded though the script syntax. Once Ghostery has acquired all objects associated with script tags, it compares the objects to a database of known third-party trackers in order to determine whether any of the scripts loaded third-party trackers. This database is the most comprehensive list of known third-party trackers that make use of cookies and web bugs. It includes over 200 trackers

12 Table 3.1: Some third-party trackers contained in the Ghostery database.

Tracker Name Analytics Quantcast SiteMeter Omniture Connect Google Adsense Doubleclick Tacoda WebTrends AddThis Revenue Science and a small sample is given in Table 3.1. The tracking agents listed in Table 3.1 are primarily ad networks and behavioral data providers. Ad networks connect advertisers who want to reach potential customers with sites who want to sell advertisement space. This business model allows ad networks reach a wide spectrum of small and medium-sized websites interested in taking on advertisements. In 2009, 30% of the $8 billion spent on went to ad networks rather than direct websites selling advertising space [5]. One advantage ad networks have over individ- ual websites selling advertising space is the ability to display the same ad across multiple websites that a single user visits. This advantage attracts advertisers who desire multiple ad impressions per user [10]. Thus, being able to accurately track user browsing habits has significant business consequences for ad networks. It follows that ad networks have a strong incentive to use third-party cookies and web bugs to track users across as many sites as pos- sible. Like ad networks, behavioral data providers aim to track and organize user browsing patterns. However, behavioral data providers are solely in the business of stratifying users for receiving targeted advertisements and not involved with the advertising process itself. Such companies work with websites or ad networks to suggest relevant ads for different users

13 based on browsing history data collected through third-party cookie and web bugs. For both types of companies, more user tracking yields more user data points that translate to improved business. datasets containing users’ browsing habits can be augmented by precise demographic data. Though the primary use of third-party tracking cookies and web bugs is the aggregation of a user’s browsing history, cookies can also be used determine demographic information about the user. Cookies and web bugs have access to primary page URLs, which may leak pieces of personal data such as name or data form information. Third parties may process this information and associate it with the browsing profile and unique string for the user [12]. This association is a serious threat to user privacy because it may de-anonymize the browsing history profile that was otherwise only connected to an alphanumeric string. The browsing profile newly associated with a specific person and/ or her demographic information might be sold or publicized at the discretion of a tracking company, resulting in a serious breach of user privacy.

3.3 HTTP Cookies and Web Bugs in FoxTracks

Following the Ghostery add-on model, FoxTracks works by reading in the underlying HTML of a web page on loading and searching for script tags which are required when content is loaded from another domain. All lines containing scripts are then examined to see if they contain an external source and if that source can be identified as a third-party tracker. Specifically, the of the source is checked against the Ghostery database con- taining information about 200 known trackers. The Ghostery database is included as a raw file in the FoxTracks add-on. I have kept the original Ghostery database format and methodology to maximize compatibility with simultaneous use of Ghostery. Each “entry” in the file is specified as a {tracker name, tracker search pattern} pair. The search patterns

14 were determined by the Ghostery development team, which collects new third-party tracker submissions from Ghostery users at large. After verifying the presence of a third-party tracker, FoxTracks returns the name of the tracker and associates it with the first-party website on which it was found. FoxTracks manages a SQLite database to hold these associations. The database called trackerBase.sqlite is updated with a {third-party tracker, first-party website} pair whenever a third party is identified on a page. The HTTP cookies and web bugs tab in FoxTracks provides a table view of trackerBase.sqlite, which features two columns, “Tracker” and “Origin” (see Figure 3.1). An end-user can resort the table by column. The tracker-based sort displays all entries containing the same tracker consecutively; this view provides the user with a snapshot of the personal browsing profiles different trackers have compiled. These profiles are consistent with the browsing profiles that each tracker stores on its servers. The origin-based sort provides a history of all the third-party cookies and web bugs that have ever tracked the user on a particular website that the user has visited. This information is useful for users who may want to adjust their website usage based on concerns about third-party tracking. To understand the FoxTracks interface, users require contextual information about cook- ies and web bugs. Thus, FoxTracks provides links to web content addressing general questions about cookies and web bugs, which third parties use these technologies, and what kinds of information third parties can collect. The software also provides links to information about opting out of third-party tracking with cookies and web bugs. Specifically, FoxTracks links to instructions for blocking third-party cookies using built-in browser settings. It also links to the latest version of Ghostery, which provides a mechanism for blocking web bug activity. By providing informational links as well as a personalized demonstration of third-party cookie and web bug tracking, FoxTracks engages and informs users about online privacy risks.

15 Figure 3.1: Screenshot of the FoxTracks HTTP Cookies and Web Bugs panel.

16 Chapter 4

Flash Local Shared Objects

4.1 A History of LSOs

Local Shared Objects are a class of remotely-accessible, client-side storage bins. Flash LSOs were first used to store settings preferences in Macromedia’s Flash Player 6 in 2002. They have been included in every subsequent version of the flash player, from Macromedia Flash Player 7 to Player 10 (Macromedia was acquired by Adobe in 2005). When a Flash application is loaded on a page, a website is able to set an associated Flash LSO without prompting the user for permission. These LSOs are formatted as .sol files and can hold up to 100 KB of data. Additionally, they do not have an expiration date and are located in a single system folder that is available to all users and browsers on a machine. These characteristics of LSOs suggest they are more persistent data stores than HTTP cookies. Greater persistence and storage size offer a number benefits as users consume more data intensive web content like streaming music and video. LSOs are able to improve media playback by storing video preferences or caching large amounts of data that would otherwise have to be repeatedly retrieved from servers. However, this technology can also be detrimental to user privacy. Adobe Flash is a standalone program that is independent from the browser. Most browsers, including Firefox, do not provide any control mechanisms over the setting and accessing of Flash LSOs, nor do

17 they prompt the user for permission to interact with Flash LSOs. Furthermore, Flash-based applications on a given web page may not be visible to the user. It follows that users who are unaware of LSOs have no control over the setting of LSOs on their machine. These concerns are aggravated by the wide variety of information that can be stored in LSOs. According to a Macromedia whitepaper on LSOs, the type of information that can be contained in .sol file is limited only by the information to which the Flash application has access. This includes any content in the Flash application file, information that the user provides to the website or the Flash application, configuration information about the users machine for video content playback, and other LSOs associated with the same domain [4]. Flash LSOs can also be used for third-party tracking purposes. In 2005, United Vir- tualities, an online advertising company, published a statement on the use of LSOs in an online environment with increased user awareness and deletion of third-party HTTP cookies [15]. Like third-party HTTP cookies, third-party LSOs with unique identifying strings can be loaded through first-party websites. These third-party LSOs can then be used to compile an enhanced browsing profile of an individual who navigates to multiple websites that load content from the third party. Because this tracking methodology is very similar to that of third-party HTTP cookies, LSOs are also known as “Flash cookies.” There has been little work on identifying how and when Flash cookies are set on a user’s machine. Without this kind of information, it is difficult to discern first-party Flash cookies from third-party Flash cookies set by companies for tracking purposes.

4.2 Corporations and Market Incentives

Soltani et al. addressed the lack of Flash cookie data by using survey techniques to find out which websites regularly employed first-party and third-party Flash cookies. They surveyed the 100 most-visited websites (as of July 2009) and found that 54 sites set a total of 157

18 Local Shared Objects that produced 281 Flash cookies. 31 of these sites also marked their flash cookies with a unique identifying string that matched a unique identifier contained in an HTTP cookie set by the same site. Upon investigation, Soltani et al. found that when the corresponding HTTP cookies were deleted, a new HTTP cookie set by the website would contain the same identifier. This behavior suggests that Flash cookies actually “respawn” deleted HTTP cookies [13]. While their research doesn’t explore the possibility of browser setting uniqueness that would allow identification by a website, further evidence of cookie respawning is given by the United Virtualities statement on the use of Flash cookies to defend against cookie deletion. In a March 2005 statement, the company wrote,

All advertisers, websites and networks use [HTTP] cookies for targeted advertis- ing, but cookies are under attack. ... [We] developed a backup ID system for cookies set by web sites, ad networks and advertisers, but increasingly deleted by users. UV’s ‘Persistent Identification Element’ (PIE) is tagged to the user’s browser, providing each with a unique ID just like traditional cookie coding. However, PIEs cannot be deleted by any commercially available anti-, mal-ware, or adware removal program. They will even function at the default security setting for .

Of the 31 domains with Flash cookies that respawned HTTP cookies, Soltani et al. iden- tified eight as advertising companies and four as first-party domains. The eight advertisers in Table 4.1 constitute the only definitive list of third parties known to use Flash cookies in a way that intentionally circumvents user efforts to delete HTTP cookies. Others may be found by searching personal LSO collections, but this approach to identifying third parties is subject to scrutiny by the web community at large. Of the companies listed in Table 4.1, many publicly disclose their ability to collect large quantities of highly specific user infor- mation such as zip code and income bracket. VideoEgg alone has a 100 million-person user base through its distribution across 500 websites [2]. These advertisers have incentives to

19 Table 4.1: Companies using Flash cookies that respawn HTTP cookies.

Company Name ClearSpring Iesnare InterClick ScanScout SpecificClick QuantCast VideoEgg Vizu override user steps to protect privacy as outlined in Section 3.2. The tractable number of advertisers known to use third-party Flash cookies also allowed me to examine more specific industry incentives to ignore concerns about user privacy on the web. Public records of venture funding show that three of the private advertisers in Table 4.1—ClearSpring Tech- nologies, Quancast, VideoEgg—have received over $110 million in venture capital funding from 2005 to 2010 [2]. Other advertisers have been recently honored with accolades such as “a top 10 most innovative company.” This kind of monetary and industry support suggests that these companies are rewarded for intrusions into user privacy. It also suggests they face little to no opposition from organized web users or other interest groups that could weaken their business model by preventing tracking or inducing concern among venture funders. The lack of concern for user privacy demonstrated by funders and industry reinforces the need for an educational tool that increases awareness of tracking with Flash cookies.

4.3 Flash Cookies in FoxTracks

While third-party tracking is of particular research interest due its intentionally obfuscated nature, the difficulty in determining when Flash cookies are set and accessed prevented me from focusing solely on third-party Flash cookies in FoxTracks. The current method of Flash

20 cookie access detection is an examination of the local LSO folder for new LSOs and changes in last-access timestamps on every page load [8]. This method results in noticeable browser slow-down when the folder size is large and when significant numbers of Flash cookies are being accessed on a single page. Significant browser latency is a disincentive to add-on usage, so I chose not to use this method. I have concluded that an ideal model for third-party Flash cookie detection would parallel the Ghostery method of finding third-party HTTP cookies and web bugs: scanning the HTML of a page for script tags and comparing the commands contained within them to strings naming known third-party trackers. While the eight companies identified by Soltani et al. constitute the beginnings of database of known third-party Flash cookie trackers, I intend to use the Ghostery model of community-based input and review to compile a larger database for inclusion in later development. Once this is a strong resource, FoxTracks can display companies that use third-party Flash cookies and how they have personally tracked the user over time. In order to gather community input, FoxTracks must first be adopted by a user base. In this version of FoxTracks, I have opted to include “view and delete” interface into a machine’s LSO folders (see Figure 4.1). My interface accesses and lists all Flash cookies on a user’s machine in table format. Information displayed about each flash cookie includes origin, i.e., with which domain the object is affiliated; name, e.g., “settings.sol;” size in ; and the date and time a specific cookie was last accessed by a website the user visited. The interface also includes information about the location of the LSO folder on the user’s machine and buttons to delete the listed Flash cookies individually or altogether. The origin and name information is generally sufficient to understand the owner and purpose of a particular Flash cookie. When a user is aiming to delete tracking Flash cookies and maintain preferences for various websites stored in other Flash cookies, the origin and name can be used to decide whether a particular cookie should be deleted or not. The size and latest access time might also provide insight into the quantity and frequency of information collection by websites

21 the user has never visited explicitly. The view generated in FoxTracks resembles a simplified version of the functionality in the most popular Firefox Flash cookie add-on, BetterPrivacy.

Figure 4.1: Screenshot of the FoxTracks Flash Objects panel.

A version of BetterPrivacy’s automatic deletion feature is included in the advanced op- tions pane of FoxTracks, which is shown in Figure 4.2. These options have been separated from the main Flash objects tab in order to simplify the tool and thereby further its edu- cational and informational goals. The options for automatic Flash cookie deletion are more restricted than those offered by BetterPrivacy, which functions as an “install and forget” add-on for users with an intermediate understanding of Flash cookies. FoxTracks allows the user to select between complete deletion at every session ending, timer-based deletion of infrequently-accessed Flash cookies, and adding an option to clear Flash cookies to the built- in Firefox “Clear Recent History” dialog box. Other advanced options include clearing the Adobe Flashplayer settings LSO that contains playback preferences in addition to a history

22 of all visited websites that use Flash, and clearing empty folders left over from deleted .sol files. By leaving out some BetterPrivacy functionality such as a “white-list” of perpetually allowable Flash cookies, and limiting options for automatic Flash cookie deletion, FoxTracks aims to focus the user’s attention on the origins and purposes of the LSOs that have been set on her machine.

Figure 4.2: Screenshot of the FoxTracks options dialog box.

Like the HTTP cookies and web bugs tab, the Flash objects tab also includes a sidebar with informational links to relevant web content. These include descriptions of what Flash cookies are, which known organizations are using them to track user behavior on their own websites or across other websites, the types of information that can be gleaned about a user through the use of Flash cookies, and how to opt out of being tracked by Flash cookies, either by deleting them or managing them centrally through the Adobe website. Instead of the deletion-blocking approach to controlling Flash cookies, users may use a formal blocking and

23 storage limitation scheme to stop tracking. Adobe’s website provides a Global Settings panel that allows users to block all third-party Flash cookies and/ or set storage size capacities for all Flash cookies. Research done by the Electronic Frontier Foundation on surveillance technologies suggests the former option may seriously impair some websites’ functionality and recommends the latter approach, setting all storage capacities to zero. However, this may result in loss of settings preferences information for first-party websites in exchange for removing all tracking possibility. The FoxTracks web reference for using Adobe’s central LSO manager describes the options available to users in full.

24 Chapter 5

DOM Storage

5.1 Web Storage in the W3C Standard

The third and final storage-based tracking technology I examined in the course of my research was DOM Storage. DOM Storage is proposed as an improved state management mechanism in working drafts of the HTML 5 standard that is set to be adopted by the World Wide Web Consortium (W3C)1 in late 2010. As of December 2009, DOM Storage specifications have been spun off into a distinct working document entitled “Web Storage” for independent review and adoption. Though it is only recently that DOM Storage is being considered as a formal Internet standard, popular web browsers have included DOM Storage capabilities since 2006. Notably, DOM Storage space was first included in .0 and has been supported through the current version, Firefox 3.6 [1]. Despite its pending W3C adoption, it is also included in the latest versions of , Internet Explorer, Chrome, and , all of which were released between 2008 and 2009. DOM refers to the legacy term “,” and serves little purpose in describing this browser storage space. Like HTTP cookies, DOM storage is a mechanism for maintaining a user’s state with a particular website. It is designed as a large storage bin that exists locally on a client’s machine. According to the W3C working draft on DOM

1The W3C is a standards organization like the IETF.

25 Storage, the mechanism offers two benefits over regular cookies. First, it prevents race conditions that can occur during simultaneous browsing sessions. For instance, when two browser windows navigate to the same site, cookie data that is transmitted in each session may get overwritten or aggregated in a way that results in unexpected behavior. The W3C specification solves this problem by providing a single session storage space for each brows- ing session. This space will only ever be accessed by one window and thus prevents state confusion from multiple connections to the same domain. Additionally, all session-only data will be discarded on window close or browser exit, so no conflicts will manifest under this model. The second advantage of DOM Storage is its much larger size than regular cookies. Allowing for megabytes of persistent storage on the client-side of communication allows for website performance enhancements in the way of a large cache. While it offers some advantages over HTTP cookies, DOM Storage presents the same third-party tracking risks as regular cookies. Additionally, the collection of highly specific user data kept in DOM Storage increases the seriousness of any privacy intrusions by third parties. The W3C is conscious of these user privacy concerns posed by DOM Storage adop- tion:

A third-party advertiser (or any entity capable of getting content distributed to multiple sites) could use a unique identifier stored in its local storage area to track a user across multiple sessions, building a profile of the user’s interests to allow for highly targeted advertising. In conjunction with a site that is aware of the user’s real identity (for example an e-commerce site that requires authenticated credentials), this could allow oppressive groups to target individuals with greater accuracy than in a world with purely anonymous Web usage.

Like RFC 2965, the DOM Storage standard promotes user agent’s role in protecting privacy. User agents are given the following suggestions: blocking third-party storage, ex-

26 piring stored data, treating persistent storage like regular cookies, tracking the origins of stored data and creating a blacklist or whitelist of websites accordingly. These suggested approaches to user privacy are unsatisfying for several reasons. First, engaging in any of these defenses requires substantial knowledge of session-only and persistent data stores. A user would need an intermediate understanding of state management mechanisms both at a high level and on a per website basis in order to determine whether DOM storage was being used benignly or maliciously. Many users browse the web unaware of DOM Storage and other state mechanisms with tracking potential. It follows that users lack to knowledge to manage them effectively. Secondly, presuming user understanding of DOM Storage, the standard does not propose an API or technical implementation of the suggested defenses. Rather, a user would need the technical expertise to implement a DOM Storage settings controller in order to realize many of these defenses. Finally, the document motions to ex- cuse concerns about user privacy by referencing the futile nature of privacy protection. It suggests that a first-party domain may track user activity and later sell it to a third-party, or that session-identifying data passed through URLs may be analyzed for user data regardless of any privacy protections that are in place. Thus DOM Storage poses unaddressed risks to user privacy.

5.2 Case Study: Gmail Mobile Privacy

Mobile versions of the major web browsers also support the HTML5 standard for local database storage. Persistent offline client-side storage is especially advantageous for mobile websites2 which frequently face limited bandwidth and inconsistent network connectivity. This is because keeping large amounts of data on the client device requires fewer requests for bandwidth-intensive data over a sporadic network connection. As a result, many mobile

2Here, “mobile websites” refers to mobile versions of regular websites.

27 websites have been implemented using the HTML5 standard and local database storage. The Gmail website for the Apple iPhone is one such mobile website, and it provides an interesting case study in DOM/ local storage risks to user privacy. To see how the Gmail mobile website makes use of local database storage, I needed to examine the underlying program folders of the iPhone web browser. However, the Safari for iPhone folder contents cannot be examined on the iPhone itself because the device’s system folders are locked to users. Thus, I chose to mimic Safari for iPhone using Safari for Mac on the standard Mac OS. This required a simple change to the Safari developer view and iPhone user agent context. Logging into gmail.com in Safari for iPhone mode had the following re- sult: a folder titled “Databases” was silently created within the Safari program folder on the Mac OS. Within this folder, a management database called “Databases.db” was created along with a second folder containing storage databases for the domain “mail.google.com.” In this simulation, though Gmail accessed and wrote to the mobile device, the user was never prompted for permission or notified of this activity. Along with the privacy concerns de- scribed below, this local storage creation underscores the failure to achieve informed consent for tracking under the current privacy paradigm. The database created within the mail.google.com folder corresponded to my Google pro- file, and was populated entirely in plain text. Without any kind of or access security, the database could be opened with a regular SQLite browsing tool. After logging out of gmail.com and locally opening the database associated with my profile, I was able to read highly detailed information about the contents of my Gmail account. In particular, the cached messages and cached conversation headers tables exposed an alarming amount of personal information (see Figure 5.1). Together, these tables provide information about frequent contacts, contacts’ addresses, subject lines, and message contents snippets. These data may be gleaned for further information such as site login names and . As an example, I was able to retrieve a site from a cached message snippet asso-

28 ciated with a password reset email in my inbox.

Figure 5.1: The cached conversation headers table in my profile database.

This storage mechanism presents a host of privacy concerns. Though the database files are not visible to other end-users through the iPhone interface, other mobile websites and third- party advertisers may use and exploit the same local storage area. The W3C working draft on web storage suggests that a user should restrict access to local storage databases to only scripts originating from the top-level website to which they navigate. However, where users lack knowledge about DOM storage, this defense is difficult to implement. Domains may take steps to privatize their local storage databases by using encryption or other techniques. However, Gmail’s mobile website suggests that at least some websites storing highly personal data do not obscure that data from third parties. Moreover, the W3C document suggests that third-party hosts may use fake domain names in order to gain access to the local storage databases set by the domain name. Without any kind of host authentication, this could lead

29 to information leakage or information spoofing activity, both of which can compromise the confidentiality of user data. In this example, information leakage might occur if an advertiser read and saved any of the mail.google.com database information available in the Databases folder. Information spoofing refers to the writing of data in another domain’s local storage. Here, a third party might set a user’s Gmail mobile session identifier to a known value and use this to track the user’s interaction with Gmail. Though this example illustrated the use of DOM storage by mobile websites, the same features of HTML5 are available for use by regular web browsers. Non-mobile websites may choose to make use of local storage in a similar manner to Gmail’s mobile website as DOM Storage is adopted as an Internet standard. Should websites and users fail to protect access to locally stored databases, third parties may be able to use DOM Storage to connect browsing history with many kinds of personally identifiable information.

5.3 Community Approach to the Study of DOM Stor- age

FoxTracks aims to be informational with regard to user privacy threats posed by each third- party tracking technology. For HTTP cookies and LSOs, I designed interfaces that are informative and displayed databases and trackers in a way that minimizes confusion. DOM Storage tracking potential is substantially more difficult to convey using a Firefox extension. Despite the inclusion of DOM Storage in Firefox 2.0, no add-ons have been developed to explore its session-only or persistent storage. BetterPrivacy features a boolean option for clearing DOM contents on browser exit but does not provide a comprehensive view of the contents or explain how DOM Storage is used by websites. Though no add-ons have been developed exclusively for viewing DOM Storage contents, Mozilla’s developer pages highlight that all persistent data resides in “webappsstore.sqlite,” a single database inside the Firefox user’s profile folder. The FoxTracks interface loads this

30 database into a table view. However, the database entries are frequently obscure and only in specific instances will the originating website and other information be intelligible. In particular, each entry in the database consists of the following fields: scope, key, value, secure, and owner. Secure is simple a boolean value related to accessibility of the database entry. Scope and owner refer to the originating website which may be masked or non- obvious. The key is scope-specific and its significance is not always immediately clear to the user. The value field is the main storage space of the database entry and may contain user data that is in human readable form. It may also store scripts that can be accessed and run from the originating websites. Risks to user privacy can only be demonstrated when entries’ originating websites and value stores are understandable. Thus, presenting an entire database view of webappstore.sqlite is not the most effective demonstration of risks to user privacy posed by DOM Storage. It has the potential to confuse users who may recognize only some originating websites and certain pieces of data contained in entries. Moreover, the database view says nothing about the information leakage and information spoofing potential of DOM Storage contents (see Figure 5.2). If database entries could be linked to known third-party companies and augmented with this information, the FoxTracks DOM Storage tab might be more effective. As with third- party LSO discovery, this improvement requires a reliable, substantial resource for associating third parties with the names of their DOM entries and the scripts they use to set DOM entries. To further this end, I intend to work with the technologists at the Center for Democracy and Technology to begin a DOM contents exploratory project. Following the Ghostery model for third-party cookie discovery, we intend to uncover third-party DOM Storage usage by applying a community-based approach. Interested users will be able to anonymously submit their DOM content for review. This DOM content can be analyzed for obscure-origin database entries that occur most frequently, and the first-party website DOM entries with which they tend to appear. Additionally, users will be able submit comments

31 about perceived uses of key and value fields for particular websites’ entries. A critical mass of comments can then be peer-reviewed and facts about popular websites’ uses of DOM Storage can be posted in a central location. A link to this location will eventually be accessible through the DOM Storage tab in FoxTracks. With support from the privacy experts and technologists at CDT, this community-based solution to acquiring, analyzing, and spreading information about DOM Storage and its role in third-party tracking will lead to more effective interface design in future versions of the FoxTracks tool.

Figure 5.2: Screenshot of the FoxTracks DOM Storage panel.

32 Chapter 6

Results

6.1 The FoxTracks Implementation

FoxTracks demonstrates how third-party HTTP cookies, Flash cookies, and DOM Storage contents can adversely affect the privacy of an end-user. FoxTracks relies on a Ghostery database of known trackers to identify third-party HTTP cookies loaded through the HTML of a web page. As a Firefox add-on, FoxTracks has access to the first-party domain a user is visiting when a third-party script attempts to get or set data on the user’s machine. Every time third-party HTTP cookie activity is recognized on a page, FoxTracks keeps a record of the third party and the website on which it appeared. When a user opens the tool, a XUL- generated interface populates a table with all of these records; and a user is given insight into the profiles different trackers have assembled from her browsing activity. These “snapshots” of partial browsing history are identical to the browsing profiles kept on the servers of third parties. They are completely independent from the user-controlled browser history and, as such, demonstrate a loss of privacy control to the end-user. In this way, FoxTracks achieves its aim to inform users about the privacy risks of third-party HTTP cookies. The identities of third parties that use HTTP cookies, the information collected by cook- ies, and even the scripts used to set cookies are well-documented in the public domain. On the other hand, the third-party risks of Flash cookies and DOM Storage exist largely as hypo-

33 thetical information leakages that are periodically supported by specific instances of privacy invasion. It was difficult to show how third parties use Flash cookies and DOM Storage in FoxTracks without resources like the Ghostery advertiser database for these technologies. For this reason, I chose to implement more general Flash cookie and DOM Storage interfaces and place greater emphasis on the accessibility of these interfaces. FoxTracks provides a file view of all Flash cookies on a user’s machine. In most cases a Flash cookie’s origin and purpose is discernible from metadata fields. While these fields do not distinguish first-party Flash cookies from third-party Flash cookies, users are likely to recognize origin domains they have never explicitly visited. In this way, even a window into all Flash cookies files can expose third-party tracking with Flash objects in a manner that is personally relevant to the user. FoxTracks also provides single-object and all-object deletion options with the aim of encouraging users to browse the informational web links embedded in the interface prior to use. The user-friendliness of the Flash objects interface is also increased by the extraction of advanced deletion methods to a separate options menu. The DOM Storage tab of FoxTracks also places an emphasis on user accessibility. How- ever, because DOM Storage takes the form of a SQLite database in the Firefox web browser, a display of its contents is only partially telling for users. When origin fields are readable, users may find database entries that have been set by third parties. When storage con- tents and origin names are only readable by remote servers, FoxTracks is not as effective in informing users about third-party tracking with DOM Storage or the associated privacy risks. Nonetheless, an overview of DOM Storage is a valuable addition to the software. By including all three tracking technologies, FoxTracks achieves an all-in-one overview of track- ing practices on the web. For interested users, an all-in-one resource provides a holistic, straightforward introduction to online privacy. While FoxTracks succeeds in being educational and informative, it stands to benefit from a number of code-based improvements. The FoxTracks interface was implemented in XUL

34 and program functionality was added through JavaScript functions included in the standard Mozilla Firefox development API. Many of the functions called in the software have single- threaded and multi-threaded implementations. To avoid increases in program complexity, multi-threaded functions were not used in the initial development of FoxTracks. However, use of multi-threaded database functions would significantly improve the performance of both the HTTP cookie and DOM Storage tabs by allowing SQLite queries that load database entries into tables to be executed in parallel. While this optimization is secondary to the goals of FoxTracks, slow program execution impacts the user’s browsing experience in a negative manner. If FoxTracks suffers from serious latency or prevents the user from browsing the web at regular pace, users are unlikely to keep or use the add-on. It follows that future development of FoxTracks should consider performance improvements. The most prominent limitation of FoxTracks is its inability to provide information solely on third-party Flash cookies and DOM Storage contents. As discussed in Section 5.3, Fox- Tracks would benefit enormously from community-based input and research. Information about third parties known to use these technologies and the scripts that set them would provide a basis for identifying additional third-party trackers on the web. Though I plan to work with CDT to address the lack of information about DOM Storage, such research might also be carried out in an academic setting or by other public interest organizations. Interested users that analyze their DOM contents for third-party activity and share their results can also further the development of informative software tools.

6.2 Third Parties as Privacy Adversaries

Third parties that set and use HTTP cookies and Flash cookies for tracking purposes are primarily players in the online advertising business. Many of these companies aggregate user browsing data to serve relevant advertisements to users. Within the privacy discourse, there

35 is debate about the merits of this “behavioral advertising.” Some claim that behavioral advertisers provide useful content for online consumers. Others cite the privacy-eroding tracking methodologies of behavioral advertisers. I apply a computer security framework to the three tracking technologies discussed in this thesis to show how third-party advertisers might be considered “adversaries” to Internet users. In the computer security literature, an adversary is an entity whose aim is to prevent users of a cryptosystem1 from achieving a goal such as data confidentiality or integrity. Adversaries’ actions typically include attempts to uncover secret data, corrupt data, spoof communication messages and message sender identities, and force system failures [14]. The concept of an adversary is used to reason about cryptosystems as “games” between users and coordinated attackers. Web browsing can be considered a game between an Internet user and the websites she visits, where data passed to these websites is intended to be private. In this game, HTTP cookies used by third parties behave like passive adversaries in formal cryptosystems. Specifically, HTTP cookies observe and record sessions between a user and first-party website, and use this information to glean facts about the user. Third parties using Flash cookies and DOM Storage may also behave like passive adversaries but have immense potential to be active adversaries that spoof, corrupt, and divert communication between users and first-party websites. Flash cookies in particular have been found to respawn HTTP cookies, which constitutes a type of message spoofing since a user’s machine establishes an HTTP cookie communication channel with a third-party server where none should exist. Both third-party Flash cookies and DOM Storage contents have the ability to intercept and fabricate users’ communications with first-party websites resulting in information leakage and information spoofing as described in the W3C web storage standard. This kind of action typifies active adversaries as they described in the computer security paradigm. In sum, the use of HTTP cookies, Flash cookies, and DOM Storage by third parties can

1A cryptosystem is any computer system that involves cryptography techniques.

36 be translated precisely into a computer security context. Within this context, tracking tech- nologies represent means by which a third party attempts break data confidentiality between a user and the first-party websites they visit. This allows me to characterize third parties as adversaries in the scheme of online privacy. Moreover, my research provides peripheral evidence of the actively invasive nature of third-party advertisers. Public comments such as the United Virtualities statement on the tracking potential of Flash LSOs, demonstrate how advertisers as a whole have tried to circumvent user attempts to control privacy. Business successes of advertisers and data aggregators that use Flash cookies also demonstrate the economic incentives in place for third parties that battle user control of privacy. The char- acterization of third parties as adversaries to individual users reaffirms the need for greater user awareness of online tracking practices and a privacy baseline of informed consent.

37 Chapter 7

Conclusions

I implemented the FoxTracks software tool to increase awareness about third-party track- ing and survey common tracking methodologies. FoxTracks was designed to be accessible and informational for average Internet users who have incomplete knowledge of third-party tracking according to survey data. FoxTracks examines the roles of HTTP cookies, LSOs, and DOM Storage in third-party tracking activities. For each technology, the tool provides information about the identities of third parties, how they use the technology to undertake tracking, what kinds of personal data can be exposed, and how users can opt out of tracking. Based on the existing literature and code for each tracking technology, FoxTracks provides these pieces of information through different interface implementations. The HTTP cookies and web bugs panel demonstrates how a third party tracks a user across multiple websites to compile a profile of the user’s browsing history. The Flash objects panel displays both first-party and third-party LSOs, and emphasizes learning about Flash cookies prior to us- ing FoxTracks deletion options. The DOM Storage panel provides a basic database view into the user’s DOM Storage and strongly emphasizes interaction with the informational links included in the interface. Together, the three panels provide a novel, holistic survey of third-party tracking on the web.

38 The design choices in FoxTracks were informed by my analysis of the history, key players, and motivations of third-party trackers. A number of significant patterns emerged from this analysis. In both the HTTP cookie standard and the DOM Storage working document, standards organizations highlighted the serious privacy risks posed by third-party use of the technologies. Nonetheless, both documents place a strong emphasis on active user self- management of privacy. The current self-management privacy default, combined with a lack of user awareness about privacy risks, provides the basis for substantial tracking by third parties. Between the Ghostery database of known third parties and an examination of the companies using Flash cookies, the majority of third parties appear to be advertising companies and behavioral data aggregators. Because their business models rely on large amounts of accurate user profiling, these companies have economic incentives to circumvent user attempts to control privacy. Moreover, venture funding and industry recognition shows they are encouraged to continue privacy-eroding practices such as HTTP cookie respawning. Applying a computer security rubric to the sum of this analysis yields qualitative results about the nature of third parties. In particular, the tracking technologies and methods employed by third parties are parallel to the actions of adversaries in a computer security model. This suggests that third parties intentionally circumvent user efforts to control privacy. In this case, third parties and third-party tracking should be considered a serious privacy risk to users. By applying my analysis to the FoxTracks tool, I was able to make the software conceptu- ally true to its goals of accessibility, informational quality, and third-party tracking exposure. However, there is substantial room for revision of the software and a need for interface test- ing. Further research directions that would also enhance FoxTracks include an examination of DOM Storage contents or a focused study of Flash cookies. FoxTracks might separately inspire similar privacy-enhancing tools that explore different web technologies or convey in- formation in creative manners. Alternatively, work might be undertaken on a holistic survey

39 of security-enhancing technologies, especially as standards promote user self-management of online privacy. Further research in any of these directions would be supported by FoxTracks and the accompanying analysis, and would in turn further the tool’s goal of achieving a privacy standard of informed consent.

40 Acknowledgements

I would like to express my sincere thanks to Professor John Mitchell and Professor David Dill for their guidance on this research project. My deepest thanks also goes to the team at CDT Labs who have advised me on technical matters and provided online support for the project.

41 Bibliography

[1] DOM Storage. ://developer-stage.mozilla.org/en/DOM/Storage, April 2010.

[2] VideoEgg. http://www.crunchbase.com/company/videoegg, March 2010.

[3] M. Ackerman, Cranor L., and J. Reagle. Privacy in e-commerce: examining user sce- narios and privacy preferences. Proceedings of the 1st ACM conference on Electronic commerce, pages 1–8, May 1999.

[4] M. Chambers. Macromedia Flash MX Security. Macromedia whitepaper describing the information accessible to LSOs., March 2002.

[5] R. Hof. Ad networks are transforming online advertising. BusinessWeek, March 2009.

[6] D. Kristol. HTTP cookies: Standards, privacy, and politics. ACM Transactions on Internet Technology, 1(2):151–198, 2001.

[7] D. Kristol and L. Montulli. RFC 2109: HTTP state management mechanism. Internet Engineering Task Force, Network Working Group, February 1997.

[8] W. Maes, T. Heyman, L. Desmet, and W. Joosen. Browser protection against cross-site request forgery. Proceedings of the first ACM workshop on Secure execution of untrusted code, pages 3–10, November 2009.

42 [9] Jonathan R. Mayer. “Any person... a pamphleteer”: Internet Anonymity in the Age of Web 2.0. Woodrow Wilson School undergraduate thesis containing relevant survey data about perceptions of third-party tracking on the web, April 2009.

[10] G. Nowak and J. Phelps. Understanding privacy concerns. an assessment of consumers’ information-related knowledge and beliefs. Journal of Direct Marketing, 6(4):28–39, August 2006.

[11] W. Peng and J. Cisna. HTTP cookies a promising technology. Online Information Review, 24(1):150–153, April 2000.

[12] B. Pfitzmann and M. Waidner. Privacy in browser-based attribute exchange. Proceedings of the 1st ACM conference on Electronic commerce, 154(3):52–62, November 2002.

[13] A. Soltani, S. Canty, Q. Mayo, L. Thomas, and C. J. Hoofnagle. Flash Cookies and Privacy. UC Berkeley survey of flash cookie adoption on the web and related privacy concerns, August 2009.

[14] Douglas R. Stinson. Cryptography Theory and Practice, pages 355–363. Chapman & Hall/CRC, third edition, 2006.

[15] United Virtualities. United virtualities develops id backup to cookies, browser-based persistent identification element will also restore erased cookie. March 2005.

43