Cross-Device Tracking: Measurement and Disclosures
Total Page:16
File Type:pdf, Size:1020Kb
Proceedings on Privacy Enhancing Technologies ; 2017 (2):113–128 Justin Brookman, Phoebe Rouge, Aaron Alva, and Christina Yeung Cross-Device Tracking: Measurement and Disclosures Abstract: Internet advertising and analytics technology share common attributes — such as the same lo- companies are increasingly trying to find ways to link cal network and IP address — those services may behavior across the various devices consumers own. This be able to correlate user activity across devices. In cross-device tracking can provide a more complete view visiting 100 sites on two virtual devices, we con- into a consumer’s behavior and can be valuable for a nected to 861 different third party domains on both range of purposes, including ad targeting, research, and devices, including domains operated by dedicated conversion attribution. However, consumers may not be cross-device tracking companies. aware of how and how often their behavior is tracked – 96 out of 100 of the sites we tested allowed con- across different devices. We designed this study to try sumers to submit a username or email address that to assess what information about cross-device tracking could be shared to correlate users across devices. (including data flows and policy disclosures) is observ- – Six companies that collect login information as a able from the perspective of the end user. Our paper first party also collect extensive behavioral data as demonstrates how data that is routinely collected and a third party on other websites. shared online could be used by online third parties to – 16 out of 100 sites shared personally identifying in- track consumers across devices. formation with third parties, either in raw or hashed form. Keywords: privacy, cross-device, tracking, transparency – Dozens of third party services sync unique cookie ID DOI 10.1515/popets-2017-0017 values, which could facilitate the sharing of cross- Received 2016-08-31; revised 2016-09-30; accepted 2016-10-01. device graph information. These findings demonstrate that a broad range of com- 1 INTRODUCTION panies possess the capacity to correlate user behavior across different devices that the users own. However, al- Concerns about cross-device tracking often center on the though the data practices summarized above could be unknown: who is performing this tracking, when does it used for cross-device linkage, we could not definitively occur, and how is it performed? To help shine some light determine that the data was used by a company for that on this topic, we conducted a review of 100 popular web- purpose in any particular instance. From the perspective sites to determine which sites transmit data or otherwise of the consumer, it is challenging to know when cross- perform actions known to facilitate cross-device track- device tracking occurs, since companies can make de- ing. We observed the following data collection practices terminations of device correlation on their own servers, that could be used by tracking companies to make in- unobservable to end users (we did not detect compa- ferences about linkages among devices: nies using the same cookie IDs across devices to identify – We detected connections to the same third party linked devices). The data transmissions observed could services on different devices. When those devices be for purposes other than cross-device tracking, though a user may be unable to rule out the possibility based on the data alone. Because the purposes for which the data transmit- Justin Brookman: Federal Trade Commission, Office of ted was often ambiguous, we also looked at the pri- Technology Research and Investigation, E-mail: jbrook- vacy disclosures of these 100 websites in order to see [email protected] Phoebe Rouge: Federal Trade Commission, Office of Tech- if they made information available about cross-device nology Research and Investigation, E-mail: [email protected] tracking. Most of the policies we reviewed reserve broad Aaron Alva: Federal Trade Commission, Office of Technology rights to allow third parties to collect and use pseudony- Research and Investigation, E-mail: [email protected] mous browser data such as IP address and unique cookie Christina Yeung: Federal Trade Commission, Office of Tech- identifiers. However, the website privacy policies we re- nology Research and Investigation, E-mail: [email protected] viewed contained little explicit discussion of cross-device Cross-Device Tracking 114 tracking specifically, or whether consumers had the abil- ing, but also for research, security, and purchase attri- ity to turn off cross-device linkages. bution purposes (e.g., if an ad on one device resulted in a sale on another). The commercial benefits from cross-device tracking could potentially be passed on to 2 BACKGROUND consumers in the form of lower prices, though we have not analyzed that issue here. This paper does not seek to weigh the potential benefits of cross-device track- Since the late 1990s, the Federal Trade Commission ing with any privacy risks or interests; instead, it ex- (FTC)1 has sought to bring greater transparency and plores what information concerning cross-device track- user control to the issue of online behavioral data col- ing is discoverable by consumers — through disclosures lection as part of its work to protect and promote con- made to consumers in privacy policies, and through ob- sumer privacy.2 In 2007, Commission staff held a two- servable behavior of websites in browsers. day workshop3 focused specifically on behavioral target- Staff of the Office of Technology Research and In- ing. In 2009, staff published “Self-Regulatory Principles vestigation conducted this research in conjunction with for Online Advertising” [2] to encourage the adoption of the FTC staff’s November 16, 2015 workshop on Cross- more transparent practices. Behavioral advertising was Device Tracking. At that event, stakeholders from in- also a significant focus of the 2012 Report “Protecting dustry, academia, and civil society gathered to explore Consumer Privacy in an Era of Rapid Change: Recom- the unique privacy issues associated with the practice.4 mendations for Businesses and Policymakers” [3]. The This paper will initially describe some of the current Commission has also brought numerous enforcement ac- models that companies use to compile “device graphs” tions to stop unfair and deceptive practice related to for consumers — that is, lists of device identifiers that online behavioral advertising [4–6]. the company imputes to a particular individual. We Despite the progress to date addressing these issues, then describe the methodology for the study we ran on consumers continue to have concerns about online pri- 100 popular websites to observe the collection of data vacy and tracking. According to a 2016 TRUSTe sur- that could be used for cross-device tracking purposes. vey, [7] 92% of consumers worry about their privacy on- We then present the full findings of the study which line. With regard to online behavioral data collection, were briefly summarized in the abstract: we describe the a recent Pew survey [8] shows that over three quarters numerous cases in which data is transmitted that could of internet users are not confident that online advertis- be used to facilitate cross-device tracking. Next, we de- ers will maintain the privacy and security of their web scribe the results of our review of the privacy policies of browsing data. And, while most expressed an interest those 100 websites. Finally, we discuss the limitations in controlling the collection of their online data, few felt of our study, and conclude with suggestions for possible empowered to do so [8]. future research. Originally, online behavioral data collection was While several previous studies [9–12] have consid- limited to connecting users across multiples websites ered data collection online, including third-party data on one device. Today, advertising technology companies collection, we are unaware of any prior studies that are finding ways to track users across devices as well. considered such collection in the context of cross-device Consumers interact with more devices — and smarter tracking across such a large number of sites. Other work devices — than ever before, including computers, smart- [13–16] has focused on third-party data collection across phones, smart TVs and Blu-Ray players, gaming plat- mobile applicatons (as opposed to web browsers); while forms, and Internet of Things devices. Connecting this we do not measure leakage of information through apps, data can be valuable to companies not just for ad target- many of the same privacy interests are implicated. A number of papers [17, 18] have also looked specifically at the efficacy and completeness of disclosures in privacy 1 This research has been prepared by staff members of the Office policies though, again, without regard to cross-device of Technology Research and Investigation of the Federal Trade tracking. Finally, some researchers have looked at non- Commission. The views expressed do not necessarily reflect the cookies tracking mechanisms such as Flash Cookies [19] views of the Commission or any individual Commissioner. 2 See https://www.ftc.gov/news-events/events-calendar/1999/ 11/online-profiling-public-workshop and [1] 3 https://www.ftc.gov/news-events/events-calendar/2007/11/ 4 https://www.ftc.gov/news-events/events-calendar/2015/11/ ehavioral-advertising-tracking-targeting-technology cross-device-tracking Cross-Device Tracking 115 or digital fingerprinting [21]. Our paper focuses primar- Fig. 1. Probabilistic Matching ily on how often sites connect to third-party services; we do not analyze in detail what methods companies might use to keep state on user activity. We do look at how cookie syncing could be used to share device graphs among third-party services building off of previous re- search [20] into methods of detecting such syncing. 3 DESCRIPTION OF CROSS-DEVICE TRACKING MODELS Below we summarize some of the primary models that are in use today to link behavior across multiple devices.