On the Ubiquity of Web Tracking: Insights from a Billion-Page Web Crawl

Total Page:16

File Type:pdf, Size:1020Kb

On the Ubiquity of Web Tracking: Insights from a Billion-Page Web Crawl Journal of Web Science, 2018, 4: 53–66 On the Ubiquity of Web Tracking: Insights from a Billion-Page Web Crawl Sebastian Schelter1 and Jérôme Kunegis2 1Database Systems and Information Management Group, Technische Universität Berlin, Germany [email protected] 2Namur Center for Complex Systems, University of Namur, Belgium [email protected] ABSTRACT We perform a large-scale analysis of third-party trackers on the World Wide Web. We extract third-party embed- dings from more than 3.5 billion web pages of the CommonCrawl 2012 corpus, and aggregate those to a dataset containing more than 140 million third-party embeddings in over 41 million domains. We study this data on several levels and provide the following contributions: (1) Our work leverages the largest empirical web tracking dataset collected so far, and exceeds related studies by more than an order of magnitude in the number of domains and web pages analyzed. As our dataset also contains the link structure of the web, we are able to derive a ranking measure for tracker occurrences based on aggregated network centrality rather than simple domain counts. We make our extracted data and computed rankings available to the research community. (2) On a global level, we give a precise figure for the extent of tracking, give insights into the structural properties of the ‘online tracking sphere’ and analyse which trackers (and subsequently, which companies) are used by how many websites, leveraging our ranking measure derived from the link structure of the web. (3) On a country-specific level, we analyse which trackers are used by websites in different countries, and identify the countries in which websites choose significantly different trackers than in the rest of the world. In particular, the three tracking domains with the highest PageRank are all owned by Google. The only exception to this pattern are a handful of countries such as China and Russia. Our results suggest that this dominance is strongly associated with country-specific political factors such as freedom of the press. (4) We investigate whether the content of websites influences the choice of trackers they use, leveraging more than ninety thousand categorized domains. In particular, we analyse whether highly privacy-critical websites about health and addiction make different choices of trackers than other websites. Our findings indicate that websites with highly privacy-critical content are less likely to contain trackers (60% vs 90% for other websites), even though the majority of them still do contain trackers. Keywords: Web, Tracking, CommonCrawl ISSN 2332-4031; DOI 10.1561/106.00000014 ©2018 S. Schelter and J. Kunegis 1 Introduction to embed links (using various technologies) to third-party con- tent, and thereby allowed the providers of such content to track The ability of a website to track the pages read by its visitors users on a wide scale. The inclusion of third-party content oc- has been present since the beginnings of the World Wide Web. curs for a variety of reasons, of which not all require tracking, In recent years however, another tracking mechanism has be- e.g., advertising, conversion tracking, acceleration of content come widespread: that of third-party websites embedded into loading, and provision of widgets. Social media and advertising the visited site by mechanisms such as JavaScript and images. companies primarily use the data collected via these tracking mechanisms to improve their capability to show personalized, In fact, the majority of websites contain third-party con- tailored advertisements to their users, which represent their tent, i.e., content from another domain that a visitor’s browser main source of income. loads and renders upon displaying the website. Such an em- bedding of third-party content has always been possible, but Regardless of their primary purpose, third-party compo- was relatively rare, since most embedded images were located nents can (and in many cases do) track web users across many on the same server as the page itself, and in any case, the em- sites and record their browsing behavior. Therefore, these bedding of content was not intended for tracking. With the third-parties constitute a privacy hazard in many aspects, such rise of social media and Web 2.0,websitesincreasinglybegan as the collection of data about health conditions (Electronic 54 Schelter and Kunegis Frontier Foundation, 2015; Libert, 2015b), news consumption as opposed to other websites, even though the majority of them (Trackography, 2014), or their instrumentalization for mass still do contain trackers (Section 7). We also show that certain surveillance by intelligence agencies (The Intercept, 2015). In types of trackers are in fact present with higher probability order to understand and control these hazards, it is desirable on websites with privacy-critical content, indicating a lack of to gain a deeper understanding of the ‘online tracking sphere’ awareness of the tracking ability of their underlying services. as a whole. Previous research has studied small samples of this (5) Finally, we provide our dataset to the scientific community3 sphere, e.g., 1,200 English-language domains from Alexa’s pop- and contribute it to the Koblenz Network Collection (Kunegis, ular sites (Krishnamurthy and Wills, 2009b), while researchers 2013)4. have only recently started to study larger datasets, e.g., track- ing on the Alexa top 1 million domains (Libert, 2015a). The main reason for this is that, until lately, such a study would 2 Online Tracking Fundamentals only have been possible for large companies possessing their own ‘copy’ of the web. Recent developments however allow In this section, we give a technical introduction to web tracking, us to study this online tracking sphere at a large scale: the and highlight resulting privacy implications. availability of enormous web crawls comprised of hundreds of terabytes of web data (Spiegler, 2013), such as CommonCrawl 2012.1 2.1 Technical Foundations We process all 3.5 billion webpages from this corpus (which In its basic form, online tracking involves users browsing the amounts to more than 200 terabytes of data), and extract the Web and the website that they intentionally visit. In addi- bipartite ‘tracking graph’, which describes the tracking of more tion to these two actors, one or more services may be present than 41 million pay-level domains by 355 tracking services. that record users’ browsing to the website; we call such ser- Apay-leveldomain(PLD)isasub-domainofapublictop- vices third-parties. We refer to third-parties as trackers if their level domain that users pay for individually. PLDs allow us main purpose or the business model of their owning company to identify a realm, where a single organization is likely to depends on collecting browsing data of users. Therefore, we 2 be in control , e.g., the PLD for www.example.com would be categorize advertising services and social network plugins as example.com.Subsequently,weusethisdatatostudythree trackers, because their main source of income comes from tar- key research questions: (i) Which are the predominant track- geted, personalized advertisements which heavily depend on ing companies on the Web, and how many websites do they user data. On the other hand, we do not consider content track? (ii) How does the distribution of web trackers vary from delivery networks as trackers, because their main business is country to country? How is that variation related to different to accelerate website loading (see Krishnamurthy and Wills, political, socio-cultural and economic factors? (iii) How are 2009b; Englehardt and Narayanan, 2016). trackers distributed with respect to highly privacy-critical top- Specifically, the user visits a website, which means that ics such as health and addiction? Do authors of such websites her browser issues an HTTP “GET” request to the web server avoid web trackers? hosting the website. The web server returns an HTTP response To the best of our knowledge, the data extracted in the pa- containing the HTML code of the website to render. This re- per constitutes the largest empirical dataset of web trackers col- turned HTML code typically contains references to external re- lected so far, containing an order of magnitude more domains sources, such as style sheets, JavaScript code and images that than comparable studies. In this article, we extend our findings are required to render the page in the client browser. Next, from previous work (Schelter and Kunegis, 2016) in several di- the user’s browser automatically issues requests for these ad- rections: (1) We conduct an analysis of tracking in the long ditional resources. In many cases, resources used by a web tail of domains on the web, looking into 41 million pay-level page reside on a different server than the one hosting the web- domains (Section 5), and identify global structural properties site. This possibility allows third-parties to track the user by of the online tracking sphere. (2) We analyze the tracking com- recording and inspecting the external requests to their servers. pany distribution in top-level domains (TLDs) of non-English In the case of online tracking, the website’s HTML code speaking countries, and find that a small set of US-based com- embeds external resources from a third-party which aims to panies have a dominating role in the majority of countries; a track the user. A typical example for such an external resource correlation analysis suggests that this role is strongly associated is a piece of JavaScript code, which the user’s browser will au- with political factors such as freedom of the press (Section 6). tomatically load and execute from the third-party server. This Furthermore, we investigate two additional research directions: external loading enables the third-party to record a wide va- (3) We uncover country-specific as well as category-specific pat- riety of information about the user: (i) The third-party sees terns in the relations between third-party trackers by clustering the current IP address of the user machine, which reveals the their co-occurrences (Section 5.5).
Recommended publications
  • The Web Never Forgets: Persistent Tracking Mechanisms in the Wild
    The Web Never Forgets: Persistent Tracking Mechanisms in the Wild Gunes Acar1, Christian Eubank2, Steven Englehardt2, Marc Juarez1 Arvind Narayanan2, Claudia Diaz1 1KU Leuven, ESAT/COSIC and iMinds, Leuven, Belgium {name.surname}@esat.kuleuven.be 2Princeton University {cge,ste,arvindn}@cs.princeton.edu ABSTRACT 1. INTRODUCTION We present the first large-scale studies of three advanced web tracking mechanisms — canvas fingerprinting, evercookies A 1999 New York Times article called cookies compre­ and use of “cookie syncing” in conjunction with evercookies. hensive privacy invaders and described them as “surveillance Canvas fingerprinting, a recently developed form of browser files that many marketers implant in the personal computers fingerprinting, has not previously been reported in the wild; of people.” Ten years later, the stealth and sophistication of our results show that over 5% of the top 100,000 websites tracking techniques had advanced to the point that Edward employ it. We then present the first automated study of Felten wrote “If You’re Going to Track Me, Please Use Cook­ evercookies and respawning and the discovery of a new ev­ ies” [18]. Indeed, online tracking has often been described ercookie vector, IndexedDB. Turning to cookie syncing, we as an “arms race” [47], and in this work we study the latest present novel techniques for detection and analysing ID flows advances in that race. and we quantify the amplification of privacy-intrusive track­ The tracking mechanisms we study are advanced in that ing practices due to cookie syncing. they are hard to control, hard to detect and resilient Our evaluation of the defensive techniques used by to blocking or removing.
    [Show full text]
  • The Internet and Web Tracking
    Grand Valley State University ScholarWorks@GVSU Technical Library School of Computing and Information Systems 2020 The Internet and Web Tracking Tim Zabawa Grand Valley State University Follow this and additional works at: https://scholarworks.gvsu.edu/cistechlib ScholarWorks Citation Zabawa, Tim, "The Internet and Web Tracking" (2020). Technical Library. 355. https://scholarworks.gvsu.edu/cistechlib/355 This Project is brought to you for free and open access by the School of Computing and Information Systems at ScholarWorks@GVSU. It has been accepted for inclusion in Technical Library by an authorized administrator of ScholarWorks@GVSU. For more information, please contact [email protected]. Tim Zabawa 12/17/20 Capstone Project Cover Page: 1. Introduction 2 2. How we are tracked online 2 2.1 Cookies 3 2.2 Browser Fingerprinting 4 2.3 Web Beacons 6 3. Defenses Against Web Tracking 6 3.1 Cookies 7 3.2 Browser Fingerprinting 8 3.3 Web Beacons 9 4. Technological Examples 10 5. Why consumer data is sought after 27 6. Conclusion 28 7. References 30 2 1. Introduction: Can you remember the last time you didn’t visit at least one website throughout your day? For most people, the common response might be “I cannot”. Surfing the web has become such a mainstay in our day to day lives that a lot of us have a hard time imagining a world without it. What seems like an endless trove of data is right at our fingertips. On the surface, using the Internet seems to be a one-sided exchange of information. We, as users, request data from companies and use it as we deem fit.
    [Show full text]
  • Improving Transparency Into Online Targeted Advertising
    AdReveal: Improving Transparency Into Online Targeted Advertising Bin Liu∗, Anmol Sheth‡, Udi Weinsberg‡, Jaideep Chandrashekar‡, Ramesh Govindan∗ ‡Technicolor ∗University of Southern California ABSTRACT tous and cover a large fraction of a user’s browsing behav- To address the pressing need to provide transparency into ior, enabling them to build comprehensive profiles of their the online targeted advertising ecosystem, we present AdRe- online interests. This widespread tracking of users and the veal, a practical measurement and analysis framework, that subsequent personalization of ads have received a great deal provides a first look at the prevalence of different ad target- of negative press; users associate adjectives such as creepy ing mechanisms. We design and implement a browser based and scary with the practice [18], primarily because they lack tool that provides detailed measurements of online display insight into how their data is being collected and used. ads, and develop analysis techniques to characterize the con- Our paper seeks to provide transparency into the targeted textual, behavioral and re-marketing based targeting mecha- advertising ecosystem, a capability that has not been ex- nisms used by advertisers. Our analysis is based on a large plored so far. We seek to enable end-users to reason about dataset consisting of measurements from 103K webpages why ads of a certain category are being displayed to them. and 139K display ads. Our results show that advertisers fre- Consider a user that repeatedly receives ads about cures for a quently target users based on their online interests; almost particularly private ailment. The user currently lacks a way half of the ad categories employ behavioral targeting.
    [Show full text]
  • Cookie Swap Party: Abusing First-Party Cookies for Web Tracking
    Cookie Swap Party: Abusing First-Party Cookies for Web Tracking Quan Chen Panagiotis Ilia [email protected] [email protected] North Carolina State University University of Illinois at Chicago Raleigh, USA Chicago, USA Michalis Polychronakis Alexandros Kapravelos [email protected] [email protected] Stony Brook University North Carolina State University Stony Brook, USA Raleigh, USA ABSTRACT 1 INTRODUCTION As a step towards protecting user privacy, most web browsers perform Most of the JavaScript (JS) [8] code on modern websites is provided some form of third-party HTTP cookie blocking or periodic deletion by external, third-party sources [18, 26, 31, 38]. Third-party JS li- by default, while users typically have the option to select even stricter braries execute in the context of the page that includes them and have blocking policies. As a result, web trackers have shifted their efforts access to the DOM interface of that page. In many scenarios it is to work around these restrictions and retain or even improve the extent preferable to allow third-party JS code to run in the context of the of their tracking capability. parent page. For example, in the case of analytics libraries, certain In this paper, we shed light into the increasingly used practice of re- user interaction metrics (e.g., mouse movements and clicks) cannot lying on first-party cookies that are set by third-party JavaScript code be obtained if JS code executes in a separate iframe. to implement user tracking and other potentially unwanted capabil- This cross-domain inclusion of third-party JS code poses security ities.
    [Show full text]
  • Analysis and Detection of Spying Browser Extensions
    I Spy with My Little Eye: Analysis and Detection of Spying Browser Extensions Anupama Aggarwal⇤, Bimal Viswanath†, Liang Zhang‡, Saravana Kumar§, Ayush Shah⇤ and Ponnurangam Kumaraguru⇤ ⇤IIIT - Delhi, India, email: [email protected], [email protected], [email protected] †UC Santa Barbara, email: [email protected] ‡Northeastern University, email: [email protected] §CEG, Guindy, India, email: [email protected] Abstract—In this work, we take a step towards understanding history) and sends the information to third-party domains, and defending against spying browser extensions. These are when its core functionality does not require such information extensions repurposed to capture online activities of a user communication (Section 3). Such information theft is a and communicate the collected sensitive information to a serious violation of user privacy. For example, a user’s third-party domain. We conduct an empirical study of such browsing history can contain URLs referencing private doc- extensions on the Chrome Web Store. First, we present an in- uments (e.g., Google docs) or the URL parameters can depth analysis of the spying behavior of these extensions. We reveal privacy sensitive activities performed by the user on observe that these extensions steal a variety of sensitive user a site (e.g., online e-commerce transactions, password based information, such as the complete browsing history (e.g., the authentication). Moreover, if private data is further sold or sequence of web traversals), online social network (OSN) access exchanged with cyber-criminals [4], it can be misused in tokens, IP address, and geolocation. Second, we investigate numerous ways.
    [Show full text]
  • Track the Planet: a Web-Scale Analysis of How Online Behavioral Advertising Violates Social Norms
    University of Pennsylvania ScholarlyCommons Publicly Accessible Penn Dissertations 2017 Track The Planet: A Web-Scale Analysis Of How Online Behavioral Advertising Violates Social Norms Timothy Patrick Libert University of Pennsylvania, [email protected] Follow this and additional works at: https://repository.upenn.edu/edissertations Part of the Communication Commons, Computer Sciences Commons, and the Public Policy Commons Recommended Citation Libert, Timothy Patrick, "Track The Planet: A Web-Scale Analysis Of How Online Behavioral Advertising Violates Social Norms" (2017). Publicly Accessible Penn Dissertations. 2714. https://repository.upenn.edu/edissertations/2714 This paper is posted at ScholarlyCommons. https://repository.upenn.edu/edissertations/2714 For more information, please contact [email protected]. Track The Planet: A Web-Scale Analysis Of How Online Behavioral Advertising Violates Social Norms Abstract Various forms of media have long been supported by advertising as part of a broader social agreement in which the public gains access to monetarily free or subsidized content in exchange for paying attention to advertising. In print- and broadcast-oriented media distribution systems, advertisers relied on broad audience demographics of various publications and programs in order to target their offers to the appropriate groups of people. The shift to distributing media on the World Wide Web has vastly altered the underlying dynamic by which advertisements are targeted. Rather than rely on imprecise demographics, the online behavioral advertising (OBA) industry has developed a system by which individuals’ web browsing histories are covertly surveilled in order that their product preferences may be deduced from their online behavior. Due to a failure of regulation, Internet users have virtually no means to control such surveillance, and it contravenes a host of well-established social norms.
    [Show full text]
  • Pre-Requisites
    WordPress is very popular content management system and widely used by many big companies for their business web sites. It offers more flexibility for creating personal and professional web sites. With this increasing popularity of WordPress demand of skilled WordPress professional also increaes. This 3 module WordPress course is designed to learn eaverthing you need to become a professional WordPress developer. We includes all the WordPress concepts that are used while development of real time web sites using WordPress. Every concept is explain with the help of live practical example _________________________________________________________ Pre-requisites • Basic Knowledge of PHP/MySQL • Basic knowledge of CSS/HTML • You can use any suitable editor for programming like dreamweaver,notepad++ etc. Who Can Join • Anyone looking for easy and afforable way to create their own web site. • Any WordPress user or developer to expand their knowledge in WordPress development. • Any web developer want to become a highly skilled and professional WordPress Developer. Module-1 [ WordPress Basics ] Section 1 [WordPress Introduction] Section 2 [Getting Started with WordPress • Getting into WordPress • Using wordpress.com to create free blog • Understanding the common terms • How to Install WordPress on localhost • Why to use WordPress • wp-admin panel • Features of WordPress latest version • wp-admin dashboard . • WP resources,codex,plugin and • What is Gravatar theme Section 3 [Blog Content using post] Section 4 [Pages, Menus,Media libraries] • Adding
    [Show full text]
  • Automattic Drives Customer-Centric Improvements on Wordpress.Com
    CASE STUDY Automattic drives customer-centric improvements on Wordpress.com Headquarters: San Francisco, CA Founded: 2005 Unites a geographically distributed team to focus on optimizations Industry: Consumer technology with greatest impact The challenge The challenge Making product improvements It is estimated that about 25 percent of all websites are built using Wordpress, a based on support requests popular website platform managed by parent company Automattic. As an open source didn’t yield substantially better project run by the Wordpress Foundation, the platform is built and maintained by experiences hundreds of community volunteers as well as several ‘Automatticians.’ The solution Due to its open source nature and because there isn’t tracking analytics on Outside perspective provided by Wordpress.org, the team does not have much insight into what’s working and what’s UserTesting panel yields valuable, not working on the site. And this was a serious issue for the organization, which shareable customer insights actively cultivates a culture that is focused on its customers. The outcome Alignment across a distributed The solution team for optimized product releases results in higher quality The team turned to UserTesting to identify potential pain points and areas of friction products for users. Collecting user feedback and sharing findings has enabled the team to make strategic product improvements before each release. Additionally, without a central office and with teams meeting in-person only a few times each year, the ability to easily share insights has been invaluable. Design Engineer Mel Choyce notes, “It’s particularly useful for us as a distributed team because we can’t all do everything in person.
    [Show full text]
  • Staying Without Power a Case Study of the Drupal Content Management System by Qi Zhang
    Staying without Power A Case Study of the Drupal Content Management System By Qi Zhang Bachelor of Science in Engineering, Jiangxi Science and Technology Normal University (1999) Interfaculty initiative in information studies program, University of Tokyo (2006) Master of international studies, University of Tokyo (2008) Submitted to the System Design and Management Program in Partial Fulfillment of the Requirements for the degree of Master of Science In EngIneerIng and Management ARCHNES At the A/SSACHUSETTS INS fW E OF TECHNOLOGY Massachusetts Institute of Technology AUG 20 2013 May 2012 y2012 RRARIES C 2012 Massachusetts Institute of Technology Signature of Author Qi Zhang Certified by -. Michael A. Cusumano SMR Distinguished Professor of Management T( esis Supervisor Certified by_ Patrick Hale Director System Design and Management Program THIS PAGE IS INTENTIONALLY LEFT BLANK Acknowledgements I run the knowledge that I learned from M.I.T the most wisdom school of the world. It works AMAZING! I would also like to thank Michael, Imran and Pat who staying with me when I am without power. Love in M.I.T; Love in USA XlApq, M31 THIS PAGE IS INTENTIONALLY LEFT BLANK Staying without Power By Qi Zhang Submitted to the System Design and Management Program in Partial Fulfillment of the Requirements for the degree of Master of Science In Engineering and Management ABSTRACT This main focus of this thesis is not to describe the inner workings of the Ecosystem or software; it is to help young entrepreneurs with limited resources to not just survive, but thrive in a competitive business environment. Thesis Supervisor: Michael A.
    [Show full text]
  • Nidan a Security Search Engine for the World Wide Web
    Nidan A Security Search Engine for The World Wide Web Master’s Thesis - 10th Semester Software Engineering ds103f19 Cassiopeia Department of Computer Science Aalborg University Copyright © Aalborg University 2014 Written in LATEXfrom a template made by Jesper Kjær Nielsen. Software Engineering Aalborg University https://www.cs.aau.dk/ Title: Abstract: Nidan: A Security Search Engine for The World Wide Web In this report, the development and usage of Nidan and KNAS are described. Nidan Theme: is a systematic webcrawler which collects all Security in distributed systems loaded JavaScript, cookies, and related meta- data and stores it in a well-strutured rela- Project Period: tional database. KNAS is a data-processing Spring Semester 2019 tool that detects vulnerabilities connected to each visted website. These include vulner- Project Group: abilities in the implemented JavaScript li- ds103f19 braries, CMSs, and server software. Nidan and KNAS has been tested on around 2 % Participant(s): of the entire .dk zone file. This test showed Jesper Windelborg Nielsen that KNAS detected vulnerable software on Mathias Jørgen Bjørnum Leding 40.47 % of the websites. 92.49 % of the vulnerable websites have vulnerabilities from Supervisor(s): last year or older, meaning that the vast René Rydhof Hansen majority of vulnerable sites rarely update Thomas Panum their software. From the data collected by Nidan, it is also possible to analyze the cook- Copies: 4 ies. Since Nidan makes no interaction with the websites other than visiting, all tracking Page Numbers: 37 cookies sat break the GDPR and EU’s cookie law. Date of Completion: June 2, 2019 The content of this report is freely available, but publication (with reference) may only be pursued due to agreement with the author.
    [Show full text]
  • Wordpress and Make Your 2Nd Edition Blog the Best It Can Be Open the Book and Find
    Spine: .816” Internet/Web Page Design ™ 2nd Edition Discover why bloggers love Making Everything Easier! WordPress and make your 2nd Edition blog the best it can be Open the book and find: Blogs are as much a part of life today as the evening • Advice for creating a blog that ® WordPress newspaper was fifty years ago, and for much the same draws readers reason: Inquiring minds want to know. WordPress powers • Tips on managing comments, some of the most popular blogs on the Web, and with trackbacks, and spam this guide to help, it can work for you, too. Here’s what WordPress does, how to set it up and use it, and some cool • How to use the Dashboard WordPress bells and whistles to make your blog stand out. • Wonderful widgets and plugins to add • Pick your flavor — decide whether to use the WordPress.com hosted service or self-host your blog with WordPress.org • How to make permalinks work with your Web server • Customization — discover CSS and template tags and how to use them to create your own unique style • The standard templates and how to tweak them • Blogging 101 — find out about archiving, interacting with readers through comments, tracking back, and handling spam • Ten popular WordPress themes ® • Host with the most — get the scoop on domain registration, Web • Where to find help when you hosting providers, basic tools like FTP, and more need it • Do it yourself — install WordPress.org, set up a MySQL® database, explore RSS feeds, and organize a blogroll • Beef up your blog — insert audio, video, images, and photos Go to • Think theme — discover where to find WordPress themes, dummies.com® Learn to: explore various options, and work with template tags to create a for more! • Use the latest upgrades to WordPress 2.7 unique look • Explore theme development and tweak free WordPress themes • Create a unique blog theme and presentation by using tags with CSS $24.99 US / $26.99 CN / £15.99 UK Lisa Sabin-Wilson is a designer of blogs and Web sites and founder ISBN 978-0-470-40296-2 of E.Webscapes Design Studio.
    [Show full text]
  • Online Tracking and Zombie Cookies Today
    ONLINE TRACKING AND ZOMBIE COOKIES TODAY 1. tracking mechanisms 2. prevalence: web 3. prevalence: mobile 4. consumer choice 5. existing incentives 6. what to do BROWSERS PRIVACY BY DESIGN 3rd Party beacon HTTP beacon http://pixel.quantserve.com/seg/p-6fTutip1SMLM2.js GET /seg/p-6fTutip1SMLM2.js HTTP/1.1 Host: pixel.quantserve.com User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; en-US; rv:1.9.2.13) Gecko/20101203 Firefox/3.6.13 Accept: */* Accept-Language: en-us,en;q=0.5 Accept-Encoding: gzip,deflate Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7 Keep-Alive: 115 Connection: keep-alive Referer: http://www.huffingtonpost.com/2011/01/24/google-chrome-firefox-do-not-track_n_813189.html Cookie: mc=4d441f48-a100a-1f66c-277f6; d=EM0BPAGBBoHxDBi6IIgwALhQnyAAiCD8sQmnDMQLEAC_UA HTTP observer identifier activity http://pixel.quantserve.com/seg/p-6fTutip1SMLM2.js GET /seg/p-6fTutip1SMLM2.js HTTP/1.1 Host: pixel.quantserve.com User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; en-US; rv:1.9.2.13) Gecko/20101203 Firefox/3.6.13 Accept: */* Accept-Language: en-us,en;q=0.5 Accept-Encoding: gzip,deflate Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7 Keep-Alive: 115 Connection: keep-alive Referer: http://www.huffingtonpost.com/2011/01/24/google-chrome-firefox-do-not-track_n_813189.html Cookie: mc=4d441f48-a100a-1f66c-277f6; d=EM0BPAGBBoHxDBi6IIgwALhQnyAAiCD8sQmnDMQLEAC_UA PROFILE observer identifier activity + + = PROFILES cobalt - - [ 3/Aug/2011:15:45:36 -0700] "GET http://www.google.com/search?sourceid=chrome&ie=UTF- 8&q=artisan+hotel+las+vegas
    [Show full text]