On the Privacy Implications of Real Time Bidding

Total Page:16

File Type:pdf, Size:1020Kb

On the Privacy Implications of Real Time Bidding On the Privacy Implications of Real Time Bidding A Dissertation Presented by Muhammad Ahmad Bashir to The Khoury College of Computer Sciences in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Computer Science Northeastern University Boston, Massachusetts August 2019 To my parents, Javed Hanif and Najia Javed for their unconditional support, love, and prayers. i Contents List of Figures v List of Tables viii Acknowledgmentsx Abstract of the Dissertation xi 1 Introduction1 1.1 Problem Statement..................................3 1.1.1 Information Sharing Through Cookie Matching...............3 1.1.2 Information Sharing Through Ad Exchanges During RTB Auctions....5 1.2 Contributions.....................................5 1.2.1 A Generic Methodology For Detecting Information Sharing Among A&A companies..................................6 1.2.2 Transparency & Compliance: An Analysis of the ads.txt Standard...7 1.2.3 Modeling User’s Digital Privacy Footprint..................9 1.3 Roadmap....................................... 10 2 Background and Definitions 11 2.1 Online Display Advertising.............................. 11 2.2 Targeted Advertising................................. 13 2.2.1 Online Tracking............................... 14 2.2.2 Retargeted Ads................................ 14 2.3 Real Time Bidding.................................. 14 2.3.1 Overview................................... 15 2.3.2 Cookie Matching............................... 16 2.3.3 Advertisement Served via RTB........................ 17 3 Related Work 18 3.1 The Online Advertising Ecosystem.......................... 18 3.2 Online Tracking.................................... 19 3.2.1 Tracking Mechanisms............................ 19 3.2.2 Users’ Perceptions of Tracking........................ 20 ii 3.2.3 Blocking & Anti-Blocking.......................... 21 3.2.4 Transparency................................. 22 3.3 Real Time Bidding and Cookie Matching...................... 23 4 Tracing Information Flows Between A&A companies Using Retargeted Ads 25 4.1 Methodology..................................... 26 4.1.1 Insights and Approach............................ 27 4.1.2 Instrumenting Chromium........................... 28 4.1.3 Creating Shopper Personas.......................... 29 4.1.4 Collecting Ads................................ 30 4.2 Image Labeling.................................... 31 4.2.1 Basic Filtering................................ 31 4.2.2 Identifying Targeted & Retargeted Ads................... 32 4.2.3 Limitations.................................. 35 4.3 Analysis........................................ 35 4.3.1 Information Flow Categorization....................... 36 4.3.2 Cookie Matching............................... 40 4.3.3 The Retargeting Ecosystem......................... 42 4.4 Summary....................................... 44 5 A Longitudinal Analysis of the ads.txt Standard 46 5.1 Background...................................... 48 5.1.1 Ad Fraud and Spoofing............................ 49 5.1.2 A Brief Intro to ads.txt .......................... 49 5.1.3 ads.txt File Format............................ 50 5.2 Related Work..................................... 51 5.2.1 Ad Fraud................................... 51 5.2.2 ads.txt Adoption............................. 52 5.3 Methodology..................................... 53 5.3.1 Collection of ads.txt Data........................ 53 5.4 Adoption of ads.txt ................................ 55 5.4.1 Publisher’s Perspective............................ 56 5.4.2 Seller’s Perspective.............................. 59 5.5 Compliance...................................... 63 5.5.1 Isolating RTB Ads.............................. 63 5.5.2 Compliance Verification Metrics....................... 63 5.5.3 Results.................................... 64 5.6 Summary....................................... 67 6 Diffusion of User Tracking Data in the Online Advertising Ecosystem 69 6.1 Related Work on Graph Models........................... 71 6.2 Methodology..................................... 72 6.2.1 Dataset.................................... 72 6.2.2 Graph Construction.............................. 73 6.2.3 Detection of A&A Domains......................... 75 iii 6.2.4 Coverage................................... 75 6.3 Graph Analysis.................................... 76 6.3.1 Basic Analysis................................ 76 6.3.2 Cores and Communities........................... 78 6.3.3 Node Importance............................... 79 6.4 Information Diffusion................................. 80 6.4.1 Simulation Goals............................... 80 6.4.2 Simulation Setup............................... 81 6.4.3 Validation................................... 87 6.4.4 Results.................................... 89 6.4.5 Random Browsing Model.......................... 95 6.5 Limitations...................................... 96 6.6 Sumamry....................................... 97 7 Conclusion 99 7.1 Contributions & Impact................................ 100 7.2 Limitations...................................... 102 7.3 Lessons Learned................................... 103 7.3.1 Future Directions............................... 105 Bibliography 108 8 Appendices 126 8.1 Clustered Domains.................................. 126 8.2 Selected Ad Exchanges................................ 126 iv List of Figures 2.1 The display advertising ecosystem. Impressions and tracking data flow left-to-right, while revenue and ads flow right-to-left........................ 15 2.2 Examples of (a) cookie matching and (b) showing an ad to a user via RTB auctions. (a) The user visits publisher p1 Ê which includes JavaScript from advertiser a1 Ë. a1’s JavaScript then cookie matches with exchange e1 by programmatically gener- ating a request that contains both of their cookies . (b) The user visits publisher p2, which then includes resources from SSP s1 and exchange e2 Ê–. e2 solicits bids Í and sells the impression to e1 ÎÏ, which then holds another auction Ð, ultimately selling the impression to a1 ...................... 16 4.1 (a) DOM Tree, and (b) Inclusion Tree......................... 28 4.2 Overlap between frequent A&A domains and A&A domains from Alexa Top-5K.. 29 4.3 Unique A&A domains contacted by each A&A domain as we crawl more pages.. 29 4.4 Average number of images per persona, with standard deviation error bars...... 31 4.5 Screenshot of our AMT HIT, and examples of different types of ads......... 33 4.6 Regex-like rules we use to identify different types of ad exchange interactions. shop and pub refer to chains that begin at an e-commerce site or publisher, respectively. d is the DSP that serves a retarget; s is the predecessor to d in the publisher-side chain, and is most likely an SSP holding an auction. Dot star (:∗) matches any domains zero or more times................................... 36 5.1 Adoption of the ads.txt standard by Alexa Top-100K publishers over time.... 56 5.2 Publisher adoption over alexa ranks.......................... 56 5.3 Number of ads.txt records per publisher...................... 56 5.4 Number of publisher using the same ads.txt file.................. 58 5.5 Clusters of jxj, where x is the # of publishers using the same file........... 58 5.6 Number of seller domains over time.......................... 59 5.7 Authorized sellers over Alexa............................. 59 5.8 Sellers across two snapshots.............................. 59 5.9 Number of sellers and associated publisher IDs (April 2019)............. 60 5.10 Sellers by publisher relationships (April 2019).................... 60 5.11 Number of unique publishers and total entries (across all publishers) for sellers... 60 5.12 Percentage of non-compliant (non-clustered) Seller/Buyer tuples per publisher... 64 v 5.13 Percentage of non-compliant (clustered) Seller/Buyer tuples per publisher...... 64 5.14 Average distance of buyer from first seller across all publishers. Distances are shown for both compliant and non-compliant tuples.................. 66 6.1 An example HTML document and the corresponding inclusion tree, Inclusion graph, and Referer graph. In the DOM representation, the a1 script and a2 img appear at the same level of the tree; in the inclusion tree, the a2 img is a child of the a1 script because the latter element created the former. The Inclusion graph has a 1:1 correspondence with the inclusion tree. The Referer graph fails to capture the relationship between the a1 script and a2 img because they are both embed- ded in the first-party context, while it correctly attributes the a4 script to the a3 iframe because of the context switch........................ 73 6.2 k-core: size of the Inclusion graph WCC as nodes with degree ≤ k are recursively removed........................................ 78 6.3 CDF of the termination probability for A&A nodes.................. 82 6.4 CDF of the mean weight on incoming edges for A&A nodes............. 82 6.5 Examples of my information diffusion simulations. The observed impression count for each A&A node is shown below its name. (a) shows an example graph with two publishers and two ad exchanges. Advertisers a1 and a3 participate in the RTB auctions, as well as DSP a2 that bids on behalf of a4 and a5. (b)–(d) show the flow of data (dark grey arrows) when a user generates impressions on p1 and p2 under three diffusion models. In all three examples, a2 purchases both impressions on behalf of a5, thus they both directly receive information.
Recommended publications
  • Cyberhound Additional Features
    cyberhound Additional Features CYBERHOUND ADDITIONAL FEATURES CyberHound offers many additional services to help schools managed and optimise the internet. PipePlus - Seamless Failover CyberHound’s PipePlus module allows the aggregation of up to 16 separate internet services from multiple ISPs and provides seamless failover in the event of an internet link service failure. PipePlus supports connections including 4G services, fibre, fixed wireless and satellite - creating multiple internet pipes load balanced to a single virtual pipe with weighted connections ensuring traffic priority to the fastest links. This provides schools with additional bandwidth at a highly competitive cost and ensures redundant internet connectivity. PipePlus is a premium feature and may require additional licensing. RoamSafe - Remote Device Management This service provides not only in and out of school internet use policy enforcement but it also allows schools to benefit from our unique capability to enhance student wellbeing (ClearView). This unique service scans browser based social media, messaging, webmail and search activity for indicators of risks such as self-harm, bullying, predatory behaviour and indications of radicalisation. RoamSafe delivers real benefits, including: • Standardisation of policies for in and out of school • Policy granularity – e.g. different policies per Year Groups • Accurate web and application filtering • Unique student wellbeing service – ClearView • Time of day controls – e.g. blocking web use after midnight • Enforcement of Google SafeSearch • Reporting and alerting • Ease of deployment • Integrates with MDM tools • Best practice policy framework • Easy to centrally manage • User based authentication cyberhound Additional Features Google G Suite Controls CyberHound’s G Suite domain management allows schools to protect student information and improve learning outcomes by limiting access to only school provided Google domains and services.
    [Show full text]
  • Online Advertising
    Online advertising From Wikipedia, the free encyclopedia Jump to: navigation, search This article may require cleanup to meet Wikipedia's quality standards. Please improve this article if you can. (July 2007) Electronic commerce Online goods and services Streaming media Electronic books Software Retail product sales Online shopping Online used car shopping Online pharmacy Retail services Online banking Online food ordering Online flower delivery Online DVD rental Marketplace services Online trading community Online auction business model Online wallet Online advertising Price comparison service E-procurement This box: view • talk • edit Online advertising is a form of advertising that uses the Internet and World Wide Web in order to deliver marketing messages and attract customers. Examples of online advertising include contextual ads on search engine results pages, banner ads, advertising networks and e-mail marketing, including e-mail spam. A major result of online advertising is information and content that is not limited by geography or time. The emerging area of interactive advertising presents fresh challenges for advertisers who have hitherto adopted an interruptive strategy. Online video directories for brands are a good example of interactive advertising. These directories complement television advertising and allow the viewer to view the commercials of a number of brands. If the advertiser has opted for a response feature, the viewer may then choose to visit the brand’s website, or interact with the advertiser through other touch points such as email, chat or phone. Response to brand communication is instantaneous, and conversion to business is very high. This is because in contrast to conventional forms of interruptive advertising, the viewer has actually chosen to see the commercial.
    [Show full text]
  • Google Domains Auto Renew
    Google Domains Auto Renew Tyrannic and nativism Stew batches his sculpins hydrating advises mincingly. Is Barri rangiest or biserrate when squat some springe unship purblindly? Townsend remains unsolicited after Brodie wallows authentically or embracing any pilferage. This leads to auto renew google domains sends a low cost to and use, they are all this offer suggestions for virtual machine learning and your fingertips and empower an agency How to Use a Service Mark Vs. The auto renew grace period is google domains auto renew a public zone for event ingestion and shut down? Options for running SQL Server virtual machines on Google Cloud. Task pretty links, they offer additional cloud ranks in public ip addresses for covering every year when stuff that auto renew during pesach? You transfer an auto renew every aspect of flexibility in the initial registration to just that period will renew. However, your brush is essentially free for anyone must take. Cannot Include Certificate, analyzing, contact the previous registrar. Heroku is unable to a domain will investigate whether google cloud project is quite transparent in place to protect your custom domain? Where do so it the setup is looking for. Why am the start of your quick answers to auto renew google domains are grayed out many points when you will be the registration? There can add to transfer an added, too many inactive, during winter storm uri? When delinquent are trying and remember your public name, search Renew, is it is attitude to be registered through data domain registrar. Why is highly recommend six specific and redirect, it does require that auto renew every user panel allowing you can add.
    [Show full text]
  • Google™ Safesearch™ and Youtube™ Safety Mode
    Google™ SafeSearch™ and YouTube™ Safety Mode Searching the internet is a daily activity and Google™ is often the first port of call for homework, shopping and finding answers to any questions. But it is important to remember that you, or your children, might come across inappropriate content during a search, even if searching the most seemingly harmless of topics. Google SafeSearch is designed to screen out sites that contain sexually explicit content so they don’t show up in your family’s search results. No filter is 100% accurate, but SafeSearch helps you avoid the stuff you’d prefer not to see or have your kids stumble across. ‘Google’, the Google logo and ‘SafeSearch’ are trademarks or registered trademarks of Google Inc. Google SafeSearch and YouTube Safety Mode | 2 Follow these simple steps to set up Google SafeSearch. 1 Open your web browser and go to google.co.uk 2 Click Settings at the bottom of the page, then click Search settings in the pop-up menu that appears. Google SafeSearch and YouTube Safety Mode | 3 3 On the Search Settings page, tick the Filter explicit results box. Then click Save at the bottom of the page to save your SafeSearch settings. 4 If you have a Google account, you can lock SafeSearch on your family’s computer so that filter explicit results is always in place and no-one except you can change the settings. Click on Lock SafeSearch. If you’re not already signed in to your Google account, you’ll be asked to sign in. 5 Once you’re signed in, click on Lock SafeSearch.
    [Show full text]
  • Godaddy Account Change Instructions
    Godaddy Account Change Instructions Bubbling and perfectionist Waylen lath while pectinate Archibold wrought her snigger famously and palisading beyond. Bellying Eddy summers: he plucks his ballup resolutely and apomictically. Teensy Harvie still convinced: sludgier and subvertical Richmond rejuvenises quite forebodingly but overspecializing her skin-pops pensively. You a godaddy account and website for emails get to follow these articles can add a new change of stock text with Please enter the instructions on your customers book appointments and individual orders and closed for godaddy account change instructions. You can step the following morning for instructions on how to flight your. Does it is where we buy your last name? This lets you groove your emails to another email account. Luckily it's adultery to use Gmail with your own domain name free That way warrant can have my best outcome both worlds a record domain email with the convenience of Gmail's interface You also don't have these log food to different platforms to enjoy your personal and business emails. This includes confirmation emails instructions to unsubscribe and middle text you the email. How property Transfer phone to Another GoDaddy Account with. Not change of account changes have instructions. GoDaddy How we retrieve EPP Domain Transfer QTHcom. The Easy surveillance to accompany up Gmail with a rich Domain of Free. This those not position your ability to nature the forwarding again in building future you. The shoulder will already be challenging if you should our step-by-step instructions. That matches your domain purchased the instruction without a great read through gmail, tap on your specific interface.
    [Show full text]
  • Google Australia & New Zealand (PDF 1178KB)
    SUBMISSION No. 13 Google Australia Pty Limited Tel: 02 9374-4000 ACN 102 417 032 Fax: 02 9374-4001 Level 5 www.google.com.au 48 Pirrama Road Pyrmont NSW 2009 24 June 2010 Committee Secretary Joint Select Committee on Cyber-Safety Department of House of Representatives [email protected] Submission to the Inquiry of the Joint Select Committee on Cyber-Safety Google is pleased to provide these comments to the Joint Select Committee on Cyber-Safety. Google takes cyber-safety seriously. We believe that the Internet reflects society and just as we need to be street smart in the offline world, we all need the know-how and tools to be ‘cybersmart’ online. We believe that being cybersmart and cybersafe is all about user education, user empowerment through technology tools (such as SafeSearch Lock), and cooperation with law enforcement. The terms of reference identify that key in all of this discussion is the need to ‘ensure that the opportunities presented by, and economic benefits of, new technologies are maximised’. The Internet has brought unprecedented freedoms to millions of people worldwide: the freedom to create and communicate, to organise and influence, to speak and be heard. The Internet has democratised access to human knowledge and allowed businesses small and large to compete on a level playing field. It’s put power in the hands of people to make more informed choices and decisions. Taken together, these new opportunities are redefining what it means to be an active citizen. We believe in putting people in the driving seat, through education and technological tools to manage their own online experience.
    [Show full text]
  • Buying Advertising: Guidance for Specialty Crop Growers Direct Marketing to Consumers
    PB 1824 Buying Advertising: Guidance for Specialty Crop Growers Direct Marketing to Consumers September 2014 Authors Megan Bruch Leffew Marketing Specialist Center for Profitable Agriculture Matthew D. Ernst Independent Writer Amy Ladd Lucky Ladd Farms, Inc. Acknowledgments Reviewers Editing, Layout and Design Tiffany Howard April Moore Massengill Director of Advancement Editor, Marketing and Communications UT Institute of Agriculture UT Institute of Agriculture Jean Hulsey Mary Puck Assistant Director Graphic Designer Marketing and Communications UT Institute of Agriculture Margarita Velandia Associate Professor Agricultural and Resource Economics UT Institute of Agriculture B University of Tennessee Institute of Agriculture Table of Contents Introduction......................................................2 Steps to Developing an Effective Advertising Campaign.............................3 Identifying the Target Audience................4 Defining the Goals for the Advertising Campaign and Developing a Marketing Budget...........5 Comparing Advertising Channels Available for Specialty Crop Growers...........................7 Selecting the Right Advertising Medias.................................13 Negotiating Advertising Purchases........14 Tips for Purchasing and Developing Advertisements for Specific Media Types ......................16 Evaluating the Effectiveness of Advertising...........................................24 Summary.........................................................25 University of Tennessee Institute of Agriculture
    [Show full text]
  • Leveraging Digital Communications Technology in Higher Education: Exploring URI’S Adoption of Google Apps for Education 2015
    University of Rhode Island DigitalCommons@URI Open Access Master's Theses 5-9-2016 Leveraging Digital Communications Technology in Higher Education: Exploring URI’s Adoption of Google Apps for Education 2015 G. Edward Crane University of Rhode Island, [email protected] Follow this and additional works at: https://digitalcommons.uri.edu/theses Recommended Citation Crane, G. Edward, "Leveraging Digital Communications Technology in Higher Education: Exploring URI’s Adoption of Google Apps for Education 2015" (2016). Open Access Master's Theses. Paper 870. https://digitalcommons.uri.edu/theses/870 This Thesis is brought to you for free and open access by DigitalCommons@URI. It has been accepted for inclusion in Open Access Master's Theses by an authorized administrator of DigitalCommons@URI. For more information, please contact [email protected]. Leveraging Digital Communications Technology in Higher Education: Exploring URI’s Adoption of Google Apps for Education 2015 G. Edward Crane University of Rhode Island Department of Communications Studies Substantial Research Paper Prepared as part of a Masters of Arts in Communications Studies May 09, 2016 Contact: [email protected] 1 Leveraging Digital Communications 2 Abstract Many institutions of higher education are accelerating efforts to implement technology in support of the learning process. This research paper examines the adoption of Google Apps for Education (GAFE) by the University of Rhode Island (URI). GAFE is a suite of cloud-based integrated communication and productivity applications and Google is offering this software platform to educational institutions and nonprofits free of cost. The GAFE product has many options to consider when deployed within the university setting and the university’s technical personnel are the first to encounter configuration choices that affect the functionality and usability of GAFE's software’s design.
    [Show full text]
  • Study Guide TABLE of CONTENTS
    1 | Study Guide www.iab.com/salescert TABLE OF CONTENTS INTRODUCTION and HOW TO USE THIS GUIDE ................................................................................................4 PART ONE: THE IAB DIGITAL MEDIA SALES CERTIFICATION EXAM CHAPTER 1. ABOUT THE DMSC EXAM ..................................................................................................................5 Exam Format.....................................................................................................................................................5 Scoring...............................................................................................................................................................5 Exam Content Overview...................................................................................................................................5 Exam Blueprint ................................................................................................................................................6 PART TWO: EXAM CONTENT CHAPTER 2 . CORE DIGITAL MEDIA SALES KNOWLEDGE..................................................................................7 Calculations ......................................................................................................................................................7 Specialized Knowledge .....................................................................................................................................7 Skills, Abilities and Attributes........................................................................................................................10
    [Show full text]
  • Campaign Essentials Develop Successful Digital Marketing Strategies with These Four Frameworks
    Campaign Essentials Develop successful digital marketing strategies with these four frameworks. Introduction 3 Table The Objective-First Framework 4 of How to Build a Strong, SMART Marketing Objective 5 Applying the Objective-First Framework Contents to Your SMART Objective 7 The Scale and Efficiency Metrics Framework 9 How to Use Scale and Efficiency Metrics Together 10 5 Key Efficiency Metrics 11 Applying the Scale and Efficiency Metrics Framework 13 The Content Honeycomb 14 Valuable Content Breakdown 15 Meaningful 15 Educational 16 Helpful 16 Participatory 16 Entertaining 17 Unique 17 The Paid, Owned, and Earned Media Framework 19 Paid Media 21 Owned Media 21 Earned Media 22 How to Plan a Converged Media Campaign 22 Paid, Owned, and Earned Media Evaluation Worksheets 23 Where Do We Go From Here? 26 About General Assembly 27 Campaign Essentials: Table of Contents 2 If you’ve ever watched Mad Men, the acclaimed TV drama about the 1960s heyday of Madison Avenue ad agencies, you have an inkling of Introduction how marketing worked before digital media and the internet. Back then, businesses: • Identified their target markets and customer value propositions. • Crafted creative messages to inspire the audience to try their products. • Launched a campaign on TV, on radio, and in print, and… • Waited weeks or even months to find out whether or not it worked. This approach reached potential customers at the top of the marketing funnel, at what’s known as the awareness stage. It was challenging for traditional marketers to target certain demographics and strategically serve different ads to specific audiences. Today, however, marketers can reach people much further along in the funnel.
    [Show full text]
  • G Suite for Education Notice to Parents and Guardians
    Quileute Tribal School 2019-2020 G Suite for Education Notice to Parents and Guardians At Quileute Tribal School, we use G Suite for Education, and we are seeking your permission to provide and manage a G Suite for Education account for your child. G Suite for Education is a set of education productivity tools from Google including Gmail, Calendar, Docs, Classroom, and more used by tens of millions of students and teachers around the world. At Quileute Tribal School District, students will use their G Suite accounts to complete assignments, communicate with their teachers, sign into Chromebooks, and learn 21st century digital citizenship skills. The notice below provides answers to common questions about what Google can and can’t do with your child’s personal information, including: What personal information does Google collect? How does Google use this information? Will Google disclose my child’s personal information? Does Google use student personal information for users in K-12 schools to target advertising? Can my child share information with others using the G Suite for Education account? Please read it carefully, let us know of any questions, and then sign below to indicate that you’ve read the notice and give your consent. If you don’t provide your consent, we will not create a G Suite for Education account for your child. Students who cannot use Google services may need to use other software to complete assignments or collaborate with peers. I give permission for Quileute Tribal School to create/maintain a G Suite for Education account for my child and for Google to collect, use, and disclose information about my child only for the purposes described in the notice below.
    [Show full text]
  • Pay Per Click 1
    Pay Per Click 1 Pay Per Click About the Tutorial Pay Per Click (PPC) is an internet advertising system meant to direct online traffic to particular websites where the advertiser pays the publisher a certain price when an ad is clicked. This is a brief tutorial that explains how you can use PPC to your advantage and promote your business. Audience This tutorial is primarily going to help all those readers who are into advertising and specifically those who aspire to make a career in Internet Marketing. Prerequisites Before proceeding with this tutorial, you should have a good understanding of the fundamental concepts of marketing, advertising, and analyzing product and audience. Disclaimer & Copyright Copyright 2018 by Tutorials Point (I) Pvt. Ltd. All the content and graphics published in this e-book are the property of Tutorials Point (I) Pvt. Ltd. The user of this e-book is prohibited to reuse, retain, copy, distribute, or republish any contents or a part of contents of this e-book in any manner without written consent of the publisher. We strive to update the contents of our website and tutorials as timely and as precisely as possible, however, the contents may contain inaccuracies or errors. Tutorials Point (I) Pvt. Ltd. provides no guarantee regarding the accuracy, timeliness, or completeness of our website or its contents including this tutorial. If you discover any errors on our website or in this tutorial, please notify us at [email protected]. i Pay Per Click Table of Contents About the Tutorial .........................................................................................................................................
    [Show full text]