IDENTIFYING and MITIGATING THREATS from EMBEDDING THIRD-PARTY CONTENT a Dissertation Presented to the Academic Faculty by Wei Me

IDENTIFYING AND MITIGATING THREATS FROM EMBEDDING THIRD-PARTY CONTENT A Dissertation Presented to The Academic Faculty By Wei Meng In Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy in the School of Computer Science Georgia Institute of Technology August 2017 Copyright © Wei Meng 2017 IDENTIFYING AND MITIGATING THREATS FROM EMBEDDING THIRD-PARTY CONTENT Approved by: Dr. Wenke Lee, Advisor Dr. Giovanni Vigna School of Computer Science Department of Computer Science Georgia Institute of Technology University of California, Santa Barbara Dr. Mustaque Ahamad Dr. Nick Feamster School of Computer Science Department of Computer Science Georgia Institute of Technology Princeton University Dr. Taesoo Kim Date Approved: July 20, 2017 School of Computer Science Georgia Institute of Technology To my parents, and those who have supported me. iii ACKNOWLEDGEMENTS This Ph.D. dissertation would not exist without the support from a number of people. I would like to take this opportunity to acknowledge them. First of all, I am very grateful to my advisor, Wenke Lee, for his guidance and support through the Ph.D. program. Wenke has been an amazing advisor, who has trained me to become an independent researcher and think critically. When I first started my Ph.D. I was a little bit disappointed that I did not receive much instructions on projects or concrete research topics from him. Instead, Wenke provided me with the freedom and necessary resources to explore my own interests. When I met difficulties in exploring new directions, his insightful feedback helped me overcome many challenges. I learned what top-quality research is through the many debates with him, which I enjoyed a lot. Wenke has also influenced me on my judgement of what is important in life. I shall benefit from hisadvice in the rest of my career and life. I would also like to thank Mustaque Ahamad, Giovanni Vigna, Taesoo Kim and Nick Feamster, for serving on my dissertation committee. Their valuable suggestions and com- ments helped me make significant improvement to this dissertation. The work presented in this dissertation would not be possible without the help from my dear collaborators. In my Ph.D., I have been fortunate for having worked with the following brilliant people: Xinyu Xing, Anmol Sheth, Udi Weinsberg, Byoungyoung Lee, Simon P. Chung, Sangho Lee, Christopher Kruegel, Roberto Perdisci, Ren Ding, and Steven Han. Many experiments carried out in this work were supported by the administration team of the Institute for Information Security & Privacy at Georgia Tech, led by Paul Royal and Adam Allred. I am very appreciative of their professional support. Finally, I must thank my parents, for their unconditional love and support. They have always been providing me with the freedom and support for pursuing my dreams. I shall be able to overcome any challenge in the rest of my life with them standing by me. iv TABLE OF CONTENTS Acknowledgments . iv List of Tables . xi List of Figures . xiii Summary . xvi Chapter 1: Introduction . 1 1.1 Problem Statement . 1 1.2 Dissertation Overview . 2 1.2.1 Pollution Attacks . 3 1.2.2 Inferring User Profile with Personalized Mobile In-App Ads .4 1.2.3 Selective Control on Web Tracking . 5 1.2.4 Monitoring and Understanding Web Content Access Behavior of JavaScript . 6 1.2.5 Fine-Grained Access Control Policy for the Document Object Model 7 1.3 Dissertation Contributions . 8 Chapter 2: Pollution Attacks against Personalized Web Services and Applications 9 2.1 Motivation . 9 2.2 Pollution Attacks against Popular Websites . 11 v 2.2.1 Overview and Attack Model . 11 2.2.2 Pollution Attack against Amazon’s Product Recommendation . 13 2.3 Pollution Attack against Targeted Advertising . 18 2.3.1 Ad Targeting and User Profiles . 20 2.3.2 Profile Pollution Attack . 22 2.3.3 Attack Setup . 25 2.3.4 Validation and Effectiveness of the Attack . 29 2.3.5 Revenue Estimation for Live Publishers . 33 2.4 Countermeasures . 36 2.5 Related Work . 38 2.6 Summary . 40 Chapter 3: Inferring User Profile with Personalized Mobile In-App Ads . 41 3.1 Motivation . 41 3.2 Background . 43 3.2.1 Ecosystem of Mobile Advertising . 43 3.2.2 Targeting in Mobile Advertising . 44 3.2.3 (Lack of) Isolation for In-App Advertising . 45 3.3 Methodology . 46 3.3.1 Goals of Study . 46 3.3.2 Challenges and Our Proposed Approaches . 47 3.3.3 Experiment Design . 49 3.4 Characterization of Mobile Ad Personalization . 52 3.4.1 Dataset . 52 vi 3.4.2 Interest Profile Based Personalization . 54 3.4.3 Demographics Based Personalization . 57 3.4.4 Comparison with Previous Studies . 62 3.5 Privacy Leakage through Personalized Mobile Ads . 63 3.5.1 Technical Feasibility . 64 3.5.2 Demographics Learning from Personalized Mobile Ads . 64 3.5.3 Evaluation . 66 3.5.4 Interpreting our Results . 68 3.6 Discussion . 71 3.6.1 Limitations . 71 3.6.2 Countermeasures . 72 3.7 Related Work . 73 3.8 Summary . 75 Chapter 4: Selective Control on Web Tracking . 76 4.1 Motivation . 76 4.2 Background . 78 4.3 Design . 80 4.3.1 Overview . 80 4.3.2 Tracking Preference Policy . 81 4.3.3 Content Analysis Engine . 83 4.3.4 Tracking Preference Checker . 84 4.3.5 Browsing Context Switch . 85 4.3.6 Discussion . 85 vii 4.4 Implementation . 86 4.4.1 Chromium Browser’s Architecture . 86 4.4.2 Tracking Preference Policy . 87 4.4.3 Content Analysis Engine . 87 4.4.4 Tracking Preference Checker . 90 4.4.5 Seamless Browsing Context Switch . 90 4.5 System Performance Evaluation . 91 4.5.1 Page load time . 92 4.5.2 Memory . 95 4.6 Anti-Tracking Evaluation . 96 4.6.1 Experimental Design . 96 4.6.2 Evaluation Result . 99 4.7 Related Work . 101 4.8 Summary . 103 Chapter 5: Monitoring and Understanding Web Content Access Behavior of JavaScript . 105 5.1 Motivation . 105 5.2 DOM-Logger . 107 5.2.1 Recording Accesses . 107 5.2.2 Tracking Element Creation . 109 5.2.3 Monitoring Dynamically Generated Scripts . 110 5.2.4 Implementation . 110 5.3 Collecting DOM Access Logs in the Real World . 111 viii 5.3.1 Setup . 112 5.3.2 Grouping Logs . 113 5.3.3 Privileges of Scripts . 114 5.3.4 Crawling Results . 116 5.4 Characterizing DOM Access behaviors of JavaScript . 120 5.4.1 Initiators of DOM Elements . 121 5.4.2 DOM Accesses . 123 5.4.3 Cross-Owner and Over-Privileged Accesses . 126 5.4.4 Sensitive DOM API Accesses . 130 5.4.5 Summary . 134 5.5 Related Work . 134 5.6 Summary . 135 Chapter 6: Fine-Grained Access Control Policy for the Document Object Model 137 6.1 Motivation . 137 6.2 Background . 140 6.2.1 Web Permission Policies . 140 6.2.2 Motivating Examples . 141 6.3 DOM Access Control Policy . 142 6.3.1 Design of DOM Access Control Policy . 143 6.3.2 Writing an Access Control Policy . 146 6.3.3 Policy Generator . 148 6.4 Implementation of DOM-ACP . ..

IDENTIFYING and MITIGATING THREATS from EMBEDDING THIRD-PARTY CONTENT a Dissertation Presented to the Academic Faculty by Wei Me

Adscape: Harvesting and Analyzing Online Display Ads

Privacy Leakage in Personalized Mobile In-App Ads

Microsoft Acquires Massive, Inc

Observing and Optimizing Online Ad Assignments

An Empirical Study of Mobile Ad Targeting

Urban Screens Reader and Sabine Niederer 2 Urban Screens Reader 3

Kober V. Google

A Natural Attitude

Advertisement

Computational Advertising: Techniques for Targeting Relevant Ads

Tracing Information Flows Between Ad Exchanges Using Retargeted

Diversification Strategy in Internet Industry: Google Inc. Example