<<

ABSTRACT

JUECKSTOCK, JORDAN PHILIP. Enhancing the Security and Privacy of the Platform via Improved Web Measurement Methodology. (Under the direction of Alexandros Kapravelos.)

The web browser platform today serves as a dominant vehicle for commerce, communication, and content consumption, rendering the assessment and improvement of that platform’s user security and privacy important research priorities. Accurate web measurement via simulated user browsing across popular real-world web sites is essential to the process of assessing and improving web browser platform security and privacy, particularly when developing improved policies that can be deployed in production to millions of real-world users. However, the state of the art in web browser platform measurement instrumentation and methodology leaves much to be desired in terms of robust instrumentation, reproducible experiments, and realistic design parameters. We propose that enhancing web browser policies to improve privacy while retaining compatibility with legacy content requires robust and realistic web measurement methodologies leveraging deep browser instrumentation. This document comprises research results supporting the above-stated thesis. We demonstrate the limitations of shallow, in-band JavaScript (JS) instrumentation in web browsers, then describe and demonstrate an open source out-of-band instrumentation tool, VisibleV8 (VV8), embedded in the V8 JS engine. We show that VV8 consistently outperforms equivalent in-band instrumentation, provides coverage unavailable to in-band techniques, yet has proved readily maintainable across numerous updates to and the V8 JS engine. Next, we test the assumption, implicit in typical web measurement studies, that automated crawls generalize to the experience of typical web users with a robustly controlled parallel web measurement experiment comparing observations from multiple network vantage points (VP) and via naive or realistic browser configurations (BC). Our results indicate that VP and especially BC selection result in measurable shifts in HTTP traffic and JS behaviors observed from third-party content providers, underscoring the importance of realism in web measurement experiment design. Finally, we apply the insights gained from our work on instrumentation and experiment design to evaluate a novel web browser third-party storage policy designed to improve user protection against stateful online tracking while retaining compatibility with real-world content. Our evaluation results suggest that our proposed policy achieves its privacy and compatibility goals, as does Software’s recent public deployment of a directly derived storage policy. © Copyright 2021 by Jordan Philip Jueckstock

All Rights Reserved Enhancing the Security and Privacy of the Web Browser Platform via Improved Web Measurement Methodology

by Jordan Philip Jueckstock

A dissertation submitted to the Graduate Faculty of North Carolina State University in partial fulfillment of the requirements for the Degree of Doctor of Philosophy

Computer Science

Raleigh, North Carolina

2021

APPROVED BY:

Anupam Das William Enck

Bradley Reaves Alexandros Kapravelos Chair of Advisory Committee DEDICATION

To my parents, who laid the moral and mental foundations of my life at great personal cost. To my wife, who built with me a loving and stable home for our three children and sustained it through this entire saga despite my late nights and frayed nerves. And to my Creator, without Whom none of this would . Soli Deo gloria.

ii BIOGRAPHY

Jordan Jueckstock was born in Princeton, West Virginia, and raised near Vicenza, Italy. He was homeschooled by his mother, a former secretary who never encountered a job too unimportant to do carefully, and by his father, a musicologist and former music teacher with a fearless talent for practical engineering. Jordan earned his Bachelor of Science in Computer Science from Bob Jones University (BJU) in Greenville, SC, in May 2009. After starting a graduate program at Clemson University the following fall, he transferred to the NSF CyberCorps program at The University of Tulsa in Tulsa, OK, completing a Master of Science in Computer Science there in December 2011. Following two-and-a-half years of work at the National Security Agency in Ft. Meade, MD, Jordan returned to BJU as an instructor. He set out to complete his formal education in computer science by joining the doctoral program at NC State in the fall of 2017. He collaborated with privacy researchers at Brave Software as a summer intern in 2020. Following his graduation from NC State, he will be resuming full-time teaching at BJU.

iii ACKNOWLEDGEMENTS

This document and the work it represents have been possible only with tremendous support, help, and encouragement from many people and sources. The following deserve particular attention and thanks for their essential role in whatever success I have achieved in this process: ...my advisor: Dr. Alexandros Kapravelos. Thanks to his proactive outreach, I actually missed out on that most stressful of freshman-PhD-student activities: finding an advisor. My advisor found me! His practical approach to research removed my chief barriers to entry, and his personal manner made meeting and working with him a genuine pleasure. His bleeding-edge approach to lab infrastructure may have caused me some uncomfortably deep dives into documentation and code, but it forced me to grow both my technical and management skills. He made me a researcher, to the extent that I am one; a better teacher; and a better hacker. ...my committee members: Drs. Will Enck, Brad Reaves, and Anupam Das. Individually they have provided both encouragement and challenges to me in classrooms, lab meetings, and personal conversations. As a committee, they have provided a healthy blend of confirmation, criticism, and counsel in directing me to the conclusion of my studies and rounding out my education in the art and science of research. ...my WSPR lab colleagues who shared valuable educational and technical advice, daily com- miseration, and memorable life stories. At the risk of leaving out somebody important, memorable names (past and present) include: Micah Bushouse, Lucas Enloe, Abida Haque, Igibek Koishybayev, Nikolaos Pantelaios, and Isaac Polinsky. Two of my lab mates require special mention: Shaown Sarker and Kyle Martin. Shaown has shared with me friendship, serial coauthorship, intriguing philo- sophical discussion, and the special misery of debugging distributed systems written in NodeJS. Kyle has shared with me friendship, serial late-night collaboration at DARPA hackathons, and the mystical bond of brothers-in-arms formed in joint combat against recalcitrant routers, switches, servers, and Ansible playbooks. He even does not hate me—too much, anyway—for making him learn Rust for that compiler class project. ...the collaborators and mentors I met working with Brave Software: Pete Snyder, Matteo Varvello, Panos Papadopoulos, and Ben Livshits. Special thanks to Pete for multiple research project ideas and collaborations, for engineering my Brave internship at the last possible moment, and for tearing apart and reworking my writing when necessary (which was ... frequently). ... my family, already mentioned but impossible to thank enough. My parents, John and Judy Jueckstock, deserve all credit for whatever positive character traits and skills I possessed when starting my higher education saga, to say nothing of life in general. My in-laws, David and Deborah Andrews, are responsible for the raising of the most wonderful woman in the world: Jessica Jueck- stock, nee Andrews, my darling wife. Our three children, Johnny, Josie, and Jadyn, have suffered much in the way of an absent-minded if not simply absent father at various points over the last four years, but their love and joy and energy in spite of it are reflections of their mother’s steadfast home-making magic. This is yours, too, Jessica. It simply could not have happened without you.

iv TABLE OF CONTENTS

LIST OF ...... viii

LIST OF FIGURES ...... ix

Chapter 1 Introduction ...... 1 1.1 Thesis Statement...... 1 1.2 Contributions ...... 2 1.3 Thesis Organization...... 4

Chapter 2 Background & Motivation ...... 6 2.1 Overview...... 6 2.2 JavaScript Instrumentation for Browsers...... 7 2.2.1 Trends and Trade-offs...... 7 2.2.2 Fundamental Criteria...... 8 2.2.3 The Case Against In-Band JS Instrumentation ...... 9 2.2.4 Summary...... 12 2.3 Web Browser Storage & Security Policies...... 12 2.3.1 Same-Origin Policy & Storage Basics ...... 12 2.3.2 User Tracking...... 14 2.3.3 Threat Model...... 15 2.3.4 Deployed Stateful Tracking Defenses...... 16 2.3.5 Compatibility and Tracking Protections...... 18

Chapter 3 VisibleV8: In-browser Monitoring of JavaScript in the Wild ...... 20 3.1 Introduction ...... 20 3.2 System Architecture...... 22 3.2.1 Chromium/V8 Internals...... 22 3.2.2 VisibleV8 Implementation...... 23 3.2.3 Performance ...... 25 3.2.4 Maintenance & Limitations...... 26 3.2.5 Collection System...... 27 3.3 Data Collection ...... 28 3.3.1 Methodology ...... 28 3.3.2 Data Post-Processing ...... 28 3.3.3 Results...... 29 3.4 Bot Detection Artifacts...... 30 3.4.1 Artifact Discovery Methodology...... 31 3.4.2 Artifact Analysis Results ...... 33 3.4.3 Case Studies...... 35 3.5 Related Work...... 37 3.6 Conclusion ...... 38 3.7 Availability...... 38

Chapter 4 Towards Realistic and Reproducible Web Crawl Measurements ...... 39 4.1 Introduction ...... 40 4.2 Methodology...... 42

v 4.2.1 Approach to Realism...... 42 4.2.2 Realism Variables...... 42 4.2.3 Control Constants...... 44 4.2.4 Web Site Selection ...... 45 4.2.5 Implementation Details ...... 46 4.2.6 Precautions & Pilot Experiments...... 47 4.2.7 Quantifying Measurement Bias...... 49 4.3 Results ...... 50 4.3.1 Refusenik Sites...... 50 4.3.2 Volume Biases in HTTP Traffic...... 50 4.3.3 Content-Level Biases in JavaScript...... 57 4.4 Discussion and Future Work...... 61 4.4.1 Application to Future Research...... 61 4.4.2 Limitations...... 62 4.4.3 Future Work...... 62 4.5 Related Work...... 63 4.5.1 Generalizability ...... 63 4.5.2 Cloaking and Bot Detection...... 63 4.5.3 Network Endpoint Discrimination...... 64 4.6 Conclusion ...... 65

Chapter 5 Page-Length Storage: A Solution to the Privacy vs. Compatibility Trade-off in Preventing Third-Party Stateful Tracking ...... 66 5.1 Introduction ...... 67 5.2 Design & Implementation ...... 69 5.2.1 Policy Design...... 69 5.2.2 Prototype Implementation ...... 71 5.2.3 Implementation Remarks...... 71 5.3 Methodology...... 72 5.3.1 Stateful Crawl Methodology ...... 72 5.3.2 Primary Evaluation Methodology ...... 74 5.3.3 Performance Evaluation...... 78 5.4 Results ...... 78 5.4.1 Stateful Crawl Statistics...... 79 5.4.2 Privacy: Cross-Site Tracking Potential ...... 79 5.4.3 Privacy: Cross-Time Tracking Potential ...... 79 5.4.4 Compatibility: Quantitative Assessment ...... 80 5.4.5 Compatibility: Qualitative Assessment...... 81 5.4.6 Performance Evaluation...... 82 5.5 Discussion...... 85 5.5.1 Limitations...... 85 5.5.2 Next Steps ...... 85 5.6 Related Work...... 86 5.7 Conclusion ...... 88

Chapter 6 Conclusions & Future Work ...... 89 6.1 Thesis Statement Revisited...... 89 6.2 Directions for Future Work...... 90

vi BIBLIOGRAPHY ...... 91

vii LIST OF TABLES

Table 2.1 Survey of published JS instrumentation systems ...... 7

Table 3.1 Final domain status after collection...... 29 Table 3.2 Bot detection seed artifacts...... 31 Table 3.3 Candidate artifacts classified...... 32 Table 3.4 Highest ranked visit domains probing identified bot artifacts...... 34 Table 3.5 Top security origin domains probing bot artifacts...... 34 Table 3.6 Most-probed bot artifacts...... 35

Table 4.1 Some “refusenik” sites always fail navigation from a single configuration but not its’ complements ...... 50 Table 4.2 Total HTTP requests by EasyList/EasyPrivacy match and frame context . . . . 54 Table 4.3 Many domains showing no overall request volume bias serve script content with distinct VP/BC biases...... 60

Table 5.1 Candidate URL deviations as assesses by holistic manual grading (n=100) . . 82 Table 5.2 Per-policy/per-temperature mean JS execution time for third-party execution contexts, normalized relative to the permissivebaseline...... 84

viii LIST OF FIGURES

Figure 2.1 Reference monitors (RM) in traditional OS & application security...... 8 Figure 2.2 Third-party storage (a) fully allowed, (b) fully blocked, () partitioned by first-party context, and (d) scoped to hosting page life time (our proposal). A, B, & T are distinct domains; T is embedded as a third-party within A & B. . 13 Figure 2.3 Stock market graph broken by strict third-party storage blocking (left) and working with page-length storage (right)...... 19

Figure 3.1 V8 architecture with VV8’s additions...... 23 Figure 3.2 Instrumentation performance on BrowserBench.org [Bro] and Dro- maeo [Dro] ...... 25 Figure 3.3 The complete data collection and post-processing system ...... 27 Figure 3.4 Cumulative feature use over the Alexa 50k ...... 30

Figure 4.1 Workflow from Domain List to Target Server...... 43 Figure 4.2 Distributions of cross-VP request volume bias by 3rd-party domains (stealth BC only); nearly twice as many domains consistently favor residential VP over the cloud VP as vice versa (3.5% > 1.8%)...... 51 Figure 4.3 Distribution of stealth-vs.-naive traffic volume bias scores for 3rd-party do- mains (residential VP only); more symmetric than its cross-VP counterparts. 52 rd Figure 4.4 Distributions of cross-VP ad/tracker request volume bias by 3 -party do- mains (stealth BC only); little change from global cross-VP distributions. . . . 53 Figure 4.5 Distribution of stealth-vs.-naive ad/tracker traffic volume bias scores for 3rd-party domains (residential VP only); BC bias is more common among these domains than the global population...... 54 rd Figure 4.6 Distributions of stealth-vs.-naive ad/tracker traffic volume bias scores for 3 - party domains (residential VP only) broken down by browser frame context; sub-frames show radically different (and less intuitive) BC bias distributions than main frames...... 55 Figure 4.7 Distribution of cross-VP execution frequency bias for families of JS code (stealth BC only)...... 58 Figure 4.8 Distribution of stealth-vs.-naive execution frequency bias for families of JS code (residential VP only)...... 59

Figure 5.1 Crawl success rate varied modestly across policies but was always reasonably high...... 79 Figure 5.2 Of our tested policies, all but permissive essentially eliminated stateful cross- site tracking potential...... 80 Figure 5.3 Cross-time tracking comparisons considering cookies and local storage . . . 81 Figure 5.4 Our page-length policy produces page behaviors within third-party frames much closer to the permissive baseline than does the breakage-prone strict third-party storage blocking policy...... 82 Figure 5.5 Time to Largest Contentful-Paint: page-length is always comparable/favor- able to “cold” permissive...... 83 Figure 5.6 Policy overhead within JS execution time and HTTP request volume ...... 84

ix CHAPTER

1

INTRODUCTION

1.1 Thesis Statement

The undeniable dominance of the web browser platform as a vehicle for commerce, communication, and content consumption makes assessment and improvement of the platform’s security and privacy compelling research priorities. Securing the control of the browser platform itself against, e.g., 0-day vulnerabilities, is an obvious imperative with obvious solutions. Browser vendors have become aggressively adept at finding and patching critical vulnerabilities and rolling out new browser versions at a dizzying pace. Security of user identity and assets from vulnerabilities in web applications are likewise generally straightforward problems that call for straightforward solutions (i.e., patches) or workaround. Such solutions may or may not involve changing the web browser platform, though robustness against common vulnerabilities of the past is a criterion that has guided the development of modern browser mechanisms such as the SameSite cookie1 attribute. Securing the privacy of user interests and associations is a much more subtle and ultimately harder problem, one that typically defies straightforward solutions. Such solutions as may be designed might call for web browser platform changes that are unpopular with prominent players in a complex online commerce ecosystem rife with paradoxical and even perverse incentives. Essential to the process of assessing and improving web browser platform security and privacy is the ubiquitous and often haphazard practice of web measurement. A typical web measurement study attempts to approximate the user browsing experience across top ranked web sites while extracting and measuring features of interest from the pages visited. Security studies can quantify how many servers or sites are vulnerable to specific, newly identified attacks. Studies of platform

1https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Set-Cookie/SameSite

1 and emerging threats can identify web browser platform features that are popular, obscure, obsolete, or being put to surprising (and potentially harmful) uses. Privacy studies measure the pervasiveness of online tracking to quantify just how widely (and accurately) user interests and associations can be tracked across sites. The state of the art in web browser platform measurement instrumentation and methodology leaves much to be desired. To begin, even though web measurement is pervasive in security and privacy research involving the web platform, the specific measurement methodologies employed are unfortunately often poorly documented, much less repeatable. Furthermore, typical approaches to browser instrumentation (e.g., browser extensions or automation tools injecting JavaScript prototype patches into the global object) are subject to platform limitations in coverage and performance and may further be exposed to easy identification by evasive or even hostile content. Finally, crawls using unrealistic network endpoints or browser configurations are vulnerable to measurement bias caused by selective or even adversarial response. Incomplete or biased measurement results stand as obstacles to improving the security and privacy of the web browser platform since the scale of a given problem, or the efficacy of a proposed solution, or both, may not be correctly observed. Furthermore, even where a novel browser mech- anism or policy could demonstrably improve user security or privacy, the problem of evaluating the solution’s compatibility with real-world legacy content remains serious and unsolved in the general case. An effective but incompatible, site-breaking solution is no solution at all, as it cannot be deployed at scale to real-world users. The traditional approach to evaluating site-breakage in- evitably relies on human agents to define “breakage,” preventing effective scaling. We summarize these research problems impeding the improvement of web and privacy and our contributions toward solving them in the following thesis statement:

Enhancing web browser policies to improve privacy while retaining compatibility with legacy content requires robust and realistic web measurement methodologies leveraging deep browser instrumentation.

The sufficiency of such measurement methodologies to assess the real-world practicality of novel privacy policies is demonstrated by the results presented in Chapter5 and by the recent adoption and deployment within the Brave browser of a policy directly derived from that proposed and evaluated in our work. Their necessity is supported by findings from our study of in-browser JavaScript instrumentation (Chapter3) compared to its popular alternatives and our study of measurement bias attributable to choice of network vantage point and browser realism when performing web crawls (Chapter4).

1.2 Contributions

The work documented in this thesis makes the following specific research contributions:

2 • Documentation of critical limitations of in-band browser instrumentation. We illustrate how browser platform and JavaScript language peculiarities make comprehensive and stealthy JavaScript API instrumentation implemented in-band, via injected JavaScript (JS) code, im- practical if not impossible.

• Presentation, release, and demonstration of effective and maintainable out-of-band JS instru- mentation. Despite being deeply embedded into the V8 JS engine, VisibleV8 has been readily maintained from Chromium 63 to Chrome 84 without major effort. It enabled discovery of 46 client-side bot-detection artifacts, in use by content on 29% of the Alexa top 50k sites, that would not have been detectable with conventional in-band JS instrumentation. It has since proved useful to independent research efforts, such as a recently published study of JavaScript obfuscation in the wild [Sar20].

• Design, documentation, and evaluation of a robustly-controlled methodology for assessing web measurement bias attributable to design criteria such as network vantage point or browser configuration. We compare three vantage points (university, cloud, and residential networks), finding a modest but consistent bias effect of third-party content providers serving more traffic to non-cloud clients. To assess browser realism’s impact, we compare a naive headless client with a non-headless browser hardened against basic bot detection, finding significant bias in traffic volume served by third-party domains (up to 19% of domains associated with advertising and user tracking). Other results suggested our findings represented a lower bound on the real effect of unrealistic vantage points or browser configurations. We go to significant lengths to eliminate all other sources of client-side noise in web measurement, documenting and where possible empirically justifying all experiment parameters.

• Design, implementation, and evaluation of a novel third-party browser storage policy which preserves both user privacy and content compatibility. The policy uses elements of existing mechanisms (ephemeral storage, storage partitioning by first-party domain context) in a novel way, restricting third-party storage to the lifetime of a top-level document. This approach prevents not only traditional cross-site tracking but also more subtle threats such as cross-visit session linking within one site context. Our evaluation experiment uses the open-source Page- Graph deep browser instrumentation system to collect and compare fine-grained elements of content behavior among our proposed policy and several representative alternatives. The results show tracking prevention equivalent to full blocking of third-party storage (effective, but known to break popular content) while maintaining compatibility comparable to the default baseline (allowing all third-party storage).

• Direct influence on a new privacy-preserving storage policy recently deployed in production by Brave software. Despite modest relaxations in the policy strictness to facilitate multi-tab browsing scenarios, the deployed policy is substantially the same as ours, confirming both the effectiveness of our policy proposal and the general accuracy of our compatibility assessment.

3 1.3 Thesis Organization

In Chapter3, we describe VisibleV8, a system for improving the coverage and stealth of JavaScript API instrumentation in web browsers. We identify in-band (i.e., non-browser-modifying) instru- mentation approaches as the prevailing state of the art in comparable work, and we then identify fundamental limitations of the browser platform and JavaScript language that inhibit effective in-band instrumentation. We present the design and implementation of an out-of-band JavaScript native API instrumentation framework providing greatly improved visibility over existing alter- natives. VisibleV8 allows us to detect pages’ attempts to probe for the existence of non-standard properties on built-in objects in the native browser , a behavior associated with “bot detection,” the classification of browsers visiting a site as automated robots, or “bots.” We use VisibleV8 to discover novel bot detection artifacts and to measure the prevalence of testing for them across the Alexa top 50K sites, finding a significant number (29%) to engage in apparent bot detection via such probes. In Chapter4, we describe and execute a methodology for measuring the impact of expected real- ism factors on repeated measurements of web network traffic and JavaScript behavior. We perform a large-scale, repeated, synchronized stateless crawl of the Tranco top 25K sites, capturing network request data and JavaScript execution traces, the latter via VisibleV8. Realism factors compared are network vantage point (VP; cloud vs. residential vs. university) and browser configuration (BC; off-the-shelf automated headless browser vs. automated browser camouflaged to appear like a typi- cal desktop browser). While overall traffic levels stayed comparable across all variables, significant numbers of domains showed consistent, large-scale differences in traffic volume depending on the realism of the VP or BC used, including up to 19% of domains serving traffic flagged by popular filter lists (EasyList, EasyPrivacy). JavaScript loading and execution analysis revealed similar patterns of selective behavior favoring one VP or BC over another. We conclude that simple realism factors, either unconsidered or left undocumented in too many web measurement studies [Ahm20], can significantly impact measurement results that are closely related to security and privacy (namely, context-sensitive script behavior and HTTP traffic to known advertising and tracking domains). In Chapter5, we propose and evaluate a policy improvement for the web browser platform to prevent stateful user tracking without breaking user experience on sites embedding third-party content. Our policy is inspired by a simple insight: third-party browser storage allows tracking, but only when accessed later in time or from the context of a different site To prevent tracking without breaking content that assumes it can access third-party storage, simply restrict the scope and lifetime of that storage to the lifespan of a single loaded Web page in the browser so it cannot be accessed on a subsequent visit to that or any other site. Our policy proves straightforward, if not easy, to prototype as minor modifications to Chromium. We perform quantitative and qualitative measurement studies to assess the privacy and compatibility efficacy of our proposed policy. Our quantitative measurements are heavily informed by our prior work: out-of-band (i.e., in-browser) instrumentation, in this case using PageGraph [Iqb; Che21], giving deep coverage with good perfor-

4 mance and stealth; and a carefully documented crawl design with realism factors informed by our multi-endpoint study (a university VP and a non-headless BC). Our results support our expectations of improved privacy without a high cost in web compatibility. The Brave browser (on which Page- Graph is based) has recently adopted a new third-party storage policy directly derived from ours, demonstrating that our approach can produce effective privacy enhancements that are sufficiently robust to deploy in the real world.

5 CHAPTER

2

BACKGROUND & MOTIVATION

2.1 Overview

This section provides background material first on technical challenges and pitfalls our work faces and then on the policy landscape we seek to improve. Our thesis presumes a solid understanding of both the technical and policy details underlying the evolution of user privacy in web browsers. The choice of deep browser instrumentation over its traditional alternatives is motivated by a solid grasp of modern browser architecture and the distinctive semantics of the JavaScript language at the heart of rich web content and browser-based applications. Any proposed solution in the applied problem space of user tracking and legacy compatibility is constrained by details of both historical and modern browser security and privacy policies, or the lack thereof. Web browsers are complex, not only in their implementation details but also in their conceptual structure and policy details. Modern browser implementations must parse dozens of complex data formats, perform typographic layout and graphical rendering of large documents with high quality and low latency, and compile and execute arbitrary programs written in a highly dynamic scripting language (JavaScript) with extremely high performance relative to traditional scripting language interpreters. Unsurprisingly, then, modern browser implementations are large, typically comprising millions of lines of code. The rapid, ad hoc evolution of browser features over three decades has produced a sometimes incoherent platform stance toward security and privacy. Broadly standardized privacy policies (e.g., the Same Origin Policy, or SOP) exist but are marked by edge-cases within and inconsistencies between vendor implementations without. Recent societal and industry development of heightened privacy awareness has prompted various mechanisms and policies, from both browser vendors and

6 independent privacy advocates, to mitigate traditional weaknesses in the browser’s privacy stance. This trend is encouraging but also increases the complexity of the platform and complicates any attempt to introduce universally helpful policy improvements.

2.2 JavaScript Instrumentation for Browsers

2.2.1 Trends and Trade-offs

The state of the art in web measurements for security and privacy relies on full browsers, not simple crawlers or browser-simulators, to visit web sites. Experience with the OpenWPM framework [Eng16] indicated measurements from standard browsers (e.g., and Chromium) to be more complete and reliable than those from lightweight headless browsers (e.g., PhantomJS [Pha]). However, the question of how best to monitor and measure JS activity within a browser remains open. Assuming an open-source browser, researchers can modify the implementation itself to provide in-browser (i.e., out-of-band) JS instrumentation. Alternatively, researchers can exploit the flexibility of JS itself to inject language-level (i.e., in-band) instrumentation directly into JS applications at run-time. We provide a summary of recent security and privacy related research that measured web content using JS instrumentation in Table 2.1. Note that here “taint analysis” implies “dynamic analysis” but additionally includes tracking tainted data flows from source to sink. Fine-grained taint analysis is a heavy-weight technique, as is comprehensive forensic record and replay, so it is not surprising that these systems employed out-of-band (in-browser) implementations in native (C/C++) code. Lighter weight methodologies that simply log (or block) use of selected API features have been implemented both in- and out-of-band, but the in-band approach is more popular, especially in more recent works.

Table 2.1 Survey of published JS instrumentation systems

System Implementation Role Problem Platform Availability

1 2 OpenWPM [Das18; Mir17; Eng16] In-band Dynamic analysis Various Firefox SDK Snyder et al., 2016 [Sny16] In-band Dynamic analysis Attack surface Firefox SDK FourthParty [May12] In-band Dynamic analysis Privacy/tracking Firefox SDK TrackingObserver [Roe12] In-band Dynamic analysis Privacy/tracking WebExtension JavaScript Zero [Sch18] In-band Policy enforcement Side-channels WebExtension Snyder et al., 2017 [Sny17a] In-band Policy enforcement Privacy/tracking Firefox SDK Li et al., 2014 [Li14] Out-of-band Dynamic analysis Malware Firefox (unspecified) 3 FPDetective [Aca13] Out-of-band Dynamic analysis Privacy/tracking Chrome 32 WebAnalyzer [Sin10] Out-of-band Dynamic analysis Privacy/tracking 8 4 JSgraph [Li18] Out-of-band Forensic record/replay Malware/phishing Chrome 48 WebCapsule [Nea15] Out-of-band Forensic record/replay Malware/phishing Chrome 36 Mystique [Che18] Out-of-band Taint analysis Privacy/tracking Chrome 58 Lekies et al. [Lek13; Lek17] Out-of-band Taint analysis XSS Chrome (unspecified) Stock et al. [Sto15] Out-of-band Taint analysis XSS Firefox (unspecified) Tran et al., 2012 [Tra12] Out-of-band Taint analysis Privacy/tracking Firefox 3.6 1 Including only OpenWPM usage depending on JS instrumentation 2 Supported only through Firefox 52 (end-of-life 2018-09-05) 3 Used both PhantomJS and Chrome built from a common patched WebKit 4 Binaries only

7 "User Space" (JavaScript Code)

"Kernel Space" (Browser Code) Traditional RM Execution Context

Execution Context

Inlined RM

= Protected Browser API

= Instrumentation Layer

Figure 2.1 Reference monitors (RM) in traditional OS & application security

2.2.2 Fundamental Criteria

The problem of monitoring untrusted code dates to the very dawn of computer security and inspired the concept of a reference monitor [And72], a software mediator that intercepts and enforces policy on all attempted access to protected resources. The traditional criteria of correctness for reference monitors are that they be tamper proof, be always invoked (i.e., provide complete coverage), and be provably correct, though this last element may be lacking in practical implementations. For security and privacy measurements, we add the additional criterion of stealth: evasive or malicious code should not be able to trivially detect its isolation within the reference monitor and hide its true intentions from researchers, since such an evasion could compromise the integrity of the derived results. A classic example of a practical reference monitor is an kernel: it enforces access control policies on shared resources, typically using a rings of protection scheme (Figure 2.1) assisted by hardware. In order to enforce security policies like access controls and audits, the kernel must run in a privileged ring. Untrusted user code runs in a less-privileged ring, where the kernel (i.e., the reference monitor) can intercept and thwart any attempt to violate system policy. Alternatively, inlined reference monitors (IRMs) [Erl03] attempt to enforce policy while cohabiting the user ring with the monitored application, typically by rewriting and instrumenting the application’s code on the fly at load or run time. On the web, the browser and JS engine provide the equivalent of a kernel, while JS application code runs in a “user ring” enforced by the semantics of the JS language. JS instrumentation in general

8 is a kind of reference monitor; implemented in-band, it constitutes an IRM. We argue that the JS language’s inherent semantics and current implementation details make it impossible to build sound, efficient, general-purpose IRMs in JS on modern web browsers.

2.2.3 The Case Against In-Band JS Instrumentation

The standard approach to in-band JS instrumentation, which we call “prototype patching,” is to replace references to target JS functions or objects with references to instrumented wrapper functions or proxy objects1. The wrappers can access the original target through references captured in a private scope inaccessible to any other code. Note that the target objects themselves are not replaced or instrumented, only the references to them (a potential pitfall highlighted in prior work [Mey10]). Structural Limits. The JS language relies heavily on a global object (window in browsers) which doubles as the top level namespace. There is no mutable root reference to the global object, and thus no way to replace it with a proxy version. Specific properties of the global object may be instru- mented selectively, but this process naturally requires a priori knowledge of the target properties. In-band instrumentation cannot be used to collect arbitrary global property accesses, as required for our methodology in Section 3.4. This limitation means that in-band JS instrumentation fails the complete coverage criterion. Policy Limits. Not all Chrome browser API features can be patched or wrapped by design pol- icy. These features can be identified using the WebIDL [Webb] (interface definition language) files included in the Chromium sources. For Chrome 64, these files defined 5,755 API functions and prop- erties implemented for use by web content (there are more available only to internal test suites). 21 are marked Unforgeable and cannot be modified at all. Notably, this set includes window.location and window.document, preventing in-band instrumentation of arbitrary-property accesses on either of these important objects. Again, such a restriction would have eliminated many of our results in Section 3.4.

1 Other forms of JS IRM exist, like nested interpreters [Cao12; Ter12] and code-rewriting systems [Chu15], but these have not yet proven fast enough for real-world measurement work.

9 1 /* from ://cdn.flashtalking.com/xre/275/ 2 2759859/1948687/js/j-2759859-1948687.js*/ 3 /*(all variable names original)*/ 4 var badWrite = !(document.write instanceof Function&& 5 ~document.write.toString().indexOf( '[native code] ')); 6 7 /*(later on, among other logic checks)*/ 8 if(badWrite ||o.append){ 9 o.scriptLocation.parentNode.insertBefore( 10 /* omitted for brevity*/); 11 } else{ 12 document.write(div.outerHTML); 13 }

Listing 2.1 Prototype patch evasion in the wild

Patch Detection. Prototype patches of native API functions (or property accessors) can be detected directly and thus fail the criterion of stealth. JS functions are objects and can be coerced to strings. In every modern JS engine, the resulting string reveals whether the function is a true JS function or a binding to a native function. Patching a native function (e.g., window.alert) with a non-native JS wrapper function is a dead giveaway of interposition. The function-to-string probe has been used to detect fingerprinting countermeasures [Vas18b] and appears commonly in real-world JS code. In many cases, such checks appear strictly related to testing available features for browser compatibility. But there also exist cases like Listing 2.1, in which the script changes its behavior in direct response to a detected patch. Function-to-string probe evasions abound, from the obvious (patch the right toString function, too) to the subtle. In Listing 2.2, the "[native code]" string literal in the patch function appears in the output of toString and will fool a sloppy function-to-string probe that merely tests for the presence of that substring.

1 /* from https://clarium.global.ssl.fastly.net("..." comment means irrelevant portions elided for brevity)*/ 2 patchNodeMethod: function(a){ 3 varb= this, 4 c= Node.prototype[a]; 5 Node.prototype[a] = function(a){ 6 "[native code]"; 7 vard=a.src ||""; 8 return/*...*/ 9 c.apply(this, arguments) 10 } 11 /*...*/ 12 }

Listing 2.2 Function patches hiding in plain sight

Let us assume a “perfect” patching system invisible to toString probes has been used to

10 instrument createElement, the single most popular browser API observed in our data collection across the Alexa 50k (Section 3.3). Such a patch is still vulnerable to a probe that exploits JS’s type coercion rules with a Trojan argument to detect patches on the call stack at runtime (Listing 2.3). For brevity, the provided proof-of-concept calls the Error constructor, which could itself be patched, but there are other ways of obtaining a stack trace in JS. The Byzantine complexity of JS’s pathologically dynamic type system offers many opportunities for callback-based exposure of patches and proxies via stack traces. Here prototype patches face a cruel dilemma: either invoke the toString operation and open the gates to side-effects (allowing detection and evasion), or refuse to invoke it and break standard JS semantics (allowing detection and evasion).

1 function paranoidCreateElement(tag){ 2 return document.createElement({ 3 toString: function(){ 4 var callers= new Error().stack.split( '\n ').slice(1); 5 if(/at paranoidCreateElement/.test(callers[1])) { 6 return tag;/* no patch*/ 7 } else{ 8 throw new Error("evasive action!");/* patched!*/ 9 }},});}

Listing 2.3 Trojan argument attack (Chrome variant)

Patch Subversion. Finally, prototype patches can be subverted through abuse of