Self-Exfiltration

Self-Exfiltration: The Dangers of Browser-Enforced Information Flow Control Eric Y. Chen, Sergey Gorbaty, Astha Singhal and Collin Jackson Carnegie Mellon University {eric.chen, sergey.gorbaty, collin.jackson}@sv.cmu.edu, [email protected] Abstract—Since the early days of Netscape, browser vendors Because the attack relies on exfiltrating users’ sensitive and web security researchers have restricted out-going data information through either the victim website itself or another based on its destination. The security argument accompanying whitelisted website, we call this attack a self-exfiltration these mechanisms is that they prevent sensitive user data from being sent to the attacker’s domain. However, in this attack. We demonstrate that an adversary can launch self- paper, we show that regulating web information flow based exfiltration attacks with or without executing JavaScript. on its destination server is an inherently flawed security To successfully launch a self-exfiltration attack, the ad- practice. It is vulnerable to self-exfiltration attacks, where versary must utilize an existing channel on the whitelisted an adversary stashes stolen information in the database of website to store stolen information. To confirm whether a whitelisted site, then later independently connects to the whitelisted site to retrieve the information. We describe eight these exfiltration channels exist in real world websites, we existing browser security mechanisms that are vulnerable to surveyed 100 websites and discovered at least one exfiltration these “self-exfiltration” attacks. Furthermore, we discovered channel for each of these websites. We conclude that none at least one exfiltration channel for each of the Alexa top of these existing data-exfiltration defenses can protect real 100 websites. None of the existing information flow control world websites from self-exfiltration attacks. mechanisms we surveyed are sufficient to protect data from being leaked to the attacker. Our goal is to prevent browser Organization vendors and researchers from falling into this trap by designing more systems that are vulnerable to self-exfiltration. Section II introduces data-exfiltration attacks, presents existing defenses, and outlines the threat model assumed for the rest of this paper. Section III introduces self exfiltration I. INTRODUCTION attacks and details the steps required to launch such an attack. As the World Wide Web matures into a ubiquitous comput- Section IV presents the results of our survey. Section V ing platform, people are growing comfortable with sharing discusses possible solutions, and finally Section VI concludes. their personal information with web applications they trust. However, this casual sharing of information is accompanied II. DATA-EXFILTRATION by serious privacy and security implications. Vulnerabilities The desire to safeguard users’ information from untrusted in web applications can lead to compromise of users’ sensitive websites led to the creation of the most important security data, resulting in embarrassment, inconvenience, and financial feature in browsers – the same origin policy. The same loss. origin policy states that JavaScript from one origin should Some of the most prominent attacks that exist on the web not be able to read the private documents loaded from another today are code injection attacks. In a code injection attack, origin. Although this policy prevents adversaries from trivially the adversary injects malicious JavaScript or HTML code obtaining users’ information, it is by no means a panacea. into a benign web page the user is viewing, allowing the Countless attacks have been discovered that compromise attacker to either perform sensitive actions on behalf of the users’ data despite the same origin policy’s restrictions. user or steal user’s sensitive information. In this paper, we One way for an adversary to circumvent the same origin focus on the scenario where a code injection attack leads policy is to steal sensitive information using cross site to the exfiltration of users’ personal information, which we scripting (XSS) attacks. Cross site scripting attacks happen refer to as a data-exfiltration attack. when an adversary is able to inject JavaScript code onto a Many browser vendors and researchers have come up victim site’s page and lure the user into visiting this page. with solutions to protect users from data-exfiltration attacks. When the user visits this malicious page, the attacker’s script However, much of the existing work is based on prohibiting will execute in the context of the victim’s origin. At this information flow to unauthorized web servers. This paper point, the attacker can exfiltrate any private data on that page presents a new class of data-exfiltration attacks that circum- and can often use certain private data on that page (such as vent these existing defenses for data-exfiltration. To launch CSRF tokens) to perform further unauthorized actions (e.g. this attack, the adversary first stores users’ information in transfer money to the attacker’s account). For the rest of this the database of a whitelisted site, then later independently paper, we will focus on the first step of the attack where connects to the whitelisted site to retrieve the information. the adversary simply wishes to obtain access to the sensitive Data type Script attacker Non-script attacker Data exfiltration defense Year of release Cookie X Navigator 3.0 1996 Password X X Noxes 2006 CSRF Token X X Dynamic tainting (Vogt et al.) 2007 Capability-bearing URL X X SOMA 2008 Personal data X X HTTP Fences 2008 Content Security Policy 2010 Table I Security Style Sheet 2011 PERSONAL DATA VULNERABLE TO A DATA-EXFILTRATION ATTACKS. Dynamic tainting (Jang et al.) 2011 Table II EXISTING DEFENSES FOR DATA EXFILTRATION AND THEIR YEAR OF RELEASE. data. Table I lists examples of private data that can be stolen by this attack. We proceed by describing these data types in detail. where the adversary exports user’s private data to a server controlled by the attacker, possibly using a code injection • Cookies – Cookies are commonly used as authentication vulnerability. It has to be noted that although cross site tokens. If an attacker manages to compromise a user’s scripting vulnerabilities can lead to data exfiltration attacks, session cookie, she can subsequently impersonate the an adversary doesn’t necessarily need JavaScript to exfiltrate user and initiate unauthorized transactions within the data. We will illustrate in Section III how an adversary can same site. To protect session cookies from being exfiltrate data without executing any JavaScript. compromised by the adversary, many websites label their cookies as “HttpOnly.” HttpOnly cookies cannot A. Existing Defenses be obtained using Javascript, hence a web attacker can Various defenses have been proposed to mitigate data- only steal a cookie if the HttpOnly flag is absent. exfiltration attacks by regulating outgoing data based on its • Password – Most modern browsers offer password destination address. When the victim page makes a request managers that store users’ passwords for websites they containing sensitive data to an unknown origin, the defense visit. These saved passwords will be automatically filled mechanism will assume the request to be malicious and deny in for future visits. An adversary could inject fraudulent it. Unfortunately, this does not solve the core of the problem; HTML elements into the victim’s website to trick the the attacker can still export users’ confidential information to browser’s password manager into exposing the user’s a white-listed origin. For the rest of this section, we describe login credentials. the current mitigations for data-exfiltration (summarized in • CSRF Tokens – Many websites use cross site request Table II) and explain why they are insufficient to stop real forgery (CSRF) tokens to distinguish legitimate requests world adversaries. from forgeries. In a nutshell, CSRF tokens are shared Dynamic tainting is a technique that tracks the information secrets between the client and the server. The server will flow of sensitive data. The goal is to ensure sensitive data only accept requests made from the client if the client is never sent to an unauthorized source. Dynamic tainting presents a valid CSRF token. This effectively prevents of web-based data was first seen in Netscape Navigator malicious websites from tricking the victim’s browser 3.0 [2]. Users of Navigator 3.0 may enable the taint feature into issuing a bogus request. However, if the attacker to enforce that data loaded from one website will never is able to obtain a CSRF token, she can use it to forge be exported to another website. Besides Netscape, many requests, bypassing the existing defense. academic researchers also proposed to use dynamic tainting • Capability-bearing URLs – Michal Zalewski sug- to mitigate web vulnerabilities. In a paper published in NDSS gested in his blog post [1] that capability-bearing URLs 2007 [3], Vogt et al. proposed to use dynamic tainting to are also vulnerable to data theft. These capability- mitigate cross site scripting attacks. Most recently, Jang et bearing URLs are utilized by many web applications to al. also suggested using dynamic taint tracking to protect issue invitations, enable sharing of private user content, browser plugins from leaking sensitive user information [4]. and to implement single sign-on flows. If an adversary Dynamic tainting requires the browser to differentiate be- is able to obtain a capability-bearing URL, she can gain tween trustworthy and non-trustworthy sites, because it needs access to the user’s private data, or even worse, full to decide whether or not sensitive information should be sent. access to the user’s account. However, the boundary of trustworthiness is not so clear; • Personal Data – Many websites include users’ personal the adversary can export the data to a trusted site, then later information, such as their addresses, phone numbers, independently connect to it to retrieve the information. content of their private emails, and even the contact HTTP Fences [5] uses the term “information leak attack” information of all their friends.

Load more