Protected Web Components: Hiding Sensitive Information in the Shadows
Total Page:16
File Type:pdf, Size:1020Kb
IT SECURITY Protected Web Components: Hiding Sensitive Information in the Shadows Philippe De Ryck, Katholieke Universiteit Leuven, Belgium Nick Nikiforakis, Stony Brook University Lieven Desmet, Frank Piessens, and Wouter Joosen, Katholieke Universiteit Leuven, Belgium Third-party code inclusion is rampant, potentially exposing sensitive data to attackers. Protected Web components can keep private data safe from opportunistic attacks by hiding static data in the Document Object Model (DOM) and isolating sensitive interactive elements within a Web component. he Web has evolved from including have severe consequences if the included code static images and document links to doesn’t behave correctly. comprising Web applications with in- Consequently, by including potentially un- dividual components provided by trusted remote scripts, a Web application de- T numerous service providers. When a Web ap- veloper accepts a certain risk, both for the site’s plication incorporates third-party components integrity and for the safekeeping of user data. using remote scripts, the user’s browser will run Opportunistic attacks on the client-side content the third-party code within the security context of a Web application can be mitigated by hiding of the Web application. This not only exposes the private data and sensitive elements from poten- code’s functionality to the Web application but tially malicious scripts. For example, iframes sup- also gives the included code full access to the Web port content isolation in a webpage, albeit with a application’s client-side context, including the large overhead and a lack in flexibility for integra- page’s content, local data, and origin-protected tion in highly dynamic, visually streamlined Web functionality. This lack of code isolation can applications. Alternatively, JavaScript sandboxing 36 IT Pro January/February 2015 Published by the IEEE Computer Society 1520-9202/15/$31.00 © 2015 IEEE techniques support code isolation,1,2 but don’t extracting security tokens and session identifi- offer isolation of data in the Document Object ers. Even when developers carefully select only Model (DOM).3 Finally, the recent Web Com- trusted third parties for remote script inclusion, ponents specification lets developers instantiate a certain risk persists, because third-party pro- custom HTML tags for use within the page.4 A viders can be compromised as well. The dangers major feature of such custom elements is the of third-party script inclusions are best illus- support for a hidden DOM, known as the Shad- trated by real-world examples, such as on-screen ow DOM.5 Unfortunately, the Web components keyboard scraping malware,7 malware spread specification focuses on functional separation of through advertisements,8 or actual compromises the DOM and doesn’t offer security features or of third-party providers.9,10 code isolation. An opportunistic attacker can gain access to Here, we motivate the need for a flexible mech- the Web application’s client-side context through anism that supports the isolation of the user’s several attack vectors—for example, by compro- private data in the DOM, as well as the isolation mising a remotely included script or advertise- of sensitive elements, such as input elements of ment, or through a cross-site scripting attack a login form. Furthermore, we investigate the (XSS). Because of the wide variety of sites that properties of the Web components specification, can be compromised through a malicious script and show that there’s a potential for offering the desired level of isolation without compromis- ing the much needed flexibility of modern Web Even when developers carefully select applications. only trusted third parties for remote Use Cases and Existing Technologies script inclusion, a certain risk persists. Integrating third-party components using re- mote scripts is common on the Web. Examples include programming APIs and development frameworks (such as JQuery and Bootstrap), or advertisement, opportunistic attackers carry advertising services (such as DoubleClick and out nontargeted attacks, such as looking for input AdSense), Web analytics tools (such as Google elements of the type password, or scraping any us- Analytics), and social media plug-ins (such as er-specific displayed content, such as email mes- Facebook’s “like” button). A 2012 study of re- sages, health records, and bank statements. mote JavaScript inclusions on the Alexa top 10,000 sites showed that 88.45 percent include Use Cases at least one remote script, and one site even in- In light of the opportunistic attacker model, we cluded scripts from 295 remote hosts.6 Further- propose three general use cases that benefit from more, 68.37 percent of sites included the Google effectively isolating data or HTML elements Analytics library, and 79.74 percent included at within the browser. least one Google library. Finally, the study ap- plied a set of metrics to show that 12 percent Displaying sensitive information. Many Web of sites that were deemed security conscious applications process and display user-specific included scripts from sites that deployed weak information, which is often considered private security measures. and sensitive. Common examples of such pri- Including remote scripts not only creates a vate data are email messages, chat conversations, vector for attacks targeting a specific Web ap- bank statements, and security challenges. Op- plication, but it also presents an attack vector portunistic attackers can easily inspect and col- for opportunistic attackers, who aim to execute lect such sensitive information because it isn’t low-profile attacks on a large number of Web isolated from the rest of the page, which includes applications. Such attacks can yield large quan- third-party scripts. tities of sensitive information—for example, by An effective isolation mechanism for in-appli- scraping the webpage’s user-specific content, re- cation content could prevent inspection or col- cording user-provided input in form fields, and lection by an opportunistic attacker. computer.org/ITPro 37 IT SECURITY Table 1. Six of the seven highest ranking free online password managers include at least one remote script on the user password page. components with known vulnerabilities” ninth Search No. of remote place.12 A similar initiative, the CW E/SA NS Top ranking Name scripts 25 Most Dangerous Software Errors, puts “inclusion 1 PassPack 1 of functionality from untrusted control sphere” 13 3 LastPass 1 at the 16th spot. 4 Norton Identity Safe 4 To support the high rankings in these indus- try surveys, and to establish the relevance of the 5 Keeper 1 aforementioned use cases, we conducted two rel- 8 Dashlane 1 atively small-scale experiments. To support the 10 Clipperz 0 use cases for hiding sensitive data in the DOM, 16 Mitto 1 we investigate popular online password manag- ers, where the DOM holds all of the user’s pass- words to every website. The second experiment supports the use case for protecting sensitive in- Protecting security tokens. A variant of dis- put elements by measuring the exposure of login played private information are application- forms to third-party script providers. related, hidden security tokens, often associated with a user’s session. For example, the security Password managers. Online password manag- tokens protecting against cross-site request forg- ers are used to store the multitude of authenti- ery (CSRF) attacks are embedded as hidden form cation credentials required on the modern Web. elements.11 This private and highly sensitive data is often Hiding such security tokens from opportu- even stored in an encrypted container, which is nistic attackers raises the security level of the decrypted at the client side when the client pro- applied countermeasures, thereby eliminating vides the correct master key. One might expect alternative attack vectors. that in such a sophisticated setup, the decrypted data is handled with care, preventing any risk of Protecting sensitive input elements. A third stolen or leaked data. use case focuses on protecting client-side input For seven online password managers, gath- elements, in contrast to hiding server-delivered ered from the top 20 results for the Google query content. Most Web applications contain sensitive “free online password manager,” we investigated input elements, such as HTML password elements whether they include scripts from a third-par- and on-screen keyboards. Opportunistic attack- ty on the page that hosts the passwords in the ers can easily gather sensitive user-provided data DOM, giving these scripts full access to the us- by using generally applicable selectors for sensi- er’s credentials. As Table 1 shows, six of the seven tive input elements. (86 percent) include third-party scripts from at Isolating such sensitive input elements from least one remote host on the page that displays opportunistic attackers ensures that user-provid- the user’s passwords. The Ghostery browser ex- ed input cannot easily be stolen with a nontarget- tension (https://www.ghostery.com/en/) consid- ed attack. Note that such an isolation mechanism ers all scripts to be analytics. Additionally, two must extend toward event handlers associated password managers include scripts from addi- with isolated input elements. tional remote hosts on their main page, which is situated within the same origin as the sensitive Motivating Empirical Evidence page. The inclusion of potentially untrusted third- party code into a Web application is a common Login forms. Almost every webpage has a login though potentially dangerous practice.6 Two im- form, which are a trivial target from which an op- portant industry-driven surveys of the most criti- portunistic attacker can extract user credentials. cal software errors warn of this risk. The Open We crawled the Alexa top 1,000 sites, looking for Web Application Security Project (OWASP) login forms situated on a page with third-party Top Ten Project, which lists the 10 most dan- script inclusions, thereby giving the third party gerous risks for Web applications, gives “using full access to the login form.