ifact t * r * Comple t te A A n * te W E is s * e C n l l

o D

C

A o

*

*

c T

BrowserAudit: Automated Testing u

e

m S s

E

u

e

S

e

n

I

R

t

v

e

o

d t

y * s E a * a

l d u

e a of Browser Security Features t

Charlie Sergio Maffeis Chris Novakovic Hothersall-Thomas Department of Computing Department of Computing Netcraft Ltd, UK Imperial College London, UK Imperial College London, UK [email protected] [email protected] [email protected]

ABSTRACT As such, browsers need to offer a variety of standardised The security of the side of a relies on security mechanisms which can be relied upon uniformly browser features such as cookies, the same-origin policy and by the client side of web applications, in order to deliver HTTPS. As the client side grows increasingly powerful and security guarantees to their users. For example, the same- sophisticated, browser vendors have stepped up their offering origin policy (SOP) [34] is effective at preventing a range of security mechanisms which can be leveraged to protect of cross-site scripting (XSS) attacks [38] against users’ web it. These are often introduced experimentally and informally browsers and is an integral aspect of modern web-based se- and, as adoption increases, gradually become standardised curity. On the other hand, it is sometimes excessively strict; (e.g., CSP, CORS and HSTS). Considering the diverse land- for instance, it forbids the sharing of information between scape of browser vendors, releases, and customised versions different subdomains, a common requirement of large web for mobile and embedded devices, there is a compelling need sites. It is also coarse-grained, and several attempts have for a systematic assessment of browser security. been made to enforce finer-grained access control [41, 39] We present BrowserAudit, a tool for testing that a de- and origins [22, 23, 29] in the browser. A variety of contem- ployed browser enforces the guarantees implied by the main porary web browsers implement the Cross-Origin Resource standardised and experimental security mechanisms. It in- Sharing (CORS) [46] standard, which may be used to control cludes more than 400 fully-automated tests that exercise the flow of information between -side resources and a broad range of security features, helping web users, ap- client-side scripts that attempt to access those resources via plication developers and security researchers to make an APIs. However, even fully-compliant implementations of the informed security assessment of a deployed browser. We SOP and CORS mechanisms in some cases do not regulate validate BrowserAudit by discovering both fresh and known access to other resources, such as images, embedded objects security-related bugs in major browsers. and web fonts, that can leave web applications vulnerable to cross-site request forgery (CSRF) attacks [20], clickjack- ing [36], framebusting [43] and CSS-based attacks [33]. The Categories and Subject Descriptors Content Security Policy (CSP) standard [45] enables much D.4.6 [Operating systems]: Security and Protection finer-grained control over the loading of arbitrary resources on a web page, mitigating several of these issues. These Keywords are just some examples of established and emerging security mechanisms offered by modern browsers. Web security, testing, same-origin policy, Con- Such mechanisms are often introduced experimentally and tent Security Policy, Cross-Origin Resource Sharing, click- informally. As adoption increases, they gradually become jacking, cookies standardised, and after numerous security reviews and bug reports they can eventually be relied upon consistently across 1. INTRODUCTION browsers [19, 37, 20]. Reaching that stage is not easy. For Personal data, business transactions, critical infrastruc- example, correctly implementing the CSP specification is non- ture and even cars, refrigerators and lightbulbs are exposed trivial: it is a lengthy document with many cross-references through web interfaces to a wide variety of web browsers. to other standards and RFCs, many of which have been su- Hence, the browser plays a key role in the modern information perseded by newer (and conflicting) standards and RFCs. It infrastructure, as the main gateway to access the information is possible that a browser vendor could incorrectly implement and capabilities made available online. some part of the CSP and thus fail to provide some of its security guarantees to their users. There is therefore a need for an automated tool that enables browser developers to complement low-level unit tests targeted at isolated source Permission to to make make digital digital or hard or hard copies copies of part of or all all ofor this part work of this for personal work for or code modules with high-level testing of the effectiveness of classroompersonal oruse classroom is granted without use is feegranted provided without that copies fee provided are not made that or copies distributed are the implementation of the security features once the browser fornot profit made or or commercial distributed advantage for profit and or that commercial copies bear thisadvantage notice and and the that full copies citation on the first page. Copyrights for third-party components of this work must be honored. is deployed. Forbear all this other notice uses, and contact the the full Owner/Author. citation on the first page. To copy otherwise, to In this paper we introduce BrowserAudit, a framework republish, to post on servers or to redistribute to lists, requires prior specific Copyright is held by the owner/author(s). for testing whether a deployed browser correctly enforces permission and/or a fee. ISSTA’15,, July July 12–17, 12–17, 2015, 2015, Baltimore, Baltimore, MD, MD, USA USA the security guarantees implied by the main standardised ACM.Copyright 978-1-4503-3620-8/15/07 2015 ACM 978-1-4503-3620-8/15/07 ...$15.00. http://dx.doi.org/10.1145/2771783.2771789

37 security mechanisms. For practical purposes, we present • We implemented the first fully-automated web application BrowserAudit as a standalone web application that auto- that comprehensively tests browser security features and matically tests the browser used to access it. BrowserAudit provides detailed information to a variety of user bases. has been designed with different sets of users in mind. A casual web user can run the tests to gain a simple secur- • We used BrowserAudit to discover previously unknown ity assessment of their browser: critically vulnerable, non- vulnerabilities in a major web browser. critically vulnerable, or okay. With the recent surge of se- curity breaches reported in the news, people are becoming 2. DESIGN OVERVIEW increasingly security-conscious and we believe there is an The goals underlying the design of BrowserAudit are the increasing demand for tools that inform the public about se- following: curity. A security researcher can benefit even more, viewing a detailed breakdown of each test result, and seeing which • Wide coverage: BrowserAudit should demonstrate that a security features passed our tests and which had problems. wide range of browser security mechanisms can be tested We display textual descriptions for each category of tests and automatically, reliably and efficiently. Complete test cov- the client-side source code of the tests. Browser developers erage of any such mechanism is not practically feasible, can use BrowserAudit to debug their security features and and beyond the scope of this project.1 web developers can use it as a way to ascertain the secur- ity capability of users’ browsers (Section 2). We chose to • Extensibility: By its very nature, BrowserAudit will always implement a careful selection of tests that covers both the be a work in progress. As the browser threat landscape most important browser security mechanisms that should be evolves, more tests will be needed to cover new security implemented in any browser, and some of the most prom- mechanisms, or to extend the coverage of existing ones. ising experimental ones that are not yet widely implemented. Our design should ease the task of creating, debugging Starting from the code of individual test cases, we identified and integrating additional test cases. and generalised common patterns in order to automatically • Ease of use: BrowserAudit should be easily accessible on generate hundreds of tests. BrowserAudit automatically tests any modern browser connected to the Internet, without over 400 behaviours where a certain action should either be the need to install additional software. It should require allowed or blocked according to an implied browser security no interaction from the user, otherwise running hundreds policy (Section 3). of tests would be impractical. Moreover, relying on user We designed BrowserAudit to be efficient and scalable, interaction would prevent the desired aim of running the and we evaluated its performance and its accuracy extens- tests transparently in the background. ively by running it on a number of browsers and platforms. Using BrowserAudit, we have discovered several previously • Broad audience: Our design should support a diverse unknown security bugs in recent versions of Mozilla Firefox, range of users. A report on the security effectively offered which we have reported to the developers (Section 4.4). by a deployed browser should benefit browser developers, Whilst there are well-understood methodologies for gen- penetration testers, security researchers and web users. erating unit tests for a given code base, there is no general solution to the problem of testing the end-to-end security be- • Scalability: Our design should be scalable on the server haviour of a family of applications (in our case web browsers) side. Several users may be testing their browser at the that must respect precise interoperability constraints (web same time, and many security tests concern features that standards) but can widely differ in implementation architec- involve communicating with the server. tures, languages and design. Hence, we faced a significant challenge when developing our tests, carrying out a sub- We now sketch the architecture of BrowserAudit and high- stantial amount of practical experimentation, guided by the light the main design choices. We defer further implementa- official RFCs, our formal and informal models of web security, tion details to Sections 3 and 4.1. and a substantial body of academic and practical research on browser and web security (surveyed in Section 5.1). Al- 2.1 User Experience though we believe that BrowserAudit is unique in its focus BrowserAudit is accessible by simply pointing the browser and breadth, we were inspired by a number of related web to be tested to https://browseraudit.com/. This is a land- applications described in Section 5.2. ing page that briefly describes the aims of the project and contains a “Test me” button to move the user to the actual Contributions. Summarising, our main contributions are: test page, hosted at https://browseraudit.com/test. This • We analysed the specifications of HTML5, CSP, CORS intermediate step avoids surprising users by actively requir- ing their consent to begin the testing phase. Once the user and HTTP Strict Transport Security (HSTS), identifying 2 the concrete security guarantees implied by the proposed clicks to start the tests, the main testing loop initiates. mechanisms. This allowed us to formulate precise goals BrowserAudit is completely automated, and the user does for security test cases. not need to interact with the browser whilst it is being tested. 1For example, an exhaustive test of the same-origin policy • We built a suite of more than 400 browser security tests, would also need to demonstrate that, for any domains A and which brings together a wealth of explicit and implicit B, a page from domain A cannot access certain properties of a page from an incompatible domain B. knowledge of the guarantees afforded by modern browser 2Unless JavaScript is disabled, in which case we display a security mechanisms. We made the tests available to warning to the user. Automated tests cannot be run without the community by open-sourcing the BrowserAudit code JavaScript, and some security features need JavaScript in base [32]. order to be exercised.

38 As the tests are running, the user can see a progress bar ad- be clicked to show the client-side source code of the test itself. vancing, and four test counters being incremented, as shown Our design uses the Bootstrap front-end framework [1], which in Figure 1. For the benefit of typical web users, test runs

Figure 3: Some sub-categories of CSP tests, with expandable test titles and result indicators. Figure 1: The test summary box part-way through the execution of our tests. makes it easy to produce a layout that works consistently are categorised using a simple Okay/Warning/Critical/Skipped across browsers and devices. traffic light indicator. Okay denotes passed tests, Warning and Critical denote failed tests, and Skipped denotes tests that 2.2 Architecture are skipped because the feature being tested is not supported The client side and server side of BrowserAudit work to- by the browser. Failures regarding SOP, cookies, and the gether in order to run tests in the browser: the server side Referer header, which we consider the most crucial secur- exercises browser security features, and the client side tests ity features, are reported as Critical; failures regarding CSP, that these features are implemented as expected. CORS, HSTS and the X-Frame-Options header are reported When multiple concurrent users access BrowserAudit, we as Warnings. This distinction is somewhat arbitrary, and will need to avoid congestion on the server side, as testing each change as these features become more broadly supported and browser causes a bursty interaction with the BrowserAudit new ones are introduced. server in the form of hundreds of requests per user per minute. After the test suite has finished running, the grey back- For this reason, we adopt a standard three-tier server archi- ground of the summary box assumes the colour of the worst tecture, consisting of a public-facing Nginx [11] , failed test, or green if all tests passed. This traffic light indic- a Go [16] application server and a PostgreSQL [13] database ator provides a basic level of information about the current backend. The Nginx server is running as a reverse proxy level of security offered by the browser. in front of the Go server, which is not publicly accessible. More sophisticated users, such as security researchers or When the Nginx server receives HTTP(S) requests for static browser developers, need more information on the tests per- resources, such as our JavaScript tests, it responds by dir- formed and on their outcomes. Clicking on the “Show/Hide ectly fetching the resource from the local static/ directory. Details” button displays a summary box that shows the vari- Dynamic requests are instead proxied to the Go server, and ous categories of tests (reflecting the security mechanisms the responses are forwarded back to the client. Nginx also that have been tested), and the number of failed tests for handles SSL termination, caching, gzip compression, URL each of them, as shown in Figure 2. rewriting, and keeps access and error logs. This architec- ture reduces the load on the Go server, which can focus on serving only dynamic requests that depend on the user’s session, and limits security risks because the Go server can run as a non-privileged user. Certificates. In order to ensure good coverage of various security features that involve the use of HSTS and cross- origin testing, BrowserAudit makes use of four domains: browseraudit.com, test.browseraudit.com, browseraudit. org and test.browseraudit.org. The server presents a single SSL certificate that is valid for all of these domains. Sessions. We use sessions to keep track of intermediate test results and other test-related data for each user whilst their tests are in progress. Sessions are needed because in many of our security tests, it is the server that makes the decision as to whether or not the browser passed the test, not the test Figure 2: BrowserAudit summary box. framework running in the browser. In these cases, the client must send an additional request asking the server what the Each category can be expanded and collapsed to show a test result was, so that it can be displayed to the user. description of the corresponding security mechanism, and a Caching. In our tests, there are many cases in which a list of sub-headers that in turn can be expanded to reveal request is first made to store a default result on the server, individual tests for a specific feature, as illustrated in Figure 3. and then a second request may be sent to overwrite this For each individual test we show a descriptive title that can result, depending on whether or not the browser correctly

39 implements a given security feature. If a user runs the tests generates Mocha code for testing AJAX calls with respect twice in short succession, and this second result was cached to the SOP. The choice of the right parameters for the re- and therefore did not reach our server, our application would sources to load (defaultResults, iframeSrc) are crucial to report an incorrect test result. We ensure that this cannot the correctness of each test instance. To favour modular- happen by preventing HTTP responses from being cached. ity and coverage, we instantiate a separate Mocha test for each case to be tested, rather than bundling a large number 2.3 Tests of cases in the same test. To ensure maximum portability, A typical test of a security feature involves making multiple we implement as much as possible on the client side using AJAX or image requests to the server and checking if the standard, browser-independent features. actual responses match the expected responses. Whenever possible, we write asynchronous tests using JavaScript and libraries. Our tests are written directly callback patterns rather than timeouts. We annotate the in JavaScript, using the jQuery library [9] for convenience. titles of tests whose results depend on timeouts with a small We deploy our tests using the Mocha framework for browser- clock icon. We try to avoid using timeouts because, when based JavaScript unit testing [10], with some custom modi- a timeout expires, it is not possible to distinguish a true fications to improve the output layout. test failure from an anomalous delay in a browser event or network connection. Moreover, it is difficult to estimate appropriate timeout values for many events. For certain 1 $.get("/del_httponly_cookie", function() { tests, however, we cannot avoid using timeouts. 2 expect($.cookie("httpOnlyCookie")).to.be.undefined; 3 $.get("/set_httponly_cookie", function() { For example, to detect whether a CSP policy that denies 4 expect($.cookie("httpOnlyCookie")).to.be.undefined; the use of JavaScript but allows the loading of fonts in an 5 done(); iframe is enforced correctly, the BrowserAudit test framework 6 }); 7 }); needs to give time for the iframe to try to load the font, and then ask the server if the font was requested. We are not allowed to run JavaScript in the iframe to inspect the page Figure 4: The client side of a proof-of-concept and detect whether the font was loaded; likewise, we cannot HttpOnly cookie test. ask the user for confirmation, because our tests must run without user interaction. Figure 4 shows a proof-of-concept test to check that the browser correctly implements HttpOnly cookies (see Sec- 3. BROWSER SECURITY MECHANISMS tion 3.4). Line 1 loads a page to clear any leftover cookies In this section, we describe the range of security mechan- from previous test runs, line 2 checks that the cookie is not isms currently exercised by BrowserAudit. Each mechanism defined, line 3 loads a second page that sets the cookie, and induces — sometimes implicitly — a security policy. Our line 4 checks that we are unable to read it via JavaScript. The emphasis is on testing representative instances of behaviours call to done() on line 5 informs Mocha that the asynchronous that should be allowed or blocked according to the corres- test is complete. In order to make the source code of the ponding security policy. tests easier to understand and maintain, we also leverage the Chai assertion library [7]. 3.1 Same-Origin Policy In the early days of the web, there was little incentive to 1 function ajaxSopTest(globalTestId, shouldBeBlocked, control the resources that could be included in a web page: sourcePrefix, destPrefix){ most web pages were static, and web developers were free to 2 // omitted code: variable initialisation include resources (e.g., images) from any source in their web 3 4 var test_template= function(done){ pages. As web sites became dynamic and interactive, thus 5 $.get("/sop/"+defaultResult+"/"+id, allowing web developers to include user-supplied content in 6 function() {$(" 3 browser’s treatment of the Secure attribute both when the cookies are set by the server and set by JavaScript.

Figure 8: The HTML for the outer iframe loaded by 3.5 Referer Header the test script shown in Figure 7. The Referer header should not be included in a non-secure request if the referring page was served via a secure protocol; Figure 7 shows the client-side code for a CSP test. The this behaviour is defined in RFC 2616 [31]. This requirement code runs on the main BrowserAudit page and loads an outer exists because the referrer might disclose an otherwise private iframe from browseraudit.com with the CSP header sand- information source. In BrowserAudit, we test this behaviour box allow-same-origin allow-scripts. This outer iframe by loading a web page over HTTPS containing an image is very simple (Figure 8), and its role is simply to load an inner loaded over HTTP and checking that the Referer header iframe from browseraudit.com that is subject to the given was not sent to the server with the request for the image. policy: scripts can run, and have same-origin permissions. The inner frame, whose code is shown in Figure 9, tries to per- 3.6 Response Headers form an XMLHttpRequest to test.browseraudit.com, which should be blocked. Note that since we cannot rely on user 3.6.1 X-Frame-Options credentials to be sent with synchronous , X-Frame-Options, defined in RFC 7034 [42], is a server- we pass the session cookie (abstracted for readability in Fig- side technique that can be used to prevent clickjacking at- ure 9 as sessionCookie) as a parameter of the request. All tacks. X-Frame-Options is a response header that specifies of this information is also visible to the BrowserAudit user by whether or not the document being served is allowed to be clicking on the corresponding test title in the user interface. rendered in a frame; more specifically, the header specifies the origin (scheme, hostname and port number) that is allowed to render the document in a frame. BrowserAudit tests for 1 2 12 browsers behave differently when dealing with nested frames, so we do not test these cases as there is no defined correct behaviour. Note also that not all browsers support the Figure 9: The HTML for the inner iframe corres- ALLOW-FROM directive. ponding to the outer iframe shown in Figure 8.

43 3.6.2 Strict-Transport-Security 1-minute 5-minute HTTP Strict Transport Security (HSTS) is a security mech- 1.6 anism that allows a server to instruct browsers only to com- 1.4 municate with it over a secure (HTTPS) connection for the 1.2 given domain. It exists primarily to defend against man-in- the-middle attacks in which an attacker is able to intercept 1 his victim’s network connection [37]. The server sends this 0.8 instruction via the header, as 0.6 Strict-Transport-Security Load average defined in RFC 6797 [35]. 0.4 When HSTS is enabled on a domain, a compliant browser 0.2 must rewrite any plain HTTP requests to that domain to use 0 HTTPS. This includes both URLs entered into the navigation 0 5 10 15 bar by the user, and resources included on a web page. The Time (minutes) Strict-Transport-Security header should only be sent in an HTTPS response. If the browser receives the header in a Figure 10: The 1- and 5-minute load averages on the response sent over plain HTTP, it should be ignored. BrowserAudit server during the performance evalu- In BrowserAudit, we test the basic behaviour of HSTS and ation. its includeSubDomains directive. We also ensure that the header is ignored when transferred via an insecure protocol, and that the HSTS state correctly expires based on the max- Most client-side tests contain components that are loaded age value set in the header. All of these tests work by testing synchronously inside dynamically-created iframes, which be- whether a request for an image at http://browseraudit. come redundant as soon as the test result is reported in com/set_protocol is rewritten to use HTTPS or not. the browser; over time, the DOM of the main BrowserAudit Almost all current browsers support HSTS, with the not- window would therefore amass an overwhelming number of able exception of 11 (the latest available iframes, slowing down the execution of tests as the browser version at the time of writing). struggles to create and append additional iframes. We avoid this problem by dynamically removing any iframes appended 4. EVALUATION to the DOM during each test’s tear-down phase (via Mocha’s afterEach() routine). We ran 15 repetitions of 10 concur- 4.1 Performance rent executions of the whole test suite on a 64-bit Windows A primary concern of BrowserAudit is scalability, given 7 machine with a 6-core Intel i7 4930K CPU and 64GB of that a single invocation of the full test suite invokes approx- memory, and Chromium 40.0.2205.0. Under these conditions, imately 1,500 requests and transfers around 3MB of data the average execution time for the test suite is just over between the client and server. The server must handle all a minute. By contrast, a single execution in Safari 8.0 on of these requests quickly (ideally in under 300ms), given the an iPhone 5 with iOS 8.1 takes on average 1.35 minutes, large number of tests in the BrowserAudit test suite and the skipping 24 tests. The execution time varies broadly across reliance of some of the tests on timeouts (see Section 2.3). browsers and platforms, but we consider this an acceptable The BrowserAudit web and database servers are currently cost for performing an in-depth browser security scan. hosted on a single virtualised server with two CPU cores and 2GB of memory, running Ubuntu 14.04. We evalu- 4.2 Correctness ated BrowserAudit’s server-side performance by running Verifying the correctness of our tests is challenging, as the BrowserAudit test suite in 15 web browsers repeatedly they need to convey in a final pass or fail result a whole and concurrently for 15 minutes. Over this period, the security-sensitive behaviour: a test containing a small bug BrowserAudit server handled around 225,000 requests and could still pass, which is generally the expected result for served a total of 450MB of data. The 1- and 5-minute load browsers correctly implementing a given security mechanism. averages on the BrowserAudit server are shown in Figure 10; Of course, no web browsers contain intentional security the peak load averages over the 15-minute duration of the flaws that would allow us to verify the correctness of tests. performance test are 1.2 and 0.7 respectively, where a load Modifying the source code of existing open-source browsers average of 1 indicates that a single CPU core is operating to break their security features in order to ensure that tests at capacity. Based on these performance figures, we estim- fail when expected is possible but challenging given the ate that a single BrowserAudit application server using this complexity of modern web browser code bases. configuration could comfortably support up to 25 concurrent However, it is a matter of public record that some web test suite executions. browsers either do not implement some of the security mech- As described in Section 2.2, our design is ready to be anisms tested by BrowserAudit, or only implement subsets scaled up as the BrowserAudit user base grows. Nginx can of those security mechanisms. We leverage the results of be configured as a load balancer, passing requests to one of browser-profiling projects such as Browserscope [3] and Can many application servers. Deploying Go application server I Use. . . [6] to broadly identify the security features im- instances is trivial thanks to Go’s ability to compile a program plemented by each web browser, and for those features we to a single statically-linked binary, so there is no dependency manually verify that the BrowserAudit test suite results are chain. In order to maintain session persistence, Nginx’s accurate. ip_hash directive can be used to ensure that all requests Using BrowserStack [5], a web-based browser testing ser- from the same IP address reach the same application server, vice, we have evaluated BrowserAudit in a range of browsers maintaining the integrity of a single suite execution. on a number of different operating systems, across both

44 desktop and mobile platforms. The full BrowserAudit test origin Worker and SharedWorker objects in scripts with the suite runs reliably in Safari 6, Firefox 13 and Chrome 25 or policy more recent versions, automatically skipping tests where a default-src ‘none’; script-src ‘unsafe-inline’. feature is not supported. BrowserAudit also runs correctly on Internet Explorer 11, but due to problems relating to In both cases, the ‘unsafe-inline’ declaration in the policy Mocha and IE’s limited call stack, it cannot execute the states that only inline stylesheets and scripts must be per- whole test suite. In older versions of these browsers, it is mitted: external resources, even those from the same origin, instead possible to run a subset of the test suite. must be blocked. We reported both of these bugs to Mozilla during the version 29 release cycle, and they were fixed in 4.3 Test Coverage version 33 of Firefox. We noted in Section 2 that full coverage for browser security Firefox does not currently implement the sandbox CSP feature tests is unattainable. Here we discuss a number of directive; this optional feature of the CSP 1.0 specification security features not covered by BrowserAudit, but that we directs browsers to relax the given security controls on iframes believe can be added to our framework. embedded in the page, as if they had been supplied in the We imply in Section 3 that there is no single same-origin sandbox attribute of each