SSOScan: Automated Testing of Web Applications for Single Sign-On Vulnerabilities Yuchen Zhou and David Evans, University of Virginia https://www.usenix.org/conference/usenixsecurity14/technical-sessions/presentation/zhou

This paper is included in the Proceedings of the 23rd USENIX Security Symposium. August 20–22, 2014 • San Diego, CA ISBN 978-1-931971-15-7

Open access to the Proceedings of the 23rd USENIX Security Symposium is sponsored by USENIX SSOScan: Automated Testing of Web Applications for Single Sign-On Vulnerabilities

Yuchen Zhou David Evans University of Virginia [yuchen, evans]@virginia.edu http://SSOScan.org

Abstract To better understand and mitigate these risks, we de- Correctly integrating third-party services into web ap- veloped SSOScan, an automated vulnerability checker plications is challenging, and mistakes can have grave for applications using SSO. SSOScan takes a website consequences when third-party services are used for URL as input, determines if that site uses SSO, security-critical tasks such as authentication and autho- and automatically signs into the site using Facebook test rization. Developers often misunderstand integration re- accounts and completes the registration process when quirements and make critical mistakes when integrating necessary. Then, SSOScan simulates several attacks on services such as single sign-on APIs. Since traditional the site while observing the responses and monitoring programming techniques are hard to apply to programs network traffic to automatically determine if the appli- running inside black-box web servers, we propose to de- cation is vulnerable to any of the tested vulnerabilities. tect vulnerabilities by probing behaviors of the system. We focus only on Facebook SSO in this work, but our This paper describes the design and implementation of approach could be used to check SSO integrations using SSOScan, an automatic vulnerability checker for appli- other identity providers or other protocols. Many of our cations using Facebook Single Sign-On (SSO) APIs. We techniques could also be adapted to scan for vulnerabili- used SSOScan to study the twenty thousand top-ranked ties in integrating other security-critical services such as websites for five SSO vulnerabilities. Of the 1660 sites online payments and file sharing APIs. in our study that employ Facebook SSO, over 20% were found to suffer from at least one serious vulnerability. 1.1 Contributions 1 Introduction Our work makes two types of contributions: those related to the construction of our scanning tool which are largely Single Sign-On (SSO) services are increasingly used to independent of the particular vulnerabilities, and those implement authentication for modern applications. SSO- resulting from our large-scale study of Facebook SSO enabled applications allow users to log into an applica- implementations. tion using an established account (with a service such as Facebook or ) and connect their account on the SSOScan. We explain the design and implementation of new site to an established Internet identity. Should the SSOScan (Section 3), as well as how to handle some of application need more information from the user, it may the challenges in the automation process. We describe ask the user for extra permissions from the established techniques that automatically perform user interactions service. Once granted, the requested information is re- to walk through the SSO process (Section 3.1), includ- turned to the application, which can then be used in the ing clicking the correct buttons and filling in registration transparent account registration process. forms. We collected information of almost 30,000 click Although these services provide SDKs intended to en- attempts for sites that implement Facebook SSO which able developers without security expertise to integrate shows in detail how the individual heuristics are affecting their services, actually integrating security-critical third- SSOScan’s behavior (Section 5.2). This provides exper- party services correctly can be difficult. Wang et al. iden- imental evidence to support our design choices and shed tified several ways applications integrating SSO SDKs light on future research that shares a similar goal. SSO- can be vulnerable to serious attacks even when develop- Scan can detect whether a target application contains any ers closely follow the documentation [27]. of the five vulnerabilities listed in Section 2.2 with an

USENIX Association 23rd USENIX Security Symposium 495 average testing time of 3.5 minutes, and is able to check used in the first exchange. 792 (81%) of the 973 websites that implement functional A signed request is a base64 encoded Facebook SSO from the top 10,000 with no human inter- Signed request. string that contains a user identity, a code, and a signa- vention at all. ture that can be verified using an application’s app secret Large-scale study. We ran SSOScan on the top 20,000 and some other metainformation. Once issued, it is not US websites (Section 4). Key results from the study in- tied to Facebook (except for the enveloped code), and the clude finding at least one vulnerability in 345 of the 1660 signature can be verified locally. sites that use Facebook SSO (Section 4.1). We also learn how vulnerability rates vary due to different ways of in- tegrating Facebook SSO (Section 4.1.1). We manually 2.2 Vulnerabilities analyzed the 228 sites ranked in the top 10,000 that SSO- Our interest in building an automatic scanning tool was Scan cannot test automatically and report on the reasons initially motivated by the access token misuse vulnera- for failures (Section 4.2). Our study reveals the complex- bility reported by Wang et al. [27]. We further identified ity of automatically interacting with web sites that follow four new vulnerabilities that are both serious and suitable a myriad of designs, while suggesting techniques that for automatic testing. The first two vulnerabilities con- could improve future automated testing tools. In Sec- cern confusions about how authentication and authoriza- tion 6, we discuss our experiences reporting the vulnera- tion are done; the other three concern failures to protect bilities to site owners and possible ways SSOScan could important secrets. be deployed. Access token misuse. This vulnerability stems from confusion about authentication and authorization. In 2 Background OAuth 2.0, an access token is intended for authoriza- tion purposes only because it is not tied to any specific This section provides a brief introduction to single sign- application. When a service uses an access token to au- on systems, describes the vulnerabilities we checked, and thenticate users, it will also accept ones granted to any summarizes relevant previous work. other application. Figure 1 illustrates an impersonation attack that exploits this vulnerability: Alice visits Mal- lory’s website (step 1), logs in using Facebook SSO (2), 2.1 Single Sign-On and receives an access token from Facebook (3). Then, Mallory’s client-side code running in Alice’s browser A typical single sign-on process involves three parties. forwards the access token to Mallory (4), which presents Alice first visits a web application and elects to use SSO the token to a vulnerable application’s server (5). After to login. She is then redirected to the identity provider’s confirming the token represents Alice, Foo’s application SSO entry point (e.g., Facebook’s server). After she logs server authenticates Mallory as Alice (6). into Facebook, her OAuth credentials are issued to the application server. The application server confirms the Facebook User identity and authenticates the client. 2. Login OAuth uses three different types of (rather confu- singly-named) credentials: 3. Issue credentials 4. Forward 1. Visit Access token. An access token represents permissions credentials Foo app server granted by the user. For example, the application may Mallory request that user grant permission to access the birth- 6. Authenticated day and friend lists from her Facebook account. Upon 5. Reuse credentials the user’s consent, a token will be issued and forwarded to the application which may then use it to obtain the granted information from Facebook. An access token Figure 1: OAuth credential misuse eventually expires, but may be valid for a long time. Code. A code is used to exchange for an access token Signed request misuse. Sometimes developers have through the identity provider. This exchange requires the chosen the correct OAuth credentials to use, but still end application’s unique app secret to proceed. If the secret up with a vulnerable implementation. One way this hap- does not match, Facebook will not issue the token. This pens is when information decoded from a signed request means a code is bound to a user as well as a target appli- is used but the signature is never checked using the cation. With Facebook SSO, the code expires after being app secret. The attack to exploit this vulnerability is

2 496 23rd USENIX Security Symposium USENIX Association similar to the previous one, except that Mallory needs tive source-sink pairs. However, these techniques require to reuse the signed request in addition to access token. white-box access to the application (at least at the level of its binary), which is not available for remote web appli- When a developer registers an appli- App secret leak. cation testing. Automated web application testing tools cation with Facebook, she receives an app secret. It that work on the server implementation [1, 8, 16] do not is essential for the application owner to keep it a se- apply to large-scale vulnerability testing well. They ei- cret because the app secret is used as the key to cre- ther require access to application source code or other ate signed requests and to access many other privileged specific details such as UML or application states. For functionalities. However, careless developers may reveal our purposes, the test target (application server imple- this secret to clients, especially when using code flow to mentation) is only available as a black box. authenticate users. By design, the code and app secret must be sent from the application’s back end server to Automated security testing. Penetration testing is Facebook in exchange for an access token. When this widely used to check applications for vulnerabilities [15, exchange is carried out through the client instead of the 28]. The tester analyzes the system and performs sim- server, app secret is exposed to any malicious client. ulated attacks on it, often requiring substantial manual effort. More automated testing requires an oracle to de- User OAuth credentials leak. The last two vulnerabili- ties both leak a user’s OAuth credentials. When the Face- termine whether or not a test failed. Sprenkle et al. book OAuth landing page contains third-party content, developed a difference metric by comparing two web- requests to retrieve those contents will automatically in- pages based on DOM structure and n-grams [21] and im- clude OAuth credentials in the referer header, which proved results using machine learning techniques [22]. leaks them to the third-party. To thwart this leakage, SSOScan also requires an oracle (Section 3.2) to deter- Facebook offers a layer of protection by only allowing mine session identity. For our purposes, a targeted oracle access token and signed request to appear in the URL works better than their generic approach. fragments, which are not visible in the referer header. Automated GUI testing. SSOScan is also closely re- Therefore only code can be leaked via referer unless the lated to automated GUI testing. The GUI element trig- application intentionally pulls the credentials and puts it 1 gering approach we take shares some similarities with re- in the URL . In addition, credentials can be exfiltrated cent works to simulate random user interactions on GUI by third-party scripts if they are present in the page con- element to explore application execution space on An- tent. If a malicious party is able to obtain these creden- droid system [14], native Windows applications [29], and tials, it could carry out impersonation attacks or perform web applications [5, 10]. Their common goal is to ex- malicious actions using permissions the user granted the plore app execution space efficiently to discover buggy, original application, such as posting on the user’s time- abnormal or malicious behavior. By contrast, our goal is line or accessing sensitive information. Note the differ- to drive the application through a particular SSO process ence between embedding OAuth credentials in the URL rather than explore its execution space. Further, we need and in the body content is that the former will directly the tests to proceed fast enough for large-scale evalua- leak them to third parties, while the latter only leaks the tion. Since each simulated user interaction with the web credential when the embedded third party code accesses application involves round-trip traffic and a non-trivial it explicitly. delay to get the response, our primary focus is to develop useful heuristics to quickly prune search space before 2.3 Related Work triggering any user interactions. SmartDroid [32] and AppIntent [31] both aim to re- Our work builds on extensive previous work on automat- cover sequences of UI events required to reach a par- ically testing applications for vulnerabilities. We briefly ticular program state or follow an execution path ob- describe relevant approaches next, as well as previous tained from static analysis. These approaches target An- works that analyze vulnerabilities in SSO services. droid applications and rely on client-side information Program analysis. Program analysis techniques such as that is not available for our web application scanning static analysis [3] and dynamic analysis including sym- tool, where the necessary state only exists on the (inac- bolic execution [7, 17] automatically identify vulnera- cessible) server side. bilities with fast testing speed and good code coverage. Human cooperative testing. Off-the-shelf testing tools Runtime instrumentation techniques such as taint track- like Selenium [19] and TestingBot [24] can be used to ing [11] and inference [18] also help to safeguard sensi- discover bugs in web applications under developers’ as- 1Surpsingly, we found several sites doing this (e.g., dealchicken.com sistance. These tools replay user interactions based on and bloglovin.com). testing scripts that are manually created by the applica-

3 USENIX Association 23rd USENIX Security Symposium 497 No FB SSO tion developer. BugBuster [6] offers some automatic web Detected application exploration capabilities, but still does not un- No No SSO Yes See FB SSO SSO process Registration derstand the application context enough to perform any button finder traffic? finished automation non-trivial actions such as those involving authentication Yes Start SSO Homepage SSO process Oracle and business logic. Button clicked automation To reduce developer effort, Pirolli et al. [13], Elbaum Yes More strategies No Registration et al. [9], and the Zaddach tool [12] show promising re- to try? successful No Yes sults by collecting interactions from normal users and re- Give up/ Vulnerability playing them to learn application states and invariants for Manual analysis Tester vulnerability scanning. These works do not require extra manual effort from developers to write testing script or Figure 2: Enroller Overview. specify user interactions. However, one potential prob- Ovals represent testing states, curved rectangles represent different lem these works fail to address is user’s privacy con- modules in our tool, and diamonds represent control flow decisions. cerns when submitting interactions. This could be es- pecially sensitive when the actions involve passwords or ules necessary to understand the results in Section 4, but payments. SSOScan avoids this problem and is comple- defer the details of our heuristics to Section 5. mentary to this line of work — SSOScan attempts to scan applications in a fully automatic fashion and does not re- quire traces from any party. 3.1 Enroller Single sign-on security. Single sign-on has emerged as Figure 2 shows the workflow of the Enroller. Given a an important security service and has been well-studied target web application, our tool first removes all cook- in recent years. Previous works have discovered prob- ies from the browser and navigates to the target URL. lems in protocols, bugs in SDK code and missed assump- A short delay after the page has fired its onload event, tions in developers’ implementations [4, 20, 23, 25, 27]. the SSO button finder (Section 3.1.1) analyzes the DOM Automated scanning is especially valuable for vulnera- and outputs the most likely candidate elements for SSO bilities that cannot be simply fixed by upgrading SDKs button. The Enroller then simulates clicks on those ele- or improving the protocols, but stem from mistakes inte- ments, monitoring traffic to listen for the Facebook SSO grating the SSO service. traffic pattern. Once a click or sequence of clicks is found Integuard [30] and AuthScan [2] have similar goals that produces the recognizable SSO traffic, SSOScan au- with SSOScan. Integuard infers invariants across re- tomatically logs into Facebook and grants the requested quests and responses and uses them to perform intru- permissions to the application. sion detection on future activities. AuthScan [2] is an About 44% of sites we tested still require a user to reg- automated tool to extract specifications from SSO imple- ister when using SSO, so it is important to automate this mentations by using both static program analysis and dy- process. SSOScan combines heuristics with random in- namic behavior probing. Our goals differ in that we focus puts to fill in and submit the forms (Section 3.1.2), and on detecting specific vulnerabilities rather than generic then uses an oracle (Section 3.2) to determine if the sub- ones. This enables us to establish clear automation goals mission succeeds. If the oracle deems the registration to and build well-defined state machines for the scanner, be a failure, the Enroller tries using different strategies and removes the uncertainties the previous works incur (Section 5) until either the oracle passes or a threshold when inferring invariants or modeling unknown func- level of effort is exceeded. The entire process succeeds tions. The drawback is our approach relies on knowledge for 80% of the websites using Facebook SSO in the top of particular vulnerabilities. For many integrated web 10,000 sites (Section 4 presents detailed results). services, including SSO, many vulnerabilities are known or can be obtained using systematic explication [27]. 3.1.1 SSO Button Finder A typical starting page, taken from huffingtonpost.com, 3 SSOScan is shown in Figure 3. SSOScan needs to first find and click the “Log in” button on the main page, and then the SSOScan consists of two main parts: the Enroller and “Log in with Facebook” button on the overlay that pops the Vulnerability Tester. The Enroller automatically reg- up afterwards. As illustrated in Figure 4, SSOScan first isters two test accounts at a web application using Face- extracts a list of qualifying elements from all nodes in book SSO. The Vulnerability Tester simulates attacks an HTML page, and then extracts content strings from and monitors traffic to test for each vulnerability. In this such elements. The Button Finder relies on the assump- section, we describe the general workflow of these mod- tion that developers put one of a small pre-defined set

4 498 23rd USENIX Security Symposium USENIX Association Extract Regex of expected words in the text content or attributes of the HTML Filtering Content matching Element String Score SSO button. It computes a score for each element by 1 2 3 matching its content with regular expressions such as [ Ll ][ Oo][Gg][IiOo][Nn] which indicates its resemblance Figure 4: SSO button finder workflow to “login”. SSOScan forms a candidate pool consisting of the top-scoring elements and triggers clicks on them. (Section 5 describes the heuristic choices SSOScan uses 3.2 Oracle to filter elements and compute scores.) The Oracle analyzes the application and determines whether it is in an authenticated state, and if so, further 3.1.2 Completing Registration identifies the session identity. This module is necessary for SSOScan to decide if a registration attempt is suc- The required interactions to complete the registration cessful. It is also used by the Vulnerability Tester to de- process after single sign-on vary significantly across web termine if a simulated impersonation attack succeeds. applications. They range from simply clicking a submit The key observation behind the Oracle is that web ap- button (e.g., Figure 5, in which all input fields are pre- plications normally remove the original login button and populated using information taken from the SSO pro- display some identifying information about the user in cess), to very complicated registration processes that in- an authenticated session. For example, after a successful volve interactively filling in multiple forms. registration many websites display a welcome message SSOScan attempts to complete all forms on the SSO that includes the user’s first name. landing page by leaving pre-populated fields untouched After the page finishes loading, the Oracle searches and processing the remaining inputs in the order of ra- the entire DOM and document.cookie for test account dios, selects, checkboxes and finally text inputs. We user information (e.g., names, email, or profile images). found this ordering to be very important to achieve We evaluate the correctness of our assumptions and ef- higher automation success, as some forms may dynami- fectiveness of our Oracle in Section 4.2. cally change what needs to be filled upon selecting differ- ent radio or select elements. Processing these elements first allows SSOScan to rescan for dynamically generated 3.3 Vulnerability Tester fields and process them accordingly. After the Enroller successfully registers two test ac- For radio and select elements, SSOScan randomly counts, control is passed to the Vulnerability Tester chooses an option; for checkboxes, it simply checks which checks the target application for the vulnerabilities all of them. For text inputs, SSOScan tries to infer described in Section 2.2. We use two different probing their requirements using heuristics and provide satisfac- approaches to cover the five tested vulnerabilities: simu- tory mock values. Once all the inputs have been filled, lated attacks and passive monitoring. the next step is to reuse the SSO Button Finder (Sec- tion 3.1.1) with different settings designed to find submit Simulated Attacks.The two credential misuse vulnera- buttons. After SSOScan attempts to click on a submit bilities are tested using simulated impersonation attacks. button candidate, it refers to the oracle to determine if We describe how this is done for signed request misuses; the entire registration process is successful. the method for checking access token misuses is similar.

1 2

Figure 3: SSO Buttons (huffingtonpost.com) Figure 5: Registration Form (espn.go.com).

5 USENIX Association 23rd USENIX Security Symposium 499 To set up the tests, we created a test application Mal Vulnerable 20.3% which uses Facebook SSO, and obtained Alice’s sign- Timeout/error 7.6% ed request for Mal. This mimics the scenario where Buggy 2.3% Alice is tricked into visiting and signing into an arbi- Facebook SSO, 9.3% trary malicious website using Facebook. After the ac- No Facebook Not Vulnerable SSO, 83.1 % 77.4% count registration finishes, we use Bob’s credentials to sign into Facebook for target application, but replace the Valid Top 20,000 sites 1660 Sites using Facebook SSO signed request in Facebook’s response with the prior re- sponse received for Alice. For consistency, we also re- Figure 6: Results overview place all access tokens found in the traffic. The attack is successful if Bob is able to login as Al- ice using the replayed signed request. The Vulnerability fewer sites (<10%) and took a week to complete running Tester deems the site vulnerable if the Oracle determines on one machine with four concurrent sessions. that Alice is logged in after the simulated attack. In July 2014, we re-ran the tests on the vulnerable sites to see how many sites had corrected the vulnerabilities. Passive Monitoring. The three credential leakage vul- The results from that scan are reported in Section 6.2. nerabilities are detected using passive approaches. For brevity, we only explain how leaks through the referrer header are detected; the other leaks are detected similarly 4.1 Automated Test Results by observing network traffic and web page contents. To check if an application leaks the user’s OAuth cre- Figure 6 presents results purely based on automatic tests dentials through the referrer header, SSOScan monitors run by SSOScan. SSOScan found a total of 1660 sites all request data during the account registration process using Facebook SSO among the 17,913 sites (9.3% of and compares each referrer header to OAuth credentials the total). Figure 7 shows the number of Facebook SSO recorded in earlier stages. If a match is found, SSOScan supported sites, sites that misuse credentials, and sites then checks if the requesting page contains any third- that leak credentials distributed by site ranking. The dot- party content such as scripts, images, or other elements ted lines on top of the bars show the average stats of all that may generate an HTTP request. SSOScan reports a sites that are more popular than that rank. In Section 4.3, potential leakage when credentials are found in the refer- we report on our manual analysis on failed tests for sites rer header for a page that contains third-party content. ranked in the top 10,000. Facebook SSO integration. Figure 7 (a) shows that more popular sites are more likely to integrate Facebook 4 Results SSO. Of the top 1000 sites, 270 (27%) of them include Facebook SSO, compared to only 52 out of the 1000 We evaluated SSOScan by running it on the list of the lowest-ranked sites in our dataset. This supports our be- most popular 20,000 websites based on US traffic down- lieve that covering the top-ranked 20,000 websites is suf- loaded from quantcast.com as of 7 July 2013. Of those ficient to get a clear picture of prevailing Facebook SSO 20,000 sites, 715 of the sites are shown as hidden profile usage since less popular sites are both less visited and (that is, no URL is given, thus excluded from our study). less likely to use Facebook SSO. We ran SSOScan on the remaining 19,285 sites in September 2013, and found that homepages of 1372 sites Faulty implementations. To implement Facebook SSO, failed to load during two initial attempts (most likely due an application must be configured correctly in the Face- to either expired or incorrect domain name, server error, book developer center. Using incorrect parameters to call or downtime). We excluded these sites from our data set, the SSO entry point also result in errors that will prevent leaving a final test dataset containing 17,913 sites. any user from authenticating to that application through Completing the tests took about three days, running 15 SSO. Such cases, automatically identified by SSOScan, concurrent sessions spread across three machines. The were more common than we expected. The most popular average time to test a site is 3.5 minutes. We limited the errors include setting the application to ‘sandbox’ mode maximum stalling time for each site on any one module (for development stage only) in the developer center, or to four minutes, and the overall testing time to 25 min- providing a wrong application ID. SSOScan found 39 utes per site. If this timeout is reached, SSOScan restarts (2.3% out of 1660 sites that incorporate Facebook SSO and retries a second time before skipping it. We ran extra buttons) sites that display visible Facebook SSO buttons rounds on tests that failed or stalled during initial round but have implementations so buggy that no user could until either the test is completed or the four rounds max- ever login using them. A possible explanation is that the imum limit has been reached. The extra rounds involved buttons are there for SEO purposes and the developers

6 500 23rd USENIX Security Symposium USENIX Association 45%

40% 35% (a) Facebook SSO support 30% 25% 20% 15% 10%

% supporting % supporting FB SSO 5% 0% 1 10 20 30 40 50 60 70 80 90 100 Site rank (100 equally sized buckets, each containing 1% (179) of all valid test sites)

45% 45% 40% 40%

35% 35% 30% 30% 25% 25% 20% 20% 15% 15% 10% 10% % vulnerable % vulnerable 5% 5% 0% 0% 1 10 20 30 40 50 60 70 80 90 100 1 10 20 30 40 50 60 70 80 90 100 (b) Credential misuse vulnerabilities (c) Credential leakage vulnerabilities Each bucket represents all sites using Facebook SSO that are ranked in the corresponding range in (a)

Figure 7: Facebook integration results by site rank never actually bothered to implement it, or the develop- Facebook accounts directly, the fact that such vulnera- ers simply copied and pasted an SSO snippet customized bilities exist in high-profile websites is worrisome, as im- for another application without ever testing it. personation attacks carried against sites with millions of users have more severe consequences thank similar at- Vulnerability trends. We found 202 sites (12.1%) that tacks on lower-profile sites. misuse credentials (126 of which are misusing both ac- Of the top-1000-ranked sites, 60 of the 270 (22.2%) cess token and signed request) and 146 sites (8.6%) that that support Facebook SSO are found to have at least one leak Facebook SSO credentials (of which 72 sites are vulnerability. The vulnerability rate is 21.3% across all leaking through both referrer headers and DOM). A to- sites in the top 10,000 and 18.5% for sites ranked from tal of 345 sites (20.3%) suffered from at least one of the 10,001 to 20,000. This overal vulnerability rate suggests five tested vulnerabilities, and 3 sites suffered from both that development practices at larger companies do not ap- credential misuse and leakage problems. pear to be more stringent (at least with respect to SSO) It is also worth noting that SSOScan did not find any than they are at less popular sites. sites leaking their app secret to the public by calling the As we do not have access to server side source code, token exchange API on the client side. To verify that we cannot measure how reusing code may positively or we implemented the check correctly, we have confirmed negatively affect the vulnerability trend. However, we that SSOScan does correctly identify this vulnerability did notice that some sites use fourth party services (e.g. on our manually-crafted faulty application. This is an in- Janrain, Gigya) to implement the Facebook SSO. In such teresting result, especially compared to the high number scenarios, the user effectively does two SSO processes of sites that have at least one of the other vulnerabilities. during authentication — the user, Facebook (IdP) and We suspect this is partly due to explicit warnings in the Janrain (RP) initially; the user, Janrain (IdP) and the true documentation and the increased effort required to actu- relying party afterwards. As the Facebook SSO process ally implement the token exchange on the client side. is entirely handled by the fourth party and is hidden to the As shown in Figure 7 (b) and (c), more popular sites relying party, the RP’s behavior is not relevant to this vul- appear to be more likely to have credential misuse vul- nerability. We have manually tested both Janrain and Gi- nerabilities, while less popular sites tend to have more gya’s Facebook SSO implementation for credential mis- credential leakage problems. This fact certainly raises use vulnerabilities and confirmed that both of them cor- concern — credential leakage could potentially do dam- rectly implement the process by only using code flow to age to users’ Facebook accounts, and it would be hard to authenticate users. As a result, sites using these services contact numerous low-profile problematic sites to have contribute to a lower vulnerability rate. Note that the RP them all fixed. The victim’s Facebook account is in jeop- would still need to implement the second SSO process ardy if any of the applications he or she uses have such correctly to avoid vulnerabilties, but SSOScan currently problem. Even though credential misuse cannot harm does not check IdPs other than Facebook.

7 USENIX Association 23rd USENIX Security Symposium 501 4.1.1 Front-end Integration third-party scripts in its content. The scripts come from various sources including quantserve.com, fonts.com, There are three basic client-side methods to integrate yahooapis.com, and multiple domains owned by . Facebook SSO: a JavaScript SDK, a pre-configured wid- The permission Fodor’s requests includes user’s basic in- get, or a custom implementation. (We have no way to formation, email address, and more importantly, permis- determine how the developers are integrating Facebook sion to post to user’s wall on the his or her behalf. This SSO at the back end.) We used SSOScan to aggregate means if the access token is leaked to a malicious party, front-end integration choices and compare them with it can post to a user’s Facebook wall without consent in vulnerability reports. Table 1 summarizes the results. addition to accessing the user’s basic information. Websites using client side SDKs and pre-configured wid- gets are more likely to misuse credentials (29.1% and 15.5% vs. 1.3% in non SDK/widget implementations). 4.2 Detection Accuracy Our guess is that this is due to the way SDKs and widgets To evaluate the detection accuracy of SSOScan, we sam- conveniently expose raw access token, signed request, pled test cases from all results (including sites reported or even user name Facebook ID values. This convenience to have no Facebook SSO support, secure and vulnera- may lead to the developers to neglect to check the signa- ble cases) and manually examined them. We consider ture and the intended audience of the credential. How- two types of mistakes: misreporting whether the site ever, our results also show that websites using SDKs and integrates Facebook SSO, and incorrectly determining widgets are better in hiding credentials (3.6% and 2.2% whether or not a Facebook SSO-enabled website exhibits compared to 12.4% vulnerable rate in SDK/widget im- a vulnerability. plementations). This is likely because such applications use the Facebook-provided landing page which has safe Facebook Login Detection Correctness. SSOScan redirect URLs and no third-party content. Applications searches SSO button based on heuristics and cannot built this way are secure unless the developers explicitly guarantee success for all websites. Indeed, it is not possi- add the credentials in the page content or URL. ble for anyone to determine with complete confidence if a website uses Facebook SSO by just browsing the site. To roughly measure how many Facebook SSO-enabled 4.1.2 Examples websites were missed by SSOScan, we randomly sam- We describe two examples of vulnerabilities found by pled the 100 sites that were reported by SSOScan to have SSOScan here to illustrate the potential risks. Section 6 no Facebook SSO support and manually examined them. discusses our experiences reporting vulnerabilities to site To make the samples representative of the whole set, we owners and Facebook. picked one site out of every 200 sites ordered by their rank. From manually investigating these 100 sites, we Match.com. Ranked 118th on the list, Match.com is a could only find one site that included Facebook SSO but popular online dating website. SSOScan revealed that was missed by SSOScan. As we introduce later in Sec- match.com is also vulnerable to signed request replace- tion 6, we also deployed SSOScan as a web service that ment attacks. To use match.com services, users need is made available to use in our research group. The web to provide sensitive information including their birthday, service has received a total of 69 valid submissions so far location, photos, personal interests, and sexual orienta- and we have also manually examined the vulnerability tion. Impersonators will not only have access to this in- reports.2 We found four cases (5.8%) where a submitted formation, but also learn whom the victim is dating and site included Facebook SSO but SSOScan was not able possibly the time and location of the dates. to trigger it. Fodors.com. Fodor’s is a travel advice website that is The sites that SSOScan fails to find Facebook lo- the 217th-ranked US site. Its redirection landing page gin present unusual interfaces which our heuristics are contains access token information along with some other not able to navigate to. Specifically, oovoo.com and bitdefender.com do not show any login button on its homepage, but instead the user needs to click a ‘my ac- Method Number Misuse Leakage count’ button to initiate the login process. The .com SDK 578 29.1% 3.6% site displays a login button on its homepage, but the SSO Widget 132 15.5% 2.2% process is not initiated until the user interacts with the Custom Code 950 1.3% 12.4% popup window three times, which exceeds the maximum All 1660 12.1% 8.6% 2These have mostly been sites suggested by people we have demoed Table 1: Rate of credential misuse and credential leakage SSOScan scan to, since the service has not yet been publicized. Hence, it is a small and non-representative sample, so not clear what we can for different Facebook SSO front-end implementations conclude from this at this point.

8 502 23rd USENIX Security Symposium USENIX Association click depths (two) in this evaluation. We have also seen cookies could be issued even before SSOScan finishes one case (coursesmart.com) in which the login process is registration forms. This means that before the Enroller rather typical, but SSOScan still missed the correct login searches for registration forms to fill in, the Oracle deems button (that button is scored the 4th highest while SSO- registration as unnecessary because it concludes that the Scan only attempts to click the top 3 candidates.). Most application is already in an authenticated state. Although of these issues may be addressed with more relaxed re- SSOScan is able to proceed and determine vulnerability strictions and more regular expression matching as de- status, the application never enters an authenticated state scribed in Section 5.2. Finally, our prototype implemen- and false negatives might occur. tation is limited to English-language websites due to its For credential leakage string matching algorithm, but could be extended to in- Trusted Third-Party Domains. vulnerabilities, SSOScan reports an application as vul- clude keywords in other languages. nerable if it identifies visible credentials co-existing with SSOScan may also incorrectly conclude that a web- any content or script that comes from any origin other site supports Facebook SSO when it does not. We than the host or Facebook. This could overestimate the have seen sites (e.g., msn.com) that only use Facebook vulnerable sites because the host may own other domains SSO to download user activities and display them on and serve content over them, which should not be consid- the page, but do not integrate their identity system with ered untrusted. For example, content delivery networks Facebook SSO. Although SSOScan is designed to skip and sub-company scenarios (e.g., cnn.com embedding searching on typical Facebook-provided social plugins content from turner.com which owns CNN) are common and widgets, non-standard integration of such function- among popular websites. alities may rarely lead to false positives.

Vulnerability Status Correctness. Since SSOScan sim- 4.3 Automation Failures ulates potential attacks and verifies their success or fail- ure, detection is likely to be highly accurate. Never- For about 19% of the top 10,000 tested site that include theless, we consider several possible reasons that might functional Facebook SSO, SSOScan is not able to fully cause false positives/negatives to be reported. automate the checking process. Figure 8 shows the dis- SSOScan should be able to capture all credential leak- tribution of rank of failed test websites. age vulnerabilities with no false positives. A false neg- To better understand the reasons why SSOScan fails, ative may occur since SSOScan only looks for exact we manually studied all 228 failed cases reported by matches to the original OAuth credential string, so it will SSOScan for sites ranked in the top 10,000. We found not report a leakage if the credential is slightly trans- that although 47 out of these 228 cases set their Face- formed or encoded. Further, SSOScan only observes book application configurations and SSO entry points traffic involving the web client, so does not detect appli- properly, they never respond to credentials returned by cation that leak OAuth credentials outside the SSO pro- Facebook SSO, which means no users would be able to cess. successfully log into these sites through Facebook SSO. SSOScan only reports a credential misuse vulnerabil- Excluding these 47 left us a total of 181 failure cases. ity when it can successfully execute an impersonation Registration automation failure. By far the most com- attack. So, the only risk for incorrect reports is if the mon reason for SSOScan to fail is due to complicated or Oracle incorrectly determines the session identity. We highly-customized registration process. We found 43.7% designed the Oracle to minimize this risk. For example, information for the test account is chosen carefully to be unlikely to appear otherwise but to be close enough Percentage of failed tests vs. Site rank to real names to pass sanity checks. For example, the 45% randomly generated name “Syxvq Ldswpk” was rejected 40% by a small number of websites, but “Jiskamesda Qua- 35% 30% narista” always passed sanity checks and only appeared 25% Overall: 19% in an authenticated session in all of our tests. Barring an 20% % failed unlikely name collision, there does not appear to be any 15% way SSOScan would produce a false positive credential 10% misuse report. 5% 0% The Oracle checks the whole response for identifying 1 10 20 30 40 50 60 70 80 90 100 information instead of only the DOM content to han- Site rank (100 equally sized bucket) dle sites which only embed such information in first- party cookies after logging in. In some rare cases, these Figure 8: Failed tests rank distribution

9 USENIX Association 23rd USENIX Security Symposium 503 Failure reason Number Percent page. In other cases, SSO authentication serves only a linking/subscription 51 28.1% sub-service of the website such as its affiliated forum, CAPTCHAs 34 18.8% but not the homepage which does not display any identi- identity invisible to oracle 28 15.5% fying information. atypical input elements 20 11.0% Others. During the testing, we have also seen a number atypical submit buttons 19 10.5% of sites with extremely long loading time or inconsistent email verification 10 5.5% network latencies after Facebook SSO or upon navigat- non-HTTPS submission forms 9 5.0% ing to certain pages. While the latency spikes can likely other (e.g., timeouts) 10 5.5% be resolved by re-running the tests, frequent long delays Total failures 181 100.0% which accumulate to SSOScan’s maximum timeout will always halt the automation process. For example, this happens when SSOScan accidentally triggers a browser Table 2: Automation Failure Causes (top 10,000 sites) confirmation dialog that requires user interaction, or ask- ing users to stop a busy script execution. of the sites that implement Facebook SSO still require users to perform additional actions to complete the reg- 5 Heuristics Evaluation istration (roughly evenly distributed by site popularity). SSOScan failed to complete registration on 143 (33.6%) The ability of SSOScan to successfully complete the of them. Table 2 shows the major reasons contributing Facebook single sign-on and registration process de- to this failure ordered by their occurrences: 1) sites that pends on heuristics it uses to find buttons and fill in regis- require SSO users to link to an existing account or pro- tration forms. Since each attempted button click involves vide payment information to subscribe to the service; a high-latency round-trip with the server, early pruning Currently SSOScan cannot handle the “linking” action: of search space and prioritization of elements is impor- automatically registering a “traditional” account and per- tant for achieving successful completion within a reason- form the linking poses an out-of-scope challenge — do- able amount of time. This section describes and analyzes 3 ing so often requires solving CAPTCHAs . 2) registra- the heuristics SSOScan uses. We analyze the click data tion forms after the SSO process include CAPTCHAs; collected from the top 10,000 sites that use Facebook 3) special input elements (e.g. div, span or image as op- SSO and show how tweaking the heuristics significantly posed to input) cannot be found automatically, or spe- improves performance. cial requirements for the input that cannot be fulfilled; 4) sites where the registration submission button cannot be located; 5) sites that requires users to confirm email 5.1 Options addresses before continuing (usually this involves click- Each step in the automation process can be controlled ing a link in an email sent by the server to the user’s by many options, including filters that can be enabled to email address); and 6) sites that insecurely send regis- eliminate candidate elements that are unlikely to be the tration data using a non-HTTPS form which causes the correct target, weightings that adjust the contribution of testing browser to pop up a warning and stall. different element properties to its score, and other behav- Oracle confusion. SSOScan may also fail because the ior modifiers. The ones SSOScan used when running the oracle reports failure (15.5%), which occurs when it de- Section 4 study are described below; additional options tects the login button no longer exists after Facebook are described in our tech report [33]. SSO but cannot identify the session identity. We man- Candidate rank. The button finder produces a candidate ually analyzed such cases and found the biggest obstacle element list ranked by score. SSOScan will first attempt is that the application homepage does not include any clicking on the highest-ranked element, but sometimes identifying information at all. For example, instead of showing ‘Welcome, username ’, it shows ‘Welcome, { } customer’, or simply ‘Welcome’, and the user name is only displayed when accessing the account information

3On the contrary, most tested applications (942 out of 973, see Sec- tion 4.3) do not ask users to solve CAPTCHAs when an account is created through SSO. This is a reasonable practice, since the user who is able to provide a valid Facebook account should have already passed Facebook’s requirements, and adding additional CAPTCHAs would be unnecessarily annoying to the users. Figure 9: Example corner cases

10 504 23rd USENIX Security Symposium USENIX Association the correct element is ranked lower. This option controls ing duplicate work by detecting if a click attempt has the maximum number of click attempts SSOScan makes resulted in a previously visited or completely explored before succeeding or giving up. For Section 4’s experi- state (see our tech report for details [33]). ment, the lowest ranked element SSOScan clicks is the third. Visibility filter. Most websites only expect users to click 5.2 Experiment Setup on UI elements that are visible, so the button finder in- In theory, SSOScan could exhaustively trigger clicks on cludes a filter that ignores all invisible elements (e.g., el- every element on the page (and on all response pages up ements with zero height or width, shadowed in the back- to some maximum depth), which would result in nearly ground layer, or those which appear only when the user 100% success rate. This would be prohibitively slow in scrolls the initial screen position). practice, though, so the number of attempted clicks must Position filter. We noticed that SSOScan sometimes gets be limited for any realistic test. Given the time needed distracted by a search box submit button when complet- for each click attempt, it is important to configure our ing the registration form, even if it is able to correctly scoring heuristics well to maximize the probability of a fill in the required information in all input elements. To successful enrollment in the minimum amount of time. eliminate these misclicks, the position filter eliminates To gather statistics about the candidate elements, we the submit buttons which are displayed above any inputs modified SSOScan to try all possible strategies even if it based on our observation that submit buttons nearly al- has already found the correct login button and to record ways come last in a registration form. information about all attempted clicks, including for ex- ample their size, position, visibility to the user, content Registration form filter. As mentioned earlier in Sec- string feature and whether it is successful. We define a tion 4.3, many websites provide two actions for the user click as successful if it is included in any sequence of after SSO is completed: ‘create new account’ or ‘link an clicks from the start page to triggering the SSO process, existing account’. The latter option requires the user to regardless of whether it appeared in an attempt that failed enter the user name and password of an existing account to trigger the process. Because SSOScan skips previ- to finish the enrollment process. To avoid these, the reg- ously explored states to avoid redundant effort, it auto- istration form filter rejects a candidate submit button if matically rejects click sequences which involves cyclic its parent form contains only two visible text inputs, one state transitions such as clicking on an irrelevant link and has the meaning of ‘name’ or ‘email’ and the other is of then clicking on a logo which returns to the initial state. type password, since such an element is most likely to be We set up SSOScan to expand the candidate pool size a submit button of a linking form. for each configuration from 3 to 8, add more matching Element content matching. SSOScan searches for ele- regular expressions (e.g., to match the string “forum” ments whose labels are close to “login with Facebook” which occasionally leads to a login page on sites where for SSO buttons by default. However, quite a few pop- no login is visible on the start page), and use equal weight ular websites (e.g. coupons.com, right side of Figure 9) for each of them. We also removed all candidate filters only allow users to “sign up with Facebook” first before described in Section 5.1. Our goal is to capture as many logging in with Facebook. If the user has yet to do this, ways to trigger the SSO process as possible by doing attempting to login with Facebook will produce an er- as close to an exhaustive search as is feasible. This in- ror. To handle this situation, SSOScan will search for creases the time required to scan a typical site to almost elements with semantics similar to “sign up with Face- an hour (compared to a few minutes with the setup used book” when it fails to register using the “login” buttons. in the full study). A filter may significantly reduce the number of mis- We ran the test on the 973 sites from the top-10,000 clicks. However, it may also occasionally exclude cor- ranked sites that were detected by SSOScan to support rect elements. For example, not every correct submit Facebook SSO in our main study (Section 4). This bi- button is below all inputs (e.g., left of Figure 9, and ases the study slightly, since it only includes sites where expedia.com’s submit button would have been missed the configuration used in the initial study was able to with the element position filter enabled). find Facebook SSO. Ideally we would like to run all top- Hence, SSOScan is designed to explore target sites us- 10,000 sites to avoid any bias introduced by the data set, ing different option settings if enrollment does not suc- but the significantly increased testing time prohibits us to ceed with the initial settings. It will continue to attempt do so, and the result of our subsequent study on a random to complete the enrollment process using different set- sample of sites (Section 5.4) supports the claim that only tings until either all configurations have been exhausted few sites containing Facebook SSO were missed by the or the timeout threshold is reached. SSOScan avoids do- main study.

11 USENIX Association 23rd USENIX Security Symposium 505 Pixels 5.3 Results 400 The experiment recorded 29,539 unique4 click attempts, 300 Width, FP Width, TP of which 5086 (17.2%) are successful (that is, they either 200 Height, FP directly trigger SSO, or lead to subsequent clicks that 100 Height, TP trigger SSO). This amounts to approximately 30 unique 0 clicks attempted per site, but the number varies signifi- 0 10 20 30 40 50 60 70 80 90 99 Percentile cantly based on the site design, from a few up to 109. Figure 12: Impact of Login Button Size Element type and content. Figure 10 shows how dif- ferent button types and properties impact success rates. We report the success rate as the number of times that Figure 11 shows how the success rate varies with at- element appeared as a successful click divided by the to- tribute content (matched by the given regular expres- tal number of clicks attempted on elements of that type. sion). The keyword “oauth” rarely exists in any content, The number beneath the element feature gives the total but when it appears it is very likely to identify the target number of times that type of element occurred as a suc- element. The result also shows that “FB” is not a good cessful click target across all the test sites. For exam- indicator to predict the target, and we think this is proba- ple, the BUTTON element type has an excellent success bly because it is very short and may be used for similarly rate — 60% of all BUTTON candidates are true positives named JavaScript variables or random abbreviations. for the Facebook SSO button. But since it only appears Both figures include data for the first click only (but do as a successful click on 78 out of 973 sites in our sam- measure first click success based on subsequent clicks). ple, it is rarely useful. By contrast, clicking on DIV ele- Data for the second clicks are noticeably different from ments are much less likely to trigger the Facebook login, the first, and overall success rates are lower on second but such elements are more common. The right side of clicks. The most interesting fact we found is that “con- Figure 10 shows that elements that are directly visible nect” (39%) and “Facebook” (36%) become the most to the user has a higher success rate than invisible ones, successful matches of all regular expressions, followed and elements residing in iframes are twice as likely to be by “oauth” at (26%). No other regular expressions ex- the correct target as their counterparts in the main page. ceed 20% success for the second click. These results suggest ways of weighting element types to improve the scoring function and increase the likelihood Element size. Figure 12 gives the cumulative distribu- of finding successful clicks early. tion function of the width and height of target elements. For example, the 80th percentile width of the true positive 4If two clicks happens on pages with the same URL, same element elements is approximately 150px, compared to 300px for XPath and same element outerHTML, we consider them the same click. false positive elements. We did not find any significant difference between first and second clicks, so the figure

0.7 combines data from all clicks. The key result is that wide 0.6 elements are less likely to be true positives, possibly due 0.5 to SSOScan incorrectly including many large underlay 0.4 0.3 elements as candidates. The result is similar for element 0.2 height (the lower two lines in the figure). This suggests 0.1 True positives / all / all positives clicks True 0 that it would be useful to add a filter function that ex- Type: A DIV SPAN IMG BTN INPUT IFRM LI properties: visible invisible main iframe Occurrences as cludes candidates whose width is greater than 300px. We successful clicks : 1556 266 204 188 78 64 9 1 1223 1144 2330 37 would expect it to eliminate 20% of the false positives while hardly missing any of the true positives. Alterna- Figure 10: Login button type statistics tively, SSOScan could adjust the final score of a node according to its size based on these results.

0.6 0.5 Element position. Figure 13 shows the heatmap of the 0.4 login button’s position in a page. The intensity at a lo- 0.3

0.2 cation indicates the number of elements found there sat-

0.1 isfying the property. Only visible elements are shown, True positives / all clicks allpositives / True 0 and each successful click only attributes to the intensity Content: logion Facebook FB signion account connect forum oauth once. All four figures are normalized with respect to their Occurrences as : successful clicks 2816 1398 1196 896 422 297 218 74 maximum intensity (i.e., element density). The figures show an interesting distinction from first Figure 11: Login button content statistics click to second click: successful first clicks almost ex-

12 506 23rd USENIX Security Symposium USENIX Association First Click, True Positive First Click, False Positive heuristics on these sites, and further increased the max- imum click depth to three to see if more SSO integra- tions could be found. Individual tests took an average of 31 minutes to finish, but varies significantly from a few minutes up to an hour (threshold) based on site content. Four additional sites were found that support Face- book SSO from this sample in total. Two are found due to Second Click, True Positive Second Click, False Positive the added regular expression [Ff ][ Oo][Rr][Uu][Mm], one of which required three clicks to trigger the SSO process. Another site is found due to the improved candidate rank- ing algorithm, and the fourth was found using the new candidate selection method that includes all elements in the right corner of the page, even if they do not match any regular expressions. This provides a reasonable de- Figure 13: Login button location heatmap gree of confidence that our original study found a large enough fraction of all the popular sites using Facebook SSO to be representative, although likely missed around clusively appear in the upper right corner of the page, 1% of Facebook SSO sites. We did not try click depths while the second click appears generally in the upper- greater than 3 because of the exponential time growth re- middle part of a page. The false positives are relatively quired to complete each test, but we feel confident that more scattered everywhere on the page5. This result sug- the number of Facebook SSO interfaces that can only be gest we should assign a higher weight for elements for discovered by attempting more than 3 clicks is very low. these locations, and focus on elements in the vicinity of the upper right corner for the first click. We could po- tentially even ignore the other criteria and only consider 6 Discussion position to find login buttons on foreign-language sites. This section concludes by discussing limitations of SSO- Scan, sharing our experiences reporting vulnerabilities, 5.4 Validation and suggesting ways SSOScan can be deployed to help After incorporating what we learned from these results secure applications integrating SSO services. (e.g., weight adjustment for different button sizes and types), we reran the SSOScan with the new heuristics 6.1 Limitations on the sites ranked from 10,000 to 20,000 that SSOScan determined to support Facebook in the original study, While SSOScan is able to automatically synthesize ba- which were not included in the heuristics evaluation. We sic user interactions and analyze traffic patterns, this ap- compare the results with those obtained by using a “con- proach is not suitable for detecting all types of vulner- trol” version of SSOScan, with equal weights on all fea- abilities. It only works for vulnerabilities that can be tures and no candidate filtering. All other settings such as checked by observing traffic or simulating predictable candidate pool size are the same between two versions. user events, and falls short if the vulnerability testing in- The results support the hypothesis that adjusting heu- volves deep server-side application scanning or compli- ristics according to the results of the evaluation can im- cated interactions. For example, Wang et al. [27] point prove the speed and robustness of detection of Facebook out that the application’s app secret might be leaked to SSO integrations. The na¨ıve control version missed 72 arbitrary party if any page including Facebook’s PHP out of the 601 sites while the new heuristics missed only SDK invokes two functions in a specific way. This two. The average rank of correct candidate elements for type of vulnerability could be checked at the developer the first and second click is 1.32 and 1.23 for the con- side using program analysis techniques, but cannot be trol experiment, which improves to 1.23 and 1.17 respec- checked by an external tool with no awareness of the tively with the new heuristics. sites’ implementation details or internal state. We also randomly picked 500 random sites from the sites that SSOScan have yet to find Facebook support 6.2 Communication and Responses in the experiment in Section 4. We tested the expanded We started contacting the site owners shortly after obtain- 5 The figures also show a clear width boundary. In the experiments ing our first list of vulnerable sites, manually sending out the browser resolution is 1920x1200, and it seems that most develop- ers’ designs follow a standard width of approximately 960px, which is notifications to 20 vulnerable websites that we thought why the density appears to be cut off. were interesting. We contacted them either by submitting

13 USENIX Association 23rd USENIX Security Symposium 507 a form on their website or through email. The responses 27 were misusing credentials (one site, trove.com, fixed were very disappointing, especially compared with our both problems). We further examined these sites man- previous experiences reporting SDK-level vulnerabilities ually to investigate the possible reasons and measures to identity providers who tend to respond quickly and ef- to fix the problems. As for sites that fixed credential fectively to vulnerability reports [27]. The vulnerabili- misuses, we found that many had abandoned the token ties found by SSOScan, on the other hand, are primar- or signed request flow in favor of the more secure code ily in consumer-oriented sites without dedicated security flow, which automatically protects them from credential teams or clear ways to effectively report security issues. reuse attacks. For credential leakages, we found that Of the 20 notifications, we only received eight re- a number of sites redesigned their SSO process to fea- sponses, most of which appear to be automated. After the ture a smoother user experience, e.g., replaced traditional initial response, three websites sent us follow-up status redirection flows with AJAX operations, which naturally updates. ESPN.com thanked us and told us the message eliminated credential leakage via referer header. has been passed onto appropriate personnel, but no fol- Communication with Facebook. Due to the ineffec- low up actions ensued. One of answers.com’s engineers tiveness of our direct communication with site owners, asked us for more details, but failed to respond again af- we contacted Facebook’s platform integrity team in May ter we replied with proposed fix. As of July 2014, both 2014. Facebook’s engineers indicated that they are par- sites are still vulnerable. Four months after getting the ticularly worried about access token leakage through ref- automated reply from ehow.com, we received a response erer headers (because a malicious party in possess of the stating that they have removed Facebook SSO from their token may perform privileged Facebook actions on be- website due to “content deemed inappropriate”, and we half of the user, which potentially directly harms Face- have confirmed that the Facebook SSO button has in- book), but are also concerned with the credential misuse deed been removed. Sadly, we think their staff likely scenario. Facebook asked for a list of the vulnerable ap- did not (bother to) understand our explanation for the fix plications and contacted all the sites with access token and simply removed the feature. leakage and credential misuse vulnerabilities (a total of The other instance where a reported vulnerability was 95 sites that we were able to re-confirm at the time of re- fixed was for hipmunk.com. Hipmunk was found to be port), and informed us that they would “take enforcement vulnerable to both the access token and signed request action as necessary” upon the ten sites that are leaking replacement attacks. We did not get any response from access tokens in the referer headers. Facebook’s engi- Hipmunk when the vulnerability was reported through neers could not provide us with more information about the normal channels, but through a personal connection what this entails or any direct responses they received, we were able to contact them directly. This led to a quick but an SSOScan re-run one month later (early July 2014) response and series of emails with one of Hipmunk’s en- revealed that only four out of the 95 sites had fixed their gineers. We explained how to check the signature of problems (of the ten sites leaking access tokens, only a signed request, which should fix both vulnerabilities. two had been fixed). Even for Facebook, it appears to be However, when they got back to us believing that the fix difficult to convince consumer-focused websites to take was complete, we re-ran SSOScan and found that Hip- security vulnerabilities seriously. munk was still vulnerable to the access token replace- ment attack. This meant Hipmunk checked the signa- ture of signed request after the fix, but never decoded the 6.3 Deployment signed message body and compared its Facebook ID with Our experiences reporting vulnerabilities found by SSO- the one returned by exchanging access token. This sur- Scan suggest that notifying vendors individually will prised us, as we implicitly assumed the developers will have little impact, which is consistent with experiences consume the signed message body after verifying its sig- reported by Wang et al. with on-line stores [26]. Hence, nature, and thus only included ‘verifying signature’ in we consider two alternate ways of deploying SSOScan the proposed fix. After further explanation, the site was to improve the security of integrated applications. fixed and now passes all our tests. App center integration. We believe SSOScan would Retesting vulnerable sites. We retested all 345 vulnera- be most effective when used by an application distribu- ble sites in May 2014, nine months after our initial exper- tion center (e.g. Apple store, Google Play) or identity iment, including the 20 websites we had notified directly. provider (e.g., Facebook) as part of the application vali- SSOScan found that 48 of the sites had eliminated the dation process. The identity provider has a strong moti- vulnerabilities (including one out of the 20 sites we con- vation to protect users who use its service for SSO, and tacted, mapquest.com). Of the 48 fixed sites, 22 had pre- could use SSOScan to identify sites that can compro- viously been diagnosed as credential leaking sites, and mise those users. It could then deliver warning messages

14 508 23rd USENIX Security Symposium USENIX Association to visitors of vulnerable applications during the log in Automatic Extraction of Web Authentication Pro- through Facebook SSO process, or even go so far as to tocols from Implementations. In 20th Network and shut down SSO logins for that application. We also be- Distributed System Security Symposium, 2013. lieve our results can provide guidance to vendors devel- oping SSO services. The results in Section 4.1 indicate [3] T. Ball and S. K. Rajamani. The SLAM Project: that sites are more likely to misuse credentials when us- Debugging System Software via Static Analysis. In ing the Facebook JavaScript SDK. With Facebook’s help, 29th ACM Symposium on Principles of Program- this problem could be mitigated by placing detailed in- ming Languages, 2002. structions inside the SDK. The instructions could be pre- sented as (non-executable) code in the SDK rather than [4] C. Bansal, K. Bhargavan, and S. Maffeis. Discover- as comments, so that the developers cannot get by with- ing Concrete Attacks on Website Authorization by th out reading and removing them. Formal Analysis. In 25 Computer Security Foun- dations Symposium, 2012. Checking-as-a-service. Without involving an central- ized infrastructure, the best opportunity to deploy SSO- [5] M. Benedikt, J. Freire, and P. Godefroid. VeriWeb: Scan is as a vulnerability scanning service that devel- Automatically Testing Dynamic Web Sites. In 11th opers can use to check their implementations before International World Wide Web Conference, 2002. their applications are launched (our prototype service at http://www.ssoscan.org/ can be used for this now). For [6] BugBuster. BugBuster is a Software-as-a-Service a developer-directed test, it would be reasonable to ask to Test Web Applications. http://bugbuster.com/. the developer to either guide the tool through the reg- istration process or provide a special test account that [7] C. Cadar and D. Engler. Execution Generated Test bypasses this step in cases where it cannot be fully au- Cases: How to Make Systems Code Crash Itself. In th tomated. Even if we assume no aid from the developers, 12 International Conference on Model Checking they should at least be able to tolerate a longer testing Software, 2005. time than is feasible in doing a large-scale scan. [8] G. Di Lucca, A. Fasolino, F. Faralli, and U. De Car- lini. Testing Web applications. In Journal of Soft- Availability ware Maintenance, 2002.

SSOScan is available at http://www.SSOScan.org/ as a [9] S. Elbaum, S. Karre, and G. Rothermel. Improving public web service. The source code is available (linked Web Application Testing with User Session Data. from that site) under an open source license. In 25th International Conference on Software Engi- neering, 2003.

Acknowledgements [10] Y.-W. Huang, S.-K. Huang, T.-P. Lin, and C.- H. Tsai. Web Application Security Assessment We thank Jonathan Burket, Longze Chen, Shuo Chen, by Fault Injection and Behavior Monitoring. In Steve Huffman, Jaeyeon Jung, Haina Li, Chris Slowe, 12th International Conference on World Wide Web, Ankur Taly, Rui Wang, Westley Weimer, Eugene 2003. Zarakhovsky and anonymous reviewers for their valu- able inputs and constructive comments. This work has [11] F. Nentwich, N. Jovanovic, E. Kirda, C. Kruegel, been supported by a Research Award from Google and and G. Vigna. Cross-Site Scripting Prevention with research grants from the National Science Foundation Dynamic Data Tainting and Static Analysis. In 14th and Air Force Office of Scientific Research. Network and Distributed System Security Sympo- sium, 2007.

References [12] G. Pellegrino and D. Balzarotti. Toward Black-Box Detection of Logic Flaws in Web Applications. In [1] N. Alshahwan and M. Harman. Automated Web 21st Network and Distributed System Security Sym- Application Testing Using Search Based Software posium, 2014. Engineering. In 26th IEEE/ACM International Con- ference on Automated Software Engineering, 2011. [13] P. Pirolli, W.-T. Fu, R. Reeder, and S. K. Card. A User-tracing Architecture for Modeling Interaction [2] G. Bai, J. Lei, G. Meng, S. S. Venkatraman, P. Sax- with the World Wide Web. In First Working Con- ena, J. Suny, Y. Liuz, and J. S. Dong. AuthScan: ference on Advanced Visual Interfaces, 2002.

15 USENIX Association 23rd USENIX Security Symposium 509 [14] V. Rastogi, Y. Chen, and W. Enck. AppsPlay- Cashier-as-a-Service Based Web Stores. In 32nd ground: Automatic Security Analysis of Smart- IEEE Symposium on Security and Privacy, 2011. phone Applications. In Third ACM Conference on Data and Application Security and Privacy, 2013. [27] R. Wang, Y. Zhou, S. Chen, S. Qadeer, D. Evans, and Y. Gurevich. Explicating SDKs: Uncover- [15] Redspin Inc. Penetration Testing, Vulnerability As- ing Assumptions Underlying Secure Authentica- sessments and IT Security Audits. https://www. tion and Authorization. In 22nd USENIX Security redspin.com/. Symposium, 2013.

[16] F. Ricca and P. Tonella. Analysis and Testing of [28] Whitehat Security. Your Web Application Security Web Applications. In 23rd International Confer- Company. https://www.whitehatsec.com/. ence on Software Engineering, 2001. [29] Q. Xie and A. M. Memon. Model-Based Testing [17] P. Saxena, D. Akhawe, S. Hanna, F. Mao, S. McCa- of Community-Driven Open-Source GUI Applica- mant, and D. Song. A Symbolic Execution Frame- tions. In 22nd IEEE International Conference on work for JavaScript. In 31st IEEE Symposium on Software Maintenance, 2006. Security and Privacy, 2010. [30] L. Xing, Y. Chen, X. Wang, and S. Chen. In- [18] R. Sekar. An Efficient Black-box Technique for De- teGuard: Toward Automatic Protection of Third- feating Web Application Attacks. In 16th Network Party Web Service Integrations. In 20th Network and Distributed System Security Symposium, 2009. and Distributed System Security Symposium, 2013.

[19] Selenium development team. Selenium: Web ap- [31] Z. Yang, M. Yang, Y. Zhang, G. Gu, P. Ning, and plication testing system. https://selenium.org/. X. S. Wang. AppIntent: Analyzing Sensitive Data Transmission in Android for Privacy Leakage De- [20] J. Somorovsky, A. Mayer, J. Schwenk, M. Kamp- tection. In 20th ACM Conference on Computer and mann, and M. Jensen. On Breaking SAML: Be Communications Security, 2013. Whoever You Want to Be. In 21st USENIX Secu- rity Symposium, 2012. [32] C. Zheng, S. Zhu, S. Dai, G. Gu, X. Gong, X. Han, and W. Zou. SmartDroid: An Automatic System [21] S. Sprenkle, E. Gibson, S. Sampath, and L. Pol- for Revealing UI-based Trigger Conditions in An- lock. Automated Replay and Failure Detection droid Applications. In Second ACM Workshop on for Web Applications. In 20th IEEE/ACM Inter- Security and Privacy in Smartphones and Mobile national Conference on Automated Software Engi- Devices, 2012. neering, 2005. [33] Y. Zhou and D. Evans. Technical Report: SSO- [22] S. Sprenkle, E. Hill, and L. Pollock. Learning Ef- Scan: Automated Testing of Web Applications fective Oracle Comparator Combinations for Web for Single Sign-On Vulnerabilities. https://www. Applications. In International Conference on Qual- ssoscan.org/SSOScan-TR.pdf. ity Software, 2007.

[23] S.-T. Sun and K. Beznosov. The Devil is in the (Implementation) Details: An Empirical Analysis of OAuth SSO Systems. In 19th ACM Conference on Computer and Communications Security, 2012.

[24] TestingBot. Selenium Testing in the Cloud - Run Your Cross Browser Tests in Our Online Selenium Grid. http://testingbot.com/.

[25] R. Wang, S. Chen, and X. Wang. Signing Me onto Your Accounts through Facebook and Google: A Traffic-Guided Security Study of Commercially Deployed Single-Sign-On Web Services. In 33rd IEEE Symposium on Security and Privacy, 2012.

[26] R. Wang, S. Chen, X. Wang, and S. Qadeer. How to Shop for Free Online – Security Analysis of

16 510 23rd USENIX Security Symposium USENIX Association