Integuard: Toward Automatic Protection of Third-Party Web Service Integrations
Total Page:16
File Type:pdf, Size:1020Kb
InteGuard: Toward Automatic Protection of Third-Party Web Service Integrations Luyi Xing, Yangyi Chen, XiaoFeng Wang Shuo Chen Indiana University Bloomington Microsoft Research Bloomington, IN, USA Redmond, WA, USA fluyixing, yangchen, [email protected] [email protected] Abstract—A web application today often utilizes web APIs to has made the web applications complicated and error-prone, incorporate third-party services into its functionality. Such API as discovered in our recent studies [1], [2]. integration, however, is full of security perils: recent studies show that popular web sites using high-profile web services, Logic flaws in web-API integrations. The unique security such as PayPal/Amazon checkouts and Facebook/Google single- challenge such a hybrid web application (i.e., those inte- sign-on (SSO) services, are riddled with logic flaws, enabling a malicious party to shop for free or log into a victim’s account. grating third-party web services) faces is how to securely To address this new threat, techniques need to be developed to coordinate different parties it involves, including the one that facilitate secure integration of third-party web services. integrates the API services (called integrator), the provider To answer this urgent call, we present in this paper of the services and the web clients using the application. InteGuard, the first system that offers security protection Our prior research shows that this is very difficult to be to vulnerable web API integrations. InteGuard operates a done securely: given the diversity in the ways that an API proxy in front of the service integrator’s web site, performing security checks on a set of invariant relations among the HTTP service is integrated and the partial view the integrator and messages the integrator receives during a transaction (e.g., a the API provider share about each other’s code and runtime checkout from a web store or a web SSO). These invariants state, logic flaws in API integration are hard to avoid and link multiple HTTP sessions to a transaction and capture their can often be easily exploited by a malicious party to cause security-critical relations. They also characterize transaction- miscoordinations and bypass security protection [1]. An ex- related communication the proxy cannot directly observe, which happens between the client and the service provider. ample is NopCommerce’s integration of Amazon Payments, InteGuard includes a suite of novel techniques that automati- in which a shopper is supposed to place an order on the cally extract such invariants from a variety of communication web store powered by NopCommerce and pay for the order channels adopted by diverse integrations and achieve effective on Amazon. What has been found is that miscoordination false positive control in this process. Our evaluation shows that can be introduced by a malicious shopper who exploits a InteGuard can defeat complicated exploits on high-profile web services, with little impact on their normal operations. logic loophole in the integration to convince the store that the order has been paid through Amazon and Amazon that the payment should be made to the shopper’s own Amazon I. INTRODUCTION seller account [1]. As another example, some websites’ Recent years have witnessed the growing trend of moving integrations of PayPal Access SSO are found to be evadable, computing from the desktop to the web, which gives rise to a simply by convincing the integrators that a message field huge demand for web applications. To meet such a demand, signed by PayPal is a user’s PayPal ID and PayPal that it web developers increasingly utilize existing web services should contain the person’s mailing address [2]. (e.g., search, map, payment, authentication, etc.) to rapidly The studies we conducted in the past years show that such build up their own applications, through integrating the Ap- logic flaws are pervasive [1]: leading web stores (Buy.com, plication Programming Interfaces (APIs) provided by these JR.com), popular merchant systems (NopCommerce, In- services. For example, many web stores (e.g., Buy.com) terspire) and even mainstream payment service providers call third-party payment APIs provided by cashier services (Amazon) all contain loopholes in their integration code such as PayPal, Amazon Payments and Google Checkout; that enable miscreants to shop for free; important websites more and more websites include single sign-on (SSO) APIs integrating SSO mechanisms of Facebook Connect, Google of Facebook, Google, Yahoo, Twitter, PayPal and others to ID, PayPal Access, etc. are vulnerable to the threat that let their customers log in through these identity providers allows unauthorized parties to log into victims’ accounts (IdP). As a result, today’s web applications become increas- [2]. The problem cannot be solved by solely relying on ingly hybrid, with their program logic distributed across existing techniques like protocol verifications, due to the multiple parties, including not only the websites hosting diversity in the ways an API service is incorporated and these applications and their clients but also various third- used by different integrators, and the limited understanding party API service providers. Such a development, however, each party (the integrator, the provider) has about the other’s behavior (e.g., how the other party verifies a signed message) order ID) need to be found so as to link them to the checkout and the underlying runtime system (e.g., whether Adobe and ensure their consistency (e.g., payment matching price) Flash allows cross-domain communication [3]). So far, little in the presence of other checkouts and unrelated activities has been done to address this emerging security challenge. (e.g., choosing items, etc.). Even more challenging, the merchant-side protection cannot see the session between the Integration protection. Although the above security flaws shopper and PayPal (unlike Waler, which sees the whole can be introduced by all parties involved in an API inte- program, and BLOCK, which sees all the traffic), and has gration, prior research [1], [2] shows that the weakest link to infer its content from what the merchant sends to the remains on the integrator side, whose code appears to be shopper. This requires extracting the parameters related to way more error-prone than that of the provider. Effective the shopper-PayPal interaction from the merchant-shopper integrator protection, therefore, will significantly mitigate communication channels, which are very diverse in different the threats posed to vulnerable integrations. In our research, integrations of even the same API (e.g., HTTP redirect, form, we made the first attempt to find such a solution, with JavaScript, JSON, etc.). Also, the efficacy of this protection an aim to develop a generic and automated technique to critically depends on effective false positive control of traffic protect integrations from exploits. Note that our approach invariants, which needs a novel solution. cannot count on the availability of integration-related source code, as the integrator does not have provider-side code Our work. In this paper, we present a suite of new tech- and often utilizes off-the-shelf integration solutions such as niques to address these challenges, making a first step toward software development kit (SDK) released by the provider or automatic protection of vulnerable web API integrations. commercial software (e.g., Interspire). The approach is also Our approach, called InteGuard, performs security checks expected to work independently of integrations’ semantics on transaction traffic forwarded by a proxy sitting in front (e.g., payment, SSO), which needs human effort to specify. of the integrator’s website. Such a proxy can simply be a An observation we can leverage is that for a specific load balancer operated by most large websites to distribute integrator-side implementation, its interactions with the ser- web traffic to their clusters. For smaller websites, it can be vice provider are rather mechanic: given a small set of run by a trusted intermediary, such as Janrain that integrates client inputs (e.g., items to be purchased, usernames, etc.), major SSO services to offer its customers convenient access the HTTP messages to be exchanged during a multi-party to them [6]. Attached to the proxy is our InteGuard server, integration-related operation like checkouts or SSO (called a which inspects invariants within web traffic before it reaches transaction in the paper) are predictable and the parties in- web servers. The invariants we look at are specified on the volved typically do not perform complicated transformations parameters automatically identified through a simple data- on the content of the messages they receive for response flow analysis that links what the browser can see to the generation. This makes us believe that the invariant relations communication channels the InteGuard server observes, and among those messages’ parameters can be identified and extracted from such channels through an in-depth analysis on utilized to avoid the situation that necessary security checks related HTTP messages, HTML/XML files, JavaScript code, fall through the cracks. etc. From these parameters, invariants are automatically Program invariants have been used to detect faulty appli- generated to bind transactions to an integrator, messages to a cation logics. For example, Waler [4] infers likely invariants transaction and each message within a transaction to its legit- from a web application’s normal operation and analyzes its imate successor. InteGuard achieves effective false-positive source code to find violations. The approach is language- control through strategic selection of normal traffic traces specific and needs source code, which is often unavail- for invariant generation and carefully-designed differential able to an integrator-side protection, particularly when it analyses on the traces. comes to the provider’s program. Simple traffic invariants We put our design to the test against 11 real-world exploits (e.g., session parameters) are employed by BLOCK [5] to reported by prior research [1], [2].