Automated Security Testing of Web Widget Interactions
Cor-Paul Bezemer Ali Mesbah Arie van Deursen Delft Univ. of Technology Delft Univ. of Technology Delft Univ. of Technology & Exact Software The Netherlands The Netherlands The Netherlands [email protected] [email protected] [email protected]
ABSTRACT directly manipulates the browser’s DOM tree. Communica- We present a technique for automatically detecting secu- tion with the server is asynchronous, and typically affects rity vulnerabilities in client-side self-contained components, small parts of the DOM tree only [13]. called web widgets, that can co-exist independently on a Furthermore, Ajax pages can be composed from indepen- single web page. In this paper we focus on two security dent user interface components, often called web widgets. scenarios, namely the case in which (1) a malicious widget These widgets are mini-applications composed of a chunk changes the content (DOM) of another widget, and (2) a of code that can be embedded in an HTML page, and run widget steals data from another widget and sends it to the independently and next to each other, providing dynamic server via an HTTP request. We propose a dynamic analy- content and functionality for a wide variety of tasks such as sis approach for automatically executing the web application showing the latest news headlines, weather predictions, or a list of new email messages. Highly visible examples include and analyzing the runtime changes in the user interface, as 1 2 3 well as the outgoing HTTP calls, to detect inter-widget in- Mashup sites such as iGoogle, Netvibes, and Pageflakes, teraction violations. which allow users to select widgets from a catalog, and cus- Our approach, implemented in a number of open source tomize and host them on their own start page. They also provide APIs for external developers to build and include Atusa plugins, called Diva, requires no modification of ap- plication code, and has few false positives. We discuss the re- new widgets in the widget catalogs. sults of an empirical evaluation of the violation revealing ca- As any program code, widgets can be used for malicious pabilities, performance, and scalability of our approach, by purposes. Security becomes an important aspect when third- means of two case studies, on the Exact Widget Framework parties are allowed to include new widgets in public catalogs. and Pageflakes, a commercial, widely used widget frame- Example scenarios include when a malicious widget changes work. the content of another widget to trick the user into releasing sensitive information, or even worse, listens to the account details a user enters in another widget (e.g., PayPal or Email Categories and Subject Descriptors widgets) and sends the data to a malicious site. D.2.5 [Software Engineering]: Testing and Debugging Current protection-based solutions aim at avoiding the problem by disallowing inter-widget interactions, at the cost General Terms of limiting functionality. Traditional detection-based ap- proaches are generally static analysis-based, which has lim- Reliability, Security, Verification itations in revealing faults and violations in the distributed runtime behavior of modern rich web applications. Keywords Testing modern web applications for security vulnerabili- Web applications, security testing ties is far from trivial. In this paper, we propose a new dy- namic analysis approach that automatically detects and re- 1. INTRODUCTION ports inter-widget interaction violations. We have extended and used our Ajax testing tool Atusa [14], to implement the Web applications increasingly rely on client-side process- approach in a number of security plugins, called Diva. In ing to enable a user experience comparable to that of desktop addition, we present the results of an evaluation, performed applications. Using Ajax [2], the classic sequence of static to assess the effectiveness and performance of our approach. HTML pages is replaced by a single page, which is dynam- The results of our case studies performed on two external ically adjusted at the client side by JavaScript code that widget frameworks are positive; We believe the applicability of our approach is not limited to Ajax-based widgets, and can be used to test the security of any web application in Permission to make digital or hard copies of all or part of this work for which inter-widget interactions pose security threats. personal or classroom use is granted without fee provided that copies are This paper is further organized as follows. In the next not made or distributed for profit or commercial advantage and that copies section, we discuss two types of security vulnerabilities that bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific 1 http://www.igoogle.com permission and/or a fee. 2 http://www.netvibes.com ESEC-FSE’09,August 24–28, 2009, Amsterdam, The Netherlands. 3 Copyright 2009 ACM 978-1-60558-001-2/09/08 ...$5.00. http://www.pageflakes.com
81 we focus on in this paper. In Section 3, we present our over- range of security violation scenarios that are increasingly all approach to automatically detect inter-widget security becoming important as they threaten the safety of modern violations. In Sections 4 and 5 we describe the details of interactive web applications, in which widgets are DIV-based the automated detection methods. Section 6 covers the im- rather than IFrame-based and require dedicated attention. plementation of the methods and Section 7 reports on the results of our evaluation with two case studies. We conclude 3. OUR APPROACH our paper with a discussion, related work, and concluding remarks. Web widgets are generally placed on a single-page web interface [13] using Ajax technology. The goal of our ap- proach is to automatically identify security vulnerabilities in 2. BACKGROUND web widgets. To achieve this goal, our approach performs Placing interactive applications from multiple external ven- a dynamic analysis of the Ajax web application, by crawl- dors on a single web page requires a framework and a collec- ing through the various states and analyzing the widgets’ tion of widgets. The framework offers APIs and guidelines behavior. for developers to create new widgets. Each implemented In our previous work [12], we proposed a new type of widget must adhere to the framework guidelines, and can web crawler, called Crawljax, capable of exercising client- be hooked into a framework web page as a subtree, and exe- side code, detecting and executing doorways (clickables) to cuted without requiring additional compilation, in a mobile various dynamic states of Ajax-based applications within code style [1]. browser’s dynamically built DOM. While crawling, Crawl- The traditional way of placing content from multiple ex- jax infers a state-flow graph capturing the states of the ternal vendors on one page is by using IFrames. The advan- user interface, and the possible event-based transitions be- tage of IFrames is that they provide an isolated environment tween them, by analyzing the DOM before and after firing for the widgets because the Same Origin Policy (SOP) [15] an event. constrains them from accessing other parts of the page. The More recently, we proposed an approach for automatic main problem with using IFrames is, however, that they do testing of Ajax user interfaces, called Atusa [14]. Atusa not provide real integration on the page. IFrames impose is based on the crawling capabilities of Crawljax and pro- undesired limitations, for instance on, the page layout, use vides data-entry point detection and (pre-, in-, and post- of CSS style sheets of the framework, and rich widget inter- crawling) plugin hooks for testing Ajax applications through actions such as resizing. generic and application-specific invariants that serve as or- An alternative for overcoming such limitations is to place acle to detect faults. each widget inside a DIV container [18]. The trade-off of In this paper we propose to extend and use Atusa for using DIV containers is the lack of SOP support. Hence, automatically spotting security problems in widget interact- widgets that are placed in DIV containers run in the same ions. For each generalized violation scenario (V1, V2), we execution environment and are, therefore, capable of ac- first present our method for detecting the violation and then cessing and manipulating other widgets’ properties. This discuss the plugin that implements the approach. leaves the developers with the difficult choice between secu- In order to find DOM change violations (V1), we first rity and functionality. Existing widget frameworks appear need to automatically detect each widget’s boundary in the to be avoiding the security problems by either completely DOM tree. Once the boundaries are defined, we can analyze disallowing inter-widget interaction, or putting the respon- the elements receiving events and the actual changes taking sibility of possible risks of using external widgets on the place on the DOM tree to decide whether a state change is shoulders of the end users. a violation. Inter-widget interactions are desirable because they can For HTTP request violations (V2), the main challenge is provide highly interactive functionality, by for instance al- in coupling each outgoing request with the corresponding lowing to update a shopping basket widget, when an item is DOM element, from which it originated. Once we know selected in a product items widget. Therefore, placement in which element is causing the request, we can analyze the DIV containers is preferred over IFrames. behavior and decide whether a violation has occurred. Various violation scenarios in DIV-based malicious widgets To detect the widget in which the violation was initiated, can be sketched. In this paper, we focus on the following two we need to consider ways in which the state of a web page can generalized types: be changed. In Ajax-based applications, the web page usu- ally changes as a result of an event fired on an element with V1 A malicious widget (MAL) changes the DOM subtree of an event listener. Note that performing a full page refresh another innocent widget (VIC), e.g., MAL changes the updates the page as well but is rarely required in current action URL of a form in VIC so that when the user single-page web interfaces. In our approach, we assume a submits the form, the content is sent to a malicious violation is initiated by an event (e.g., click, mouseover, site. dblclick) fired on a DOM element. V2 MAL reads data from VIC and sends it to a server using In the next two sections, our methods for detecting V1 an HTTP request, e.g., user logs into an email account and V2 are presented respectively. using VIC, MAL logs the entered username and pass- word and sends it to a remote server. 4. DETECTING INTER-WIDGET CHANGE Although security testing is a broad term, current web VIOLATIONS application security research is mainly focused on prevent- The first violation scenario that we address involves a ma- ing and detecting Cross-site Scripting (XSS) and SQL in- licious widget MAL trying to make a change to the DOM jections. However, we believe that V1 and V2 cover a wide subtree belonging to an innocent widget VIC. Below, we
82 html html:{}|EXCLUDE; body:{}|EXCLUDE; div:{}|LOW; div:{class=widget}|VERY_HIGH; threshold = MEDIUM
body head Figure 2: Widget boundary rules for the example widget (separated by ;).
div title #wi_container Algorithm 1 IdentifyWidgetBoundary(fingerNode)
widget1 1: rules ← getIdentificationRules() widget2 2: node ← fingerNode div 3: bestMatchedNode ← null div #widget1 highestRating ← null #widget2 4: .widget .widget 5: while node 6= null do ANNOTATED 6: r ← node.match(rules) 7: if r 6= EXCLUDE && r > highestRating then 8: bestMatchedNode ← node 9: highestRating ← r image p image p #img1 #p1 #img2 #p2 10: end if 11: if r = VERY_HIGH then 12: break Figure 1: Example DOM tree of a widget framework 13: end if (# = id, . = class). 14: node ← node.getParent() 15: end while 16: if highestRating ≥ getThreshold() then 17: bestMatchedNode.annotate() first discuss the nature of widget boundaries, after which we 18: return bestMatchedNode explain how we can reverse engineer them, and how these 19: end if 20: return null boundaries can be used to detect illegal DOM changes. 4.1 Widget Boundaries MEDIUM, HIGH and VERY_HIGH, as well as EXCLUDE. The latter A widget is a subtree of the DOM. Widgets cannot be option indicates that a node mating a given pattern cannot nested, and do not share DOM elements. The top (root) be a widget boundary. element of a widget comprises its boundary with the top In order to determine the widget boundary for a given level framework DOM elements. node n, we traverse the tree upwards from n to the root of An example is shown in Figure 1: Two widgets are dis- the full DOM tree. Every ancestor we encounter on our way played, with root elements having id’s #widget1 and #wid- up we match with the given rules. The node with the highest get2. The framework element is the parent of these two ranking is marked as widget boundary, provided the rank- widgets, having id #wi_container. ing is higher than a minimal threshold. The corresponding In many frameworks, it is straightforward to identify the algorithm is displayed in Algorithm 1. boundary between a widget and the framework part of a DOM tree, often by means of the class=widget attribute. 4.3 Detecting Violations With the goal of defining a standard for automated widget The final step is using the widget boundaries to identify testing, we are an advocate of this policy. For example, DOM Change Violations. To that end, we need to identify in Figure 1, the root of each widget holds a class attribute elements triggering a change, and elements affected by a .widget. change. We define an active element as the element on which Not all frameworks, however, require this explicit identi- the last event was fired. fication. In those cases, a number of heuristics can be used The widget boundary of the widget that initiated the to determine if a particular DOM node constitutes a widget DOM change can be automatically detected using Algorithm boundary. First, some elements are certain not to belong 1 with the active element as input. to any boundary. Typical examples include HTML or BODY To detect the widget in which the DOM change took place, elements, or nodes that are known to belong to the given first any changed node(s) must be determined by comparing framework (the #wi_container in the example). the DOM tree before firing the event with the DOM tree In addition to that, some elements can be given a certain after firing the event. After detecting the node(s) in which probability that they constitute a boundary. An example is a the DOM change took place, an analysis of the DOM change DIV element, which can, but need not be a widget boundary. can be carried out. Assuming the DOM is successfully an- These probabilities are defined by the tester after manual notated during the detection of the widget boundary of the inspection of the DOM. active element, Algorithm 2 can be used to automatically 4.2 Reverse Engineering Boundaries analyze the DOM changes. In order to identify widget boundaries automatically, we propose the use of boundary rules. Each rule provides a 5. DETECTING REQUEST VIOLATIONS pattern as well as a rating. Example rules for Figure 1 are Our second violation scenario describes the generalized displayed in Figure 2. situation in which a widget reads data from another wid- The pattern syntax is equal to the pattern used to con- get, and sends it to a remote server using HTTP requests. figure the elements to crawl in Crawljax [14], resulting in We first offer some examples illustrating this violation (Sec- rules of the form tag :{attr =value }|rating . tion 5.1). Then, we explain how we can capture the widget The rating is an indicator for the likelihood that a given from which a request originates (Section 5.2), and how we pattern will match a boundary. Possible values include LOW, can use that to compare the request to a list of trusted re-
83 Algorithm 2 IsDomChangeViolation(changedNode) // before annotation Require: The DOM tree is annotated successfully during the widget boundary detection of the active element. // after annotation 1: node ← changedNode 3: if node.isAnnotated() then 4: return false 5: end if Figure 5: Example annotation of the unique at- 6: node ← node.getParent() tribute annotation proxy plugin. 7: end while 8: return true triggered the request. To be able to analyze HTTP requests, all requests must be intercepted. For this purpose, we pro- quests for the given widget (Section 5.3). In addition, we pose to place an HTTP proxy between the client browser explain how these lists can be generated automatically. and the server, which buffers all outgoing HTTP requests. 5.1 Request Violation Examples The only way to attach information about DOM elements to an HTTP request, without affecting the behavior of the A simplified example of code that causes an JavaScript web server handling the request, is by adding data to the re- HTTP request violation is given in Figure 3. quest query string (e.g., ?wid=w23&requestForProxyId=123). This data should be selected carefully, to ensure it does not // code in MAL interfere with other parameters being sent to the server. If url = "http://www.domain.com/mal_script.aspx"; the request parameters contain the value of a unique at- vic = document.getElementById("VIC.inputField"); vic.onkeydown = sendData(url, event.getKey()); tribute, such as the element’s ID, it can be extracted and used to identify the element in the DOM. Enforcing all Figure 3: MAL attaches an event listener to VIC, HTTP requests to contain a value with which the origin sending every user input in VIC to a remote server. widget can be detected requires having mechanisms for the enforcement of a unique attribute in each DOM element, It is important to realize that the attachment of the on- and the attachment of the unique attribute of the originat- keydown event handler does not affect the DOM tree of VIC ing element to outgoing requests. and, therefore, can not be detected using Algorithm 2. In First we need to consider ways HTTP requests can be this example, VIC makes an HTTP request every time a triggered in Ajax-based web applications. key is pressed in its input field, which, based on VIC’s orig- Static Elements. HTTP requests triggered by the src inal functionality, could be a violation. A second example attribute of an static element, for instance in a SCRIPT or is given in Figure 4, in which the event which triggers the IMG element in the source code of the HTML page, are sent request is attached to MAL itself. Because MAL reads data immediately when the browser parses them. This leaves us from VIC, this is also a violation. no time to dynamically annotate a unique value on these elements, as the requests are sent before we can access the DOM. The solution we propose is to use the proxy for inter- // code in MAL cepting responses as well. The responses can be adjusted by function getValue() { return document. the proxy to ensure that each element with a src attribute getElementById("VIC.inputField").value; is given a unique identifying attribute. } Figure 5 shows an example of annotating a requestFor- url = "http://www.domain.com/mal_script.aspx"; ProxyId attribute on a SCRIPT element. Note that the at- mal = document.getElementById("MAL.button"); tribute is annotated twice: in the URL so that it reaches mal.onclick = sendData(url, getValue()); the proxy, and as an attribute for easy identification on the DOM tree using XPath when the violation validation pro- Figure 4: MAL reads data in VIC and sends it to a cess is carried out. remote server through an HTTP request. Dynamic Elements. The src attribute of an element that is dynamically created on the client through JavaScr- This type of violation is very difficult to detect by static ipt and added to the DOM tree, can also trigger an HTTP analysis, because an event handler can be generated and at- request. Annotating attributes through the proxy has limi- tached to an element at runtime in JavaScript. In addition, tations for this type of request, since elements that are added it is difficult to detect by which widget an event handler was dynamically on the client-side are missed. During dynamic attached to an element. Our approach uses dynamic analy- annotation these elements are missed as well, because the sis and assumes that MAL has to send the data read from request is triggered before the element can be annotated. VIC to a remote server. Otherwise, the data never leaves Because we assume every element has a unique attribute in the client and hence no real violation can occur. We detect our approach, requests triggered from dynamically gener- HTTP request violations by analyzing all outgoing HTTP ated elements can be detected easily as they do not contain requests. a unique attribute. We believe dynamically generated ele- ments with a src attribute are rare in modern web applica- 5.2 HTTP Request Origin Identification tions, and since this attribute should point to, for instance, The main challenge of detecting the origin widget of a a JavaScript file or image, the HTTP request they trigger request is to couple the request to the DOM element from should be easy to verify manually by a tester. Therefore, which it originated. This is not a trivial task, since HTTP all requests made from elements which are not annotated, requests do not carry information about the element that should be flagged as suspicious and inspected by the tester.
84 Algorithm 3 IsHttpRequestViolation(request) Assuming a request contains the annotated attribute of Require: Access to the DOM history list. the origin element, Algorithm 3 can be used to automatically Require: Elements in the DOM are annotated with a unique at- detect the origin widget of the request and report HTTP tribute. Require: Request contains the annotated attribute of the DOM el- request violations. ement. Note that this approach also works for requests that do 1: id ← request.getQueryString().getElementId() not originate from a widget, but from a non-widget element 2: activeNode ← domHistory.findElement(id) instead. By crawling the framework with only an empty 3: widget ← IdentifyWidgetBoundary(activeNode) 4: list ← widget.getAllowedUrlList() widget, an allowed URL list can be created for the frame- 5: url ← request.getUrl() work. A request which originates from an element that does 6: return !list.contains(url) || isSuspicious(url) not have a widget boundary will be validated against the allowed URL list of the overall framework.
Ajax Calls. HTTP requests sent through an Ajax call, 6. IMPLEMENTATION via the XMLHttpRequest object, are the most essential form We have implemented our methods for detecting DOM of sending HTTP requests in modern single-page web appli- change violations and HTTP request violations as two sepa- cations [2]. These requests are often triggered by an event, rate plugins, called Diva (Detecting Inter-widget Violations e.g., click, mouseover, on an element with the corresponding with Atusa), for our Atusa Ajax testing and crawling in- event listener. Note that this type of elements could also be frastructure. Diva is available through our web site4 as an created dynamically, and therefore proxy annotation is not open source project. desirable. Hence, we propose to dynamically annotate such The full architectural processing view, of the crawler as elements. To that end, we annotate a unique attribute on well as the security plugins, is shown in Figure 6. Two in- the element right before an event is fired. Note that this Crawling plugins are shown in particular: DOM Violations, annotation is easiest to implement by means of aspects, as and HTTP violations. Furthermore, it shows the integrated explained in Section 6. proxy between the embedded browser and the web server. After the annotation, the attribute (and its value) must The implementation details of Atusa and the underlying be appended to all HTTP requests that the event triggers. Ajax crawler, Crawljax, and the role of the remaining To that end, we take advantage of a technique known as components such as the robot, state machine, controller, Prototype Hijacking[17], in which the Ajax call responsible and DOM analyzer are explained in [14, 12]. for client/server communication can be subverted using a The first issue to consider for the implementation of Diva wrapper function around the XMLHttpRequest object. Dur- is the way Atusa handles the DOM. Atusa uses three DOM ing the subversion, we can use the annotated attribute of objects: the element, on which the event initiating the call was fired, to add a parameter to the query string of the Ajax HTTP • The realDom - The runtime internal browser DOM, call. • The domBeforeEvent - A copy of the realDom before It is possible that the annotated origin element is removed the event is fired, from the DOM by the time the request is validated. To • The domAfterEvent - A copy of the realDom after the avoid this problem, we keep track of the DOM history. Af- event is fired. ter an event is fired, and a DOM change is occurred, the state is saved in the history list. Assuming the history size The DOM change violation detection plugin compares the is large enough, a request can always be coupled to its ori- domBeforeEvent and the domAfterEvent using the Diff func- gin element, and the state from which it was triggered, by tion of XmlUnit5 to find the changed nodes. XmlUnit re- searching the DOM history. turns a list of references to nodes in the domAfterEvent object, while the crawler returns a reference to the active 5.3 Trusted Requests element in the domBeforeEvent object. As mentioned ear- After detecting the origin widget of a request, the request lier, our approach requires both references to be in the same must be validated to verify whether the widget was allowed object. Therefore, we extended Atusa using AspectJ to an- to send this request. To this end, a method must be defined notate the widget boundary of the active element in the real- for specifying which requests a widget is allowed to make. Dom with a unique attribute, right before it fires an event on Our approach uses an idea often applied in firewall tech- it. We have chosen to annotate the widget boundary rather nology, in which each application has an allowed list of URLs than the active element itself, because we assume widgets [10]. For each widget, we can automatically create a list of cannot be removed during the crawling session. This im- allowed URLs by crawling it in an isolated environment. plies that the widget boundary is available in every DOM This way, every request intercepted by the proxy can be state, while the active element may be removed or changed. assigned to that specific widget. At the end of the crawling When realDom is copied into domAfterEvent, the annota- process, the proxy buffer contains all the requests the widget tion is copied along and the active element can be retrieved has triggered. This list can be saved, edited by the tester, using an XPath query for the annotated unique attribute. and retrieved during the validation phase of a request. In Using this approach, references to the widget boundary of addition, it is possible for a tester to manually flag URLs the active element and the changed nodes can be found in in the list as suspicious. If during the validation process the same DOM object. a request URL does not exist in the allowed URL list of The minimum rating an element must match, to be in- its origin widget, or if the URL is flagged as suspicious, we cluded in the detection process, can be configured. It is also 4 assume the widget does not have permission to trigger the http://spci.st.ewi.tudelft.nl 5 request and thus an HTTP request violation has occurred. http://xmlunit.sourceforge.net
85 UI Widget pair DOM change in VIC Description event Embedded D1 Form action URL Browser D2 Link description An event fired in MAL D3 Link destination changes the specified update event D4 Background colour element in VIC Robot Server D5 Image source DOM event Ajax HTTP D6 Framework element An event fired in MAL Engine Proxy changes a framework element, update e.g. the href of a menu item Widget pair Event attached to VIC Description H1 Key press MAL attaches an event fire H2 Focus handler, which sends an Crawljax Controller initialize event H3 Submit HTTP request, for the H4 Click specified event to an H5 Mouse movement element in VIC Analyze Dom Widget pairs Description D* All DOM violations on one page State DOM inCrawling HTTP H* All HTTP violations on one page update Machine Analyzer plugins Violations DH* All DOM and HTTP violations on one page
Legend Access DOM Test Case Table 1: Widget pairs causing DOM change and Violations Control flow Generation report HTTP request violations by MAL. Event invocation report Data component Validation Processing component Report Test Case
Static file 7. EMPIRICAL EVALUATION DB Implementation To assess the the usefulness and effectiveness of our ap- proach and the corresponding implemented plugins, we con- Figure 6: Processing view of our security violation ducted two case studies following Yin’s guidelines [21]. Our detection approach. evaluation addresses the following research questions:
RQ1 What are the violation revealing capabilities of the possible to use wildcards in the value specification of a rule, two approaches for DOM change and HTTP request e.g. div:{id=widget%}. violation detection? For dynamic injections as described in Section 5.2, Atusa RQ2 How well does our analysis perform, with respect to was adapted, through aspects, to inject a unique attribute crawling time and the number of states? into the element on which the next event will be fired, right RQ3 How scalable is our approach, with respect to the before the event is fired. number of widgets that can be verified at the same For our request violation detection approach, we have in- time? tegrated an HTTP proxy into Atusa. We use the proxy functionality of WebScarab,6 which is an open source Java proxy. Our integration allows initializing and accessing the 7.1 Subject Systems proxy through the Crawljax controller. Our experimental subject systems consist of EWF (Exact Di Paola and Fedon [17] have proposed a method for sub- Widget Framework) and Pageflakes. verting Ajax calls by diverting the XmlHttpRequest, called EWF. Exact Software8 is currently researching the pos- Prototype Hijacking. Unfortunately, this approach does not sibilities of building software solutions using widgets. As a work in Internet Explorer since the browser does not al- proof of concept, a widget framework is being prototyped, low prototype manipulation of the XmlHttpRequest object. which is referred to as EWF throughout this paper. One To overcome this limitation, we subvert an instance of the of the key components of the EWF will be the personalized object instead of the prototype itself. This solution is, how- start page for each user, on which widgets from a catalog ever, not generic, i.e., depending on the underlying Ajax can be placed using a drag and drop mechanism. The pro- framework and the way the calls are made, different subver- totype Ajax-based widget framework is implemented by the sion code is required. We have currently implemented diver- Research and Innovation (R&I) team of Exact Software in sion code for the ASP.NET and jQuery Ajax frameworks. ASP.NET and runs on IIS 7. The JavaScript wrapper code is automatically added to Pageflakes. Pageflakes is a commercial Ajax-based the page trough the proxy. start page, which allows its users to place widgets on their The output of Diva consists of a description of the vio- personalized pages. Pageflakes is a closed source framework, lation and the event sequence which has caused it. The de- which means we are not able to inspect or adapt the frame- scription of a DOM change violation contains a textual rep- work code. In Pageflakes, widgets are called flakes and at resentation of the DOM change found, generated by XmlU- the time of writing over 240,000 flakes were available. Page- nit. The description of an HTTP request violation contains flakes also provides APIs for external developers to create information about the origin widget, the requested URL, their own widgets. and the name of the list against which it was checked. In Note that both EWF and Pageflakes use DIV-based con- addition, a Selenium7 test case is generated for each detected tainers for their widgets. violation automatically. These test cases can be used by the tester to replay and investigate the detected violations man- 7.2 Setup ually. We designed a number of scenarios, each consisting of two widgets, a malicious widget (MAL) and a victim widget (VIC). Two groups of widget pairs were designed, as shown 6 http://www.owasp.org/index.php/Category:OWASP_WebScarab_ in Table 1. The first group contained six cases (D1-D6), Project 7 8 http://seleniumhq.org http://www.exactsoftware.com
86 each causing a DOM change violation in VIC by MAL, e.g., an event fired on MAL changes the action URL of a form, or the background color of an element in VIC. The second group contained five cases (H1-H5), each causing an HTTP request violation, e.g., MAL attaches an event listener to VIC so that the contents of a form field in VIC are sent to Widget pair(s) Violations seeded Violations detected False Positives Time (s) w.o. plugins Time (s) w. plugins Overhead time (%) Clickables States the server by MAL after a keypress event. D1 3 3 0 122 110 -9.8 3 4 D2 3 3 0 106 90 -15.0 3 4 Both studies were performed on an Intel Core 2 6400 2.13 D3 3 3 0 107 93 -13.0 3 4 Ghz CPU with 2 GB RAM running Windows XP. We config- D4 3 3 0 106 90 -15.0 3 4 D5 3 3 0 175 159 -9.1 3 4 ured Atusa properties to use Internet Explorer 7 as the in- D6 4 4 0 126 110 -12.7 4 5 ternal browser, included the A, Input, Img, Button, La- D* 19 19 0 552 538 -2.5 19 20 H1 1 2 1 74 75 1.4 3 4 bel tags for crawling, and the click, mouseover, mouse- H2 1 2 1 55 55 0.0 2 3 H3 ------down, mousemove, mouseup, blur, focus, keydown, key- H4 1 2 1 51 51 0.0 1 2 press, keyup, and select as the possible events for the H5 1 2 1 62 61 -1.6 4 5 H* 4 8 4 156 158 1.3 10 11 robot. DH* 23 43 10 703 760 8.1 29 30 In order to test the violation revealing capabilities of our approach (RQ1), we measure the number of violations seeded Table 2: Results of the EWF case study. and detected. We also measure the number of false positives to test the reliability of the output of our plugins. To test the performance of our approach (RQ2), we measure the During the execution of all widget pairs (DH*) both plu- crawling time for each case, and the number of clickables gins were enabled. and DOM states found by Crawljax. The set of clickables contains all elements which were not explicitly excluded by 7.4 Results configuration and on which firing an event resulted in a new Tables 2 and 3 show our measurements for the EWF and DOM state [12]. Finally, to test the scalability of our ap- Pageflakes case studies, respectively. The scenario with all proach (RQ3), we place multiple widget pairs on the page widgets on the page (DH*) required more crawling time on at the same time (case D*, H* and DH* in Table 1). Pageflakes than the maximum time allowed (annotated with We have configured the maximum crawling time for each * in Table 3). case at 60 minutes. If the crawling process was not finished The first run on the EWF case reported a large number by that time, the execution was stopped and the output was of false positives. After inspection, we found that the used analyzed. jQuery UI9 library in EWF, attaches onmousedown event handlers to many elements on the page that cause DOM 7.3 Execution changes for each fired event. To avoid these unnecessary false positives, Atusa was adapted to filter DOM changes of EWF. To prevent bias, 11 EWF widget pairs, including this type. During the second run, we found that widget pair their malicious behaviour, were implemented by a member H3 could not be validated due to invalid HTML code, which of the Exact R&I team, following the requirements in Ta- caused a problem after the JavaScript subversion code in- ble 1. Because our focus was on inter-widget interactions, jection. During this process, the HTML is parsed, the code all elements in the menu section and elements which may is added, and serialized. Therefore, a requirement of Diva remove or hide a widget were excluded from the crawling is that widgets contain valid HTML. Our proposal is to run process. We have manually determined the items to ex- the regular HTML validity checker plugins in Atusa first clude by inspecting the DOM, which could be done within [14], and reject widgets that contain invalid HTML code. 30 minutes. We defined the rules for EWF widget boundary Since Pageflakes does not allow the use of