Information-Flow-Based Access Control for Web Browsers
Total Page:16
File Type:pdf, Size:1020Kb
IEICE TRANS. INF. & SYST., VOL.E92–D, NO.5 MAY 2009 836 PAPER Special Section on Information and Communication System Security Information-Flow-Based Access Control for Web Browsers Sachiko YOSHIHAMA†,††a), Takaaki TATEISHI†, Naoshi TABUCHI†, Nonmembers, and Tsutomu MATSUMOTO††, Member SUMMARY The emergence of Web 2.0 technologies such as Ajax and box holds a DOM (Document Object Model) tree repre- Mashup has revealed the weakness of the same-origin policy [1], the current senting an HTML document. The same-origin policy pro- de facto standard for the Web browser security model. We propose a new hibits access between DOMs or JavaScript objects that be- browser security model to allow fine-grained access control in the client- ff side Web applications for secure mashup and user-generated contents. We long to di erent origins. External script files imported into propose a browser security model that is based on information-flow-based the HTML document by the <script src=’...’ / > access control (IBAC) to overcome the dynamic nature of the client-side elements are regarded as part of the HTML document, and Web applications and to accurately determine the privilege of scripts in the thus run within the same sandbox as the main document. event-driven programming model. key words: Web security, browser security, access control, information- The XMLHttpRequest (XHR) is a de facto standard flow control API that is implemented in most Web browsers. XHR al- lows a client-side script to issue an arbitrary HTTP request 1. Introduction to a remote server. The same-origin policy is also applied to XHR, which means an HTTP request can only be issued to a server that belong to the same-origin as the client-side Ajax offers new modes of Web application construction in- content. However, the emergence of Web 2.0 technologies volving asynchronous data exchanges between the clients such as Ajax and Mashup has revealed design flaws of the (i.e., browsers) and servers, and using JavaScript to update same-origin policy. the GUI via the DOM (Document Object Model) and style First, the same-origin policy assumes that the content sheets. Ajax supports interactive user interfaces without on the same server is mutually trusted, which does not hold reloading the webpages, and this makes possible desktop- for content generated by many users or mashup applications. application-like user experiences for Web applications. For It is quite possible that the content on the same server in- example, word processors and spreadsheets have been the cludes malicious scripts from some users or third parties. In major native desktop applications on PCs, but they can now fact, Cross-Site Scripting attacks have been the most serious be implemented as Ajax applications [2]. security threat to Web applications for some years [3]. Mashup is an application programing model that com- Second, the same-origin sandboxes still need mecha- bines multiple content sources into a single user experience. nisms to allow communication between content from the A typical mashup application might integrate a third-party different origins. A typical example that bypasses the re- map service and a list of stores to interchangeably search striction of XHR is to use the src attribute of an <img> for the shops near some location on the map, or to find the element to send arbitrary information to an arbitrary remote locations of the shops on the map. server, by passing the information as part of the URL request The de facto security policy of current browsers is parameters. This technique is widely used in XSS attacks to called the same-origin policy [1]. The same-origin policy steal session cookies and send them to remote attackers. If assumes that contents (e.g., HTML documents) downloaded an attacker uses the src attribute of a <script> element, from the same server can trust each other, and therefore lim- he can easily implement bi-directional communication be- its the communication between contents from different ori- tween the browser and a remote server, taking advantage gins. The origin of the content is determined by the protocol, of the fact that the returned script content will be executed port number, and the server name of the content URLs. on the client-side [4]. Since the network accesses via URL Each set of same-origin content is confined within a references in the HTML attributes are not under the con- sandbox, which is a browser window or a frame. Each sand- trol of the same-origin policy, an attacker can communicate Manuscript received August 4, 2008. with arbitrary servers without using XHR. Similar loop- Manuscript revised January 2, 2009. holes exist in communications between windows and frames †The authors are with the IBM Tokyo Research Laboratory, on a browser, which are deliberately used by Ajax frame- Yamato-shi, 242–8502 Japan. †† works [5], [6] to enable cross-domain communications. The authors are with the Graduate School of Environment and This paper proposes a new secure browser model that Information Sciences, Yokohama National University, Yokohama- shi, 240–8501 Japan. mitigates the design flaws of the same-origin policy and a) E-mail: [email protected] mitigates the threats to Web applications. In our proposed DOI: 10.1587/transinf.E92.D.836 browser model, all data, i.e., contents received from servers, Copyright c 2009 The Institute of Electronics, Information and Communication Engineers YOSHIHAMA et al.: INFORMATION-FLOW-BASED ACCESS CONTROL FOR WEB BROWSERS 837 are associated with the security label which identifies the attack. origin of the data. The labels represents the security domain DOM-based XSS. The DOM-based XSS is possible when of the URL from which the content is originated. In addi- the client-side application is designed to read a URL tion, the user and each content provider can define the ac- string from document.location, document.URL or cess control policy to control how content (i.e., script) can document.referrer, and inject part of it into the access resources, such as the browser’s internal DOM tree DOM [7]. and the document cookies. The access control policy can also control execution of script as well as network access Once a malicious script is triggered, it is executed by JavaScript (e.g., XMLHttpRequest) and HTML elements within the same-origin sandbox of the innocent content, and and attributes (e.g., by using an <img> element). The ac- therefore allows an attacker to steal document cookies or cess control policy is defined based on the origin of the con- user passwords, or compromise data in a Web application tent. In order to enforce the access control policy properly in for the purpose of phishing [8]. Likewise, Mashup applica- JavaScript, which has dynamic nature, the proposed access tions have the same vulnerabilities as XSS, because third- control model is built on top of information-flowbased ac- party content is often integrated and executed in the same- cess control (IBAC) [13]. In IBAC, the privilege of the con- origin context as trusted content. tent is judged based on the origin of the data used in each op- A common practice to prevent XSS is to filter out eration. The origin of the data is tracked through script ex- scripts on the server side. However, in case of DOM-based ecution. IBAC has advantages over the Stack-based access XSS, server-side filtering is not possible because malicious control (SBAC), a code-origin based access control mech- URLs are handled only on the browser side. Even when the ffi anism which is widely used in Java. Since IBAC assesses server side filtering is logically possible, it is di cult to cor- the privilege of the content not only based on stack inspec- rectly filter data because new ways of bypassing filters are tion but also the origin of method parameters, our proposed constantly invented [9], [10]. It is reported that 73% of Web ff model can enforce access control policies based on the orig- applications su er from XSS vulnerabilities [3]. Therefore, inator of an action even in dynamic and self-mutating client- it is important to provdide a mechanism to protect users side Web applications. from vulnerable Web applications. The rest of the paper is organized as follows: Section 2 describes some attack scenarios. Section 3 offers some ob- 2.1 DOM-Based XSS and Our Approach servations on the access control models. Section 4 describes the detailed design of a secure browser model and its op- This section describes the details of how a DOM-based XSS erational semantics. Section 5 shows the correctness of the works, and how our proposed browser model prevents the proposed model. Section 6 presents an example scenario attack. and evaluates the safety of the model with sample attacks. 1. The user receives a malicious URL, such as Section 7 makes observation on some design decisions. Sec- http://foo.com/bar?name=john#<script>... tion 8 reviews related work. Section 9 concludes the paper ...</script> via a channel outside of the browser, with our future research agenda. such as in spam e-mail. 2. When the URL is passed to the browser, e.g., by 2. Motivating Scenarios: Cross-Site Scripting the user double clicking on the link in the spam mail, the browser sends the HTTP request GET This section describes an attack scenario that is enabled by /bar?name=john to the server foo.com. The fragment the shortcomings of the current browser security model, and identifier after the # sign will not be sent to the server. then gives overview of how the proposed browser model 3. The server foo.com returns a HTML document which prevents the attacks.