SAFE: Formal Specification and Implementation of a Scalable Analysis Framework for Ecmascript

SAFE: Formal Specification and Implementation of a Scalable Analysis Framework for Ecmascript

SAFE: Formal Specification and Implementation of a Scalable Analysis Framework for ECMAScript Hongki Lee Sooncheol Won Joonho Jin Junhee Cho Sukyoung Ryu KAIST KAIST KAIST KAIST KAIST [email protected] [email protected] [email protected] [email protected] [email protected] Abstract 1 function Wheel4() { this.wheel = 4 } 2 function Car() { this.maxspeed = 200 } The prevalent uses of JavaScript in web programming have re- 3 Car.prototype = new Wheel4; vealed security vulnerability issues of JavaScript applications, 4 var modernCar = new Car; which emphasizes the need for JavaScript analyzers to detect such 5 issues. Recently, researchers have proposed several analyzers of 6 var beforeModern = JavaScript programs and some web service companies have devel- 7 modernCar instanceof Car; // true oped various JavaScript engines. However, unfortunately, most of 8 the tools are not documented well, thus it is very hard to understand 9 function Wheel6() { this.wheel = 6 } and modify them. Or, such tools are often not open to the public. 10 Car.prototype = new Wheel6; In this paper, we present formal specification and implemen- 11 var afterModern = SAFE tation of , a scalable analysis framework for ECMAScript, 12 modernCar instanceof Car; // false developed for the JavaScript research community. This is the very 13 var truck = new Car; first attempt to provide both formal specification and its open- 14 var aftertruck = source implementation for JavaScript, compared to the existing ap- 15 truck instanceof Car; // true proaches focused on only one of them. To make it more amenable for other researchers to use our framework, we formally define three kinds of intermediate representations for JavaScript used in Figure 1. Unintuitive behavior of JavaScript prototypes the framework, and we provide formal specifications of translations between them. To be adaptable for adventurous future research in- cluding modifications in the original JavaScript syntax, we actively its own implementation of the language, JScript, in the Internet use open-source tools to automatically generate parsers and some Explorer 3.0 browser in 1996, Ecma International develops the intermediate representations. To support a variety of program anal- standardized version of the language named ECMAScript [8, 9]. yses in various compilation phases, we design the framework to be JavaScript was first envisioned as a simple scripting language, but as flexible, scalable, and pluggable as possible. Finally, our frame- with the advent of Dynamic HTML, Web 2.0 [28], and most re- work is publicly available, and some collaborative research using cently HTML5 [1], JavaScript is now being used on a much larger the framework are in progress. scale than intended. All the top 100 most popular web sites accord- ing to the Alexa list [2] use JavaScript and its use outside web pages Categories and Subject Descriptors D.3.3 [Programming Lan- is rapidly growing. guages]: Language Constructs and Features As Brendan Eich, the inventor of JavaScript, says [7]: General Terms Languages, Formalization “Dynamic languages are popular in large part because pro- Keywords JavaScript, ECMAScript 5.0, formal semantics, formal grammers can keep types latent in the code, with type specification, compiler, interpreter checking done imperfectly (yet often more quickly and ex- pressively) in the programmers’ heads and unit tests, and therefore programmers can do more with less code writ- 1. Introduction ing in a dynamic language than they could using a static JavaScript is now the language of choice for client-side web pro- language.” gramming, which enables dynamic interactions between users and By sacrificing strong static checking, JavaScript enjoys aggres- web pages. By embedding JavaScript code that use event handlers sively dynamic features such as run-time code generation using such as onMouseOver and onClick, static HTML web pages be- eval and dynamic scoping using with. In addition, JavaScript come “Dynamic HTML” [12] web pages. JavaScript is originally provides quite different semantics from conventional programming developed in Netscape, released in the Netscape Navigator 2.0 languages like C [22] and Java [4]. For example, JavaScript al- browser under the name LiveScript in September 1995, and re- lows programmers to use variables and functions before defining named as JavaScript in December 1995. After Microsoft releases them, and to assign values to new properties of an object even be- fore declaring them in the object. Also, JavaScript allows users to access the global object of a web page via interactions with the DOM (Document Object Model) without requiring any permis- Permission to make digital or hard copies of all or part of this work for personal or sions. JavaScript provides “prototype-based” inheritance instead of classroom use is granted without fee provided that copies are not made or distributed classes. for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute Consider the code example in Figure 1. Unlike conventional to lists, requires prior specific permission and/or a fee. programming languages, the inheritance hierarchy may be changed FOOL ’12 October 22, 2012, Tocson, AZ, USA. after creation of objects. When modernCar is constructed at line 4, Copyright c 2012 ACM [to be supplied]. $10.00 it is an instance of the Car object. However, because the prototype of Car is changed from Wheel4 to Wheel6 at line 10, modernCar erated parsers and AST nodes from high-level, brief descriptions is not an instance of Car any more at line 12. In JavaScript, when thanks to various third-party open-source tools. To support a va- some properties of a constructor change, the objects constructed riety of analyses on various compilation phases, we provide three before and after the change may be considered different instances levels of intermediate representations and well-defined translation even though they are constructed from the same constructor. mechanisms between them. Using SAFE, some collaborative re- Due to such quirky semantics, understanding and analyzing search on JavaScript such as clone detection and code structure JavaScript applications are well known to be difficult, and they are analysis are in progress with both academia and industry. often targets of security attacks [19]. Because of the crude con- In short, our contributions are as follows: trol by the same-origin policy of HTML, once a web page trusts • SAFE is the very first attempt to support both formal specifica- third-party code it permits subsequent contents from the same ori- tion and its implementation for JavaScript. gin, which often allows malicious scripts to sneak in. Such code injections can easily allow attackers to get high access permissions • SAFE is based on the 5th edition of the ECMAScript specifica- to secure contents including session cookies and unprotected per- tion. sonal information. This security problem known as XSS (cross-site • SAFE formally defines every intermediate representation used scripting) shows up often in web pages and web applications. To re- in the framework and provides formal specifications of the solve the problem, web service companies have developed several translations between them. defense mechanisms such as cookie-based security policies and fil- tering out string inputs that may contain malicious scripts, but their • SAFE describes the formal semantics of its Intermediate Rep- functionalities are very limited. More robust approach might be us- resentation (IR) with the descriptions of the corresponding lan- ing a presumably safe subset of JavaScript: Yahoo! ADsafe [6], guage constructs in the ECMAScript specification. Facebook FBJS [10], and Google Caja [13]. While they are in- • SAFE consists of formally defined components that are ade- tended to be safe subsets of JavaScript, none of them has been quate for pluggable analysis extensions. shown safe. Rather, researchers have reported security vulnerabili- • SAFE makes its implementation available to the public for the ties with ADsafe and FBJS [24, 25, 31]. research community: Clearly, better analyses of JavaScript applications for devel- oping more reliable programs become indispensable. As more http://plrg.kaist.ac.kr/research/safe fundamental solutions to the security vulnerability problems of JavaScript, researchers recently have proposed formal specifica- tions [11, 17, 23], type systems [3, 18, 33], static analyses [16, 2. SAFE 20], and combinations of static and dynamic analyses [5, 24] Before describing the formal specification and the implementation for JavaScript. Web-service companies also have fertilized the of SAFE in detail in Sections 3 and 4, we describe the motivation JavaScript research community by open sourcing their JavaScript of our work and a big picture of the framework. engines such as Rhino [26] and SpiderMonkey [27] from Mozilla, and V8 [14] from Google. While each of them contributes various 2.1 Motivation aspects to solve the problem of JavaScript vulnerability issues, they We encountered several obstacles while using existing tools in are yet unsatisfactory in several reasons. First, most of them do not our previous research. Recently, we have worked on JavaScript- have a well-defined specification or a document to describe them; related topics: 1) adding modules to the existing JavaScript lan- it is hard for other researchers to understand them and utilize them guage via desugaring [21]

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    12 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us