JSZap: Compressing JavaScript Code Martin Burtscher Benjamin Livshits and Benjamin G. Zorn Gaurav Sinha University of Texas at Austin Microsoft Research IIT Kanpur
[email protected] {livshits,zorn}@microsoft.com
[email protected] Abstract bytes of compressed data. This paper argues that the current Internet infrastruc- JavaScript is widely used in web-based applications, and ture, which transmits and treats executable JavaScript as gigabytes of JavaScript code are transmitted over the In- files, is ill-suited for building the increasingly large and ternet every day. Current efforts to compress JavaScript complex web applications of the future. Instead of using to reduce network delays and server bandwidth require- a flat, file-based format for JavaScript transmission, we ments rely on syntactic changes to the source code and advocate the use of a hierarchical, abstract syntax tree- content encoding using gzip. This paper considers re- based representation. ducing the JavaScript source to a compressed abstract Prior research by Franz et al. [4, 22] has argued that syntax tree (AST) and transmitting it in this format. An switching to an AST-based format has a number of valu- AST-based representation has a number of benefits in- able benefits, including the ability to quickly check that cluding reducing parsing time on the client, fast checking the code is well-formed and has not been tampered with, for well-formedness, and, as we show, compression. the ability to introduce better caching mechanisms in With JSZAP, we transform the JavaScript source into the browser, etc. However, if not engineered properly, three streams: AST production rules, identifiers, and an AST-based representation can be quite heavy-weight, literals, each of which is compressed independently.