Statically Detecting JavaScript Obfuscation and Minification Techniques in the Wild Marvin Moog∗y, Markus Demmel∗, Michael Backesy, and Aurore Fassy ∗Saarland University yCISPA Helmholtz Center for Information Security: fbackes,
[email protected] Abstract—JavaScript is both a popular client-side program- inherently different objectives, these transformations leave ming language and an attack vector. While malware developers different traces in the source code syntax. In particular, transform their JavaScript code to hide its malicious intent the Abstract Syntax Tree (AST) represents the nesting of and impede detection, well-intentioned developers also transform their code to, e.g., optimize website performance. In this paper, programming constructs. Therefore, regular (meaning non- we conduct an in-depth study of code transformations in the transformed) JavaScript has a different AST than transformed wild. Specifically, we perform a static analysis of JavaScript files code. Specifically, previous studies leveraged differences in to build their Abstract Syntax Tree (AST), which we extend with the AST to distinguish benign from malicious JavaScript [5], control and data flows. Subsequently, we define two classifiers, [9], [14], [15]. Still, they did not discuss if their detectors benefitting from AST-based features, to detect transformed sam- ples along with specific transformation techniques. confounded transformations with maliciousness or why they Besides malicious samples, we find that transforming code did not. In particular, there are legitimate reasons for well- is increasingly popular on Node.js libraries and client-side intentioned developers to transform their code (e.g., perfor- JavaScript, with, e.g., 90% of Alexa Top 10k websites containing mance improvement), meaning that code transformations are a transformed script.