Dynamic Analysis for Javascript Code

Dynamic Analysis for JavaScript Code by Liang Gong A dissertation submitted in partial satisfaction of the requirements for the degree of Doctor of Philosophy in Computer Sciences in the GRADUATE DIVISION of the UNIVERSITY OF CALIFORNIA, BERKELEY Committee in charge: Professor Koushik Sen, Chair Professor David Wagner Professor Vern Paxson Professor Sara McMains Spring 2018 Dynamic Analysis for JavaScript Code Copyright © 2018 by Liang Gong Abstract Dynamic Analysis for JavaScript Code by Liang Gong Doctor of Philosophy in Computer Sciences University of California, Berkeley Professor Koushik Sen, Chair The eectiveness of the widely adopted static analysis tools is often limited by JavaScript’s dynamic nature and the need to over-approximate runtime behaviors. To tackle this challenge, we research robust dynamic analysis techniques for real-world JavaScript code. To analyze front-end web applications, we rst extend Jalangi which is a dynamic analysis framework based on source code instrumentation. Our extension of Jalangi intercepts and rewrites JavaScript code during network transmission. We also develop NodeSec, which is a dynamic instrumentation framework that traces and sandboxes the interactions between a Node.js program and the operating system. Based on the two frameworks, we research dynamic analysis techniques to detect correctness, performance, and security issues in JavaScript code. First, we present DLint, a dynamic analysis approach to check code quality rules in JavaScript. DLint consists of a generic framework and an extensible set of checkers that each addresses a particular rule. We formally describe and implement 28 checkers that address problems missed by state-of-the-art static approaches. Applying the approach in an empirical study on over 200 popular web sites shows that static and dynamic checking complement each other. On average per web site, DLint detects 49 problems that are missed statically, including visible bugs on the web sites of IKEA, Hilton, eBay, and CNBC. Second, we present JITProf, a proling framework to dynamically identify JIT-unfriendly code, which prohibits protable JIT optimizations. The key idea is to associate meta-information with JavaScript objects and code locations, to update this information whenever particular runtime events occur, and to use the meta-information to identify JIT-unfriendly operations. We use JITProf to analyze widely used JavaScript web applications and show that JIT-unfriendly code is prevalent in practice. We show that refactoring JIT-unfriendly code identied by JITProf leads to statistically signicant performance improvements of up to 26.3% in 15 popular benchmarks. Finally, we conduct the rst large-scale empirical study of security issues on over 330,000 npm packages. We adopted an iterative approach to dynamically analyze those packages and identied 360 previously unknown malicious or vulnerable packages, 315 of which have been validated by the community so far; 258 of those issues are considered as highly severe. All those packages with security issues in aggregate have 2,138 downloads per day, stressing the risks for the Node.js ecosystem. 1 Contents Contents i List of Figures v List of Tables vi Acknowledgments vii 1 Introduction 1 1.1 JavaScript — the Good, the Bad, and the Ugly . .1 1.2 Dealing with Problematic Features . .2 1.3 Outline and Previously Published Work . .5 2 Related Work 6 2.1 Static Analysis . .6 2.2 Symbolic Execution and Concolic Execution . .7 2.3 Dynamic Analysis Infrastructure . .8 2.3.1 Runtime Instrumentation . .8 2.3.2 Source Code Instrumentation . .8 2.3.3 Meta-circular Interpreter . .9 2.4 Correctness and Reliability . .9 2.4.1 Code Smells and Bad Coding Practices . 10 2.4.2 Program Repair . 11 2.4.3 Cross-browser Testing . 11 2.5 Performance . 12 2.5.1 Just-in-time Compilation . 12 2.5.2 Performance Analysis and Proling . 12 2.6 Security . 13 2.6.1 Node.js Security Analysis . 13 i 2.6.2 Manual Security Inspection . 14 2.6.3 Blacklist-based Static Checking Tools . 14 2.6.4 Runtime Monitoring Tools . 14 3 JavaScript Instrumentation 15 3.1 Dynamic Analysis and Instrumentation . 15 3.2 Jalangi Instrumentation Framework . 16 3.3 NodeSec Instrumentation Framework . 22 3.3.1 Node.js Background . 23 3.3.2 Dynamic Instrumentation Overview . 23 3.3.3 Intercepting and Monitoring Built-in System Functions . 24 3.3.4 Instrumentation to Wrap Asynchronous Callbacks. 26 3.3.5 Sandbox . 27 3.4 Rules, Events, and Runtime Patterns . 29 4 Checking Correctness Issues 31 4.1 Introduction . 31 4.2 Approach . 34 4.2.1 Rules, Events, and Runtime Patterns . 34 4.2.2 Problems Related to Inheritance . 36 4.2.3 Problems Related to Types . 38 4.2.4 Problems Related to Language Misuse . 40 4.2.5 Problems Related to API Misuse . 42 4.2.6 Problems Related to Uncommon Values . 44 4.3 Implementation . 45 4.4 Evaluation . 45 4.4.1 Experimental Setup . 46 4.4.2 Dynamic versus Static Checking . 46 4.4.3 Code Quality versus Web Site Popularity . 50 4.4.4 Performance of the Analysis . 50 4.4.5 Examples of Detected Problems . 50 4.4.6 Threats to Validity . 54 ii 4.5 Conclusion . 54 5 Checking Performance Issues 55 5.1 Introduction . 55 5.2 Approach . 57 5.2.1 Framework Overview . 57 5.2.2 Formalization of Prolers . 58 5.2.3 Patterns and Prolers . 59 5.2.4 Sampling . 68 5.3 Implementation . 69 5.4 Evaluation . 70 5.4.1 Experimental Methodology . 70 5.4.2 Prevalence of JIT Unfriendliness . 71 5.4.3 Proling JIT-Unfriendly Code Locations . 71 5.4.4 Runtime Overhead . 76 5.5 Conclusion . 76 6 Checking Security Issues 77 6.1 Introduction . 77 6.2 Background . 79 6.3 Overview . 79 6.3.1 Motivation behind Behavior-driven Study . 81 6.3.2 Identifying Suspicious Behaviors . 81 6.3.3 Dynamically Analyzing an Example . 82 6.4 Suspicious Behaviors . 83 6.4.1 File System Interaction . 84 6.4.2 Network Interaction . 85 6.4.3 Shell Interaction . 85 6.4.4 Ad hoc Attacks/Vulnerabilities . 86 6.4.5 Memory and CPU Monitoring . 86 6.4.6 Package Installation & NPM Scripts . 87 6.5 Implementation . 87 iii 6.5.1 Dynamic Instrumentation . 88 6.5.2 Virtual Machine . 90 6.5.3 Dynamic Analysis . 90 6.6 Empirical Study . 92 6.6.1 Directory Traversal . 97 6.6.2 Backdoor . 97 6.6.3 Virus . 98 6.6.4 Prank & Rockstar . 98 6.6.5 Denial-of-Service Attack . 99 6.6.6 Package Overwriting . 99 6.6.7 Unauthorized Access . 99 6.6.8 Privacy Issues . 100 6.6.9 File Overwrite Attack . 101 6.6.10 Runtime Install & Insecure Download . 101 6.6.11 Violation of Security Practices . 101 6.6.12 Known Vulnerabilities . 102 6.7 Conclusion . 103 7 Limitations 104 8 Summary 106 Bibliography 107 iv List of Figures 3.1 Overview of.

Load more