Change-Aware Dynamic Program Analysis for Javascript

Change-aware Dynamic Program Analysis for JavaScript Dileep Ramachandrarao Krishna Murthy Michael Pradel Department of Computer Science Department of Computer Science TU Darmstadt TU Darmstadt Darmstadt, Germany Darmstadt, Germany [email protected] [email protected] Abstract—Dynamic analysis is a powerful technique to detect promising usage scenario is that an analysis produces results correctness, performance, and security problems, in particular quickly. for programs written in dynamic languages, such as JavaScript. To catch mistakes as early as possible, developers should run Unfortunately, most dynamic analyses implemented on top such analyses regularly, e.g., by analyzing the execution of a of general-purpose analysis frameworks, such as Valgrind and regression test suite before each commit. Unfortunately, the high Jalangi, are not only powerful but also heavyweight. Typically, overhead of these analyses make this approach prohibitively such an analysis imposes a significant runtime overhead, expensive, hindering developers from benefiting from the power ranging between factors of tens and hundreds. For projects of heavyweight dynamic analysis. This paper presents change- aware dynamic program analysis, an approach to make a with large regression test suites, this overhead practically common class of dynamic analyses change-aware. The key idea prohibits running the analysis after every change. Such large is to identify parts of the code affected by a change through regression test suites are particularly common for dynamic a lightweight static change impact analysis, and to focus the languages, where static checks are sparse and likely to miss dynamic analysis on these affected parts. We implement the idea problems. For example, applying two dynamic analyses for based on the dynamic analysis framework Jalangi and evaluate it with 46 checkers from the DLint and JITProf tools. Our results JavaScript, DLint [6] and JITProf [4], to the regression test 1 show that change-aware dynamic analysis reduces the overall suite of the widely used underscore.js library takes over five analysis time by 40%, on average, and by at least 80% for 31% minutes on a standard PC, whereas running the test suite of all commits. without any analysis takes only about five seconds. Index Terms—program analysis, JavaScript As a motivating example, consider Figure 1, which shows a change in a piece of JavaScript code. Suppose to use a dynamic I. INTRODUCTION analysis that warns about occurrences of NaN (not a number), Dynamic analyses help developers identify programming which may result in JavaScript when arithmetic operations are mistakes by analyzing the runtime behavior of a program. applied to non-numbers, but it does not trigger any runtime For example, the Valgrind analysis framework [1] is the basis exception. Furthermore, suppose that the project’s regression for various tools, e.g., to identify references of undefined test suite has three tests that call the functions a, b, and e, values via a memory analysis and unintended flows of data respectively, as indicated in the last three lines of the example. via a taint analysis. More recently, the Jalangi framework [2] In the original code, the analysis reports a NaN warning at has popularized dynamic analysis for dynamic languages, in line 22. In the modified code, the analysis warns again about particular, JavaScript, and is the basis for several tools that line 22 and also about the newly introduced NaN at line 12. find, e.g., inconsistent types [3], optimization opportunities [4], Naively applying the dynamic analysis after the change will memory leaks [5], and various other code quality issues [6]. analyze the executions of all code in Figure 1. However, most Given the power of these analyses, it would be desirable of the code is not impacted by the change, and there is no to use them frequently during the development process. For need to re-analyze the non-impacted code. example, one could apply such analyses on each commit, Existing approaches that reduce the overhead of dynamic e.g., by running a regression test suite and analyzing its analysis use sampling. One approach is to sample the execu- execution. By running each analysis after each commit, the tion in an analysis-specific way, e.g., based on how often a analyses could warn developers about mistakes right when the code location has been executed [4]. While such sampling mistakes are introduced, similar to static checks performed by can significantly reduce the overhead, it still remains too a compiler or lint-like tools. An important prerequisite for this high for practical regression analysis. Another approach is to distribute the instrumentation overhead across multiple This work was supported by the German Federal Ministry of Education and Research and by the Hessian Ministry of Science and the Arts within users [7]. However, this approach is not suitable for regression CRISP, by the German Research Foundation within the ConcSys and Perf4JS projects, and by the Hessian LOEWE initiative within the Software-Factory 4.0 project. 1http://underscorejs.org Old version: New version: produces a function dependence graph. Starting from functions 1 var x=4, y, z; 1 var x=4, y, z; that are modified in a particular change, the approach then 2 2 propagates the change to all possibly impacted functions. Key 3 function a() { 3 function a() { to the success of our approach is that the change impact 4 x = 42; 4 x = "aaa"; // changed 5 } 5 } analysis is efficient. To this end, we use a scalable yet not 6 function b() { 6 function b() { provably sound analysis. We find that, despite the theoretical 7 c(); 7 c(); unsoundness, the analysis is sufficient in practice, in the spirit 8 d(); 8 d(); 9 } 9 } of soundiness [9]. 10 function c() { 10 function c() { For the example in Figure 1, the change impact analysis 11 11 // new NaN warning determines that function a has been changed and that, as 12 y = x / 2; 12 y = x / 2; 13 } 13 } a result of this change, also function c is impacted by the 14 function d() { 14 function d() { change. By only applying the dynamic analysis to these 15 / expensive 15 / expensive * * functions, but not to any others, the approach reduces the 16 * computation 16 * computation 17 * independent 17 * independent number of analyzed functions from five to two. As a result, the 18 * of x, y, z */ 18 * of x, y, z */ change-aware analysis reports the newly introduced warning at 19 } 19 } line 12. To obtain the set of all warnings in the program, one 20 function e() { 20 function e() { 21 // NaN warning 21 // old NaN warning can combine the warnings found by the change-aware analysis 22 var n = "hi" - 23; 22 var n = "hi" - 23; for the current version with warnings found for previous 23 return n; 23 return n; versions of the program. 24 } 24 } 25 25 To evaluate the CADA approach, we apply it to the version 26 a(); 26 a(); histories of seven popular JavaScript projects. Without our 27 b(); 27 b(); 28 z = e(); 28 z = e(); approach, analyzing the changed versions of the projects takes 133 seconds, on average over all commits. CADA reduces the Fig. 1: Code change in a program to be analyzed dynamically. analysis time by 40%, on average, and by at least 80% for 31% of all commits. Comparing CADA to naively running non- analysis, where results are expected in short time and without change-aware analyses on every version, we find that CADA relying on users. does not miss any warnings reported by the class of dynamic analyses targeted in this paper. As a technique complementary to reducing the overhead of dynamic analysis, regression test selection [8] can reduce the In summary, this paper contributes the following: number of tests to execute, which will also reduce the overall analysis time. Regression test selection partly addresses the • An approach for turning a common class of dynamic problem but has the drawback that even after reducing the analyses into change-aware dynamic analyses without number of tests, the analysis may still consider code that is modifying the analyses themselves. unaffected by a change. For example, in Figure 1, selecting the • A lightweight, static change impact analysis for Java- test that calls function b will nevertheless cause the dynamic Script. We believe this analysis to be applicable beyond analysis of function d, which is called by b but not affected by the current work, e.g., for helping developers to under- the change. Furthermore, as a practical limitation, we are not stand the impact of a potential change or for test case aware of any generic regression test selection tool available prioritization. for JavaScript, which is the target language of this paper. • Empirical evidence that the approach significantly reduces the time required to apply heavyweight dy- This paper addresses the problem of reducing the time namic analysis. As a result, such analyses become more required to run a dynamic analysis that has already been amenable to regular use, e.g., at every commit. applied to a previous version of the analyzed code. We present change-aware dynamic analysis, or short CADA, an approach II. PROBLEM STATEMENT that turns a common class of dynamic analyses into change- aware analyses. The key idea is to reduce the number of code The following section describes the problem that this paper locations to dynamically analyze by focusing on functions that tackles. We consider dynamic analyses that reason about the are impacted by a code change. When entering a function at behavior of a program to identify potential programming runtime, the dynamic analysis checks whether this function is mistakes and that report warnings to the developer. Given a impacted, and only if it is impacted, analyzes the function.

Load more