Automatic Tracking of Null Dereferences to Inception with Causality Traces

View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by UCL Discovery Casper: Automatic Tracking of Null Dereferences to Inception with Causality Traces Benoit Cornu*, Earl T. Barr†, Lionel Seinturier*, Martin Monperrus* *University of Lille & INRIA †University College London Abstract: Fixing a software error requires under- 1 Exception in thread"main" java.lang.NullPointerException standing its root cause. In this paper, we introduce 2 at [..].BisectionSolver.solve(88) 3 at [..].BisectionSolver.solve(66) “causality traces”, crafted execution traces augmented 4 at ... with the information needed to reconstruct the causal Listing 1: The standard stack trace of a real null chain from the root cause of a bug to an execution dereference bug in Apache Commons Math. error. We propose an approach and a tool, called Casper, based on code transformation, which dy- namically constructs causality traces for null derefer- 1 Exception in thread"main" java.lang.NullPointerException ence errors. The core idea of Casper is to replace 2 Dereferenced parameter “f" // symptom null 3 at [..].BisectionSolver.solve(88) s with special objects, called “ghosts”, that track 4 at [..].BisectionSolver.solve(66) the propagation of the nulls from inception to their 5 at ... 6 Parameter f bound to field “f2” error-triggering dereference. Causality traces are ex- 7 at [..].BisectionSolver.solve(66) tracted from these ghosts. We evaluate our contribu- 8 Field “f2” set to null 9 at [..].UnivariateRealSolverImpl.<init>(55) // tion by providing and assessing the causality traces of cause 14 real null dereference bugs collected over six large, popular open-source projects. Listing 2: What we propose: a causality trace, an extended stack trace that contains the root cause. 1 Introduction assignment of a null value, by means of a causality Null pointer dereferences are frequent errors that trace — the execution path the null took through the cause segmentation faults or uncaught exceptions. code from its inception to its dereference. Li et al. found that 37.2% of all memory errors The literature offers different families of techniques in Mozilla and Apache are null dereferences [15]. to compute the root cause of bugs, mainly program Kimura et al. [14] found that there are between one slicing, dataflow analysis, or spectrum-based fault- and four null checks per 100 lines of code on aver- localization. However, they have been little studied age. Problematic null dereferences are daily reported and evaluated in the context of identifying the root in bug repositories, such as bug MATH#3051. A null cause of null dereferences [11,4, 24]. Those techniques dereference occurs at runtime when a program tries are limited in applicability ([11] is an intra-procedural to read memory using a field, parameter, or variable technique) or in accuracy (program slicing results in that points to “null”, i.e. nothing. The terminology large sets of instructions [3]). The fundamental prob- changes depending on the language, in this paper, lem not addressed in the literature is that the causal- we concentrate on the Java programming language, ity trace from null inception to the null symptom is where a null dereference triggers an exception called missing. This is the problem that we address in this NullPointerException, often called “NPE”. paper. We propose a causality analysis technique that Just like any bug, fixing null dereferences requires uncovers the inception of null variable bindings that understanding their root cause, a process that we call lead to errors along with the causal explanation of null causality analysis. At its core, this analysis is how null flowed from inception to failure during the the process of connecting a null dereference, where execution. While our analysis may not report the root the fault is activated and whose symptom is a null cause, it identifies the null inception point with cer- pointer exception, to its root cause, usually the initial tainty, further localizing the root cause and speeding 1https://issues.apache.org/jira/browse/MATH-305 debugging in practice. 1 Let us consider a concrete example. Listing 1 ciousness values and does not collect causality traces shows a null dereference stack trace which shows nor identifies null inception points with certainty. that the null pointer exception happens at line 88 We evaluate our contribution Casper by providing of BisectionSolver. Let’s assume that a perfect fault and assessing the causality traces of 14 real null deref- localization tool suggests that this fault is located erence bugs collected over six large, popular open- at line 55 of UnivariateRealSolverImpl (which is source projects. We collected these bugs from these where the actual fault lies). However, the developer project’s bug reports, retaining those we were able to is left clueless with respect to the relation between reproduce. Casper constructs the complete causal- line 55 of UnivariateRealSolverImpl and line 88 of ity trace for 13 of these 14 bugs. For 11 out of these BisectionSolver where the null dereference happens. 13 bugs, the causality trace contains the location of What we propose is a causality trace, as shown in the actual fix made by the developer. Listing 2. In comparison to Listing 1, it contains three To sum up, our contributions are: additional pieces of information. First, it gives the exact name, here f, and kind, here parameter (local • The definition of causality traces for null derefer- variable or field are other possibilities), of the variable ence errors from null inception to its dereference 2 that holds null . Second, it explains the inception of and the concept of “ghost” classes, which replace the null binding to the parameter, the call to solve null, collect causality traces, while being other- at line 66 with field f2 passed as parameter. Third, wise indistinguishable from null. it gives the root cause of the null dereference: the • A set of code transformations that inject null assignment of null to the field f2 at line 55 of class ghosts and collect causality traces of null deref- UnivariateRealSolverImpl. Our causality traces con- erences. tain several kinds of causal links, of which Listing 2 shows only three: the name of the wrongly derefer- • Casper, an Java implementation of our tech- enced variable, the flow of a null binding through pa- nique. rameter bindings, and null assignment. Section 2.2 • An evaluation of our technique on 14 real null presents the concept of null causality trace. dereference bugs collected over 6 large open- We present Casper, a tool that transforms Java source projects. programs to capture causality traces and facilitate the fixing of null deferences3. Casper takes as input the The remainder of this paper is structured as fol- program under debug and a main routine that triggers lows. Section2 presents our technical contribution. the null dereference. It first instruments the program Section3 gives the results of our empirical evalua- under debug by replacing null with “ghosts” that are tion. Section4 discusses the limitations of our ap- shadow instances responsible for tracking causal infor- proach. Sections5 and6 respectively discusses the mation during execution. To instrument a program, related work and concludes. Casper and our bench- Casper applies a set of 11 source code transforma- mark can be downloaded from https://github.com/ tions tailored for building causal connections. For Spirals-Team/casper. instance, x = y is transformed into o = assign(y) , where method assign stores an assignment causal links in a null ghost (Section 2.3). Section 2.4 details 2 Debugging Nulls with Casper these transformations. Casper tracks the propagation of a null binding dur- Casper Compared to the related work, is novel ing application execution in a causality trace. A null along three dimensions. First, it collects the complete dereference causality trace is a sequence of program el- causality trace from the inception of a null binding ements (AST nodes) traversed during execution from to its dereference. Second, it identifies the inception the source of the null to its erroneous dereference. of a null binding with certainty. Third, Casper is lightweight and easily deployable, resting on transformation rather than replacing the Java virtual ma- 2.1 Overview chine, a la Bond et al. [4]. The first two properties We replace nulls with objects whose behavior, from strongly differentiate Casper from the related work the application’s point of view, is same as null, ex- [20,4], which tentatively labels root causes with suspi- cept that they store a causality trace, defined in Sec- null ghosts 2 tion 2.2. We call these objects and detail if the dereference is the result of a method invocation, we Casper give the code expression that evaluates to null them in Section 2.3. rewrites the program 3We have named our tool Casper, since it injects “friendly” under debug to use null ghosts and to store a null’s ghosts into buggy programs. causality trace in those null ghosts, (see Section 2.4). 2 We instantiated Casper’s concepts in Java and there- Mnemonic Description Examples fore tailored our presentation in this section to Java Inception (Section 2.5). x = null; Casper makes minimal assumptions on the appli- L Object x = null cation under debug, in particular, it does not assume null literal return null; a closed world where all libraries are known and ma- nipulable. Hence, a number of techniques used in void foo(Object x) P null at method entry Casper comes from this complication. {...} Propagation 2.2 Null Dereference Causality Trace //foo returns null x = foo() R null at return site To debug a complex null dereference, the developer foo().bar() has to understand the history of a null binding from bar(foo()) its inception to its problematic dereference. When a variable is set to null, we say that a null binding is A null assignment x = e; created.

Load more