Does the Fault Reside in a Stack Trace? Assisting Crash Localization by Predicting Crashing Fault Residence
Total Page:16
File Type:pdf, Size:1020Kb
The Journal of Systems and Software 148 (2019) 88–104 Contents lists available at ScienceDirect The Journal of Systems and Software journal homepage: www.elsevier.com/locate/jss Does the fault reside in a stack trace? Assisting crash localization by predicting crashing fault residence ∗ Yongfeng Gu a, Jifeng Xuan a, , Hongyu Zhang b, Lanxin Zhang a, Qingna Fan c, Xiaoyuan Xie a, Tieyun Qian a a School of Computer Science, Wuhan University, Wuhan 430072, China b School of Electrical Engineering and Computer Science, The University of Newcastle, Callaghan NSW2308, Australia c Ruanmou Edu, Wuhan 430079, China a r t i c l e i n f o a b s t r a c t Article history: Given a stack trace reported at the time of software crash, crash localization aims to pinpoint the root Received 27 February 2018 cause of the crash. Crash localization is known as a time-consuming and labor-intensive task. Without Revised 7 October 2018 tool support, developers have to spend tedious manual effort examining a large amount of source code Accepted 6 November 2018 based on their experience. In this paper, we propose an automatic approach, namely CraTer, which pre- Available online 6 November 2018 dicts whether a crashing fault resides in stack traces or not (referred to as predicting crashing fault res- Keywords: idence ). We extract 89 features from stack traces and source code to train a predictive model based on Crash localization known crashes. We then use the model to predict the residence of newly-submitted crashes. CraTer can Stack trace reduce the search space for crashing faults and help prioritize crash localization efforts. Experimental re- Predictive model sults on crashes of seven real-world projects demonstrate that CraTer can achieve an average accuracy of Crashing fault residence over 92%. © 2018 Published by Elsevier Inc. 1. Introduction to 41% of crashing faults are outside stack traces. In general, lo- calizing crashing faults that reside outside of stack traces requires Software faults can hide everywhere in source code and could more effort since developers need to examine the function call cause software crashes. Once a crash happens, a stack trace of the graph and to check a larger amount of source code. For example, crash is logged to record the status of program execution. To fix Gong et al. (2014) studied crashes of Firefox 3.6. They found that the crash-causing fault ( crashing fault or fault for short), developers by expanding the call depth by 1 (i.e., checking all the functions need to locate the root cause of the crash in the source code. Such that are directly called by any functions in the stack trace), devel- localization is known as crash localization ( Wu et al., 2014 ). opers have to examine 624 more functions and only eight more Crash localization is challenging. A stack trace consists of a run- crashing faults can be discovered. By expanding the call depth by time exception and a function call sequence. Crash localization 2 (i.e., checking all the functions that are two call steps away from takes the stack trace and the source code as input and outputs the any function in the stack trace), 964 more functions have to be ex- location of the fault. Besides the stack trace, crash localization can amined and only five more crashing faults can be discovered. Al- leverage bug reports from bug tracking systems such as Bugzilla, though the number of discovered faults increases, the developers or consulting websites such as StackOverflow. However, the infor- have to spend much more effort in examining a large number of mation obtained from bug tracking systems or consulting websites functions outside the stack trace. Additionally, a crashing fault may can be incomplete or inaccurate; this makes it difficult to automate associate with a hidden or private function that does not appear in collection and validation. Hence, in this paper we only focus on the the API reference document, which increases the difficulty of crash stack traces and source code. In an empirical study of Mozilla crash localization. Hence, predicting whether a crashing fault resides in data, Wu et al. (2014) found that 59% to 67% of the crashing faults a stack trace or not can assist developers to speed-up crash local- can be found in functions that are inside stack traces while 33% ization and to prioritize debugging efforts ( Theisen et al., 2015 ). In this paper, we proposed an automatic approach, namely CraTer (short for Cra sh de Te cto r ), to address the problem of pre- ∗ Corresponding author. dicting crashing fault residence, which aims to predict whether a E-mail addresses: [email protected] (Y. Gu), [email protected] (J. Xuan), [email protected] (H. Zhang), [email protected] (L. Zhang), crashing fault resides in a stack trace. This problem is modeled as [email protected] (Q. Fan), [email protected] (X. Xie), [email protected] (T. Qian). a binary classification problem with two class labels: InTrace and https://doi.org/10.1016/j.jss.2018.11.004 0164-1212/© 2018 Published by Elsevier Inc. Y. Gu, J. Xuan and H. Zhang et al. / The Journal of Systems and Software 148 (2019) 88–104 89 OutTrace . That is, a crashing fault resides in or out of the state- tual Machine (JVM) pushes a function into the stack if a function ments that are recorded in a stack trace. For each crash, CraTer is called by the main program. 1 Once a crash appears, JVM aborts extracts 89 features from the faulty program as well as its stack the program execution and outputs the function calls stored in the trace to characterize the residence of the crashing fault in the stack stack based on their call sequences. Modern software projects col- trace. Examples of the features include the type of the exception in lect software crashes to facilitate program debugging and bug fix- the stack trace and the number of files included in the source code. ing. Many projects deploy a bug tracking system (such as Bugzilla 2 Each crash corresponds to a vector of 89 feature values. CraTer con- and Jira 3 ) to enable users to submit a bug report to record the sists of two major phases: the training phase and the deployment crash of projects. Large-scale crash reporting systems are also used phase. In the training phase, a predictive model is built by com- to automatically collect crash reports from end users. Examples of bining a decision tree classifier with a strategy of imbalanced data such systems include Microsoft Windows Error Reporting System processing. In the deployment phase, the trained model is used ( Dang et al., 2012 ), Mozilla Crash Reporter, 4 and Fedora Analysis to predict whether the crashing fault resides in the stack trace Framework. 5 Given a collected crash report, developers can repro- for a newly-submitted crash. The predicted results can assist the duce the crashing scenario and then fix the faulty code. The major manual crash localization work performed by developers. For in- part of a collected crash report is a stack trace, which consists of a stance, consider a stack trace with 10 lines. Analyzing all frames runtime exception and a function call sequence at the moment of and method calls that are listed in the stack trace requires the re- the crash. view of many lines of code. If our approach CraTer can identify the Fig. 1 shows a real-world crashing fault, Bug 718 in a prediction result of InTrace , i.e., the crashing fault is predicted in- widely-used open-source library, Apache Commons Math. 6 The side the stack trace, then we can focus on the review of the 10 exception type of this crash is ConvergenceException lines of code recorded in the stack trace. This can save the time and is directly thrown from the function evaluate() in a cost and the human labor. class ContinuedFraction . According to the bottom of the We evaluated our approach CraTer on seven real-world, open- stack trace, all functions in Fig. 1 are called by executing source Java projects: Apache Commons Codec, Ormlite-Core, JSql- a function inverseCumulativeProbability() in a class Parser, Apache Commons Collections, Apache Commons IO, Jsoup, AbstractIntegerDistribution . and Mango. We seeded faults using program mutation techniques A typical stack trace can be viewed as a list of n + 1 frames, to mimic real crashes and randomly sample 500 crashes for ten from Frame 0 to Frame n ( Wu et al., 2014; Chen and Kim, 2015 ). times. The overall accuracy on all the crashes of seven projects Frame 0 records the thrown exception of the crash; Frames 1 to reaches 92.7%. Experiments on each individual project show that n represent the function call sequence. Each frame in the function our approach can correctly predict the residence of crashing faults call sequence is a tuple of a class name, a function name, and a with the accuracy ranging from 86.0% to 95.7%. The F-measure of line ID. These records directly indicate the position of a function InTrace and OutTrace for individual projects are from 65.0% to call. For instance, the caller of the function regularizedBeta() 87.9% and from 90.8% to 97.9%%, respectively. In our experiments, at Frame 2 is regularizedBeta() at Frame 3; the callee of we also analyzed the most dominant features among the 89 fea- regularizedBeta() at Frame 2 is evaluate() at Frame 1. tures, which have the strongest correlation with the residence of According to a manually-written patch of Bug 718, the crashing faults.