Changelocator: Locate Crash-Inducing Changes Based on Crash Reports

Empir Software Eng https://doi.org/10.1007/s10664-017-9567-4 ChangeLocator: locate crash-inducing changes based on crash reports Rongxin Wu1 · Ming Wen1 · Shing-Chi Cheung1 · Hongyu Zhang2 © Springer Science+Business Media, LLC 2017 Abstract Software crashes are severe manifestations of software bugs. Debugging crashing bugs is tedious and time-consuming. Understanding software changes that induce a crashing bug can provide useful contextual information for bug fixing and is highly demanded by developers. Locating the bug inducing changes is also useful for automatic program repair, since it narrows down the root causes and reduces the search space of bug fix location. However, currently there are no systematic studies on locating the software changes to a source code repository that induce a crashing bug reflected by a bucket of crash reports. To tackle this problem, we first conducted an empirical study on characterizing the bug inducing changes for crashing bugs (denoted as crash-inducing changes). We also propose ChangeLocator, a method to automatically locate crash-inducing changes for a given bucket of crash reports. We base our approach on a learning model that uses features originated from our empirical study and train the model using the data from the historical fixed crashes. We evaluated ChangeLocator with six release versions of Netbeans project. The results show that it can locate the crash-inducing changes for 44.7%, 68.5%, and 74.5% of the Communicated by: Martin Monperrus and Westley Weimer Rongxin Wu [email protected] Ming Wen [email protected] Shing-Chi Cheung [email protected] Hongyu Zhang [email protected] 1 Department of Computer Science and Engineering, The Hong Kong University of Science and Technology, Hong Kong, China 2 School of Electrical Engineering and Computing, The University of Newcastle, Newcastle, Australia Empir Software Eng bugs by examining only top 1, 5 and 10 changes in the recommended list, respectively. It significantly outperforms the existing state-of-the-art approach. Keywords Crash-inducing change · Software crash · Crash stack · Bug localization 1 Introduction Software crashes are usually considered as severe bugs. They cause unintended program behaviour, bad user experiences, or even disasters for safe-critical systems. Various crash reporting systems such as Windows Error Reporting (Glerum et al. 2009), Apple Crash Reporter (Technical note tn2123: Crashreporter 2015), Mozilla Crash Reports (Mozilla crash reports 2015) and Netbeans Exception Reports (Netbeans exception reports 2015), have been developed and deployed. These systems (Technical note tn2123: Crashreporter 2015;Glerumetal.2009; Mozilla crash reports 2015; Netbeans exception reports 2015) automatically collect crash reports from end users and group similar crash reports into buckets, which are proved to be useful for prioritizing debugging effort (Glerum et al. 2009). However, automatic debugging for crashing bugs is not supported by existing crash reporting systems. Over the years, various bug localization techniques have been proposed to help developers debug (e.g., Abreu et al. 2007:Jonesetal.2002; Liblit et al. 2003;Sahaetal.2013;Yeet al. 2014; Zhou et al. 2012; Wong et al. 2014). These techniques rank suspicious statements or modules based on execution traces or bug reports. Although these techniques could be useful in some situations, their effectiveness is largely unsatisfactory in practice and they are rarely used by developers (Parnin and Orso 2011). One reason is that locating buggy statements or modules in isolation may not provide adequate information for developers to understand the bugs (Parnin and Orso 2011). Our previous study (Wen et al. 2016) found that, in order to understand and fix a bug, developers often refer to the bug inducing changes (i.e., the commits that initially introduce the bug) in the discussion of bug reports. In our investigation, we found that 1,851 bug reports (Bug report list 2015) about crashes in Mozilla include developers’ discussion on bug inducing changes made to the repository. Developers extensively discussed “regression range” (i.e., the last known good revision to the first known bad revision) (Regression range 2015) in bug reports. We sampled some developers’ discussions and their desire to learn about bug inducing changes as follows. “given the regression range (for which I’m most grateful) ...1 Finding a regression range to find the change that busted it could be useful.2 Alternatively, you could check if you can reproduce on some of the older (archived) UX builds and/or otherwise get an idea of the regression range.3 And to be useful, the range (of course) needs to be narrow ... In my experience, given that narrow a regression range, it’s often possible to guess which patch triggered a 1https://bugzilla.mozilla.org/show bug.cgi?id=448608 2https://bugzilla.mozilla.org/show bug.cgi?id=589191 3https://bugzilla.mozilla.org/show bug.cgi?id=941044 Empir Software Eng problem. Then I could do a test build with that patch backed out, and ask you guys to test it.4 ... help in narrowing down the regression range and finding the right prod- uct/component is appreciated”.5 Based on our investigation of developers’ discussions and source code repository, we summarize four major benefits that developers can derive from understanding bug inducing changes. First, since a bug inducing change records the developer who committed it, it facilitates the triage of the bug to the developer who is familiar with the corresponding code. As shown in Fig. 1, the bug was fixed by the developer Vladimir Kvashin who was also the committer of the bug inducing changes. Our previous study (Wen et al. 2016) confirmed that 77.86% of the bugs were fixed by the committer of the bug inducing changes. Second, the modified code in the bug inducing changes provides the contextual clues that can help explain and fix the bugs. Take the bug #244574 as an example. This bug is an NullPointer Exception (Fig. 1a), and it crashes at Line 317 as shown in Fig. 1b. This crash is due to the null value of p.second(). From the bug inducing change, we can get the clue why p.second() is null. At Line 449, the return value is an object of Pair class whose field second is assigned with the null value. At Line 425, the object cur can be assigned with an object of Pair class which is returned from Line 449 (dotted line 1). The variable cur then can be assigned to another object result in the method getEnv at Line 428 (dotted line 2), and then result is returned (dotted line 3). At Line 315, the object p is assigned with the return value of the method getEnv and then used at Line 317 (dotted line 4). By examining the bug inducing code, we can understand how a null value is propagated to the variable used in crash point. Moreover, in Fig. 1c, the bug-fixing code at Line 317 and 319 is semantically equivalent to the deleted code before Line 315 in the bug inducing change, which indicates that developers can get hints of how to write the bug-fixing patch from the bug inducing change. Third, reverting bug inducing changes is one of the ways that developers resolve the bugs. In our investigation of NetBeans project, there are more than 1,069 bug inducing changes that were reverted for fixing bugs in the source code repository. Furthermore, the ability to locate the bug inducing changes is also beneficial for automatic program repair techniques (e.g., Le Goues et al. 2012; Arcuri and Yao 2008;Weimeret al. 2010 ), since it narrows down the root causes and reduces the search space of bug fix location. Although bug inducing changes are important to program debugging in practice, there are no systematic studies on how to locate these changes for the bugs discovered in the field. One relevant study is delta debugging (Zeller 1999), which intensively runs the test cases on the different combination of changes and locates the bug introducing changes. Similar to spectrum-based bug localization techniques, this technique assumes the availability of failing test cases, which is not generally valid for the bugs happened in the field. The other relevant studies (Kamei et al. 2013; Kim et al. 2008; An and Khomh 2015;Anetal.2017) are to predict whether a committed change is buggy. However, these techniques cannot distinguish which bug a buggy-prone change is blamed for, and developers may not know what to do without actionable information. 4https://bugzilla.mozilla.org/show bug.cgi?id=400291 5https://bugzilla.mozilla.org/show bug.cgi?id=446630 Empir Software Eng Bug: #244574 User:Exception Reporter Reported Date: May 18 18:16 2014 UTC Description: NullPointerException at org.netbeans.modules.cnd.makeproject.ui. RemoteSyncActions$BaseAction.enable (a) Bug #244574 changeset:e10374c2df360 user:Vladimir Kvashin date: Fri Feb 28 14:56:50 2014 summary: additional fix for #242416 Hide “upload to” action in the case remote host uses “System-level-file-sharing” - ExecutionEnvironment execEnv = getEnv(activatedNodes); - If (execEnv != null && execEnv.isRemote()) { - enabled = ServerList.get(execEnv).getSyncFactory().isCopying(); 315 + Pair p <ExecutionEnvironment, RemoteSyncFactory> = getEnv(activatedNodes); 316 + if(p != null4 && p.first()!= null&& p.first().isRemote()){ 317 + enabled = p.second().isCopying(); Crashing line 318 } else { 319 enabled = false; 320 } 3 421 + private static Pair<ExecutionEnvironment, RemoteSyncFactory> getEnv(Node[] activatedNodes) { 422 + Pair<ExecutionEnvironment, RemoteSyncFactory> result = null; 425 + Pair<ExecutionEnvironment, RemoteSyncFactory> curr = getEnv(project); 426 + if (curr!= null) { 427 if (result == null) { 2 - result = env; 428 + result = curr; 429 } 1 437 return result; 438 } 439 + private static Pair<ExecutionEnvironment,RemoteSyncFactory> getEnv(Project project) { 449 + return Pair.of(ServerList.getDefaultRecord().getExecutionEnvironment(),null); 450 } (b) Bug Inducing Change for Bug #244574 changeset:65f38533c2ea user:Vladimir Kvashin date: Tue May 20 17:42:33 2014 summary:fixed #244574–NullPointerExcetion at org.netbeans.

Changelocator: Locate Crash-Inducing Changes Based on Crash Reports

Challenges and Issues of Mining Crash Reports

Crash Reporting: Mozilla’S Open Source Solution

An Autopsy of a Web Browser's Leaked Crash Reports

Why Did This Reviewed Code Crash? an Empirical Study of Mozilla Firefox

City Research Online

The Unreasonable Effectiveness of Traditional Information Retrieval in Crash Report Deduplication

Supporting Software Development Based on Crash Report Mining and Analysis

Privacy-Aware Remote Error Analysis on Commodity Software

Statistical Debugging

A Crash Reporting Library for Android

Stephen a Ridley

Look 360: an Android Application Using Google Analytics and Firebase with Mobile Cloud Computing