Challenges and Issues of Mining Crash Reports

Le An, Foutse Khomh SWAT, Polytechnique Montreal,´ Quebec,´ Canada {le.an, foutse.khomh}@polymtl.ca

Abstract—Automatic crash reporting tools built in many soft- In the rest of this paper, we describe the structure of a ware systems allow software practitioners to understand the crash report before presenting some analytic techniques to origin of field crashes and help them prioritise field crashes mine implicit information from crash reports. For each of the or bugs, locate erroneous files, and/or predict bugs and crash occurrences in subsequent versions of the software systems. In techniques, we discuss its challenges and issues. After giving this paper, after illustrating the structure of crash reports in an overview of the related work, we conclude the paper with , we discuss some techniques for mining information from suggestions for future studies. crash reports, and highlight the challenges and issues of these techniques. Our aim is to raise the awareness of the research II.STRUCTUREOF CRASH REPORTS community about issues that may bias research results obtained In this section, we take crash reports from Mozilla Socorro from crash reports and provide some guidelines to address certain server (Mozilla’s crash collecting database) [4] as an example challenges related to mining crash reports. Index Terms—Crash report, bug report, mining software to describe the typical crash collecting system. repositories. Mozilla delivers its applications with a built-in automatic crash reporting tool: Mozilla , which sends a crash report to the Mozilla Socorro server, once end users I.INTRODUCTION encounter an unexpected halt of the application. Each crash Nowadays, crash reporting tools are embedded in many report records a stack trace of the falling thread and other software systems to collect information about crashes in the information about the user’s environment. A stack trace is an field (i.e., when a program stops functioning properly in a ordered set of frames where each frame indicates a method user environment). Crashes with the same crashing signature, signature and a link to the corresponding source code. Source the stack trace of the failing thread, will be grouped into code information is not always available in the frames, espe- a crash type. Usually, software quality managers prioritise cially when a frame belongs to a third party binary [1]. crash types by the number of crash occurrences, then file Crash Report – e1c1267874640-94324-32423 Each Crash Report is the top crash types into bug reports. Crash reports provide assigned a unique ID Crash Time - OCT 24, 2010 11:20:53 useful reference to analyse and resolve bugs. They could help Install Time – SEP 22, 2010 10:20:15 System Uptime – 1125 seconds software practitioners locate erroneous code, understand the Version- 3.6.13 User Environment impact of failures, and prioritise crash-related bug reports. OS – Windows NT 6.1 2600 Information CPU – x86 Leveraging crash reports, previous studies propose new User Comment – ideas to improve the current software maintenance techniques. Stack Trace – Frame Module Signature Source All crash Reports with top Khomh et al. [1] analysed crash reports of Mozilla Firefox 0 @0x654789 signature as 1 User32.dll UserCallWinProcCheckWow “UserCallWinCheckWow” are and proposed an entropy metric that reveals the distribution of 2 User32.dll DispatchMethod grouped together 3 User32.dll DispatchMessage the occurrences of crashes among end users. Wang et al. [2] Not all frames have Source 4 XUI.dll ProcessNextNativeEvent Src/win/nsAppShell.cpp:179 Information studied the crash reports of Firefox and stack traces of Eclipse. 5 XUI.dll nsShell::OnProcess Src/win/nsShell.cpp:77 6 Nspr4.dll mozilla::Pump Ipc/glue/MessagePump.cpp:134 They proposed five rules to automatically identify correlated 7 XUI.dll MessageLoop:Run Ipc/glue/MessageLoop.c:784 crash types. In a recent study, we also analysed crash reports from Mozilla products to identify bugs that affect a large Fig. 1: A sample crash report from Firefox population of users [3]. However, some inherent properties of crash reports may Figure 1 shows a sample crash report from Mozilla Firefox. challenge the research conclusions if researchers and software Among all information provided in a crash report, researchers practitioners do not take care of them when they mine infor- and software practitioners may be interested in the following mation from crash reports. For example, user information is attributes when they study crashes for software maintenance: not always available in crash reports, and the many-to-many • crash signature: top method signature of a crash. Crashes relationship between crashes and their related bugs increases with the same crash signature are grouped into one crash the difficulty of analysis. Furthermore, since there are not type in the Socorro server [1]. enough publicly opened crash collecting databases, researchers • Install age: the time in seconds from the installation until can hardly estimate the latent peril due to some of these the crash occurred. issues (e.g., lack of precise users’ information) or validate the • Crash date: the point of time when the crash was re- generalisability of their findings. ported.

/15/$31.00 c 2015 IEEE 17 SWAN 2015, Montréal, Canada • OS name: name of the operating system on which the Algorithm 1: Map different crashed users to a crash type, crash occurred. and map different crash types to a bug • OS version: version of the operating system on which the Input: signature, user, buglist crash occurred. 1 if signature in dictcrash then • CPU type: family model of the CPU of a machine on 2 stackuser ← dictcrash[signature] which the crash occurred. 3 add user to stackuser 4 else • Bug list: list of bugs related to the crash. 5 stackuser ← new stack with user • Up time: the amount of seconds since the application 6 dictcrash[signature] ← stackuser startup. 7 foreach bug in buglist do Software researchers and practitioners can refer to the So- 8 if bug in bug dict then corro documentation [5] for the description of other attributes. 9 setsignature ← dictbug[bug] 10 add signature to setsignature 11 else III.ANALYSIS OF CRASH REPORTS AND CHALLENGES 12 setsignature ← new Set with signature In this section, we describe some analytic techniques of 13 dictbug[bug] ← setsignature crash reports for software maintenance and discuss their chal- lenges and issues. Algorithm 2: Map crashed user occurrences to their bugs A. User Identification Input: dictbug When we study crash reports, an important process is to 1 foreach bug in dictbug do identify unique users who encounter crashes in a software 2 setsignature ← dictbug[bug] 3 signature in set system. Mapping crashes to different users can help software foreach signature do 4 stackuser ← dictcrash[signature] practitioners understand the dispersion of these crashes in the 5 concatenate stackuser to occuruser user base and determine the priority of related crash types and bugs. However, user identity is not always available. For example, Mozilla crash reports do not contain personal information indicating unique users reporting the crashes due On the one hand, we can use the bug lists indicated in to privacy concerns. Khomh et al. [1] used a heuristic where crash reports to directly map crashes to bugs. However, some they took installed point of time (subtraction of crash date crashes are reported before the opening of their corresponding from install age) of the studied system to identify different bugs. If we apply the direct mapping only, we may omit the users (noted as install profiles). In our recent work [3], which information about these crashes. consisted in studying the crashing impact of bugs on the user On the other hand, we can also perform an indirect mapping population base, we used a vector of computer configuration from crashes to bugs via crash types. In other words, we (combination of CPU type, OS name, and OS version) to perform a two-step mapping as following: we map each bug represent different “users” (noted as machine profiles) in a to a set of crash types (i.e., a group of crashes with the system. same crash signature), then map each crash type to a set of However, these user identification heuristics may lead to crash reports. Concretely, we build two hash tables, dictbug inconsistent results. For example, when we apply the heuristic and dictcrash. In any item of dictbug, the key is a bug ID by install profile, two users who install an application at the and the value is a set of crash signatures. In any item of same point of time (by second) may encounter quite different dictcrash, the key is a crash signature and the value is a types of crashes. When we apply the heuristic by machine stack1 of crash reports. By associating the two hash tables, profile, users with the same computer configuration may also we can finally map a bug to a set of corresponding crash exhibit different crashing characteristics. In other words, users reports, even including those lacking bug information (i.e., that happened to install a software at the same time or in crashes reported before the creation of the bugs). Similarly, the same model of computer, especially for enterprise users to investigate the crash distribution in the user base, we can whose computers are uniformly configured or software is also link different users (represented by install profiles or simultaneously installed, may behave differently, and should machine profiles) to a bug by analysing the bug’s related crash not be clustered together in terms of crash analysis. reports. Algorithm 1 and 2 show the pseudocode to link a bug to the users suffering its related crashes.2 However, in B. Mapping between Crashes and Bugs a software system, a crash report may bring about several When we study the characteristics of bugs related to field bugs, a bug may be also related to more than one crashes. crashes, we should map crash reports to their corresponding bugs in order to understand the distribution of crash-related 1 Every new crash report / crash occurrence will be appended at the end bugs in the user base. Using the information provided in crash of the structure. 2 The mapping script in Python and example data are available in the reports and bug reports, there are two general ways to link following repository: related bugs and crashes. http://swat.polymtl.ca/anle/data/SWAN2015/

18 The many-to-many relationship increases the difficulty to map factors may challenge the results of this approach. Especially, a bug to its corresponding crash reports. Compared with the if we use bug reports to link a crash type and its corresponding direct mapping, the concatenation operations (in Algorithm 2) bug fixes, some kinds of bugs may “exaggerate” the required in the indirect mapping require more execution time. Usually, effort of a crash type. For example, some re-opened bugs, i.e., crash analysis involves a big dataset (e.g., Mozilla receives bugs that have been opened more than once, may be due to an 2.5 million crash reports on the peak day each week, namely, inactive attitude of software developers (i.e., these bugs have around 50GB of data need to be processed every day [6]), been prematurely closed) [7]. The bug fixes ported before the reducing time complexity of the mapping technique can help re-opening to these bugs should not be taken into consideration researchers and software practitioners improve their analytic when computing the required effort. Moreover, crashes due efficiency to early understand the impact of the crash-related to non-reproducible bugs [8] may bias effort estimations as bugs. well, because these bugs only show up on certain machines To mitigate the above mentioned problems, we can combine (from certain users), and do not affect other users. Joorabchi et the direct and indirect mapping techniques together. In other al. [8] found that many non-reproducible bugs are mislabelled, words, if the bug list is available in a crash report, we will because the available resolutions in the repositories do not use the direct mapping; otherwise, the indirect mapping will cover all possible scenarios. These “defects” are mainly due be applied. The combined approach can effectively reduce the to wrong configuration of a software, and should not have mapping algorithm’s time complexity while linking a bug to been classified as bugs. If software managers compute the all its possible crash reports. Besides, according to our case fixing effort of the false positive bugs as other bugs, they may study [3], direct mapping and indirect mapping may lead to overestimate the trivial issues while omitting the serious ones. results with subtle difference. Suppose a bug B is due to a D. Localisation of Buggy Files crash C1 pertaining to the crash type CT . Another crash C2 also classified as type CT might not be related to the bug B. Crash reports can help for fault localisation as well. There Because most software organisations, such as Mozilla, group exist plenty of studies on automatic bug localisation. For crashes with the same “top frames” (in a stack trace of failing example, Liblit et al. [9] studied predicate patterns in correct thread) into a crash type. When a stack trace contains a lot of and incorrect execution traces. They proposed an algorithm to frames, different software organisations may extract different separate the effects of different bugs and identify predictors numbers of “top frames” as crashing signatures. Different associated with individual bugs. Nessa et al. [10] introduced criteria for the selection of the “top frames” may result in a bug localisation algorithm based on N-gram analysis to different distributions of crash types, and translate into differ- rank the executable statements of a software by level of ent mappings between crashes and bugs. Therefore, to make suspicion. Wang et al. [2] proposed an algorithm based on sure the mapping results with a high precision, efficiency, and crash correlation groups to locate and rank buggy files by completeness, we suggest that software practitioners prioritise analysing the stack traces in correlated crash types. Crash the direct mapping before resorting to the indirect mapping reports provide failing source code location in their stack using crash types. traces, but this information is not always complete (e.g., in the sample shown in Figure 1, source code information is C. Relationship between Crashes and Bug Fixes not available in some frames). There are generally three types Furthermore, we can also map crash reports to bug fixes to of source code information given in a crash report. In the investigate whether developers efficiently addressed reported best case, buggy file paths are directly indicated in the frames field crashes and estimate the effort required to fix the crashes. of a stack trace. Sometimes, crashed method signatures are There are two ways to link a crash type to its corresponding provided in the stack trace. We should iteratively search the bug fixes, via bug reports and by text analysis. In our previous signatures in every file of source code repository to locate the work, we described a technique to map fixing commits to bug possible buggy files. In the worst case, just crashing memory reports [7]. So we can use bug reports as an intermediary addresses are given in the frames from which we can hardly to link crashes and commits. On the other hand, if a bug locate the source of a bug. contains different crash types that are then related to many Nevertheless, we should not only rely on crash reports bugs, it is better to directly link a crash type (in the crash to locate erroneous code, because this technique is merely collecting system) to bug fixes (in the version control system) applicable to crash-related bugs while not all crash reports by comparing the stack trace of the failing thread in crash contain related files’ location or crashed methods. Bug fixes reports and the revised code in bug fixes. Namely, we check are another source to discover buggy files. We can apply the whether the crash-related code (or in a coarse level, the file heuristic described in [7] to link a bug to its related bug fixes of this code) is revised in any revisions. Linking a crash to its then identify the changed files in the bug fixes. In our case revisions, we can compute the fixing duration and the number study with Mozilla, Eclipse, Netbeans, and Webkit, we found of involved developers of the crash, i.e., the crash’s fixing that bug reference is not always available in bug fixes, i.e., effort. some bugs may not be able to get linked to their bug fixes Although bug fixing commits could be used to assess the by this heuristic. Therefore, crash reports could be used as a effort (or estimated effort) required to fix a crash type, some complementary source to detect buggy file locations.

19 E. Suggestions versions of a software system. However, researchers still have When conducting empirical studies with crash report data, to face some challenges and issues when studying field crash researchers should pay attention to the above mentioned issues. reports: such as the lack of precise information about users Before drawing any conclusion, the threats to validity (e.g., or buggy files in crash reports, the many-to-many mapping user identification technique) should be carefully discussed. between bugs and crash reports, which makes it difficult to In addition, different software organisations may design their link a crash to its bug fixing commits, and even the lack crash reports in different format and structure. At the time of open source crash report databases. These issues may of writing this paper, too few crash collecting databases have challenge the conclusions drawn from these studies and should been publicly available to researchers. Mozilla Socorro is the be carefully taken into account when conducting empirical only source that we can explore for a pertinent case study. studies in the future. Especially, the last issue greatly limits We hope that more software organisations could share their the generalisation of the ideas developed in previous studies. crash collecting databases to allow researchers to verify the In this paper, we illustrate the structure of crash reports in generalisability of our proposed approaches as well as other re- Mozilla, then describe some crash analysis techniques and lated work and verify whether the above mentioned issues are discuss their corresponding problems. We appeal that more relevant in a larger context (i.e., beyond the case of Mozilla). software organisations share their crash report databases for By analysing crash reports from different organisations, we future researchers in order to improve our knowledge of the can also propose ideas to improve the current structure of problems. Various subject systems can also provide different crash reports and crash collecting systems to help software perspectives to help crash report designers improve current organisations augment their productivity, user satisfaction, and crash reporting systems and further help software practitioners reduce their maintenance effort. improve the satisfaction of their end users. REFERENCES IV. RELATED WORK [1] F. Khomh, B. Chan, Y. Zou, and A. E. Hassan, “An entropy evaluation In this section, we introduce some related work on mining approach for triaging field crashes: A case study of mozilla firefox,” crash reports. in Reverse Engineering (WCRE), 2011 18th Working Conference on. Podgurski et al. [11] introduced an automated failure clus- IEEE, 2011, pp. 261–270. [2] S. Wang, F. Khomh, and Y. Zou, “Improving bug management using tering approach for the classification of crash reports, in order correlations in crash reports,” Empirical Software Engineering, pp. 1– to facilitate their prioritisation and the diagnostic of their root 31, 2014. causes. By mining crash reports in Mozilla Firefox, Khomh et [3] L. An and F. Khomh, “An empirical study of highly-distributed bugs in mozilla firefox,” Ecole´ Polytechnique de Montreal,´ Tech. al. [1] proposed an entropy-based approach that can be used Rep., November 2014. [Online]. Available: http://swat.polymtl.ca/anle/ to identify crash types by not only their crashing frequency technicalreports/highlydistributed.pdf but also their crashing dispersion among users. Inspired by [4] “Socorro: Mozilla’s crash reporting system,” accessed 6th January, 2015. [Online]. Available: https://crash-stats.mozilla.com/ the work of Khomh et al., Wang et al. [2] analysed crash home/products/Firefox information in Firefox and Eclipse, and proposed an algorithm [5] “Socorro documentation,” accessed 6th January, 2015. [Online]. that can be used to locate and rank buggy files as well as Available: http://socorro.readthedocs.org/en/latest/index.html [6] “Socorro: Mozilla’s Crash Reporting Server,” accessed 6th January, a method to identify duplicate and related bug reports. Kim 2015. [Online]. Available: http://blog.mozilla.org/webdev/2010/05/19/ et al. [12] studied crash reports and impacted source files in socorro-mozilla-crash-reports/ Firefox and Thunderbird to predict top crashes before a new [7] L. An, F. Khomh, and B. Adams, “Supplementary bug fixes vs. re- opened bugs,” in Proceedings of the 14th IEEE International Working release of a software. Conference on Source Code Analysis and Manipulation (SCAM), Vic- Most of these researchers used Mozilla Socorro crash re- toria, BC, Canada, September 2014. porting system as a subject system. Because, as far as we [8] M. Erfani Joorabchi, M. Mirzaaghaei, and A. Mesbah, “Works for me! characterizing non-reproducible bug reports,” in Proceedings of the 11th know, only the Mozilla Foundation has opened the crash re- Working Conference on Mining Software Repositories. ACM, 2014, pp. ports to the public [2]. Though Wang et al. [2] studied another 62–71. system, Eclipse, they could obtain crash information only [9] B. Liblit, M. Naik, A. X. Zheng, A. Aiken, and M. I. Jordan, “Scalable statistical bug isolation,” in ACM SIGPLAN Notices, vol. 40, no. 6. through the stack traces contained in the bug reports (instead ACM, 2005, pp. 15–26. of using crash reports). However, stack trace information is not [10] S. Nessa, M. Abedin, W. E. Wong, L. Khan, and Y. Qi, “Software fault always available in bug reports for the majority of software localization using n-gram analysis,” in Wireless Algorithms, Systems, and Applications. Springer, 2008, pp. 548–559. systems. Dang et al. [13] proposed a method, ReBucket, to [11] A. Podgurski, D. Leon, P. Francis, W. Masri, M. Minch, J. Sun, and improve the current crash reports clustering technique based on B. Wang, “Automated support for classifying software failure reports,” in call stack matching. But their subject crash collecting database, Software Engineering, 2003. Proceedings. 25th International Conference on. IEEE, 2003, pp. 465–475. Microsoft’s (WER) system, is not [12] D. Kim, X. Wang, S. Kim, A. Zeller, S.-C. Cheung, and S. Park, “Which accessible for every researcher. crashes should I fix first?: Predicting top crashes at an early stage to prioritize debugging efforts,” Software Engineering, IEEE Transactions V. CONCLUSIONAND FUTURE WORK on, vol. 37, no. 3, pp. 430–447, 2011. [13] Y. Dang, R. Wu, H. Zhang, D. Zhang, and P. Nobel, “Rebucket: Previous studies on mining crash reports proposed ap- A method for clustering duplicate crash reports based on call stack proaches to improve crash triaging, locate buggy files and similarity,” in Proceedings of the 2012 International Conference on duplicate bug reports, as well as predict top crashes in future Software Engineering. IEEE Press, 2012, pp. 1084–1093.

20