SPAIN: Security Patch Analysis for Binaries Towards Understanding the Pain and Pills

Total Page:16

File Type:pdf, Size:1020Kb

SPAIN: Security Patch Analysis for Binaries Towards Understanding the Pain and Pills 2017 IEEE/ACM 39th International Conference on Software Engineering SPAIN: Security Patch Analysis for Binaries Towards Understanding the Pain and Pills Zhengzi Xu∗, Bihuan Chen∗‡, Mahinthan Chandramohan∗, Yang Liu∗ and Fu Song† ∗School of Computer Science and Engineering, Nanyang Technological University, Singapore †School of Information Science and Technology, ShanghaiTech University, China ‡Corresponding Author Abstract—Software vulnerability is one of the major threats to analysis has been proposed to discover n-day vulnerabilities, software security. Once discovered, vulnerabilities are often fixed whose patches have been released but not deployed to every by applying security patches. In that sense, security patches carry instance of the software in the world. Since patch analysis valuable information about vulnerabilities, which could be used to discover, understand and fix (similar) vulnerabilities. However, is relatively accurate to find vulnerabilities, it has gained the most existing patch analysis approaches work at the source code popularity in both industry and academia. Also, security patches level, while binary-level patch analysis often heavily relies on a lot are good entry points to understanding the program weaknesses of human efforts and expertise. Even worse, some vulnerabilities and how the vulnerabilities work inside the program, especially may be secretly patched without applying CVE numbers, or only for security participants who do not have access to the source the patched binary programs are available while the patches are not publicly released. These practices greatly hinder patch anal- code. Moreover, the underlying information of patches has ysis and vulnerability analysis. been used to build automatic bug-fixing tools [8, 9, 10], In this paper, we propose a scalable binary-level patch analysis generating valid patches for similar vulnerabilities. However, framework, named SPAIN, which can automatically identify se- up till today, most existing researches on patch analysis work curity patches and summarize patch patterns and their corre- at the source code level [11, 12], but very few works have sponding vulnerability patterns. Specifically, given the original and patched versions of a binary program, we locate the patched been done to tackle this problem at the binary-level. functions and identify the changed traces (i.e., a sequence of basic Binary-level patch analysis can only be performed on ma- blocks) that may contain security or non-security patches. Then chine instructions for closed source programs without symbol we identify security patches through a semantic analysis of these tables, which often requires a significant amount of human traces and summarize the patterns through a taint analysis on efforts and expertise to understand the semantics of instructions. the patched functions. The summarized patterns can be used to search similar patches or vulnerabilities in binary programs. Due to this complex nature, the existing techniques [13, 14] of- Our experimental results on several real-world projects have ten heavily rely on manual or heavy program analysis, which be- shown that: i) SPAIN identified security patches with high accu- comes infeasible for real-world programs. racy and high scalability; ii) SPAIN summarized 5 patch patterns On the other hand, software companies may tend to patch the and their corresponding vulnerability patterns for 5 vulnerability vulnerabilities they find themselves in a secret way instead of types; and iii) SPAIN discovered security patches that were not documented, and discovered 3 zero-day vulnerabilities. making them public and applying Common Vulnerabilities and Exposures (CVE) numbers due to their security regulations or policies. As a result, security analysts cannot know the existence I. INTRODUCTION of particular vulnerabilities, which hinders the understanding Program vulnerability is one of the major threats to software and analysis of vulnerabilities. Even worse, in some cases, security. However, it is almost impossible to avoid vulnerabili- only the patched binary programs are available; while the ties at the development stage; and it is even difficult to discover patches themselves may not be publicly released, which hinders vulnerabilities at the production stage. Security experts usually the existing patch analysis techniques that often rely on the leverage dynamic fuzzing (e.g., [1, 2]), symbolic execution (e.g., availability of patches. Moreover, due to the application of patch [3, 4]) or static code auditing (e.g., [5, 6]) to find vulnerabilities. obfuscation and patch modification techniques such as honey- However, none of these techniques can provide a complete patching [15], the patch patterns in open source binaries may solution to win the war against vulnerabilities. Dynamic fuzzing differ greatly from close source ones. suffers from the code coverage problem and the initial seeds To address these problems, we propose a scalable binary- problem [7]. Symbolic execution cannot scale well to real- level patch analysis framework, named SPAIN, to automatically world programs, due to path explosion and constraint solving identify security patches and summarize patch patterns and problems. Static code auditing often requires human expertise, their corresponding vulnerability patterns. In particular, given and cannot scale well when the program complexity increases. the original and patched versions of a binary program, SPAIN Vulnerabilities, once discovered, are often fixed by applying locates the functions that have been changed from the original security patches. In that sense, security patches carry important binary to the patched binary. Then, it detects the changed traces information about vulnerabilities. By focusing on the remedy of (i.e., a sequence of basic blocks) for each patched function to vulnerabilities instead of the vulnerabilities themselves, patch capture the function-level changes. These traces may contain 1558-1225/17 $31.00 © 2017 IEEE 460462 DOI 10.1109/ICSE.2017.49 security or non-security patches. Finally, it identifies security BB1 BB1 patches through a semantic analysis of these traces and sum- call BN new call BN new 1: mov edi, eax a: test eax, eax BB3 error: ... mov edi, eax marizes the patterns through a taint analysis on the patched ... 2: cmp eax, [edi+8] b: jz error functions. As we run more binary programs, we can maintain ... BB2 a database of patterns that will be continuously enriched to ... c: cmp eax, [edi+8] embrace the evolving and emerging of various vulnerabilities. ... There are many impactful applications of the learned patterns Original version Patched version in binary security like patch detection, vulnerability detection, Fig. 1: Running Example: NULL Pointer Dereference Vulner- automatic patch synthesis, and even patch/vulnerability trend ability and its Patch from OpenSSL 1.0.1l. analysis with the help of data analytic techniques. A more detailed discussion can be found in Section III-E. B. Running Example We evaluated SPAIN on several real-world projects. The ex- Figure 1 lists the partial X86 assembly code of one NULL perimental results demonstrated that SPAIN can identify pointer dereference vulnerability abstracted from the OpenSSL security patches with high accuracy and high scalability, and 1.0.1l that consists of the original version and patched version. summarize 5 security patch patterns and their corresponding In the original version, the program first calls the function vulnerability patterns for 5 types of vulnerabilities. Moreover, NB_new by which the return value is stored in the register we discovered undocumented security patches and found 3 eax. Then, it assigns the return value to the register edi at the zero-day vulnerabilities in Adobe PDF Reader. location 1. Later, the return value is directly used to dereference In summary, our work makes the following contributions. the memory [edi+8] at the location 2. This vulnerability is • We proposed a scalable binary-level patch analysis frame- patched in OpenSSL 1.0.1m by checking whether the return work SPAIN, which can identify security patches and value of the function call NB_new is NULL or not. This summarize patch patterns and their corresponding vulner- checking is implemented in the patched version by adding ability patterns. two instructions: test eax, eax and jz error. It first • We implemented SPAIN in a prototype and conducted tests the return value at the location a by test eax, eax experiments to demonstrate the accuracy, scalability, and which performs a bitwise AND operation on the operand eax. application of SPAIN. The test operation sets the carry flag cf and overflow flag • We discovered undocumented security patches and found of to 0. The sign flag sf is set to the most significant bit of 3 zero-day vulnerabilities in Adobe PDF Reader. the result of the bitwise AND operation. The zero flag zf is set to 1 if the result of AND operation is 0, 0 otherwise. The II. PRELIMINARIES AND OVERVIEW parity flag pf is set to 1 if the number of ones in that byte is A. Preliminaries even, 0 otherwise. The value of the adjust flag af is undefined. In this work, we assume that binary programs are in the After assigning the return value to the register edi, it checks X86 32bit format. Each binary program contains a number of whether the zero flag zf is 1 or not (i.e., the value of eax is functions with an entry point (starting function). This section Null or not) at the location b.Ifitis1 (i.e., eax is NULL), introduces several basic concepts
Recommended publications
  • APPLICATION of the DELTA DEBUGGING ALGORITHM to FINE-GRAINED AUTOMATED LOCALIZATION of REGRESSION FAULTS in JAVA PROGRAMS Master’S Thesis
    TALLINN UNIVERSITY OF TECHNOLOGY Faculty of Information Technology Marina Nekrassova 153070IAPM APPLICATION OF THE DELTA DEBUGGING ALGORITHM TO FINE-GRAINED AUTOMATED LOCALIZATION OF REGRESSION FAULTS IN JAVA PROGRAMS Master’s thesis Supervisor: Juhan Ernits PhD Tallinn 2018 TALLINNA TEHNIKAÜLIKOOL Infotehnoloogia teaduskond Marina Nekrassova 153070IAPM AUTOMATISEETITUD SILUMISE RAKENDAMINE VIGADE LOKALISEERIMISEKS JAVA RAKENDUSTES Magistritöö Juhendaja: Juhan Ernits PhD Tallinn 2018 Author’s declaration of originality I hereby certify that I am the sole author of this thesis. All the used materials, references to the literature and the work of others have been referred to. This thesis has not been presented for examination anywhere else. Author: Marina Nekrassova 08.01.2018 3 Abstract In software development, occasionally, in the course of software evolution, the functionality that previously worked as expected stops working. Such situation is typically denoted by the term regression. To detect regression faults as promptly as possible, many agile development teams rely nowadays on automated test suites and the practice of continuous integration (CI). Shortly after the faulty change is committed to the shared mainline, the CI build fails indicating the fact of code degradation. Once the regression fault is discovered, it needs to be localized and fixed in a timely manner. Fault localization remains mostly a manual process, but there have been attempts to automate it. One well-known technique for this purpose is delta debugging algorithm. It accepts as input a set of all changes between two program versions and a regression test that captures the fault, and outputs a minimized set containing only those changes that directly contribute to the fault (in other words, are failure-inducing).
    [Show full text]
  • Repairnator Patches Programs Automatically
    Repairnator patches programs automatically Ubiquity Volume 2019, Number July (2019), Pages 1-12 Martin Monperrus, Simon Urli, Thomas Durieux, Matias Martinez, Benoit Baudry, Lionel Seinturier DOI: 10.1145/3349589 Repairnator is a bot. It constantly monitors software bugs discovered during continuous integration of open-source software and tries to fix them automatically. If it succeeds in synthesizing a valid patch, Repairnator proposes the patch to the human developers, disguised under a fake human identity. To date, Repairnator has been able to produce patches that were accepted by the human developers and permanently merged into the code base. This is a milestone for human-competitiveness in software engineering research on automatic program repair. Programmers have always been unhappy that mistakes are so easy, and they need to devote much of their work to finding and repairing defects. Consequently, there has been considerable research on detecting bugs, preventing bugs, or even proving their absence under specific assumptions. Yet, regardless of how bugs are found, this vast research has assumed that a human developer would fix the bugs. A decade ago, a group of researchers took an alternative path: regardless of how a bug is detected, they envisioned that an algorithm would analyze it to synthesize a patch. The corresponding research field is called "automated program repair." This paradigm shift has triggered a research thrust with significant achievements [1]: Some past bugs in software repositories would have been amenable to automatic correction. In this following example, a patch fixes an if- statement with incorrect condition (a patch inserts, deletes, or modifies source code): - if (x < 10) + if (x ≤ 10) foo(); Armed with the novel mindset of repairing instead of detecting, and with opulent and cheap computing power, it is possible today to automatically explore large search spaces of possible patches.
    [Show full text]
  • Karampatsis2021.Pdf (1.617Mb)
    This thesis has been submitted in fulfilment of the requirements for a postgraduate degree (e.g. PhD, MPhil, DClinPsychol) at the University of Edinburgh. Please note the following terms and conditions of use: This work is protected by copyright and other intellectual property rights, which are retained by the thesis author, unless otherwise stated. A copy can be downloaded for personal non-commercial research or study, without prior permission or charge. This thesis cannot be reproduced or quoted extensively from without first obtaining permission in writing from the author. The content must not be changed in any way or sold commercially in any format or medium without the formal permission of the author. When referring to this work, full bibliographic details including the author, title, awarding institution and date of the thesis must be given. Scalable Deep Learning for Bug Detection Rafael-Michael Karampatsis I V N E R U S E I T H Y T O H F G E R D I N B U Doctor of Philosophy Institute for Language, Cognition and Computation School of Informatics University of Edinburgh 2020 Dedicated to my brother Elias and to my lovely parents for encouraging me to never stop pursuing my dreams Declaration I declare that this thesis has been composed by myself and that this work has not been submitted for any other degree or professional qualification. I confirm that the work submitted is my own, except where work which has formed part of jointly-authored publications has been included. My contribution and those of the other authors to this work have been explicitly indicated at the beginning of each chapter.
    [Show full text]
  • Bug Assignment: Insights on Methods, Data and Evaluation
    Bug Assignment: Insights on Methods, Data and Evaluation by Ali Sajedi Badashian A thesis submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy Department of Computing Science University of Alberta © Ali Sajedi Badashian, 2018 Abstract The bug-assignment problem is prevalently defined as ranking developers based on their competence to fix a given bug. Previous methods in the area used machine-learning or information-retrieval techniques and considered textual elements of bug reports as evidence of expertise of developers to give each of the developers a score and sort the developers for the given bug report. Despite the importance of the subject and the substantial attention it has received from researchers during last 15 years, still it is a challenging, time-consuming task in large software projects. Even there is still no unanimity on how to validate and comparatively evaluate bug-assignment methods and, often times, methods reported in the literature are not repro- ducible. In this thesis, we make the following contributions. 1) We investigate the effect of three important experimental-design parameters in the previous research; the evaluation metric(s) they report, their definition of who the real assignee is, and the community of developers they consider as candidate assignees. Supported by our experiment on a comprehensive data set of bugs we collected from Github, we propose a systematic framework for evaluation of bug-assignment research. Addressing those aspects supports better evaluation, enables replication of the study and promotes its usage in other research or industrial applications. 2) We propose a new bug-assignment approach relying on the set of Stack Overflow tags as the thesaurus of programming keywords.
    [Show full text]
  • Dissertation Deep Pattern Mining for Program Repair
    PhD-FSTC-2019-79 The Faculty of Science, Technology and Communication Dissertation Presented on the 18/12/2019 in Luxembourg to obtain the degree of Docteur de l’Université du Luxembourg en Informatique by Kui Liu Born on 11th August 1988 in Jiangxi, China Deep Pattern Mining for Program Repair Dissertation Defense Committee Dr. Yves Le Traon, Dissertation Supervisor Professor, University of Luxembourg, Luxembourg Dr. Tegawendé Bissyandé, Chairman Assistant Professor, University of Luxembourg, Luxembourg Dr. Dongsun Kim, Vice Chairman Quality Assurance Engineer, Furiosa.ai, South Korea Dr. Jacques Klein, Expert Associate Professor, University of Luxembourg, Luxembourg Dr. Andreas Zeller Professor, Saarland University, Saarbrucken, Germany Dr. David Lo Associate Professor, Singapore Management University, Singapore Abstract Error-free software is a myth. Debugging thus accounts for a significant portion of software maintenance and absorbs a large part of software cost. In particular, the manual task of fixing bugs is tedious, error- prone and time-consuming. In the last decade, automatic bug-fixing, also referred to as automated program repair (APR) has boomed as a promising endeavor of software engineering towards alleviating developers’ burden. Several potentially promising techniques have been proposed making APR an increasingly prominent topic in both the research and practice communities. In production, APR will drastically reduce time-to-fix delays and limit downtime. In a development cycle, APR can help suggest changes to accelerate debugging. As an emergent domain, however, program repair has many open problems that the community is still exploring. Our work contributes to this momentum on two angles: the repair of programs for functionality bugs, and the repair of programs for method naming issues.
    [Show full text]
  • Genetic Improvement
    Emergent and Self-Adaptive Systems: Theory and Practice Data Science Institute, University of Lancaster, 19-20th Oct 2017 Genetic Improvement W. B. Langdon Computer Science, University College, London A Comprehensive Survey, IEEE TEVC Starting in 2015 annual workshops on GI 17.10.2017 Genetic Improvement W. B. Langdon Computer Science, University College, London A Comprehensive Survey, IEEE TEVC Starting in 2015 annual workshops on GI Genetic Improvement of Programs • What is Genetic Improvement • What has Genetic Improvement done – Technology behind automatic bug fixing – Improvement of existing code: speedup, transplanting, program adaptation • Goals of Genetic Improvement – more automated/higher level programming • What can GI do for Emergent and Self-Adaptive Systems? W. B. Langdon, UCL 3 What is Genetic Improvement W. B. Langdon, UCL 4 Genetic Improvement Use GP to evolve a population of computer programs – Start with representation of human written code – Programs’ fitness is determined by running them – Better programs are selected to be parents – New generation of programs are created by randomly combining above average parents or by mutation. – Repeat generations until solution found. Free Free PDF E-book kindle Charles Darwin 1809-1882 Typical GI Evolutionary Cycle Auto-clean up W. B. Langdon, UCL 6 GI Automatic Coding • Genetic Improvement does not start from zero • Use existing system – Source of non-random code – Use existing code as test “Oracle”. (Program is its own functional specification) – Can always compare against previous version – Easier to tell if better than if closer to poorly defined goal functionality. • Testing scales (sort of). Hybrid with “proof” systems 7 What has Genetic Improvement done W.
    [Show full text]
  • Fixing Recurring Crash Bugs Via Analyzing Q&A Sites
    Fixing Recurring Crash Bugs via Analyzing Q&A Sites Qing Gao, Hansheng Zhang, Jie Wang, Yingfei Xiong, Lu Zhang, Hong Mei Key Laboratory of High Confidence Software Technologies (Peking University), MoE Institute of Software, School of Electronics Engineering and Computer Science, Peking University, Beijing, 100871, P. R. China fgaoqing11, zhanghs12, wangjie14, xiongyf04, zhanglu, [email protected] framework, which indicates that Q&A sites are more or less a Abstract—Recurring bugs are common in software systems, reliable source for obtaining fixes for a large portion of bugs. especially in client programs that depend on the same framework. As the first step of fixing recurring bugs via analyzing Existing research uses human-written templates, and is limited to Q&A sites, we focus on a specific class of bugs: crash bugs. certain types of bugs. In this paper, we propose a fully automatic Crash bugs are among the most severe bugs in real-world approach to fixing recurring crash bugs via analyzing Q&A sites. software systems, and a lot of research efforts have been put By extracting queries from crash traces and retrieving a list of Q&A pages, we analyze the pages and generate edit scripts. Then into handling crash bugs, including localizing the causes of we apply these scripts to target source code and filter out the crash bugs [5], keeping the system running under the presence incorrect patches. The empirical results show that our approach of crashes [6], and checking the correctness of fixes to crash is accurate in fixing real-world crash bugs, and can complement bugs [7].
    [Show full text]
  • State-Of-The-Art in Self-Healing and Patch Generation
    ICT PROJECT 258109 Monitoring Control for Remote Software Maintenance Project number: 258109 Project Acronym: FastFix Project Title: Monitoring Control for Remote Software Maintenance Call identifier: FP7-ICT-2009-5 Instrument: STREP Thematic Priority: Internet of Services, Software, Virtualisation Start date of the project: June 1, 2010 Duration: 30 months D6.1: State-of-the-art in Self-Healing and Patch Generation Lead contractor: Lero Benoit Gaudin, Emil Vassev, Mike Hinchey, Paddy Author(s): Nixon, Dennis Pagano, João Coelho Garcia, Nitesh Narayan. Submission date: December 2010 Dissemination level: Public Project co-founded by the European Commission under the Seventh Framework Programme © S2, TUM, LERO, INESC ID, TXT, PRODEVELOP Abstract: As software maintenance processes still rely on human expertise and manual intervention, they usually require important efforts in terms of time and costs. Research has been conducted in the past recent years in order to try to overcome this issue by considering systems that would maintain themselves. This concept gave birth to Autonomic Computing (AC), which aims to transfer and automate part of the maintenance processes to the system itself. Self-healing is one of the properties that AC seeks to equip system with. Self-healing aims for the system to automatically recover from faults, possibly fixing them through patch generation. This document introduces self-healing approaches as well as other research areas that are very related to this concept, such as fault-tolerance, Automatic Diagnosis and Automatic Repair. The latter contains works related to automatic patch generation. This document has been produced in the context of the FastFix Project. The FastFix project is part of the European Community’s Seventh Framework Program for research and development and is as such funded by the European Commission.
    [Show full text]
  • Obstacles in Fully Automatic Program Repair: a Survey S
    Obstacles in Fully Automatic Program Repair: A survey S. Amirhossein Mousavi, Donya Azizi Babani, Francesco Flammini Abstract The current article is an interdisciplinary attempt to decipher automatic program repair processes. The review is done by the manner typical to human science known as “diffraction”. We attempt to spot a gap in the literature of self-healing and self-repair operations and further investigate the approaches that would enable us to tackle the problems we face. As a conclusion, we suggest a shift in the current approach to automatic program repair operations in order to attain our goals. The emphasis of this review is to achieve “full automation.” Several obstacles are shortly mentioned in the current essay but the main shortage that is covered is the “overfitting” obstacle, and this particular problem is investigated in the stream that is related to “full automation” of the repair process. 1. Introduction In today’s world, security and dependability happen to become the most significant priority of system requirements whereby the system functions even in the presence of faults or cyber- attacks. An exhaustive review; “Epic failures: 11 infamous software bugs”, [1] illustrates most famous software system disasters caused by the system bug which has resulted in massive financial losses and even human lives losses. A bug or failure can cause drastic and unrecoverable losses in crucial and sensitive systems such as aviation systems, highways or railways traffic control systems etc. Debugging processes almost impose up to 50% of the development expenses of software production [2], [3]. However, by thoroughly investigating the existing literature at hand, one may encounter various definitions, explanations, and synonyms for bugs, fault, failure, defects, etc.
    [Show full text]
  • Tailoring Programs for Static Analysis Via Program Transformation
    Tailoring Programs for Static Analysis via Program Transformation Rijnard van Tonder Claire Le Goues School of Computer Science School of Computer Science Carnegie Mellon University Carnegie Mellon University USA USA [email protected] [email protected] ABSTRACT analyses in practice. Analyses must approximate, both theoreti- Static analysis is a proven technique for catching bugs during soft- cally [22, 25] and in the interest of practicality [9, 11]. Developers ware development. However, analysis tooling must approximate, are sensitive to whether tools integrate seamlessly into their exist- both theoretically and in the interest of practicality. False positives ing workflows [21]; at minimum analyses must be fast, surface few are a pervading manifestation of such approximations—tool con- false positives, and provide mechanisms to suppress warnings [10]. figuration and customization is therefore crucial for usability and The problem is that the choices that tools make toward these ends directing analysis behavior. To suppress false positives, develop- are broadly generic, and the divergence between tool assumptions ers readily disable bug checks or insert comments that suppress and program reality (i.e., language features or quality concerns) spurious bug reports. Existing work shows that these mechanisms can lead to unhelpful, overwhelming, or incorrect output [20]. fall short of developer needs and present a significant pain point Tool configuration and customization is crucial for usability and for using or adopting analyses. We draw on the insight that an directing analysis behavior. The inability to easily and selectively analysis user always has one notable ability to influence analysis disable analysis checks and suppress warnings presents a significant behavior regardless of analyzer options and implementation: modi- pain point for analysis users [10].
    [Show full text]
  • Automatic Patch Generation Learned from Human-Written Patches
    A Critical Review of “Automatic Patch Generation Learned from Human-Written Patches”: Essay on the Problem Statement and the Evaluation of Automatic Software Repair Martin Monperrus University of Lille & INRIA, France [email protected] The automatic detection of bugs has been a vast research field for decades, with a large spectrum of static and dy- ABSTRACT namic techniques. Active research on the automatic repair1 of bugs is more recent. A seminal line of research started in At ICSE’2013, there was the first session ever dedicated to 2009 with the GenProg system [37, 15], and at the 2013 In- automatic program repair. In this session, Kim et al. pre- ternational Conference on Software Engineering, there was sented PAR, a novel template-based approach for fixing Java the first session ever dedicated to automatic program repair. bugs. We strongly disagree with key points of this paper. The PAR system [19] was presented there, it is an ap- Our critical review has two goals. First, we aim at explain- proach for automatically fixing bugs of Java code. The re- ing why we disagree with Kim and colleagues and why the pair problem statement is the same as GenProg [15] “given a reasons behind this disagreement are important for research test suite with at least one failing test, generate a patch that on automatic software repair in general. Second, we aim makes all test cases passing”. PAR introduces a new tech- at contributing to the field with a clarification of the essen- nique to fix bugs, based on templates. Each of PAR’s ten tial ideas behind automatic software repair.
    [Show full text]
  • Leveraging Light-Weight Analyses to Aid Software Maintenance
    Leveraging Light-Weight Analyses to Aid Software Maintenance A Dissertation Presented to the Faculty of the School of Engineering and Applied Science University of Virginia In Partial Fulfillment of the requirements for the Degree Doctor of Philosophy (Computer Science) by Zachary P. Fry May 2014 c 2014 Zachary P. Fry Abstract While software systems have become a fundamental part of modern life, they require maintenance to continually function properly and to adapt to potential environment changes [1]. Software maintenance, a dominant cost in the software lifecycle [2], includes both adding new functionality and fixing existing problems, or “bugs,” in a system. Software bugs cost the world’s economy billions of dollars annually in terms of system down-time and the effort required to fix them [3]. This dissertation focuses specifically on corrective software maintenance — that is, the process of finding and fixing bugs. Traditionally, managing bugs has been a largely manual process [4]. This historically involved developers treating each defect as a unique maintenance concern, which results in a slow process and thus a high aggregate cost for finding and fixing bugs. Previous work has shown that bugs are often reported more rapidly than companies can address them, in practice [5]. Recently, automated techniques have helped to ease the human burden associated with maintenance activities. However, such techniques often suffer from a few key drawbacks. This thesis argues that automated maintenance tools often target narrowly scoped problems rather than more general ones. Such tools favor maximizing local, narrow success over wider applicability and potentially greater cost benefit. Additionally, this dissertation provides evidence that maintenance tools are traditionally evaluated in terms of functional correctness, while more practical concerns like ease-of-use and perceived relevance of results are often overlooked.
    [Show full text]