Extracting Concise Bug-Fixing Patches from Human-Written Patches in Version Control Systems Yanjie Jiang1, Hui Liu1∗, Nan Niu2, Lu Zhang3, Yamin Hu1 1School of Computer Science and Technology, Beijing Institute of Technology, China, 2Department of Electrical Engineering and Computer Science, University of Cincinnati, USA, 3Key Laboratory of High Confidence Software Technologies, Peking University, China Email: fyanjiejiang,liuhui08,
[email protected],
[email protected],
[email protected] Abstract—High-quality and large-scale repositories of real bugs and patches may also inspire novel ideas in finding, locating, and their concise patches collected from real-world applications and repairing software bugs. For example, by analyzing real are critical for research in software engineering community. In bugs, researchers could identify what kind of statements are such a repository, each real bug is explicitly associated with its fix. Therefore, on one side, the real bugs and their fixes may inspire more error-prone and thus try to repair such statements first novel approaches for finding, locating, and repairing software during automatic program repair [15]. Another typical exam- bugs; on the other side, the real bugs and their fixes are indis- ple is the common fix patterns learned from human-written pensable for rigorous and meaningful evaluation of approaches patches [16]. Leveraging such patterns significantly increased for software testing, fault localization, and program repair. To this the performance of automatic program repair [16]. Finally, end, a number of such repositories, e.g., Defects4J, have been pro- posed. However, such repositories are rather small because their data-driven and learning-based approaches in automatic pro- construction involves expensive human intervention.