Program Committee MSR 2005

Proceedings 2nd International Workshop on Mining Software Repositories MSR 2005 Proceedings 2nd International Workshop on Mining Software Repositories MSR 2005 Saint Louis, Missouri, USA 17th May 2004 Co-located With International Conference on Software Engineering (ICSE 2005) Edited by Ahmed E. Hassan, Richard C. Holt, and Stephan Diehl Contents International Workshop on Mining Software Repositories MSR 2005 Message from the Workshop Chairs.................................................................................... i Program Committee............................................................................................................ ii Additional Reviewers ......................................................................................................... ii Program..............................................................................................................................iii Understanding Evolution and Change Patterns Understanding Source Code Evolution Using Abstract Syntax Tree Matching...........................2 Iulian Neamtiu, Jeffrey Foster, and Michael Hicks Recovering System Specific Rules from Software Repositories……………...............................7 Chadd Williams, and Jeffrey K. Hollingsworth Mining Evolution Data of a Product Family………….............................................................12 Michael Fischer, Johann Oberleitner, Jacek Ratzinger, and Harald Gall Using a Clone Genealogy Extractor for Understanding and Supporting Evolution of Code Clones …………..................................................................................................................17 Miryung Kim, and David Notkin Defect Analysis When do changes induce fixes?.............................................................................................24 Jacek Śliwerski, Thomas Zimmermann, and Andreas Zeller Error Detection by Refactoring Reconstruction…………………………………….....................29 Carsten Görg, and Peter Weißgerber Education Software Repository Mining with Marmoset: An Automated Programming Project Snapshot and Testing System...............................................................................................................................36 Jaime Spacco, Jaymie Strecker, David Hovemeyer, and William Pugh Mining Student CVS Repositories for Performance Indicators.....................................................41 Keir Mierle, Kevin Laven, Sam Roweis, and Greg Wilson Text Mining Toward Mining "Concept Keywords" from Identifiers in Large Software Projects......................48 Masaru Ohba, and Katsuhiko Gondow Source code that talks: an exploration of Eclipse task comments and their implication to repository mining ...........................................................................................................................53 Annie Ying, James Wright, and Steven Abrams Text Mining for Software Engineering: How Analyst Feedback Impacts Final Results…...........58 Jane Huffman Hayes, Alex Dekhtyar, and Senthil Karthikeyan Sundara Software Changes and Evolution Analysis of Signature Change Patterns..........................................................................................64 Sunghun Kim, James Whitehead, and Jennifer Bevan Improving Evolvability through Refactoring..........................................................................69 Jacek Ratzinger, Michael Fischer, Johann Oberleitner, and Harald Gall Linear Predictive Coding and Cepstrum coefficients for mining time variant information from software repositories.............................................................................................................74 Giuliano Antoniol, Vincenzo Fabio Rollo, and Gabriele Venturi Process and Collaboration Repository Mining and Six Sigma for Process Improvement...................................................80 Michael VanHilst, Pankaj Garg, and Christopher Lo Mining Version Histories for Verifying Learning Process of Legitimate Peripheral Participants………………….................................................................................................84 Shih-Kun Huang, and Kang-Min Liu Taxonomies & Formal Representations Towards a Taxonomy of Approaches for Mining of Source Code Repositories.........................90 Huzefa Kagdi, Michael Collard, and Jonathan Maletic A Framework for Describing and Understanding Mining Tools in Software Development….....95 Daniel German, Davor Cubranic, and Margaret-Anne D. Storey SCQL: A formal model and a query language for source control repositories………………....100 Abram Hindle, and Daniel German Integration and Collaboration Developer identification methods for integrated data from various sources……......................106 Gregorio Robles, and Jesús M. González-Barahona Accelerating Cross-Project knowledge Collaboration Using Collaborative Filtering and Social Networks…........................................................................................................................111 Masao Ohira, Naoki Ohsugi, Tetsuya Ohoka, and Ken-ichi Matsumoto Collaboration Using OSSmole: A repository of FLOSS data and analyses………..................116 Megan Conklin, James Howison, and Kevin Crowston Message From Workshop Chairs MSR 2005 Welcome to MSR 2005, the 2nd international workshop on Mining Software Repositories. MSR 2005 brings together researchers and practitioners to consider methods of using data stored in software repositories to further understanding of software development practices. We expect the presentations and discussions in this workshop to facilitate the definition of challenges, ideas and approaches to transform software repositories from static record keeping systems to active repositories used by researchers to gain empirical understanding of software development, and by software practitioners to predict and plan various aspects of their project. We received a large number of submissions – 38 papers from 14 countries. After the review process, 22 papers were chosen for publication. All accepted papers are presented. In order to fit all talks within the workshop day and based on input from the Program Committee during the review process, 11 papers are presented as Regular talks and 11 papers are presented as Lightning talks. Following the Lightning talks, we allocated an hour of informal discussions and demos, in which attendees are encouraged to interact with all presenters on topics of interest. We are grateful for the excellent and professional review job done by the reviewers on such a tight schedule. Ahmed E. Hassan Richard C. Holt University of Waterloo Stephan Diehl Catholic University Eichstätt i Program Committee MSR 2005 Alexander Dekhtyar, University of Kentucky, USA Premkumar T. Devanbu, University of California at Davis, USA Stephen G. Eick, SSS Research Inc., USA Harald Gall, University of Zurich, Switzerland Les Gasser, University of Illinois at Urbana Champaign, USA Daniel German, University of Victoria, Canada Jane Huffman Hayes, University of Kentucky, USA Katsuro Inoue, Osaka University, Japan Philip Johnson, University of Hawaii, USA Timothy C. Lethbridge, University of Ottawa, Canada Gail Murphy, University of British Colombia, Canada Audris Mockus, Avaya Labs Research, USA Thomas J. Ostrand, AT&T Research, USA Dewayne Perry, University of Texas, USA Jelber Sayyad Shirabad, University of Ottawa, Canada Annie Ying, IBM Research, USA Andreas Zeller, Saarland University, Germany Additional Reviewers MSR 2005 Davor Cubranic, University of Victoria, Canada Cory Kapser, University of Waterloo, Canada Jingwei Wu, University of Waterloo, Canada Thomas Zimmermann, Saarland University, Germany ii MSR 2005: International Workshop on Mining Software Repositories msr.uwaterloo.ca Welcome and Introduction [slides] 9:00-9:15 Ahmed E. Hassan, Richard C. Holt, and Stephan Diehl Session 1: Understanding Evolution and Change Patterns z Understanding Source Code Evolution Using Abstract Syntax Tree Matching [slides] Iulian Neamtiu, Jeffrey Foster, and Michael Hicks (University of Maryland) z Recovering System Specific Rules from Software Repositories [slides] Chadd Williams, and Jeffrey K. Hollingsworth (University of Maryland) 9:15-10:30 z Mining Evolution Data of a Product Family [slides] Michael Fischer, Johann Oberleitner, Jacek Ratzinger (Vienna University of Technology), and Harald Gall (University of Zürich) z Using a Clone Genealogy Extractor for Understanding and Supporting Evolution of Code Clones [slides] Miryung Kim, and David Notkin (University of Washington) 10:30-11:00 Coffee Break Session 2: Defect Analysis z When do changes induce fixes? [slides] 11:00-11:45 Jacek Sliwerski (Max Planck Institute for Computer Science), Thomas Zimmermann, and Andreas Zeller (Saarland University) z Error Detection by Refactoring Reconstruction [slides] Carsten Görg (Saarland University), and Peter Weißgerber (Catholic University Eichstätt) Session 3: Education z Software Repository Mining with Marmoset: An Automated Programming Project 11:45-12:30 Snapshot and Testing System [slides] Jaime Spacco, Jaymie Strecker, David Hovemeyer, and William Pugh (University of Maryland) z Mining Student CVS Repositories for Performance Indicators [slides] Keir Mierle,Kevin Laven, Sam Roweis, and Greg Wilson (University of Toronto) 12:30-1:45 Lunch iii Session 4: Lightning Talks (5 mins each) and Walkaround Presentations [info] z Session 4A: Text Mining { Toward Mining "Concept Keywords" from Identifiers in Large Software Projects [slides] Masaru Ohba, and Katsuhiko Gondow (Tokyo

Program Committee MSR 2005

Emacspeak — the Complete Audio Desktop User Manual

The Design & Implementation of an Abstract Semantic Graph For

The Journal of AUUG Inc. Volume 25 ¯ Number 3 September 2004

Application of Graph Databases for Static Code Analysis of Web-Applications

Dense Semantic Graph and Its Application in Single Document Summarisation

Autowig: Automatic Generation of Python Bindings for C++ Libraries

⅀ Xref Local PDF

DART@AI*IA 2013 Proceedings

Graph-Based Source Code Analysis of Javascript Repositories

How to Deal with Your Raspberry Spy

Domain Specific Languages and Their Type Systems

Experience with ANSI C Markup Language for a Cross-Referencer