On the Evolution of Source Code and Software Defects

On the Evolution of Source Code and Software Defects Doctoral Dissertation submitted to the Faculty of Informatics of the Università della Svizzera Italiana in partial fulfillment of the requirements for the degree of Doctor of Philosophy presented by Marco D’Ambros under the supervision of Prof. Dr. Michele Lanza October 2010 Dissertation Committee Prof. Dr. Carlo Ghezzi Politecnico di Milano, Italy Prof. Dr. Cesare Pautasso Università della Svizzera Italiana, Switzerland Prof. Dr. Harald C. Gall University of Zurich, Switzerland Prof. Dr. Hausi A. Müller University of Victoria, Canada Dissertation accepted on 19 October 2010 Prof. Dr. Michele Lanza Research Advisor Università della Svizzera Italiana, Switzerland Prof. Dr. Michele Lanza PhD Program Director i I certify that except where due acknowledgement has been given, the work presented in this thesis is that of the author alone; the work has not been submitted previously, in whole or in part, to qualify for any other academic award; and the content of the thesis is the result of work which has been carried out since the official commencement date of the approved research program. Marco D’Ambros Lugano, 19 October 2010 ii To Anna iii iv Abstract Software systems are subject to continuous changes to adapt to new and changing requirements. This phenomenon, known as software evolution, leads in the long term to software aging: The size and the complexity of systems increase, while their quality decreases. In this context, it is no wonder that software maintenance claims the most part of a software system’s cost. The analysis of software evolution helps practitioners deal with the negative effects of software aging. With the advent of the Internet and the consequent widespread adoption of distributed development tools, such as software configuration management and issue tracking systems, a vast amount of valuable information concerning software evolution has become available. In the last two decades, researchers have focused on mining and analyzing this data, residing in various software repositories, to understand software evolution and support maintenance activities. However, most approaches target a specific maintenance task, and consider only one of the sev- eral facets of software evolution. Such approaches, and the infrastructures that implement them, cannot be extended to address different maintenance problems. In this dissertation, we propose an integrated view of software evolution that combines different evolutionary aspects. Our thesis is that an integrated and flexible approach supports an extensible set of software maintenance activities. To this aim, we present a meta-model that in- tegrates two aspects of software evolution: source code and software defects. We implemented our approach in a framework that, by retrieving information from source code and defect repositories, serves as a basis to create analysis techniques and tools. To show the flexibility of our approach, we extended our meta-model and framework with e-mail information extracted from development mailing lists. To validate our thesis, we devised and evaluated, on top of our approach, a number of novel analysis techniques that achieve two goals: 1. Inferring the causes of problems in a software system. We propose three retrospective analysis techniques, based on interactive visualizations, to analyze the evolution of source code, software defects, and their co-evolution. These techniques support various maintenance tasks, such as system restructuring, re-documentation, and identification of critical software components. 2. Predicting the future of a software system. We present four analysis techniques aimed at anticipating the locations of future defects, and investigating the impact of certain source code properties on the presence of defects. They support two maintenance tasks: defect prediction and software quality analysis. By creating our framework and the mentioned techniques on top of it, we provide evidence that an integrated view of software evolution, combining source code and software defects information, supports an extensible set of software maintenance tasks. v vi Acknowledgements Life is relationship and relationships shape everything we do. The work presented in this thesis would not have been possible without the support and influence of a number of people who I had the fortune and the privilege to meet. First of all, I would like to thank my advisor Michele Lanza for his guidance and support during my Ph.D. Many thanks also for your patience in dealing with my strange and improvised ideas (e.g., “lets write a paper”—a couple of days before the deadline). Michele, what I learned from you during these years goes far beyond the work presented in this thesis and research in general. I wish you, Marisa, Alessandro, and the new member of the family all the best. Special thanks go to the members of my dissertation committee—Harald Gall, Carlo Ghezzi, Hausi Müller, and Cesare Pautasso—for dedicating a part of their precious time to evaluate my work, and for providing very valuable and detailed feedback. There is something special that I really appreciated from each of you: Harald, I learned a lot from you on how to be very focused and professional, while also having fun; Carlo, you are able to be among the world-wide top scientists in the software engineering scene and, at the same time, such a cool person; Hausi, I really admire the energy and passion that you put in whatever you do, from reviewing my thesis to playing soccer and climbing the great wall; Cesare, thanks for your positive and constructive feedback which boosted my motivation in the last moments before the defense. To my colleagues, the former and current members of the REVEAL research group: It has been a pleasure and a privilege to work with you all. I always enjoyed the easy atmosphere in our office, the jokes, and the sense of playfulness alternated with hard working sessions. Romain “Omar” Robbes, you were an example for me, and I learned a lot from you. Thanks for being always kind and offering your help all the time. Mircea “the long” Lungu, I am glad that I met such a sympathetic and cheery person as you are. I will always remember the awesome trips we did around the world, and I am certain that our friendship will last no matter where we will be or what we will do. With Ricky “the vettel” Wettel I always had fun time, exchanging jokes, reading funny blogs, and watching nonsensical clips. I already miss the time we have spent together. My best wishes to you and Simi, I am sure that you will be fantastic parents. Lile “the thermostat” Hattori, thanks for accepting my continuous jokes and for the Eclipse-related help... without forgetting the precious (?) Google stickers. Fernando “beautiful coder” Olivero, your peacefulness and passion for OO programming is contagious. It was fun playing soccer together, and I am looking forward to visiting you in the southernmost city of the world and having a full-fledged asado together (and perhaps a cup of tea, lad). Alberto “12 O’clock” Bacchelli, even if our personalities are opposite (i.e., the chaotic vs the precise), we got along well from the very first moment. You are a very generous person whom one can always rely on. Working with you was always instructive and pleasant, both for teaching and for research: I especially enjoyed our missions impossible such as FASE, QSIC, ICPC. I really hope that we will continue to collaborate, and I wish you and Sara a marvelous future together. Tommaso “graspa” Dal Sasso, you just vii viii joined the REVEAL group, but your sharp sympathy is already spreading. I am also grateful to all the people that I had the pleasure to meet during my staying in Lugano: Mehdi, Shima, Alessio, Paolo, Alessandra, Domenico, Mostafa, Nicolas, Marco, Jeff, Giovanni, Francesco, Cédric, Dmitrijs, Edgar, Milan, Morgan, Sasa, Cyrus, Adina, Antonio, and many others that are too numerous to list. To our secretaries, Cristina, Danijela, Elisa and Nina: Thanks for always trying to fulfill my improbable requests while keep smiling. Many thanks go to a number of people that I regularly met at conferences and workshops: Martin, Ahmed, Ettore, Carl, Jonathan, Andi, Denys, Rocco, Filippo, Max, Doru, and many others. We had fruitful discussions, which influenced my work, and a great deal of fun too. I owe very special thanks to many people that do not have anything to do with my research, but contributed to this work by making my life much better. I am grateful to Zamo and Lukino, my best friends: You were, are, and will always be a firm reference point for me. To my friends Polly, Carmelo, Sara, Marcoalto, Elegance, Vivandiere, Vittorio, Alice, Emi, Simo, Ema, Samba, Pablo, Franco, Dany, Michela, Panji, Davide, Giovanni, Betty, Giulio, Giuseppe, Barla, Lucky, Salvo, Pigei, Antonio, Cucia, Raffo, Vito, Nunzia, Rossella, Paolo, Orla, Griso, Ampio, Stefano, Tala, Nico, Luca, Elsa, and many others that are too numerous to mention: In different ways and time you all had played an important role in my life. Thanks for all the crazy adventures and cool stuff that we did together, which were always cause of fun, joy and enthusiasm. To the young families that share with us the experience of having a baby: Vale, Dany and Samu (the eagle king); Fra, Peppe, Gaia and Alice; Sheila, Vale and Marco; Vitto, Ricky, Giovanni and Lucia; Anna, Poldis and Francy. I would like to thank you for your support, the precious advises, and especially for the good time spent together. To all the kids, boys, and girls that I had the honor to accompany during a small stretch of their path: Thanks for teaching me so well the importance of playing, and not taking ourselves too seriously.

On the Evolution of Source Code and Software Defects

Debian Developer's Reference Version 12.0, Released on 2021-09-01

Code Distribution Process - Deﬁnition of Evaluation Metrics

A Case Study on Software Vulnerability Coordination

GNAT User's Guide

Estudos Preliminares

Bug Tracker Net Documentation

Name Synopsis Description Options

Software Development and Maintenance

Pipenightdreams Osgcal-Doc Mumudvb Mpg123-Alsa Tbb

Primera Fase Del Proyecto De Creación De Una Herramienta De Ticketing Web Ibermática Multi-Cliente Y Parametrizable

RTEMS Development Roadmap

Software Maintenance and Upcoming Changes to Bug