A Systematic Approach to Benchmark and Improve Automated Static Detection of Java-API Misuses

A Systematic Approach to Benchmark and Improve Automated Static Detection of Java-API Misuses Vom Fachbereich Informatik der Technischen UniversitätDarmstadt genehmigte Dissertation zur Erlangung des akademischen Grades eines Doktor-Ingenieurs (Dr.-Ing.) vorgelegt von Sven Amann, M.Sc. geboren in Darmstadt (Hessen). Referent: Prof. Dr.-Ing. Mira Mezini Korreferent: Prof. Dr.-Ing. Andreas Zeller Datum der Einreichung: 20. März2018 Datum der mündlichen Prüfung: 07. Mai 2018 Erscheinungsjahr 2018 DarmstädterDissertationen D17 . Amann, Sven : A Systematic Approach to Benchmark and Improve Automated Static Detection of Java-API Misuses Darmstadt, Technische UniversitätDarmstadt Jahr der Veröffentlichung der Dissertation auf TUprints: 2018 URN: urn:nbn:de:tuda-tuprints-74222 URL: http://tuprints.ulb.tu-darmstadt.de/id/eprint/7422 Tag der mündlichen Prüfung:07.05.2018 Veröffentlicht unter CC BY-SA 4.0 International https://creativecommons.org/licenses/ Preface I love solving challenging problems. Maybe this is why I became interested in programming. The very idea of solving problems by writing down executable solution routines fascinates me. And the additional challenge of developing and maintaining high-quality solutions keeps me hooked. During my studies, I kept coming back to practices and tools that support software quality, such as testing and code-analysis tools. Finally, in my Master's thesis, I developed a code recommender system based on implicit user feedback, to assist developers in writing high-quality code. This thesis was supervised by Marcel Bruch and Andreas Sewe from Prof. Mira Mezini's Software Technology Group (STG) at the Technische Univer- sitätDarmstadt. Marcel and Andreas introduced me to academic software-engineering research. They also introduced me to Sebastian Proksch, a PhD student at the STG, who asked me to join the research project KaVE, where we researched and developed recommender systems for software engineering. What followed were five years of contin- uous learning, hard work, and many many ups and downs. These years ultimately led to the thesis you hold before you. And since I did not walk this path alone, what follows is an attempt to thank all the people who accompanied me. First, I would like to thank my supervisor, Prof. Mira Mezini, for the opportunity to do my PhD and for the liberty to pursue my own projects and ideas during this time. I highly appreciate that you trusted me to find my way, that you supported me in following this way, and that you acknowledged where it has led me. I also thank Prof. Andreas Zeller for being the second examiner of my thesis. I am grateful for the time you spent on carefully reviewing my thesis and for your honest feedback. Next, I want to thank Sebastian Proksch, with whom I collaborated very closely during the first half of my PhD. We shared many ups and downs during this time, and I am happy that the work we did back then ultimately contributed to your PhD thesis. I learned a great deal working with you and I am very glad to have had the opportunity. Another person without whom I would not be where I am today is Sarah Nadi, who joined the STG as a PostDoc during my second year as a PhD student. Your guidance and example strongly influenced how I work, as well as my thoughts and believes about research and academia as a whole. I am deeply grateful for your advice and your reliability, even after you had long left to become a professor at the University of Alberta. Soon after I started the work presented in this thesis, I had the privilege to get in contact with Hoan Anh Nguyen and Prof. Tien N. Nguyen, two experts in the field of recommender systems for software engineering. I am grateful that you two took the chance to work with a complete stranger, who you would only ever meet on Skype for almost a year to come. I highly appreciate all the guidance and assistance you put into 3 our work and that you continued to believe in me, despite all the bad luck we had, in addition to Reviewer 2. One of the many things I can look back on is my research project Eko that was funded by the German Ministry of Education and Research (BMBF). It was a great honor to receive funding at this early career stage. It enable me to lead a research team and back my work up with working prototypes, which we released along with the publications. The motivated people involved in this project were Dr. Sarah Nadi, the PhD student Leonid Glanz, and the undergraduate students Mattis Manfred Kämmerer, Jonas Schlitzer, Simon Weiler, Manuel Benz, and Govind Singh. We grew as a team over the course of the project and still collaborate on new topics. In this very project, I had the chance to meet Willi Weiers and Joachim Heldmann from DHL IT Services, who mentored me over the course of the project. I am very grateful for your time and advice during that time. Our meetings provided me with ideas, confidence, and insights to tackle all obstacles on the path to a successful project, as well as this thesis. My research would have been impossible without the people who went before me and who openly shared their work, data, and tools with me, no questions asked. These are Prof. Martin Monperrus, Prof. Michael Pradel, Andrzej Wasylkowski, and Prof. Andreas Zeller. May your example inspire many generations of researchers to come. Over the years, I was happy to work with a number of excellent student assistants, namely Mattis Manfred Kämmerer,Jonas Schlitzer, David Albrecht, Uli Fahrer, and An- dreas Bauer. Your hard work and dedication to high-quality software engineering enabled both development and maintenance of the many research prototypes that we published and contributed to over the years. You did a great service to the research community and myself, for which I am truly grateful. I would also like to include the students I had the pleasure to supervise in the last years. These are Michael Kutschke, David Dahlen, Oliver Abt, Waldemar Graf, Markus Zimmermann, Carina Oberle, Manuel Benz, Simon Weiler, Mattis Manfred Kämmerer,Govind Singh, Rossana BermúdezDe La Hoz, and Vidyashree Nanjunde Gowda. I would also like to thank the people who have provided their help in proof reading this thesis. First and foremost there is Andreas Sewe, whose amazingly detailled and constructive feedback has|over and over again|brought my thinking to higher levels of clarity. Second, there is Ben Hermann, who I am convinced was an extraordinary salesperson in a prior life. Furthermore, I would like to thank the anonymous reviewers of all my submitted publications (including Reviewer 2). Though I sometimes disagreed with your opinions, I always came to value your criticism and did my best to consider it in my work. Like probably all PhD students, I came to loath the peer-review system at times, when a reject deprived me of so much of what little time I had. I sincerely hope that the community will find ways to continue providing insightful and constructive reviews in the face of the growing numbers of the submissions. The past five years would not have been half as fun without my brilliant colleagues Andi Bejleri, Oliver Bracevac, Ervina Cergani, Joscha Drechsler, Michael Eichberg, Matthis Eichholz, Sebastian Erdweg, Leonid Glanz, Sylvia Grewe, Dominik Helm, Ben Hermann, Sven Keidel, Mirko Köhler,Florian Kübler,Edlira Kuci, Johannes Lerch, Ingo 4 Maier, Ralf Mitschke, Ragner Mogk, Patrick Müller,Sarah Nadi, Sebastian Proksch, Michael Reif, Guido Salvaneschi, Jan Sinschek, Jurgen van Ham, Manuel Weiel, Pascal Weisenburger, and Anna-Katharina Wickert. Ultimately, my PhD would have come to a grinding halt many times without the invaluable support of Gudrun Harris. You certainly are the single most important person along the journey towards a doctoral degree at the STG. Your rigorous work ensures that we are well funded, fulfill regulations, and do not loose ourselves in paperwork. Your open ears and your humor ensure that we stay on our paths, with both feet on the ground, and free of illusions regarding our baking skills. I cannot thank you enough. And of course, I would not have made it without the support of my girlfriend, my friends, and my family. I am deeply grateful to every single one of you, for excepting me for who I am and for having shared so much with me|the good, the bad, and the ugly. Editorial notice: Throughout this thesis I use the term \we" and \us" to describe my work. This is meant to underline that research is always a cooperative effort and that I would have much less (if something at all) to present here, if other people had not took the time off of their own work to review, discuss, and contribute to mine. I am deeply grateful for their effort. 5 Abstract Today's software industry relies heavily on the reuse of existing software libraries. Such libraries provide the building blocks for modern software products. Reusing them allow developers to focus on innovation, while standing on the shoulders of giants. To use libraries effectively, developers need to know the Application Programming Interfaces (APIs) through which they communicate with the libraries. This includes both the APIs' semantics and the (implicit) usage constraints that come with them. In the face of the rapidly growing and evolving supply of software libraries, this has become a challenging task. As a result, incorrect usages of APIs, or API misuses, are a prevalent cause of software bugs, crashes, and vulnerabilities.

Load more