Empirical Analyses of Software Contributor Productivity

EMPIRICALANALYSESOFSOFTWARECONTRIBUTOR PRODUCTIVITY by Ayushi Rastogi Under the supervision of Dr. Nachiappan Nagappan and Prof. Pankaj Jalote Indraprastha Institute of Information Technology, Delhi August 2017 EMPIRICALANALYSESOFSOFTWARECONTRIBUTOR PRODUCTIVITY by Ayushi Rastogi Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy to the Indraprastha Institute of Information Technology, Delhi August 2017 Dedicated to my Mom, Dad and Di. CERTIFICATE This is to certify that the thesis titled “Empirical Analyses of Software Contributor Productivity’‘ being submitted by Ayushi Rastogi to the Indraprastha Institute of Information Technology, Delhi for the award of the degree of Doctor of Philosophy, is an original research work carried out by her under our supervision. It is to be noted that from July 2011 to August 2014 research was carried out under the supervision of Dr. Ashish Sureka. In our opinion, the thesis has reached the standards fulfilling the requirements of the regulations relating to the degree. The results contained in this thesis have not been submitted in part or full to any other university or institute for the award of any degree/diploma. August 2017 Dr. Nachiappan Nagappan Prof. Pankaj Jalote Indraprastha Institute of Information Technology, Delhi New Delhi, 110020 ACKNOWLEDGEMENTS In the end, it’s the journey that matters... My PhD journey is beautiful because of the people in the forefront and in the background who stood by me. My advisers, collaborators and committee members stood at the forefront guiding me throughout and my friends/family acted as my invisible strength. As I finish my PhD, I would like to thank all those whose thoughtfulness, care and support helped me reach here. Many thanks to my advisers Dr. Nachiappan Nagappan and Prof. Pankaj Jalote whose constant support and guidance made this thesis possible. I thank Dr. Ashish Sureka who introduced me to research during the initial years of my PhD and laid the foundation for my future works. Thanks to Dr. Thomas Zimmerman, Dr. Suresh Thummalapenta, Jacek Czerwonka, and Dr. Georgios Gousios for their collaboration and fruitful discussions. Also thanks to my PhD monitoring committee: Dr. Rahul Purandare and Dr. Pushpendra Singh, for their constant feedbacks on the thesis. I thank my thesis review committee: Prof. Premkumar Devanbu, Prof. Michele Lanza and Dr. Anjaneyulu Pasala for sharing their suggestions to improve the work. I also thank my industry mentor - Ravindra Naik and comprehensive examination review committee - Dr. Pamela Bhattacharya and Dr. Gautam Shroff for their inputs. I am indebted to my Mom, Dad and Di who listened to all my crazy talks and helped me get back to track whenever I digressed from normal life. And my wonderful friends: Sangeeth, Monika, Ankita, Sonia, Megha, Surabhi, Dipto, Sonal, Hemanta, Annapurna, and Venkatesh within the institute and many more outside the institute, who made all these years memorable. Thank you all for being there. I thank the TCS PhD research fellowship programme which supported me financially throughout my PhD (January 2012 to December 2016). Also thanks to Microsoft, CICS, SIGSOFT, TCS, and IIIT- Delhi for providing travel support. Last but not the least, thanks to all the survey respondents from industry and open source projects for sharing their perspectives and experiences. v PUBLICATIONS [1] Ayushi Rastogi, Nachiappan Nagappan, and Georgios Gousios. “Geographical Bias in GitHub: Perceptions and Reality.” In: Submitting to ICSE 2018. [2] Ayushi Rastogi and Nachiappan Nagappan. “On the Personality Traits of GitHub Contribu- tors.” In: 27th International Symposium on Software Reliability Engineering, Ottawa, Canada. 2016. [3] Ayushi Rastogi and Nachiappan Nagappan. “Forking and the Sustainability of the Developer Community Participation–An Empirical Investigation on Outcomes and Reasons.” In: IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER). Vol. 1. IEEE. 2016, pp. 102–111. doi: 10.1109/SANER.2016.27. [4] Ayushi Rastogi, Suresh Thummalapenta, Thomas Zimmermann, Nachiappan Nagappan, and Jacek Czerwonka. “Ramp-Up Journey of New Hires: Tug of War of Aids and Impediments.” In: ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM). 2015, pp. 1–10. doi: 10.1109/ESEM.2015.7321212. [5] Ayushi Rastogi, Suresh Thummalapenta, Thomas Zimmermann, Nachiappan Nagappan, and Jacek Czerwonka. “Ramp-up Journey of New Hires: Do strategic practices of software com- panies influence productivity?” In: 10th Innovations in Software Engineering Conference (ISEC), formerly India Software Engineering Conference. 2017, pp. 1–4. [6] Ayushi Rastogi. “Contributor’s Performance, Participation Intentions, Its Influencers and Project Performance.” In: 2015 IEEE/ACM 37th IEEE International Conference on Software En- gineering. Vol. 2. 2015, pp. 919–922. doi: 10.1109/ICSE.2015.292. [7] Ayushi Rastogi and Ashish Sureka. “What Community Contribution Pattern Says about Stabil- ity of Software Project?” In: 21st Asia-Pacific Software Engineering Conference. Vol. 2. 2014, pp. 31– 34. doi: 10.1109/APSEC.2014.88. [8] Ayushi Rastogi and Ashish Sureka. “Open Source Software: Mobile Open Source Technologies: 10th IFIP WG 2.13 International Conference on Open Source Systems, OSS 2014, San José, Costa Rica, May 6-9, 2014. Proceedings.” In: Berlin, Heidelberg: Springer Berlin Heidelberg, 2014. Chap. Does Contributor Characteristics Influence Future Participation? A Case Study on Google Chromium Issue Tracking System, pp. 164–167. isbn: 978-3-642-55128-4. doi: 10.1007/ 978-3-642-55128-4_22. url: http://dx.doi.org/10.1007/978-3-642-55128-4_22. [9] Ayushi Rastogi and Ashish Sureka. “SamikshaViz: A Panoramic View to Measure Contribu- tion and Performance of Software Maintenance Professionals by Mining Bug Archives.” In: Proceedings of the 7th India Software Engineering Conference. ISEC ’14. Chennai, India: ACM, 2014, 2:1–2:10. isbn: 978-1-4503-2776-3. doi: 10.1145/2590748.2590750. url: http://doi.acm.org/ 10.1145/2590748.2590750. [10] Ayushi Rastogi, Arpit Gupta, and Ashish Sureka. “Samiksha: Mining Issue Tracking System for Contribution and Performance Assessment.” In: Proceedings of the 6th India Software Engineering Conference. ISEC ’13. New Delhi, India: ACM, 2013, pp. 13–22. isbn: 978-1-4503-1987-4. doi: 10.1145/2442754.2442757. url: http://doi.acm.org/10.1145/2442754.2442757. [11] Ayushi Rastogi and Ashish Sureka. “SamikshaUmbra: Contribution and Performance Assess- ment of Software Maintenance Professionals by Mining Software Repositories.” In: 2013 20th Asia-Pacific Software Engineering Conference (APSEC). Vol. 2. 2013, pp. 170–175. vi CONTENTS 1 introduction 1 1.1 Empirical Software Engineering . 2 1.1.1 Research Methods . 2 1.1.2 Data Collection Techniques . 4 1.1.3 Mining Software Repositories . 6 1.1.4 Impact of Empirical Software Engineering . 7 1.2 Software Contributor Productivity . 7 1.2.1 Measuring Contributor Productivity . 8 1.2.2 Understanding Contributor Productivity . 9 1.3 Thesis Overview . 10 2 aids and impediments in the ramp-up journey of new hires 15 2.1 Introduction . 15 2.2 Related Work . 17 2.3 Background . 17 2.4 Methodology . 18 2.4.1 Data Collection . 18 2.4.2 Quantitative Analysis . 18 2.4.3 Qualitative Analysis . 19 2.5 Time to First Check-In . 20 2.5.1 Quantitative Analysis . 20 2.5.2 Qualitative Analysis . 22 2.6 Time to Ramp-Up . 24 2.6.1 Quantitative Analysis . 24 2.6.2 Qualitative Analysis . 26 2.7 Other New Hire Activities . 27 2.8 Threats to Validity . 28 2.8.1 Internal Validity . 28 2.8.2 Construct Validity . 28 2.8.3 External Validity . 28 2.9 Summary . 28 3 geographical bias in peer code review 30 3.1 Introduction . 30 3.2 Background and Related Work . 31 3.2.1 Visible Demographic Attributes and Bias . 31 3.2.2 Pull-based Development . 31 3.2.3 Factors Influencing Pull Request Acceptance Decisions . 31 3.3 Methodology . 31 3.3.1 GitHub Data . 32 3.3.2 Survey............................................ 35 3.4 Results . 37 3.4.1 Analysis of GitHub projects’ data . 37 3.4.2 Analysis of Survey Data . 39 3.4.3 Open-ended Survey Responses . 41 3.5 Threats to Validity . 43 3.5.1 Internal Validity . 43 3.5.2 External Validity . 43 3.6 Implications . 43 3.7 Summary . 44 4 factors influencing contributor participation 45 4.1 Dynamics of Competing Projects . 45 4.1.1 Introduction . 45 4.1.2 Related Work . 47 4.1.3 Background . 47 vii contents viii 4.1.4 Methodology . 50 4.1.5 Results . 52 4.1.6 Threats to Validity . 57 4.1.7 Implications . 57 4.2 Personality Traits and Context of Software Development . 58 4.2.1 Introduction . 58 4.2.2 Background and Related Work . 59 4.2.3 Methodology . 60 4.2.4 Results . 63 4.2.5 Discussions . 67 4.2.6 Threats to Validity . 68 4.3 Role and Past Contributions . 69 4.3.1 Introduction . 69 4.3.2 Empirical Analysis . 70 4.4 Summary . ..

Load more