Whodo: Automating Reviewer Suggestions at Scale
Total Page:16
File Type:pdf, Size:1020Kb
1 WhoDo: Automating reviewer suggestions at scale 59 2 60 3 61 4 Sumit Asthana B.Ashok Chetan Bansal 62 5 Microsoft Research India Microsoft Research India Microsoft Research India 63 6 [email protected] [email protected] [email protected] 64 7 65 8 Ranjita Bhagwan Christian Bird Rahul Kumar 66 9 Microsoft Research India Microsoft Research Redmond Microsoft Research India 67 10 [email protected] [email protected] [email protected] 68 11 69 12 Chandra Maddila Sonu Mehta 70 13 Microsoft Research India Microsoft Research India 71 14 [email protected] [email protected] 72 15 73 16 ABSTRACT ACM Reference Format: 74 17 Today’s software development is distributed and involves contin- Sumit Asthana, B.Ashok, Chetan Bansal, Ranjita Bhagwan, Christian Bird, 75 18 uous changes for new features and yet, their development cycle Rahul Kumar, Chandra Maddila, and Sonu Mehta. 2019. WhoDo: Automat- 76 ing reviewer suggestions at scale. In Proceedings of The 27th ACM Joint 19 has to be fast and agile. An important component of enabling this 77 20 European Software Engineering Conference and Symposium on the Founda- 78 agility is selecting the right reviewers for every code-change - the tions of Software Engineering (ESEC/FSE 2019). ACM, New York, NY, USA, 21 79 smallest unit of the development cycle. Modern tool-based code 9 pages. https://doi.org/10.1145/nnnnnnn.nnnnnnn 22 review is proven to be an effective way to achieve appropriate code 80 23 review of software changes. However, the selection of reviewers 81 24 in these code review systems is at best manual. As software and 1 INTRODUCTION 82 25 teams scale, this poses the challenge of selecting the right reviewers, Large software projects have continuously evolving code-bases 83 26 which in turn determines software quality over time. While pre- and an ever changing set of developers. Making the development 84 27 vious work has suggested automatic approaches to code reviewer process smooth and fast, while maintaining code-quality is vital to 85 28 recommendations, it has been limited to retrospective analysis. We any software development effort. As one approach for meeting this 86 29 not only deploy a reviewer suggestions algorithm - WhoDo - and challenge, code review [1, 2] is widely accepted as an effective tool 87 30 evaluate its effect but also incorporate load balancing as part ofit for subjecting code to scrutiny by peers and maintaining quality. 88 31 to address one of its major shortcomings: of recommending expe- Modern code review [3], characterized by lightweight tool-based 89 32 rienced developers very frequently. We evaluate the effect of this reviews of source code changes, is in use broadly across both com- 90 33 hybrid recommendation + load balancing system on five reposito- mercial and open source software projects [17]. This form of code 91 34 ries within Microsoft. Our results are based around various aspects review provides developers with an effective workflow to review 92 35 of a commit and how code review affects that. We attempt to quanti- code changes and improve code and this process has been studied 93 36 tatively answer questions which are supposed to play a vital role in in depth in the research community [5, 6, 11, 13, 17–20, 23]. 94 37 effective code review through our data and substantiate it through One topic that has received much attention over the past five 95 38 qualitative feedback of partner repositories. years is the challenge of recommending the most appropriate re- 96 39 viewers for a software change. Bacchelli and Bird [3] found that 97 40 CCS CONCEPTS when the reviewing developer had a deep understanding of the code 98 41 being reviewed, the feedback was "more likely to find subtle defects 99 • Human-centered computing → Empirical studies in collabo- 42 ... more conceptual (better ideas, approaches) instead of superficial 100 rative and social computing; • Software and its engineering → 43 (naming, mechanical style, etc.)". Kononenko et al [12] found that 101 Software configuration management and version control systems; 44 selecting the right reviewers impacts quality. Thus, many have pro- 102 Software maintenance tools; Programming teams. 45 posed and evaluated approaches for identifying the best reviewer 103 46 104 KEYWORDS for a code review [4, 8, 14, 15, 22, 24, 26] (see section 2 for a more 47 in-depth description of related work). At Microsoft, many devel- 105 48 software-engineering, recommendation, code-review opment teams have voiced a desire for help in identifying those 106 49 developers that have the understanding and expertise needed to 107 50 Permission to make digital or hard copies of all or part of this work for personal or 108 classroom use is granted without fee provided that copies are not made or distributed review a given software change. 51 for profit or commercial advantage and that copies bear this notice and the full citation The large and growing size of the software repositories at many 109 52 on the first page. Copyrights for components of this work owned by others than ACM software companies (including Microsoft, the company involved 110 must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, 53 111 to post on servers or to redistribute to lists, requires prior specific permission and/or a in the evaluation of our approach) has created the need for an 54 fee. Request permissions from [email protected]. automated way to suggest reviewers [3]. One common approach 112 55 ESEC/FSE 2019, 26–30 August, 2019, Tallinn, Estonia that several projects have used is a manually defined set of groups 113 © 2019 Association for Computing Machinery. 56 114 ACM ISBN 978-x-xxxx-xxxx-x/YY/MM...$15.00 that identify experts in an area of code collectively. These groups are 57 https://doi.org/10.1145/nnnnnnn.nnnnnnn used in conjunction with rules which trigger the addition of groups 115 58 1 116 ESEC/FSE 2019, 26–30 August, 2019, Tallinn, Estonia Asthana, et al. 117 whenever files in a pre-defined part of the system change (e.g.,add • Provide an evaluation of the recommendation system in live 175 118 participants in the Network Protocol Review group whenever a production. 176 119 file is changed in /src/networking/protocols/TCPIP/*/). This • Present the results of a user study of software developers 177 120 ensures that the appropriate group of experts are informed of the that used the recommendation system. 178 121 file change and can review it. However, such solutions are hardto 179 122 scale, suffer from becoming stale quickly, and may miss the right 2 RELATED WORK 180 123 181 reviewers even when rules and groups are manually kept up to Tool based code review is the adopted standard in both OSS and 124 182 date. proprietary software systems [17] and many tools exist that enable 125 183 Motivated by the need for help in identifying the most appropri- developers to look at software changes effectively and review them. 126 184 ate reviewers and the difficulty of manually tracking expertise, we Reviewboard1, Gerrit2 and Phabricator3, the popular open source 127 185 have developed and deployed an automatic code review system at code review tools and Microsoft’s internal code review interface 128 186 Microsoft called WhoDo. In this paper, we report our experience share some common characteristics: 129 and initial evaluation of this deployment on five software reposi- 187 130 tories. We leverage the success of previous works like Zanjani et • Each code change has a review associated with it and almost 188 131 al. [26] that demonstrated that considering past history of code all code changes (depending on team policies) have to go 189 132 in terms of authorship and reviewership with respect to the cur- through the review process. 190 133 rent change is an effective way to recommend peer reviewers for a • Each review shows the associated changes in a standard diff 191 134 code change. Based on the positive metrics of our evaluation and format. 192 135 favorable user-feedback, we have proceeded to deploy WhoDo onto • Each review can be reviewed by any number of developers. 193 136 many additional repositories across Microsoft. Currently, it runs Currently, reviewers are added by the author of the change 194 137 on 123 repositories and this number continues to grow rapidly. or through manually defined rules. 195 138 As discussed in Section 2, reviewer recommendation has been • Each reviewer can leave comments on the change at any 196 139 the subject of much research. However, it is not clear which of these particular location of the diff pinpointing errors or asking 197 140 systems have been deployed in practice and the evaluation of such for clarifications. 198 141 recommendation systems has consisted primarily of historical com- • The code author addresses these comments in subsequent 199 142 parison, determining how well the recommendations match what iterations/revisions and this continues until all comments 200 143 actually happened and who participated in the review. While such are resolved or one of the reviewer signs-off. 201 144 an offline evaluation [9] is useful, it may not provide an accurate One potential bottleneck of the above workflow is addition of 202 145 picture of the impact of the recommendation system when used in code reviewers by authors. This is a manual activity and several 203 146 practice. For example, there is an (often implicit) assumption that social factors [7] such as developer relations, code knowledge play 204 147 those who participated in the review were in fact the best people a role in it. This system, while, effective is also biased towards inter- 205 148 to review the change and those who were not invited were not personal relations which can lead to incorrect assignment in cases 206 149 appropriate reviewers.