Empirical Software Engineering at Microsoft Research
Total Page:16
File Type:pdf, Size:1020Kb
Empirical Software Engineering at Microsoft Research http://research.microsoft.com/en-us/projects/esm/ Christian Bird Brendan Murphy Nachiappan Nagappan Thomas Zimmermann [email protected] [email protected] [email protected] [email protected] Microsoft Research, Redmond, USA and Cambridge, UK (Authors are in alphabetical order.) ABSTRACT At a high level the goals of the ESE follows two guiding We describe the activities of the Empirical Software Engi- principles, neering (ESE) group at Microsoft Research. We highlight our research themes and activities using examples from our Empower software development teams research on socio technical congruence, bug reporting and To gain insight from product process, people triaging, and data-driven software engineering to illustrate and customers our relationship to the CSCW community. We highlight our by employing a qualitative and quantitative approach to the unique ability to leverage industrial data and developers and software development process. the ability to make near term impact on Microsoft via the results of our studies. We also present the collaborations In this paper we discuss three broad themes of the ESE our group has with academic researchers. group, Author Keywords Socio technical congruence; Software engineering, Socio technical congruence, bug Bug reporting and triaging; and tracking and triaging, data-driven software engineering Data-driven software engineering. ACM Classification Keywords In each of these sections our studies leverage techniques D.2.5 [Software Engineering]: Testing and Debugging; and methods from both the Software Engineering and D.2.7 [Software Engineering]: Distribution, Maintenance, CSCW communities to adapt case studies in practice from and Enhancement the empirical domain with the CSCW aspects as all soft- ACM General Terms ware systems which are built by teams inherently have a Human Factors, Management, Measurement significant collaborative aspect. We also present our collab- orations and discuss the uniqueness of our fit in the middle INTRODUCTION of these two communities. The Empirical Software Engineering (ESE) group at Mi- crosoft Research focuses on working in the intersection of SOCIO TECHNICAL CONGRUENCE the Software Engineering and CSCW communities. “Design and programming are human activities; forget that and all is lost” “Over the last decade, it has become clear that empirical studies – Bjarne Stroustrop are a fundamental component of software engineering research and practice: Software development practices and technologies As software projects grow in size and complexity, so do the must be investigated by empirical means in order to be under- teams of engineers that develop and maintain them. stood, evaluated, and deployed in proper contexts. This stems Brooks, in his seminal work, “The Mythical Man Month” from the observation that higher software quality and productivi- ty have more chances to be achieved if well-understood, tested [1] discussed coordination as one of the key problems of practices and technologies are introduced in software develop- running a software project with many developers. The co- ment. Empirical studies usually involve the collection and analy- ordination effort required to help each member of a team sis of data and experience that can be used to characterize, evalu- stay in sync and keep a project on schedule is enormous. ate and reveal relationships between software development de- liverables, practices, and technologies.” Socio Technical Congruence is a term that has emerged (Empirical Software Engineering journal, recently in the software engineering literature to describe http://www.springer.com/computer/swe/journal/10664) the relationship between the “social” side of development, meaning the developers, their relationships to each other, how they communicate, work together on software, etc. and Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are the “technical” side which encapsulates features of the not made or distributed for profit or commercial advantage and that copies software itself such as dependencies between components, bear this notice and the full citation on the first page. To copy otherwise, component complexity, and software quality. The idea or republish, to post on servers or to redistribute to lists, requires prior behind the term has its origins in Conway’s Law, originally specific permission and/or a fee. CSCW 2011, March 19–23, 2011, Hangzhou, China. presented in Conway’s paper “How Do Committees Invent” Copyright 2011 ACM 978-1-4503-0556-3/11/3...$10.00.. [2]. This is of importance to the CSCW community because Figure 1. A developer-module network characterizes the con- tributions of developers within a system. of the emphasis placed on understanding how, and under Figure 2. A socio-technical network between modules (circles) what circumstances, developers should work together on and developers (boxes). projects. Advanced centrality measures can improve the In an effort to aid the development effort at Microsoft and prediction of number of post-release failures. understand the effect of human factors, we have gathered data and investigated the relationship of software quality In summary, we found that a strong relationship exists be- with developer attributes such as collaboration behavior, tween the developers’ commit behavior and the software geographic location, position within the organization, and quality of modules within the system. work assignment. Below, we describe some of our key Adding Technical Relationships results. In later work, Christian and Nachi built upon this result by Contribution Behavior and Quality adding module dependencies to the developer-module net- Is a study of collaboration behavior, Nachi, in joint work work [4]. In previous work, Tom and Nachi found that with Martin Pinzger, developed a developer-module net- dependencies can predict failures in both modules [5] and work, which characterized the contributions of developers subsystems [6]. A network that incorporates both develop- to modules within a system [3]. Figure 1 shows an example er contributions and dependencies is a socio-technical net- developer-module network. Gray circles represent devel- work because edges may represent contributions from peo- opers and boxes represent modules within a system. Edges ple to modules or dependencies between modules within a connect developers to the modules that they have contribut- system. Figure 2 depicts a portion of a socio-technical net- ed to, with edge weights representing the number of source work. Circles represent modules and boxes are developers. code repository commits. Note that the developer-module Solid directed edges are dependencies, indicating that one network for Windows Vista is quite large, with thousands module may use functions or types defined in another, may of developers and thousands of binaries – executables make RPC calls to another, etc. Dashed lines indicate that a (.exe), shared libraries (.dll) and drivers (.sys). developer contributed to a module (we used weights in our analysis, but do not depict them in the figure). We found that topological properties of this network were highly related to post-release faults. For instance, modules By combining both types of relationships into one graph, that were more central, as defined by traditional social net- we were able to increase the power of fault predicting re- work analysis centrality measures, tended to have more gression models. Using principal component analysis, we faults than other modules. We also found that less complex found that models based on this network had higher recall measures, such as the number of distinct authors and num- than networks with only contribution edges (developer- ber of distinct commits were both significant predictors for module networks) or only dependency edges to a statistical- the probability of post-release failures. By using a host of ly significant degree in Vista (recall was similar to previous social network measures (we refer the reader to the original models). paper for details and descriptions) in conjunction with prin- To see if such models are specific to Microsoft or if they cipal component analysis, we were able to train a logistic are more generally applicable, we also applied the same regression module for predicting failure-prone modules that techniques to 6 major releases of the Eclipse Java IDE (2.0 achieved an average precision of 83% and recall of 89%. through 3.3) and achieved precision and recall rates of fail- We summarize our important results: ure-prone plug-ins ranging from 75% to 86%. Further, we Network centrality measures can predict failure- were able to train a regression model on one release of prone binaries in Windows Vista. Eclipse and achieve recall and precision values ranging Network centrality measures can predict the num- from 75% to 93% on the next release, showing that cross ber of post-release failures. release prediction works well for network based regression models. The key contributions of this work were: these metrics. Our results indicate that all eight measures are important because a step-wise regression retained every We found that using technical and contribution re- measure. We also created a predictor based on principal lationships together have more power than either component analysis (due to high correlation between some in isolation for predicting bugs.