Microsoft Research. Technical Report. MSR-TR-2012-93. Representativeness in Software Engineering Research September 5, 2012 Technical Report MSR-TR-2012-93 Microsoft Research Microsoft Corporation One Microsoft Way Redmond, WA 98052 © 2012 Microsoft Corporation. All rights reserved. Page 1 Microsoft Research. Technical Report. MSR-TR-2012-93. Representativeness in Software Engineering Research Meiyappan Nagappan Thomas Zimmermann Christian Bird Software Analysis and Intelligence Lab (SAIL) Microsoft Research Microsoft Research Queen’s University, Kingston, Canada Redmond, WA, USA Redmond, WA, USA
[email protected] [email protected] [email protected] Abstract—One of the goals of software engineering research is distinct projects, they all fall within a very narrow range of to achieve generality: Are the phenomena found in a few projects functionality (and are likely similar in terms of structure, size, reflective of what goes on in others? Will a technique benefit running time, etc.). Although such an evaluation would include more than just the projects it is evaluated on? The discipline of many projects, generality would not be achieved because the our community has gained rigor over the past twenty years and is now attempting to achieve generality through evaluation and projects studied are not representative of a larger and wider study of an increasing number of software projects (sometime population of software projects. We would learn about JSON hundreds!). However, quantity is not the only important compo- parsers, but little about other types of software. This extreme nent. Selecting projects that are representative of a larger body illustration is contrived and no serious researcher would select of software of interest is just as critical.