Software Analytics, We Ing Styles Could Be Seen As More Bug Can Correct Those Misconceptions

FOCUS: SOFTWARE ENGINEERING’S 50TH ANNIVERSARY much reuse of too many old, and now Software outdated, ideas. A recent study of 564 software developers found that a) programmers do indeed have Analytics: very strong beliefs on certain top- ics b) their beliefs are primarily formed based on personal experi- What’s Next? ence, rather than on fndings in empirical research and c) beliefs can vary with each project, but do Tim Menzies, North Carolina State University not necessarily correspond with actual evidence in that project.3 Thomas Zimmermann, Microsoft Research The good news is that, using software analytics, we can correct those // Developers sometimes develop their misconceptions (for examples, see own ideas about “good” and “bad” the sidebar “Why We Need Soft- ware Analytics”). For example, after software, on the basis of just a few a project is examined, certain cod- projects. Using software analytics, we ing styles could be seen as more bug can correct those misconceptions. // prone (and to be avoided). There are many ways to distill that data, including quantitative methods (which often use data- mining algorithms) and qualitative FPO methods (which use extensive human feedback to guide the data explora- tion). Previously, we have offered extensive surveys of those methods (see Figure 2).4–7 The goal of this article SOFTWARE DEVELOPMENT IS a someone claims that “this or that” is to update those surveys and offer complex process. Human develop- is important for a successful soft- some notes on current and future di- ers may not always understand all ware project, analytics lets us treat rections in software analytics. the factors that infuence their proj- that claim as something to be veri- ects. Software analytics is an excel- fed (rather than a sacred law that Some History lent choice for discovering, verifying, cannot be questioned). Also, once As soon as people started program- and monitoring the factors that the claim is verifed, analytics can ming, it became apparent that pro- affect software development. act as a monitor to continually check gramming was an inherently buggy Software analytics distills large whether “this or that” is now over- process. Maurice Wilkes, speaking of amounts of low-value data into small come by subsequent developments. his programming experiences in the chunks of very high-value informa- Anthropologists studying software early 1950s, recalled the following: tion.1 Those chunks can reveal what projects warn that developers usually factors matter the most for software develop their personal ideas about It was on one of my journeys projects. For example, Figure 1 lists good and bad software on the basis between the EDSAC room and some of the more prominent recent of just a few past projects.2 All too the punching equipment that insights learned in this way. often, these ideas are assumed to ap- “hesitating at the angles of stairs” Software analytics lets us “trust, ply to all projects, rather than just the the realization came over me with but verify” human intuitions. If few seen lately. This can lead to too full force that a good part of the 2 IEEE SOFTWARE | PUBLISHED BY THE IEEE COMPUTER SOCIETY 0740-7459/18/$33.00 © 2018 IEEE FIGURE 1. Surprising software analytics fndings in the press, from Linux Insider, Nature, Forbes, InfoWorld, The Register, Live Science, and Men’s Fitness. remainder of my life was going to WHY WE NEED be spent in fnding errors in my own programs.8 SOFTWARE ANALYTICS It took several decades to gather Prior to the 21st century, researchers often had access to data from only one or the experience required to quantify two projects. This meant theories of software development were built from limited any kind of size–defect relationship. data. But in the data-rich 21st century, researchers have access to all the data they In 1971, Fumio Akiyama described need to test the truisms of the past. And what they’ve found is most surprising: the frst known “size” law, saying that the number of defects D was a • In stark contrast to the assumptions of much prior research, pre- and post- function of the number of LOC.9 In release failures are not connected.25 1976, Thomas McCabe argued that • Static code analyzers perform no better than simple statistical predictors.26 the number of LOC was less im- • The language construct GOTO, as used in contemporary practice, is rarely portant than the complexity of that considered harmful.27 code.10 He argued that code is more • Strongly typed languages are not associated with successful projects.28 likely to be defective when his cyclo- • Developer beliefs are rarely backed by any empirical evidence.3 matic complexity measure is over 10. • Test-driven development is not any better than “test last.”29 Not only is programming an in- • Delayed issues are not exponentially more expensive to fix.30 herently buggy process, it’s also • Most “bad smells” should not be fixed.31,32 inherently diffcult. On the basis of data from dozens of projects, Barry Boehm proposed in 1981 an SEPTEMBER/OCTOBER 2018 | IEEE SOFTWARE 3 FOCUS: SOFTWARE ENGINEERING’S 50TH ANNIVERSARY from software projects contains useful information that can be found by data-mining algorithms. We now know that with modest amounts of data collection, it is possible to • build powerful recommender systems for software navigation or bug triage, or • make reasonably accurate predictions about software development effort and defects. Those predictions can rival those made by much more complex methods (such as static-code-analysis tools, which can be hard to maintain). Various studies have shown the commercial merits of this approach. For example, for one set of projects, software analytics could predict 87 percent of code defects, decrease inspection effort by 72 percent, and hence reduce post-release defects by 44 percent.12 Better yet, when multiple models are combined using en- semble learning, very good estimates can be generated (with very low vari- ance in their predictions), and these predictors can be incrementally up- dated to handle changing condi- tions.13 Also, when there is too much data for a manual analysis, it is pos- FIGURE 2. Researchers have collaborated extensively to record the state of the art in sible to automatically analyze data software analytics. from hundreds to thousands of projects, using software analytics.14 Recent research explains why soft- estimator for development effort (that modifcation to work on other proj- ware analytics has been so successful: was exponential with program size) ects. One useful feature of the cur- artifacts from software projects have using a set of effort multipliers Mi rent generation of data-mining an inherent simplicity. As Abram inferred from the current project:11 algorithms is that it is now relatively Hindle and his colleagues explained, fast and simple to learn, for exam- b effort ϭ a * KLOC * ⌸Mi, ple, defect or effort predictors for Programming languages, in other projects, using whatever attri- theory, are complex, fexible and where 2.4 Յ a Յ 3 and 1.05 Յ b Յ butes they have available. powerful, but the programs that 1.2. While the results of Akiyama, real people actually write are McCabe, and Boehm were useful Current Status mostly simple and rather repeti- in their home domains, it turns out Thousands of recent research papers tive, and thus they have usefully that those results required extensive all make the same conclusion: data predictable statistical properties 4 IEEE SOFTWARE | WWW.COMPUTER.ORG/SOFTWARE | @IEEESOFTWARE that can be captured in statistical different groups. By adopting the software engineering tools can im- language models and leveraged perspectives of different social plement other useful tasks such as for software engineering tasks.15 groupings, developers can build bet- automatically tuning the control ter software.19 Models built from parameters of a data-mining algo- For example, here are just some social factors (such as how often rithm22 (which, for most developers, of the many applications in this someone updates part of the code) is a black art). growing area of research: can be more effective for predicting Other important trends show the code quality than code factors (such maturity of software analytics. For • Patterns in the tokens of soft- as function size or the number of ar- decades now, there have been exten- ware can recognize code that is guments). For example, when you’re sive discussions about the reproduc- unexpected and bug prone. studying software built in multiple ibility of software research—much • Sentiment analysis tools can countries, a good predictor for bugs of it having the form, “wouldn’t it gauge the mood of the develop- is the complexity of the organiza- be a good idea.” We can report here ers, just by reading their issue tional chart. (The fewest bugs are that, at least for quantitative soft- comments.16 introduced when people working ware analytics based on data-mining • Clustering tools can explore on the same functions report to the algorithms, such reproducibility is complex spaces such as Stack same manager, even if they are in common practice. Many conferences Overfow to automatically detect different countries.20) in software engineering reward re- related questions.17 Software analytics also studies searchers with “badges” if they place the interactions of developers, us- their materials online (see tiny.cc ing biometric sensors. Just as we /acmbadges). Accordingly, research- But What’s Next? mine software (and the social pro- ers place their scripts and data in When we refect over the past de- cesses that develop them), so too GitHub. Some even register those re- cade, several new trends stand out. can we mine data, collected at the positories with tools such as Zenodo, For example, consider the rise of the millisecond level, from computer in order to obtain unique and eternal data scientist in industry.

Load more