Using Analytics to Guide Improvement During an Agile–Devops

Total Page:16

File Type:pdf, Size:1020Kb

Using Analytics to Guide Improvement During an Agile–Devops FOCUS: ACTIONABLE ANALYTICS culture built on agile methods and Using Analytics DevOps practices.4 The transformation was moti- vated by Fannie Mae’s need to ad- to Guide just its IT capabilities to meet the requirements and pace of a rapidly evolving mortgage market. In 2012, Fannie Mae had only 10 teams using Improvement agile-like techniques. Releases took nine to 18 months, provisioning of environments took two to four during an months, and quality was not consis- tently measured. Fannie Mae’s IT department was Agile–DevOps a complex, technical-polyglot ecosys- tem composed of 461 applications, several hundred utilities, and almost 18,000 open source components. Transformation As part of reducing this complex- ity, Fannie Mae initiated two proj- ects to implement agile methods and Barry Snyder, Fannie Mae DevOps as enterprise capabilities. Executive management required peri- Bill Curtis, CAST odic empirical evaluation of progress in quality, productivity, and delivery // Fannie Mae IT has transformed from a waterfall speed to monitor the effectiveness of the agile–DevOps deployment. organization to a lean culture enabled by agile methods and DevOps. The article discusses how Implementing the Analytics Platform analytics were used, along with challenges in Fannie Mae started by leveraging selecting measures and implementing analytics what already existed and filled the gaps with new solutions. While this in an agile–DevOps transformation. // got the transformation moving for- ward, it revealed that a highly cus- tomized solution can become an impediment. For example, the pro- duction release platform was built out of three point-solution prod- ucts with significant custom scripts to unify these solutions. While it conformed to Fannie Mae’s tightly governed environment, it impeded transformation because a major up- SOFTWARE ANALYTICS1 HAVE In 2015, Fannie Mae, a provider date to the platform would take proven increasingly useful in in- of liquidity for mortgages in the more than nine months when team dustrial case studies for guiding US housing market, undertook a needs were in the weeks. adoption and evaluating benefits of major transformation from its tra- In comparison, the CAST 2,3 advanced software techniques. ditional waterfall approach to a Application Intelligence Platform 78 IEEE SOFTWARE | PUBLISHED BY THE IEEE COMPUTER SOCIETY 0740-7459/18/$33.00 © 2018 IEEE (AIP)—adopted for application anal- Originally, analytics were de- could rapidly pinpoint which process ysis and measurement—had no cus- livered by a central team, so devel- and technology improvements drove tomization and was implemented opment teams were constrained in the greatest impact. out of the box. This platform using measures by how quickly they For example, one of three was deployed in 2014 as an enter- were received. This bottleneck led to squads working on elements of prise platform providing technology- developing a self-service interface en- the same product line had adopted independent structural-quality met- abling teams to choose the frequency behavior-driven-development tech- rics that enabled consistent analysis of scanning their applications and niques using Selenium for testing. and comparison across Fannie Mae’s consume analytics at their own pace. Analyzing data on effort, size, and application portfolio. Structural This interface increased platform us- development context combined from quality represents how well the sys- age by 481%, from 340 scans dur- different sources indicated that the tem is architected and the code is ing the year prior to the self-service squad’s productivity rose by 4.9% over engineered. Since manual structural- interface, to 1,636 scans during the three sprints due to shortening the cus- quality analysis of entire applica- year after deployment. tomer feedback cycle. This led the re- tions was infeasible, an automated maining squads to adopt Selenium. platform provided a common ana- Analyzing Productivity lytics solution across the enterprise. Measuring productivity improvement Analyzing Structural The only customization automated was a critical challenge in monitoring Quality the AIP onboarding process with an the transformation. Several measures Structural-quality measures were in-house built framework—a neces- (story points, dollar spend, LOC, code joined with defect ratio, dollar spend, sity for providing a self-service ca- review defects, code coverage, etc.) cycle release time, build count, and pability that was integrated into the were evaluated and eliminated be- other measures to analyze the impact DevOps toolchain. cause they did not provide equivalent of agile–DevOps transformation AIP analyzes an entire multilayer, comparisons that were independent practices. These analyses provided multilanguage application system of technology and comparable across equivalent comparisons regardless of from the user interface to the data- applications. For analyzing produc- language, analyzed the complexities base structure.5 To evaluate archi- tivity, Fannie Mae chose to measure of multilayer transactions, and could tecture, structural-quality analytics functional size rather than LOC. be combined with other measures are developed during integration The Consortium for IT Software to calculate ratios such as spend per rather than during unit testing, and Quality (CISQ) recently produced a AFP, defect ratio per AFP, etc. replace some information previously standard, through the Object Man- Application teams scanned builds provided by formal inspections. agement Group, for Automated Func- one or more times during a sprint An abstract structural representa- tion Points (AFPs)6 that was based to detect structural-quality flaws tion of the system is generated from on counting guidelines published by and produce the analytic measures. this analysis and evaluated against the International Function Point Us- Scan outputs provided early detec- more than 1,200 rules of good ar- ers Group (IFPUG).7 Since AFPs were tion of structural flaws, enabling chitectural and coding practice. Vi- automatically generated by AIP when teams to address the most critical olations of these rules are weighted analyzing the structural attributes of a issues before release. Prior to analyz- and aggregated into measures that code base, AFPs experienced rapid ac- ing structural quality, teams would highlight the operational and cost ceptance by early adopter teams. occasionally discover these flaws in risks of the application system in AFPs provided an outcome-based testing, but most often after an ap- five areas: robustness, security, per- evaluation of productivity regard- plication was released into produc- formance efficiency, changeability, ing the amount of functionality de- tion. Most teams began scanning at and transferability (or comprehen- livered per unit of effort in sprints. a minimum of once per sprint (every sibility). These five measures are ag- After aggregating results across all two weeks). Those who were scan- gregated into a sixth measure—the projects, the enterprise could rapidly ning several times a week or even total quality index (TQI)—that pro- adjust activities to address progress daily could fix critical defects within vides a summary structural-quality shortcomings and support transfor- a day or two rather than waiting un- score. mation accelerators. In turn, teams til the next sprint. These measures, JANUARY/FEBRUARY 2018 | IEEE SOFTWARE 79 FOCUS: ACTIONABLE ANALYTICS along with detailed information However, the structural-quality productivity and quality efficiency about the location and severity of output did not align with other en- as trailing indicators of transforma- defects, were provided to application terprise metrics. Aligning the met- tion success to demonstrate return teams via engineering dashboards. rics at the enterprise level and across on investment. Application teams used these the DevOps tools was a challenge, as measures to iteratively improve the tools were in silos. As the trans- Analyzing quality in the five structural-quality formation evolved over the first two Transformation dimensions while monitoring overall years, each platform built one-off so- Improvement quality via the TQI score. Structural- lutions to collate its data with exter- Figure 1 displays how Fannie Mae quality analytics enabled teams to nal enterprise metrics—this resulted measured and monitored productiv- look for code quality issues pre- in metric silos. ity in functionality delivered (AFPs senting short-term risk, while si- To eliminate the silos, Fan- added, modified, or deleted) and multaneously addressing long-term nie Mae constructed an enterprise quality (critical defects per AFP) as architectural limitations and costly data mart to align measures the transformation was deployed maintenance early in construction. and analytics from all enterprise across applications. Critical defects Fannie Mae intended to improve tools covering financials, enterprise were those rated from 7 to 9 on a the structural quality of all appli- program management, software de- 9-point scale. During the first quar- cations to a “green” state—roughly velopment (DevOps, test automa- ter of 2015, data aggregated across a 70th percentile score on the TQI. tion, test data management, agile applications indicated that the trans- At the start of the journey in 2015, methods, code quality, etc.), and formation began delivering substan- most apps were “yellow” and “red” operations. The collation of data tially more functionality without (scores
Recommended publications
  • Information Needs for Software Development Analytics
    Information Needs for Software Development Analytics Raymond P.L. Buse Thomas Zimmermann The University of Virginia, USA Microsoft Research [email protected] [email protected] Abstract—Software development is a data rich activity with decision making will likely only become more difficult. many sophisticated metrics. Yet engineers often lack the tools Analytics describes application of analysis, data, and sys- and techniques necessary to leverage these potentially powerful tematic reasoning to make decisions. Analytics is especially information resources toward decision making. In this paper, we present the data and analysis needs of professional software useful for helping users move from only answering questions engineers, which we identified among 110 developers and of information like “What happened?” to also answering managers in a survey. We asked about their decision making questions of insight like “How did it happen and why?” process, their needs for artifacts and indicators, and scenarios Instead of just considering data or metrics directly, one can in which they would use analytics. gather more complete insights by layering different kinds of The survey responses lead us to propose several guidelines for analytics tools in software development including: Engi- analyses that allow for summarizing, filtering, modeling, and neers do not necessarily have much expertise in data analysis; experimenting; typically with the help of automated tools. thus tools should be easy to use, fast, and produce concise The key to applying analytics to a new domain is under- output. Engineers have diverse analysis needs and consider most standing the link between available data and the information indicators to be important; thus tools should at the same time needed to make good decisions, as well as the analyses support many different types of artifacts and many indicators.
    [Show full text]
  • Software Analytics for Incident Management of Online Services: An
    Software Analytics for Incident Management of Online Services: An Experience Report Jian-Guang Lou, Qingwei Lin, Rui Ding, Tao Xie Qiang Fu, Dongmei Zhang University of Illinois at Urbana-Champaign Microsoft Research Asia, Beijing, P. R. China Urbana, IL, USA {jlou, qlin, juding, qifu, dongmeiz}@microsoft.com [email protected] Abstract—As online services become more and more popular, incident management needs to be efficient and effective in incident management has become a critical task that aims to order to ensure high availability and reliability of the services. minimize the service downtime and to ensure high quality of the A typical procedure of incident management in practice provided services. In practice, incident management is conducted (e.g., at Microsoft and other service-provider companies) goes through analyzing a huge amount of monitoring data collected at as follow. When the service monitoring system detects a ser- runtime of a service. Such data-driven incident management faces several significant challenges such as the large data scale, vice violation, the system automatically sends out an alert and complex problem space, and incomplete knowledge. To address makes a phone call to a set of On-Call Engineers (OCEs) to these challenges, we carried out two-year software-analytics trigger the investigation on the incident in order to restore the research where we designed a set of novel data-driven techniques service as soon as possible. Given an incident, OCEs need to and developed an industrial system called the Service Analysis understand what the problem is and how to resolve it. In ideal Studio (SAS) targeting real scenarios in a large-scale online cases, OCEs can identify the root cause of the incident and fix service of Microsoft.
    [Show full text]
  • Patterns for Implementing Software Analytics in Development Teams
    Patterns for Implementing Software Analytics in Development Teams JOELMA CHOMA, National Institute for Space Research - INPE EDUARDO MARTINS GUERRA, National Institute for Space Research - INPE TIAGO SILVA DA SILVA, Federal University of São Paulo - UNIFESP The software development activities typically produce a large amount of data. Using a data-driven approach to decision making – such as Software Analytics – the software practitioners can achieve higher development process productivity and improve many aspects of the software quality based on the insightful and actionable information. This paper presents a set of patterns describing steps to encourage the integration of the analytics activities by development teams in order to support them to make better decisions, promoting the continuous improvement of software and its development process. Before any procedure to extract data for software analytics, the team needs to define, first of all, their questions about what will need to be measured, assess and monitored throughout the development process. Once defined the key issues which will be tracked, the team may select the most appropriate means for extracting data from software artifacts that will be useful in decision-making. The tasks to set up the development environment for software analytics should be added to the project planning along with the regular tasks. The software analytics activities should be distributed throughout the project in order to add information about the system in small portions. By defining reachable goals from the software analytics findings, the team turns insights from software analytics into actions to improve incrementally the software characteristics and/or its development process. Categories and Subject Descriptors: D.2.8 [Software and its engineering]: Software creation and management—Metrics General Terms: Software Analytics Additional Key Words and Phrases: Software Analytics, Decision Making, Agile Software Development, Patterns, Software Measurement, Development Teams.
    [Show full text]
  • Evaluating and Improving Software Quality Using Text Analysis Techniques - a Mapping Study
    Evaluating and Improving Software Quality Using Text Analysis Techniques - A Mapping Study Faiz Shah, Dietmar Pfahl Institute of Computer Science, University of Tartu J. Liivi 2, Tartu 50490, Estonia {shah, dietmar.pfahl}@ut.ee Abstract: Improvement and evaluation of software quality is a recurring and crucial activ- ity in the software development life-cycle. During software development, soft- ware artifacts such as requirement documents, comments in source code, design documents, and change requests are created containing natural language text. For analyzing natural text, specialized text analysis techniques are available. However, a consolidated body of knowledge about research using text analysis techniques to improve and evaluate software quality still needs to be estab- lished. To contribute to the establishment of such a body of knowledge, we aimed at extracting relevant information from the scientific literature about data sources, research contributions, and the usage of text analysis techniques for the im- provement and evaluation of software quality. We conducted a mapping study by performing the following steps: define re- search questions, prepare search string and find relevant literature, apply screening process, classify, and extract data. Bug classification and bug severity assignment activities within defect manage- ment are frequently improved using the text analysis techniques of classifica- tion and concept extraction. Non-functional requirements elicitation activity of requirements engineering is mostly improved using the technique of concept extraction. The quality characteristic which is frequently evaluated for the prod- uct quality model is operability. The most commonly used data sources are: bug report, requirement documents, and software reviews. The dominant type of re- search contributions are solution proposals and validation research.
    [Show full text]
  • Software Analytics in Practice
    FOCUS: THE MANY FACES OF SOftWARE ANALYTICS • software practitioners could Software conduct software analytics by themselves, • researchers from academia or in- dustry could collaborate with soft- Analytics ware practitioners from software companies or open source commu- nities and transfer their analytics in Practice technologies to real-world use, or • practitioners could organically adopt analytics technologies devel- Dongmei Zhang, Shi Han, Yingnong Dang, Jian-Guang Lou, oped by researchers. and Haidong Zhang, Microsoft Research Asia However, no matter how this is pur- Tao Xie, University of Illinois at Urbana-Champaign sued, it remains a huge challenge. We’ve worked on a number of soft- ware analytics projects that have had a // The StackMine project produced a software analytics high impact on software development 7–10 system for Microsoft product teams. The project provided practice. To illustrate the lessons we’ve learned from them, we describe several lessons on applying software analytics technologies our experiences developing StackMine, to make an impact on software development practice. // a scalable system for postmortem per- formance debugging.7 The Promise and Challenges of Software Analytics As we mentioned before, software an- alytics aims to obtain insightful and actionable information. Insightful in- formation conveys knowledge that’s meaningful and useful for practitioners THE SOFTWARE DEVELOPMENT software domain, we formed Micro- performing a specific task. Actionable process produces various types of data soft Research Asia’s Software Analyt- information is information with which such as source code, bug reports, check- ics Group (http://research.microsoft. practitioners can devise concrete ways in histories, and test cases. Over the past com/groups/sa) in 2009.
    [Show full text]
  • Software Analytics to Software Practice: a Systematic Literature Review
    2015 IEEE/ACM 1st International Workshop on Big Data Software Engineering Software Analytics to Software Practice: A Systematic Literature Review Tamer Mohamed Abdellatif, Luiz Fernando Capretz, and Danny Ho Department of Electrical & Computer Engineering Western University London, Ontario, Canada [email protected], [email protected], [email protected] Abstract—Software Analytics (SA) is a new branch of big data suitable and supportive insights and facts to software industry analytics that has recently emerged (2011). What distinguishes stakeholders to make their decision making easier, faster, more SA from direct software analysis is that it links data mined from precise, and more confident. The main difference between SA many different software artifacts to obtain valuable insights. and direct software analysis is that rather than just providing These insights are useful for the decision-making process straightforward insights extraction SA performs additional throughout the different phases of the software lifecycle. Since SA is currently a hot and promising topic, we have conducted a advanced steps. As clarified by Hassan [1], SA must provide systematic literature review, presented in this paper, to identify visualization and useful interpretation of insights in order to gaps in knowledge and open research areas in SA. Because many facilitate decision making. researchers are still confused about the true potential of SA, we This paper is organized as follows: In Section II, we will had to filter out available research papers to obtain the most SA- illustrate our review methodology. Our review results are relevant work for our review. This filtration yielded 19 studies illustrated in Section III. Section IV presents the limitation of out of 135.
    [Show full text]
  • Em All: Watchdog, a Family of IDE Plug-Ins to Assess Testing
    How to Catch ’Em All: WatchDog, a Family of IDE Plug-Ins to Assess Testing Moritz Beller,* Igor Levaja,* Annibale Panichella,* Georgios Gousios,# Andy Zaidman* *Delft University of Technology, The Netherlands #Radboud University Nijmegen, The Netherlands {m.m.beller, a.panichella, a.e.zaidman}@tudelft.nl [email protected], [email protected] ABSTRACT vidual, users of WatchDog benefit from immediate testing and de- As software engineering researchers, we are also zealous tool smiths. velopment analytics, a feature that neither Eclipse nor IntelliJ sup- Building a research prototype is often a daunting task, let alone ports out-of-the-box. After the introduction of the testing reports, building a industry-grade family of tools supporting multiple plat- even accomplished software engineers were surprised by their own forms to ensure the generalizability of our results. In this paper, recorded testing behavior, as a reaction from a software quality con- we give advice to academic and industrial tool smiths on how to sultant and long-term WatchDog user exemplifies: “ Estimated design and build an easy-to-maintain architecture capable of sup- time working on tests: 20%, actual time: 6%. Cool statistics!”, porting multiple integrated development environments (IDEs). Our The key contributions of this paper are: experiences stem from WatchDog, a multi-IDE infrastructure that 1) The introduction of WatchDog for IntelliJ, an instantiation of assesses developer testing activities in vivo and that over 2,000 reg- WatchDog’s new multi-IDE framework. istered developers use. To these software engineering practitioners, 2) The description of a set of experiences and suggestions for WatchDog, provides real-time and aggregated feedback in the form crafting a multi-IDE plug-in infrastructure.
    [Show full text]
  • Software Runtime Analytics for Developers: Extending Developers’ Mental Models by Runtime Dimensions
    Zurich Open Repository and Archive University of Zurich Main Library Strickhofstrasse 39 CH-8057 Zurich www.zora.uzh.ch Year: 2018 Software runtime analytics for developers: extending developers’ mental models by runtime dimensions Cito, Jürgen Posted at the Zurich Open Repository and Archive, University of Zurich ZORA URL: https://doi.org/10.5167/uzh-152966 Dissertation Published Version Originally published at: Cito, Jürgen. Software runtime analytics for developers: extending developers’ mental models by runtime dimensions. 2018, University of Zurich, Faculty of Economics. Department of Informatics Software Runtime Analytics for Developers Extending Developers’ Mental Models by Runtime Dimensions Dissertation submitted to the Faculty of Business, Economics and Informatics of the University of Zurich to obtain the degree of Doktor / Doktorin der Wissenschaften, Dr. sc. (corresponds to Doctor of Science, PhD) presented by Jürgen Cito from Vienna, Austria approved in February 2018 at the request of Prof. Dr. Harald C. Gall, University of Zurich, Switzerland Dr. Philipp Leitner, Chalmers University of Technology, Sweden Prof. Dr. Nenad Medvidovic, University of Southern California, USA The Faculty of Business, Economics and Informatics of the University of Zurich hereby authorizes the printing of this dissertation, without indicating an opinion of the views expressed in the work. Zurich, February 14th, 2018 Chairwoman of the Doctoral Board: Prof. Dr. Sven Seuken Abstract Software systems have become instrumental in almost every aspect of modern society. The reliability and performance of these systems plays a crucial role in the every day lives of their users. The performance of a software system is governed by its program code, executed in an environment.
    [Show full text]
  • Practical Experiences and Value of Applying Software Analytics to Manage Quality
    Practical experiences and value of applying software analytics to manage quality Anna Maria Vollmer Silverio Mart´ınez-Fernandez´ Alessandra Bagnato Fraunhofer IESE Fraunhofer IESE Softeam Kaiserslautern, Germany Kaiserslautern, Germany Paris, France [email protected] [email protected] [email protected] Jari Partanen Lidia Lopez´ Pilar Rodr´ıguez Bittium UPC-BarcelonaTech University of Oulu Oulu, Finland Barcelona, Spain Oulu, Finland [email protected] [email protected] pilar.rodriguez@oulu.fi Abstract—Background: Despite the growth in the use of soft- more actors via one or more media so that the technology ware analytics platforms in industry, little empirical evidence recipient sustainably adopts the object in the recipients context is available about the challenges that practitioners face and the in order to evidently achieve a specific purpose.” [8]. We report value that these platforms provide. Aim: The goal of this research is to explore the benefits of using a software analytics platform the industrial scenario and describe our experiences on the for practitioners managing quality. Method: In a technology industry-academia journey in two pilot projects that used a transfer project, a software analytics platform was incrementally software analytics platform to solve their quality management developed between academic and industrial partners to address problems. Our previous work focused on incrementally de- their software quality problems. This paper focuses on exploring veloping a software analytics platform in a previous forma- the value provided by this software analytics platform in two pilot projects. Results: Practitioners emphasized major benefits includ- tive stage [9]. This study reports our experiences with the ing the improvement of product quality and process performance software analytics platform Q-Rapids (hereafter referred to as and an increased awareness of product readiness.
    [Show full text]
  • Software Analytics: Opportunities and Challenges
    Software Analytics: Opportunities and Challenges Olga Baysal Latifa Guerrouj ! ! School of Computer Science Département de GL et des TI Carleton University École de Technologie Supérieure [email protected] [email protected] olgabaysal.com latifaguerrouj.ca @olgabaysal @Latifa_Guerrouj Development (Big) Data Development Data 2 Mining Software Repositories Detection and analysis of hidden patterns and trends Code Tests Build/ Version Control Config Source code information Usage data Docs Issues Mailing lists Crash repos Field logs Other artifacts Runtime data 3 Problem Development Stakeholders artifacts Data Is NOT Actionable 4 Decisions Drive Development! Program defects Product deadlines How do I fix this bug? Are we ready to release? Risks, cost, operation, Release engineers Developers Program correctness planning How effective is our When do we release? test suite? Managers QA Decisions are often based on intuition or experience 5 Solution – Analytics 6 Analytics In Industry 7 Software Analytics 8 Benefits of Software Analytics 9 Supporting Development Decisions Program defects Product deadlines How do I fix this bug? Are we ready to release? Risks, cost, operation, Release engineers Developers Program correctness planning How effective is our When do we release? test suite? Managers QA Data-driven decision making, fact-based views of projects 10 Software Artifacts • Source Code • Execution Trace • Development History • Bug Reports • Code Reviews • Developer Activities • Software Forums • Software Microblogs 11 Artifact:
    [Show full text]
  • Mastering Software Testing with Junit 5: Comprehensive Guide to Develop
    Mastering Software Testing with JUnit 5 Comprehensive guide to develop high quality Java applications Boni García BIRMINGHAM - MUMBAI Mastering Software Testing with JUnit 5 Copyright © 2017 Packt Publishing All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews. Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing, and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book. Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information. First published: October 2017 Production reference: 1231017 Published by Packt Publishing Ltd. Livery Place 35 Livery Street Birmingham B3 2PB, UK. ISBN 978-1-78728-573-6 www.packtpub.com Credits Author Copy Editor Boni García Charlotte Carneiro Reviewers Project Coordinator Luis Fernández Muñoz Ritika Manoj Ashok Kumar S Commissioning Editor Proofreader Smeet Thakkar Safis Editing Acquisition Editor Indexer Nigel Fernandes Aishwarya Gangawane Content Development Editor Graphics Mohammed Yusuf Imaratwale Jason Monteiro Technical Editor Production Coordinator Ralph Rosario Shraddha Falebhai About the Author Boni García has a PhD degree on information and communications technology from Technical University of Madrid (UPM) in Spain since 2011.
    [Show full text]
  • Developer Testing in the IDE: Patterns, Beliefs, and Behavior
    IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. 14, NO. 8, AUGUST 2015 1 Developer Testing in the IDE: Patterns, Beliefs, and Behavior Moritz Beller, Georgios Gousios, Annibale Panichella, Sebastian Proksch, Sven Amann, and Andy Zaidman, Members, IEEE Abstract—Software testing is one of the key activities to achieve software quality in practice. Despite its importance, however, we have a remarkable lack of knowledge on how developers test in real-world projects. In this paper, we report on a large-scale field study with 2,443 software engineers whose development activities we closely monitored over 2.5 years in four integrated development environments (IDEs). Our findings, which largely generalized across the studied IDEs and programming languages Java and C#, question several commonly shared assumptions and beliefs about developer testing: half of the developers in our study do not test; developers rarely run their tests in the IDE; most programming sessions end without any test execution; only once they start testing, do developers do it extensively; a quarter of test cases is responsible for three quarters of all test failures; 12% of tests show flaky behavior; Test-Driven Development (TDD) is not widely practiced; and software developers only spend a quarter of their time engineering tests, whereas they think they test half of their time. We compile these practices of loosely guiding one’s development efforts with the help of testing in an initial summary on Test-Guided Development (TGD), a behavior we argue to be closer to the development reality of most developers than TDD. Index Terms—Developer Testing, Unit Tests, Testing Effort, Field Study, Test-Driven Development (TDD), JUnit, TestRoots WatchDog, KaVE FeedBag++.
    [Show full text]