FOCUS: ACTIONABLE ANALYTICS

culture built on agile methods and Using Analytics DevOps practices.4 The transformation was moti- vated by Fannie Mae’s need to ad- to Guide just its IT capabilities to meet the requirements and pace of a rapidly evolving mortgage market. In 2012, Fannie Mae had only 10 teams using Improvement agile-like techniques. Releases took nine to 18 months, provisioning of environments took two to four during an months, and quality was not consis- tently measured. Fannie Mae’s IT department was Agile–DevOps a complex, technical-polyglot ecosys- tem composed of 461 applications, several hundred utilities, and almost 18,000 open source components. Transformation As part of reducing this complex- ity, Fannie Mae initiated two proj- ects to implement agile methods and Barry Snyder, Fannie Mae DevOps as enterprise capabilities. Executive management required peri- Bill Curtis, CAST odic empirical evaluation of progress in quality, productivity, and delivery // Fannie Mae IT has transformed from a waterfall speed to monitor the effectiveness of the agile–DevOps deployment. organization to a lean culture enabled by agile methods and DevOps. The article discusses how Implementing the Analytics Platform analytics were used, along with challenges in Fannie Mae started by leveraging selecting measures and implementing analytics what already existed and filled the gaps with new solutions. While this in an agile–DevOps transformation. // got the transformation moving for- ward, it revealed that a highly cus- tomized solution can become an impediment. For example, the pro- duction release platform was built out of three point-solution prod- ucts with significant custom scripts to unify these solutions. While it conformed to Fannie Mae’s tightly governed environment, it impeded transformation because a major up- ANALYTICS1 HAVE In 2015, Fannie Mae, a provider date to the platform would take proven increasingly useful in in- of liquidity for mortgages in the more than nine months when team dustrial case studies for guiding US housing market, undertook a needs were in the weeks. adoption and evaluating benefits of major transformation from its tra- In comparison, the CAST advanced software techniques.2,3 ditional waterfall approach to a Application Intelligence Platform

78 IEEE SOFTWARE | PUBLISHED BY THE IEEE COMPUTER SOCIETY 0740-7459/18/$33.00 © 2018 IEEE (AIP)—adopted for application anal- Originally, analytics were de- could rapidly pinpoint which process ysis and measurement—had no cus- livered by a central team, so devel- and technology improvements drove tomization and was implemented opment teams were constrained in the greatest impact. out of the box. This platform using measures by how quickly they For example, one of three was deployed in 2014 as an enter- were received. This bottleneck led to squads working on elements of prise platform providing technology- developing a self-service interface en- the same product line had adopted independent structural-quality met- abling teams to choose the frequency behavior-driven-development tech- rics that enabled consistent analysis of scanning their applications and niques using Selenium for testing. and comparison across Fannie Mae’s consume analytics at their own pace. Analyzing data on effort, size, and application portfolio. Structural This interface increased platform us- development context combined from quality represents how well the sys- age by 481%, from 340 scans dur- different sources indicated that the tem is architected and the code is ing the year prior to the self-service squad’s productivity rose by 4.9% over engineered. Since manual structural- interface, to 1,636 scans during the three sprints due to shortening the cus- quality analysis of entire applica- year after deployment. tomer feedback cycle. This led the re- tions was infeasible, an automated maining squads to adopt Selenium. platform provided a common ana- Analyzing Productivity lytics solution across the enterprise. Measuring productivity improvement Analyzing Structural The only customization automated was a critical challenge in monitoring Quality the AIP onboarding process with an the transformation. Several measures Structural-quality measures were in-house built framework—a neces- (story points, dollar spend, LOC, code joined with defect ratio, dollar spend, sity for providing a self-service ca- review defects, code coverage, etc.) cycle release time, build count, and pability that was integrated into the were evaluated and eliminated be- other measures to analyze the impact DevOps toolchain. cause they did not provide equivalent of agile–DevOps transformation AIP analyzes an entire multilayer, comparisons that were independent practices. These analyses provided multilanguage application system of technology and comparable across equivalent comparisons regardless of from the user interface to the data- applications. For analyzing produc- language, analyzed the complexities base structure.5 To evaluate archi- tivity, Fannie Mae chose to measure of multilayer transactions, and could tecture, structural-quality analytics functional size rather than LOC. be combined with other measures are developed during integration The Consortium for IT Software to calculate ratios such as spend per rather than during unit testing, and Quality (CISQ) recently produced a AFP, defect ratio per AFP, etc. replace some information previously standard, through the Object Man- Application teams scanned builds provided by formal inspections. agement Group, for Automated Func- one or more times during a sprint An abstract structural representa- tion Points (AFPs)6 that was based to detect structural-quality flaws tion of the system is generated from on counting guidelines published by and produce the analytic measures. this analysis and evaluated against the International Function Point Us- Scan outputs provided early detec- more than 1,200 rules of good ar- ers Group (IFPUG).7 Since AFPs were tion of structural flaws, enabling chitectural and coding practice. Vi- automatically generated by AIP when teams to address the most critical olations of these rules are weighted analyzing the structural attributes of a issues before release. Prior to analyz- and aggregated into measures that code base, AFPs experienced rapid ac- ing structural quality, teams would highlight the operational and cost ceptance by early adopter teams. occasionally discover these flaws in risks of the application system in AFPs provided an outcome-based testing, but most often after an ap- five areas: robustness, security, per- evaluation of productivity regard- plication was released into produc- formance efficiency, changeability, ing the amount of functionality de- tion. Most teams began scanning at and transferability (or comprehen- livered per unit of effort in sprints. a minimum of once per sprint (every sibility). These five measures are ag- After aggregating results across all two weeks). Those who were scan- gregated into a sixth measure—the projects, the enterprise could rapidly ning several times a week or even total quality index (TQI)—that pro- adjust activities to address progress daily could fix critical defects within vides a summary structural-quality shortcomings and support transfor- a day or two rather than waiting un- score. mation accelerators. In turn, teams til the next sprint. These measures,

JANUARY/FEBRUARY 2018 | IEEE SOFTWARE 79 FOCUS: ACTIONABLE ANALYTICS

along with detailed information However, the structural-quality productivity and quality efficiency about the location and severity of output did not align with other en- as trailing indicators of transforma- defects, were provided to application terprise metrics. Aligning the met- tion success to demonstrate return teams via engineering dashboards. rics at the enterprise level and across on investment. Application teams used these the DevOps tools was a challenge, as measures to iteratively improve the tools were in silos. As the trans- Analyzing quality in the five structural-quality formation evolved over the first two Transformation dimensions while monitoring overall years, each platform built one-off so- Improvement quality via the TQI score. Structural- lutions to collate its data with exter- Figure 1 displays how Fannie Mae quality analytics enabled teams to nal enterprise metrics—this resulted measured and monitored productiv- look for code quality issues pre- in metric silos. ity in functionality delivered (AFPs senting short-term risk, while si- To eliminate the silos, Fan- added, modified, or deleted) and multaneously addressing long-term nie Mae constructed an enterprise quality (critical defects per AFP) as architectural limitations and costly data mart to align measures the transformation was deployed maintenance early in construction. and analytics from all enterprise across applications. Critical defects Fannie Mae intended to improve tools covering financials, enterprise were those rated from 7 to 9 on a the structural quality of all appli- program management, software de- 9-point scale. During the first quar- cations to a “green” state—roughly velopment (DevOps, test automa- ter of 2015, data aggregated across a 70th percentile score on the TQI. tion, test data management, agile applications indicated that the trans- At the start of the journey in 2015, methods, code quality, etc.), and formation began delivering substan- most apps were “yellow” and “red” operations. The collation of data tially more functionality without (scores averaging the 46th percentile across platforms enabled holis- a decrease in quality. Applications on TQI). Currently, nearly all appli- tic insight into the transformation achieved between a 30% to 48% im- cations have achieved a green state. progress. Additionally, relation- provement in quality when measured ships between data elements could by critical defects per AFP com- Aligning Analytics be investigated rapidly. pared to the 2014 baseline. An aver- across the Organization For example, one application had age of 28% productivity gains was Structural-quality analytics provided a production outage identified as po- achieved by teams that implemented a multidimensional, enterprise view tential defective code. By relating the repeatable agile practices, automated of the code base via incident ticket to data from the scan structural-quality analysis, test au- prior to the production release, the tomation, and consistent use of the • a visual heat map—the red-to- team discovered it had opted to mark DevOps platform for continuous in- green map of applications for a reported defect as a false positive. tegration and delivery. both quality and size; This granular analysis enabled the The agile–DevOps transformation • enterprise trending of scores— team to adjust its code development allowed Fannie Mae to create an in- a 2D line chart of overall code practices regarding a finding flagged ternal software supply chain with au- quality over time for all applica- as a false positive. tomated handoffs between functions tions presented by portfolio, sub- This specific example reflects to reduce delays. In comparison to portfolio, or application; how teams leveraged the insight pro- 2012, Fannie Mae achieved 21 times • aggregated structural flaws vided by an enterprise data mart to (more than 19,000) more builds per across the enterprise—the top 10 address specific needs. From an en- month with half the previous staff- quality issues across all applica- terprise perspective, quality metrics ing, while improving quality. tions; and could now be aligned with opera- • filters based on portfolio and tional tickets, finances, and program Using Analytic Baselines maturity of transformation— management. This alignment en- for Teams filtering indicating which ap- abled measuring productivity from The structural-quality analytics ag- plications were agile versus a value (AFPs per dollar) and qual- gregated across applications pro- waterfall or DevOps versus ity (defects per AFP) perspective. In vided a baseline for evaluating the non-DevOps. turn, the enterprise could measure effectiveness of each application

80 IEEE SOFTWARE | WWW.COMPUTER.ORG/SOFTWARE | @IEEESOFTWARE 900 25

800

700 20

600 15 500

400 10

AFPs per release 300 No. of defects per AFP 200 5 No. of added, modified, or deleted Adoption of CAST AIP by agile projects Enterprise agile CoE established 100

0 0 Quarter 1 2015 Quarter 2 2015 Quarter 3 2015 Quarter 4 2015 Quarter 1 2016

No. of AFPs added, modified, or deleted No. of critical findings per AFP

FIGURE 1. Improvements in productivity and quality during the agile–DevOps transformation. AFP 5 Automated Function Points, CAST AIP 5 CAST Application Intelligence Platform, and CoE 5 Center of Excellence.

team’s adoption of technologies and only infrequent scans averaged a comparison, projects using both So- practices. Aligning the code qual- 9.8% improvement in TQI. narQube and AIP as complemen- ity platform with the enterprise data • Teams performing frequent tary prebuild and postbuild tools mart enabled a correlation between scans coupled with the con- averaged 24.7% structural-quality the quality increases and the agile tinuous integration–deployment improvements measured by TQI. adoption of practices such as team platform averaged a 24% im- Consequently, the project returned product ownership, MVP-driven provement in TQI. to using SonarQube and AIP in sprints (an MVP is a minimally vi- • Teams performing frequent scans tandem. able production-ready product that coupled with the full spectrum provides value to customers), test of agile–DevOps technologies An Application Team automation, and daily continuous and practices averaged a 30% to Example integration. Since these practices 48% improvement in TQI, with Figure 2 displays the structural-quality were adopted simultaneously, it was outliers as high as 70%. results for an application team that not possible to correlate individ- underwent the agile–DevOps trans- ual changes with the measured im- Teams began using data to evalu- formation between 2015 and 2016. provements. However, Fannie Mae ate the effectiveness of their agile– Prior to adopting agile–DevOps, the established an enterprise-level rela- DevOps practices.8 For example, application release cycle was lengthy. tionship between overall structural one project had abandoned the use The team achieved only a 10% im- quality measured by the TQI and the of the prebuild static-analysis tool provement in application quality, with completeness of the practices being SonarQube9 in favor of using AIP. no means to analyze the drivers of adopted: The project’s TQI score dropped improvement. from 18% to 11% because more When the team began using • Teams not practicing agile meth- component-level defects escaped structural-quality analytics, it could ods or DevOps and conducting detection and entered builds. In measure and address defect injection

JANUARY/FEBRUARY 2018 | IEEE SOFTWARE 81 FOCUS: ACTIONABLE ANALYTICS

8 140 7.6 7.6 7.4 No. of AFPs added, modified, or deleted 7 No. of critical findings per AFP 120

6 The project used CAST AIP on a limited basis. 100 5.2 The project realized a 10% 5 5 gain in quality through 4.5 standard project practices. 80 4 The project adopted agile and DevOps. Significant gains in quality occurred with incremental 60 3 rapid releases. AFPs per release

No. of critical findings Initial releases took longer as the team adapted to the 40 2 new practices. No. of added, modified, or deleted

1 20

0 0 r1 r1.1 r2 r2.1 r3 Waterfall 4 sprints 2 sprints 2 sprints 2 sprints

18 months 2 months 1 month 1 month 1 month

FIGURE 2. A case study of the transition from waterfall to agile practices with DevOps automation.

much earlier during the shorter be easily adapted to future evolv- justify the investment in agile– sprints. The team realized a 41% ing needs. DevOps practices. reduction in defects per AFP by re- • When transforming at scale, lease r3 at the end of 2016. Coupled measures need to be consistent Acknowledgments with this quality improvement, the across all applications to maxi- Special thanks to Alan DeFelice for the team delivered more functionality mize comparability. visuals used in this article. (as measured by AFPs) in shorter • Measures used early in the trans- time periods. That is, the aggregate formation will be replaced with References functionality delivered in one- to metrics better aligned across the 1. T. Menzies and T. Zimmerman, two-month sprints during the first enterprise as practices evolve. “Software Analytics: So What?,” six months of agile–DevOps adop- IEEE Software, vol. 30, no. 4, 2013, tion (r1.1 to r3) was greater than Analytics developed from au- pp. 31–37. the functionality delivered in the tomated measures of size and 2. D. Zhang et al., “Software Analytics previous 18-month release (r1), re- structural quality contributed to in Practice,” IEEE Software, vol. 30, sulting in a 30% improvement in productivity and cycle-time gains no. 5, 2013, pp. 30–37. productivity. during Fannie Mae’s agile–DevOps 3. N. Mellegard et al., “Impact of In- transformation. A significant portion troducing Domain-Specific Model- of the initial gain resulted from ear- ling in : An everal key lessons were lier detection of structural flaws and Industrial Case Study,” IEEE Trans. learned through this agile– analysis of injection patterns, reduc- Software Eng., vol. 42, no. 3, 2016, pp. S DevOps transformation: ing corrective-maintenance effort. 248–263. Aggregating these analytics allowed 4. G. Kim et al., The DevOps Hand- • Early gains can be achieved by executive management to main- book, IT Revolution Press, 2016. leveraging existing platforms, tain visibility into the transforma- 5. B. Curtis, J. Sappidi, and A. but only if those platforms can tion’s progress and empirically Szynkarski, “Estimating the

82 IEEE SOFTWARE | WWW.COMPUTER.ORG/SOFTWARE | @IEEESOFTWARE Principal of an Application’s Techni- cal Debt,” IEEE Software, vol. 29, no. 6, 2012, pp. 34–42. BARRY SNYDER is a product manager for the DevOps release 6. Automated Function Points (AFP), pipeline, code quality, and in Development Version 1, Object Management Services at Fannie Mae and a key contributor to the DevOps Group, 2014; www.omg.org/spec transformation. His research interests are DevOps, enterprise /AFP. transformation, and machine learning. Snyder received a mas- 7. Function Points Counting Practices ter’s in public administration in IT systems from Pennsylvania Manual, Release 4.3., Int’l Function State University. Contact him at barry_t_snyder@fanniemae Point Users Group, 2009. .com. 8. A. Elbanna and S. Sarkar, “The Risks of Agile : Learning from Developers,” IEEE BILL CURTIS is a senior vice president and chief scientist at Software, vol. 33, no. 5, 2016, pp. CAST and the executive director of the Consortium for IT Soft-

72–79. AUTHORS THE ABOUT ware Quality (CISQ). He leads research on software measure- 9. G.A. Campbell and P.P. Papapetrou, ment, quality, and pathology, as well as maturity models and SonarCube in Action, Manning, process improvement. He leads development of international 2014. standards through CISQ. Curtis received a PhD in organizational psychology and statistics from Texas Christian Read your subscriptions University. He’s an IEEE Fellow. Contact him at [email protected]. through the myCS publications portal at http://mycs.computer.org

SUBMIT IEEE TRANSACTIONS ON TODAY SUSTAINABLE COMPUTING

SUBSCRIBE AND SUBMIT

For more information on paper submission, featured articles, calls for papers, and subscription links visit: www.computer.org/tsusc

JANUARY/FEBRUARY 2018 | IEEE SOFTWARE 83