Forrester Research on DevOps Quality Metrics that Matter COMMON METRICS — RANKED BY INDUSTRY EXPERTS

75 The way that we develop and deliver has changed dramatically in the To provide the DevOps community an objective perspective past 5 years—but the metrics we use to measure quality remain largely the on what quality metrics are most critical for DevOps success, same. Despite seismic shifts in business expectations, development metho- Tricentis commissioned Forrester to research the topic. The dologies, system architectures, and team structures, most organizations still goal was to analyze how DevOps leaders measured and va- rely on quality metrics that were designed for a much different era. lued 75 quality metrics (selected by Forrester), then identify which metrics matter most for DevOps success. Every other aspect of application delivery has been scrutinized and optimized as we transform our processes for DevOps. Why not put quality metrics un- der the microscope as well?

Are metrics like number of automated tests, test case coverage, and pass/fail rate important in the context of DevOps, where the goal is immediate insight into whether a given release candidate has an acceptable level of risk? What other metrics can help us ensure that the steady stream of updates don’t un- dermine the very user experience that we’re working so hard to enhance?

PAGE 2 THE PROCESS

Survey 603 global Learn what quality Compare the DevOps enterprise leaders metrics those experts experts’ quality metric responsible for their actually measure, and usage vs that of DevOps firms’ DevOps strategies. how valuable they laggards. If there was a rate each metric that significant discrepancy, they regularly measure. the metric is considered a DevOps differentiator.

From that sample, Use those findings to identify the firms with rate and rank each mature and successful metric’s usage DevOps adoptions (how often experts use (157 met Forrester’s the metric) and value criteria for this distinction). (how highly experts value the metric).

PAGE 3 We plotted all the metrics in each THE PROCESS category in a quadrant with 4 sections

VALUE ADDED HIDDEN GEM OVERRATED DISTRACTION

Metrics that are used Metrics that are not used Metrics that are used frequently Metrics that are not used frequently by DevOps experts frequently by DevOps experts, by DevOps experts, but not rated frequently by DevOps experts, and consistently rated as but are consistently rated as as valuable by the organizations and not rated as valuable by the valuable by the organizations valuable by the organizations who measure them. organizations who measure them. who measure them. who measure them.

PAGE 4 THE 3 KEY TAKEAWAYS

Understanding of Experts focus more on Experts are more likely to business risk is a contextual metrics measure the user critical factor in (e.g., requirements coverage) experience across an DevOps success. while others focus on end-to-end transaction “counting” metrics while others rely on (e.g., number of tests). application-specific or team-specific metrics.

PAGE 5 Globally, the following 20 metrics were ranked as the most THE 20 MOST IMPORTANT METRICS valuable by the DevOps experts who actually measure them.

FUNCTIONAL INTEGRATION END-TO-END BUILD VALIDATION TESTING

Number of automated tests Requirements covered Requirements covered Percent of automated 1 prioritized by risk by tests by tests end-to-end test cases

Successful code builds Total critical functional New defects Requirements covered 2 defects by tests 3 Unit test pass/fail Pass/fail rate Defect density Total # of defects

Total # of defects Defect density Pass/fail rate Number of test cases 4 executed

Code coverage Risk coverage Code coverage / Test case coverage 5 risk coverage

PAGE 6 QUALITY METRICS DEEP DIVE

The individual metrics fall into 4 major categories.

END-TO-END FUNCTIONAL INTEGRATION BUILD REGRESSION VALIDATION TESTING TESTING

Within each category, metrics are listed according to their value ranking (highest value rankings listed first).

PAGE 7 BUILD

PAGE 8 BUILD RANKING HEAT MAP Metric Usage Rank Value Rank Automated tests prioritized by risk 13 1

Successful code builds 1 2

Unit test pass/fail rate 9 3

Total number of defects 5 4 When measuring builds, done well matters. Counting unit Code coverage 10 5 tests is a waste of time but understanding change impact matters. Tracking “unit” tests prioritized by risk is the key. As the code base Test cases executed 12 6 evolves, developers and testers need immediate feedback about Build failure rate 6 7 change impact. This feedback is significantly more actionable if prio- ritized by level of risk. Defect status by severity 1 8 Static analysis results 10 9

Sixty-three percent of these firms consider the number of unit tests New defects 7 10 prioritized by risk as one of their top desired metrics. But far fewer Test case coverage 7 10 can actually do so — while 34% of advanced DevOps firms track the number of unit tests run, only 27% prioritize by risk. And less advan- Unit test coverage 14 12 ced DevOps firms use it even less — just 15% can track the metric Requirements covered by tests 4 13 today. Other important metrics tracked in builds focus on ensuring code quality — like the number of successful code builds (61%), unit Defect status by priority 15 14 test pass/fail rate (60%), and total number of defects identified (59%). Code churn 17 15

Tests did not run 18 16 Forrester commentary Number of automated tests 3 17

Test hygiene metrics 16 18

= DevOps Differentiator PAGE 9 BUILD QUADRANT MAPPING HIDDEN GEM VALUE ADDED Here‘s a quick look at how Build metrics are positioned vs. one another—based on the raw data collected from DevOps experts. DevOps Differentiators are highlighted in green. Automated tests prioritized by risk Successful code builds Unit test pass/fail Total number of defects Code coverage Test cases executed Defect status by severity Build failure rate

Static analysis results Test case coverage Unit test coverage New defects Defect status by priority Requirements covered by tests Tests did not run & Code churn

Number of automated tests

Test hygiene

DISTRACTION OVERRATED VALUED PAGE 10 MEASURED BUILD

Automated Tests Prioritized by Risk Successful Code Builds Unit Test Pass/Fail Rate

Usage Value Usage Value Usage Value #13 #1 #1 #2 #9 #4

27% 63% 36% 61% 30% 60%

This metric goes beyond simply counting the number of auto- This metric measures the number of builds that are free of This metric measures the percentage of unit tests that pass mated tests and introduces the concept of risk. Teams using any errors or warnings. Teams might choose to “raise the bar” or fail. It is notoriously easy to manipulate. For example, the this metric have business risks clearly defined and tests corre- for successful code builds by configuring builds to fail if ad- team can increase the pass rate (or reduce the fail rate) by lated to those risks. With this insight, they can focus their test ditional quality gates (e.g., static analysis) are not satisfied. simply adding redundant passing tests that do not add value. automation efforts on tests that matter most to the business.

In each metrics section, the following icons are used to highlight:

DevOps Hidden Overrated PAGE 11 Differentiator Gem BUILD

Total Number of Defects Code Coverage Test Cases Executed

Usage Value Usage Value Usage Value #5 #4 #10 #5 #12 #6

32% 59% 28% 58% 28% 57%

This metric measures the number of defects identified during This metric measures what percentage of the source code is This metric measures the total number of tests executed. each build and highlights how the total number of defects ei- exercised by tests. Code coverage can be measured in a num- However, it does not account for test redundancy, test effec- ther diminishes or increases as teams get closer to their fi- ber of different ways; for example, statement (line) coverage, tiveness, or test flakiness. Having more tests does not neces- nal production build. This can include defects exposed by unit branch coverage, MC/DC coverage, etc. Code coverage is pri- sarily add more value, nor does it automatically increase a tests, static analysis, or other testing approaches. marily used with unit tests, but can expand into functional team’s chances of releasing faster. validation, integration testing, and end-to-end testing.

In each metrics section, the following icons are used to highlight:

DevOps Hidden Overrated PAGE 12 Differentiator Gem BUILD

Build Failure Rate Defect Status by Severity Static Analysis Results

Usage Value Usage Value Usage Value #6 #7 #1 #8 #10 #9

32% 56% 36% 56% 28% 53%

This metric measures the percentage of builds that fail over This metric tracks how the number of reported defects fall This metric refers to the output of a static analysis tool that a given period of time. An increase in build failure rates likely into each severity category. Defect severity categories can be scans the source code for weaknesses. Static analysis tools indicates that the application’s overall “health” is deteriorating. created and defined according to a team’s or organization’s are notorious for reporting “noise” (issues you do not care custom scale. The severity indicates how much damage the about) and “false positives.” If not controlled, this will limit the defect is expected to cause. value of measuring this metric.

In each metrics section, the following icons are used to highlight:

DevOps Hidden Overrated PAGE 13 Differentiator Gem BUILD

New Defects Test Case Coverage Unit Test Coverage

Usage Value Usage Value Usage Value #7 #10 #7 #10 #14 #12

31% 51% 31% 51% 24% 50%

This metric tracks the number of new defects reported in a gi- This metric measures the effectiveness of tests by looking at This metric measures how well unit test cases cover code. ven build. It is typically used to assess if changes are negatively how well test cases cover the application’s functional require- Like test coverage, it can be measured in a number of dif- impacting quality. Together with “Open Defects” and “Resolved ments. It can be measured at different granularities, depen- ferent ways; for example, statement (line) coverage, branch Defects,” it helps identify how the build quality is trending. ding on what’s most important for your organization. coverage, MC/DC coverage, etc. Unit test coverage is a type of code coverage.

In each metrics section, the following icons are used to highlight:

DevOps Hidden Overrated PAGE 14 Differentiator Gem BUILD

Requirements Covered by Tests Defect Status by Priority Code Churn

Usage Value Usage Value Usage Value #4 #13 #15 #14 #17 #15

34% 49% 23% 49% 19% 48%

This metric looks at the percentage of requirements that are This metric tracks the status (open, closed, etc.,) of the de- This metric measures the degree to which the code base correlated to test cases. It can be measured at different gra- fects in each defect priority category. A defect’s priority indi- changes over a given period. It can be measured in terms of nularities, depending on what’s most important for your orga- cates how urgently it should be fixed. It is commonly used in code churn size, type, breadth, and depth. nization. concert with severity (e.g., a high-severity defect would likely be assigned a high priority).

In each metrics section, the following icons are used to highlight:

DevOps Hidden Overrated PAGE 15 Differentiator Gem BUILD

Tests Did Not Run Number of Automated Tests Test Hygiene

Usage Value Usage Value Usage Value #18 #16 #3 #17 #16 #18

17% 48% 34% 44% 21% 34%

This metric tracks the number of tests that were slated for This metric counts the number of automated tests perfor- This metric measures test tidiness, usefulness and effective- execution, but did not complete (for either internal or external med for the specified build. However, it does not account for ness. For example, a test with a high hygiene score is most li- reasons). Tests might not run due to CI <> tool integration is- test redundancy, test effectiveness, or test flakiness. Having kely deemed useful, effective, and easily reproduced/unders- sues, test environment issues, test data issues, a problem with more tests does not necessarily add more value, nor does it tood/extended across the team. how the test was written, limited testing time, etc. automatically increase a team’s chances of releasing faster.

In each metrics section, the following icons are used to highlight:

DevOps Hidden Overrated PAGE 16 Differentiator Gem FUNCTIONAL VALIDATION

PAGE 17 FUNCTIONAL RANKING HEAT MAP Metric Usage Rank Value Rank VALIDATION Requirements covered by tests 2 1

Critical defects 13 2

Pass / fail rate 1 3

Defect density 6 4

Risk coverage 4 5

When functional testing kicks in, user story/requirements coverage Blocked test cases 3 6 gets the focus. The top goal for testers in this stage is to minimize risk Code coverage 7 7 by ensuring that the functionality expressed in user stories works as Open defects 4 8 expected. Extending this coverage concept to business risk coverage is an additional metric that leading DevOps firms execute to advance New defects 7 9 release automation. Planned test case coverage 13 10

Total defects 13 10 Running functional tests and checking against covered requirements (69%), the density of bugs (62%) or number of functional defects New critical defects 10 12 found (66%), and the ratio of tests passed vs. failed (64%) are metrics Test effectiveness 9 13 that successful teams identify as important to manage risk and track Release readiness 11 14 quality during functional testing. Test cases executed 19 15

Forrester commentary Build verification tests 18 16 Test efficiency 17 17

Test case coverage 11 18

Test execution time 16 19

PAGE 18 = DevOps Differentiator FUNCTIONAL QUADRANT MAPPING VALIDATION HIDDEN GEM VALUE ADDED

Here‘s a quick look at how Functional Validation metrics are positioned vs. one another—based on the raw data collected from DevOps experts. DevOps Differentiators are highlighted in green. Requirements covered by tests Critical defects Pass/fail rate Risk coverage Defect density Code coverage Blocked test cases

Open defects New defects

Planned Test case coverage Total defects New critical defects

Test effectiveness Release readiness

Test cases executed Build verification tests Test efficiency

Test case coverage Test execution time DISTRACTION OVERRATED VALUED PAGE 19 MEASURED FUNCTIONAL VALIDATION

Requirements Covered by Tests Critical Defects Pass/Fail Rate

Usage Value Usage Value Usage Value #2 #1 #13 #2 #1 #3

40% 69% 21% 66% 46% 64%

This metric looks at the percentage of requirements that are This metric tracks how the total amount of critical functional This metric measures the percentage of functional tests that correlated to functional validation tests. It can be measured at defects changes over time. Critical functional defects are es- pass or fail. It is notoriously easy to manipulate. For example, different granularities, depending on what’s most important timated to cause severe damage to the user experience and/ the team can increase the pass rate (or reduce the fail rate) for your organization. or the organization delivering the application. by simply adding redundant passing tests that do not add value.

In each metrics section, the following icons are used to highlight:

DevOps Hidden Overrated PAGE 20 Differentiator Gem FUNCTIONAL VALIDATION

Defect Density Risk Coverage Blocked Test Cases

Usage Value Usage Value Usage Value #6 #4 #4 #5 #3 #6

33% 62% 34% 62% 38% 60%

This metric measures the quality of the application by looking This metric tracks how well tests cover the application’s busi- This metric measures functional test cases that could not be at the total number of reported defects divided by application ness risks. As a prerequisite, risks must be defined and as- executed due to an external reason. For example, a test might size. It is typically reported in terms of defects per lines of code sociated with the application’s functional components. For be blocked because an environment is not ready, or becau- (LOC). example, if a high risk application area is not tested, you might se another failure prevents the test suite from reaching and have high test coverage but low risk coverage. executing it.

In each metrics section, the following icons are used to highlight:

DevOps Hidden Overrated PAGE 21 Differentiator Gem FUNCTIONAL VALIDATION

Code Coverage Open Defects New Defects

Usage Value Usage Value Usage Value #7 #7 #4 #8 #7 #9

31% 60% 34% 56% 31% 55%

This metric measures how well functional test cases cover This metric tracks the number of reported but unresolved This metric tracks the number of new defects reported du- code. It can be measured in a number of different ways; for defects found during functional validation. A high number of ring functional testing. It is typically used to assess if changes example, statement (line) coverage, branch coverage, MC/DC open defects amounts to a large “technical debt.” are negatively impacting the quality of the release. Together coverage, etc. with “Open Defects” and “Resolved Defects,” it helps identify how quality is trending.

In each metrics section, the following icons are used to highlight:

DevOps Hidden Overrated PAGE 22 Differentiator Gem FUNCTIONAL VALIDATION

Planned Test Case Coverage Total Defects New Critical Defects

Usage Value Usage Value Usage Value #13 #10 #13 #10 #10 #12

21% 50% 21% 50% 24% 47%

This metric tracks the target level of test coverage. The actual include defects exposed by automated tests as well as other This metric measures the number of new defects with a “cri- test coverage can be compared to this number to determine if testing approaches (e.g., exploratory testing). tical” severity reported during functional validation. It is typi- sufficient test coverage is achieved. cally used to assess if changes are negatively impacting the quality of the release. Together with “Open Defects” and “Re- solved Defects,” it helps identify how quality is trending.

In each metrics section, the following icons are used to highlight:

DevOps Hidden Overrated PAGE 23 Differentiator Gem FUNCTIONAL VALIDATION

Test Effectiveness Release Readiness Test Cases Executed

Usage Value Usage Value Usage Value #9 #13 #11 #14 #19 #15

28% 41% 22% 39% 18% 33%

This metric measures test effectiveness according to a custom This metric measures release readiness according to a cus- This metric measures the total number of tests executed. set of criteria (defined by the team or organization). The mea- tom set of criteria (defined by the team or organization). The However, it does not account for test redundancy, test effec- surement method varies significantly across teams, projects, measurement method varies significantly across teams, pro- tiveness, or test flakiness. Having more tests does not neces- divisions, and organizations. Test effectiveness tends to focus jects, divisions, and organizations. A high release readiness sarily add more value, nor does it automatically increase a on achieving the desired result, while test efficiency aims to score indicates a high confidence that the application can be team’s chances of releasing faster. achieve the desired result with the fewest resources. released with an acceptable level of business risk.

In each metrics section, the following icons are used to highlight:

DevOps Hidden Overrated PAGE 24 Differentiator Gem FUNCTIONAL VALIDATION

Build Verification Tests Test Efficiency Test Case Coverage

Usage Value Usage Value Usage Value #18 #16 #17 #17 #11 #18

19% 32% 20% 30% 22% 24%

This metric measures the number of tests run to verify that This metric measures functional test efficiency according to a This metric measures the effectiveness of tests by looking at the build is testable (before the actual functional validation be- custom set of criteria (defined by the team or organization). how well test cases cover the application’s functional require- gins). These tests are commonly a small subset from the func- The measurement method varies significantly across teams, ments. It can be measured at different granularities, depen- tional validation test suite. projects, divisions, and organizations. Test effectiveness tends ding on what’s most important for your organization. to focus on achieving the desired result, while test efficiency aims to achieve the desired result with the fewest resources.

In each metrics section, the following icons are used to highlight:

DevOps Hidden Overrated PAGE 25 Differentiator Gem FUNCTIONAL VALIDATION

Test Execution Time

Usage Value #16 #19

21% 19%

This metric measures the total time required to execute the test suite. When measuring test execution time, it’s important to consider variations in the number of tests executed as well as blocked test cases.

PAGE 26 INTEGRATION TESTING

PAGE 27 INTEGRATION TESTING RANKING HEAT MAP Metric Usage Rank Value Rank Requirements covered by tests 3 1

New defects 8 2

Defect density 2 3

Pass / fail rate 1 4 To deal with modern distributed architectures, measuring integrati- Code coverage 6 5 on testing and API tests win all around. With application architectures becoming more decoupled, decomposed in services and microser- Risk coverage 6 5 vices, API testing increases in relevance. So much is going on beyond Open defects 5 7 the user interface that it’s impossible to keep quality high and dimi- nish risk without addressing the API layer. Advanced DevOps firms Blocked test cases 4 8 prioritize many of the same metrics for integration testing as they do Total critical defects 13 9 for functional testing phase; this time, the focus is on APIs specifically. New critical defects 18 10

Planned test case coverage 12 11 These metrics include tests run against functional requirements (75%), total number of new API defects found (64%) and API bug den- Test efficiency 10 12 sity (63%), API test pass vs. fail rate (62%), and API code coverage Test execution time 9 13 (62%). Monitoring API risk coverage is an important metric as well, with 62% of advanced DevOps firms prioritizing it as a top metric for Test case coverage 15 14 the category. Test effectiveness 14 15

Build verification tests 16 16 Forrester commentary Test cases executed 17 17

Total defects 19 18

Release readiness 10 19

PAGE 28 = DevOps Differentiator INTEGRATION TESTING QUADRANT MAPPING HIDDEN GEM VALUE ADDED Here‘s a quick look at how Integration Testing metrics are positioned vs. one another—based on the raw data collected from DevOps experts. Requirements covered by tests DevOps Differentiators are highlighted in green.

New defects Risk coverage Defect density Code coverage Pass/fail rate

Open defects

Blocked test cases

Total critical defects New critical defects

Planned test case coverage Test efficiency Test execution time Test case coverage

Test effectiveness Build verification tests Total defects Test cases executed Release readiness

DISTRACTION OVERRATED VALUED PAGE 29 MEASURED INTEGRATION TESTING

Requirements Covered by Tests New Defects Defect Density

Usage Value Usage Value Usage Value #3 #1 #8 #2 #2 #3

34% 75% 29% 64% 39% 63%

This metric looks at the percentage of requirements that are This metric measures the sum of new defects identified du- This metric measures the quality of the APIs by looking at the correlated to API tests. It can be measured at different granu- ring API testing. This can include defects exposed by automa- total number of reported defects divided by API size. It is ty- larities, depending on what’s most important for your organi- ted tests as well as other testing approaches (e.g., explorato- pically reported in terms of defects per lines of code (LOC). zation. ry testing).

In each metrics section, the following icons are used to highlight:

DevOps Hidden Overrated PAGE 30 Differentiator Gem INTEGRATION TESTING

Pass/Fail Rate Code Coverage Risk Coverage

Usage Value Usage Value Usage Value #1 #4 #6 #5 #6 #5

44% 62% 31% 62% 31% 62%

This metric measures the percentage of API tests that pass or This metric measures how well API test cases cover code. It This metric tracks how well tests cover the API’s business fail. It is notoriously easy to manipulate. For example, the team is typically measured in terms of how well the API’s methods/ risks. As a prerequisite, risks must be defined and associated can increase the pass rate (or reduce the fail rate) by simply operations are covered. with the API’s requirements. For example, if a high risk requi- adding redundant passing tests that do not add value. rement is not tested, you might have high test coverage but low risk coverage.

In each metrics section, the following icons are used to highlight:

DevOps Hidden Overrated PAGE 31 Differentiator Gem INTEGRATION TESTING

Open Defects Blocked Test Cases Total Critical Defects

Usage Value Usage Value Usage Value #5 #7 #4 #8 #13 #9

32% 57% 34% 53% 23% 49%

This metric tracks the number of reported but unresolved API This metric measures API test cases that could not be exe- This metric measures the number of defects with a “critical” defects. A high number of open defects amounts to a large cuted due to an external reason. For example, a test might severity reported during API testing. This can include defects “technical debt.” be blocked because an environment is not ready, or becau- exposed by automated tests as well as other testing approa- se another failure prevents the test suite from reaching and ches (e.g., exploratory testing). executing it.

In each metrics section, the following icons are used to highlight:

DevOps Hidden Overrated PAGE 32 Differentiator Gem INTEGRATION TESTING

New Critical Defects Planned Test Case Coverage Test Efficiency

Usage Value Usage Value Usage Value #18 #10 #12 #11 #10 #12

18% 48% 24% 44% 25% 43%

This metric measures the number of new defects with a “criti- This metric tracks the target level of test coverage. The actual This metric measures API test efficiency according to a cus- cal” severity reported during API testing. It is typically used to test coverage can be compared to this number to determine tom set of criteria (defined by the team or organization). The assess if changes are negatively impacting the quality of the if sufficient test coverage is achieved. measurement method varies significantly across teams, pro- release. Together with “Open Defects” and “Resolved Defects,” jects, divisions, and organizations. Test effectiveness tends it helps identify how the quality is trending. to focus on achieving the desired result, while test efficiency aims to achieve the desired result with the fewest resources.

In each metrics section, the following icons are used to highlight:

DevOps Hidden Overrated PAGE 33 Differentiator Gem INTEGRATION TESTING

Test Execution Time Test Case Coverage Test Effectiveness

Usage Value Usage Value Usage Value #9 #13 #15 #14 #14 #15

26% 41% 21% 39% 23% 35%

This metric measures the total time required to execute the This metric measures the effectiveness of tests by looking at This metric measures test effectiveness according to a cus- test suite. When measuring test execution time, it’s important how well test cases cover the API’s functional requirements. tom set of criteria (defined by the team or organization). The to consider variations in the number of tests executed as well It can be measured at different granularities, depending on measurement method varies significantly across teams, pro- as blocked test cases. what’s most important for your organization. jects, divisions, and organizations. Test effectiveness tends to focus on achieving the desired result, while test efficiency aims to achieve the desired result with the fewest resources.

In each metrics section, the following icons are used to highlight:

DevOps Hidden Overrated PAGE 34 Differentiator Gem INTEGRATION TESTING

Build Verification Tests Test Cases Executed Total Defects

Usage Value Usage Value Usage Value #16 #16 #17 #17 #19 #18

19% 34% 19% 32% 17% 32%

This metric measures the number of tests run to verify that the This metric measures the total number of tests executed. This metric measures the sum of all defects identified during API build is testable (before the actual API testing begins). These However, it does not account for test redundancy, test effec- testing. This can include defects exposed by automated tests as tests are commonly a small subset from the API test suite. tiveness, or test flakiness. Having more tests does not neces- well as other testing approaches (e.g., exploratory testing). sarily add more value, nor does it automatically increase a team’s chances of releasing faster.

In each metrics section, the following icons are used to highlight:

DevOps Hidden Overrated PAGE 35 Differentiator Gem INTEGRATION TESTING

Release Readiness

Usage Value #10 #19

25% 30%

This metric measures release readiness according to a custom set of criteria (defined by the team or organization). The mea- surement method varies significantly across teams, projects, divisions, and organizations. A high release readiness score in- dicates a high confidence that the application can be released with an acceptable level of business risk.

PAGE 36 E2E REGRESSION TESTING

PAGE 37 END-TO-END RANKING HEAT MAP Metric Usage Rank Value Rank REGRESSION TESTING Percent of automated test cases 4 1

Requirements covered by tests 3 2

Total defects 7 3

Test cases executed 2 4

Test case coverage 1 5 End-to-end regression testing also gets a first-class citizen role. In Percent of test cases passed 5 6 this category, leading organizations automate end-to-end tests at the process or transaction level. Automating these types of tests is Risk coverage 5 6 not easy, and advanced tools are required because Release readiness 10 8 speed matters. The choice of the testing technology matters becau- Variance from baseline of percent of test cases passed 9 9 se achieving and, more importantly, maintaining high levels of auto- mation is crucial. Therefore, the more tests that are automated, the New requirements added 13 10 better, and so 70% of leading DevOps teams prioritize the percent of Defect status by priority 8 11 automated end-to-end test cases as a top metric. Test effectiveness 10 12

Many of the other metrics that leading firms rank as important in this Percent of requirements tested 12 13 category are quantitative and measure coverage of functionality vs. Percent of passed tests for new requirements 18 14 requirements (70%), number of test cases executed (65%), and total number of defects identified during testing (66%). New requirements tested 15 15 Time spent preparing test data 19 16

Forrester commentary Time spent preparing test environments 15 17

Total test execution time 14 18

Blocked test cases 15 19

PAGE 38 = DevOps Differentiator END-TO-END QUADRANT MAPPING REGRESSION TESTING HIDDEN GEM VALUE ADDED

Here‘s a quick look at how End-to-End Regression Tes- ting metrics are positioned vs. one another—based on the raw data collected from DevOps experts. DevOps Differentiators are highlighted in green. Requirements Percent of automated test cases covered by tests

Total defects Test cases executed Test case coverage Risk coverage Percent of test cases passed

Release readiness Variance from baselines of percent of test cases passes New requirements added Defect status by priority

Test effectiveness

Percent of requirements tested

New requirements tested Time spent preparing test environments Total test execution time

Blocked test cases DISTRACTION OVERRATED VALUED PAGE 39 MEASURED END-TO-END REGRESSION TESTING

Percent of Automated Test Cases Requirements Covered by Tests Total Defects

Usage Value Usage Value Usage Value #4 #1 #3 #2 #7 #3

36% 70% 40% 70% 33% 66%

This metric measures how much of the total test suite is auto- This metric looks at the percentage of requirements that are This metric measures the sum of all defects identified during mated. It is calculated by dividing the total number of test ca- correlated to test cases. It can be measured at different gra- E2E testing. This can include defects exposed by automated ses by the number of automated test cases. nularities, depending on what’s most important for your or- tests as well as other testing approaches (e.g., exploratory ganization. testing).

In each metrics section, the following icons are used to highlight:

DevOps Hidden Overrated PAGE 40 Differentiator Gem END-TO-END REGRESSION TESTING

Test Cases Executed Test Case Coverage Percent of Test Cases Passed

Usage Value Usage Value Usage Value #2 #4 #1 #5 #5 #6

41% 65% 48% 61% 34% 59%

This metric measures the total number of tests executed. Ho- This metric measures the effectiveness of tests by looking at This metric measures the percentage of E2E tests that pass. wever, it does not account for test redundancy, test effective- how well test cases cover the application’s functional require- It is notoriously easy to manipulate. For example, the team ness, or test flakiness. Having more tests does not necessarily ments. It can be measured at different granularities, depen- can increase the pass rate by simply adding redundant pas- add more value, nor does it automatically increase a team’s ding on what’s most important for your organization. sing tests that do not add value. chances of releasing faster.

In each metrics section, the following icons are used to highlight:

DevOps Hidden Overrated PAGE 41 Differentiator Gem END-TO-END REGRESSION TESTING

Risk Coverage Release Readiness Variance from Baseline of Percent of Test Cases Passed

Usage Value Usage Value Usage Value #5 #6 #10 #8 #9 #9

34% 59% 26% 53% 28% 51%

This metric tracks how well tests cover the application’s busi- This metric measures release readiness according to a cus- This metric measures how the percentage of passing E2E tests ness risks. As a prerequisite, risks must be defined and associ- tom set of criteria (defined by the team or organization). The changes from the historical baseline. This can be a valuable ated with the application’s functional components. For exam- measurement method varies significantly across teams, pro- way to put results in perspective. For example, a 75% pass ple, if a high risk application area is not tested, you might have jects, divisions, and organizations. A high release readiness rate might be a great improvement for a team who historical- high test coverage but low risk coverage. score indicates a high confidence that the application can be ly had 50% of their tests passing—but a significant drop for a released with an acceptable level of business risk. team that was consistently achieving 95% pass rates.

In each metrics section, the following icons are used to highlight:

DevOps Hidden Overrated PAGE 42 Differentiator Gem END-TO-END REGRESSION TESTING

New Requirements Added Defect Status by Priority Test Effectiveness

Usage Value Usage Value Usage Value #13 #10 #8 #11 #10 #12

20% 50% 31% 47% 26% 41%

This metric measures how many new requirements were ad- This metric tracks the status (open, closed, etc.,) of the de- This metric measures test effectiveness according to a cus- ded over time. From a quality perspective, it is not interesting fects in each defect priority category. A defect’s priority indi- tom set of criteria (defined by the team or organization). The on its own. However, it can be used to add context to other cates how urgently it should be fixed. It is commonly used in measurement method varies significantly across teams, pro- metrics—for example, requirements covered by tests might concert with severity (e.g., a high-severity defect would likely jects, divisions, and organizations. Test effectiveness tends drop if many new requirements were added, but the corre- be assigned a high priority). to focus on achieving the desired result, while test efficiency sponding functionality has not been thoroughly tested yet. aims to achieve the desired result with the fewest resources.

In each metrics section, the following icons are used to highlight:

DevOps Hidden Overrated PAGE 43 Differentiator Gem END-TO-END REGRESSION TESTING

Percent of Requirements Tested Percent of Passed Tests for New New Requirements Tested Requirements

Usage Value Usage Value Usage Value #12 #13 #18 #14 #15 #15

25% 37% 15% 35% 17% 32%

This metric looks at the percentage of requirements that are This metric measures the percentage of E2E tests for new This metric tells you percentage of new requirements that actually tested through test execution. It can be measured at requirements that pass. This can provide a focused look at are tested by E2E tests. If this number is low, the percentage different granularities, depending on what’s most important the quality of new features and enhancements. However, it of passed tests for new requirements is less meaningful. for your organization. does not necessarily indicate if these changes negatively im- pact other areas of the application (e.g., previously-working functionality).

In each metrics section, the following icons are used to highlight:

DevOps Hidden Overrated PAGE 44 Differentiator Gem END-TO-END REGRESSION TESTING

Time Spent Preparing Test Data Time Spent Preparing Test Environments Test Execution Time

Usage Value Usage Value Usage Value #19 #16 #15 #17 #14 #18

13% 30% 17% 28% 19% 28%

This metric tracks the time that team members spend prepa- This metric tracks the time that team members spend pre- This metric measures the total time required to execute the ring test data. This can include requesting test data to be pro- paring test environments. This can include requesting virtual test suite. When measuring test execution time, it’s import- visioned, extracting data, masking data for compliance purpo- machines or service virtualization access, configuring them ant to consider variations in the number of tests executed as ses, injecting data into the tests, refreshing test data after a to test various configurations and conditions, refreshing the well as blocked test cases. test completes, etc. environments after a test completes, etc.

In each metrics section, the following icons are used to highlight:

DevOps Hidden Overrated PAGE 45 Differentiator Gem END-TO-END REGRESSION TESTING

Blocked Test Cases

Usage Value #15 #19

17% 20%

This metric measures E2E test cases that could not be exe- cuted due to an external reason. For example, a test might be blocked because an environment is not ready, or because another failure prevents the test suite from reaching and exe- cuting it. Blocked test cases tend to me more common in E2E testing that at other levels due to the complexity of the test suite, test data, and test environments involved.

PAGE 46 ANALYSIS

DevOps Changes the Game—For Quality Metrics and the Ultimate Mission of Testing By Wayne Ariola, Tricentis

Scaled agile and DevOps change the game for . It’s In the past, when software testing was a timeboxed activity at the end of the cycle, we focused on answering the question not just a matter of accelerating testing—it’s also about fundamental- Are we done testing? When this was the primary question, counting metrics associated with the number of tests run, in- ly altering the way that we measure quality. For Agile, we need to test complete tests, passed tests, failed tests, etc. drove the process and influenced the release decision. As you can imagine, faster and earlier. But DevOps demands a more deep-seated shift. these metrics are highly ineffective in understanding the actual quality of a release. In today’s world, we have to ask a diffe- The test outcomes required to drive a fully-automated release pipe- rent question: Does the release have an acceptable level of risk? line are dramatically different than the ones that most teams measure today. Even if you’re very good at testing in a siloed manner isolated As you’ve seen in this report, companies with the most mature DevOps practices moved on from considering “counting” to an Agile team, this might not help you in your DevOps process— metrics — for instance, whether you’ve run tests an adequate number of times — as key indicators of success. Instead, where an overarching assessment of business risk is imperative for a they’re prioritizing “contextual” metrics — whether the software meets all of the requirements of the user experience. release decision.

PAGE 47 ANALYSIS

DevOps has shifted the mission of testing to determining whether the Continuous Testing really boils down to providing the right feedback to the right stakeholder at the right time. For deca- release candidate is truly fit for release. If you can answer this ques- des, testing was traditionally deferred until the end of the cycle. At that point, testers would provide all sorts of important tion with 5 tests, that’s actually better than answering it with 5000 feedback…but nobody really wanted to hear it then. It was too late, and there was little the team could feasibly do, except tests that aren’t closely aligned with business risk. Count doesn’t mat- delay the release. With Continuous Testing, the focus is on providing actionable feedback to people who really care about ter—what’s important is 1) the ability to assess risk and 2) the ability it… when they are truly prepared to act on it. This can occur at any point in the application delivery lifecycle—including both to make actionable decisions based on test results. “shift left” and “shift right.”

That’s why Continuous Testing is so critical. Continuous Testing is the process of executing automated tests as part of the software delivery Learn more about Continuous Testing — get the Continuous Testing Reference Guide pipeline in order to obtain feedback on the business risks associated with a software release as rapidly as possible.

PAGE 48 OVERRATED QUALITY METRICS

The following metrics are commonly used, but are rarely ranked as high-value metrics by DevOps experts:

Metric Category Number of automated tests BUILD

Defect status by priority E2E

Requirements covered by tests BUILD

PAGE 49 HIDDEN GEMS

The following metrics are not commonly used (even among DevOps experts), but are ranked as extremely valuable by the select teams who actually measure them:

Metric Category New defects IT

Critical defects FV

Automated tests prioritized by risk BUILD

Code coverage BUILD

Test cases executed BUILD

Static analysis result BUILD

Variance from baselines of percent of test cases passed E2E

Release readiness E2E

PAGE 50 TOP DEVOPS DIFFERENTIATORS

DevOps experts/leaders measure the following metrics significantly more than DevOps laggards measure them:

Metric Category Automated tests prioritized by risk BUILD

Percent of automated E2E test cases E2E

Risk coverage IT

Release readiness FV / IT / E2E

Test efficiency FV / IT

Requirements covered by tests BUILD / FV / IT / E2E

Test case coverage BUILD / E2E

Static analysis results BUILD

Variance from baseline of percent of test cases passed E2E

PAGE 51 Test effectiveness FV / IT / E2E MOST USED BY DEVOPS EXPERTS

The following metrics are the most frequently used (overall) by DevOps experts/leaders:

Metric Category Test case coverage E2E

Pass/fail rate FV

API pass/fail rate IT

Number of tests executed E2E

API bug density IT

Requirements covered by tests FV

Requirements covered by tests E2E

Blocked test cases FV

Percent of automated E2E test cases E2E

PAGE 52 Successful code builds BUILD MOST VALUED BY DEVOPS EXPERTS

The following metrics are the most highly-valued (overall) by DevOps experts/leaders:

Metric Category Requirements covered by API tests IT

Percent of automated E2E tests E2E

Requirements covered by tests E2E

Requirements covered by tests FV

Count of critical functional defects FV

Total number of defects discovered in test E2E

Number of test cases executed E2E

Pass fail rate FV

New API defects found IT

PAGE 53 Automated tests prioritized by risk BUILD REPORT CA: 3%

METHODOLOGY UK: 9% DE: 9% FR: 11% US: 38% IN: 8% In this study, Forrester conducted an online survey of 603 enterprise organizations in North America, Europe, and Asia Pacific to evaluate current software testing practices SG: 14% and metrics tracked during survey develop- ment. Survey participants included deci- AU: 6% sion makers and individual contributors NZ: 2% responsible for their organizations’ Agile and/or DevOps development strategies. Questions provided to the participants Company size ”Which of the following continuous development asked about their firms’ attitudes toward practices does your organization use today?“ software development automation, risk 51% management, and testing practices, as well Agile + DevOps as the metrics they track and value in the software development life cycle. Agile development 51% 25% 25% DevOps

21% 1.000 to 4.999 5.000 to 19.999 20.000 or more PAGE 54 employees employees employees 28% REPORT METHODOLOGY

Industry Title (Top ten shown) Technology/software 27% Vice president 12%

Financial services and insurance 15% Director 38% Manufacturing and materials 12% Manager 35% Retail 6% Project Manager 8%

Healthcare 5% Full-time practitioner 7% Education and nonprofits 4% ”What is your level of responsibility when it comes to Agile Government 4% and/or DevOps development strategy at your organization?“ Telecommunication services 4% I am the final decision-maker 36% Transportation and logistics 3% I am part of a team making decisions 41%

Consumer product manufacturing 3% I influence decisions 11% I am involved 11%

PAGE 55