Measurement and Estimation Symposium

Measurement and Estimation Symposium

SEPG Conference Amsterdam 12-15 June 2006

Experience in developing metrics for agile projects compatible with CMMI best practice

Graham Collins, UCL

Abstract

This report covers the experience and resolution of problems with the reporting of agile projects with a background of senior managers familiar with Earned Value (EV) reporting.

It was found to be important to retain a variety of measures and incorporate new measures based on progress (velocity) over short time periods, analogous to EV performance indicators (cpi and spi) as well as report on percentage of completed software.

The shift of emphasis from earned value progress reporting to tracking on a frequent basis and measuring what percentage software has been completed (using acceptance tests data) initially caused concerns, as the progressachieved in working software typically shows slow progress initially.

An outline of an improvement initiative is given using actual data,to try to achieve a more accurate understanding of a project status, and to show how the initial concerns of senior management were overcome. An assessment of business value, progress each week, and also the remaining work which can be correlated to the EV measurement, are compatible with Capability Maturity Model Integration (CMMI) best practice and agile development. This also had the effect of the development team being more in control, knowing they had ‘buy-in’ from senior staff and also being able to track the teams progress continually as this was visible to everyone involved in development.

This invariably has an impact on other issues such as how the life-cycle is reported to senior management and how metrics should be tailored for agile projects where emphasis is placed on prioritisation of business goals (benefits) and fixed iterations.

Background

Agile practices at first sight do not seem compatible with EV, especially as the goals and requirements are refined during agile projects and EV is based on having clear goals and estimates from inception. Yet increasingly developers wish to use agile approaches, due to the advantages of teamwork, feedback from the client and seeing working software at an earlier stage. In addition organisations are increasingly requiring greater compliance and justification of their projects not just in financial terms but in tracking the value to the business. An approach which is being increasingly adopted is that of earned value management (EVM) which is now mandated on all appropriate MoD [APM 06a] projects in the UK.

Added to this is the Software Engineering Institute’s (SEI) CMMI framework which is becoming increasingly used as best practice. Chrissis outlines CMMI has its foundations [Chrissis 03] in the work of ‘principles of statistical quality control’ [Shewhart 31] although other authors [Lipke 00] refer to this work later refined by W.Edwards Deming [Deming 86] using the term Statistical Process Control (SPC). Chrissis also cites the influence of Humphrey’s Managing the Software Process, but it is worth noting that Humphrey advocates both SPC and EV approaches and recently includes detailed examples of EVM [Humphrey05].

Anderson, whose MSF project has achieved level 3, puts forward a strong argument that CMMI framework is firmly rooted in the work of Deming and SPC [Anderson 05] and because of this it must be possible to achieve CMMI level 5. In addition Chrissis makes clear quoting the SEI that ‘the quality of a system or product is highly influenced by the quality of the process used to develop and maintain it’, in effect, laying the foundation for continual process improvement as the basis for CMMI.

Depending on the reporting required, constraints and governance of the project both approaches EVM and SPC may be required. The effectiveness and their possibility of being used in conjunction with agile projectsat different levels of abstraction will be discussed, based on experience from 19 projects (2003-2006).

The CMMI takes into account the increasingly heterogeneous environments that software is developed for and at level 3 requires effective project management practices. Assessment allows different approaches, provided the main goals are achieved.

Pair reporting

Working with agile teams having a role on several sites has necessitated a different style of management. There is a high level of trust in a typical agile team, the developers commit to an amount of work (tasks) on a daily, weekly and iteration basis. The development work is done in pairs as this achieves faster and more robust code. To reduce the onus on the project manager to track individuals’ progress, the tasks are outlined on the whiteboard and any problems and issues are flagged up. The rate of work agreed each day in comparison to work achieved (velocity) is marked by the pair. In effect, pair reporting, which addresses concerns and potential problems as early as possible that may need further assistance from the group. This has found to be a useful addition to the agile processes adopted and fits naturally with the agile philosophy of reporting team progress rather than individual progress.

Iterations typically use 1 month time frames, similar to the time period in the agile method Scrum (30 day ‘sprints’);however the main approach is Extreme Programming (XP) [Beck 00]. Information is stored where appropriate on whiteboards and stored via digital cameras and the use of WiKi pages. Approaches where possible have been integrated into existing programme, project and risk management frameworks.

Earned Value and Control Charts

The work of [Lipke 00] in using SPC and control charts for air logistics software development projects seemed to point the way to a continuous improvement programme compatible with CMMI level 5. In1996 these became the first software activities in federal service to achieve CMMI level 4 distinction. The development points towards the compatibility of using SPC control charts on earned value data and creating an application to report risk of outside acceptable boundaries. This was achieved by using the schedule performance index (spi) and cost performance index (cpi) and application of control limits to these indicators.

Software development in the nuclear contracting industry [Alleman 03] seemed to show the compatibility of using earned value with agile programming to gain CMMI level 3. Although the work used XP and achieved a high degree of compatibility, the estimates based on testing gave earned value figures not dissimilar to planned, suggesting the goals remained fairly constant and there were limited scope changes.

Experience proved encouraging, but as is usual in agile projects where the goals (requirements) are being uncovered, EV was typically lower than planned, similar Figure 1. As soon as the velocity for the first iteration was applied then results were within a margin of error of 20%.This provided the basis to see if EV could be applied to agile projects. First it was necessary to establish the velocity and ascertain if the agile development processes was under control statistically. The problem was initially in reverse of Lipke’s work, needing to know if the process is under control, before any meaningful cpi, spi values could be derived.

The key issue remaining was that a more accurate metric was needed and that for agile projects the earned value would need to be recalculated at each iteration for improved accuracy. The agile use of acceptance tests to verify user stories complete seemed the ideal solution. Userstories measured in ‘story points’ was the metric to indicate story size (and at a finer granularity tasks). As the concept of agile practices was to address the high risk and more valuable stories first this mapped to their relative business value as well.

Earned Value Definition and Summary

‘The value of the useful work done at any given point in a project. The value of completed work expressed in terms of the budget assigned to that work. A measure of project progress. Note: The budget may be expressed in cost or labour hours’ [APM 06b].

Figure 1. Earned Value chart

EV measurements should be based on working software, Figure 2., problems may arise when artefacts such as documentation and UML design are included giving a view of progress on work done (activities) rather than work directly attributed to business value.

Figure 2. Progress as a function of working software, adapted from [Bittner 05]

Earned Value Analysis (EVA)

The problem with tracking agile projects with EV is that one of the key features of agile processes is to refine goals through iterative development work so the full extent of what is required evolves through iterations. This has the advantage that the right solution is developed although it is difficult to estimate duration from the outset.

As the scope has not been finalised the estimates are poor. A resolution to this problem is to consider the business value of the development.

If we consider an agile project, Table 1, developing a web application with a project team of 6. Stories were measured in ‘story points’ to indicate relative amount of development work (tasks) required. Task times were then logged during the week to give actual (i.e. developer) cost. The table shows how complete each story is by assigning ‘points earned’. Using the approach that earned value of each task is determined by percentage complete of the planned activity [Lester 00]; the percentage of story points ‘earned’ is multiplied with planned(developer hours). As an example story number 1 is 100% complete (it has earned all its associated story points) therefore EV is equivalent to the planned (developer hours). From these values the efficiency, in terms of cpi (220/260 =0.85) and spi (220/240=0.92) can be calculated to give progress estimates. Various other approaches could also be taken, incorporating the business value, to give different interpretations of progress in terms of business value and points progression.

Story number / ‘Business Value’ / StoryPoints / ‘Points earned’ / Planned (developer hours) / Actual
(logged hours) / EV (earned value)
1 / 3 / 10 / 10 / 100 / 120 / 100
2 / 2 / 8 / 8 / 60 / 60 / 60
3 / 2 / 4 / 4 / 60 / 80 / 60
4 / 1 / 2 / 0 / 20 / 0 / 0
total / 8 / 24 / 22 / 240 / 260 / 220

Table 1. Recorded values for one iteration. The EV figures were automatically calculated from the figures of ‘earned points’. Business values record the relative importance of stories.

What also needed to be established, was the rate of work for the environment, the team experience and to a lesser extent the tools used.

Figure 3. Initial planning figures based on spic and cpi figures that showed planned progress was not being maintained.

Estimates for time and cost to complete were then derived from the performance indicators spic and cpi.This was achieved on a daily and iteration basis. The estimated cost at completion equals (the original) planned/cpi and estimated time equals planned/spic. Note: spic was used in preference to spit time, which can be easily approximated by inspection of the iteration schedule.However far more useful is to see the efficiency of the process,which in the project Figure 3, was below planned on both cost and time, raising possible causes such as poor estimation or other issues.Using this data for the next project and estimating after the first iteration improved estimation figures by more than 50%. Using spi and cpi ratios is an integral part of the PSM method [McGarry 02] which is used by BAE systems to achieve CMMI level 5.However, even if historic figures are available they may not be applicable. What was required was to understand the rate of development i.e. velocity.

Measuring Velocity

The rate of work or velocity, Figure 4, using story points achieved each week by the team can be established via acceptance tests. This measure helps determine the

Figure 4. Velocity or rate of work in story points achieved per week.

development level for the next iteration and can be estimated within ranges [Cohn 06].

Estimation of Project Duration using Story Points

Cohn considers it appropriate to set expectations using a range estimate, which is achieved ideally by running one or more iterations to give some data on the progress of the team and then applying a weighting factor as shown in Table 2 to give an upper and lower value.

Iterations Completed / Low Multiplier / High Multiplier
1 / 0.6 / 1.60
2 / 0.8 / 1.25
3 / 0.85 / 1.15
4 or more / 0.90 / 1.10

Table2 Multipliers for estimating velocity based on number of iterations completed from [Cohn 06].

It has been found sometimes useful to weight stories, which include a short value statement, in a range of 1-3 to give an estimate of business value, Table 1, and apply a multiplier to provide upper and lower range estimates.The next step was to develop process control measures to alert the team and project manager to any issues.

Control Limits

Control techniques inform whether this development process is under control using the criteria set within various σ ranges [Florec 99]. When a sequence of values need to be evaluated then a time sequenced plot of individual values may be appropriate using individuals and moving range (XmR) charts such as Figure 5, using the formulae in Figure 6. Control limits based on 3σ can be set to inform when the process has values that need to be examined.

Figure 5 Use of velocity and mR (The UCLR control limit = 20.07)

Figure 6 Equations for Calculating Control Limits for XmR Charts

Xi / mR
35
32 / 3
45 / 7
30 / 15
35 / 5
39 / 4
32 / 7
34 / 2
35.25 / 6.14

Table 3 Xi and mR Data from Project Figure 5

As an illustration, using the XmR equations and the values from Table 3, where the average velocity per week is 35.25 and the average two-point moving range is 6.14 the UNPLx = 51.6 and LNPLx = 18.9 which indicates these values are within natural process limits.

Next it is useful to make an estimate of future progress. One approach is to use weighted averages which have helped achieve improvements for initial estimation of planned figures.

Figure 7X-bar average in comparison to weighted average.Iterations were 1 week duration.

What was a typical occurrence, at the start of projects, is a time lag before acceptance tests were complete Figure 7. Taking the weighted average is an alternative to the average CLxand has proved to be more effective measure for estimation of velocity at the start of a project. Values used to calculate weighted average are shown in Table 4.

Although not shown, the values for Figure 7, givethe upper control limit for moving average UCLR as 22.8 and all moving range values are within this limit.

Table 4. Weighted Average

Acceptance Test Charts

Using acceptance tests solved some ofthe problems of earned value reporting.Information was available to assess progress in stories complete and how much work was involved (story points). The team deliver can only deliver a certain rate or velocity each iteration, and if additional stories are requested then others may need to be dropped. Provided managers understood that the team would be able to deliver at a velocity similar to earlier iterations then changes to work load were reduced, and planned values remained valid for longer.Even with changes in scope, this data was of value to managers, who were familiar with re-planning and making adjustments to the inventory of work. Cumulative flow of acceptance tests provided further information to manage projects appropriately, Figure 8.

Figure 8.Cumulative flow chart.

The team can easily assess the amount of work remaining using burn down charts, Figure 9, which have a motivating effect on the team, especially towards the end of an iteration or project. Also it can be seen when additional work has been attributed to the project. Sometimes it may be a useful to show the changing inventory line. Cockburn discusses the use of burn-up charts (i.e. cumulative progress with time) and their correlation to earned value charts [Cockburn 03] and considers this a ‘natural mapping’. Cohn outlines the separation of impact of velocity and scope changes by delineating at the x-axis.

Figure 9. Burn down chart.

Discussion

Ideally acceptance data should be within control limits if it to be used as a basis for EV reporting. However, even if data is not within limits, grouping data within iterations can give meaningful values. Variability in this metric reflects the size of user stories and changes in scope.

Measurement should be a natural by-product ofthe development process and should cause minimal overheads. The interpretation of tasks and acceptance data outside of control limits should help understand the process better and attribute reasons to these variations. Acceptance test data is one approach. Issue reporting is also essential to understand the stability of the code dependant on the size and domain of the project.

Team tracking, using pair reporting, and keeping the data visible, allows the team more involvement in this process.Presentation of results should be as simple as possible with the team and managers able to view velocity to give rates of progress and burn-down charts showing remaining work. Charts should be in a format to gain as much leverage for reporting of progress as possible. Percentages can also be easily derived from stories and story points achieved.

With better appreciation of the rate of work, what has been found is that scope changes in terms of story points has been reduced. The managers know what the team are capable of delivering and the team realise that they cannot achieve more than a certain number of story points (tasks). The increased stability creates more effective and accurate EV reporting. Combining understanding of acceptance data (using weighted averages at the start of the project) and the rate (velocity) with a better understanding of the process has created more viable estimation figures. Where managers are briefed as to the nature of the agile process, to ensure that the requirements with the greatest business value are deliver first, they are more likely to accept the idea of improved estimates during iterations.