TCN Spell Checker / Team AZP Software Engineering Department Rochester Institute of Technology

Team AZP Software Engineering Department TCN Spell Checker Rochester Institute of Technology 134 Lomb Memorial Drive Rochester, NY 14623

Process Metrics Summary for Product Iterations 1 & 2

A summary of the process engineering objectives and results from the initial Spell Checker software framework development effort

Iteration 1: November 29, 2004 -- February 1, 2005 Iteration 2: February 2, 2005 -- February 18, 2005

Section 1 Summary

Process Metrics Objectives Iterations 1 & 2

Why process metrics?

The collection of process metrics is most useful for managing the activities of the various members of the team to maximize the success of the project effort. When tracked over a number of projects (or deliverables), metrics provide a standard for measurement for the efficacy of the team and a basis for time and labor estimations for future projects. We at Team AZP are unable to apply previous project metrics to evaluate our progress, so we take advantage of metrics with immediate value to establish a baseline for future performance expectations, and to detect problems in the current iteration for immediate correction.

Our metrics for this iteration fall into two major categories:

1) Defect Analysis

2) Estimation Accuracy

The first group is used to identify and classify the general causes of problems thus far in our development effort, and the second is to detect the degree of over- or under- estimation inherent in our projected time and labor budget for this and future iterations.

Section 2 DefectDefects by SourceAnalysis 8 7 6 5 Number of 4 Defects 3 2 1 0

Source

Defect-Derived Metrics:

Defect Source Tracking

The above chart describes the results of defect source analysis for all of the defects identified during Iterations 1 and 2 of the Spell Checker project. This chart is meant to categorize the defects submitted by our team by cause.

Implementation and Coding contributed the most defects by more than a factor of two, which figures into our expectations that the implementation of the software would account for the majority of the effort spent in developing the first version of the application. Our development process consisted of several sessions of requirements collection through dialog with TCN employees, followed by several weeks of high-level design. The majority of quarter one was spent (weeks 4-9) building the framework on which the application runs as well as several prototype modules which allow the system to emulate its final search-processing state.

Our problems stemmed most from complications discovered during implementation, and were largely errors inject by programmers, not from requirements or design activities.

2 Defects by Severity Index (1-5)

6

5

4 Number of 3 Defects 2

1

0 Trivial Low Moderate High Critical Severity

Average Bug Severity

Each defect was labeled with a severity index from 1 (Critical) to 5 (Trivial) as it was recorded. The above chart describes the frequency of defects with the differing levels of severity. We estimated an item of Critical severity would require the attention of the entire team with maximal urgency – the defect abruptly halts system development, and must be resolved before development can safely continue. Moderate defects have an average level of importance – they conflict with the project’s requirements, and cannot survive until the next iteration without causing problems. Trivial severity indicates a defect that takes the form of a “suggestion”. If time allows and team members are not bogged down with more pressing work or forward development, the changes can be considered and implemented.

Our analysis shows that the average defect severity is 3.73, rather high of “moderate” severity. We interpret this to mean that most of the problems we find are almost always pressing issues, which may indicate Team AZP makes very few trivial mistakes, trivial problems are generally unreported or fixed ad hoc by developers, or that we have been slightly overzealous in assigning severity to incoming defects.

Our future iteration goal is to bring the severity of reported bugs closer to 3, which will add value to our development process by assuring that minor errors are given the same attention as more severe ones (so that they are not compounded by time and negligence).

3 Average Bug Complexity

Defects by Complexity Index (1-5)

9 8 7 6 Number of 5 Defects 4 3 2 1 0 Trivial Low Moderate High Critical Complexity

Each defect was labeled with a complexity index from 1 (Critical) to 5 (Trivial) as it was recorded in much the same way as the severity index described above. The provided “Defects by Complexity” chart describes the frequency of defects with the differing levels of complexity. We estimated an item of Critical complexity would involve making major changes to the structure of code, design, or known requirements that could involve up to and including scrapping some existing components or modules and starting afresh. Moderately complex defects will probably occupy a single team member for a day or two, and may involve some elaboration of project design, or a reinterpretation of technologies used to program the application. Trivial complexity indicates something along the lines of a single-line code error, or a similar quick fix.

Our analysis shows that the average defect complexity is 2.73 on the low end of “moderate” average complexity. We like these measurements, and believe lower complexity is indicative with adequate requirements and design analysis and general quality development. Our system is highly adaptable, and problems we encounter (on average) do not require more than a superficial addition or replacement to existing code and process.

Our goal for future iterations is to reduce the overall complexity of our defects. Low complexity defects are easy and quick to resolve, and have little or no impact on other aspects of the project. For our purposes, a low complexity defect is a good defect.

Average Age of Open Defects

4 Age of Defects (Open, Closed, and Deferred) 6

5 s t c e

f 4 e D

f 3 o

r e

b 2 m u

N 1

0 0 1 2 3 4 5 6 7 8 9 10 11 Age of Defect (days)

The above graph displays the recorded age of our defects (open, closed and deferred) as of 2/18/2005. This scatter plot shows the relative spacing and frequency of defects, and displays our trend of fixing defects within 5 days or less, on average (calculated as 4.87 days). Our aim for future iterations is to bring this trend further inward, and shortening our average age of defects to less than 4.5 days. Improving this average, we will reduce the impact of defects and improve our forward progress by spending less time fixing the average defect, and more time building the system as planned.

Each defect is recorded on a specialized template documenting the source, severity, complexity, and detection date. By analyzing the content of each template and summing/averaging the results, the above trends in defect attributes are demonstrated.

5 Section 3 Estimation Analysis

Defect-Derived Metrics:

Time Estimated vs. Time Spent per Deadline

We are deferring analysis of this metric pending the acquisition of additional effort expenditure data (Iteration 3).

Schedule Slippage

As of the end of the Winter quarter, we are happy to report no evident schedule slippages to date. Iteration 1 was achieved ahead of schedule, and we are confident we will complete Iteration 2 before its deadline, February 18th, 2005. We are currently on schedule. (0% schedule slippage)

The time differences for each deadline and scheduled project milestone gives us real time feedback regarding the accuracy of our current and anticipated deliverable dates. This information will be useful when rescheduling project goals after a major schedule slip.

6 Section 4 Plans for the Future

Where to from here?

Beginning with Iteration 3 (beginning March 7th, 2005), Team AZP intends to keep close track of hours of effort spent by each team member in several categories of development, such as implementation/coding, administrative tasks, metrics collection and management, testing and debugging, and so on. We believe that, as the Spell Checker project enters its second half, it is critical that we are able to keep tabs on the amount of work performed/hours spent by our team and its constituent members compared to the quality of the results of our efforts. In particular, we wish to more exactly manage the input of different team members to account for scheduling difficulties, role assignments, and skill areas. The results of this study will be published in future Metrics summary documents.

We plan to continue managing defect attributes as with previous iterations (described above).

7