WLMP Metrics Plan

Extremely Expensive Program (XXP)

Metrics Methodology

June 17, 2015

Dr. Jennifer Booker

XYZ Corporation, Inc., PLC, GmBH TABLE OF CONTENTS

1 Introduction...... 1 1.1 Background...... 1 1.2 Scope and Assumptions...... 1 1.3 Objectives...... 1 1.4 Constraints...... 1 2 References...... 2 2.1 CMM Traceability...... 2 2.1.1 Software CMM...... 2 2.1.2 Systems Engineering CMM...... 3 2.2 Terminology...... 3 3 Measurement Basics...... 4 3.1 Why Measure?...... 4 3.2 How do we Measure?...... 5 3.3 What can be Measured?...... 6 3.3.1 Fundamental Measurement Types...... 6 3.3.2 Complex Measurements...... 8 3.3.3 Before and After Measurements...... 8 4 How do we Choose What to Measure?...... 9 4.1 Identify Your Business Goals...... 9 4.2 Identify What You Want to Know or Learn...... 9 4.3 Identify Your Subgoals...... 10 4.4 Identify the Entities and Attributes...... 11 4.5 Formalize Your Measurement Goals...... 11 4.6 Identify Quantifiable Questions and Indicators...... 12 4.7 Define Your Measures...... 13 4.8 Identify the Data Elements...... 14 4.9 Identify the Actions Needed to Implement Your Measures...... 14 4.10 Prepare a Plan...... 15 5 How can we Present Measures?...... 16 5.1 Graphing Conventions...... 16 5.1.1 Graph Terminology...... 16 5.1.2 Units...... 17 5.1.3 How Much Data Fits?...... 17 5.2 Pie Chart...... 19 5.3 Check Sheet...... 19 5.4 Pareto Diagram...... 19 5.5 Histogram (Bar Chart)...... 20 5.6 Run Chart (Line Chart)...... 22 5.7 Scatter Diagram...... 23 5.8 Control Chart...... 24 5.9 Cause and Effect Diagram (Fishbone)...... 25 XXP Metrics Methodology

1 Introduction This Metrics Methodology describes a recommended methodology for identifying and describing measures, and deciding how they can be presented. [This was adapted from a real document, which was cleaned up to alleviate any proprietary issues.]

1.1 Background This document was written for a particular contract, though the methodology described is widely applicable.

1.2 Scope and Assumptions All projects within XXP are responsible for using this methodology to identify relevant metrics, then collect, analyze, and report them on a regular basis (e.g. at least monthly). Guidance for performing these activities is provided in sections 3 through 5 of this document.

1.3 Objectives This methodology needs to support basic measurement objectives for the XXP contract:  Help identify measures which demonstrably support the program’s needs  Clearly define how those measures will be calculated and graphically presented More generally, this methodology helps the XXP measurement program meet the following objectives:  To meet the measurement and reporting requirements of the XXP contract  To meet project management requirements  To support meeting Software CMM Level 3 process maturity requirements, and establish a sound quantitative framework for higher maturity levels (e.g. Continuous Process Improvement, Defect Prevention, etc.)

1.4 Constraints The measures must be consistent with the following constraints on how they may be defined, collected, and/or used.

Process Constraints  Processes must be consistent with organization’s Standard Engineering Processes.  The SEI Capability Maturity Model (CMM) for Software, version 1.1, dated 1993 (CMU/SEI-93-TR-024).

Tool Constraints  The existing methodology for program management  The major COTS products used in the modernized system

Page 1 XXP Metrics Methodology

Contractual Constraint  The XXP contract for development and implementation of the modernized system (see References, section 2.2). The contract includes the Statement of Work (SOW), Performance Requirements, and Performance Bonus Plan.

2 References This methodology supports industry best practices for software development and maintenance, such as those practices identified by IEEE standards, SEI, STSC, SPMN, PMI, and the NIST. Specific industry references include:  Booker, Jennifer, “Statistics for Software Process Improvement,” http://cci.drexel.edu/faculty/gbooker/INFO630/statistics.pdf, 2015.  Brassard, Michael, and Ritter, Diane, “The Memory Jogger II,” Goal/QPC, First Ed., 1994 (http://www.goalpqc.com)  Capability Maturity Model (CMM) for Software, version 1.1, dated 1993 (CMU/SEI-93-TR-024).  Fagley, Kara, “Basic Seven Tools of Quality,” http://www.freequality.org/beta freequal/fq web site/training/Basic7ToolsofQuality[1].ppt  IEEE-1045, “IEEE Standard for Software Productivity Measures,” 1992.  IEEE-1061, “IEEE Standard for a Software Quality Measures Methodology,” 1988.  IEEE-982.1, “IEEE Standard Dictionary of Measures to Produce Reliable Software,” 1988.  Ishikawa, Kaoru, “Guide to Quality Control,” ISBN 92-833-1036-5, Productivity Press, 1986. Source for Ishikawa’s seven tools.  Kan, Stephen H., “Measures and Models in Software Quality Engineering,” Addison-Wesley Pub Co; ISBN: 0201633396, 1995. This also describes Ishikawa’s seven basic tools for presenting measures, which are summarized here in section 5.  Park, Robert E., “Goal-Driven Software Measurement — A Guidebook,” CMU/SEI-96-HB-002, 1996. This describes the GQ(I)M approach summarized here in section 4.  Systems Engineering Capability Maturity Model (SE-CMM), CMU/SEI-95-MM-003

2.1 CMM Traceability

2.1.1 Software CMM This Plan supports the common feature of "Measurement and Analysis" for all Key Process Areas (KPA’s) of the Software CMM, and will eventually support the Quantitative Process Management KPA.

2.1.2 Systems Engineering CMM This Plan also supports the Systems Engineering CMM's generic practice "Track with Measurement," and will eventually support the common feature "Establishing Measurable Quality Goals."

2.2 Terminology General XXP terminology is described in the XXP Quality Management Terminology document. Some specific measurement concepts of interest include the following.

Term Definition Data Some quantity needed to calculate the value of a measure. Element Data Source The repository from which a data element is measured. Goal A high level business objective Indicator A means of graphically presenting one or more measures Information The result of comparing a measure to an internal or external measurement standard. KLOC Thousand lines of source code; a commonly used measure for software size. Measure A specific characteristic used to quantify an attribute of something. Metric A means of measuring something. Subgoal An objective which supports meeting a goal; here, subgoals are defined by each project organization within the program.

3 Measurement Basics This section describes the reasons for measuring a project, how measurements are made, and the types of characteristics which can be measured.

3.1 Why Measure? There are four major reasons for measuring software development processes, products, and resources: to Characterize existing processes, to Evaluate program status, to Predict the future status of this and other programs, and to Improve the processes used. These reasons are described in more detail in Table 1.

Table 1. Reasons for Collecting and Using Measures

Reason Description Characterize Allows XXP to gain an understanding of the processes, products, resources, and environments, and to establish baselines for comparisons with future assessments.  What types of defects are common?  What is our current defect rate?

Evaluate Allows XXP to determine a program’s status with respect to plans, assess achievement of quality goals, and assess the impacts of technology and process improvements on products and processes  What is the reliability of the product after delivery?  Does the product provide the needed functionality and is it easy to use?  Does functional testing minimize certain types of faults?

Predict Allows XXP to plan. Measuring for prediction involves gaining understandings of relationships among processes & products, and building models of these relationships. The models allow one to focus on the values observed for the characteristics of processes, products; resources; constraints; etc., which provide a mechanism for predicting:  How can we refine our Basis of Estimates (BOE)?  How can we estimate our budget?

Improve Allows XXP to improve our processes and procedures, by collecting a repository of data for evaluating our weaknesses and process bottlenecks. This will help answer:  How can we do it better?  What new techniques may improve our capabilities?

Hence the Characterize step is when we make measurements to describe how we’re currently performing. The Evaluate step is comparing those measurements to our goals. Based on the trends observed over time, we can then Predict future performance. Finally, we strive for ways to Improve our organization so that we can do even better than earlier predictions would have allowed.

3.2 How do we Measure? Measures are used to understand our activities, and to eliminate bias in making decisions. As shown in Figure 1, measures are derived from data elements, and are used to produce useful information for our project. This information often forms the basis for making decisions, such as deciding when to proceed to the next life cycle phase, or when to accept a new software into the system. To implement this concept, we often start by identifying the information we want to understand, and then work backwards to determine what measures and data elements are needed to get that information.

Data Data Measures Information Source Elements

Figure 1. Calculation and Analysis of Measures

The Information is derived from comparing a measure to an internal (e.g. project-defined) standard, and/or to an external (such as industry-based) standard. For example, if we want to know whether the quality of our custom-developed software is “good,” that’s a piece of information based on comparing our defect rate (a measure) to our project’s own history of defect rate (an internal standard), and/or to an industry standard for “good quality” software (an external standard). Each Measure is calculated from one or more data elements. For example, the defect rate is often defined as the number of defects divided by the number of thousand lines of code (KLOC). Defect Rate = (# defects) / (# of KLOC) Hence in order to calculate this measure, we need two data elements; the number of defects, and the number of thousand lines of code. We obtain each Data Element from a Data Source. This usually means taking particular pieces of data from the data source which are needed to calculate our measure. Notice that different data elements may come from completely different data sources. The Data Source is where each Data Element comes from. A data source can be electronic (such as a database or spreadsheet) or manual (a pile of paper sales receipts). In this example, we might get the number of defects from a problem reporting database, and the number of KLOC might come from a tool to analyze our source code. It is critical to define very precisely how measures are calculated, and how data elements are extracted from their data sources, so that measures from several years ago can be meaningfully compared to measures made yesterday. It’s possible for some data elements also to be used as measures in another context. The number of lines of source code (# of KLOC) is a data element in the above example, because it’s used to calculate the defect rate. However it’s also possible to track the # of KLOC by itself over time to

Page 5 XXP Metrics Methodology monitor the growth of the product; in that context, # of KLOC is a measure because it’s the quantity being plotted and analyzed.

3.3 What can be Measured? This section describes the most fundamental types of measurements, then combines them to produce more complex measures, and describes how they can be measured before and after some significant event.

3.3.1 Fundamental Measurement Types Measurements may be collected for the work products, and for the resources, tools, and processes needed to generate those work products. These are the most fundamental measures. As shown in Figure 2, the objective of a project is to create some sort of work Products. Products could be software, documentation, plans – in short, creation of any thing which is needed to ultimately fulfill the project’s needs. In order to do so, we need Resources, such as people, to be able to create the product. But Resources can’t work in a vacuum, they need Tools to do their work, such as a software development environment. And finally, to produce a consistently high quality product, we need to follow a defined Process.

Resources

Tools Products

Processes

Figure 2. Need Resources, Tools, and Processes to produce Products

Hence fundamental measurements can be defined for any or all of these aspects of our business environment, as shown in Table 2. The precise distinction whether a particular measure relates to process, tool, resource, or product isn’t critical – this is merely a method for understanding where the focus of our measurement attention needs to be placed. For example, if you have low product quality, and know that the tools aren’t a problem for your staff, then you can focus on resource or process measures to help identify why the product quality is low.

Table 2. Types of Fundamental Measures

Type Definition Examples Products Any final or interim work product Product measures are often the easiest to collect, and can even be done long after the product has been created. Basic product measures can include aspects such as size (LOC, pages, etc.), complexity, and number of components (e.g. modules, scripts, classes, etc.). The number of defects in a product are also a key measurement. Resources The people who create the Resource measures are often the most important products, including their training project management tool. How much staff effort was and skills. Labor cost to use those required to perform task X? How much does it cost resources also belongs in this to fix a defect? More advanced resource measures type. may include the amount of training needed to maintain or improve staff qualifications. Tools The equipment used by the Tool measures are relatively rare, but can include resources to produce the products. measures such as utilization (what percent of time are May include hardware, software, tools being used?), cost, and capacity (what percent and third party components. All of available bandwidth is being used? Or of available non-labor costs generally belong storage?) to this type. Processes Generally refers to a defined set Process measures often focus on duration (how long of processes and procedures, does it take to perform process X?) or frequency which describe the tasks and (how often is Process X performed?). Notice that activities performed in order to process measures focus on calendar time to perform create the products. tasks, while Resource measures focus on staff effort.

3.3.2 Complex Measurements The fundamental measures can be combined to yield more complex measurements, as shown in Table 3.

Table 3. Typical Complex Measurements

Defect Rate Based on two product measurement (defects and size), the defect rate is typically given in defects per thousand lines of code. Earned Earned value measurements compare the cost, effort, and amount of product Value created to define dozens of measures. See http://users.snip.net/~gbooker/Reference/measures/evalue.htm for examples and further references. Productivity Productivity measures are based on the amount of product produced, divided by the time needed to produce them. Typical software productivity is defined by the number of LOC per staff month needed to design, write, and validate them.

Thousands of complex measures can be defined to meet specific needs; see the book by Kan for more examples.

3.3.3 Before and After Measurements A common technique is to measure the same quantity at two times to allow comparisons. This is the foundation of being able to Predict or Improve any aspect of our organization (see section 3.1). Predict - The most common technique is to compare the planned and actual values for a measure, such as the planned and actual effort required to perform a task. This type of before- and-after measurement is primarily used for project management, to allow prediction of the cost and schedule needed to produce some product. Improve - The other variation on this is to compare measures before and after some change in the environment. For example, the process duration before a deliberate change to the tools, may be compared to the duration after the tools are in use, to determine if there was a significant improvement. Hence this type of before-and-after measurement is primarily used for process improvement, to quantify improvements in processes.

4 How do we Choose What to Measure? This section tells how to identify which measurements are best for your environment, using the GQ(I)M method. There are thousands of possible measures which can be defined from even fairly simple project data. To help ensure we’re finding the right measures to meet our goals, we need a method to help choose the best measures for our project. The GQ(I)M method (per Park, et al) is used to support the definition of measures, in order to ensure traceability from goals to individual data elements. The method is called "GQ(I)M" for Goal, Question, Indicator, and Measurement. It is based on the GQM method by Victor Basili, first developed circa 1988-89. GQ(I)M is a ten-step top-down approach to definition of measures which focuses on tracing measures from high level business goals (objectives) down to their exact definition, and collection of the data elements needed to calculate them. The reasons for using this method include:  To ensure that measures are clearly defined.  To avoid overcollection of data  To ensure that there is a defined reason (management purpose) for collecting each piece of data.  To understand why the data are being collected. The “why” is important- it affects how the data will be interpreted, and provides a basis for reusing measurement plans and procedures for other projects. There is nothing magical about using a top-down approach for defining measures. The key is to fill out all of the GQ(I)M parameters for every measure, regardless of whether you started near the top (goal), bottom (data elements), or somewhere inbetween. One could start with the existing data sources and determine what measures and goals could be addressed with them; or pick measurements which seem relevant, then work up to goals and down to data elements to clarify their purpose.

4.1 Identify Your Business Goals The top level business goals for XXP are defined in the sites’ Metrics Plans. Typical high level goals are to ensure compliance with contractual obligations, and meet corporate standards for process maturity and process improvement.

4.2 Identify What You Want to Know or Learn Each XXP Project Manager needs to identify what information they would like to know in order to understand, assess, predict, or improve their activities to help meet our collective business goals. Consider your organization; and ask:  How do we receive inputs?  What temporary or internal products do we create?

 What work products do we generate?  What resources and tools do we use?  What processes do we use? Ask what basic information you know quantitatively about your activities:  Cost: How much does it cost to perform each type of activity?  Schedule: How long does it take?  Effort: How much effort does it require?  Scope: How much product is created? If you’re already comfortable answering those questions, try these:  Is the process stable?  How is it performing now?  What limits our capability?  What determines quality?  What determines success?  What things can we control?  What do our customers want?  What limits our performance?  What could go wrong? How can we tell if it does?  What might signal early warnings?  How big is our backlog?  Where is backlog occurring? The net result from this step should be an idea of what aspects of your activities are most in need of better understanding.

4.3 Identify Your Subgoals Translate the top-level goals from step 1 & and the questions from step 2 into sub-goals that specifically relate to each Project Manager’s area (e.g. Deployment, Sustainment, Data, etc.). Each subgoal should support the overall project goals from section 3.4.1, but more specifically address issues relevant to each project area. Subgoals might include issues like:  Improve emergency fix responsiveness  Improve customer satisfaction  Understand scope of external interfaces  Reduce dependency on subcontractor expertise

 Or whatever might be appropriate for each project area

4.4 Identify the Entities and Attributes For each subgoal, identify the products and activities which relate to that subgoal – these are called the entities of interest. Then for each entity, determine which attributes (characteristics) of that entity are most important to fulfilling the subgoal. This is to identify what attributes or combination of attributes can be measured to see if the subgoal has been reached.

4.5 Formalize Your Measurement Goals Based on the analysis in the first three steps, we now want to formulate specific measurement goals to address each subgoal. There may be multiple measurement goals for one subgoal, depending on the complexity of the subgoal. Each measurement goal is a sentence of the form: Verb measure frequency The verb is an action verb chosen based on whether the measurement goal is “active” or “passive.”  Passive (strategic) measurement goals are meant to enable basic learning or understanding of the measure (e.g. identify capabilities & trends; identify root causes; etc.). Passive goals are better suited to lower levels of process maturity, when the baseline performance hasn’t yet been fully defined. Verbs used with passive goals are focused on developing an understanding of the current situation, such as “track,” “determine,” “assess,” “observe,” “identify,” or “monitor.”  Active (tactical) measurement goals are used when you already have a good understanding of the entity you’re trying to measure. Active goals are better suited to high levels of process maturity, when process improvement is the norm. Verbs used with active goals are focused on causing change from the existing state, such as “improve,” “increase,” “decrease,” or “reduce.” The measure is a written description of the specific measure(s) which will be used to address this particular subgoal. Be as specific as possible about exactly what is being measured; this will help make defining the measure easier. Measures can be calculated, so it’s okay to cite the number of things, a difference between two things, or the average value of something. Examples of “measures” could include:  “the number of open Problem Reports”  “the cost and schedule variance”  “the number of known major defects”  “the number of approved system requirements”  “the average number of people charging to the project” The frequency is used to define how often the measure is collected. Frequencies may be on a fixed schedule (e.g. daily, weekly) or event-driven (per release). Examples of frequency phrases include:

 at the end of each day  at the end of each week  at the end of each month  for each release  for each life cycle phase  after each contract modification Table 4 puts all of this into a menu format. To generate a measurement goal, pick one entry from column A, one from column B, and one from Column C. Notice that both passive and active verbs are shown.

Table 4. Menu for Defining Measurement Goals These lists are not comprehensive - they are examples only!

Verb Measure Frequency  Track  the number of open Problem Reports  at the end of each day.  Determine  the cost and schedule variance  at the end of each week.  Assess  the number of known major defects  at the end of each month.  Observe  the number of approved system requirements  for each release.  Identify  the average number of people charging to the  for each life cycle phase. project  Monitor  after each contract modification.  Improve  Increase  Decrease  Reduce

4.6 Identify Quantifiable Questions and Indicators Another way to understand measurement goals is to formulate them in terms of a question which will address the subgoal’s needs. Sometimes this is a starting point for defining measures, because a question may arise which prompts defining some measure to answer it. Common measurement questions may include:  Have our requirements stabilized?  What is the quality of our software?  Has testing been thorough?  Is the customer happy with our product and services?

 When do we need to upgrade our servers? Indicators are needed to display measures in a meaningful manner. In this context, indicators refers to the way in which measures are graphically displayed or captured, such as:  Pie Chart  Check Sheet  Pareto Diagram  Histogram (Bar Chart)  Run Chart (Line Chart)  Scatter Diagram  Control Chart  Cause and Effect Diagram (Fishbone) These indicators are discussed in detail in section 5.

4.7 Define Your Measures Note: In this document, steps 7 and 8 are switched from the original GQ(I)M approach. Based on the measurement goals, define each measure to be used. Definitions of measures must be consistently repeatable. Give each measure a brief name, show an equation for calculating the measure, and give a written description of the measure. The written description should communicate what is being measured, how it is measured, and what has been included or excluded from its definition. For example, the Defect Rate might be defined as: Defect Rate = (number of defects) / (number of KLOC) The Defect Rate is used to assess the quality of software by measuring the number of known defects per thousand lines of source code (KLOC). Defect severity and status are not considerations – major and minor defects count equally. This sets the stage for definition of the data elements in the next step.

4.8 Identify the Data Elements Based on the definitions of each measure, collect a single list of all data elements needed to calculate all measures. (A particular data element may appear in several measures; hence this avoids repetition.) Describe how each data element is defined and obtained, so that it may be consistently collected. “Number of defects” is the number of defects, of any severity or status, which apply to the current developmental build, which are in the Problem Reporting Database. They are collected by running the script “find-all-defects.sql”. Each data element should generally be collected at one time, so that one measure isn’t, for example, using the number of defects as of last Friday, and another similar measure is using the number of defects as of the end of last month. This could lead to confusion in interpretation.

4.9 Identify the Actions Needed to Implement Your Measures In order to implement these measures, we need to identify where, when, and by whom each measure will be collected. A common (shared) location needs to be identified for collection and dissemination of measures and data elements. This location generally needs to be public for the entire project team. The main exception to this is sensitive data, such as financial data, which may need more restricted access. The following information needs to be identified for every measure:  Goal (a general statement of the subject of interest)  Subgoal (a.k.a. objective)  Question (what is the specific question answered by this measure, if any)  Indicator (how to display measure; e.g. line chart, histogram, etc.)  Measure (the actual measure, including how it is calculated and its definition)  Assignee (who is responsible for collecting and analyzing each measure, and when) And the following information needs to be defined for each data element:  Data Element name (used to calculate measure)  Source (of each data element - what report, script, survey, etc will be used to gather it?)  Assignee (who is responsible for collecting each data element)

4.10 Prepare a Plan Each project needs to prepare a Plan to define and implement new measures. This activity has five major elements: 1. Identify existing measures, and document them using the format described in section 4.9. 2. Define desired measures and data elements needed to meet their subgoals, using sections 4.2 to 4.9. 3. Determine the difference between the existing measures, and the desired measures. Determine the priority of each new measure (e.g. High, Medium, Low). 4. Plan implementation of the new measures. Most projects will want to phase in new measures over time, such as one set of measures implemented immediately, another set implemented in six months, and so on. Phased implementation of measures supports the human need for gradual cultural change, and supports Continuous Process Improvement. 5. Implement the new measures according to the Plan, and report on measures periodically (e.g. weekly or monthly).

5 How can we Present Measures? This section describes how measurements can be presented graphically (called Indicators in the GQ(I)M method). The first two subsections (5.1 and 5.2) describe common terminology for presenting measures, and discuss a common tool for presenting data, the pie chart. The next seven subsections (5.3 through 5.9) discuss what are also known as “Ishikawa’s seven tools.” They are, respectively: o Check Sheet o Pareto Diagram o Histogram (Bar Chart) o Run Chart (Line Chart) o Scatter Diagram o Control Chart o Cause and Effect Diagram (Fishbone) Note: Some sources cite the Flow Chart as one of Ishikawa’s tools, instead of the Check Sheet. However his book Guide to Quality Control discusses Check Sheets. These collectively cover all of the most commonly used methods for presenting measures graphically.

5.1 Graphing Conventions Once a measure has been selected, and its collection frequency determined, it’s possible to define how it will be presented graphically. Various graphs can show from two data points up to a million on one chart. The challenge is to determine which type of chart is best for a particular measure or set of measures.

5.1.1 Graph Terminology Most graphs are presented in a two-dimensional format, with some horizontal axis (usually called ‘X’ in math classes) and a vertical axis (‘Y’), as shown in Figure 3. The X axis is also called the independent variable. Here it usually represents when data was collected for one set of measurements. It is independent because you could choose any value of X, and look up what corresponding value of Y was measured. The Y axis is called the dependent variable, because the value of Y depends solely on the X value. You can’t pick a value of Y and be guaranteed that there will be a corresponding data point.

Sample (Line) Graph

9 8 7 6 5 Y 4 3 2 1 0 0 1 2 3 4 5 X

Figure 3. Definition of X and Y Axes

5.1.2 Units It is important to remember that most measures have units. These units should be shown with each graph to help ensure the reader understands the data being presented. The units for each X or Y axis of a graph should be included in the axis label. Typical units may include:  Effort is in staff hours  Defect Rate is in defects/KLOC  A single date has no other units (May 2002), but differences between dates need to specify units (hours, days, etc.)  Many Earned Vaue measures are in money ($) or percent (%).

5.1.3 How Much Data Fits? As various methods for presenting measures are discussed, keep in mind two key factors which will help determine the best method for presenting the data. 1. Amount of Data per Interval: Each graph may show one or more measures for each data collection opportunity. 2. Number of Intervals Shown: The number of data collection opportunities, or intervals, shown on each graph may also range from one to several. Since many measures are repeated over time, that is the most frequent kind of interval of interest. Hence the analyst typically needs to decide how many time intervals need to be presented on the graph. A very rough guide to the amount of data which may be shown on each type of graph is given in Table 5. The actual amount of data which fits depends on the size of the graph, font size, length of data labels, and other factors.

Table 5. Allowable Data Amounts by Graph Type

Amount of Data per Interval Number of Intervals Shown Graph Type (Number of Y values per X) (Number of X values) Control Chart 1-6 2-100 Histogram (Bar Chart) – 1 2-15 simple Histogram (Bar Chart) – 2-6 2-10 stacked or clustered Pareto Diagram 1 2-15 Pie Chart 1-10 1 Run Chart (Line Chart) 1-6 2-100 Scatter Diagram 1-6 2-100

5.2 Pie Chart The pie chart is the simplest type of graph (see Figure 4). It is used to show the distribution (or relative amounts) of up to about ten data points at one moment in time, generally now. It is good for showing percentages, since the pie slices visually add up to 100% for the complete “pie.”

Sales ($M)

1.2

West 4.6 2.4 Northeast South Midwest East 1.1 1.7

Figure 4. Pie Chart

5.3 Check Sheet The Check Sheet is not a graphic display device; it is used to gather data easily, consistently, and in a standard format. A check sheet used to help control the quality of a process or product is a “checklist.” The main purpose of a check sheet is that it helps to define key parts of a process, so that nothing gets left out. Examples include a code inspection checklist and detailed test procedures.

5.4 Pareto Diagram A Pareto Diagram is a special type of bar chart (see Figure 5), used to identify problem areas - what do we need to fix first? What are the biggest fires to put out? It is useful in a software environment because defects tend to cluster in portions of code, so if we can identify where a lot of defects have already been found, we can probably find more defects in those same areas. Here we would typically find the defect rate for all major software components, and then plot the defect rate (Y axis) for each component (X axis). To be a Pareto diagram, we must list the components in descending order of defect rate, which automatically puts the most defect-ridden components on the left of the graph. There is an optional line on a Pareto diagram, which shows the cumulative percent of all cases for each component, going from left to right. Hence the line always reaches 100% by the right-

Page 19 XXP Metrics Methodology most X value, and strictly increases going from left to right. In the context of defect rate analysis, this line would not have much meaning since defect rates cannot be added. A sample Pareto diagram, without the cumulative percent line, is shown in Figure 5.

Pareto Diagram of Component Defect Rate

) 25 22.3 C O L 20 K /

s 15.6 t

c 13.3 e 15 f

e 11.1 d

( 8.9 10 e t

a 5.6 R

t 5 c e f e

D 0 Interface Query Core Import Export Reports Component

Figure 5. Pareto Diagram

5.5 Histogram (Bar Chart) There are many variations possible on the histogram.  The bars can be oriented vertically (a.k.a. a column chart), or horizontally. Usually vertical bars are preferred.  A histogram can contain one measure (a simple bar chart) or more than one. The latter is a stacked bar chart if the bars are placed on top of each other, or a clustered bar chart if the bars are sitting next to each other.  The bars can represent the actual value of each measure (e.g. the number of problem reports in various statuses), or they can show the percent each measure contributes to the total (e.g. the percent of problem reports which are in each status). A third possibility, for real-valued data (as opposed to integer data), is that each bar can represent “the number of data points within a specific range of values.” So a histogram could show the distribution of problem report closure times, where one bar might be the “the number of problem reports which closed in 2 or 3 days,” and another bar might be the “the number of problem reports which closed in 1 day or less,” and so on. This gives at least eighteen possible types of histogram (vertical vs horizontal, simple vs. stacked vs. clustered, and actual values vs percentages vs. data ranges). Figures 6, 7, and 8 show examples of the simple, stacked, and clustered bar charts. Notice that the data in Figure 6 is the first cluster (leftmost) in Figure 8.

No. of Problem Reports by Status

250 234 s t r o

p 200 e R 150 m e l b o 100 r P

50 f 34 o 50 23 . 15 8 o N 0 New Closed Pending Withdrawn Analysis Testing Status

Figure 6. Simple Bar Chart

Problem Reports in each Status over Time

500

m Testing e 400 l b Analysis o s r t 300 r P Withdrawn

o f p o

e Pending r 200 R e b Closed m 100 u New N 0 Jan-02 Feb-02 Mar-02 Apr-02 May-02 Time

Figure 7. Stacked Bar Chart

Problem Reports in each Status over Time

300

m 250 New e l

b Closed

o 200 s r t r P Pending

o f 150 p o

e Withdrawn r R e 100 b Analysis m

u 50 Testing N 0 Jan-02 Feb-02 Mar-02 Apr-02 May-02 Time

Figure 8. Clustered Bar Chart Same data as Figure 7.

5.6 Run Chart (Line Chart) A run chart, by definition, graphs one or measures over time. Hence the X axis for a run chart is always some form of date measure. By using straight lines to connect data points for each measure, the line chart allows more data to be presented than could easily fit on a histogram. Note: In Microsoft Excel, a “Line Chart” uses the same format data as a bar chart – only the representation of each data point is different. In order to use a line chart with real-valued time data (e.g. if you have specific data points at May 1, May 5, and May 6, 2002 instead of just a generic category May 02), then you have to use a scatter plot instead and use straight lines to connect the data points. A sample run chart is shown in Figure 9.

Number of Defects Found by Severity

d 45 n

u 40 o

F 35

s t 30

c Minor e

f 25 e Major D

20 f

o Total

15 r

e 10 b

m 5 u

N 0 Jan-02 Feb-02 Mar-02 Apr-02 May-02 Time

Figure 9. Run Chart

5.7 Scatter Diagram Scatter diagrams are used to graph two measures against each other. They are generally used to determine if there is a correlation between the two measures, e.g. “does testing time decrease when more time is spent doing review?” In a scatter diagram the Y axis is the measure we wish to predict or understand more fully, and the X axis is the measure which affects the value of the Y axis. So in the example, the testing time would be the Y axis, and review time would be the X axis. A scatter diagram can be used to plot data points initially. Then, based on the type of relationship which appears to exist between the measures, a regression analysis is done to determine if that relationship is statistically meaningful. See “Statistics for Software Process Improvement” for more information on the statistical aspects of analyzing a scatter diagram’s data. A sample scatter diagram is shown in Figure 10. Notice that there does appear to be a correlation between salary and employee’s educational level. This graph was generated from a fictitious school employee database, using the program called SPSS.

Current Salary vs. Educational Level 60000

50000

40000

30000 Y R A

L 20000 A S

T

N 10000 E R R

U 0 C 6 8 10 12 14 16 18 20 22

EDUCATIONAL LEVEL

Figure 10. Scatter Diagram

5.8 Control Chart The control chart is a special kind of line graph used to monitor performance of a measure. The main purpose of using a control chart is to provide statistically sound basis for determining if a measure is under control. The main feature of a control chart is that there are three lines in addition to the measure itself. These three are the mean, the Upper Control Limit (UCL) and Lower Control Limit (LCL). There are a set of rules for determining if your measure is “out of control” based on its behavior relative to the mean, UCL and LCL. (See Memory Jogger II or Ishikawa’s book for details.) There are many kinds of control charts to meet a wide range of measure types.  Variable Data control charts apply when the measure is a continuous (real-valued) quantity, such as temperature, pressure, cost, etc. The control charts for variable data are based on the mean value of the measure and some measure of its standard error.  Attribute Data control charts apply when the measure is a discrete event, such as a defective part, an absent employee, or some other measure based on integer variables. Of those, software tends to focus on defect-oriented measures, so Attribute Data control charts apply most often here. Within Attribute Data control charts, there are different methods based on whether each defect is counted separately (called Defect data), or whether entire units are passed or failed (Defective data). Then the type of control chart is selected based on whether the samples are constant size (rare in software), or variable size. Assuming the latter applies, the choice of control chart becomes this:

 If each defect is counted separately (Defect data), and the sample size may vary, use the ‘u’ type of control chart. This is the ‘number of defects per unit’ type of control chart.  If each unit is passed or failed as a whole (Defective data), and the sample size may vary, use the ‘p’ type of control chart. This is the ‘fraction defective’ type of control chart. After selecting the type of control chart, there are formulas for calculating the UCL and LCL. Hence every data point for the measure also results in calculation of the control limits. (If the sample sizes vary by less than +/- 20%, then the control limits can be fixed, which makes the chart easier to read.) A sample control chart is shown in Figure 11.

Control Chart: Percent Passed 93.000

89.000

85.000

Percent Passed

81.000 UCL = 92.3343

Average = 85.0833

77.000 LCL = 77.8324 1 2 3 4 5 6 7 8 9 10 11 12

Sigma level: 3

Figure 11. Control Chart (‘p’ type)

5.9 Cause and Effect Diagram (Fishbone) A cause and effect diagram isn’t a method for presenting measures graphically. It is a tool for analysis of the causes of some undesired effect (problem). It’s called a fishbone diagram because it resembles the skeleton of a fish. A sample cause and effect diagram is shown in Figure 12. Cause and effect diagrams are generally generated in a meeting setting, by brainstorming about some problem. The diagram is a way of capturing everyone’s thoughts. The ‘head’ of the diagram is a box containing the effect you wish to cure. A horizontal line from the head forms the backbone of the ‘fish.’ The lines at 45 degrees to the backbone are the major types of causes of the effect. Some common choices for major causes are:  Machines (Plant, or equipment)  Methods (or Processes)  Materials (raw materials for the product)  People (staffing)

 Policies (high level rules)  Procedures (steps in a task)  Environment (facilities)  Measurement (data collection) The choices for major causes need to be customized for each problem. Generally at least four major causes are selected. Specific possible causes of the effect are shown by horizontal lines which point to each major cause. Each possible cause can be elaborated upon by adding more lines to show the causes of the possible cause. This process can continue to any desired level of detail. The main questions being asked to generate the cause and effect diagram are:  Why does that happen?  What causes that?  What influences that? The fishbone diagram can be used to support causal (root cause) analysis.

Figure 12. Cause and Effect Diagram (from Kara Fagley ‘s overview of Ishikawa’s tools)

Page 26