Cost Estimation Methods For

By

Andre Ladeira

Dissertation submitted in partial fulfillment of the requirements for the degree

Magister lngeneriae in Engineering Management In the faculty of Engineering at the Rand Afrikaans University

Supervisor: Prof L Pretorius January 2002 Cost Estimation Methods for Software Engineering

Executive Summary

This dissertation summarizes several classes of software cost estimation models and techniques. Experience to date indicates that expertise-based techniques are less mature than the other classes of techniques (algorithmic models), but that all classes of techniques are challenged by the rapid pace of change in software technology. The primary conclusion is that no single technique is best for all situations, and that a careful comparison of the results of several approaches is most likely to produce realistic estimates.

As more pressure on accurate cost estimation increase, research attention is now directed at gaining a better understanding of the software-engineering process as wall as constructing and evaluating software cost estimation tools. This dissertation evaluated four of the most popular algorithmic models used to estimate software cost (SLIM, COCOMO II, Function points and SLOC)

This dissertation also provides an overview of the baseline cost estimation model tailored to these new forms of software engineering. The major new modeling capabilities are an adaptable family of software sizing models, involving Function Points and Source Lines of Code. These models are serving as a framework for an extensive current data collection and analysis effort to further refine and calibrate the model's estimation capabilities. Cost Estimation Methods for Software Engineering

Index Chapter 1

Introduction ...... &

1.1. Background to the problem ...... 6 1.2. Background literature ...... 6 1.3. Problem Statement ...... 10 1.4. Research objective ...... 11 1.5. Conclusion ...... 11

Chapter 2

Estimation Processes ...... 12

2.1 . Software Cost ...... 12 2.2. Software Cost Estimation Process ...... 13 2.3. Estimation and the software process ...... 13 2.4. Inputs and Outputs to the Estimation Process ...... 15 2.5. The Estimation Process ...... 19 2.6. Timing of the estimates ...... 21 2.7. Estimation Constraints ...... 22 2.8. Data gathering ...... 24 2.9. Problems with the Cost Estimation Process ...... 25 2.1 0. Problems with Requirements ...... 26 2.11. Conclusion ...... 28

Chapter 3

Size Estimation ...... 30

3.1 Lines of code ...... 30 3.2 Function Point Analysis ...... 34 3.3 Conclusion ...... 42

2 Cost Estimation Methods for Software Engineering

Chapter 4

E:!itirrlettiC>rl ~E!tll()cJ!i ...... 44

4.1. Software Life-cycle Management (SLIM) method ...... 44 4.2. Constructive Cost Model (COCOMO II) ...... 49 4.3. Expertise-Based Technique ...... 57 4.4. Cost Estimation method ...... 59 4.5. Conclusion ...... 61

Chapter 5

Case Stucty ...... 64

5.1 Project description ...... 64 5.2 Project Size estimation ...... 64 5.3 Effort estimation ...... 66 5.4 Conclusion ...... 70

Chapter 6

Conclusions a net Recorr1rt1enctations ...... 71

6.1 Conclusions ...... 71 6.2 Recommendations ...... 73 6.3 Further Investigation ...... 76

References ...... Error! Bookmark not defined.

~IC>!iSCir)f ...... ~~

3 Cost Estimation Methods for Software Engineering

Appendix A ...... 82

Scaling Drivers ...... 86

Appendix 8 ...... 85

Architecture I Risk Resolution ...... 89

Appendix C ...... 86

Team Cohesion ...... 90

Appendix 0 ...... 86

Process Maturity ...... 90

Appendix E ...... 87

Product Complexity ...... 91

Appendix F ...... 88

Effort multipliers ...... 92

Appendi>e c:J ...•••••••...... ••...... •••...... •••...... ••••...... !JE;

Nu Metro Server Technical Specification ...... 99

List of figures

Figure 1.1: Influencing factors to be evaluated to produce an accurate estimate .9 Figure 1.2: Information to be used to predict scenarios on future projects ...... 10 Figure 1.3: Estimation principle ...... 11 Figure 2. 1: Classical view of software estimation process ...... 16 Figure 2.2: Actual cost estimation process ...... 18

4 Cost Estimation Methods for Software Engineering

Figure 3.1: Definition checklist for source statements counts ...... 32 Figure 4.1: Rayleigh curve ...... 44

List of tables

Table 1.1: Project levels of complexity ...... 8 Table 3.1: Function point complexity matrix ...... 37 Table 3.2: Function point complexity-weight matrix ...... 37 Table 4.1: Rating scheme for the COCOMO II scale factors ...... 55 Table 4.2: Effort multipliers cost driving rating for the post-architecture model. .. 58 Table 4.3: Early design and post-architecture cost driver ...... 57

5 Cost Estimation Methods for Software Engineering

Chapter 1

Introduction

1.1. Background to the problem

"If there is one management danger zone to mark

above all others, it is software cost estimation."

Robert Glass - Building

The reason for the strong emphasis on software engineering cost estimation is that it provides the vital link between the general concepts and techniques of economic analysis and the particular world of software engineering. There is no good way to perform a software cost-benefit analysis, breakeven analysis, or make-or-buy analysis without some reasonably accurate method of estimating software engineering costs, and their sensitivity to various product, project, and environmental factors. Software engineering cost estimation techniques also provide an essential part of the foundation for good engineering management.

Cost in a project is also due to the requirements for software, hardware and human resources. The bulk of the cost of software development is due to the human resources needed, and most cost estimation procedures focus on this aspect. Most cost estimates are determined in terms of person-months (PM).

1.2. Background literature

As the cost of the project depends on the nature and characteristics of the project, at any point, the accuracy of the estimate will depend on the amount of reliable information that is available about the final product [4][27]. When the project is being initiated or during the feasibility study, the analysts have only

6 Cost Estimation Methods for Software Engineering some idea of the data the system will get and produce and the major functionality of the system. There is a great deal of uncertainty about the actual specifications of the system. As the user specifies the system more fully and accurately, the uncertainties are reduced and more accurate cost estimates can be made. Despite the limitations, cost estimation models have matured considerably and generally give fairly accurate estimates.

By far, the project sizing technique that delivers the greatest accuracy and flexibility is function point analysis [24]. Based upon logical, user-defined requirements, function points permit the early sizing of the software problem domain. In addition, the function point methodology presents the opportunity to size a user requirement regardless of the level of detail available. An accurate function point size can be determined from the detailed information included in a thorough user requirements document, or an adequate function point size can be derived from the limited information provided in an early proposal.

An alternative sizing method is counting lines of code [20]. It dependent upon information that is not available until later in the development life cycle. Function points accurately size the stated requirement. If the problem domain is not clearly or fully defined, the project will not be properly sized. When there are missing, brief, or vague requirements, a simple process using basic diagramming techniques with the requesting user can be executed to more fully define the requirements.

In addition to the project size, project complexity must be properly evaluated [Matson]. To some extent, complexity levels are evaluated by 14 general system characteristics: • Data communication • Distributed data processing • Performance • Heavily used configuration

7 Cost Estimation Methods for Software Engineering

• Transaction rate • Online data entry • End-user efficiency • Online update • Complex processing • • Installation ease • Operational ease • Multiple sites

The assessment of a project's complexity should also take into consideration complex interfaces, database structures, and contained algorithms. The assessment of complexity can be based upon five varying levels of complexity as shown in table 1.1 : Level1: Simple addition/subtraction Simple logical algorithms Simple data relationships Level2: Many calculations, including multiplication/division in series More complex nested algorithms Multidimensional data relationships Level3: Significant number of calculations typically contained in payroll/actuarial/rating/scheduling applications Complex nested algorithms Multidimensional and relational data relationships with a significant number of attributive and associative relationships Level4: Differential equations typical Fuzzy logic Extremely complex logical and mathematical algorithms typically seen in military/telecommunications/real-time/automated process control/navigation systems Extremely complex data LevelS: Online, continuously available, critically timed Event-driven outputs that occur simultaneously with inputs Buffer area or queue to determine processing priorities Memory, timing, and communication constraints

Table 1.1: Project levels of complexity [19]

8 Cost Estimation Methods for Software Engineering

The capability to deliver software is based upon a variety of risk factors that influence a development organization's capability to deliver software in a timely and economical fashion. Risk factors include such things as the software processes that will be used, the skill levels of the staff (including user personnel) who will be involved, the automation that will be utilized, and the influences of the physical (development conditions) and business environment (competition and regulatory requirements). In fact, numerous factors influence our ability to timely deliver software with high quality. Categorized in Figure 1.1 are some examples of influencing factors that must be evaluated to produce an accurate estimate.

MANAGEMENT DEFINITION DESIGN • Team Dynamics • Clearly Stated • Formal Process • High Morale Requirements • Rigorous reviews • Project Tracking • Formal Process • Design Reuse • Project Planning • Customer • Customer • Automation Involvement Involvement • Management • Experience Level • Experience Skills • Business Impact Development Staff • Automation

BUILD TEST ENVIROMEMENT • Code Review • Formal Testing • New Technology • Source Code Methods • Automated Tracking • Test Plans Process • Project tracking • Development Staff • Adequate Training • Project Planning Experience • Organizational • Automation • Effective Test Dynamics • Management Tools • Certification Skills • Customer Involvement

Figure 1.1: Influencing factors to be evaluated to produce an accurate estimate [19].

This information can be used to predict and explore "what-if' scenarios on future projects (see Figure 1.2).

9 Cost Estimation Methods for Software Engineering

Estimate Project Completion

Access: Size \ ~ Complexity Size Influence ,__~ Rate of delivery •,..... ~ Complexity Factors Influence Factors

Baseline of Create Profile Performance Select a Rate of Delivery baseline profile Time to Market Defects

Figure 1.2: Information to be used to predict scenarios on future projects

An organization should develop profiles that reflect the rate of delivery for a project of a given size, complexity, and risk factors [12].

1.3. Problem Statement

At the core of the estimating challenge are two issues [14]: the need to understand and express (as early as possible) the engineering problem domain, and the need to understand the capability to deliver the required software solution within a specified environment. Only then it will be able to accurately predict the effort required to deliver a project.

The current engineering problem domain can be defined simply as the scope of the required software. The problem domain must be accurately assessed for its size and complexity. To complicate the situation, experience tells that at the point in time that an initial estimate is required (early in the system's life cycle) it cannot be presumed that all the necessary information is available. Therefore, there must be a rigorous process that permits a further clarification of the problem domain.

An effective estimating model considers three elements: size, complexity, and risk factors. When factored together, they result in a more accurate cost estimate (see Figure 1.3).

10 Cost Estimation Methods for Software Engineering

Definition Capability

~ Estimates Project -Schedule ( Project Size ) * * ( Risk Factors Complexity = -Effort -Costs \~ ~

Figure 1.3: Estimation principle

1.4. Research objective

The objective in this dissertation is as follows: • Determining the software engineering cost estimation principle and process. • Investigating different size estimation methods and determining the difference between the different methods. • Investigating different effort estimation methods or techniques and determining the difference between the methods or techniques.

1.5. Conclusion

The structure of the research dissertation will be as follows: • Chapter two will cover a comprehensive literature review of the subject matter and related fields. • Chapter three will cover an investigation into two different size estimation methods (function points calculation and source line of code count). • Chapter four will cover an investigation into three different effort and cost estimation methods used in the software engineering industry. • Chapter five will cover a case study where the estimated methods were used. • Chapter six contains the conclusion and recommendations on the findings made.

11 Cost Estimation Methods for Software Engineering

Chapter 2

Estimation Processes

2.1. Software Cost

Despite the terminology, software engineering cost does not refer directly to a monetary value associated with software development. Such a value is almost impossible to arrive at and not always useful. The questions are "What's the effort involved?" and "How long will it take?" The answers to these two questions can then be translated to the monetary value. This leads to the following definition of software cost.

Software cost consists of three elements [9]: • Manpower loading is the number of engineering and management personnel allocated to the project as a function of time. • Effort is defined as the engineering and management effort required to complete a project, usually measured in units such as person-months. The types and the levels of skills for the resources influence the cost of the project. • Duration is the amount of time (usually measured in months) required to complete the project.

Arriving at a cost estimate involves using a number of different factors to try to determine the overall cost of a system. Deciding which factors to include and combining them to arrive at the estimate make up the engineering cost estimation process that is defined as follows [14]:

Direct costs include items such as analysis, design, coding, testing and integration. Depending on who is doing the engineering and why, software cost

12 Cost Estimation Methods for Software Engineering

may also include a number of other items such as training, customer support, installation, level of documentation, configuration management, and quality assurance.

2.2. Software Cost Estimation Process

A software cost estimation process is the set of techniques and procedures that an organization uses to arrive at a software cost estimate. Generally there is a set of inputs to the process (e.g., system requirements) and an output of effort, manpower loading, and/or duration.

It is very difficult to examine the software cost estimation process without the overall context of the software development process in use within a given organization.

The set of procedures, techniques, and standards that an organization uses for organizing, managing, and controlling software development projects is called the software process.

Organizations have different software processes, depending on the type of software they are developing. For many organizations, the development process is very informal; in other cases it is well documented and stringently monitored.

2.3. Estimation and the Software Process

Cost of a project can be estimated for a number of reasons. Why it is done is an important factor in determining when and how it is done. The reasons why a cost estimation process is undertaken include the following [14]: • Project approval. For every project there must be a decision by the organization to undertake the project. Such a decision requires an estimate of the money and resources required to complete it.

13 Cost Estimation Methods for Software Engineering

• Project management. Project managers are responsible for planning and control of projects. Both activities require an estimate of the activities required to complete a project and the resources required for each activity. • Project team understanding. For members of a project team to work together more efficiently on a project, it is necessary that each one understand his/her role in the project and the overall activities of the project. A project task definition, which can be used for this purpose, is generated by a cost estimate.

The "why" of the cost estimation process can be any of the above reasons and is one of the factors determining when the estimate is done. Project approval requires estimates to be performed very early in the project life cycle, often before requirements have been clearly specified. The project approval process typically has a number of points where a "go/no go" decision must be made. At each of these points, an estimate may be required to permit management to make the decision. Early in the project life cycle, these may be approximate order of magnitude estimates sufficient to allow the organization to determine whether they should continue to look at a project. Late in the project, management can get much more detailed estimates of cost to completion in order to decide whether to cancel an ongoing project.

For managing and understanding a project, an estimate can be done early in the development of the project to arrive at an initial estimate, and then repeated on a regular basis during development to keep the estimate current [1][2][3]. For these estimates the prime concern is not necessarily the absolute "cost," but the estimated set of tasks required to complete the project, the results of each of these tasks, how these tasks fit together, and the resources required to complete each task.

14 Cost Estimation Methods for Software Engineering

Re-estimates are required throughout the development cycle regardless of why the estimate is done. As a project progresses, more information is available on the product and the process being used to develop it. This information can be used to increase the accuracy and detail of the estimate.

2.4. Inputs and Outputs to the Estimation Process

The software cost estimation process computes a set of outputs as a function of a set of inputs. The inputs to the estimation process depend on when the estimate is being performed. Very early estimates are necessarily based on sparse and incomplete data regarding the project and the development process.

Preliminary estimates are needed before requirements are known or architecture has been defined [22]. Such estimates will necessarily be based on sketchy data and will not have a high degree of accuracy. Estimates performed late in the development cycle are based on a much wider set of information. Computing cost to completion late in the development cycle allows a great deal of project and process information to be used. Given that more information is available, more detailed estimates can be made, which have a much greater degree of accuracy than the initial estimates.

Most models of cost estimation view the estimation process as being a function computed from a set of cost drivers. These drivers are assumed to be the characteristics of a system that determine the final cost of production. In most of the advocated cost estimation techniques, the primary cost driver is assumed to be the software requirements [2][3][1 0]. In this model of software cost estimation (illustrated in Figure 2.1 ), the requirements are the primary input to the process and form the basis for the estimate. The estimate is then adjusted according to a number of other cost drivers (such as experience of personnel and complexity of system) to arrive at the final estimate.

15 Cost Estimation Methods for Software Engineering

In this classical view, the effort, duration, and loading are computed as fixed numbers (perhaps with tolerances), or a set of relationships between the values is given, allowing managers to trade off costs in order to minimize any of the three values.

Requirement

Cost Software cost drivers estimation process

Other cost drivers Loading

Figure 2.1: Classical view of software estimation process [22]

In fact, the cost estimation process can be much more complex than that portrayed in Figure 2.1. There is interdependency between many items of information, all of which are relevant to the cost estimation process (Figure 2.2).

Many of the data items that are inputs to the cost estimation process are modified and output by the process. Thus, rather than viewing the cost estimation process as a function of the requirements, it is often more accurate to view this process as trying to satisfy a set of constraints. The inputs to the system are a set of constraints on the requirements, software architecture, financial resources, etc., while the outputs are a cost estimate and a set of assumptions that satisfy all the constraints.

This view allows the constraints to be imposed on any of the factors that affect the cost. These factors range far beyond requirements to include issues such as delivery date, finances and software process.

Requirements are viewed as constraints that must be satisfied. In a few cases, these requirements are fixed, complete, and correct. In most cases, however,

16 Cost Estimation Methods for Software Engineering during estimation the estimator detects inconsistencies and ambiguities in the requirements. As part of the estimation process, the estimator will resolve some of these ambiguities by imposing new constraints on the requirements. In other cases, the problems with the requirements remain, with a corresponding affect on the accuracy of the estimate.

Financial, calendar, manpower, architectural, and software process constraints are also significant to the cost estimation process. Financial, calendar, and manpower constraints limit the amount of resources that can be allocated to a project. Financial constraints limit the amount of money that can be budgeted for the project; calendar constraints specify a delivery date that must be met; and manpower constraints limit the number of people that can be allocated to the project. For example, if a fixed amount of money is available for a project, then the estimated cost should satisfy this financial constraint, perhaps by varying the functionality.

The software architecture defines the different components used to construct the system and the interrelationships between these components. The stage in the development life cycle determines whether the software architecture is a factor for the estimation process. For example, maintenance organizations that are working with an existing system are constrained to use the existing architecture and can base their estimates on this architecture.

The cost estimation process for new development may not make any assumptions on the software architecture and base the estimate entirely on the basis of system functionality. For many larger contracts, the software process becomes one of the constraints that must be satisfied by the estimating process. Many organizations have within their software process a standard Work Breakdown Structure (WBS), which defines the tasks to be performed to complete a project. Frequently, the estimating process will be working under the

17 Cost Estimation Methods for Software Engineering constraint that the standard WBS must be used for a project. The estimating process will then tailor the WBS to the specific project, adding sufficient detail.

For example, one situation where constraints to the software process affect the estimation process is the requirement to develop according to the ISO 9000 standard. Significant cost is incurred by adhering to this standard; for small changes, ISO 9000 can actually be the dominant cost factor. When estimating a system developed to this standard, estimators must be aware of the cost incurred by use of the standard.

Less vaque Vaque (and modified) Cost requirements requirements drivers Other cos drivers

Sofhvare cost estimation process

Loading Cons1raints

Contingency

Tentative WBS

Less fuzzy Other architecture inputs

Figure 2.2: Actual cost estimation process [22]

Aside from the various constraints, other factors that must be included as part of the estimation process are the risks associated with the project. These risks could include, for example, dependency on outside contractors, lack of experience in the application domain, etc. These risk factors should be identified

18 Cost Estimation Methods for Software Engineering

as early as possible in order include them in the decision making and project management processes.

2.5. The Estimation Process

An estimate is arrived at by taking the identified constraints, applying the estimation process, and generating results that satisfy all the constraints. A variety of techniques are used by different organizations to arrive at these estimates. The processes used can be classified as either model based or analogy based.

Model-based estimation builds a costing model of system development based on the characteristics of the system being built, the process being used to build it, and it's the development environment.

A model can be a formal mathematical model or a set of informal guidelines used by an estimator. Informal models are used by experienced developers who have gained sufficient knowledge about system development by working on previous projects. The informal model used by such an estimator is expressed as a set of "rules of thumb" or, at an even more primitive level, as a "gut feel" [30]. When questioned as to how they developed their model and how they apply it, estimators are usually unable to say exactly what it is they do. It appears to be an issue of gaining the required experience in order to arrive at accurate estimates.

Formal models attempt to quantify all inputs to the cost estimation process, and then apply a set of equations that describes the relationships between the inputs and the outputs of the cost estimation process. The equations are developed through analysis of historical data and must be calibrated to each individual development environment. The best known formal models are Boehm's COCOMO II [2][4] function points, and Putnam's application of Rayleigh curves to the development process [27].

19 Cost Estimation Methods for Software Engineering

The usual method of applying the formal model is to transform the requirements into a measure of the "size" of the system. This size measure, which can be either SLOG (Source Lines of Code) or FPs (Function Points), is used as the basis for creating the cost estimates. The estimator can also quantify a set of other cost drivers, examples of which include: • Product attributes, e.g., required reliability, product complexity, etc. • Computer attributes, e.g., memory constraints. • Personnel attributes, e.g., applications experience, programming language experience.

These cost drivers become multipliers that can be used to increase or decrease the initial estimate. The bulk of the current literature and research on cost estimation is devoted to formal models, particularly as relates to new system development [2][4][27].

Analogy-based estimating processes estimate costs by comparing the current development project with previous development projects undertaken by the organization. An analogy-based technique requires maintenance of a history of past projects; this information can be used as a reference point. Past projects with properties similar to the current project are identified and their costs used as a basis for estimating the current project.

At the most informal-level of analogy based techniques, the history of past projects is maintained in the estimator's memory. Finding past projects with properties similar to the current project involves the estimator thinking of similar project and what cost was involved in those projects. Such an approach is highly dependent on the memory of the individual estimators and a very low employee turnover.

The analogy-based approach can be made more rigorous in a number of ways. The history of past projects can be maintained as a computerized database, with

20 Cost Estimation Methods for Software Engineering

detailed metrics and descriptions of characteristics recorded for each project. Using a historical database, an estimator can query the database searching for projects with similar characteristics and then base the estimate on actual costs and process of the previous projects. Such an approach avoids the fallibility of human memory and provides a much more detailed historic record of what occurred in the course of a project [9].

2.6. Timing of the estimates

Estimation is not a task done once, at the beginning of a project. Rather, estimates and re-estimates are undertaken throughout the life of a project [7][8][10]. The success of an estimator is not necessarily the accuracy of the initial estimates, but rather the rate at which the estimates converge to the actual costs. The timing of estimates depends on the type of organization involved and why the estimate is being performed.

Contractors usually perform two estimates early in the development life cycle. The first is done to prepare a bid for the contract, usually in a relatively quick fashion, with the objective of arriving at a winning bid. The timing of this bid is very much dependent on the procuring agency that issues the Request for Proposal (RFP). The contractor is required to generate an estimate at this point, basing it on information within the RFP and obtained informally from the contracting agency.

Upon winning a bid, most contracting organizations immediately undertake a second, more detailed, estimation process. The objective of this estimate is to develop a more accurate and detailed cost estimate and project plan which are based on the previous estimate and WBS. Frequently, much discussion between the contractor and the agency is necessary to deal with previously undetected issues and problems in the requirements.

21 Cost Estimation Methods for Software Engineering

2. 7. Estimation Constraints

An estimation process involves arnvmg at an estimate that satisfies the constraints. These constraints vary depending on the timing of the estimate and the organization performing the estimate, but can include: • System requirements. • Delivery date. • Financial. • Manpower resources. • System architecture. • Software process.

When preparing a bid to develop new software, a contracting organization is usually faced with constraints on system requirements, delivery date, manpower resources, and software process. Depending on the system under construction, constraints may be placed upon the architecture. The constraints on the requirements of the system vary considerably among projects. Some projects have requirements which are well understood and well documented within the RFP. In these cases, the constraints on the requirements are well understood by all parties involved. However, in many cases, requirements are not clearly understood up front, or are flexible in terms of the actual functionality to be delivered as part of the end product.

Delivery date and financial resources are constraints that are very firm and have a large impact upon a contractor's preparation of a bid for estimation purposes. There are two reasons that these constraints are imposed upon contractors. First, the procuring agency has a budget and timetable, which they are under pressure to meet and which they are not willing to exceed. Second, there will be competing bids submitted.

22 Cost Estimation Methods for Software Engineering

Once the bid has been won, the contractor performs another more detailed estimate [8][10]. This estimate is in many ways more realistic because there is less pressure to satisfy financial constraints; it is usually done by the project manager to determine how much the system is really going to cost. Although financial constraints affect the process, the manager usually defines in much more detail the functionality of the system and the process used to develop the system. This results in a more accurate estimate and can determine whether the system may be built for the contracted price.

Re-estimates done by contractors during development involve modifying the duration, effort, and functionality. As understanding of the tasks increases, more accurate estimates can be made regarding effort and duration. As the requirements of the system are better understood, they can be re-estimated and appropriate modifications made to the effort and duration estimates.

From a procuring agency's perspective, estimates are performed under a different set of constraints. Project Directors try to balance the following constraints while getting approval for the project:

• Financial. How much money is the organization willing to put into this project? • Calendar. When do I have to show results to keep management satisfied? • Requirements. What is the functionality required of the system?

Each of these constraints has a different level of priority, depending on the particular project. Once project development begins, control of the project passes from the Project Director to the Project Manager (PM). At this point budgetary approval has been received and all previous estimates are considered to be cast in stone. Thus, here is great pressure on the PM not to change any of the previous estimates.

23 Cost Estimation Methods for Software Engineering

The PM must decide in what order to sacrifice the financial, calendar, and requirements constraints. Different PMs have different approaches; generally they try to maintain the functionality of the system, but let either the calendar or financial constraints slip. In reality, however, it appeared that if the original estimates were incorrect, all of the constraints were affected.

2.8. Data gathering

It seems obvious that without knowledge of the past, it is impossible to predict what may happen on future projects. (Even with knowledge of the past, there is still no guarantee 1that the future can be predicted.) A corollary is that if an organization wants to improve its cost estimation process, should gather relevant data on previous projects.

The simplest way to gather data is to have a stable work force so that project and process data are maintained in the memory of the individuals of the organization. The individuals can then use this information to estimate costs of other projects. However, relying on individuals' imperfect memories is barely sufficient for small projects; for large projects it is completely inadequate.

Even if this information is gathered, it is often done for financial purposes and is not used by software managers to estimate the cost of future projects. There are a number of reasons why this data may not be useful [27][30]:

• The data is not accurate. If the primary perceived purpose of time sheets is to monitor the staff, the accuracy of the figures in the time sheets must be questioned. • The data is not accessible. Often time sheets are gathered for the benefit of the financial department rather than to assist estimators. Thus, they are kept on systems not easily accessible to estimators, or worse, are simply stored as m asses of paper files. • The data is not broken down in a useful way. The overall cost of a project has a limited usefulness. What is usually of more interest to an estimator

24 Cost Estimation Methods for Software Engineering

is how the project was broken down into activities and the cost of each of these individual activities.

2.9. Problems with the Cost Estimation Process

What factors make software cost estimation difficult? There are situations where a high level of accuracy in cost estimation can be found; many of these situations were identified by the following characteristics [3]: • The users are experienced in the system, know what they want, and can express what they want. • The requirements are clear, precise, correct, and complete. • The project duration is short. • The manpower loading is small. • The people doing the estimation are experienced in the application domain and have developed similar systems. • The development environment and development process are familiar to all people involved. • Staff turnover is low both among the developers and the users. • No unfamiliar software or hardware from outside suppliers is to be integrated with the final product.

A project satisfying the above characteristics frequently resulted in accurate cost estimates. However, most of the projects did not satisfy the above conditions and therefore the estimates produced were not accurate. The characteristics needed for accurate estimates can be reversed in order to enumerate problems leading to inaccurate estimates: • Problems with the requirements. • Issues in maintenance. • Procurement process. • System size. • Software process and process maturity. • Monitoring progress of the project.

25 Cost Estimation Methods for Software Engineering

• Lack of historical data. • Lack of application domain expertise. • Embedded software.

2.10. Problems with Requirements

Almost universally and without exception, organizations blame problems with the requirements as a major reason why cost estimates were inaccurate. The problems are numerous: incomplete, ambiguous, inconsistent, incorrect, and incomprehensible.

The problem of users not understanding the requirements existed for all types of systems and all types of developments. For new development projects, users would request systems (and quotes) before there was a complete understanding of the problem or the solution.

Cost estimates can be made without a clear understanding of the requirements of the system being built; it must be accepted that these estimates have a very high likelihood of error.

Requirements creep. As projects progress and the knowledge of the problem increases, it seems inevitable that users (and developers) request more and more features and changes to be included in the product. Thus, over the development of the project, new features work their way into the requirements, leading to "requirements creep" (or, as Boehm described it, "requirements gallop"). New feature requests come from many sources and for many reasons, but the problem seems to be universal. Correct and complete requirements for complex systems are impossible to achieve. A fact that must be accepted is that a complete statement of the requirements cannot be defined before development begins [14]. This has nothing to do with the competence of the users or the developers but rather is inherent in the nature of complex computer system

26 Cost Estimation Methods for Software Engineering applications. Unless the system being developed is almost identical to a previously developed system, the requirements will invariably be wrong and/or incomplete. As a project evolves, users and developers gain a better understanding of the problem and of the solutions. As people gain a better understanding of the problem being solved the requirements evolve.

One frequent assumption is that the requirements will be firm before development begins. Anyone working under this assumption will meet serious problems when trying to estimate software costs accurately.

Since the requirements are probably wrong or incomplete, it is unlikely that the estimates based on those requirements will be accurate. If the requirements are included as part of the RFP put out by a procurement agency and a contractor is expected to submit a firm bid based on those requirements, a frequent result later in the development stages is confrontation between the contractor and the agency as they argue over the meaning of each requirement and the cost associated with the changing requirements.

Long development time, leading to requirements that are obsolete before the system is delivered [8][10]. The rate of change in technology is so fast that any attempts to predict what the technology will be in a few years are doomed to failure. As the technology changes, so do the range of solutions to problems, and the users' expectations of the solutions. Projects with a long time between initiation and expected delivery suffer in that the solution is usually obsolete by the time it is delivered. The customer is dissatisfied because the product does not satisfy the new requirements. Large staff turnover for end users, resulting in changing requirements as new staff arrive. Developing software systems requires a consistent users' base throughout the development cycle. If the users' base changes too frequently, requirements continually change, and it is difficult for developers to obtain consistent answers and comments from the end users.

27 Cost Estimation Methods for Software Engineering

2.11. Conclusion

All private businesses have two concepts in common. These are • Ensure that a profit is made • Ensure their survival

To ensure that this happens, all projects taken on must ensure that the business is not worse of than when started with the project. This can be accomplished when the initial cost estimate is complete and accurate.

To determine the cost of a software project, being low level software integration or high level web page development, the process is no different. The estimation process has many unknown factors that must be determined before the estimation process can be started. The following factors must be considered • The software process. Most software engineering firms or companies have a different management methodology on developing software. These differences can influence the cost estimation processes. There can be more documentation or formal processes that must be completed before the development process can move into the next step. • There are more inputs to be considered than in previous year of software cost estimation. Previously the only considerations taken into account was were the system requirement and cost drivers. Today there are more factors to consider. Some of them are the company software process, financial constraints, risk factors and the specific software architecture • System requirement. In some cases the required software to be engineered is a new system based on new technology released. There is no data or experienced manpower available. A steep learning curve must be taken into consideration.

Another obstacle in the cost estimation process is the specific requirements set by the client. In many cases these requirements are vague, incomplete and

28 Cost Estimation Methods for Software Engineering ambiguous. The system analyst or project manager must set up a task team to determine the complete and correct requirements. This process can be time consuming and sometimes expensive.

Time brackets allocated for request for proposal (RFP) are inadequate. The project manager or the specific member assigned with the RFP must create a cost estimation with the vague information supplied. This in turn may cause that the estimation process is inaccurate.

These obstacles can be resolved by firstly estimating the size of the project with available requirement. Different size estimation methods are available; the most popular methods are the counting of source lines of code and function point counting.

These methods will be discussed in more detail in the next chapter. The goal of the chapter is to determine what method would be best suited for one of the biggest obstacles, accurate estimations with limited requirements.

29 Cost Estimation Methods for Software Engineering

Chapter 3

Size Estimation

3.1 Lines of code

The traditional size metric for estimating software development effort and for measuring productivity has been lines of code (LOC). A large number of cost estimations models have been produced, most of which are functional lines of code, or thousands of lines of code (KLOC). The definition of KLOC is important when comparing these models. Some models include comment lines, and others do not. Similarly, the definition of what effort (E) is being estimated is equally important. Effort may represent only coding at one extreme of the total analysis, design, coding and testing effort at the other extreme. As a result, it is difficult to compare these models.

The abbreviation NCLOC is used to represent a non-commented source line of code. NCLOC is also sometimes referred to as effective lines of code (ELOC). NCLOC is therefore a measure of the uncommented length.

The commented length is also a valid measure, depending on whether or not line documentation is considered to be a part of programming effort. The abbreviation CLOC is used to represent a commented source line of code [11]

By measuring NCLOC and CLOC separately the total length can be defined:

Total length (LOC) = NCLOC + CLOC Equation 3.1

KLOC is used to denote thousands of lines of code.

A logical source statement has been chosen as the standard line of code. Defining a line of code is difficult due to conceptual differences involved in

30 Cost Estimation Methods for Software Engineering accounting for executable statements and data declarations in different software languages. The goal is to measure the amount of intellectual work put into program development, but difficulties arise when trying to define consistent measures across different languages.

To minimize these problems, the Software Engineering Institute (SEI) definition checklist for a logical source statement is used in defining the line of code measure. The Software Engineering Institute (SEI) has developed this checklist as part of a system of definition checklists, report forms and supplemental forms to support measurement definitions [12][20].

Figure 3.1 shows a portion of the definition checklist as it is being applied to support the development of the COCOMO II model. Each checkmark in the "Includes" column identifies a particular statement type or attribute included in the definition, and vice-versa for the excludes. Other sections in the definition clarify statement attributes for usage, delivery, functionality, replications and development status.

There are also clarifications for language specific statements for ADA, C, C++, CMS-2, COBOL, FORTRAN, JOVIAL and Pascal.

31 Cost Estimation Methods for Software Engineering

Definition Checklist for Source Statements Counts

1•11~ !

LH;.tkal ·mlllT\' .. t:iknwut' Statenwn! type Vtlt~P:"? a ,Jnp or stntA't)£..:~: t";Ottt;;'lt:?S nlOt8 th"~'! t>ne :ypit. t~ta:;.:;.d~< ;: ~3> Nte lypB< ·/r:U: tr-t: n:gtJ?St J:! t!'... >:den<:t:, 1 Ex>>cutablo?: 2 Noru,;,x<'::vt cod;, 8 Banwtrs and m>t1·1Aant; SfMCt.:r::. g B!;mk 1.;;mpty) .:::ornmt'nts 10 BL:mk llm:-s

Hmv produced [),:.1u1ition 1 Pn)\1! arm tM:>d :? (71\on;:.r.atf.td wlt11 :;ourcof!o ~~<)fle \JRn;,r-.tors 3 Conv•:rted mth

On!,nn De!irulion ! N;:.w v.'0rk no prk>r ;:,xlf.Jfii!C;:. 2 !Ynor wo1k: !ake-n or <~>mmerd$11. <>ff-the-~he!f soft-v:,~re (COTS> •:.th&r than ht>r<~riBS S Gt•\'ernnwnt fumrsh,;d &<•ft·Nnm ·GFSJ t:>lher thrn1 tBur,.;; hhr:~rins 6 Another product 7 A v.;;ndor-suppl~d h1t1quage support llhrDry (lmmodified! 8 A ve-uda·supphed t>pet.:1tin9 ::;y::;tem vr ttti!!ly o_vnrn

Figure 3.1: Definition checklist for source statements counts [4]

Some changes were made to the line-of-code definitions that depart from the default definition provided in [20]. These changes eliminate categories of

32 Cost Estimation Methods for Software Engineering software which are generally small sources of project effort. Not included in the definition are commercial-off-the-shelf software (COTS), government furnished software (GFS), other products, language support libraries and operating systems, or other commercial libraries. Code generated with source code generators is not included though measurements will be taken with and without generated code to support analysis.

There are a number of problems with using LOG as the unit of measure for software size. The primary problem is the lack of a universally accepted definition for exactly what is a line of code really is.

Another difficulty with lines of code as a measure of system size is its language dependence. It is not possible to directly compare project development by using different languages.

Still another problem with the lines of code measure is the fact that it is difficult to estimate the number of lines of code that will be needed to develop a system from the information available at requirements or design phase of development [7][8].

If cost models based on size are to useful, it is necessary to be able to predict the size of the final product as early and accurately as possible. Unfortunately, estimating software size using the lines of code metric depends so much on previous experience with similar project that experts can make radically different estimates.

Finally, the lines of code measure places undue emphasis on coding, which is only one part of the implementation phase of a software development project. It is stated that coding accounts only for 10% to 15% of the total effort on a large engineering system. It is also questioned whether the total effort is really linearly dependent on the amount of code [28].

33 Cost Estimation Methods for Software Engineering

3.2 Function Point Analysis

The function point cost estimation approach is based on the amount of functionality in a software project and a set of individual project factors [3][17][15]. Function points are useful estimators since they are based on information that is available early in the project life cycle.

Software engineers have been searching for a metric that is applicable for a broad range of software engineering environments. The metric should be technology independent and support the need for estimating, project management, measuring quality and gathering requirements. Function Point Analysis is the measure that accomplishes all these requirements.

There have been many misconceptions regarding the appropriateness of Function Point Analysis in evaluating emerging environments such as real time embedded code and Object Oriented programming. Since function points express the resulting work-product in terms of functionality as seen from the user's perspective, the tools and technologies used to deliver it are independent.

Introduction to Function Point Analysis

One of the initial design criteria for function points was to provide a mechanism that both software engineers and users could utilize to define functional requirements. It was determined that the best way to gain an understanding of the users' needs was to approach their problem from the perspective of how they view the results an automated system produces. Therefore, one of the primary goals of Function Point Analysis is to evaluate a system's capabilities from a user's point of view. To achieve this goal, the analysis is based upon the various ways users interact with computerized systems. From a user's perspective a system assists them in doing their job by providing five (5) basic functions. Two of these address the data requirements of an end user and are referred to as

34 Cost Estimation Methods for Software Engineering

Data Functions. The remaining three addresses the user's need to access data and are referred to as Transactional Functions.

Function point calculations

Function points (FP) measure size in terms of the amount of functionality in a system. Function points are computed by first calculating an unadjusted function point count (UFC). Counts are made for the following categories [Fenton]:

Internal Logical Files - The first data function allows users to utilize data they are responsible for maintaining. For example, a pilot may enter navigational data through a display in the cockpit prior to departure. The data is stored in a file for use and can be modified during the mission. Therefore the pilot is responsible for maintaining the file that contains the navigational information. Logical groupings of data in a system, maintained by an end user, are referred to as Internal Logical Files (ILF).

External Interface Files - The second Data Function a system provides an end user is also related to logical groupings of data. In this case the user is not responsible for maintaining the data. The data resides in another system and is maintained by another user or system. The user of the system being counted requires this data for reference purposes only. For example, it may be necessary for a pilot to reference position data from a satellite or ground-based facility during flight. The pilot does not have the responsibility for updating data at these sites but must reference it during the flight. Groupings of data from another system that are used only for reference purposes are defined as External Interface Files (ElF).

The remaining functions address the user's capability to access the data contained in ILFs and EIFs. This capability includes maintaining, inquiring and outputting of data. These are referred to as Transactional Functions.

35 Cost Estimation Methods for Software Engineering

External Input - The first Transactional Function allows a user to maintain Internal Logical Files (ILFs) through the ability to add, change and delete the data. For example, a pilot can add, change and delete navigational information prior to and during the mission. In this case the pilot is utilizing a transaction referred to as an External Input (EI). An External Input gives the user the capability to maintain the data in ILF's through adding, changing and deleting its contents.

External Output - The next Transactional Function gives the user the ability to produce outputs. For example a pilot has the ability to separately display ground speed, true air speed and calibrated air speed. The results displayed are derived using data that is maintained and data that is referenced. In function point terminology the resulting display is called an External Output (EO).

External Inquiries - The final capability provided to users through a computerized system addresses the requirement to select and display specific data from files. To accomplish this a user inputs selection information that is used to retrieve data that meets the specific criteria. In this situation there is no manipulation of the data. It is a direct retrieval of information contained on the files. For example if a pilot displays terrain clearance data that was previously set, the resulting output is the direct retrieval of stored information. These transactions are referred to as External Inquiries (EQ).

In addition to the five functional components described above there are two adjustment factors that need to be considered in Function Point Analysis.

Functional Complexity - The first adjustment factor considers the Functional Complexity for each unique function. Functional Complexity is determined based on the combination of data groupings and data elements of a particular function. The number of data elements and unique groupings are counted and compared to a complexity matrix that will rate the function as low, average or high

36 Cost Estimation Methods for Software Engineering complexity. Each of the five functional components (ILF, ElF, El, EO and EQ) has its own unique complexity matrix.

Tables 3.1 shows the complexity rating matrix for the different categories calculated.

For ILF and ElF For EO .and EQ ForEI Record Data Elements File Data Elements File Data Elements Elements 1 - 20 - 51+ Types 1 - 6- 19 20 + Types 1 - 5 - 16 + 19 50 5 4 15 1 Low Low Avg 0 or 1 Low Low Avg 0 or 1 Low Low Avg 2-5 Low Avg High 2-3 Low Average High 2-3 Low Avg High 6+ Avg High High 4+ Avg High High 3+ Avg High High

Table 3.1: Function point complexity matrix [11]

Table 3.2 shows the complexity weight matrix that must be applied after the function points have be categorized and complexities determined.

Function Type Complexity-Weight Low Average High Internal Logistic Files 7 10 15 External Interfaces Files 5 7 10 External Inputs 3 4 6 External Outputs 4 5 7 External Enquiries 3 4 6

Table 3.2: Function point complexity-weight matrix [11]

All of the functional components are analyzed in this way and added together to derive an Unadjusted Function Point count (UFP).

37 Cost Estimation Methods for Software Engineering

UFP= ~X *W Equation 3.2 ~ 1 1

Where Xi is the specific number for specific function type andWi is the complexity weight value listed in table 3.2

Value Adjustment Factor

The Technical complexity factor (TCF) is when the Unadjusted Function Point count is multiplied by the second adjustment factor called the Value Adjustment Factor. This factor considers the system's technical and operational characteristics and is calculated by answering 14 questions [1 ][29]. The factors are:

• Data Communications. The data and control information used in the application are sent or received over communication facilities. • Distributed Data Processing. Distributed data or processing functions are a characteristic of the application within the application boundary. • Performance. Application performance objectives, stated or approved by the user, in either response or throughput, influence (or will influence) the design, development, installation and support of the application. • Heavily Used Configuration. A heavily used operational configuration, requiring special design considerations, is a characteristic of the application. • Transaction Rate. The transaction rate is high and influences the design, development, installation and support. • On-line Data Entry. On-line data entry and control information functions are provided in the application. • End -User Efficiency. The on-line functions provided emphasize a design for end-user efficiency. • On-line Update. The application provides on-line update for the internal logical files.

38 Cost Estimation Methods for Software Engineering

• Complex Processing. Complex processing is a characteristic of the application. • Reusability. The application and the code in the application have been specifically designed, developed and supported to be usable in other applications. • Installation Ease. Conversion and installation ease are characteristics of the application. A conversion and installation plan and/or conversion tools were provided and tested during the system test phase. • Operational Ease. Operational ease is a characteristic of the application. Effective start-up, backup and recovery procedures were provided and tested during the system test phase. • Multiple Sites. The application has been specifically designed, developed and supported to be installed at multiple sites for multiple organizations. • Facilitate Change. The application has been specifically designed, developed and supported to facilitate change.

Each component is rated from 0 to 5, where 0 means the component has no influence on the system and 5 means the component is essential [26]. The technical complexity factor (TCF) can then be calculated as [19]:

TCF = 0.65 + 0.01 (LFi) Equation 3.3

Where Fi is the function counts determined in the initial analysis process. The TCF can range from 0.65 to 1.35 because a figure of 0.65 would result if all the complexity factors had no influence, and a figure of 1.35 would indicate all the complexity factors had a significant influence.

Each of these factors is scored based on their influence on the system being counted. The resulting score will increase or decrease the Unadjusted Function Point count by 35%. This calculation provides us with the Adjusted Function Point count. The final function point figure can then be calculated [Matson]

39 Cost Estimation Methods for Software Engineering

FP=UFP*TCF Equation 3.4

Function Points as a Sizing Metric

Function points are a synthetic method, much the same as square feet or meters that permit the calculation of a relative size for individual software projects, applications, or subsystems even in their early requirements stages. Function point counting is typically performed when a developer wants to size and estimate development time and effort for an application or a project. In addition to functional size, other risk and complexity factors must be considered when estimating effort. These factors include, but are not limited to [19]: • Development and/or maintenance tasks to be performed • Application complexities; e.g., logical complexity, mathematical complexity, security requirements, etc. • Performance considerations • Source code languages used • Extent of reusable components from previously developed documents and code • Skill sets of both development and user personnel in all phases • The process and technology to be applied in development and maintenance • The environment in which development and/or maintenance will take place • When the impact of selected risk and complexity factors is considered, the effort required for development or maintenance of a certain range of function points can be estimated accurately.

40 Cost Estimation Methods for Software Engineering

An Approach to Counting Function Points

Function point counting can be accomplished with minimal documentation. However, the accuracy and efficiency of the counting improves with appropriate documentation. Examples of appropriate documentation are: • Design specifications • Display designs • Data requirements (Internal and External) • Description of user interfaces

Function point counts are calculated during the workshop and documented with both a diagram that depicts the application and worksheets that contain the details of each function discussed.

Benefits of Function Point Analysis

Organizations that adopt Function Point Analysis as a realize many benefits including: improved project estimating; understanding project and maintenance productivity; managing changing project requirements; and gathering user requirements. Each of these is discussed below.

Estimating software projects is as much an art as a science. While there are several environmental factors that need to be considered in estimating projects, two key data points are essential. The first is the size of the deliverable. The second addresses how much of the deliverable can be produced within a defined period of time. Size can be derived from Function Points, as described above. The second requirement for estimating is determining how long it takes to produce a function point. This delivery rate can be calculated based on past project performance or by using industry benchmarks. The delivery rate is expressed in function points per hour (FP/Hr) and can be applied to similar proposed projects to estimate effort (i.e. Project Hours = estimated project function points FP/Hr).

41 Cost Estimation Methods for Software Engineering

Productivity measurement is a natural output of Function Points Analysis [19]. Since function points are technology independent they can be used as a vehicle to compare productivity across dissimilar tools and platforms. More importantly, they can be used to establish a productivity rate (i.e. FP/Hr) for a specific tool set and platform. Once productivity rates are established they can be used for project estimating as described above and tracked over time to determine the impact continuous process improvement initiatives have on productivity.

3.3 Conclusion

The basis of the Measure LOC is that program length can be used as a predictor of program characteristics such as effort and ease of maintenance. The advantage of SLOC is that it is simple to measure. The disadvantages of SLOC include:

• It cannot measure the size of specification. • It characterises only one specific view of size, namely length; it takes no account of functionality or complexity • Inadequate software design may cause excessive line of code • It is language dependent • Users cannot easily understand it

On the other hand the function points can be used as an estimation variable that is used to determine the size each element of the software or as baseline metrics collected from past projects and used in conjunction with estimation variables to develop cost and effort projections.

The advantages of function points include:

• It is not restricted to code • Languageindependent • The necessary data is available early in a project.

42 Cost Estimation Methods for Software Engineering

• Layout independent

The disadvantages of function points include:

• Subjective counting 1 • Hard to automate and difficult to compute • Ignores quality of output • Oriented to traditional data processing applications

Selecting a size estimation method will depend on the preference and experience of the firm. For the best result it is best to use both methods and compare results once completed but will increase the cost of the estimation process.

Once the size is estimated the effort and cost must be determined. These values may be presented to possible customers or management for new projects or form part of a motivation for new developments. These methods will be discussed in chapter 4.

1 In a paper by [JEFFERY] they concluded that there was a 30% variation between analysts counting function points

43 Cost Estimation Methods for Software Engineering

Chapter 4

Estimation Methods

4.1. Software Life-cycle Management (SLIM) Method

Putnam developed a constraint model called SLIM to be applied to projects exceeding 70,000 lines of code. Putnam's model [27] assumes that effort for software projects is distributed similarly to a collection of Rayleigh curves. Putnam suggests that staffing rises smoothly during the project and then drops sharply during acceptance testing. The SLIM model is expressed as two equations describing the relation between the development effort and the schedule. The first equation, called the software equation, states that development effort is proportional to the cube of the size and inversely proportional to the fourth power of the development time [Fenton]. The second equation, the manpower-buildup equation, states that the effort is proportional to the cube of the development time.

The Rayleigh curve represents manpower as a function of time [25]. SLIM uses separate Rayleigh curves for design and code, test and validation, maintenance, and management. A typical Rayleigh curve is shown in Figure 4.1

Percent of total effort

Time

Figure 4.1: a Typical Rayleigh curve

44 Cost Estimation Methods for Software Engineering

Development effort is assumed to represent only 40 percent of the total life cycle cost. Requirements specification is not included in the model. Estimation using SLIM is not expected to take place until design and coding.

The Software Equation

Putnam used some empirical observations about productivity levels to derive the software equation from the basic Rayleigh curve equation [11]. The software equation is expressed as:

Size =CE±(,~ J Equation 4.1

Where • C is a technology factor. The technology constant, C, combines the effect of using tools, languages, methodology, quality assurance procedures. standards etc. It is determined on the basis of historical data (past projects). C is determined from project size, area under effort curve, and project duration. • Size is the quantity of function created in source lines of code written, function points, objects, or other measures of function. • E is the total project effort in person years. It includes all categories of labor used on the project. • Time is the elapsed calendar development time from the start of detailed design until the product is ready to enter into operational service (frequently this is a 95% reliability level).

SLIM is applicable to all types and sizes of software projects. It computes schedule, effort, cost, staffing for all software development phases and reliability for the main development phase. It works with software languages, and function points as well as other sizing metrics. It is specifically designed to address the concerns of senior management, such as:

45 Cost Estimation Methods for Software Engineering

• What options are available if the schedule is accelerated by four months to meet a tight market window? • How many people must be added to get two months schedule compression and how much will it cost? • When will the defect rate be low enough so that a reliable product can be marketed and have satisfied customers? • If the requirements grow or substantially change, what will be the impact on schedule, cost, and reliability? • How can the quantifying value of the process improvement program?

SLIM can record and analyze data from previously completed projects which are then used to calibrate the model; or if data are not available then a set of questions can be answered to get values of FP from the existing database.

The Rayleigh-Putnam Curve uses a negative exponential curve as an indicator of cumulative staff-power distribution over time during a project. The technology factor is a composite cost drivers involving the following primarily components: • Overall process maturity and management practices • The extent to which good software engineering practices are used • The level of programming languages used • The state of the software environment • The skills and experience of the software team • The complexity of the application

The software equation includes a fourth power and therefore has strong implications for resource allocation on large projects. Relatively small extensions in delivery date can result in substantial reductions in effort [26].

46 Cost Estimation Methods for Software Engineering

The Manpower-Buildup Equation

To allow effort estimation, Putnam introduced the manpower-buildup equation [11 ]:

Equation 4.2 where D is a constant called manpower acceleration, E is the total project effort in years, and t is the elapsed time to delivery in years.

The manpower acceleration is 12.3 for new software with many interfaces and interactions with other systems, 15 for standalone systems, and 27 for re­ implementations of existing systems [Putman].

Using the software and manpower-buildup equations, the effort [11] can be solved:

Equation 4.3

This equation is interesting because it shows that effort is proportional to size to the power 9/7 or -1.286, which is similar to Boehm's factor [4] which ranges from 1.05 to 1.20.

Inputs

The primary input for SLIM is SLOC, function points or any valid measure of function to be created [27]. The model uses size ranges for input: minimum, most likely, and maximum. Other important inputs include: • Language: Multiple choices and mixes. • System Type: One of nine (business, scientific, command & control, real time, etc.). • Environmental Information: Tools, methods, practices, database usage; standards in place and adherence and usage of those standards. • Experience: Personnel skill and qualifications.

47 Cost Estimation Methods for Software Engineering

• Process Productivity Parameter: a macroscopic factor determined by calibration from historical data. It is a reliable tuning factor that accurately reflects application complexity and the efficiency of the organization in building software. This is a sensitive parameter that is capable of measuring real productivity and process improvement. SLIM contains and expert system to determine the Process Productivity Parameter when the user has no historical data. This (non-linear) parameter is dealt with in terms of a linear scale ranging from 0 to 40. • Management Constraints: Maximum allowable schedule, minimum cost, maximum and minimum staff size, required reliability at the time the software goes into service as well as the desired probabilities for each of these constraints. • Accounting: Labor rates, inflation rates, and other economic factors. • Flexibility: Extensive tailoring for milestones, phase definitions, and fraction of time and effort applied to each phase based on the organization's own history.

Processing

There are three primary modes of operation: building and using an historical database, performing estimating and analysis, and creating presentations and reports [27].

For estimation, SLIM uses the software equation in conjunction with management constraints for schedule, cost, staffing and required reliability to determine an optimal solution with the highest probability of successful completion. Through Monte Carlo simulation techniques, the size range estimates are mapped through the software equation to provide estimates of the uncertainty in schedule, cost staffing and reliability. The solution obtained can be compared with the user's historical data to test its reasonableness. This discloses impossible or highly improbable solutions so that expensive mistakes are avoided.

48 Cost Estimation Methods for Software Engineering

Outputs

The primary output of SLIM is the optimal solution, which provides development time, cost, effort and reliability expected at delivery [27]. It also provides comprehensive sensitivity and risk profiles for all key input and output variables, and a consistency check with similar projects. SLIM's graphical interactive user interface makes it easy to explore quickly extensive tradeoff and "what if' scenarios including design to cost, schedule, effort and risk. It has 181 different output tables and graphs from which the user can choose. These outputs constitute a comprehensive set of development plans to measure and control the project while it is underway.

Calibration

The process productivity parameter for SLIM can (and should) be obtained by calibration using historical data. All that is required are project size, development time and effort. These numbers are input into the software equation to solve for the process productivity. The historical data can also be used to compare with any current solution to compare for reasonableness.

4.2. Constructive Cost Model (COCOMO II)

The COCOMO (Constructive Cost Model) cost and schedule estimation model was originally published by Boehm [3]. It became one of most popular parametric cost estimation models of the 1980s. But COCOMO '81 experienced difficulties in estimating the costs of software developed to new life-cycle processes and capabilities. The COCOMO II research effort was started in 1994 at University of South California to address the issues on non-sequential and rapid development process models, reengineering, reuse driven approaches and object oriented approaches.

49 Cost Estimation Methods for Software Engineering

COCOMO II was initially published in the Annals of Software Engineering in 1995 [5]. The model has three sub models, Applications Composition, Early Design and Post-Architecture, which can be combined in various ways to deal with the current and likely future software practices marketplace.

The Application Composition model is used to estimate effort and schedule on projects that use Integrated Computer Aided Software Engineering tools for rapid application development. These projects are too diversified but sufficiently simple to be rapidly composed from interoperable components. Typical components are GUI builders, database or objects managers, middleware for distributed processing or transaction processing and domain specific components such as financial, medical or industrial process control packages.

The Early Design model involves the exploration of alternative system architectures and concepts of operation. Typically, not enough is known to make a detailed fine-grain estimate. This model is based on function points (or lines of code when available) and a set of five scale factors and 7 effort multipliers.

The Post-Architecture model is used when top level design is complete and detailed information about the project is available and as the name suggests, the software architecture is well defined and established. It estimates for the entire development life-cycle and is a detailed extension of the Early-Design model. It uses Source Lines of Code and/or Function Points for the sizing parameter, adjusted for reuse and breakage; a set of 17 effort multipliers and a set of 5 scale factors that determine the economies/diseconomies of scale of the software under development.

Cost factors are also evaluated and weighted within COCOMO II for application complexity and software reliability; execution, memory, and environmental constraints; development personnel skill levels; tools and technologies; and a variety of other considerations.

50 Cost Estimation Methods for Software Engineering

COCOMO avoids estimating labor costs in monetary value because of the large variations between organizations in what is included in labor costs, and because person-months are a more stable quantity than monetary value, given current inflation rates and international money fluctuations. In order to convert COCOMO person-month estimates into rand estimates, the best compromise between simplicity and accuracy is to apply a different average rand per person-month figure for each major phase, to account for inflation and the differences in salary level of the people required for each phase

COCOMO II Model Rationale and Elaboration

The rationale for providing this mix of models (application composition, early design and post-architecture models) rests on three primary premises.

First, current and future software projects will be tailoring their processes to their particular process drivers. These process drivers include reusable software availability; degree of understanding of architectures and requirements; market window or other schedule constraints; size; and required reliability (see [5] for an example of a tailoring guidelines).

Second, the granularity of the software cost estimation model used needs to be consistent with the granularity of the information available to support software cost estimation. In the early stages of a software project, very little may be known about the size of the product to be developed, the nature of the target platform, the nature of the personnel to be involved in the project, or the detailed specifics of the process to be used.

Third, given the situation in premises 1 and 2, COCOMO II enables projects to furnish coarse-grained cost driver information in the early project stages, and increasingly fine-grained information in later stages. Consequently, COCOMO II does not produce point estimates of software cost and effort, but rather range estimates tied to the degree of definition of the estimation inputs.

51 Cost Estimation Methods for Software Engineering

Modeling Software Economies and Diseconomies of Scale

Software cost estimation models often have an exponential factor to account for the relative economies or diseconomies of scale encountered as a software project increases its size. This factor is generally represented as the exponent B in the COCOMO effort equation [5]:

PMnominal =A* (Size Y Equation 4.4

Where

PMnominal orE is person-months of estimated effort.

A is a coefficient that is provisionally set to a default value of 2 .5, but should be set to reflect a specific organization's cost and culture.

B is an exponential factor to account for the relative economies or diseconomies of scale encountered in different size software projects.

If the value of B is smaller than 1.0, the project exhibits economies of scale. This means that if the product's size is doubled, the project effort is less than doubled. The project's productivity increases as the product size is increased.

Some project economies of scale can be achieved via project-specific tools (e.g., simulations), but in general these are difficult to achieve. For small projects, fixed startup costs such as tool tailoring and setup of standards and administrative reports are often a source of economies of scale.

If B = 1.0, the economies and diseconomies of scale are in balance. This linear model is often used for cost estimation of small projects. It is used for the COCOMO II Applications Composition model.

52 Cost Estimation Methods for Software Engineering

If B > 1.0, the project exhibits diseconomies of scale. This is generally due to two main factors: growth of interpersonal communications overhead and growth of large-system integration overhead. Larger projects will have more personnel, and thus more interpersonal communications paths consuming overhead. Integrating a small product as part of a larger product requires not only the effort to develop the small product, but also the additional overhead effort to design, maintain, integrate, and test its interfaces with the remainder of the product.

A multiplicative constant, A, is used to calibrate the model locally for a better fit and it captures the linear effects of effort in projects of increasing size. The coefficient A in the equation is provisionally set at 3.0 Initial calibration of COCOMO II to the original COCOMO project database [5] indicates that this is a reasonable starting point. This value must be adjusted as the size of the project varies.

Scaling Approach

The COCOMO II scaling value is integrated into a single rating-driven model. Table 4.1 list a summary of the scale divers and the rating criteria. A project's numerical ratings Wi are summed across all of the factors, and used to determine a scale exponent B via the following equation [6]:

B = 1.01 + 0.01 * l)¥; Equation 4.5

Thus, a 100 KSLOC project with Extra High (0) ratings for all factors will have Wi= 0, B = 1.01, and a relative effort E = 1001.01= 105 PM.

A project with Very Low (5) ratings for all factors will have Wi= 25, B = 1.26, and a relative effort E = 331 PM. This represents a large variation, but the increase involved in a one-unit change in one of the factors is only about 4. 7%. Thus, this approach avoids the 40% swings involved in choosing a development mode for a 100 KSLOC product in the original COCOMO model.

53 Cost Estimation Methods for Software Engineering

Scale Very Low Low Nominal High Very High Extra High Factors (Wt)

PREC thoroughly largely somewhat generally largely throughly unprecedented unprecedented unprecedented familiar familiar familiar

FLEX rigorous occasional some general some general relaxation relaxation conformity conformity goals

RESL Little (20%) some (40%) often (60%) generally mostly full (100%) (75%) (90%)

TEAM Very difficult some difficult basically largely highly seamless interactions interactions cooperative cooperative cooperative interactions interactions

PMAT Weighted average of "Yes" answers to CMM Maturity Questionnaire

Table 4.1 Rating scheme for the COCOMO II scale factors2 [6]

Appendix A list a full description of the meaning of each scaling driver

Cost Factors: Effort-Multiplier Cost Drivers

COCOMO II uses a set of effort multipliers to adjust the nominal person-month estimate obtained from the project's size and exponent drivers [6]:

17 ) p Madjusted p Mno min a/ EM; Equation 4.6 = * ( D

Table 4.2 summarizes the COCOMO II effort-multiplier cost drivers by the four categories of Product, Platform, Personnel, and Project Factors. The superscripts following the cost driver names indicated the differences between the COCOMO II cost drivers and its counterpart in the original COCOMO model.

2 * % significant module interfaces specified,% significant risks eliminated. t The form of the Process Maturity scale is being resolved in coordination with the SEI. The intent is to produce a process maturity rating as a weighted average of the project's percentage compliance levels to the 18 Key Process Areas in Version 1.1 of the Capability Maturity Model­ based [Paulk 1993] rather than to use the previous 1-to-5 maturity levels. The weights to be applied to the Key Process Areas are still being determined.

54 Cost Estimation Methods for Software Engineering

Very Low Low Nominal High Very High Extra High RELY slight low, easily Moderate, high financial risk to inconvenience recoverable easily loss human losses recoverable life losses DATA DB bytes/Pgm 10 < D/P < 100 < D/P < D/P < 1000 SLOC < 10 100 1000 CPLX Appendix C RUSE none Across project Across across across program product multiple line product lines DOCU Many life-cycle Some life-cycle Right-sized to Excessive for Very needs needs life-cycle life-cycle needs excessive uncovered uncovered. needs for life-cycle needs TIME 50% use of 70% 85% 95% available execution time STOR 50% use of 70% 85% 95% available storaqe PVOL major change major: 6 mo.; major: 2 mo.; major: 2 wk.; every 12 mo.; minor: 2 wk. minor: 1 wk. minor: 2 minor change days every 1 mo. ACAP 15th percentile 35th percentile 55th 75th percentile 90th percentile percentile PCAP 15th percentile 35th percentile 55th 75th percentile 90th percentile percentile PCON 48% I year 24% /year 12% /year 6% I year 3% I year AEXP < 2 months 6 months 1 year 3 years 6 years PEXP < 2 months 6 months 1 year 3 years 6years LTEX < 2 months 6 months 1 year 3 years 6 years TOOL edit, code, simple, front- basic lifecycle strong, mature strong, debug end, backend tools, lifecycle tools, mature, CASE, little moderately moderately proactive life integration integrated integrated cycle tools, well integrated with processes, methods, reuse SITE: International Multi-city and Multi-city or Same city or Same Fully Collocati Multi-company Multi-company metro. Area building collocat on Or complex ed SITE: Some phone, Individual Narrowband Wideband Wide band lnteracti Commun mail phone, FAX email electronic elect. ve icatlons communication. comm, multi me occasional dia video conf SCED 75% of 85% 100% 130% 160% nominal Table 4.2: Effort multipliers cost driving ratmg for the post-architecture model [6].

55 Cost Estimation Methods for Software Engineering

Table 4.2 provides the COCOMO II effort multiplier rating scales. Appendix F lists a full description of the meaning of each effort multipliers cost drivers.

Development Schedule Estimates

The initial baseline schedule equation for all three COCOMO II models is3 [6]:

-~ozs+o2(B-Iol))J SCED% TDEV = 3.67* (PM) *--- Equation 4. 7 [ 100 where TDEV is the calendar time in months from the determination of its requirements baseline to the completion of an acceptance activity certifying that the product satisfies its requirements. PM is the estimated person-months excluding the SCED effort multiplier, and SCEDPercentage is the schedule compression I expansion percentage in the SCED cost driver rating.

Early Design effort multiplier cost drivers

In Early Design, however, a reduced set of effort multiplier cost drivers is used. These are obtained by combining the Post-Architecture cost drivers as shown in Table 4.3.

The resulting seven cost drivers are easier to estimate in early stages of software development than the 17 Post-Architecture cost drivers. However, their larger productivity ranges (up to 5.45 for PERS and 5.21 for RCPX) stimulate more variability in their resulting estimates. This situation is addressed by assigning a higher standard deviation to Early Design (versus Post-Architecture) Estimates.

3 PM is the estimated person-months excluding the SCED effort multiplier. SCED% is the compression I expansion percentage in the SCED effort multiplier in table

56 Cost Estimation Methods for Software Engineering

Early Design Cost Driver Counterpart Combined Post-Arch. Cost Driver CPLX RELY,DATA,CPLX,DOCU RUSE RUSE PVOL TIME, STOR, PCON ACAP ACAP, PCAP, PCON PREX AEXP, PEXP, LTEX TOOL TOOL, SITE SCED SCED

Table 4.3: Early design and post-architecture cost driver [5].

4.3. Expertise-Based Technique

Expertise-based technique is useful in the absence of quantified, empirical data. They capture the knowledge and experience of practitioners seasoned within a domain of interest, providing estimates based upon a synthesis of the known outcomes of all the past projects to which the expert is privy or in which he or she participated. The obvious drawback to this method is that an estimate is only as good as the expert's opinion, and there is no way usually to test that opinion until it is too late to correct the damage if that opinion proves wrong. Years of experience do not necessarily translate into high levels of competency.

Delphi Technique

The Delphi technique [13] was developed at The Rand Corporation in the late 1940s originally as a way of making predictions about future events. More recently, the technique has been used as a means of guiding a group of informed individuals to a consensus of opinion on some issue.

Participants are asked to make some assessment regarding an issue, individually in a preliminary round, without consulting the other participants in the exercise. The first round results are then collected, tabulated, and then returned

57 Cost Estimation Methods for Software Engineering to each participant for a second round, during which the participants are again asked to make an assessment regarding the same issue, but this time with knowledge of what the other participants did in the first round. The second round usually results in a narrowing of the range in assessments by the group, pointing to some reasonable middle ground regarding the issue of concern. The original Delphi technique avoided group discussion; the Wideband Delphi technique [5] accommodated group discussion between assessment rounds.

This is a useful technique for coming to some conclusion regarding an issue when the only information available is based more on "expert opinion" than hard empirical data.

It becomes more obvious that a number of parameters need to be determined based on as expert's (or designer's) estimates. The accuracy of these is crucial to the performance of the model that has to be calibrated to the needs of the specific software organization. One may also expect that a group of experts (designers) can do a better job than a single individual. The Delphi method helps coordinate a process of gaining information and generating reliable estimates. The group estimating procedure governed by the Delphi method comprises a series of the following steps:

• Coordinator presents each expert with a specification of the proposed project and other relevant information. • Coordinator calls a group meeting where experts discuss the estimates. • Experts fill out estimation forms indicating their personal estimates of total project effort and total development effort. The estimates are given in a interval format: the expert provides the most likely value along with an upperandlowerbound. • Coordinator prepares and circulates the summary report indicating the group estimates and the individual estimates.

58 Cost Estimation Methods for Software Engineering

• Coordinator calls a meeting during which experts discuss current estimates.

This process is repeated until a consensus is reached. The group estimate is taken as an average of the weighted individual estimates, computed as [24]

Estzmate. = _____Lower bound of.:....._ estimate ______+ 4 *most likely estimate + upper bound of estiamte_ 6 Equation 4.8

The variance of the individual estimate is defined as [24] Upper bound- Lower bound Variance Equation 4.9 6

The group variance is the average of the variances of the individual estimates.

4.4. Cost Estimation method

Once the effort required has been determined the resources must be allocated to the project. The number of resources required will depend on the person-month and the time to complete the project. Equation 4.10 indicates how the amount of resources can be determined [24] Effort Estimated Number of resources required=------Equation 4.10 Calender months estimated

For an estimated 12 month project, with an estimated person months of 120 the required number of resources will be 120/12 which is ten resources (full time development engineers). An assumption is made that all developer engineers will be fulltime allocated to the project and will part of the project in all phases of the project (calendar months estimated).

59 Cost Estimation Methods for Software Engineering

In the case that only a limited number of resources are available, equation 4.7 is not valid. The person months must then be divided by the number of available resources to get the estimated calendar months.

The effort cost will consists of the sum of all the salaries of the development engineers for the specific period. The effort cost can be calculated a follow (expressed in rand value) [24]

Effort cost= TDEV *~)Cost to companyJ Equation 4.11

Where • TDEV is the calendar time in months from the determination of its requirements baseline to the completion of an acceptance activity certifying that the product satisfies its requirements. • Cost to company is the direct cost of each engineer allocated to the project.

Other resources like analysts and project managers are not incorporated in the effort formula. These values must be determined separately and then added to the effort cost.

Direct cost like operating systems, development tools and licensing must also be added. Indirect cost like travel expenses, training and stationary must then also be added. The sum of all these values will result in the cost of the project (equation 4.14)

Total Cost = Direct Cost + Indirect Cost + Effort Cost Equation 4.12

60 Cost Estimation Methods for Software Engineering

4.5. Conclusion

The cost estimation process is an interesting mix of formal models and experience. In this sense, the overall modeling process is not straight forward and requires a significant level of skill. To produce a meaningful and reliable estimate, the cost estimation process needs to be thoroughly arranged and carefully followed.

From this section it seems that all estimation tools have specific strong points for specific types of project. It would be best when estimating the cost that more than one estimation method be used to get a global view on the possible cost.

The advantages and disadvantage of the methods and techniques discussed are summarized in table 4.4

Method Advantages Disadvantages SLIM Uses linear programming A study carried out by [PENGELLY] to consider development indicated that SLIM did not perform constraints on both cost accurately on small projects. However, and effort [LONDIEX] reported that SLIM is suitable for software developments that meet the following of 1) Software size is greater than 5000 lines 2) Effort greater than 1.5 man years 3) Over 6 months development time.

SLIM estimates are extremely sensitive to the technology factor

Process is not transparent

61 Cost Estimation Methods for Software Engineering

COCOMO COCOMO is transparent Extremely vulnerable to mis- 11 and it can be seen how it classification of the development mode works Success depends largely on tuning the Drivers are particularly model to the needs of the organization, helpful to the estimator to using historical data which is not understand the impact of always available different factors that affect project cost COCOMO estimates assume that the project will enjoy good management by both the developer and the customer.

COCOMO assumes that the requirements specification is not substantially changed after the plans and requirements phase, although some refinements and reinterpretations are inevitable. Any significant modifications or added capabilities should be covered by a revised cost estimate. Expert- Group of experts are The process is extremely sensitive to Based involved in the estimation the technology factor technique process and not one individual Large amount of human resource required

Large amount of time required

Table 4.4: Advantages and disadvantages of cost estimation methods.

62 Cost Estimation Methods for Software Engineering

Although the COCMO II method is the most popular method used for estimation, it does not mean that the other methods are less accurate. The best scenario would be that all three methods are used and that a comparison is made, but this will add a cost component to the whole estimation process.

Firms should keep a database of project history and this can also be a useful reference to measure the end result of the current estimation process with historical projects.

63 Cost Estimation Methods for Software Engineering

Chapter 5

Case Study

5.1 Project description

To demonstrate how the cost estimation is used in practice a use case will be presented. The project under discussion is a server application to be developed to interface with a terminal. The full technical specification of the project is listed in Appendix G but a summary of the project is as follows: • Clients will be given a member card to be used at any Nu Metro cinemas. • The card will be swiped and the system must determine the following o Is the user a valid user o Has the user used the card the same day o How many tickets are allowed • The terminal will receive a valid list of users with all the required detail from the server • The terminal will initialize a connection to the server on the predefined time each day and receive the required information. • The terminal will send client usage information to the server once all the relevant detail has been received

The first step of estimating the cost of a project would be to analyze the requirements of the system (Appendix G) and determine the size of the project.

5.2 Project Size estimation

In the case of this project both the function point count and source line of code was used. Firstly the SLOG size estimation method will be discussed.

Once all requirements was available a workgroup was set up with the following staff members:

64 Cost Estimation Methods for Software Engineering

• Project manager • Head software developer • Developer assigned for the project • Specialist developer working with the terminal

The project was assessed and broken into modules. Each member had to give her/his view on the difficulty of the project and the number of non-commented lines of code to be generated.

A variance between 800 and 1500 lines of code was given. The specialist giving the lowest value and the assigned engineer giving the highest value. A value of 1000 SLOC was agreed on by all parties reached.

Once this was completed the same team started with the function point count. The final results were as follows: • External inputs. The inputs were the parameter files to be received from the terminal and the request for connection file. • External output. The external file was the parameter file sent to the terminal. • External Inquiries. None • Internal Logical Files. The processing file • External Interface Files. The audit files and database queries.

All the function types were marked as highly complex and the value adjustment factor would have no influence meaning that the value of TCF = 0.65.

From equation 3.2 the UFP is

UFP = 2*6 + 1*7 + 1*15 + 2*10 =54

Using equation 3.4 the FP is

65 Cost Estimation Methods for Software Engineering

FP = 0.65 * 54 = 35

To bring the two values into perspective [5] has a conversion table between SLOG and FP for each development language. The project in question was developed in Visual Basic 6 and the conversion value from FP to SLOG is 36. The number of SLOG for the function point count was 1200, differing by 100 from the SLOG calculation.

On completion the number of source line of code developed was 1253. Table 5.1 shows the difference.

SLOC FP [!}ctual ·· ...... : ...... Estimated 1000 1200 1253 Variance 253 53 Table 5.1: Estimation method variations

5.3 Effort estimation

For this project only the GOGOMO II method and the expert-base technique was used. The individual results are listed below.

COCOMO II

To determine the effort, equation 4.4 will be used. The inputs needed are A, 8 and the Size. The default value for A is used. The value for A can be different from organization to organization. The Post-Architecture model is used because the requirements are already defined.

The Size has already been determined as 1.2 KLOG or 1200 LOG. The only value that still needs to be determined is the B scale driver. To determine B,

66 Cost Estimation Methods for Software Engineering equation 4.5 will be used. The summary of the values for the effort estimation is shown in table 5.2.

Scale Factor Rating Value PREC Considerable 3.5 Development flexibility Considerable 2.3 Architecture I Risk Some 3.4 Resolution Team Cohesion Medium 2.5 Process maturity Medium 1.63 Table 5.2: Summary of effort estimation values

After all the scale driver values have been determined, B van be determined with B=l.01+0.01*LW; = 1.01 +0.01(3.5+2.3+3.4+2.5+ 1.63)= 1.143

Now the nominal person-month value can be determined with

1 143 PMnominal =A*(SizeY = 1.25(1.2) · == 1.2

The A coefficient value is taken as 1.25. From previous project data it was found that the default of 2.5 gave an over estimation and it was found that the calibration of the coefficient to 1 .25 gave a more accurate value.

Now the nominal person month has to be adjusted. To adjust the person month the cost drivers have to be determined. Table 5.3 provides a summary of the 17 cost driver's values. The description of each value is listed in appendix A.

67 Cost Estimation Methods for Software Engineering

Driver Value Required Software Reliability 4 Data Base Size 5 Product Complexity 4 Required Reusability 2 Documentation to life cycle needs 3 Execution time constraints 3 Main storage constraint 3 Platform Volatility 3 Analyst capability 3 Engineering capability 3 Application experience 4 Platform experience 3 Language and tool experience 3 Personal continuity 2 Use of software tools 3 Multi-site development 3 Required development schedule 2 (75%) Table 5.3: Summary of cost dnver values The Adjusted Person-Months is determined with equation 4.6:

PMadjusted = PMnominal *( DEM;)

= 1.2*(1.1705) = 1.4

PMadjusted without SCED driver is 1.2*(1.7558) =2.1

To determine the schedule equation 4.7 is used

-~o28+o2(s-Lol))J SCED% TDEV = 3.67 * (PM j * --- [ 100 0 3066 = 3.67(2.1) · * (0.75) == 3 Months

68 Cost Estimation Methods for Software Engineering

The cost for just the development for the code would be the cost to company of one developer. Other expenses would include the hardware cost, development software and overheads

Expert base technique The expert base technique is started having a work session. The following members of the project was involved in the estimation process: • Project manager • Head software developer • Developer assigned for the project • Specialist developer working with the terminal

Each member had to give there estimation on how long they expected the development to last. The results are listed in table 5.4.

Member Months Project manager 2 Head software developer 2.5 Developer assigned for the project 3 Specialist developer working with the 2.5 terminal Table 5.4: Summary of team estimation values

From equation 4.8 the estimated time= (3 + 4*2.5 + 2)/6 = 2.5 months The project took 2.5 months to complete, which is exactly in line with the expert base technique prediction

COCOMO II Expert Actual 3 months 2.5 Months 2.5 Months Table 5.5: Companson between COCOMO II and Expert Based Techmque

69 Cost Estimation Methods for Software Engineering

5.4 Conclusion

From the results of the size estimation it can be seen that the function point count was more accurate than the source line of code estimation. This can be contributed to the fact that the project at hand is small of size and that the requirements were available.

Function points are very accurate for small project and starts to become less accurate as the size and complexity increases. The same goes for SLOG estimation.

Function point counting is currently more accurate than SLOG because there are fewer dependants on early accurate requirements and takes complex development functions into account.

On the effort side it can be seen that the expert-based scenario is more accurate than the GOGOMO II method. This can also be attributed to the fact that the project is small in size. Most of the cost drivers are used for medium to large projects with medium to large man power teams.

The conclusion is that the expert method would be well suited for small projects. Where small projects can be seen as projects where one or two developers are involved and the project life span is less than months.

70 Cost Estimation Methods for Software Engineering

Chapter 6

Conclusions and Recommendations

6.1 Conclusions

This dissertation has presented an overview of a variety of software estimation techniques, providing an overview of several popular estimation models currently available. Literature to date [4] indicates that estimate based techniques are less mature than the other classes of techniques, but that all classes of techniques are challenged by the rapid pace of change in software technology.

The baseline COCOMO II family of software cost estimation models presented here provides an adaptable cost estimation capability well matched to the major current and likely future software process trends. It is currently serving as the framework for an extensive data collection and analysis effort to further refine and calibrate its estimation capabilities.

Thus, it can be see that the COCOMO II rating scales and effort multipliers provide a rich quantitative framework for exploring software project and organizational tradeoff and sensitivity analysis. The framework would enable the project manager to explore alternative staffing options involving various mixes of application, platform, and language and tool experience. An organization-level manager could also explore various options for transitioning a portfolio of applications from their current application/platform/language configuration to a desired new configuration (e.g., by using pilot projects to build up experience levels).

Software cost estimation is an important part of the software development process. Models can be used to represent the relationship between effort and a primary cost factor such as size. Cost drivers are used to adjust the preliminary

71 Cost Estimation Methods for Software Engineering estimate provided by the primary cost factor. Although models are widely used to predict software cost, many suffer from some common problems. The structure of most models is based on empirical results rather than theory. Models are often complex and rely heavily on size estimation. Despite these problems, models are still important to the software development process. Models can be used most effectively to supplement and corroborate other methods of estimation.

The following points were observed by the author: • Experience and informal analogy are the primary cost estimation methods. The majority of organizations relied on individuals' expertise and experience to arrive at cost estimates. Managers received little or no training in estimation. Estimators were expected to arrive at accurate estimates by relying on their knowledge of the software process used within the organization and recollections of their previous projects

• Few organizations have sufficient historical data to be used for cost estimation. With a few notable exceptions, organizations did not have information regarding past projects recorded in a manner that was useful and accessible to estimators. However, a number of organizations had recently implemented programs to gather and store this data, but it will be a few years before the impact of the data gathering on estimation accuracy can be determined.

• Estimation cannot be improved without a well-defined and well controlled software process. Organizations without a defined and controlled software process cannot achieve consistency in their software development. Without consistency in software development, consistently accurate estimates are not possible.

• Requirements creep is a major reason for cost overruns. It can be minimized, but cannot be eliminated. Two conclusions are drawn. First, if

72 Cost Estimation Methods for Software Engineering

cost estimates are to be accurate, the initial software requirements must be as complete and correct as possible. Second, for complex systems, it is impossible to generate requirements that are 100% complete and correct. Thus, one must accept the fact that complete accuracy for estimates of complex systems is not possible.

6.2 Recommendations

The solution to improving estimation accuracy is not a high technology issue. No existing tools, models, or methodologies can be brought to bear on the problem that by themselves will have a significant impact. Rather, the problem is one of applying simple technologies, an effective software development process, and proper management and control to achieve a consistency in development, which allows more accurate cost estimation. Solutions to the cost estimation problem must address the issues in all of these areas or they will not be effective.

The cost estimation problem varies considerably among organizations that do their estimation under very different constraints. The recommendations are general in nature and must be tailored to the individual organizations needs, depending on whether they are maintenance groups, procurement organizations, commercial developers, etc.

The following recommendations are based on two assumptions: • There is significant room for improvement in the accuracy of cost estimates for software intensive systems. • Although there is room to improve the level of accuracy of software cost estimates, there will continue to be a large margin of error; organizations must adapt to accept this fact.

73 Cost Estimation Methods for Software Engineering

Software Process Improvements

Improving software cost estimation accuracy must begin with a solid and effective software development process. An effective software process can be used to increase accuracy in cost estimation in a number of ways. • Formalizing when and how estimates and re-estimates costs are performed. A critical aspect to estimation accuracy is to have a well­ defined process that defines when and how cost estimates are performed. • The process used to perform the estimates, including who performs the estimate and who has sign off authority on the estimate. • Permit effective monitoring and control of software costs. No cost estimate will be accurate without effective monitoring and control of software costs. If there is no effective technique for monitoring and controlling the project, there is an increased risk of the costs of the project escalating without management being able to recognize or identify the problem at a time when action can be taken to minimize the effect. • Objective measure of completeness. Each WBS work item should have clearly identified output items and an objective means of determining the completeness of these items. • Analyzing problems reported during the development process. Every reported problem with either a product or a process should be traced back to its cause. This requires determining which Work Breakdown Structure activity, and which work item of that activity, was the cause of the problem. This is a prerequisite to determining whether a particular activity within the organization is a cause of the problem. • Management must recognize that cost estimates based on the initial requirements are wrong because the requirements are wrong. This means there must be a provision within the software process to re-estimate costs as requirements are changed. The re-estimation depends on the constraints under which the system is being developed.

74 Cost Estimation Methods for Software Engineering

Maintaining a Historical Database

Organizations should maintain a database, which can be used as a basis for estimating costs of future projects. The database should include both project metrics (which describe the features of the system built) and process metrics, which describe features of the process used to build the system. It is impossible to identify specific metrics that should be recorded and used by every organization; each organization and each situation are unique. However, metrics recorded for the purpose of improving cost estimation should be able to satisfy the following: • Actual cost of the system development. Unless the actual cost of the system development is known, it is impossible to determine the accuracy of the estimates. • All estimates and re-estimates are recorded. To determine the accuracy .of estimates, and the rate of convergence of the estimates to the actual cost, a complete record of all estimates must be maintained • The characteristics of the completed product. This includes the size measured in some suitable units (e.g., Source Lines of Code, Function Points), a description of the functionality of the system, classification of type of software, and any other information that characterizes the system. This information is required if any rigorous estimation by analogy is to be performed or if any costing models are to be developed.

These processes will simplify the cost estimation process and will in turn increase management capabilities to ensure that cost effective and on budget software is engineered.

75 Cost Estimation Methods for Software Engineering

6.3 Further Investigation

One of the statements made in this dissertation is that cost estimation methods should stay up to date with software trends. The current cost estimation methods express effort in a value of man months. This gives a good indication on the amount of human resource will be required.

It happens that a project requires a senior engineer of the architectural design and the junior engineers can complete the project while a database administrator is required half way trough the project

Future cost estimation research should be inclined towards delivering indications on what skills will be needed for what period to complete a specific project.

76 Cost Estimation Methods for Software Engineering

Glossary

Algorithmic Models (also known as parametric models): produce a cost estimate using one or more mathematical algorithms using a number of variables considered to be the major cost drivers. These models estimate effort or cost based primarily on the hardware/software size, and other productivity factors known as cost driver attributes.

Analogy (or Comparative) Models: Models that use a method of estimating that compares a proposed project with one or more similar and completed projects where costs and schedules are known. Then, extrapolating from the actual costs of completed projects, the model(s) estimates the cost of a proposed project.

Constructive Cost Model (COCOMO): A software cost estimation model developed by Barry Boehm and is described in his book, Software Engineering Economics.

Cost Analysis: The review and evaluation of the separate cost elements and proposed profit of (a) an offeror's or contractor's cost or pricing data and (b) the judgmental factors applied in projecting from the data to the estimated costs in order to form an opinion on the degree to which the proposed costs represent what the cost of the contract should be, assuming reasonable economy and efficiency.

77 Cost Estimation Methods for Software Engineering

Cost Driver Attributes: Productivity factors in the software product development process that include software product attributes, computer attributes, personnel attributes, and project attributes.

Cost Drivers: The controllable system design or planning characteristics that have a predominant effect on the system's costs. Those few items, using Pareto's law, that have the most significant cost impact.

Cost Model: An estimating tool consisting of one or more cost estimating relationships, estimating methodologies, or estimating techniques used to predict the cost of a system or one of its lower level elements.

Delphi Technique: A group forecasting technique, generally used for future events such as technological developments, that uses estimates from experts and feedback summaries of these estimates for additional estimates by these experts until a reasonable consensus occurs. It has been used in various software cost-estimating activities, including estimation of factors influencing software costs.

Domain: A specific phase or area of the software life cycle in which a developer works. Domains define developers and users areas of responsibility and the scope of possible relationships between products. The work can be organized by domains such as Software Engineering Environments, Documentation, Project Management etc.

Expert Judgment Models: use a method of software estimation that is based on consultation with one or more experts that have experience with similar projects. An expert-consensus mechanism such as the Delphi technique may be used to produce the estimate.

Function Points: Function Points are those pieces of code that perform some specific activity related to inputs, inquiries, outputs, master files, and external system interfaces.

78 Cost Estimation Methods for Software Engineering

Life Cycle: The stages and process through which hardware or software passes during its development and operational use. The useful life of a system. Its length depends on the nature and volatility of the business, as well as the software development tools used to generate the databases and applications.

Metric: Quantitative analysis values calculated according to a precise definition and used to establish comparative aspects of development progress, quality assessment or choice of options.

New Line of Code: A source line of code that will be developed completely, i.e., designed, coded and tested.

PM: Person Months, A person month is the amount of time one person spends working on the software development project for one month.

Price Analysis: The process of examining and evaluating a proposed price without evaluating its separate cost elements and proposed profit.

Process: The sequence of activities (in software development) described in terms of the user roles, user tasks, rules, events, work products, resource use, and the relationships between them. It may include the specific design methodology, language, documentation standards etc.

Rayleigh Distribution: A curve that yields a good approximation to the actual labor curves on software projects.

Real-Time: 1) Immediate response. The term may refer to fast transaction processing systems in business; however, it is normally used to refer to process control applications. For example, in avionics and space flight, real-time computers must respond instantly to signals sent to them. 2) Any electronic operation that is performed in the same time frame as its real-world counterpart. For example, it takes a fast computer to simulate complex, solid models moving

79 Cost Estimation Methods for Software Engineering on screen at the same rate they move in the real world. Real-time video transmission produces a live broadcast.

Security: The protection from accidental or malicious access, use, modification, destruction, or disclosure. There are two aspects to security, confidentiality and integrity.

Software Development Life Cycle: The stages and process through which software passes during its development. This includes requirements definition, analysis, design, coding, testing, and maintenance.

Software Engineering Institute (SEI): SEI is a federally funded research and development center established in 1984 by the DoD with a broad charter to address the transition of software engineering technology. The SEI is an integral component of Carnegie Mellon University and is sponsored by the Office of the Under Secretary of Defense for Acquisition and Technology. SEI developed the Software Acquisition Capability Maturity Model (CMM) and the Checklist and Criteria for Evaluating the Cost and Schedule Estimating Capabilities of Software Organizations.

Software Method (or Software Methodology): Focuses on how to navigate through each phase of the software process model (determining data, control, or uses hierarchies; partitioning functions; and allocating requirements) and how to represent phase products (structure charts; stimulus-response threads; and state transition diagrams).

Source Lines of Code (SLOC): All executable source code statements including deliverable Job Control Language (JCL) Statements, Data declarations, Data Typing statements, Equivalence statements, and Input/Output format statements. SLOG does not include any statement that upon its removal, the program will still compile, e.g., comments, blank lines, and non-delivered programmer debug statements.

80 Cost Estimation Methods for Software Engineering

Validation: In terms of a cost model, a process used to determine whether the model selected for a particular estimate is a reliable predictor of costs for the type of system being estimated.

Work Breakdown Structure: A work breakdown structure is a product-oriented family tree, composed of hardware, software, services, data and facilities which results from system engineering efforts during the development and production of a defense material item, and which completely defines the program. A work breakdown structure displays and defines the product(s) to be developed or produced and relates the elements of work to be accomplished to each other.

81 Cost Estimation Methods for Software Engineering

Appendix A

Scaling Drivers

Development Flexibility (FLEX)

To determine the flexibility of the development process the following features has to be taken into account. • Need for software conformance with pre-established requirements • Need for software conformance with external interface specifications • Premium on early completion

Feature Very low Nominal High Extra High Need for software Full Considerable Considerable Basic conformance with pre- established requirements Need for software Full Considerable Considerable Basic conformance with external interface specifications Premium on early High Medium Medium Low completion

Table A 1 Development Flexibility scaling drivers

82 Cost Estimation Methods for Software Engineering

Scaling Drivers

Precedentedness includes the following features Feature Very low Nominal High Extra High Organizational understanding of General Considerable Considerable Thorough product objective Experience in working with related Considerable Considerable Extensive software systems Moderate Concurrent development of Extensive Moderate Moderate Some associated new hardware and operational procedures Need for innovative data processing Considerable Some Some Minimal architecture and algorithms

Table A2: Precedentedness scaling drivers

The PREC rating is the subjective weighted average of the listed characteristics.

Architecture I Risk Resolution (RESL)

The RELS rating is the subjective weighted average of the listed characteristics (see Appendix B).

Team Cohesion (TEAM)

The Team Cohesion scale factor account for the source of project turbulence and entropy due to difficulties in synchronizing the project's stakeholders: users, customers, developers, maintainers, interfaces and others. These

83 Cost Estimation Methods for Software Engineering

difficulties may arise from differences in stakeholder's objectives and cultures, difficulties in reconciling objectives or stack holder's lack of experiences and familiarity in the operating team. Appendix C provides a detailed definition for the overall TEAM rating levels. The final rating is the subjective weighted average of the listed characteristics (see Appendix C).

Process Maturity (PMAT)

To determine the process maturity the following Key Process Areas (KPA) questionnaire must be completed and the weight average must be determined (See Appendix D).

• Check Almost Always when the goals are consistently achieved and are well established in standard operating procedures. • Check Frequently when the goals are achieved relatively often, but sometimes are omitted under difficult circumstances. • Check About Half when the goals are achieved about half of the time • Check occasionally when the goals are sometimes achieved, but less than often. • Check Rarely If ever when the goals are rarely if ever achieved. • Check Does Not Apply when the engineers have the required knowledge about the project or organization and the (KPA). • Check Don't Know when uncertain about how to respond to the KPA

After the KPA in completed each compliance level is weighted and a PMAT factor is calculated, as in equation A1

84 Cost Estimation Methods for Software Engineering

s-[t(KPA%*i*2_)] Equation A1 1=1 100 18

Appendix B

Architecture I Risk Resolution (RESL)

Characteristics Very Low Low Nominal High Very Extra High high Risk Management Plan identifies all critical risk items, establishes None Little Some Generally Mostly Fully milestones for resolving them by Product Design Review Schedule, budget and internal milestones through Product design None Little Some Generally Mostly Fully Review compatible with Risk Management Plan Percentage of development schedule devoted to establishing 5 10 17 25 33 40 architecture, given general product objectives Percent of required top software architects available to the project. 20 40 60 80 100 120 Tool support available for resolving risk items, developing and None Little Some Good Strong Full verifying architectural specs Level of uncertainty in Key architecture drivers: mission, user Extreme Significant Considerable Some Little Very little interface, hardware, technology and performance Number and criticality of risk items > 10 5-10 Critical 2-4 Critical 1 Critical > 5 Non- < 5 Non- Critical Critical - Critical

Table 81: Architecture/Risk resolution scaling table

85 Cost Estimation Methods for Software Engineering

Appendix C

Team Cohesion (TEAM)

Characteristics Very low Low Nominal High Very High Extra High Consistency of stakeholder objectives and cultures Little Some Basic Considerable Strong Full Ability, willingness of stakeholders to accommodate other Little Some Basic Considerable Strong Full stakeholders objectives Experience of stakeholders in operating team None Little Little Basic Considerable Extensive Stakeholder teambuilding to achieve shared vision and commitments None Little Little Basic Considerable Extensive

Table C1: Team cohesion scaling table Appendix D

Process Maturity (PMAT)

Key Process Area Almost Always Often About Half Occasionally Rarely if Does Not Don't >90% 60-90% 40-60% 10-40% Ever <10% Apply Know Requirements Management Software Project Planning Software Project Tracking Software Subcontract Management Software Quality Assurance Software Configuration Management Organization Process Focus Organization Process Definition Training Program Integrated Software Management Software Product Engineering Intergroup Coordinating

86 Cost Estimation Methods for Software Engineering

Peer Review Quantitave Process management Software Quality Management Defect Prevention Technology Change management Process Change Management

Table 01: Process maturity scaling table

Appendix E

Product Complexity (CPLX)

Very Low Low Nominal High Very High Extra high Control Straight-line Straight forward Mostly simple Highly nested Reentrant and recursive Multiple resource Operations code without a nesting nesting. Decision programming coding. Fixed-priority scheduling with few non-nested programming tables and simple operators with interrupt handling and dynamically charging structure operators callbacks or many compound complex callbacks priorities and microcode- programming message passing. predicates. Queues level control operators and stack control. Computation Evaluation of Evaluation of Use of standard Basic numerical Difficult but structured Difficult and unstructured al Operations simple moderate-level math and statistical analysis: numerical analysis: near numerical analysis: expressions expressions routines. Basic multivariate singular matrix equations, Highly analysis of noisy, matrix/vector interpolation, partial differential stochastic data Complex Operation ordinary differential equations. parallelization equations.

87 Cost Estimation Methods for Software Engineering

Device Simple read, No cognizance 1/0 processing Operation at Routines for interrupt Device timing dependent dependent write needed of includes devices physical 1/0 level. diagnosis, servicing, coding, micro- Operations statements with particular processor selection, 1/0 masking. Communication programmed operations I status Optimized i simple formats. or 1/0 device checking and error overlaps. line handling characteristics. 1/0 processing done at GET/Put level. Data Simple arrays Single file Multi-file input and Simple triggers Distributed database Highly coupled, dynamic management in main subsetting with no single file output. activated by data coordination. Complex relational and object Operations memory. data structure Simple structural stream contents. triggers. Search structures. Natural Simple COST- changes, no edit, changes, simple Complex data optimization. language data DB queries, no intermediate edits. restructuring. management updates files. User Simple input Use of simple Simple use of Widget set Moderately complex Complex multimedia, : Interface forms, report graphic user widget set development and 2D/3D, dynamic graphics, virtual reality Management generators interface (GUI) extension. Simple multimedia. Operations builder voice 1/0, ----- multimedia

Table E1: Product complexity scaling table Appendix F

Effort multipliers

Required Software Reliability (RELY)

This is the measure of the extent to which the software must perform its intended function over a period of time. If the effect of a software failure is only slight inconvenience then RELY is low. If a failure would risk human life then RELY is very high

88 Cost Estimation Methods for Software Engineering

Data Base Size (DATA)

This measure attempts to capture the effect large data requirements have on product development. The rating is determined by calculating D/P. the reason the size of the database is important to consider is because of the effort required to generate the test data that will be used to exercise the program.

D DataBaseSize(Bytes) E t· F1 -= qua 1on P Pr ogramSize(SLOC)

DATA is rated as low if D/P is less than 10 and very high if it is greater than 100

Product Complexity (CPLX)

Complexity is divided into five areas: Control operations, computational operations, device-dependent operations, data management operations, and user interface management operations. Select the combination of areas that characterize the product or a sub-system of the product. The complexity rating is the subjective weighted average of these areas. If the Control operations are Low and the Data Management Operations is high then the Complexity is the average of 1 and 4, which are 2.5. Always round of to the value closes to the Nominal value, which are 3 (see Appendix E).

Required Reusability (RUSE)

89 Cost Estimation Methods for Software Engineering

This cost driver accounts for the additional effort needed to construct components intended for reuse on the current or future projects. This effort is to be consumed with creating generic design of software, more elaborate documentation and more extensive testing to ensure components are ready fore use in other applications.

Documentation Match to Life-Cycle Needs (DOCU)

Several software cost models have a cost driver for the level of required documentation. In COCOMO II, the rating scale for the DOCU cost driver is evaluated in terms of the suitability of the project's documentation to its life-cycle needs. The rating scale goes from Very Low (many life-cycle needs uncovered) to Very high (very excessive for live-cycle needs).

Execution Time Constraint (TIME)

This is a measure of the execution time constraint imposed upon a software system. The rating is expressed in terms of the percentage of available execution time expected to be used by the system or subsystem consuming the execution time resource. The rating ranges from nominal, less than 50% of the execution time resource used, to extra high, 95% of the execution time resource is consumed.

Main Storage Constraint (STOR)

This rating represents the degree of main storage constraint imposed on a software system or subsystem. Given the remarkable increase in available processor execution time and main storage, one can question whatever resources are available, making these cost drivers still relevant.

90 Cost Estimation Methods for Software Engineering

Platform Volatility {PVOL)

"Platform" is used here to mean the complex of hardware or software (OS, DBMS) the software product calls on the perform its tasks. If the software to be developed is an operating system then the platform is the computer hardware. If a database management system is to be developed then the platform is the hardware operating system. The platform includes any compilers or assemblers supporting the development of the software system. This rating ranges from low, where there is a major change every 12 months, to very high, where there is a major change every two weeks

Analyst Capability {ACAP)

Analyst is personnel that work on requirements, high-level design and detailed design. The major attributes that should be considered are the rating is Analysis and design ability, efficiency and thoroughness, and the ability to communicate and cooperate. The rating should not be considered the level of experience of the analyst. Analysts that fall in the 15th 1 percentile are rated very low and those that fall in the 95 h percentile are rated as very high.

Programmer Capability {PCAP)

Current trends continue to emphasize the importance of highly capable analyst. However the increasing role of complex software packages, and the significant productivity leverage associated with programmer's ability to deal with these software packages, indicates a trend towards higher importance of programmer capability as well.

91 Cost Estimation Methods for Software Engineering

Evaluation should be based on the capability of the programmers as a team rather than individuals. Major factors, which should be considered in the rating, are ability, efficiency and thoroughness and the ability to communicate and cooperate. The experience of the programmer should not be considered. Programmers that fall in the 15th percentile are rated very low and those that fall in the 95th percentile are rated as very high.

Applications experience (AEXP)

This rating is dependent on the level of applications experience of the project team developing of the software system or subsystem. The ratings are defined in terms of the project team's equivalent level of experience with this type of application. A very low rating is id for application experience of less than two months. A very high rating is for experience of six years or more.

Platform Experience (PEXP)

The Post-Architecture model broadens the productivity influence of PEXP, recognizing the importance of understanding the use of more powerful platforms, including more graphic user interface, database, networking, and distributed middleware capabilities.

Language and Tool Experience (L TEX)

This is a measure of the level of programming language and software tool experience of the project team developing the software system or subsystem. Software development includes the use of tools that perform requirements and design

92 Cost Estimation Methods for Software Engineering

representation and analysis, configuration management, document extraction, library management, program style and formatting, consistency checking, etc. In addition to experience in programming with a specific language the supporting tool set also effects development time. A low rating given for experience of less than two months. A very high rating is given for experience of six or more years.

Personnel Continuity (PCON)

Staff turnaround has an important impact on a project. The rating scale for PCON is in terms of the project's annual personnel turnover: from 3 %, very high, to 48 %, very low.

Use of Software Tools (TOOL)

Software tools have improved significantly since the 1970's projects used to calibrate COCOMO. The tool rating ranges from simple edit and code, very low, to integrated lifecycle management tools, very high.

Multi site Development (SITE)

Given the increasing frequency of multi site developments, and indications that multi site development effects are significant, the SITE cost driver has been added in COCOMO II. Determining its cost driver rating involves the assessment and averaging of two factors: site collocation (from fully collocated to international distribution) and communication support (from surface mail and some phone access to full interactive multimedia)

93 Cost Estimation Methods for Software Engineering

Required Development Schedule (SCED)

This rating measures the schedule constraint imposed on the project team developing software. The ratings are defined in terms of the percentage of schedule stretch- out or acceleration with respect to a nominal schedule for a project requiring a given amount of effort. Accelerated schedules tend to produce more effort in the later phases of development because more issues are left to be determined due to lack of time to resolve them earlier. A schedule compress of 74 % is rated very low. A stretch - out of a schedule produces more effort in the earlier phases of development where there is more time for thorough planning, specification and validation. A stretch- out of 160% is rated very high.

94 Cost Estimation Methods for Software Engineering

Appendix G

Nu Metro Server Technical Specification

95 Electric Liberty NU Metro Server Product: Nu Metro Server Technical Specification Project: Brasilia

Technical Specification

Electric Liberty

N U Metro Server FreeStyle

Revision 0.01

Document Status: Draft

Author: Andre Ladeira Print Date: 2003/03/04 Document Source: BSL-Template-TDD Nu-metro.doc Revision 0.01 Electric Liberty NU Metro Server Product: Nu Metro Server Technical Specification Project: Brasilia

Contents DOCUMENT CONTROL...... 98 INTRODUCTION ...... 100 Overview ...... 100 Overall Architecture ...... 100 TECHNICAL FLOW ...... 101 DATABASE DESIGN ...... 102 Data related issues ...... 103 TECHNICAL COMPONENTS ...... 108 Communication Protocol ...... 108

Author: Andre Ladeira Print Date: 2003/03/04 Document Source: BSL-Template-TDD Nu-metro.doc Revision 0.01 Electric Liberty NU Metro Server Product Nu Metro Server Technical Specification Project Brasilia

Document Control c on f 1gura f 1on c on t ro Project: Brasilia Title: NU Metro Server Category: Technical Specification VSS Reference: VSS\Brasilia\Specifications\Technical Template Used: VSS\Brasilia\Templates\BSL-Template-TDD Created By: Creation Date:

D ocumen t H"IS t ory Date Version Status Who VSS Version 0.01 Draft -

ReVISIOn H"1s t ory Date Version Changes 0.01 New document created

R ev1ew H"IS t ory Date Version Status Management Minute Reference 1.00

Author: Andre Ladeira Print Date: 2003/03/04 Document Source: BSL-Template-TDD Nu-metro.doc Revision 0.01 Electric Liberty NU Metro Server Product: Nu Metro Server Technical Specification Project: Brasilia

References Description Source Related Documents

Related Specifications

Author: Andre Ladeira Print Date: 2003/03/04 Document Source: BSL-Template-TDD Nu-metro.doc Revision 0.01 Electric Liberty NU Metro Server Product: Nu Metro Server Technical Specification Project: Brasilia

Introduction

Overview

This document describes the functionality of the NuMetro Server and the interface components involved. The document must be read in conjunction with the NuMetro Functional specification document

Overall Architecture

The architecture diagram below is a logical interpretation of the production environment implementation of the NuMetro server.

Client Side Server Side

Protocol converted converts data to a readable ascii format

Terminal connects to protocal converted via x25 radio pad Engenico Terminal -. .. Protocal converter

Nu Metro Server connected to protocal converter via TCP/IP

Nu Metro Server determines • which message is being send and sends the relevant data back Nu Metro Server to protocal converter to be send to terminal

Valid data is retrieved

Author: Andre Ladeira Print Date: 2003/03/04 Document Source: BSL-Template-TDD Nu-metro.doc Revision 0.01 Electric Liberty NU Metro Server Product: Nu Metro Server Technical Specification Project: Brasilia

Technical Flow

The Nu Metro server uses only one external com+ component for data access namely the modFreestyle class. An external application (protocol converter) is used to translate messages received from the terminals to ASCII strings and visa versa. To determine when a message arrives the Microsoft Winsock TCP/IP component is used. The following methods are used: tcpCiient_DataArrival: Once the data has arrived at the protocol converter this event will be triggered and the process will start. A string check is also put in place to ensure that the full message has arrived. In the case that the full message has not arrive the message will be placed in memory until the full message has been retrieved. All the data will be checked to determine that no corrupted data has arrived (check sum and string length count, see message protocol) SendData: Once the message has been compiled the data is send to the protocol converter.

Note: most of the application is string manipulation to ensure that the agreed data protocol is met.

Data flow process • Login. The terminal ID will be checked to determine if the terminal may be given access. In the case that the terminal ld is not valid the terminal will be locked out. • Once the terminal has been successfully logged in all the parameters will be send. See database detail • After successful completion of the parameter file upload the good card list (GCL) will be send. • After successful completion of the CGL all vouchers issued will be send from the current terminal to the server, to be stored in the database. • After successful completion of the voucher download the terminal will logout

Note: The server can not initiate communication with the terminal, the terminal initiates communications and send commands to the server. The server only responds on the valid commands. The terminal will connect at the time as specified in the parameter file send.

Author: Andre Ladeira Print Date: 2003/03/04 Document Source: BSL-Template-TDD Nu-metro.doc Revision 0.01 Electric Liberty NU Metro Server Product: Nu Metro Server Technical Specification Project: Brasilia

Database Design

Database tables used

NuMetroStatusHist NUMetrologinHistory NUMetroParameter

PK (:2kiNMStatusHistld PK (:2kiNUMetroLoginHisto0£1d PK (:2kiNUMetrQParameterld

fkiMemberCustld fkiBUOrgld fkiBUOrgld qwfldinfo_NMstat dtloginRequested sMerchantNo dtEffectiveDate bloginSuccessfull sSiteVenue dtTerminationDate bDataTransferCompleted sAccountSiteNo dtTimeStamp dtlogout sNextCaiiTime ~~ sNextCaiiMethod sNextCaiiNo sNextSeqNo !Velocity IMaxTicketsPerVoucher NUMetroTerminal NUMetroMemberDetaiiUpload sManager_1_PinNo sManager_1_CardNo PK (:2kiNUMetroTerminall!;! PK (:2kiNUMetroMemberDetaiiU(:21oadld sManager_2_PinNo sManager_2_CardNo fkiBUOrgld FK1 fkiN UMetrolog in H istoryld dtEffectiveDate sTerminalld sSeqNo dtTerminationDate dtEffectiveDate INoOfCards dtTimeStamp dtTerminationDate dtTimeStamp NUA_No_1 dtTimeStamp bSent NUA_No_2 dtSent NUA_No_3 ·~

NUMetroMemberDetail NUMetroMemberDetaiiSeqNo PK (:2kiNUMetroMemberDetailld PK (:2kiNUM!iltroMemberDetaiiSegNold sTransactionType ~ sCardNo FK2 fkiNUMetroMemberDetaiiUploadld INoOfDependants FK1 fkiNUMetroMemberDetailld dtTimeStamp dtTimeStamp

NUMetroVoucherDetail NUMetroMemberDetaii_Processed PK 1:21siNUMetroVoucherDetailld

fkiNUMetrologinHistoryld pkiNUMetroMemberDetailld sControiData sTransactionType sVoucherNo sCardNo sCardNo INoOfDependants dtVoucherlssued dtTimeStamp INoOmckets sManagerPinNo dtTimeStamp

Author: Andre Ladeira Print Date: 2003/03/04 Document Source: BSL-Template-TDD Nu-metro.doc Revision 0.01 Electric Liberty NU Metro Server Product: Nu Metro Server Technical Specification Project: Brasilia

Data related issues

Table: FS BAckoffice.NuMetroParameter .. Description: This contains all the relevant data to be loaded onto the terminal (see communication protocol for detail) Column Value Description pkiNUMetroParameterld lnt Primary key fkiBUOrgld lnt Unique organization id sMerchantNo Char Cinema merchant number sSiteVenue Varchar Cinema name sAccountSiteNo Varchar Freestyle account number sNextCaiiTime Char Login time every day sNextCaiiMethod Char Call method (X25) sNextCaiiNo Char Other call number sNextSeqNo Varchar Next voucher sequence number IVeloci!Y_ Tinyjnt Velocity IMaxTicketsPerVoucher Tinyint Max number of tickets to be issued sManager 1 Pin No Char Manager one's pin number for supervisors card sManag_er 1 Card No Char Managersone's card number for supervisors card sManager 2 Pin No Char Manager two's pin number for supervisors card sManager 2 Card No Char Managers two's card number for supervisors card dtEffectiveDate Datetime From when active dtTerminationDate Datetime To when active dtTimeStamp Datetime Date creates NUA No 1 varchar NUA number to access

NuMetroLobginHistory

Table: FS BAckoffice. NuMetrolobginHistory Description: Inserts all the detail of terminal that is requesting connection Column Value Description pkiNUMetrologinHistoryld lnt Primary key fkiBUOrgld lnt Organization ld dtloginRequested Datetime Date of login bloginSuccessfull Bit Was login successful bDataTransferCompleted bit Was data transfer completed dtlogout Datetime Date of logout

Author: Andre Ladeira Print Date: 2003/03/04 Document Source: BSL-Template-TDD Nu-metro.doc Revision 0.01 Electric Liberty NU Metro Server Product: Nu Metro Server Technical Specification Project: Brasilia

N uMetroStatusH ist

Table: FS BAckoffice. NuMetroStatusHist Description: Store status detail of nu metro members Column Value Description pkiNMStatusHistld lnt Pkey fkiMemberCustld lnt Customer ID qwfldinfo NMstat lnt dtEffectiveDate Datetime Effective date dtTerminationDate Datetime Termination date dtTimeStamp datetime

NuMetroTerminal

Table: FS BAckoffice. NuMetroTerminal Description: Store all the terminal detail Column Value Description pkiNUMetroTerminalld lnt Pkey fkiBUOrgld lnt Organization ID sTerminalld varchar TerminaiiD dtEffectiveDate Datetime Effective date dtTerminationDate Datetime Termination Date dtTimeStamp Datetime

NuMetroMemberdetaiiUpload

Table: FS BAckoffice. NuMetroMemberdetaiiUpload Description: Store the detail of all the member detail uploaded to each terminal Column Value Description pkiNUMetroTerminalld lnt Pke_y_ fkiBUOrgld lnt Organization ID sTerminalld varchar TerminaiiD dtEffectiveDate Datetime Effective date dtTerminationDate Datetime Termination Date dtTimeStamp Datetime

Author: Andre Ladeira Print Date: 2003/03/04 Document Source: BSL-Template-TDD Nu-metro.doc Revision 0.01 Electric Liberty NU Metro Server Product: Nu Metro Server Technical Specification Project: Brasilia

NuMetromemberdetail

Table: FS BAckoffice. NuMetromemberdetail DescriQ_tion: All members that must be uploaded, edited or currently in the terminal memory Column Value Description pkiNUMetroMemberDetailld int Pkey sTransactionTYQ_e char Transaction type (Add or delete)) sCardNo Char Member card number INoOfDependants Tinyint Number of dependants dtTimeStamp Datetime

NuMetromemberdetaiiSeqNo

Table: FS BAckoffice. NuMetromemberdetaiiSeqNo Description: Member detail upload sequence number linked table Column Value Description pkiNUMetroMemberDetaiiSeqNold lnt Pkey fkiNUMetroMemberDetaiiUploadld lnt Link to numetroMemberdetailupload fkiNUMetroMemberDetailld int Link to Numetromemberdetail dtTimeStamp Datetime

NuMetroVoucherDetail

Table: FS BAckoffice. NuMetroVoucherDetail Description: Store the detail of all the member detail uploaded to each terminal Column Value Description pkiNUMetroVoucherDetailld lnt Pke_y fkiNUMetrologinHistoryld lnt Link to Numetromemberloginhistory table sControiData Varchar sVoucherNo Varchar Voucher number sCardNo Char Card number dtVoucherlssued Datetime Date voucher issued INoOfTickets Tinyint Number of tickets for member

Author: Andre Ladeira Print Date: 2003/03/04 Document Source: BSL-Template-TDD Nu-metro.doc Revision 0.01 Electric Liberty NU Metro Server Product: Nu Metro Server Technical Specification Project: Brasilia

sManagerPinNo Char Was the ticket overridden dtTimeStamp datetime

NuMetroMemberDetaii_Processed

Table: FS BAckoffice. NuMetroMemberDetail Processed Description: Store the detail of all the member detail uploaded to each terminal Column Value Description pkiNUMetroMemberDetailld int Pkey sTransactionTy:pe Char Transaction type sCardNo Char Member card number INoOfDependants Tinyint No of dependants dtTimeStamp Datetime

Store Procedures Used

NuMetroMemberFilePopulate.

Select the member details to be uploaded to terminal. Only 13 members can be uploaded per message. This store procedure retrieves 13 members at a time.

NuMetroRetrieveMemberDetail

Retieve all the member detail

NuMetroUpdateMemberDetai/Send

Update table NuMetroMemberDetaiiUpload to indicate that data has been sent

Numetro Termina/PasswordValidate

Validates terminal password

NuMetroLoginHistoryLog

Inserts record into NUMetrologinHistory table

Author: Andre Ladeira Print Date: 2003/03/04 Document Source: BSL-Template-TDD Nu-metro.doc Revision 0.01 Electric Liberty NU Metro Server Product: Nu Metro Server Technical Specification Project: Brasilia

NuMetroVoucherDetaillnsert

Insert all voucher detail uploaded from terminal

NuMetroParameterSelect

Retrieve parameter detail for specific terminal MemberValidationdetail

Select all valid Nu metro clients. Used to determine the amount of packages (messages to be send) NuMetroCinemasNotEntered

Selects all cinemas that has not entered for a specific day

Author: Andre Ladeira Print Date: 2003/03/04 Document Source: BSL-Template-TDD Nu-metro.doc Revision 0.01 Electric Liberty NU Metro Server Product: Nu Metro Server Technical Specification Project: Brasilia

Technical Components

Communication Protocol

Operational and Protocol specification

HOST (Server) ~lngenico communication (Client)

Communication Protocol

Direction STX CMD LEN DATA ETX CHK HOST=> TERM [STX] F[XXX] 4 Bytes As per [ETX] [CHK] specification TERM=> HOST I[XXX] 2 Bytes lXXXl = Name of process

Description

Field Description STX Ox02 CMD Example 'FLIN' LEN Length of data to follow, in ASCII '0012' indicates a lenqth of 12 DATA Depends on CMD ETX Ox03 CHK XOR of all data, excluding STX. Result in ASCII 'FD' indicates a CHK of OxFD

All numeric data in the DATA tag from Host and terminal will be compressed numeric.

Login and Logout

Login- Terminal to Host Request

IDirection I STX ICMD I DATA I CHK

Author: Andre Ladeira Print Date: 2003/03/04 Document Source: BSL-Template-TDD Nu-metro.doc Revision 0.01 Electric Liberty NU Metro Server Product: Nu Metro Server Technical Specification Project: Brasilia

TERM=> HOST I [STX] IILIN I ooo8 I [ETXl I [CHK] [Y] = TerminaiiD (8 Bytes, Numeric)

Login - Host to Terminal Reply

Direction STX CMD LEN DATA ETX CHK HOST=> TERM [STX] FUN 00013 [X][Zl [ETXl [CHK] [X] =Valid login (1 -accepted, 0- rejected, 1 Byte, Numeric) [Z] = Host date and time (DDYYMMDDHHMM, 12 Bytes, Numeric)

Logout- Terminal to Host Request

Direction STX CMD LEN DATA ETX CHK TERM=> HOST [STX] I LOT 0008 [Yl [ETXl [CHK] lY] = TerminaiiD ( 8 Bytes, Numeric)

Logout- Host to Terminal Reply

Direction STX CMD LEN DATA ETX CHK HOST=> TERM [STX] FLOT 0000 [ETX] [CHK]

Voucher batch upload

Batch detail

Request

Direction STX CMD LEN DATA ETX CHK TERM=> HOST [STX] IVBU 4 Bytes [A][B][X][C1 ][C2] [ETX] [CHK]

Author: Andre Ladeira Print Date: 2003/03/04 Document Source: BSL-Template-TDD Nu-metro.doc Revision 0.01 Electric Liberty NU Metro Server Product: Nu Metro Server Technical Specification Project: Brasilia

I I I I rc3JrC4l rc5J I I Notes

[A] = Packet number to be retrieved (2 bytes length, truncated with nulls if needed , Numeric) [B] = Total number of packets to be send (2 bytes length, truncated with nulls if needed , Numeric) [X] = Number of record sent (2 bytes, Numeric) [C1] =Voucher number (8 Bytes, Numeric) [C2] = Card Number (14 Bytes, Numeric) [C3] = Current Date Time (YYYYMMDDHHMM, 12 Bytes, Numeric) [C4] = Number of tickets (2 Bytes, Numeric) [C5] =Manager Number (1 Bytes, Numeric) Fields [C1] to [C5] are repeated [X] times.

Field [C] is repeated [X] times where [X] may not be > 5

LEN = Variable

Description • Number of records in the voucher file for the specific package send • Voucher number. The number that was printed on the voucher • Card Number. The number of the card that was used • Current Date Time. The date and time the ticket was issued • Number of tickets. Number of tickets requested • Manager Number. The manager number as per the parameter file. If manager number= 1 then it is 'Manager One Card Number' as per parameter file. The same if the manager number= 2. If the manager number= 0 then there was no override. Reply

Batch detail Response

Direction STX CMD LEN DATA ETX CHK TERM=> HOST [STX] FVBU 0002 [X] [ETX] [CHK]

Notes

Author: Andre Ladeira Print Date: 2003/03/04 Document Source: BSL-Template-TDD Nu-metro.doc Revision 0.01 Electric Liberty NU Metro Server Product: Nu Metro Server Technical Specification Project: Brasilia

[X]= Number of record received (2 bytes, Numeric)

LEN= 2

Parameter File - Host to terminal

Request

Direction STX CMD LEN DATA ETX CHK TERM= HOST [STX] IPAF 0000 [ETX] [CHK] Notes

This will be a request for the parameter file

Reply

Direction STX CMD LEN DATA ETX CHK HOST => [STX] FPAF 0153 [A][B][C][D][E][F][H] [ETX] [CHK] TERM [I][ J ][K][L ][M][N][O ][P][Qj_ Notes

[A]= Merchant Number (14 Bytes, Numeric) [B] = Site Venue (24 Bytes, Alpha Numeric) [C] =Account Site Number (10 Bytes, Numeric) [D] = Next Time (HHMM, 4 Bytes, Numeric) [E] = Next Call Method (3 Bytes, Alpha Numeric) [F] =Next Call Number (10 Bytes, Numeric) [H] = Next Sequence Number (8 Bytes, Numeric) [I] =Velocity Values (2 Bytes, Numeric) [J] =Tickets Per Voucher (2 Bytes, Numeric) [K] =Manager One Pin Number (6 Bytes, Numeric)

Author: Andre Ladeira Print Date: 2003/03/04 Document Source: BSL-Template-TDD Nu-metro.doc Revision 0.01 Electric Liberty NU Metro Server Product: Nu Metro Server Technical Specification Project: Brasilia

[L] =Manager One Card Number (14 bytes, Numeric) [M] = Manager Two Pin Number (6 Bytes, Numeric) [N] =Manager Two Card Number (14 bytes, Numeric) [0] = NUA1 (12 Bytes, Numeric) [P] = NUA2 (12 Bytes, Numeric) [Q] = NUA3 (12 Bytes, Numeric)

Description

• Merchant Number. Unique number allocated by Freestyle to each individual card reader. This number will be used to uniquely identify the site. • Site Venue. Indicates the site and the venue of the card reader • Account Site Number. Number for accounting purposes. • Next Call Date and Time. Next time for upload • Next Call Method. Will not be used for launch, but may be used in the future if a different medium is available to transfer data. The default value will be "X25" • Next Call Number. Number needed to use for different medium. The value will be populated with 10 nulls. • Next Sequence Number. Specifies the next number to be printed on the next voucher issued. If this value is not provided, the next sequential number in the card reader must be used. The voucher number will be as follow MMDDXXXX where XXXX is the sequential number. • Velocity Values. Specifies the time required to elapse before a card can be approved as a valid re-swipe. Default will be 24 hours. • Tickets Per Voucher. Specifies the maximum amount of tickets that can be issued on a voucher. Default value of 000 will mean according to max number of participating members on the card. • Manager Pin Number. Stores the manager's pin number. If this pin number is not provided the old number must be retained. • Manager Card Number. Card number of manager to be used to override system.

Good List Upload- Member Data

Data upload

Request

Direction STX CMD LEN DATA ETX CHK TERM=> HOST [STX] IGLU 0012 [A] [ETX] [CHK] Notes [A] = Sequence Number [CCYYMMDDXXXX] (12 Bytes, Numeric) where XXXX indicates the package to be send

Reply

Author: Andre Ladeira Print Date: 2003/03/04 Document Source: BSL-Template-TDD Nu-metro.doc Revision 0.01 Electric Liberty NU Metro Server Product: Nu Metro Server Technical Specification Project: Brasilia

Direction STX CMD LEN DATA ETX CHK HOST=> TERM LSTX] FGLU 4 Bytes lAJLB]lCJLDJLEl JET)9 LCHKl Notes [A]= Next sequence number [CCYYMMDDXXXX] (12 Bytes, Numeric) [B] = Number of Cards [X] to follow (up to max of 13) (2 Bytes, Numeric) [C] =Add "A" or Delete "D" (1 bytes, Alpha Numeric) [D] =Card Number (14 Bytes, Numeric) [E] = Number of dependants (2 Bytes, Numeric)

Field [C][D][E] is repeated [X] times where [X] may not be > 13

LEN = Variable

If the host returns the same sequence value, it would indicate that there are no more cards to download. In this case [X] will also= zero

Supervisors Card

There will be two cards issued per site. The cards will have the following detail: • Track One- "NU" +Card number (5555XXXXXXXXC) where XXXXXXXX is the unique number and C the check sum. • Track Two - Merchant Number

Each manager will be allocated with a 6 digit password. Once the supervisor's card is swiped the manager will be prompted to enter the password.

The password will be validated against the data that has been downloaded in the parameter file.

Author: Andre Ladeira Print Date: 2003/03/04 Document Source: BSL-Template-TDD Nu-metro.doc Revision 0.01