2014 Joint Conference of the International Workshop on Software Measurement and the International Conference on Software
Process and Product Measurement
Measuring the Software Size of Sliced V-model
Projects
Andreas Deuter
Gregor Engels
PHOENIX CONTACT Electronics GmbH
University of Paderborn
Dringenauer Str. 30
31812 Bad Pyrmont, Germany
Zukunftsmeile 1
33102 Paderborn, Germany Email: [email protected]
Email: [email protected]
But, the “manufacturing” of software within the standard production process is just a very short process when bringing the binaries of the software to the device. This process is hardly to optimize and does not reflect the “production of software” at all. The creation of the software is namely done in the developing teams by applying typical software engineering methods. However, to keep up with the high demands on implementing new functionality (e.g. for PLC) the software development process within these companies must improve. Therefore, they start to analyze their software processes in detail and to identify productivity drivers. They enhance the way they specify, implement and test their software products. Improvements can address some of the following elements:
Abstract—Companies expect higher productivity of their software teams when introducing new software development methods. Productivity is commonly understood as the ratio of output created and resources consumed. Whereas the measurement of the resources consumed is rather straightforward, there are several definitions for counting the output of a software development. Source code-based metrics create a set of valuable figures direct from the heart of the software - the code. However, depending on the chosen process model software developers and testers produce also a fair amount of documentation. Up to now this output remains uncounted leading to an incomplete view on the development output. This article addresses this open point by proposing a novel automated way on software quantity measurement. It extends source code-based metrics, namely the size of code changes which is called churn, by the counting of work items representing the design and test documentation belonging to that churn. We demonstrate the validity of this approach on the sliced V-model process which is an agility extension of the traditional V-model.
••
Choose, implement or change the software process model, suitable to their specific environment
Implement an appropriate tool landscape for software life-cycle management
Keywords—Software productivity, software quantity, work items, version control, sliced V-model
••
Implement test automation Train and qualify the teams
I. INTRODUCTION
As it is possible to enhance the team’s performance in even more areas, the main challenge for the companies is now to find out whether their expectations on improvements are really met. The first challenging issue is to create objectives and requirements about the expectations before starting the improvement activities. As with all good-quality requirements, their validation must be clearly possible, and precise target figures should be announced at an early stage. An example for precise target figures can be: “Reduce the development time by 30 percent for the same amount of software”. After establishing one or more of the mentioned improvement elements and using it in a software development project, the requirements have to be validated. This validation should then be continuously repeated in the subsequent projects.
The importance of software in the industrial sector is growing enormously these days. The reason is the increasing percentage of software in the added value of industrial products. More and more devices perceive their innovations by software. Examples of industrial products with a high degree of software inside are the Programmable Logic Controllers (PLC) of Phoenix Contact. Phoenix Contact is a leading supplier of electrical and electronic components for industrial applications with a wide range of products. The overwhelming part of them are mechanical and hardware components. That means the core competences of the company are hardware development and production, but not software. However, the importance of software grows rapidly. The software in a PLC consists of several hundred thousand lines of code. New versions of existing PLC types are generated almost only by software updates.
A typical approach to validate the objectives is to ask the stakeholders (e.g., from the management) and the teams for their evaluation of the situation before and after installing the new procedures. This can be done by formal questionnaires or by informal talks in the team meetings. Obviously, this gives a good estimation of how the changes are accepted by the involved team staff. However, it does surely not give an objective assessment on the situation.
The mechanical and electronic assembly of the PLCs is highly automated. A full range of figures are used for monitoring the production process ensuring maintaining and increasing the manufacturing productivity. A typical set of figures contains number of pieces per time, cost of material or quality losses. This means that there is a very good understanding about productivity in the assembly lines.
A technical way for measuring improvements is to automatically generate output numbers of the software created,
978-1-4799-4174-2/14 $31.00 © 2014 IEEE DOI 10.1109/IWSM.Mensura.2014.22
233
TABLE I.
REQUIREMENTS FOR SOFTWARE OUTPUT MEASUREMENT
for example by using a reporting tool. When setting these numbers in ratio with consumed resources, typical productivity information as known from the assembly lines is available, Equation (1).
R1 R2 R3 R4 R5
Figures on software output are concrete numbers Figures are objective (i.e., they are not based on questionnaires) Figures include the size of software and its development documentation The measurement of the figures is automated
output
The figures are useable for trend analysis
productivity =
(1)
resources
Having access to such figures enables companies to measure their software productivity and to address the following questions on productivity formulated by Albrecht [1]: complete set of our requirements. It proposes a novel approach extending source-code based methods by counting the size of the networks of artifacts used for documentation.
•••
Are we doing as well as we can? Are we competitive?
This document is structured as follows: Section II presents related work. Section III describes in detail the proposed concept. Section IV explains its practical implementation at Phoenix Contact. Section V outlines the conclusions and the future work.
Are we improving?
Whereas the answers of the first two questions are of some interest, the focus for the companies is on finding satisfying answers for the third one by comparing a first set of measurement figures with successive ones. Once these sets are integrated in trend analysis, the companies are in a position to illustrate increasing (or also decreasing) performance due to any changing activity.
II. RELATED WORK
This section gives an overview about the known work on measuring software quantity. Before we start with this we explain the sliced V-model as it is the process base of the proposed approach.
Having identified the need for measuring the software productivity, the challenge arises how to figure out and identify the output created and the resources consumed. Whereas the latter is mainly represented by the hours spent by the team staff for a certain software project, it is very difficult to create an understanding of the output. There are many different, but not at all harmonized, views of defining “software output”. Very popular methods create figures based on the heart of the software - the code. Other methods quantify the functionality provided by the software with points. Looking at a company like Phoenix Contact, that is not enough. Its software development teams spend a lot of effort on writing requirements specifications, system design specifications and test reports. This documentation is needed for managing the products over a long life-cycle of up to 20 years. Furthermore, it is required to provide thorough documentation of the software development process to customers or even to approval authorities. For example, when it is required to get an IEC 61850 [2] certification for safety products.
A. Sliced V-Model
As mentioned before, there are good reasons for the companies of the industrial sector to use the V-model. The V-model was originally defined by Boehm [3]. It is nowadays considered little flexible and stiff in time behavior. In a recent work, the first author refined it in his so-called sliced V-model [4]. The sliced V-model maintains the V&V strength of the traditional V-model, but reduces the teams’s Work in Progress (WiP), which is a major objective of agile methods. Furthermore, it reduces the effort for managing the documentation.
The documents in the sliced V-model are container of work items of the type of the document. The work items in the different documents are connected via so-called “links” to form a “V” shape (Fig. 1). This shape is called “V”-slice. Each link means that a work item is verified or validated by the linked work item. As the work items consist of a title, a meaningful description and other attributes, they represent the development and the test documentation of a software project (Example shown in Table II). The work item of the “module” type is directly linked to the source code revisions created to implement the requirement.
The development software lifecycle demanded by safety standards must follow the V-model. The V-model gives good guidance for documentation and instructions to prove the correctness of all verification and validation (V&V) activities. To have only one process model in place, companies apply the V-model also for standard software products.
TABLE II.
ID
EXAMPLE: WORK ITEM OF THE “TEST” TYPE
Having a high expertise on monitoring assembly lines productivity, the companies seek for methods allowing to measure software output as they are used to for production output. That means they would like to get the data direct out of the production databases in an automated way without the need of manual rework. The figures shall include every output where effort has been spent for. Summarizing, the method for measuring the output of the software development shall meet the requirements formulated in Table I.
ID45
Teststage Assignee Title
Acceptance Steve Tester Test: Profinet Optical Diagnostic Tests if optical diagnostic information is displayed correctly in PLC display
Description Precondition Teststeps
Profinet network configured with two devices 1. Download PLC program “x.mwt” 2. Start-up PLC
This article gives an overview of known definitions on software quantity. It explains why none of them fulfills the
3. Reduce optical power between devices dev1, dev2
- Warning is displayed with location “dev2”
- Expected Reaction
234
Fig. 2. Managing Work in Progress (WiP) with “V”-slices
C. Software Quantity
As shown by Sneed’s devil’s square the quantity is one element when defining software productivity. However, quantity of software is difficult to determine. There are methods to evaluate quantity by either a functional view or by a constructive view.
Fig. 1. Structure of the sliced V-model
1) Functional Quantity Measurement: Functional quantity
measurements base their counting on the function delivered (the functional size) by a software product. There are several methods, the most popular are the IFPUG FPA (Function Point Analysis) and the COSMIC Functions Points (CFP), and are standardized in the ISO/IEC 20926 [12] and ISO/IEC 19761 [13], respectively. Function point measurement was originally developed by Albrecht [1]. According to Albrecht, function points are the results of the number of inputs, inquiries, outputs, and master files delivered, which are weighted, summed up and adjusted for complexity. Counting of function points are typically done by trained function point experts.
“V”-slices are independent between each other and are processed at an individual speed within a given release timeframe (Fig. 2). By this they represent the team’s WiP. As in agile methods the team decides when to start them. This means they switch to a “pull” instead of using the “push” mode known from the traditional V-model. Furthermore, the sliced V-model overcomes the known drawback first to create all documentation required for one phase and then to proceed to the next one. In order to use the sliced V-model efficiently a repository system is needed. The requirements for such a repository are stated in [4].
Main advantages of the function point method are seen in its independence from programming languages or technologies used. However, there are also some criticisms. Especially the effort to introduce the method in an organization is very high, as the mentioned trained experts are needed. The size calculated is subjective to the expertise of these experts [14] and the counting is difficult to mechanize [15]. Latter point has been addressed for the COSMIC recently by Lind and Heldal [16] and Oriu et al. [17]. Both approaches require software models (e.g., as UML) as input for automatic size calculation. However, in practice these models often do not exist. This pure fact and the missing opportunity to include the documentation size makes function points unsuitable for use in our environment.
B. Software Productivity
There are numerous articles, books and studies about productivity in software development (or short: software productivity). Petersen gives a comprehensive overview of studies on measuring and predicting software productivity [5]. Process improvement models such as the Capability Maturity Model Integration (CMMI) focus on continuous improvement on quality and efficiency [6]. Therefore, they do require the measurement of software productivity in level 4 and 5, but they leave it open by which method.
Cheiki et al. analyzed to what extent software productivity is already part of international standards [7]. They found that standards ISO 9126 [8] and IEEE 1045 [9] contain information on software productivity. However, standard ISO/IEC 25010 [10] has replaced standard ISO 9126, in which even the term “productivity” was substituted by “efficiency”. The IEEE 1045 standard has been withdrawn. Hence, the term software productivity is not defined in any standards today.
2) Constructive Quantity Measurement: Constructive
quantity measurements base their counting on the source code analysis. The best-known method is to count the lines of code (LOC). This method is a very pragmatic and is often used for generating productivity data in an automated way. For example, Ramasubbu et al. used the ratio KLOC/Person Hour for a multi-company analysis of project productivity [18]. Problems related to LOC have been discussed intensively in the past. For example, it should be carefully distinguished between lines and statements [14].
We follow the view of Sneed given in his so-called “devil’s square” [11]. As driving factors of software productivity he names quality and quantity as well as cost and duration (Fig. 3). According to Equation (1) quality and quantity are output parameters, cost and duration are resource parameters. Sneed assumes that the overall productivity of a team is constant. Hence, the change of a single parameter will affect one or more of the other parameters. E.g., if quality is increased cost and/or duration will increase, too. Sneed’s assumption leads to the interpretation of the devil’s square that the overall software productivity can only be increased if all parameters are considered simultaneously.
Park analyzed many other aspects of the source code which can be analyzed by creating a framework for counting source statements [19]. This extremely comprehensive work focuses on code metrics such as physical, logical and data source statements. In a recent work, Rahman et al. found out that so-called file-based processes deliver much better information than code metrics, at least for the prediction of defect-dense
235
Fig. 4. Example of a unified diff patch
D. Summary of related work
Measuring the quantity of software is one of the key tasks to establish software productivity analysis. Today there are different methods in place to determine software quantity - functional and constructive ones. Constructive methods fulfill already important requirements needed for our environment, whereas functional methods are not suitable. Both methods do not meet the requirement for counting the size of the documentation created during the software development. Table III shows a summary of the fulfillment of our requirements.
Fig. 3. Devil’s square by Sneed [11]
files [20]. Process metrics are for example the number of commits made to a file in a source code repository.
Mockus uses the size of changes in the source codes for prediction of software projects when the software related tasks are organized by work items [21]. According to his work, the analysis of software changes has a number of advantages such as the obtaining of very detailed information since the delta in the source code is linked to a single work item. However, Mockus does not use the size of software changes to create quantity figures.
TABLE III.
FULFILLMENT OF REQUIREMENTS BY EXISTING METHODS
- Requirement
- Functional Methods
- Constructive Methods
R1 R2 R3 R4 R5
+o-
++-o+
++
The information on size of software code changes is known as “churn”. The use of this term for software was initiated by Munson and Elbaum [22]. Sjoberg defines it as follow: “Churn is defined as the sum of the number of lines added, deleted, and modified in the source code.” [23]. Sjoberg even used churn as a mean to compare productivity when teams switching their software development process models. Churn has also been used in relationship with software dependencies to predict post-release failures [24]. A possible format to quantify the churn is the so-called unified diff patch. It consists of sequences of diff hunks following a defined rule set to indicate changes [25]. New lines are marked with “+”, deleted lines with “-” and modified ones are indicated as deleted and added. This allows extracting precise figures of added, deleted or modified lines. Fig. 4 shows an example of a unified diff patch. As the determination of churn is automatable from the source code repositories it fulfills major requirements stated in Table I.
III. PROPOSED APPROACH
As explained, our aim is to measure the software quantity including the documentation size. We will use the known churn method for counting the size of source code changes. As our approach is based on the sliced V-Model process described in section II-A we will show that it is very well suited for counting the size of software development documentation.
A. Conception
The sliced V-model establishes a clear traceability within a “V”-slice and the source code changes belonging to that slice. These code changes are determined by measuring the churn size. It is calculated by analyzing the revisions in the source code repository that are linked to a work item of the “module” type and is called UPM . The term “revision” is known from the source code management system Subversion [28]. A revision contains the list of files that have been changed compared to the previous revision. Each file in a revision is compared with the file in the previous revision by determining the unified diff patch between these two entities. As the unified diff patch has the data type string, its string length in Kilobyte [kB] is used as the churn’s size. Details on number of modified lines are omitted.
However, none of the mentioned constructive methods considers the size of the development documentation.
3) Software documentation size: The IEEE 610.12 standard
states that software does not only consist of computer programs and procedures, but also of the associated documentation [26]. Muranko and Drechsler emphasize its importance for embedded systems and for the reuse of components [27]. Therefore, documentation is part of the development workflow and must be included in any software size calculations. However, we are not aware of any method considering documentation in software size measurements.
The size of the source code changes made to implement a single requirement, called UPR, is the sum of all source
236
code changes between the files in the revisions linked to all corresponding work item of the “module” type within a “V”- slice. Equations (2) and (3) show how it is calculated.
k
ꢀ
UPM UPR
=
UPF ile
(2) (3)
i
i=1
l
ꢀ
=
UPM
i
i=1
with:
UPR Churn size of one “requirement” work item UPM Churn size of one “module” work item UPF ile Churn size of a single file
k
Number of files in the revisions linked to one “module” work item Number of all “module” work items in one “V”-slice
l
Fig. 5. Calculation UPSize for one “module” work item
A special effect can arise if not only source code files are changed when working on the software, but also other files such as images (ico, jpg, etc.) or configuration files (ini, xml, etc.). If these files are part of the revisions, they are included in the numbers of modified files, but not in the calculation of the unified diff patch. In analogy to UPSize the number of revisions created and the number of files changed in order to implement one requirement are available and used for quantity figures. They are calculated as shown in Equations (4) and (5). changes do not only include new or modified code, but also deleted ones.
Furthermore, in a sliced V-model there is a clear traceability between the work items belonging together. Each work item of the “requirement” type is connected to at least one work item of the “design” type and one work item of the “test” type. Each work item of the “design” type is connected with at least one work item of the “module” type and one of the “test” type. Each work item of the “module” type is connected at least with one work item of the “test” type. This allows counting the number of work items needed to specify the implementation and the test of one single requirement. This number shall be called the documentation size RSize. It is calculated as shown in the following Equations (6) and (7).
l
ꢀ
RevR FileR
==
RevM
(4) (5)
i
i=1
l
ꢀ