Journal of Advances in Computational Intelligence Theory Volume 1 Issue 3

Development of an Enhanced Automated Software Complexity Measurement System

Sanusi B.A.1, Olabiyisi S.O.2, Afolabi A.O.3, Olowoye, A.O.4 1,2,4Department of Computer Science, 3Department of Cyber Security, Ladoke Akintola University of Technology, Ogbomoso, Nigeria. Corresponding Author E-mail Id:- [email protected] [email protected] [email protected] 4 [email protected]

ABSTRACT Code Complexity measures can simply be used to predict critical information about reliability, testability, and maintainability of software systems from the automatic measurement of the . The existing automated code complexity measurement is performed using a commercially available code analysis tool called QA- for the code complexity of C- which runs on Solaris and does not measure the defect-rate of the source code. Therefore, this paper aimed at developing an enhanced automated system that evaluates the code complexity of C-family programming languages and computes the defect rate. The existing code-based complexity metrics: Source Lines of Code metric, McCabe Cyclomatic Complexity metrics and Halstead Complexity Metrics were studied and implemented so as to extend the existing schemes. The developed system was built following the procedure of waterfall model that involves: Gathering requirements, System design, Development coding, Testing, and Maintenance. The developed system was developed in the Visual Studio Integrated Development Environment (2019) using C-Sharp (C#) programming language, .NET framework and MYSQL Server for database design. The performance of the system was tested efficiently using a software testing technique known as Black-box testing to examine the functionality and quality of the system. The results of the evaluation showed that the system produced functionality of 100, 100, 75, 75, and 100 %, and quality of 100, 100, 75, 75, and 100 % for the source code written in C++, C, Python, C# and JavaScript programming languages respectively. Hence, the tool helped software developers to view the quality of their code in terms of code metrics. Also, all data concerning the measured source code was well documented and stored for maintenance and functionality in the possibility of future development.

Keywords: Software Metrics, Code-based Complexity, Sources Line of Code metric, McCabe Cyclomatic, Halstead Complexity.

INTRODUCTION [15]. One of the functions of the Computational Complexity hypothesis hypothesis is establishing feasible focuses on identifying computational limitation of what the computer can issues as claimed by their built-in execute. problems and correlate these categories to In software lifetime development, one of one another [15]. These issues are task the significant concerns is software resolved by a system using other problem- complexity. The source code complexity solving operations mainly by a computer metrics are used to measure the quality of

HBRP Publication Page 1-11 2019. All Rights Reserved Page 1 Journal of Advances in Computational Intelligence Theory Volume 1 Issue 3

software properties. Software complexity RELATED WORK in order words predicts analytical In [8], Jetter conducted a research on how information about reliability, testability, to apply the Bansiya and maintainability of program source model to evaluate the development of 19 code from the automatic measurement of Azureus versions by making comparisons the code [12]. A complex source code will using the object-oriented metrics rather therefore be difficult to develop and than the metrics proposed by Bansiya and maintain by the software developer. his colleagues [1]. The researcher Recently, it’s important for software demonstrated the ability of this model to engineering to correctly predict the track the evolution of design quality in complexity of a program source code to several versions by providing access to save millions of people in maintaining important information in the internal life of time and effort [16]. A software metric can the software. This information can simply be defined as the quality of measuring the support design decisions at higher levels of level to which a software system or abstraction. His suggested model may operation possesses some attribute. require additional inputs to cover the However, the metrics and measurements highest levels of abstraction to assess all are often used but metrics are said to be aspects of the quality model at this level basis while measurements are the figures [8]. obtained by the application of metrics. According to literature quantitative Zhang and Baddoo [18], studied the measurements are vital in all sciences, performance of three complexity metrics. these leads to ceaseless effort by computer They selected McCabes Cyclomatic science practitioners and theoreticians to Complexity, Halstead’s Complexity, and bring similar perspective to software Douces Spatial Complexity. Their development [10]. In this paper the code- experiment is established on four based complexity metrics, Source Lines of hypotheses using dataset from Eclipse JDT Code metrics (SLOC), McCabe which is an open-source application. Cyclomatic Complexity metrics and Sequel to the result, they conclude that the Halstead Software Science metrics were three complexity metrics show different used to obtain the objective reproducible performance results during their measurements that can be useful for hypotheses testing, and finally they quality assurance, performance, debugging recommended combining Cyclomatic management, and defects rate. The complexity metric and Halstead metrics Software metrics have been accepted for better decisions on software under the idea that before any source code complexity. can be measured or quantified, it needs to be translated into numbers. There are Another study by Panovski [13] several areas where software complexity established a new assessment of software metrics are found to be useful. These areas product quality, which focused on simply include phases from software assessing the quality of the external planning to the steps that are meant to features of the software product, which improve the quality of certain software. means evaluating the behavior of the However, the software system can’t software product when implemented. In perform on its own without human addition, the study focused on the interaction. Therefore, the software development of the quality model (ISO / complexity metric is also a measure of a IEC 9126) at the level of software metrics. person's relation to the software that he or The study is based on seven samples of the she is handling [14]. software product and assesses them using

HBRP Publication Page 1-11 2019. All Rights Reserved Page 2 Journal of Advances in Computational Intelligence Theory Volume 1 Issue 3

ISO / IEC 9126-2 quality model. In the complexity and usability, either in research, Panovski [13] concluded that monitoring software complexity during external product quality attributes are an development, or in evaluating the area or category that can be used, and that complexity of legacy software system. The the metrics provided by ISO / IEC 9126-2 researchers of this study, Widheden and can be regarded as a starting point for the Goran [17] suggested a new coupling definition of standards, but are not ready to metrics (Ecoup), and introduced the Java use in their present form. The software met tool, which works in a static analysis metrics of the software product need to be of programs written in Java programming more adapted to give better information language with respect to coupling, flow [13]. control, complexity and coherence. In the same year, Chandra et al. suggested the Borchert [3] examined the technique of use of Object Oriented metrics that code profiling by using a static analysis. introduced by Chidamber and Kemerer [5] The study was done on 19 industrial to assess program quality at the class level. samples and 37 samples of students' The suggested tool can be used to verify programs. He has analyzed software the class design conforms to the design samples through software metrics. The specifications of the Object Oriented results of this study recommended that the programming, through using the threshold code pattern could be a useful method for for each metric [4]. rapid program comparisons and quality observation in the field of industrial According to Silva et al. [16 who worked application and education. on the relevance of three software complexity metrics: McCabe’s Cyclomatic Jay et al. [7] compared two metrics to complexity, Halstead’s complexity, and calculate the complexity of code. They Shao and Wang’s cognitive functional used McCabe Cyclomatic Complexity size. In the research work, 10 different metric and Line of code metric (LOC) to programs of the same Java programming prove the stable linear relationship language were used and evaluate which between these two techniques. They used 5 one of the three complexity metrics is most NASA datasets with different suitable for software manufacturing. In programming languages, such as C, C++, order to calculate manually the ranking of and Java, to be the dataset for their the ten programs complexity based on the research. three metrics, quota sampling method was applied by selecting five different Bhatti [2] explored the research area companies and randomly asked six occupied by the software complexity programmers from each company to rank metrics. He made use of a public available the complexity of the ten programs. QA-C tool to automatically measure software metrics on the code written in C Hence, this research studied and programming language by expressing the investigated code-based complexity association between software metrics and metrics, Source Lines of Code metrics the complexity of the source code. He also (SLOC), Halstead, and McCabe attempted to evaluate the values of these Cyclomatic Complexity metrics, and metrics graphically only, without designed a system that will automatically considering the quality features and measure the software complexity of C- threshold limits relationship [2]. family programming languages as well as compute the defect rate. Another work in [17] is the impact of code

HBRP Publication Page 1-11 2019. All Rights Reserved Page 3 Journal of Advances in Computational Intelligence Theory Volume 1 Issue 3

METHODOLOGY of this paper is to find suitable metric or a The waterfall model of software group of metrics that can indicate the engineering was adopted in this paper. It is complexity of a system. Therefore, some a linear sequential procedure that flows of software code metrics have been down in respect to its importance or an studied. organized process to make sure that every stage is followed carefully before moving Source Line of Code (SLOC) on to the following step. The steps taken This metric is used to measure the are hereby highlighted: quantitative characteristics of program 1. Requirement Engineering: This phase source code. This metric is based on involved the gathering of required counting the lines of the source code. The information online from different limitations of SLOC may give an websites, eBooks and reviewing of impression of a less useful metric, but wise existing software metrics applications. use of this metric can still be valuable. 2. System Design: The requirements Specifically, to the requirements of this gathered during phase one was the paper, SLOC can simply be used to primary directing tool that was used in monitor change in terms of lines of code designing the system which measures metric between different build versions of the complexity as well as compute the the software. defect rate of a code fragment. 3. System Coding: This phase deals with This software metric can be calculated for programming or development of the individual functions of a program. For designed tool and creation of the instance, if a function is too large in database using C# programming comparison with the average length per language, .NET framework, and function, it may indicate that the function MYSQL server for database design. is hard to maintain and hence complex in 4. System Testing: The system was terms of maintainability. The unit of this tested efficiently by making sure no metric is a simple number that indicate the invalid data has been entered in forms number of lines in a source code file. This in other not to disorganize the system, number can simply be used with other all pages were duly tested and the code metrics to derive new complexity system was tested against exceptions indicator that can be more comprehensive. and how they are handled. Black-box LOC is usually represented as: testing was adopted in this research 1. kLOC: thousand lines of code metrics since it focuses on the functionality of 2. mLOC: million lines of code metrics the system. 5. Maintenance: The system Halstead Software Science maintenance was done on different A suite of metrics was introduced by phases during the development stage Halstead in 1977 [6]. This suite of metrics of the system while the code was is known as Halstead software science or maintained whenever new information as Halstead metrics. Most of the product is gathered with respect to the objects metrics typically apply to only one used on the system. particular aspect of a software product. In contrast, Halstead set of metrics applies to several aspects of a program, as well as to SELECTION OF METRICS overall production effort. In this paper, the goal is to examine and The Halstead metrics are established on implement a way in which automated the following indices: measurement of source code complexity is 1. - distinct number of operators in a possible to perform. One of the objectives

HBRP Publication Page 1-11 2019. All Rights Reserved Page 4 Journal of Advances in Computational Intelligence Theory Volume 1 Issue 3

program This metric is known as McCabe 2. - distinct number of operands in a cyclomatic complexity metric and it has program been famous code complexity metric 3. - total number of operators in a throughout since it was first introduced. program The McCabe cyclomatic complexity 4. - total number of operands in a metric is based on measuring the linearly program independent path through a program and Halstead formulas gives cyclomatic complexity of the On the basis of above-mentioned indices program which is represented by a single ( , , , ), Halstead derived more number. than one formulas relating to the properties of program code. These formulas can McCabe noted that a program consists of measure Program Vocabulary ( ), code chunks that execute according to the Program Length ( ), Program Volume decision and control statements, e.g. if-else ( ), Program Difficulty ( ), Program and loop statements. McCabe metric Level ( ), Total Effort ( ), Development ignores the size of individual code chunks Time ( ) and Number of Delivered Bugs when calculating the code complexity but ( ). Halstead named his formulas as counts the number of decision and control “Halstead’s Software Science Metrics”. statements. McCabe cyclomatic Program vocabulary + (1) complexity method maps a program to a Program length (2) directed, connected graph. The nodes of Program volume (3) the graph represent decision or control

Potential Difficulty (4) statements. The edges indicate control paths that define the program flow. Program level (5) Cyclomatic complexity is calculated as: Total effort (6) (9) Development time ⁄ (7) Value of is usually taken as 18 for these where calculations. = McCabe metric, Halstead’s delivered bugs (B) is an = the number of edges of the graph of a estimate for the number of errors in the program, implementation. = the number of nodes of the graph and

= the number of connected components. (8) can also be considered as the number of exits from the program logic. The McCabe Cyclomatic Complexity recommended ranges are shown in Table 1 McCabe [11] established a metric based on the control flow structure of a program.

Table 1:- McCabe Cyclomatic Complexity Ranges Cyclomatic Complexity Code Complexity 1 – 10 A simple program, without much risk 11 – 20 More complex, moderate risk 21 – 50 Complex, high risk 50+ Untestable, very high risk

Requirement Analysis reliability by measuring its defect rate. A The system was developed with the goal of defect rate is the percentage of output that automated measurement of software fails to meet a quality target and also complexity and increasing software calculated by testing output for non-

HBRP Publication Page 1-11 2019. All Rights Reserved Page 5 Journal of Advances in Computational Intelligence Theory Volume 1 Issue 3

compliance to a quality target [9]. The source code for another C# or other C- following formula can be used to calculate family programming languages where the defect rate; software application run some metrics and

calculations with respect to the source code supplied. Figure 1 represents the (10) block diagram for the developed system to Code defects are commonly measured as work effectively and carry out a particular defects per one thousand lines of code, and task. In this paper, the net satisfaction can be calculated with the following index ranging from 0 to 100% was formula; discussed and agreed with the software

developers in the underlying test. The net (11) satisfaction index rating scale maps ratings The requirements needed to gather for between completely dissatisfied and every analysis carried out from the completely satisfied to numbers between 0 software being developed take a main and 100% as seen in Table 2. requirement which was a source project or

Metric Software Source Code / Start Project

User Interface Results and Stop Analysis

Fig.1:-The Block diagram of software metric application

Table 2:-The Net Satisfaction Index Ranges Net Satisfaction Index Rating Remark 100% Completely Satisfied 75% Satisfied 50% Neutral 25% Dissatisfied 0% Completely Dissatisfied

RESULTS AND DISCUSSION The selection of the measured source code Source Line of Code Metric (SLOC) was made after consulting with some In evaluating the Source Line of Code software developers. Each software metric (SLOC), this tool automatically developers have different view about their counts and calculate the number of Blank software package before the complexity Lines of Code (BLOC), Comment Lines of measurement. As stated, code-based metric Code (CLOC) and Source Lines of Code (SLOC) which includes the actual code Source Lines of Code metrics (SLOC), (logic and computation) and declarations Halstead Software Science metrics and in the source code while BLOC and CLOC McCabe Cyclomatic Complexity metrics improves readability and helps in are used during the development of this understanding the code during tool. However, the implementation of each maintenance. SLOC metric is used to complexity metric is discussed in the calculate the quantitative attributes of the below sub-sections. Table 3 represents the program source code and gives the complexity values of the measured source software size estimations. However, in code. increasing the software package reliability the defect rate was measured with the

HBRP Publication Page 1-11 2019. All Rights Reserved Page 6 Journal of Advances in Computational Intelligence Theory Volume 1 Issue 3

calculated number of delivered bugs and them to calculate the McCabe complexity the lines of code. value. It counts the number of decision or control structures and maps a program to a Halstead Software Science Metric directed, connected graph. The branches of In evaluating the Halstead set of metrics, the graph are regarded as decision or this tool counts the distinct number of control statements and the edges indicate a operators, a distinct number of operands, control path that defines the program flow. the total number of operands, and the total Nevertheless, due to the fact that no number of operators in a program source organization or software company will code. Considering a C program, examples want to release their source code and of unique operators are main, ( ), { }, int, because of the license time restrictions two scanf, &, =, +, /, printf and so on. Also, senior software programmers were examples of operands are a, b, c, 3, avg, consulted. The selected source code for %d among others. The tool searches for all performing the code analysis is Python and the distinct operands and operators in the C#. software package and counts how many times each distinct operator or operands The software developer’s opinion about appears in the source code, then calculate Python was that the package is well the total and gives the output as the total structured and designed which should number of operands and operators. Once possibly contain less complex code. In these results have been computed then the other words, the software package C# was values are used to calculate the Halstead supposed to be more complex as said by Complexity metrics as stated in each the software developer. Hence, the formula discussed in the previous chapter. McCabe Cyclomatic metric used in this The accuracy of the tool is determined tool results for a fast assessment and from while evaluating every possible place in the calculated results the Python software the software package that could have a package has less complex code while the predefined operator. C# software package has more complex code which relates to the software However, SLOC metric is typically developer’s opinion. Figure 2 represents applied to only one particular aspect of the the calculated result of the Python software source code. In contrast, the Halstead set package and Figure 3 shows the calculated of metrics applies to several aspect of a result of the C# software package. program source code as it is good to have more than one metric for the quantitative Evaluation of the Developed Tool measure of a program and not to fully rely This developed tool has made it possible on one. Although, Halstead set metrics are for software testers to indicate which part difficult to compute manually as it is not of the code needs refactoring and stores easy to count all the operands and the generated report in the database so as operators if a program is using a large to monitor the code changes between the number of operands and operators. Hence, releases. Also, this tool has the in this research automatic measurement of functionality to check up to 500,000 lines source code is achieved by developing a of code. tool that solves the limitations of the existing tool. However, black-box testing method was used to test the tool based on requirements McCabe Cyclomatic Complexity and functionality by the software McCabe is a measure of the complexity of developers because it is a software testing a module’s decision structure. In method which evaluates the functionality evaluating the McCabe Cyclomatic of an application without looking into its Complexity metrics, this developed tool looks for IF, FOR, WHILE, CASE internal structure or workings that is, in the statements in the source code and counts testing method, the structure and design of

HBRP Publication Page 1-11 2019. All Rights Reserved Page 7 Journal of Advances in Computational Intelligence Theory Volume 1 Issue 3

the source code are not disclosed to the figure4. The functionality is performed by software tester, software developers and the end-users on providing the input and end-users perform this test on software. the output equals the desired results while The software programmers were chosen in the quality is the level to which the system the software testing phase and the test was meets specified requirements. The overall conducted on the system. Hence, this average for functionality and quality of the black-box testing method attempts to find developed system results to 90% and 90% errors in functionality and quality of the respectively. developed system as represented in

Table 3:-Complexity Values of the Measured Source Code Metrics Attributes C++ C Python C# JavaScript Source BLOC 4 314 291 245 316 Lines of CLOC 0 315 210 156 293 Code SLOC 82 2005 1807 1162 1828 Metric Total LOC 86 2634 2308 1563 2437 Program Length 135 348 15881 11714 290 (N)

Vocabulary(n) 59 69 1411 706 49

Program Volume 794.157 2125.767 166154.999 110855.725 1628.266 (V) Halstead Difficulty (D) 28.517 56.649 256.429 501.995 63.542 Metric Program Level (L) 0.035 0.018 0.004 0.002 0.016 Total Effort (E) 22646.975 120422.575 42606960.240 55645915.710 103463.278 1258.165 6690.143 2367053.347 3091439.762 Program Time (T) 5747.960 Sec. Sec. Sec. Sec. Sec. Delivered Bugs 0.268 0.816 40.902 48.875 0.738 (B) Cyclomatic No. of Methods 3 2 98 62 2 Complexity Defect rate 0.327 1.636 73.910 56.793 1.349

Fig.2:-Desktop Interface Showing Python Software Package Result

HBRP Publication Page 1-11 2019. All Rights Reserved Page 8 Journal of Advances in Computational Intelligence Theory Volume 1 Issue 3

Fig.3:-Desktop Interface Showing C# Software Package Result

120%

100%

80%

60%

40%

20%

0% C++ C Python C# JavaScript

Functionality Quality

Fig.4:-Summary of Black-box Testing for each Software Developer

CONCLUSION quantities. Furthermore, establishing a As the demand for software is growing at reason for gathering metrics is essential to an exponential rate, the complexity of ensure relevant metrics are gathered software is also increasing. The source because too many metrics become code complexity measure seems to be the confusing if they are not required. most capable measure for both the quantitative and control flow of the In this paper, code-based metrics Source software project. In order words, Line of Code metrics (SLOC), McCabe understanding and measuring the quality Cyclomatic Complexity and Halstead and functionality of a software is Complexity Metrics were studied and used significant to relate it to measurable to achieve the goal of developing an

HBRP Publication Page 1-11 2019. All Rights Reserved Page 9 Journal of Advances in Computational Intelligence Theory Volume 1 Issue 3

automatic software complexity Software Engineering.1994.20 measurement tool. The metrics provide (6):476-498p. visibility and control for the complex 6. Halstead, M. (1977). The Elements of software development procedure and Software Science, Operating and therefore, they are a valuable tool for Programming Systems Series, providing guidance on making the Elservier Computer Science Library software development process better as North Holland N. Y. Elsevier North- well as meeting organizational goals to Holland, Inc. ISBN 0-444-00205- enhance software productivity and quality. 7.1977:1–6p. 7. Jay, G., Hale, J. E., Smith, R. K., This tool was successfully developed and Hale, D., Kraf, N. A. and Ward, C. tested efficiently to run on (2009). Cyclomatic complexity metric Windows operating system. However, the and lines of code: Empirical evidence software testing is a significant stage of the of a stable linear relationship, J. software development life-cycle. In order Software Engineering & words, it is necessary to optimize test Applications.2009:7p. cases and generate them automatically so 8. Jetter, A. (2006). Assessing Software as to reduce testing time, effort and cost. Quality Attributes with Source Code Hence, the black-box testing method was Metrics. Diploma Thesis, Department used in this research because it focuses on of Informatics, University of Zurich. evaluating the functionality of the system. 2006:45-48p 9. John Spacey (2017). How Defect Rate REFERENCES is Calculated. 1. Bansiya, J. (2000). A Hierarchical https://simplicable.com/new/defect- Model for Object-Oriented Design rate. Quality Assessment. IEEE 10. Lincke, R., Lundberg, J. and Löwe, Transactions on Software W. (2008). Comparing software Engineering.2000.28(1):4-17p. metrics tools. In Proceedings of the 2. Bhatti, H. R. (2010). An Automatic 2008 International Symposium on Measurement of Source Code Software Testing and Analysis. Complexity. Master’s Thesis, ISSTA’08. 131–142p. New York, Department of Computer Science, NY:ACM. Electrical and Space Engineering, 11. McCabe, T. J. (1976). A Software Lulea University of Complexity Measure. IEEE Technology.2010:1-14p. Transactions on Software 3. Borchert, T. (2008). Code Profiling: Engineering.1976.SE-2(4):308-320p. Static Code Analysis. Master’s Thesis, 12. Olabiyisi, S. O., Omidiora, E. O. and Department of Computer Science, Sotonwa, K. A. (2013). The Karlstad University, Sweden.2008:12- Comparative Analysis of Software 16p. Complexity of Searching Algorithms 4. Chandra, E., Linda, P. and Edith, A. Using Code Based Metrics. (2010). Class Break Point International Journal of Scientific and Determination Using CK Engineering Research. ISSN: 2229- Metrics Thresholds, Global Journal of 5518. 2013.4(6):1-2p. Computer Science and 13. Panovski, G. (2008). Product Technology.2010.10(14):73-77p. Software Quality. Master’s Thesis, 5. Chidamber, S. R. and Kemerer, C. F. Department of Mathematics and (1994). A metrics suite for the object- Computing Science, Technische oriented design. IEEE Transactions on Universiteit Eindhoven. 2008:1-24p.

HBRP Publication Page 1-11 2019. All Rights Reserved Page 10 Journal of Advances in Computational Intelligence Theory Volume 1 Issue 3

14. Sam, M. (2008). Areas of Use for 17. Widheden, K and Göran, J (2010). Software Metrics. [Online: accessed Software Complexity: Measures and on 2010-06-16 from Measuring for Dependable Computer http://www.articlesbase.com/manage Systems ", MASTER’S THESIS, ment-articles/areas-of-use-for- Department of Computer Science and software-metrics306127.html] Engineering, University of 15. Sanjeev, A. and Barak, B. (2009). Gothenburg, Goteborg, Sweden. Computational Complexity. A 2010:1-14p. Modern Approach, 18. Zhang, M. and Baddoo, N. (2007). Cambridge. ISBN 978-0-521-42426- Performance comparison of software 4. Zbl 1193.68112.2009:1-8p. complexity metrics in an open source 16. Silva, D., Koadagoda, N. and Perera, project, In Proc. 14th European H. (2012). Applicability of three Conference Software Process complexity metrics, in Proc. Improvement, Potsdam, International Conference on Advances Germany.2007:1–16p. in ICT for Emerging Regions. 2012:12-16p.

HBRP Publication Page 1-11 2019. All Rights Reserved Page 11