Using Metrics to Evaluate Software System Maintainability

COMPUTING Using Metrics to Evaluate Software Svstem J Maint ainabilitv J In this month’s Computing Practices Don Coleman and Dan Ash, Hewlett-Packard we offer a sneak preview of Computer‘s September issue on software metrics. Bruce Lowther, Micron Semiconductor Software metrics have been much criticized in the last few years, some- Paul Oman, University of Idaho times justly but more often unjustly, because critics misunderstand the intent behind the technology. Software complexity metrics, for example, rarely measure the “inherent complexity“ embedded in software systems, but ith the maturation of software development practices, software main- they do a very good job of comparing tainability has become one of the most important concerns of the soft- the relative complexity of one portion of ware industry. In his classic book on software engineering, Fred Brooks’ a system with another. In essence, they claimed, “The total cost of maintaining a widely used program is typically 40 percent are good modeling tools. Whether they or more of the cost of developing it.” Parikh2 had a more pessimistic view, claiming are also good measuring tools depends that 45 to 60 percent is spent on maintenance. More recently, two recognized ex- on how consistently and appropriately perts, Corbi3 and Yourdon: claimed that software maintainability is one of the ma- they are applied. The two articles jor challenges for the 1990s. showcased here suggest ways of These statements were validated recently by Dean Morton, executive vice presi- applying such metrics. dent and chief operating officer of Hewlett-Packard, who gave the keynote address Our first article, by Don Coleman et at the 1992 Hewlett-Packard Software Engineering Productivity Conference. Mor- al., sets forth maintainability metrics for ton stated that Hewlett-Packard (HP) currently has between 40 and 50 million lines gauging the effect of maintenance of code under maintenance and that 60 to 80 percent of research and development changes in software systems, rank personnel are involved in maintenance activities. He went on to say that 40 to 60 ordering subsystem complexity, and percent of the cost of production is now maintenance expense. comparing the “quality” of two different The intent of this article is to demonstrate how automated software maintainabil- systems. ity analysis can be used to guide software-related decision making. We have applied The second article, by Norman metrics-based software maintainability models to 11 industrial software systems and Schneidewind, describes an approach used the results for fact-finding and process-selection decisions. The results indicate to validating software quality metrics for that automated maintainability assessment can be used to support buy-versus-buildde- large-scale projects such as the space cisions, pre- and post-reengineering analysis, subcomponent quality analysis, test re- shuttle flight software. The proposed source allocation, and the prediction and targeting of defect-prone subcomponents. metrics isolate specific quality factors Further, the analyses can be conducted at various levels of granularity. At the com- that let us predict and control software ponent level, we can use these models to monitor changes to the system as they occur quality. and to predict fault-prone components. At the file level, we can use them to identify Please feel free to contact me di- subsystems that are not well organized and should be targeted for perfective mainte- rectly about articles you liked, didn’t nance. The results can also be used to determine when a system should be reengi- like, or would like to see in this section neered. Finally, we can use these models to compare whole systems. Comparing a (oman Qcs.uidaho.edu). known-quality system to a third-party system can provide a basis for deciding whether -Paul Oman to purchase the third-party system or develop a similar system internally. NlX-9162194/$4.00 0 1994 IEEE COMPUTER Authorized licensed use limited to: CALIF STATE UNIV NORTHRIDGE. Downloaded on April 26, 2009 at 01:38 from IEEE Xplore. Restrictions apply. Recent studies in metrics for software sented to HP Corporate Engineering the program or system is decom- maintainability and quality assessment managers in the spring and summer of posed into algorithms. have demonstrated that the software’s 1993. At that time it was decided that the (2) The information structure, which in- characteristics, history, and associated en- hierarchical multidimensional assess- cludes characteristics pertaining to vironment(s) are all useful in measuring ment and the polynomial regression the choice and use of data structure the quality and maintainability of that models would be pursued as simple and dataflow techniques. software?-7Hence, measurement of these mechanisms for maintainability assess- (3) Typography, naming, and comment- characteristics can be incorporated into ment that could be used by maintenance ing, which includes characteristics software maintainability assessment mod- engineers in a variety of locations. HP pertaining to the typographic layout, els, which can then be applied to evalu- wanted quick, easy-to-calculate indices and naming and commentingof code. ate industrial software systems. Successful that ‘‘line’’ engineers could use at their models should identify and measure what desks. The following subsections explain We can easily define or identify sepa- most practitioners view as important com- how these methods were applied to in- rate metrics that can measure each di- ponents of software maintainability. dustrial systems. mension’s characteristics. Once the metrics have been defined andlor identified, HPMAS: A hierarchical multidimen- an “index of maintainability” for each di- sional assessment model. HPMAS is mension can be defined as a function of A comDarison HP’s software maintainability assessment those metrics. Finally, the three dimen- of five models system based on a hierarchical organiza- sion scores can be combined for a total tion of a set of software metrics. For this maintainability index for the system. For We recently analyzed five methods for particular type of maintainability prob- our work, we used existing metrics to cal- quantifying software maintainability lem, Oman and Hagemeister6 have sug- culate a deviation from acceptable ranges from software metrics. The definition, gested a hierarchical model dividing and then used the inverse of that devia- derivation, and validation of these five maintainability into three underlying di- tion as an index of quality. methods has been documented else- mensions or attributes: Most metrics have an optimum range where.7 Only a synopsis of the five meth- of values within which the software is ods is presented here: (1) The control structure, which includes more easily maintained. A method called characteristics pertaining to the way weight and trigger-point-range analysis is Hierarchical multidimensional assessment models view software maintainability as a hierarchical structure of the source code’s attributes6 Polynomial regression models use regression analysis as a tool to explore the relationship between software maintainability and software metrics8 An aggregate complexity measure gauges software maintainability as a function of entropy.5 Principal components analysis is a statistical technique to reduce collinear- ity between commonly used complexity metrics in order to identify and reduce the number of components used to construct regression model^.^ ed into Factor analysis is another statistical technique wherein metrics are or- thogonalized into unobservable underlying factors, which are then used to model system maintainabilit~.~ Tests of the models indicate that all five compute reasonably accurate maintainability scores from calculations based on simple (existing) metrics. All five models and the validation data were pre- August 1994 Authorized licensed use limited to: CALIF STATE UNIV NORTHRIDGE. Downloaded on April 26, 2009 at 01:38 from IEEE Xplore. Restrictions apply. used to quantify maintainability by cal- Polynomial assessment tools. Regres- of comments. That is, large comment culating a “degree of fit” from a table of sion analysis is a statistical method for blocks, especially in small modules, un- acceptable metric ranges. When the met- predicting values of one or more response duly inflated the resulting maintainabil- ric value falls outside the optimum range, (dependent) variables from a collection ity indices. To rectify this, we replaced the it indicates that maintainability is lower; of predictor (independent) variables. For aveCM component with percent com- hence, there is a deviation (or penalty) purposes of software maintainability as- ments (perCM), and a ceiling function on the component’s contribution to sessment, we need to create a polynomial was placed on the factor to limit its con- maintainability. The optimum range equation by which a system’s maintain- tribution to a maximum value of 50.1° value, called the trigger point range, re- ability is expressed as a function of the as- Also, because there has been much dis- flects the “goodness” of the program sociated metric attributes. We have used cussion of the nonmonotonicity of Hal- style. For example, if the acceptable this technique to develop a set of polyno- stead’s effort metric (it is not a nonde- range for average lines ofcode (aveLOC) mial maintainability assessment models8 creasing function under the concatenation is between 5 and 75, values falling

Load more