Measuring Software Product Quality: a Survey of ISO/IEC 9126

featurequality Measuring Software Product Quality: A Survey of ISO/IEC 9126 Ho-Won Jung and Seung-Gweon Kim, Korea University Chang-Shin Chung, Telecommunications Technology Association apid IT growth and the proliferation of PCs have resulted in nu- merous software products. For 2003, the Gartner Group estimates that worldwide user spending on software exceeded $730 billion, and IDC estimates that spending in the packaged soft- R 1,2 ware market topped $176 billion. Although the market is rapidly grow- ing, users are often dissatisfied with software quality. In fact, widespread dissatisfaction in the US drove recent legislative activity governing software quality and suppliers’ responsibilities.3 Simi- standards specify software product quality’s larly, user satisfaction is often considered a characteristics and subcharacteristics and their critical outcome of quality management, and metrics. However, some in the software engi- studies show it as having a positive impact on neering community have expressed concerns organizational cost, profit, and sales growth.4 about a lack of evidence to support such stan- To address the issues of software product dards. According to Shari Lawrence Pfleeger quality, the Joint Technical Committee 1 of the and her colleagues, “Standards have codified International Organization for Standardiza- approaches whose effectiveness has not been tion and International Electrotechnical Com- rigorously and scientifically demonstrated. mission published a set of software product Rather, we have too often relied on anecdote, quality standards known as ISO/IEC 9126. ‘gut feeling,’ the opinions of experts, or even (See the sidebar for related standards.) These flawed research rather than on careful, rigor- ous software engineering experimentation.”5 Stakeholders use evaluations of software The international standard ISO/IEC 9126 defines a quality model product quality as a basis for many important for software products. Based on a user survey, this study of the decisions, including improving product quality, making large-scale acquisitions, and mon- standard helps clarify quality attributes and provides guidance for itoring contracts. Given this crucial business revising the standard. role and the community’s doubts about stan- 88 IEEE SOFTWARE Published by the IEEE Computer Society 0740-7459/04/$20.00 © 2004 IEEE dards, we found it essential to empirically in- Related Standards vestigate whether the ISO/IEC 9126 catego- rization is correct and reliable in evaluating Working Group 6 of Subcommittee 7 (Software and Systems Engineering user satisfaction with the judgment of a pack- Standardization) under Joint Technical Committee 1 of the International Or- aged software product’s quality. ganization for Standardization and the International Electrotechnical Com- mission is developing standards and technical reports for software product ISO/IEC 9126 evaluation and metrics. WG 6 has developed ISO/IEC 9126-1 (quality Quality, according to ISO 8402, is “the to- model), 9126-2 (external metrics), 9126-3 (internal metrics), and 9126-4 tality of characteristics of an entity that bear (quality in use metrics). WG 6 has also developed the ISO/IEC 14598 se- on its ability to satisfy stated and implied ries of technical reports, which provide guidance for evaluating software needs.” ISO/IEC 9126-1 defines a quality model product quality on the basis of ISO/IEC 9126. that comprises six characteristics and 27 sub- JTC1 publishes six types of documents. Each type follows all or part of characteristics of software product quality (see the following development stages: preliminary, proposal, preparatory, com- Table 1). Because the quality model is generic, mittee, approval, and publication.1 ISO/IEC 9126-1 is the international you can apply it to any software product by standard and 9126-2, -3, and -4 are technical reports. tailoring to a specific purpose. ISO/IEC 9126 also defines one or more Reference metrics to measure each of its subcharacteris- 1. ISO/IEC JTC1 Directives, Procedures for the Technical Work of ISO/IEC JTC1, ISO/IEC tics. For example, the quality level of a software JTC1, ISO, 1999; www.jtc1.org/directives/toc.htm. product’s functionality can be represented by measured values of its five subcharacteristics. Together, those subcharacteristics include the properties that the standard posits to consti- In ISO/IEC 9126, “satisfaction” implies “the tute functionality. Following an appropriate capability of the software product to satisfy aggregation method, you can combine the five users in a specified context of use.” Satisfaction subcharacteristics’ measured (or survey) val- in that sense refers to the user’s response to in- ues into a single value on a composite index of teraction with the product. It includes judg- functionality. ments about product use rather than about However, if any of functionality’s subcharac- properties of the software itself. Of course, in teristics measures doesn’t correlate to functional- addition to ISO/IEC 9126’s definition, user ity, the aggregated value would fail to fully rep- satisfaction also can include facets such as the resent functionality’s quality as ISO/IEC 9126 quality of service, cost, or developer’s reputa- defines it. In such a case, the unrelated sub- tion. However, we limited our survey ques- characteristic should be moved into a more ap- tions’ wording to cover only the subcharacter- propriate characteristic, so that each characteristics as defined by ISO/IEC 9126. istic includes homogeneous subcharacteristics.6 Background and method Table 1 Because you can call each of ISO/IEC 9126’s six characteristics a dimension, or factor,7,8 in Characteristics and subcharacteristics statistical terms, the standard’s structure is a in ISO/IEC 9126 multidimensional concept of software product Characteristic Subcharacteristics quality. (A similar multidimensional concept Functionality Suitability, accuracy, interoperability, security, functionality would be job satisfaction. For example, you can compliance* be more satisfied or less satisfied with various Reliability Maturity,* fault tolerance,* recoverability,* reliability compliance* combinations of your job, supervisor, pay, Usability Understandability, learnability, operability, attractiveness, and/or workplace.) Our study aims to evaluate usability compliance* empirically ISO/IEC 9126’s dimensionality, or Efficiency Time behavior, resource utilization, efficiency compliance* the classification of characteristics, and inter- Maintainability Analyzability, changeability, stability, testability, maintainability nal-consistency reliability. For this purpose, we compliance* conducted a survey in which users evaluated Portability Adaptability, installability, replaceability, coexistence, portability their satisfaction concerning the quality of a compliance* packaged software product according to the criteria of ISO/IEC 9126’s subcharacteristics. *Denotes the omitted subcharacteristics in our survey September/October 2004 IEEE SOFTWARE 89 The survey we revised the questionnaire and then tested it A marketing department identified 200 users using five industry end users again. The sec- Using too few of its company’s packaged software product— ond pretest results led to minor revisions. rating-scale a query and reporting tool for business data- Because many pretest respondents had dif- bases. Of the 200 people we contacted through ficulty relating the ISO/IEC 9126 reliability categories won’t email or telephone, 75 (38 percent) responded subcharacteristics to their experience with the capture the to the questionnaire. The respondents included product, we removed the reliability questions. 48 end users, 25 developers who had modified In addition, most respondents had the same questions’ full the tool to embed it in their applications, and trouble with the compliance subcharacteris- discriminatory two others—all of whom we refer to as users. tics, defined for each characteristic as “the powers. Using We were unable to find any significant differ- software product adheres to standards, con- ences in end-user and developer responses. ventions, or regulations in laws and similar too many might prescriptions relating to each characteristic.” be beyond the Rating scale So, we removed the compliance subcharacter- If you use too few rating-scale categories, the istic for each of the remaining five characteris- respondents’ response won’t capture the questions’ full dis- tics. In the end, our questionnaire covered 18 discriminatory criminatory power. On the other hand, using subcharacteristics of the five characteristics. too many categories might be beyond the re- During the survey, we encouraged feedback powers. spondents’ limited discriminatory powers. A from respondents regarding the question- Monte Carlo study of the number of scale naire’s contents. points’ effects on reliability showed that reliability estimates increased as the number of scale Dimensionality’s role points increased from two to five, but the esti- Principal component analysis (PCA) is a mates decreased as more categories were statistical method used to investigate an un- added.9 So, we used a five-category rating scale, derlying concept’s dimensionality.8 Dimen- which ranged from very satisfied to very dissat- sionality is a concept related to the correlation isfied, and included a “Don’t know” option. between each of the measured subcharacteris- Using this scale, we measured the respon- tics and derived dimensions. A dimension con- dents’ judgments

Load more