<<

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING. VOL. 15. NO. S. MAY 1989 543 A Software Science Model of

WADE H. SHAW, JR., SENIOR MEMBER, IEEE, JAMES W. HOWATT, ROBERT S. MANESS, AND DENNIS M. MILLER

Abstract-Halstead’s theory of software science is used to describe larger system whose purpose is not primarily computa- the compilation process and generate a performance index. tional, such as a weapons system or a process controller. A nonlinear model of compile time is estimated for four Ada com- pilers. A fundamental relation between compile time and program Physically, an embedded system may range from a single modularity is proposed. Issues considered include data collection pro- microcomputer to a network of large computers” [2]. For cedures, the development of a counting strategy, the analysis of the example, one area of use will be in the field of avionics. complexity measures used, and the investigation of significant relation- In the development of avionics software, efficient com- ships between program characteristics and compile time. The results pilers are needed. As more Ada become avail- suggest that the model has a high predictive power and provides inter- esting insights into compiler performance phenomena. The research able, tools are needed to validate and evaluate these com- suggests that the discrimination rate of a compiler is a valuable per- pilers to determine which, if any, could best meet DoD formance index and is preferred to average compile time statistics. requirements. One measure of interest in compiler com- parisons is the computer time required to translate source Index Terms-Ada compilers, compile time, performance indexes, software science. code into . Currently, benchmark test suites are used; however, they have a poor reputation because the performance fig- I. INTRODUCTION ures are sometimes cited out of context and overgeneral- ECHNOLOGICAL advances in computer software ized into overall ratings [3]. What is needed is an ap- Tare changing the way we understand the underlying proach that provides insight into the effect of intrinsic processes governing software design. Computer systems software characteristics on compile time. Researchers, are becoming more numerous, more complex, and deeply such as Maurice Halstead [4], [5], have raised questions embedded in our society. Inherent in this explosion of about the existence of fundamental principles that govern technology exist questions concerning fundamental rela- the design and of software. Halstead’s goal was tionships between the processes of problem definition, al- to develop objective measures of programming time and gorithm selection and coding, and translation into an ex- effort to make sound judgments about software quality and ecutable image. We can no longer write programs, but complexity. must “engineer” software for our systems to offset the The motivation for this research is threefold. First, the rising cost of software development [ 11. application of software science to the compiling problem The Department of Defense (DoD) recognized this is a straightforward extension of the theory. The degree challenge in the 1970’s and realized that a new standard to which concepts proposed in software science can be language could be created to encourage the use of modern used to explain compile time represents reinforcement of software engineering principles [2]. With the introduction the basic tenents offered by Halstead. Second, software of the Ada for DoD, software en- science may, in fact, produce insight into the physical gineering tools are needed to evaluate the performance process of compilation. Clearly, compile time is a rela- and reliability of this language. Ada was developed under tively minor aspect of compiler performance. Neverthe- sponsorship of the DoD to support development of soft- less, variation in characteristics such as op- ware for embedded computer systems. “By definition, an eratorloperand frequency are manifested in varying embedded computer system is one that forms a part of a compile times which represent a phenomenon that is not fully understood. Any relationships uncovered by map- Manuscript received February 11, 1987; revised August 3, 1987. ping characteristics of source code to a model of compile W. H. Shaw, Jr. and J. W. Howatt are with the Department of Electrical time offer some evidence of a natural process. Finally, the and Computer Engineering, Air Force Institute of Technology, Wright- Patterson AFB, OH 45433. use of a compile time model allows direct comparison of R. S. Maness was with the Department of Electrical and Computer En- alternative compiler implementations as well as compar- gineering, Air Force Institute of Technology, Wright-Patterson AFB, OH ison of target architectures. Use of simple averages does 45433. He is now with the Air Force Satellite Control Facility, Peterson AFB, CO. not yield as sensitive a metric as models specifically de- D. M. Miller was with the Department of Electrical and Computer En- signed to reduce the error component always present in gineering, Air Force Institute of Technology, Wright-Patterson AFB, OH performance measurement. A model yields structure to 45433. He is now with the Air Force Operational and Test Center, Kirtland AFB, NM. the problem of compiler/machine comparisons so that sta- IEEE Log Number 8926735. tistical tests can be used with a known precision.

U.S. Government work not protected by U.S. Copyright 544 IEEE TRANSACTIONS ON SOFTWARE ENGINEERING. VOL. 15, NO. 5, MAY IYXY

We propose to apply the fundamental concepts of Hal- where V has a unit of measurement in bits. That is, stead’s software science theory to determine if an exten- log2 (n) bits are needed to distinguish each of the n tokens sion of his theory could be used to explain compile time in a program. and evaluate Ada compilers [6], [7]. Specifically, the re- An algorithm may be implemented by many different, search hypotheses are as follows. but functionally equivalent programs. When an algorithm 1) There is no variability in compile time for Ada pro- is implemented in its most succinct form, then its poten- grams which can be explained by the relationships pro- tial volume V* is posed in software science. 2) There is no variability in the performance of the V* = (2 + nf)(log2(2 + nf)) (4) software science model of compile time attributable to where nf is the number of input/output parameters and characteristics of the program. (We develop these char- “2” is the number of required operators: the procedure/ acteristics later.) function name and the parameter list grouping operator. 3) There is no variability in the performance of the This represents the size of the program if it existed as a software science model explained by alternative com- built-in function or procedure call. Halstead then argued piler/computer systems. that the amount of time required to implement an algo- These hypotheses allow for the development and testing rithm is directly proportional to the square of the program of the applicability of software science in three ways. volume ( V)divided by the potential volume ( V*) and a First, how well does the model predict compile time and constant ( S ): what fundamental relationships exist between compile time and the software metrics proposed by Halstead? Sec- T = V’/(SV*). (5) ond, is there a difference in the model’s performance The constant “S” represents the speed of the programmer across various categories of Ada code (such as high versus or the number of mental discriminations per unit of time. low percentage control flow code)? Third, can perfor- Halstead used a value of 18 because in his experiments, mance differences between compilers and machines be 18 gave him the best results when comparing actual ver- detected using software science measures and, if so, can sus predicted programming time. We use the parameter a performance measure be developed? “S” (denoted K) as a measure of compilation rate; it The next section presents the theory applicable to this therefore represents a performance index. investigation to provide a background for the compile time model. The research methodology and experimental de- 111. RESEARCHMETHOD sign used to evaluate the compiler model are presented as Halstead’s programming time equation serves as the well as results of the experiment for four compilers. Fi- basic theoretical model. The equation is specified as a set nally, we summarize the results and conclude the paper of independent variables related by a set of parameters to with comments on the applicability of the compile time be estimated. The dependent variable is the actual CPU model. time required for the compilation process. The volume 11. HALSTEAD’SSOFTWARE SCIENCE THEORY (V) and the potential volume (V*)are the independent variables. Placing (5) in parameter form yields In his classic work on software science [5], Halstead attempted to define and measure the complexity of soft- T = KV“( V*)b. (6) ware by analyzing program source code. Halstead defined four basic metrics computable from the code: This equation has the exact form as Halstead’s time equation where “K” is the discrimination rate, “U” is 2, nI = the number of unique operators, and “b” is - 1. “K” is assumed to have the same mean- n2 = the number of unique operands, ing in the compilation process as the constant “S” in Hal- NI = the total number of occurrences of operators, stead’s equation for predicting programmer time. “K” N2 = the total number of occurrences of operands. represents how fast the compiler does its job (the pro- Using these basic metrics, Halstead defined the vocab- cessing rate) and will depend on the compiler architecture ulary n of a program to be the total unique tokens: and the efficiency of the compiler itself. Clearly, “K” can be interpreted as a performance index given that ‘‘a” = 121 n2 n + (1) and “b” are known. Alternatively, “K” can be esti- and the length of a program to be the total operator and mated across compilers and used as a performance index operand occurrences: to statistically distinguish compilers. N = NI + N2. (2) Data collection represented a major portion of this re- search effort. Observations of compile times were re- The size or volume of a program may vary when trans- quired where the metrics of the source code could be ac- lating from one language to another. Halstead surmised curately measured. To estimate an equation, numerous that the volume of a program is a function of its vocabu- observations were required with sufficient variability in lary and is given by the software metrics. The observations of software met- V N log2 (n) (3) rics necessary for estimation of the proposed mathemati- SHAW et al.: SOFTWARE SCIENCE MODEL OF COMPILE TIME 545 cal model were extracted from algorithms taken from a DoD-sponsored test suite. This required a set of rules for the identification and enumeration of each operator, op- erand, and U0 variables in each program. For our inves- tigation, we expanded the counting strategy defined by VAX 111785 UNIX 4.2 Verdix 5.1 Halstead. He considered tokens in only executable code. DC HV-8000-11 AOS/VS 5.6 ADE 2.3 But a compiler must process all the tokens in a program, VAX 111780 VNS 4.4 DEC 1.2 and can expend substantial resources translating data VAX 11/785 VHS 4.3 DEC 1.2 types, packages, tasks, and the like. Thus, to obtain ac- I Condition.: Compilers have been validated. All programs curate estimates, we include all program source code (ex- used compile correctly. Compilation times are the results of no optimization. Three cept comments) in computing the measures for our anal- replications of each experimental unit are ysis. averaged to reduce measurement error. This implies that tokens found in declarations as well as in executable code be counted. Consequently, in con- structing these rules, consistency with the syntax dia- TABLE I1 grams in Appendix E of the Military Standard Ada Pro- PROGRAM CHARACTERISTICS TO BE CONTROLLED gramming Language (MIL-STD-1815A) [8] was ~ ~ enforced. Not only did this approach offer a rigid basis lest : Description upon which to build the counting strategy, it also offered the benefit of using the same syntax charts that authors of 1 Non-tasking versus tasking modules. a compiler must use in designing an Ada compiler. We 2 Low versus high input/output parameter percentage. develop and implement a counting strategy specifically for 3 Short versus long modules.

Ada [9]. 4 Low versus high total operator concentration. The experimental design required that the algorithms be 5 Low versus high total operand concentration. selected with a wide range of software metrics. There- 6 Low versus high unique operand percentage as compared fore, the Prototype Ada Compiler Evaluation Capability to module vocabulary. (ACEC), a collection of approximately 300 Ada modules, 7 Low versus high unique operator percentage as compared was selected. DoD sponsored the creation of this bench- to module length. mark test suite to validate and evaluate Ada compilers. 8 Low versus high unique operand percentage as compared The programs provide information about language fea- to module length. tures that must be present in a full implementation of a 9 Low versus high number of unique function/procedure MIL-STD-1815A compiler [lo]. calls as compared to the number of unique operators. Operators and operands for each program were identi- 10 Low versus high number of total function/procedure fied and counted using our counting rules, and then, the calls as compared to number of total operators. 11 Low versus high number of total functionlprocedure needed software metrics were calculated from these calls as compared to module length. counts. Of the 300 test programs, 171 were selected for 12 Low versus high number of control flow operators as this investigation. Programs were eliminated if they in- compared to number of total operators. cluded pragmas (impact of pragmas was unknown and not 13 Low versus high number of control flow operators as of interest in this study) or they were similar to other mod- compared to module length. ules, i.e., the vocabulary and length were the same. The algorithms were compiled on four different computers having validated Ada compilers and the compilation times the SAS [ 111 software package for data analysis. To use were recorded. Table I summarizes the experimental en- linear regression analysis, the compiler model had to be vironment. linearized. This was done by taking the natural logarithms CPU compile time was measured as accurately as pos- of both sides of the compile time equation. As a result, sible. On a multiuser computer system, compilation time the equation for the compiler model now becomes cannot be measured simply by a stopwatch because of the contention with other users. Therefore, total CPU time In (T) = In (K)+ a In (V)+ b In (V*) + error. used in the compilation process was used. This time was obtained from the list or history file generated by the com- (7) piler for the AOS/VS and VMS systems. The UNIX sys- The analysis was divided into three parts. The first part tem did not provide this information; therefore, the sys- evaluated the compiler model to determine its ability to tem command “time” was used. The Ada was explain compile time across alternative Ada compilers. reinitialized after each program was compiled to ensure The second part evaluated different module characteristics that librarian functions did not affect timing between suc- to determine which, if any, of them had a measurable im- cessive compilations. pact on compile time on the UNIX system. Finally, we The compiler model was then estimated using the anal- investigated the development of a performance index to ysis of variance method and the linear regression tool on compare the speed of various Ada compilers. ~

546 IEEE TRANSACTIONS ON SOFTWARE ENGINEERING. VOL. IS. NO. 5. MAY 1989

Thirteen program characteristics were considered in this TABLE 111 investigation. Table I1 summarizes the program attributes ERRORREDUCTION IN PREDICTING COMPILE TIME used to divide the test suite. Each attribute average was Computer Mean Compile Tine X Error Reduction computed for the entire test suite and the mean was used I Model I (CPU Secs) Relative to Average to distinguish between two levels of the attribute. Regres- t I I 1I sion models were then computed for each partition. 13.36 55.56 AOS/VS 27.10 83.81 vns-780 12.36 74.09 IV. RESULTS vns-785 7.31 73.72 Table 111 shows the percentage of error reduction rela- tive to the average compilation time if the compiler mode is used to predict compile time. By error reduction, we TABLE IV mean the average difference between predicted and actual COMPILETIME MODEL PARAMETER ESTIMATES compile times compared to use of simple test suite aver- age. That is, the software science model reduced predic- UNIX: Adjusted R2 = 0.5556, F = 107.278 (0.0001) tion error 83.8 percent compared to the arithmetic average Parameter Est Std Err Prob > T

of 27.1 CPU seconds for the AOS/VS system. It is inter- -0.6386 0.2075 0.0024 esting to note that the slowest compiler, the AOS/VS sys- 0.4124 0.0315 0.0001 tem, provided the best model. - Based on the statistical analysis of the compiler model, AOWVS: the estimated exponents for V and V*, “U” and “b”, > respectively, are shown in Table IV. The models’ R2 is Para;ter Est Std Err Prob T the coefficient of determination and measures the degree -1.5067 0.1415 0.0001 0.5830 0.0215 0.0001 I to which the data fits the model. An R2 of 1 would indi- 0.0431- 0.0200 0.0319 cate perfect correspondence between the observed com- - pile times and the software science model. Low R2 indi- vns-780 Adjusted R2 = 0.7409, F = 244.079 (0.0001) cates poor performance of the model. Since R2 is Parameter Est Std Err Prob > T influenced by the number of observations and the number K -1.3314 a.1655 0.0001 of terms in a linear model, the adjusted R2is used to avoid a 0.4730 0.0251 0.0001 b 0.1047 0.0233 0.0001 any upward bias. The adjusted R2 is therefore a conser- - vative estimate of the degree of model fit. vns-785: Adjusted R2 = 0.7372, F = 239.44 (0.0001) The F statistic is a test statistic which indicates the like- Parameter Est Std Err Prob > T lihood that some parameter in the model is nonzero. That K -1.7833 0.1642 0.0001 is, higher values of the F statistic indicate more confi- a 0.4670 0.0249 0.0001 dence in the model. This confidence is expressed by the b 0.0991 0.0231 0.0001 number in parentheses which indicates the probability that the data observed could produce the estimated model and still be incorrect. For example, a value of 0.0001 is 0.01 Halstead, is a contrast about the effects of program mod- percent chance that the model has incorrectly established ularization between programming time and compile time. a linear fit between the dependent and independent vari- Assume a program can be broken into n modules such ables. that the volume of the program is equal to the sum of the Likewise, the column labeled “Prob > T” is the like- volume of the n modules. This can be expressed as lihood of error in estimating a parameter. Usually, like- lihoods of less than 1-5 percent are deemed significant n and indicate the parameter should be retained in the V = c vi. model. The evidence is overwhelming that the models are i= 1 statistically valid and that the majority of variability in compile time can be explained by the software science If Halstead’s time equation is a function of the power of model. V and that power is greater than 1, then V In Halstead’s original work, he set the exponent of n and V* in the programmer time equation to 2 and - 1, V2 >> c (Vi)? respectively. In Table IV, the estimated exponents are i= I shown to be approximately 0.5 and 0.1. V* is not signif- icant in two of the estimated models. Instead of dividing As a result, modularization reduces programming time. Vand V*, the compiler mode multiplied these two param- That is, the program will take longer to write than it would eters where V* was very small and perhaps insignificant. take to write an equivalent modularized program. How- On the other hand, taking the square root of V is interest- ever, the compiler model seems to indicate the opposite ing because of the effect on modularization. That is, the effect since the exponent was fractional. If compile time exponent for V being 0.5 rather than 2, as suggested by is a function of the power of the volume, then the total SHAW er al.: SOFTWARE SCIENCE MODEL OF COMPILE TIME 547 sum of all the modules' volumes is greater than the one TABLE V ADACOMPILER MODELS module containing all the programs:

n SYSTEM The Equdt ~cn Correlation Coef. Va << c (vi)" wherea < 1. T = 0.53 (Vo.411 .74 i= I AOS/VS T = 0.22 (V0.58) .75 VMS-780 T = 0.26 (Vo.44) ((V')o.lo) .88 VMS-785 T = 0.17 ((V')o.lo) .88 Consequently, compile time increases if modulariza- lWNiX tion is used. The time reduced by a programmer when modularizing software causes the compiler to suffer in performance. Intuitively, this makes sense because the compiler must expend more resources checking the li- 80 brary and symbol tables. *O i 1 Based on Table IV, the estimated model for compila- 70-1 II tion time for each compiler and the correlation between 60 the actual and predicted compilation times are shown in 50 Table V. If the parameter was not significant (t > 0.05 ), then it was not included in the model. That is why V* is U) not in the Unix or AOSNS model. As shown in Table V, 30 the correlation between predicted and observed compila- 20 tion times for each compiler are all quite high. Conse- 10 quently, the model fits well. Note also that the correla- tions between the actual and predicted times on the VAX computers are identical. This makes sense because the Algorithms Sorted by UNIX Compile Tlme same compiler is used in each computer. However, the Fig. 1. Comparison of Unix system actual versus predicted compile time. discrimination rate is different, and therefore represents a relative measure of computer speed. TABLE VI The predicted times compare relatively well to the ac- TESTPARAMETERS BY CATEGORY tual times as shown in Fig. 1 for the UNIX compiler. In this figure, all observations were sorted in ascending or- Test No. Level Count Mean Adj R2 der based on the Unix compile times. The other compiler 1 NO 151 13.60 ,5456 models generated similar curves with close correspon- YES 20 11.60 .9552 dence between actual and predicted compile times. Note 2 LOW 101 13.49 ,4240 that as the compile time increases, the difference between HIGH 70 13.19 ,7468 actual and predicted times increases. This seems to sug- 3 SHORT 141 9.39 ,2271 gest that the error term in the estimation process may en- LONG 30 32.05 ,3407 ter the model as a product term and not a sum. However, 4 LOW 60 16.18 .4929 below 30 s of CPU time, the model does very well. The HIGH 111 11.84 .5976 figure suggests the existence of a structural break which 5 LOW 111 11.84 ,5976 60 16.18 may influence the model parameters as compile time in- HIGH ,4929 6 LOW 52 22.43 .6099 creases. In general, the actual and predicted compile times HIGH 119 9.40 ,2358 correspond quite well. 7 LOW 78 18.62 .6469 Hypothesis 2 was introduced to determine if the predic- HIGH 93 8.96 .1774 tive power of the compile time model remains consistent 8 LOW 93 8.96 .1774 between levels of program characteristics. The first test HIGH 78 18.62 .6469 divided the test suite into one group of modules that did 9 LOW 111 13.08 .4773 not have a tasking function and the other group having a HIGH 60 13.89 .7833 tasking function. Another test divided the suite based on 10 LOW 77 16.77 .5285 module length with the dividing point being the average HIGH 94 10.57 ,7350 module length for the test suite. Each of the other 11 tests 11 LOW 77 16.75 ,5266 HIGH 94 10.59 ,7350 divided the test suite into two groups based on low versus 12 LOW 106 13.47 .5649 high percentage of the characteristic being tested. Mod- HIGH 65 13.18 .5654 ules that fall below the average percentage of a charac- 13 LOW 106 13.46 .5563 teristic were put in the low category and all other modules HIGH 65 13.20 ,5714 were put in the high category. Table VI presents the results of the investigation for LEGEND: LEVEL - Level of the category being tested COUNT - Number of modules in the test category each of the 13 characteristics presented earlier. Clearly, MEAN - Mean complle time of all modules in the category differences in model performance exist which provide Adj R2 - Adjusted R-Square for the category useful insight into the compilation process. For example, 548 IEEE TRANSACTIONS ON SOFTWARE ENGINEERING. VOL. IS. NO. S. MAY 1989 programs with high a percentage of I/O statements yield TABLE VI1 PARAMETER ESTIMATESFOR POOLED DATA similar average compile times, but the higher the I/O con- centration, the better software science was able to explain I Parameter] Est 1 Std Err I Prob > T ~ compile time. I 1 I The next major area of investigation was the develop- In K -0.9818 0.1018 0.0001 a 0.4839 0.0151 0.0001 ment of a performance index. For this analysis, dummy b 0.0745 0.0140 0.0001 variables were used to observe the effect of each compiler e -0.5601 0.0319 0.0001 f -0.1063 0.0319 0.0009 and to analyze each compiler separately while maintain- I I ing the same exponents for volume and potential volume. Adjusted R2 = 0.7066, F = 412.312 (0.0001) The revised model was then estimated using aggregate I I data from the entire test suite and the dummy variable’s values was used to determine the compiler’s relative rate of compilation. The revised model is given as In T = In K + a In V + b In V* + e + f + error. Rank by Rank by X Faster than Compile Avg. Compiler Model UNIX (8) Here, e andfrepresent coded variables (0 or 1) depend- AOSIVS UNIX - UNIX AOSlVS 10.1 ing on the compiler being estimated. For example, the VMS-780 VMS-780 42.9 Unix compiler was designated e = 0 andf = 0 so that the VMS-785 VMS-785 48.6 discrimination rate for the Unix machine was simply In K. The AOS/VS compiler was designated e = 0 andf = 1 so that the least squares estimate off was added to In k VAX 11/785 computer is faster than the VAX 111780 to estimate the discrimination rate of the AOS/VS com- computer by 10.1 percent (K being 0.21 versus 0.19). piler by itself. Likewise, the VMS-780 was coded e = 1, Clearly, the compiler/machine implementations are sig- f = 0, and the VMS-785 was coded e = 1, f = 1. In this nificantly different in performance. This difference is sta- way, the V and V* coefficients were constant across the tistically valid and represents an index which we feel is data and the impact of the alternative compilers was di- more useful than simple averages. In fact, simple aver- rectly available in e andf. ages generated a rank order different from the software Using binary dummy variables results in more efficient science approach! estimation because the use of a variable for each compiler The use of a software science model of compile time would introduce four variables instead of the two shown. takes into account much more information about the al- The use of fewer variables raises the degrees of freedom gorithm being compiled, and therefore is able to predict associated with the error term, and thus more realistically the compile time much better than use of a simple aver- estimates the impact of each compiler. This model en- age. The use of the discrimination rate is a sensitive es- sures that each compiler effect is estimated independently timate of the compiler and machine speed and generates of the other compilers, while the software science param- a rank order which more closely represents the true pro- eters (a and b) are estimated across all the data. cessing power of the alternatives being compared. Table VI1 shows the results of this effort. The results of using dummy variables does not change the compiler V . CONCLUSIONS model from Table V except for the discrimination rate K. A number of conclusions can be drawn from this anal- Note that the exponents for V and V* are approximately ysis. First, the attempt to develop a measure which would the same, i.e., 0.5 and 0.1, respectively. provide a suitable approximation of the amount of time However, “K,” the translation rate, changes slightly. expended during the compilation process has been vali- The discrimination rate for each compiler is significantly dated. The results suggest that the software science com- different from the base, UNIX, due to nonzero values of piler model is a good tool for predicting compilation times e and f. The equations for each compiler now become and provides some fundamental insights concerning the compilation process. Hypothesis 1 is therefore rejected. Unix = T = 0.37(V048)((V*)007), The signs and the magnitudes of the estimated param- AOS/VS = T = 0.34(V048)((V*)007), eters were not within the proximity of the theorized val- ues. In particular, the value of the exponent for the vol- VMS-780 = T = 0.21(V048)((V*)007), ume (0.5 instead of 2.0) was unexpected. However, this VMS-785 = T = 0.19(V048)((V*)007). value seems reasonable in that a compiler must expend more resources compiling several modules separately than As the equations suggest, the compilers would be ranked compiling a single program containing all the modules. In as shown in Table VIII. addition, the exponent for the potential volume V* was Note that the compilers on the VMS machines were the not negative and was not as significant in this application same. Therefore, it can be concluded from above that the as compared to programming time. SHAW ('f

The 13 tests that analyzed the predictive power of the 121 G. Booch. Softwrre E'ri,yir~ec~riri~ywith Adtr. Menlo Park. CA: Ben- model based on module characteristics showed that the jaininICummings. 1983. 131 R. Relph. S. Hahn, and S. Viles. "Benchmarking C compilers," Dr. compiler model was statistically significant at the 0.05 Dohb '.\ J. Softirure Tool.s. vol. 11. pp. 30-50. Aug. 1986. level for both categories of all 13 tests. Therefore, the [4] A. Fitzsimnions and T. Love. "A review and evaluation of software compile time model is deemed useful regardless of pro- science." ACM Cornput. Surreys. vol. 10, pp. 3-17, Mar. 1978. 151 M. Halstead, Elernerir of Sofiwcrrr Scirncr. New York: Elsevier gram characteristics. Hypothesis 2 is therefore rejected North-Holland, 1977. and the complex tradeoffs between program characteris- 161 R. Maness, "Validation of a structural model of computer compile time." M.S. thesis. GCSIENGI86D-I, DTIC ADA177655. School tics are noted. It is useful to note the overall high explan- Eng.. Air Force Inst. Technol. (AU), Wright-Patterson AFB. OH. atory power of the model regardless of the modules' char- Dec. 1986. acteris t ics . 171 D. Miller, "Application of Halstead's timing model to predict the compilation time of Ada compilers.'' M.S. thesis, GEIENGl86D-7. The explanatory power of the compile time model was DTIC AD A177652, School Eng., Air Force Inst. Technol. (AU). consistently encouraging across four compilerimachine Wright-Patterson AFB. OH, Dec. 1986. combinations. The predictive ability of software science 181 U.S. Dep. Defense, "Military Standard Ada programming lan- in explaining differences in compilation rate across alter- guage," ANSI/MIL-STD-I815A, Jan. 22, 1983. [9] D. M. Miller. R. S. Maness, J. W. Howatt, and W. H. Shaw, "A native Ada compilers was found significant and useful. software science counting strategy for the full Ada language," SfC- The use of the discrimination rate proved to be a sensitive PLAN Norices, vol. 22. May 1987. and appealing way to evaluate a compiler's performance 1101 A. Hook. A. Audrey et al.. User's Munuul for rhe Prororxpa Ado Compiler E\uluation Capability (ACEC) Vc.r.siori I. IDA Paper and is not as biased by aberrations in measurement or out- P-1879. Oct. 1985. liers as the simple arithmetic averages. Therefore, Hy- 11 I] SAS Institute Inc., SAS User's Guide: Srati.ttic.t Version, 5th ed. pothesis 3 is rejected. Gary. NC: SAS Institute Inc., 1985. The results of this research are twofold. First, a model of compile time has been developed and tested. The model Wade H. Shaw, Jr. (M'87-SM.8X) receibed the B.S. degree in electrical engineering. the M.S. suggests that modularization of code is costly in terms of degree in systems engineering. and the Ph.D. de- compile time and appears valid across levels of program gree in engineering management from Clemson characteristics. Second, the development of a compiler University, Clemson, SC. He is an Assistant Professor of Electrical En- performance index has been proposed. Clearly, the value gineering at the Air Force Institute of Technol- of "K" represents the processing rate of a compiler. The ogy. Dayton, OH. His teaching, research. and results indicate that ranking a compiler's performance consulting interests include simulation. modeling and analysis, decision support systems, and soft- solely on the average compile time is suspect. Based on ware engineering. the average compile time, the AOSiVS system was slower Dr. Shaw is Captain in the U.S. Army. a member of AIIE, DSI. TIMS, than the Unix system. In contrast, the software science ORSA. SCS, and is a Registered Professional Engineer in Ohio and South model ranked the Unix system slower than the AOS1VS Carolina. system. It was also shown that "K" can also compare James W. Howatt received the B.S. degree in computer science from computers when the operating environment is the same. Wright State University. Fairborn. OH, in 1977, the M.S. degree in sys- In this case, as expected, a VAX 111785 was faster than tems analysis from the University of West Florida, Pensacola. in 1980. a VAX 111780. and the Ph.D. degree in computer science from Iowa State University. Anies, in 1985. In summary, a model of Ada compile time has been Currently on active duty with the United States Air Force. he is an A\- developed and validated. This study represents a prelim- sistant Professor of Computer Engineering at the Air Force Institute of inary exploration of the applicability of software science Technology. His research interests include software measures and metrics. formal specification languages and techniques, and static program analysis metrics to compilers. The results have indicated that there tools and techniques. is enough evidence to continue investigating this area. Future research testing this model on other compilers and languages on a broader spectrum of data may further il- Robert S. Maness received the B.A. degree in natural sciences from the University of Texas, Austin, in 1981 and the M.S. degree in information luminate the compilation phenomena of compilers. A systems from the Air Force Institute of Technology, OH, in 1986. compiler index has been proposed, developed, and tested. He is a Captain in the United States Air Force and is currently assigned With further research, the software science compiler to Operating Location AB. Air Force Satellite Control Facility at Peterson model may become a valuable tool in evaluating compiler AFB, CO. performance. Dennis M. hliller received the B.S. degree in electrical and electronic ACKNOWLEDGMENT engineering from North Dakota State University. Fargo. in 1981 and the M.S. dcgrce in electrical engineering from the Air Force Institute of Tech- The authors wish to thank the IEEE TRANSACTIONSON nology. OH. in 1986. SOFTWAREENGINEERING Editor as well as three anony- Upon graduation from NDSU. he received a conirnission in the USAF through the ROTC program and entered the Air Force on active dut! in mous reviewers for their helpful suggestions and com- July 1981. His tirst a\signnient was to the 1000th Satellite Operat~onsGroup ments. at Ottutt AFB. NB. HISduty was as a System Integration Enginccr. ~ith responsibility for testing. integrating. and evaluating sjsteni upgrade\ lor REFERENCES the command mil control and telemetry processing s)stcins in the Defense Meteorolosical Program. including telemetry analysis. communication. [I] R. E. Fairley. Softwcrre L-ri,yiric,eririg Coric~~ps. New York: Mc- datshasc. and retrie\al systcnis. He is currentl) assigned to the Alr Forcc Graw-Hill. 1985. Operational and Test Center. Kirtland AFB. NM.