<<

Software Reuse

CHARLES W. KRUEGER

School of , G’arnegie Mellon University, Pittsburgh, Pennsylvania 15213

Software reuse is the process ofcreating software systems from existing software rather than building software systems from scratch. ‘l’his simple yet powerful vision was introduced in 1968. Software reuse has, however, failed to become a standard software engineering practice. In an attempt to understand why, researchers have renewed their interest in software reuse and in the obstacles to implementing it. This paper surveys the different approaches to software reuse found in the research literature. It uses a taxonomy to describe and compare the different approaches and make generalizations about the field of software reuse. The taxonomy characterizes each reuse approach interms of its reusable artifacts and the way these artifacts are abstracted, selected, speciahzed, and integrated. Abstraction plays a central role in software reuse. Concise and expressive abstractions are essential if software artifacts are to be effectively reused. The effectiveness of a reuse technique can be evaluatedin terms of cognztzue dwtance-an intuitive gauge of the intellectual effort required to use the technique. Cognitive distance isreduced in two ways: (l) Higher level abstractions ina reuse technique reduce the effort required to go from the initial concept of a software system to representations inthe reuse technique, and(2) automation reduces the effort required togo from abstractions inareuse technique to an executable implementation. This survey will help answer the following questions: What is software reuse? Why reuse software? What are the different approaches to reusing software? How effective are the different approaches? What is required to implement a software reuse technology? Why is software reuse difficult? What are the open areas for research in software reuse?

Categories and Subject Descriptors: D.1.O [Programming Techniques]: General; D.2. 1 [Software Engineering]: Requirements\ Specifications-la rzguages, methodologies, tools; D.2.2 [Software Engineering]: Tools and Techniques—modules and interfaces, programmer workbench, software libraries; D .2.m [Software Engineering]: Miscellaneous-–reusable software; D.3.2 [Programming Languages]: Language Classifications-specialized application languages, very high-level languages; D.3.4 [Programming Languages]: Processors; F.3. 1 [Logics and Meanings of Programs]: Specifying and Verifying and Reasoning about Programs; H.3. 1 [Information Storage and Retrieval]: Content Analysis and Indexing—abstracting methods, indexing methods; H.3.3 [Information Storage and Retrieval]: Information Search and Retrieval—query formulation, retrieual models, search process, select~on process; 1.2.2 [Artificial Intelligence]: Automatic Programming--, .

This work was supported in part by the Hillman Fellowship for Software Engineering, in part by ZTI-SOF of Siemens Corporation, Munich, Germany, and in part by the Avionics Lab, Wright Research and Development Center, Aeronautical Systems Division (AFSC), U.S. Air Force, Wright-Patterson AFB, OH 45433-6543 under contract F33615-90-C-1465, ARPA Order 7597.

Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the ACM copyright notice and the title of the publication and its date appear, and notice is given that copying is by permission of the Association for Computing Machinery. To copy otherwise, or to republish, requires a fee and/or specific permission. CQACM 0360-0300/92/0600-0131 $01.50 132 “ Charles W. Krueger

General Terms: Design, Economics, Languages Additional Key Words and Phrases: AbstractIon, cognitive distance, software reuse

CONTENTS 9. TRANSFORMATIONAL SYSTEMS 9.1 AbstractIon m Transformational Systems 9.2 SelectIon 9.3 Speciahzatlon INTRODUCTION 9.4 Integratmn 1. ABSTRACTION 9.5 Appraisal of Reuse Techmques m Transfor- 1 1 Abstraction ]n mational Systems 12 Abstraction m Software Reuse 10 SOFTWARE ARCHITECTURES 13 Cognitive D]stance 10.1 AbstractIon m Software Architectures 2, ORGANIZATION OF THE SURVEY 10.2 Selection 3. HIGH-LEVEL LANGUAGES 10.3 Speclahzatlon 3.1 AbstractIon m High-Level Languages 10.4 Integration 3.2 SelectIon 10.5 Appraisal of Reuse Techniques in Software 3.3 Speclahzatlon Architectures 3.4 Integration 11 SUMMARY 35 ADcraisal. . of Reuse Techmques m Hlgh- 11,1 Categories and Taxonomy Level Languages 11.2 Cogmtlve Distance 4. DESIGN AND CODE SCAVENGING 113 General Conclusions 41 Abstraction m Design and Code ACKNOWLEDGMENTS Scavenging REFERENCES 4.2 SelectIon 4.3 Speclalizatlon 4.4 Integration 4.5 Appramal of Reuse Techmques m Desgn INTRODUCTION and Code Scavenging 5. COMPONENTS The 1968 NATO Software Engineering 51 Abstraction ]n Source Code Components Conference is generally considered the 52 SelectIon birthplace of the software engineering 5.3 Specialization 5.4 Integration field [Naur and Randell 1968]. The con- 5.5 Object-Oriented Van ants to Component ference focused on the software Reuse crisis—the problem of building large, re- 5.6 Appraisal of Reuse Techniques m Source liable software systems in a controled, Code Components

6 SOFTWARE SCHEMAS cost-effective way (it was at this confer- 61 Abstraction in Software Schemas ence that the term soft ware crisis origi- 62 Selection nated). From the beginning, software 63 Specialization reuse has been touted as a means for 6.4 Integration overcoming the software crisis. The semi- 6.5 Appramal of Reuse Techniques in Software Schemas nal paper on software reuse was an in- 7. APPLICATION GENERATORS vited paper at the conference: Mass Pro- 71 Abstraction m Apphcatlon Generators duced Software Components by Mclh-oy 7,)- &lectiOn [ 1968]. McIlroy proposed a library of 7.3 Specialization 7.4 Integration reusable components and automated 7.5 Software Life Cycle with Application techniques for customizing components Generators to different degrees of precision and ro- 76 Appraisal of Reuse Techniques in bustness. McIlroy felt that component li- Apphcation Generators 8. VERY HIGH-LEVEL LANGUAGES braries could be effectively used for nu- 8.1 Abstraction m Very High-Level merical computation, 1/0 conversion, Languages text processing, and dynamic storage al- 8.2 Selectlon location. 8.3 Speclalizatlon Twenty-three years later, many com- 8.4 Integration 8.5 Appraisal of Reuse Techmques in Very puter scientists still see software reuse High-level Languages as potentially a powerful means of im-

ACM Computing Surveys, Vol. 24, No. 2, June 1992 Software Reuse ● 133 proving the practice of software engi- developers would be forced to sift neering [Boehm 1987; Brooks 1987; through a collection of reusable arti- Standish 1984]. The advantage of amor- facts trying to figure out what each tizing software development efforts artifact did, when it could be reused, through reuse continues to be widely ac- and lhow to reuse it, knowledged, even though the tools, Selection. Most reuse approaches help methods, languages, and overall under- software developers locate, compare, standing of software engineering have and select reusable software artifacts. changed significantly since 1968 [Bi- For example, classification and cata- ggerstaff and Richter 1989]. loging schemes can be used to organize In spite of its promise, software reuse a library of reusable artifacts and to has failed to become standard practice guide software developers as they for software construction. In light of this search for artifacts in the library failure, the computer science community [Horowitz and Munson 1989]. has renewed its interest in understand- Specialization. With many reuse tech- ing how and where reuse can be effective nologies, similar artifacts are merged and why it has proven so difficult to into a single generalized (or generic) bring the seemingly simple idea of soft- artifact. After selecting a generalized ware reuse to the forefront of software artifact for reuse, the software devel- development technologies [Biggerstaff oper specializes it through parame- and Perlis 1989a, 1989b; Freeman 1987b; ters. transformations. constraints. or Tracz 1988]. some other form of refinement. For Simply stated, software reuse is using example, a reusable stack implemen- existing software artifacts during the tation might be parameterized for the construction of a new software system. maximum stack depth. A programmer The types of artifacts that can be reused using this generalized stack would are not limited to source code fragments specialize it by providing a value for but rather may include design struc- this parameter. tures, module-level implementation Integration. Reuse technologies typi- structures, specifications, documenta- cally have an integration framework. tion, transformations, and so on [Free- A software develo~er uses this frame- man 1983]. work. to combine a collection of se- There is great diversity in the software lected and specialized artifacts into a engineering technologies that involve complete software system. A module some form of software reuse. However, interconnection language is an exam- there is a commonality among the tech- ple of an integration framework [De- niques used. For example, software com- Remer and Krcm 1976; Prieto-Diaz and ponent libraries, application generators, Neighbors 1986]. With a module inter- source code , and generic soft- connection language, functions are ex- ware templates all involve abstracting, ported from modules that implement selecting, specializing, and integrating them and imported into modules that software artifacts [Biggerstaff and use them. Modules are assembled into Richter 1989]. In this survey, software a system by interconnecting modules engineering technologies are analyzed with the appropriate exports and im- and contrasted in terms of their id- ports. iomatic reuse techniques, particularly along the four mentioned dimensions: From the software engineering per- spective, software reuse pertains solely Abstraction. All approachm to ~oftware to the proceim of constructing ~oftware reuse use some form of abstraction for systems. Using existing sine routine software artifacts. Abstraction is the source code during the construction of a essential feature in any reuse tech- program is considered an example of nique. Without abstractions, software software reuse, but repeatedly invoking

ACM Computing Surveys, Vol. 24, No. 2, June 1992 134 “ Charles W. Krueger

that sine routine during the execution of 1.1 Abstraction in Software Development the program is not. Also excluded are Computer scientists often use abstrac- repeated program executions and verba- tion to help manage the intellectual tim duplication of programs for the pur- complexity of software [ Shaw 1984]. pose of distribution. An abstraction for a software artifact is The goal of this paper is to introduce a succinct description that suppresses research efforts in software reuse; it is the details that are unimportant to a primarily a survey of the software reuse software developer and emphasizes the literature. To provide a coherent perspec- information that is important. For exam- tive of the diverse literature, a uniform ple, the abstraction provided by a high- taxonomy is developed. Different reuse level programming language allows a approaches are then characterized using programmer to construct algorithms this taxonomy. Some simple comparative without having to worry about the de- analysis identifies the relative merits and tails of hardware register allocation. drawbacks of the different reuse tech- Software typically consists of several niques and forms the basis for general- layers of abstraction built on top of raw izations about the field of software reuse. hardware. The lowest-level software ab- As a survey of research literature, this straction is object code, or machine code. paper is not intended to be a guide for Assembly language is a layer of abstrac- selecting or implementing a reuse tech- tion above object code. A programming nology in practice. Many of the systems language (e.g., Adal ) is a layer of ab- described are research prototypes that straction above the assembly language. have not been scaled up or validated in In modular languages such as Ada, the practical use. module specification can serve as a layer of abstraction above the imdementation details in the module body. ‘ 1. ABSTRACTION These examples demonstrate that ev- ery software abstraction has two levels. Because abstraction is such an important The higher of the two levels is referred to part of software reuse, it is used as the as the abstraction specification. The unifying theme in this survey. This choice lower, more detailed level is called the also reflects the view that successful ap- plication of a reuse technique to a soft- abstraction realization. 2 When abstrac- tions are layered, the abstraction specifi- ware engineering technology is inex- cation at one layer is the abstraction re- orably tied to raising the level of abstrac- alization at the next higher layer. Figure tion for that technology. Since raising 1 shows a hierarchy with two abstrac- abstraction levels for software engineer- tions, L and M. Rep 1, Rep 2, and Rep 3 ing technologies has proven to be quite are three representations of the same difficult, the relation between abstrac- software artifact, where Rep 1 is the most tion and reuse provides us with the first detailed (lowest level) representation. For clue to why there are so few successful abstraction L, Rep 2 is the abstraction reuse systems. specification, and Rep 1 is the abstrac- Others have noted the relationship be- tion realization. From the point of view tween software reuse and abstraction [Booth 1987; Neighbors 1984; Parnas et al. 1989; Standish 1984; Wegner 1983]. According to Wegner [1983, p. 30], for 1Ada is a registered trademark of the U.S. Govern- example, “abstraction and reusability are ment, Ada Joint Program Office. two sides of the same coin.” He states z Implementation is also common terminology for that every abstraction describes a related the lowest level of detail in an abstraction. How- collection of reusable entities and that ever, implementation is often associated with exe- cutable source code, which is not necessarily the every related collection of reusable enti- case for all abstractions. For this reason, realiza- ties determines an abstraction. tion is used in this survey.

ACM Computing Surveys, Vol. 24, No. 2, June 1992 Software Reuse ● 135

Abstraction Specification /0Rep ~ / Specification M I

Realization M

L

Realization L ,tjRep 1 / Figure 1. Two-level abstraction hierarchy. Figure 2. Mappimz. . from a variable abstraction specification. of abstraction M, Rep 3 is the abstrac- tion specification, and Rep 2 is the ab- straction realization. rather an arbitrary decision made by the An abstraction has a hidden part, a creator of the abstraction. The creator variable part, and a fixed part. The hid- decides what information will be useful den part consists of the details in the to users of the abstraction and puts it in abstraction realization that are not visi- the abstraction specification. The creator ble in the abstraction specification. The also decides which properties of the ab- variable and fixed parts are visible in straction the user might want to vary the specification. The variable part rep- and places them i.n the variable part of resents the variant characteristics in the the abstraction specification. Continuing abstraction realization, whereas the fixed with the stack example, the value for the part represents invariant characteristics maximum stack depth can be placed in in the abstraction realization. Therefore, either the variable, fixed, or hidden part as illustrated in Figure 2, an abstraction of the stack abstraction. If it is placed in specification with a variable part corre- the variable part, the user has the ability sponds to a collection of alternate real- to choose the maximum stack depth (e.g., izations. The variable part of an abstrac- 10, 1000, unbounded). If the maximum tion specification maps into the collection stack depth is placed in the fixed part, of possible realizations. the user knows the predefine value of For example, in an abstraction for maximum stack depth but cannot change stacks, the fixed part of the abstraction it. If placed in the hidden part, the stack expresses the invariant characteristics depth is totally removed from the con- for all stack realizations, such as the cerns of the user. last-in-first-out (LIFO) semantics. The Abstraction specifications and realiza- invariant stack behavior does not, depend tions can take on many forms. They can on the type of elements stored in the be formal or informal, explicit or implicit. stack, so the element type can be in the Consider the stack example written as a variable part of the abstraction. Then generic Ada package. The abstraction re- each different element type corresponds alization corresponds to an instantiation to a different stack realization. of the generic package with a particular The partitioning of an abstraction into stack element type. The abstraction spec- variable, fixed, and hidden parts is not ification, on the other hand, must be an innate property of the abstraction but a combination of different descriptions

ACM Computing Surveys, Vol. 24, No. 2, June 1992 136 “ Charles W. Krueger because of Ada’s limited expressiveness. Integration. To integrate a reusable ar- The generic package can provide the syn- tiGct into a software system effec- tactic specification for operations of the tively, the user must clearly under- stack abstraction, but the semantic spec- stand the artifact’s interface (i.e., those ification must be expressed outside of the properties of the artifact that interact Ada language. One possibility is to use a with other artifacts or the integration formal notation such as Hoare axioms. framework). An artifact interface is an Another is to use an informal description abstraction in which the internal de- such as English text. tails of the artifact are suppressed. Semantic specifications are rarely de- rived from first principles. Abstraction 1.3 Cognitive Distance creators implicitly assume that a certain amount of the abstraction specification is The effectiveness of abstractions in a common knowledge among the users. As software reuse technique can be evalu- an extreme example, it might be suffi- ated in terms of the intellectual effort cient simply to give “stack” as a semantic required to use them. Better abstractions specification and assume that all users mean that less effort is required from the understand the stack abstraction with- user. To aid in evaluation, cognitive dis- out an explicit formal description. tance is introduced as an intuitive gauge In summary, an abstraction expresses for comparing abstractions. a high-level, succinct, natural, and useful Cognitive distance is defined as the specification that corresponds to a less amount of intellectual effort that must be perspicuous realization level of represen- expended by software developers in order tation. The abstraction specification typi- to take a software system from one stage cally describes “what” the abstraction of development to another. From this def- does, whereas the abstraction realization inition, it should be clear that cognitive describes “how” it is done. For an ab- distance is not a formal measurement straction to be effective, its specification that can be expressed with numbers and must express all of the information that units. Rather, it is an informal notion is needed by the person who uses it [Shaw that relies on intuition about the relative 1984]. This may include space/ time effort required to accomplish various characteristics, precision statistics, scala- software development tasks. bility limits, and other information not For the creator of a software reuse normally associated with specification technique, the goal is to minimize cogni- techniques [McIlroy 1968]. tive distance by (1) using fixed and vari- able abstractions that are both succinct 1.2 Abstraction in Software Reuse and expressive, (2) maximizing the hid- den part of the abstractions, and (3) Abstraction plays a central and often- using automated mappings from abstrac- times limiting role in each of the other tion specification to abstraction realiza- facets of software reuse: tion (e.g., compilers). This can be sum- Selection. Reusable artifacts must have marized in an important truism about concise abstractions so users can effl- software reuse: ciently locate, understand, compare, and select the appropriate artifacts For a software reuse technique to be effectwe, It from a collection. must reduce the cogmtwe distance between the mitlal concept of a system and Its final executable Specialization. A generalized reusable Implementation, artifact is in fact an abstraction with a variable part. Specialization of a gen- This truism, along with others that arise eralized artifact corresponds to choos- later in the survey, are obvious and ing an abstraction realization from the seemingly simple requirements on soft- variable part of an abstraction specifi- ware reuse techniques that have proven cation. difficult to satisfy in practice.

ACM Computing Surveys, Vol. 24, No. 2, June 1992 Software Reuse ● 137

2. ORGANIZATION OF THE SURVEY These imprecision, however, should not detract from our goal, which is to explore We partition the different approaches to the interesting issues in software reuse. software reuse into eight categories: high-level languages, design and code 3. HIGH-LEVEL LANGUAGES scavenging, source code components, software schemas, application genera- High-level languages such as C, Ada, tors, very high-level languages, transfor- Lisp, Smalltalk, and ML have not been mational systems, and software architec- treated in the literature as examples of tures. They are presented in Sections 3 software reuse. From the perspective of through 10, respectively, beginning with software developers prior to the existence reuse techniques that rely on low-level of these languages, however, the goals abstractions such as assembly language and achievements for high-level lan- patterns and progressing through higher guages have strong parallels to the cur- level abstractions. rent-day aspirations of software reuse re- Each of the eight software reuse cate- searchers. Given their marked level of gories are discussed according to the fol- acceptance and success (e.g., a factor of 5 lowing taxonomy: speedup in writing code [Brooks 197.51), Abstraction. What type of software arti- it is interesting to examine high-level language technology in terms of software facts are reused and what abstractions reuse. are used to describe the artifacts? Selection. How are reusable artifacts 3.1 Abstraction in High-Level Languages selected for reuse? Specialization. How are generalized ar- Before high-level languages were devel- tifacts specialized for reuse? oped, software development was done Integration. How are reusable artifacts mainly with assembly languages. As de- integrated to create a complete soft- velopers discovered the inherent com- ware system? plexity of building large systems in this model, they began to search for more Each section ends with a summary of the effective ways to express computation. key taxonomic features for the category The common implementation patterns and a statement about its pros and cons. used in assembly language programming These summaries provide a convenient became the primitive constructs in the point of reference for comparing the dif- next generation, high-level languages. ferent approaches to software reuse. Examples include iteration, branching, Although the categories and the taxon- arithmetic expressions, relational expres- omy illustrate different approaches and sions, data declarations, and assignment. issues in software reuse, they are not The reusable artifacts in a high-level precise. The real software engineering language are assembly language pat- technologies we will examine often have terns. A high-level language is at a level features that fit into several categories. of abstraction above the assembly lan- For example, different features of the Ada guage artifacts. That is, high-level lan- language have relevance in four cate- guage constructs are abstraction speci- gories: High-Level Languages, Design fications, whereas the corresponding and Code Scavenging, Source Code Com- assembly language artifacts are abstrac- ponents, and Program Schemas. Like- tion realizations. wise, the relative importance of the four For example, consider a conditional facets in the taxonomy differs among the statement in a high-level language: different reuse categories. For example, If ((expression)) men application generators emphasize the (statements, ) specialization of a single highly ab- else stracted artifact but typically do not re- {statements,) quire selecting or integrating artifacts. endif

ACM Computmg Surveys, Vol. 24, No. 2, June 1992 138 ● Charles W. Krueger

This reusable template is an abstraction described in the previous paragraph. The specification that directly maps to an ab- larger, encapsulating constructs such as straction realization—the reusable as- procedures, packages, or modules are in- sembly language pattern. Referring back tegrated with a module interconnection to Figure 2, the variable parts in the language or with scope rules. At both the abstraction specification are the ( expres- statement and module levels, the inte- sion ) and two ( statements) slots. The gration framework of a high-level lan- fixed part of the abstraction specification guage defines how individual constructs corresponds to the semantic description are composed to form a complete soft- of the conditional statement, given in the ware system. That is, every language language reference manual: construct in a fully integrated program has a well-defined operational semantics The expression IS evaluated. If true, execute statements,, otherwise statements. that describes its effect on the run-time data and the run-time control flow The hidden part of the abstraction in- [Loeckx and Sieber 1984]. (Module inter- cludes all of the assembly language de- connection languages are addressed in tails such as temporary data in the eval- more detail in Section 5.4.) uation of the expression. Programmers use only the fixed and variable parts of the abstraction specifi- 3.5 Appraisal of Reuse Techniques in cation. They never see the actual assem- High-Level Languages bly language artifacts that are reused From the perspective of today’s software because the mapping from specification development technology, an analysis of to realization is fully automated by the high-level languages is almost trite. . These languages are often the lowest level of abstraction used by software 3.2 Selection developers. The accomplishments, Since there is a relatively small number strengths, and weaknesses of high-level of reusable artifacts (i.e., language con- languages are well known. However, it is structs) in a high-level language, it is not widely recognized that high-level lan- relatively simple for a programmer to guages are examples of software reuse. select among them. The language refer- Nor is it recognized that, in many ways, ence manual and tutorial examples help high-level language technology is a a novice make the appropriate selections paragon of software reuse that re- in a particular language. A programmer searchers currently can only hope to em- can typically master a high-level lan- ulate. For example, discovery of a new guage in a matter of days or weeks. reuse technology that routinely offered a factor of 5 speedup in software develop- 3.3 Specialization ment would be among the most signifi- cant software engineering achievements Most high-level 1anguage constructs are of the decade. generalized constructs with parametri- High-level language technology pro- zed dots. For example, conditional state- vides a good example of how abstraction ments typically have parameterized slots impacts the effectiveness of software for Boolean expressions and statements. reuse techniques. Compared to using as- Programmers specialize the generalized sembly languages, high-level languages language constructs by recursively filling are considerably more succinct and have the parameterized slots with other lan- a more natural form for expressing and guage constructs of the appropriate type. reasoning about programs. In addition, programmers are largely unaware of the 3.4 Integration reuse that takes place. They use only the Programmers integrate statement-level fixed and variable parts of the abstrac- constructs by recursive specialization as tion specification, while the mapping

ACM Computing Surveys, Vol. 24, No. 2, June 1992 Software Reuse ● 139

Table 1. Reuse in High-Level Languages

Abstraction The reusable artifacts in a high-level language are assembly language patterns. High-level language constructs serve as abstraction specifications for low-level assembly language patterns. Selection The semantics for each of a small number of constructs is described in a language manual. Experienced programmers easily commit the collection of constructs to memory, which facilitates selection. Specialization Language constructs are typically generalized with parameterized slots. Con- structs are specialized by recursively filling the parameterized slots with constructs of the appropriate type. Integration Language constructs are integrated by recursive specialization, module intercon- nection rules, or scope rules. The operational semantics of the language defines the effect of integrating constructs on the run-time data and control flow. Pros Reusable assembly language patterns provide a factor of 5 speedup in develop- ment time compared to programming direct] y in assembly language because (1) high-level language constructs are succinct and natural, (2) compilers fully automate the mapping from abstraction specification to abstraction realization, and (3) programmers are completely isolated from compiler and assembly language details. Cons Due to the need for s,ystem design prior to coding, there is still a large cognitive distance between the informal requirements for a software system and its implementation. High-level languages are at a relatively 1ow level of abstraction.

from specification to realization is fully The goal of design and code scavenging automated by the compiler. Program- is to reduce the cognitive effort, the num- mers do not have to examine or under- ber of keystrokes, and therefore the stand the internals of the compiler or the amount of time required to design, imple- assembly language output, which com- ment, and debug a new software system. pletely isolates them from the details of Scavengers copy as much as possible from the hidden and realization parts of the analogous systems that have already reuse abstraction. been designed, implemented, and de- The primary limitation of high-level bugged. languages as a reuse technology is the large amount of system design effort re- 4.1 Abstraction in Design and Code quired prior to coding—high-level lan- Scavenging guage programming comes late in the software development life cycle. Thus, In design and code scavenging, the ab- there is a large cognitive distance be- stractions used by a software developer tween the informal requirements for a are mostly informal, and they often exist software system and its implementation only in the mind of the developer. The in a high-level language. Table 1 summa- abstractions are concepts about design rizes the high-level language approach to and programming that a software devel- software reuse. oper has learned from experience. For- mal education, where students are taught practical design concepts, provides an- 4. DESIGN AND CODE SCAVENGING other source of abstraction for scav- Many programmers adopt an ad hoc, al- engers. though effective, approach to reusing The reusable artifacts in scavenging software system designs and source code. are source code fragments. In code scav- They scavenge fragments from existing enging, a contiguous of source code software systems and use them as part of is copied from an existing system. In de- new software development. Experienced sign scavenging, a large block of code is software developers are known to gain copied, but many of the internal details great leverage from reusing previous de- are deleted while the global template of signs [Biggerstaff and Richter 1989]. the design is retained. There is, of course,

ACM Computing Surveys, Vol. 24, No. 2, June 1992 140 “ Charles W. Krueger a continuum between code scavenging 4.3 Specialization with dense code blocks and design scav- A programmer specializes a scavenged enging with sparse design templates. code fragment by manually editing it. A software developer creates an ab- The fragment is edited to resolve differ- straction for an existing design or code ences between the original context from fragment by remembering an abbrevi- where it was scavenged and the new con- ated description. For example, the devel- text where it will be reused. For exam- oper might remember the semantics of a ple, a scavenged stack implementation stack and the location of some software might push and pop integer elements, that implements a stack. Then, if the but a new implementation might need developer needs a stack in a new soft- string elements. The programmer must ware system, the existing implementa- thoroughly understand the lowest-level tion can be scavenged. In this case, the details in the code fragment to adapt it to abstraction realization is the existing its new context correctly. stack source code. The abstraction speci- fication is the abbreviated description in the memory of the software developer.3 4.4 Integration In code scavenging, the software devel- Integrating a scavenged code fragment oper is ultimately involved with all parts into a new context means that the pro- of the abstraction—the abstraction speci- grammer has to modify the fragment, the fication, the abstraction realization, and context, or both. For example, variable the mapping from specification to real- names in a scavenged code fragment may ization. Therefore, there is no hidden part collide with existing variable names in of the abstraction. Initially the developer the new system. This conflict can be re- recalls an existing artifact from a mental solved by either modifying the context or abstraction specification. The mapping the fragment. Modifying the fragment from specification to realization corre- corresponds to specialization, as de- sponds to the developer manually locat- scribed in the previous paragraph. The ing the existing artifact and modifying it issues for modif~ng the context are es- for reuse. Modification and integration sentially the same. are performed at the realization level of the abstraction, which forces the devel- 4.5 Appraisal of Reuse Techniques in Design oper to become intimately involved with and Code Scavenging details in the realization. In ideal cases of scavenging, the software developer is able to find large fragments 4.2 Selection of high-quality source code quickly that The abstractions a software developer has can be reused without significant modifi- in his or her head can be thought of as a cation. In these cases, the payoff is high. The developer goes directly from an in- library of reusable designs and source formal abstraction of a design to a fully code. While designing a new software system, the developer may notice simi- implemented source code fragment. That larities between the new system and ab- is, the cognitive distance between the ini- stractions in the library. The developer tial concept of a design and its final exe- can directly reuse these abstractions in cutable implementation is small. the new design, or the abstractions can In practice, the overall effectiveness of be used as an index to locate existing code scavenging is severely restricted by its informality. A programmer can only source code for scavenging. scavenge those code fragments he or she remembers or knows how to find. There is no systematic way to share fragments

3 I am being informal and taking great liberties among many different programmers. with the notion of abstraction in this section. Specialization and integration are ineffi-

ACM Computmg Surveys, Vol. 24, No, 2, June 1992 Software Reuse w 141

Table2. Reuse in Design and Code Scavenging

Abstraction The reusable artifacts in scavenging are source code fragments. The abstractions for these artifacts areinformal concepts that asoftware developer has learned from design and programming experience. Selection When apro~ammer recognizes that part ofanew application is similar to one previously written, a search for existing code may lead to code fragments that can be scavenged. Specialization A programmer specializes a scavenged code fragment by manually editing it. The programmer must thoroughly understand the lowest level details of the code fragment in order to adapt it correctly to its new context. Integration Integrating a scavenged code fragment into a new context means that the programmer has to modify the fragment, the context, or both. Pros In ideal cases of scavenging, a software developer is able to adapt large fragments of source code without significant modification. In these cases, cognitive distance is small. Cons In the worst cases, a software developer spends more time locating, understand- ing, modifying, and debugging a scavenged code fragment than the time required to develop the equivalent software from scratch.

cient because they require a thorough software reuse by proposing an industry understanding and direct editing at the of off-the-shelf source code components. realization level (source code). The pro- These components were to serve as build- grammer may introduce errors while ing blocks in the construction of larger modifying the fragment, so the valida- systems. Given a large enough collection tion, testing, and debugging must be re- of these components, software developers peated each time a code fragment is scav- could ask the question “What mechanism enged for a new context. shall we use?” rather than “What mecha- These limitations lead to the second nism shall we build?” truism of software reuse: The nature of a reusable component technology strongly depends on the lan- For a software reuse technique to be effectwe, It must be easier to reuse the arhfacts than It is to guage in which it is implemented. The develop the software from scratch. components proposed by McIlroy in 1968 were assembly language or sub- Efforts to extend the scope of code scav- routines and therefore emphasized enging, especially across multiple pro- reusable functions. Examples of collec- grammers, often violate this simple con- tions of reusable functions include statis- straint. For example, trying to scavenge tics libraries such as SPSS and numeri- unfamiliar code from a friend may result cal analysis libraries such as IMSL. in spending more time locating, under- Modern languages have a richer collec- standing, modifying, and debugging a tion of program units, such as modules, code fragment than the time required to packages, subsystems, and classes. The develop the equivalent software from type of reusable components that can be scratch. Scavenging is inherently limited written in these languages is not limited by the fact that developers are not iso- to functions but can also include data- lated from the details in the abstraction centered artifacts such as abstract data realization level. They frequently work at types [Deutsch 1989; EVB Software 1985; the same level of abstraction while scav- Ichbiah 1983; Parnas et al. 1989]. That enging as they do when building soft- is, these languages make a clear distinc- ware from scratch. Table 2 summarizes tion between control abstraction and data the design- and code-scavenging ap- abstraction [Goguen 1986, 1989]. As an proach to software reuse. example of reusable data-centered com- ponents, Booth [ 198’7] defines reusable 5. SOURCE CODE COMPONENTS Ada packages for 11 abstract data types: McIlroy’s [1968] Mass Produced Software stacks, lists, strings, queues, deques, Components introduced the notion of rings, maps, sets, bags, trees, and graphs.

ACM Computing Surveys, Vol. 24, No. 2, June 1992 142 ● Charles W. Krueger

He also gives reusable function-centered tions to describe components. These ab- components for control abstractions such stractions are universally understood by as sorting and searching. all programmers in the application do- The goal of off-the-shelf components is main. For example, in numerical analy- to reduce the cognitive effort, the number sis libraries, reusable components for sine of keystrokes, and therefore the amount and matrix multiply do not require de- of time required to design, implement, tailed abstraction specifications. The and debug a new software system. The names alone serve as complete abstrac- one-time cost of creating a reusable com- tions for the users based on their count- ponent is further amortized each time it less hours studying high school and col- is reused. Reusable com~onents. have lege mathematics. Another example is .proven fruitful in a few narrow areas Booth’s abstract data types (stacks, lists, such as numerical analysis, but there has strings, queues, deques, rings, maps, been less success with collections of gen- sets, bags, trees, and graphs), where the eral-purpose components. names serve as well-defined abstractions for most computer scientists. In both of these examples, the source code compo- 5.1 Abstraction in Source Code Components nents are often accompanied by informal Component reuse is analogous to code natural language descriptions, which scavenging in that programmers copy ex- provide more detailed abstraction specifi- isting source code artifacts into new soft- cations for the components. ware systems. However, reusable com- As with code scavenging, the software ponent technologies typically use more developer reusing source code compo- systematic techniques such as catalogs nents is often involved with all parts of and libraries of components. In addition, abstraction. Initially the developer uses reusable components are created and the abstraction specification to locate an stored specifically for the purpose of appropriate component. The mapping reuse, whereas with scavenging the from specification to realization corre- reused artifacts are extracted from soft- sponds to the developer locating the com- ware that was not written with reuse in ponent in the library and (if necessary) mind. modifying it for reuse. Modification, if A major challenge for researchers try- needed, and integration are performed at ing to implement large libraries of the realization level of the abstraction, reusable components is to find concise which might force the developer to be- abstractions for the components. Without come involved with details in the compo- abstractions, the user of a component li- nent implementation. brary must examine the source code to determine what each component does. 5.2 Selection Although the source code is appropriate for the abstraction realization level, it is The third truism of software reuse em- unreasonable to expect a user to search phasizes a characteristic problem with through a large library of source code to reusable components: find an appropriate component. The li- To select an artifact for reuse. you must know what brary implementor must provide abstrac- It does. tion specifications that succinctly de- scribe component behavior. The library When faced with a library of source implementor must also provide classifi- code components, the user needs a level cation and retrieval schemes so users can of abstraction that emphasizes what the efficiently search for specific components. components do as opposed to the source It is interesting to note that the best- code that describes how it is done. Anna known successes with component reuse is an example of an Ada language exten- have been in application domains with sion for annotating components (i.e., Ada application-specific, “one-worp abstrac- packages) with semantic descriptions

ACM Computmg Surveys, Vol. 24, No. 2, June 1992 Software Reuse ● 143

[Luckham and von Henke 1984]. Al- ware developers with techniques to lo- though Ada maintains a clear separation cate components efficiently. Without such between the package declaration and the techniques, a developer must search package implementation, the package through a large, monolithic collection of declaration provides only syntactic infor- reusable components. This is analogous mation, which is not sufficient for de- to searching for a book on a particular scribing its behavior and intended use. subject “in a public library that has no With Anna, it is possible to “understand card catallog but has all books arranged how to use a package without inspecting alphabetically” [Embley and Woodfield its implementation in the private part 1987, p. 360]. This leads to the fourth and body” [Luckham and von Henke and final truism of software reuse (which 1984, p. 123]. is a special case of the second): Anna, which stands for Annotated Ada, To reuse a software arhfact effectwely, you must be is a language for formal annotations. An- able to find it faster than you could build It notations are predicates (i.e., Boolean- valued expressions) that express con- To address this problem, implementors straints on Ada language constructs such of a component library can organize com- as data objects, types, subtypes, subpro- ponents with a classification scheme and grams, and exceptions. For example, the then provide manual or automated re- following is an Ada subtype declaration trieval techniques. ‘l’he scheme for classi- for the even integers. The first line is the fication and r~trieval is another level of Ada declaration, and the second is an abstraction over all of the components in Anna annotation; it is impossible to ex- the library. For example, the IMSL [ 1987] press this subtype semantics in pure Ada math library has a three-volume manual [Luckham and von Henke 1984]: with components hierarchically classified by abstract computational or analytical subtype EVEN is INTEGER; capabilities. In addition, three different --lwhere X : EVEN = ) X mod 2 = O; indexes serve as inde~endent abstract in- The next example is a simple Anna terfaces to the libra~y: a keyword index specification for the behavior of a counter (KWIC), an ACM math software classifi- package (also from [Luckham and von cation index, and an alphabetical index. Henke 1984]). The two Anna annota- The keyword and ACM classification in- tions, which begin with --1, formally state dexes are both mappings from abstract that the value of a counter just after descriptions (i.e., abstraction specifica- initialization will be O and that the value tions) to reusable com~onents in the li- of the counter just after an increment brary (i.e., abstraction realizations ). About 1200 pages are used to describe will always be 1 more than its value just before the increment: and classify approximately 900 routines. Automated tools for component selec- package COUNTERS is tion may be necessary for component col- type COUNTER is limited private lections significantly larger than IMSL. function VALUE (C : COUNTER) return NAT- An interesting small-scale example is the URAL; netlib system for automatically distribut- procedure INITIALIZE (C: in out COUNTER); ing mathematical software components --l where out (vALLIE (C)= 0); via electronic mail [lDongarra and Grosse procedure INCREMENT (C : in out 1987]. Users’ queries and requests are COUNTER); --l where out (VALUE(C) = VALUE(in C) + 1); drafted according to a simple syntax and private sent to an Internet address. A server at that address parses incoming messages end COUNTERS; and returns responses to queries or re- quested software components. In addition to component descriptions, The net,lib components are all public a component library must provide soft- domain, and the netlib service is pro-

ACM Compd,mg Surveys, Vol. 24, No. 2, June 1992 144 “ Charles W. Krueger vialed free of charge. In December, 1988, sorting in which parameterization deter- the netlib source code collection consisted mines the type of elements that are of51 different software packages (includ- sorted, the indexing scheme into a collec- ing Linpack, published ACM algorithms, tion of elements, the partial ordering Minpack, fft packages, and eigenvalue among elements, and so on. packages) and occupied about 75MB of storage. Although this system is limited 5.4 Integration to mathematical software and is rela- Module interconnection languages pro- tively unsophisticated in terms of soft- vide a framework for integrating reusable ware classification and retrieval, it elicits images of an international distribution components [DeRemer and Kron 1976]. system for a wide range of reusable soft- Integrating reusable components into a software system is essentially the same ware artifacts. as integrating components built from scratch. Module interconnection lan- 5.3 Specialization guages are an integral part of many pro- As with code scavenging, programmers gramming languages such as Modula-3, can specialize reusable components by Standard ML, and Ada, which makes directly editing the source code. This ap- these languages particularly good candi- proach, however, has the same draw- dates for implementing reusable compo- backs that were noted with code scaveng- nents. Readers not familiar with module ing: interconnection languages might be in- terested in a survey by Prieto-Diaz and ● Editing source code forces the software Neighbors [ 1986]. developer to work at a low level of Naming and binding conflicts can pre- abstraction. The effort required to un- sent problems for the software developer derstand and modify the low-level de- integrating a reusable component into the tails of a component offsets a signifi- context of a new system, Names im- cant amount of the effort saved in ported into and exported from the compo- reusing the component. nent may clash or be incorrectly bound in ☛ Editing source code may invalidate the the new system. correctness of the original component. Ada provides mechanisms to overcome This eliminates the ability to amortize some of these naming problems [Ichbiah validation and verification costs over 1983]. The dot notation can be used to the life of a reusable component. disambiguate name clashes. For exam- Implementors of reusable components ple, if package P and Q both define func- can offer a more efficient approach to tions named F, they can be uniquely specialization with generalized compo- identified as P.F. and Q. F. Overloading nents and construction-time parameters. allows name clashes to be automatically Software developers specialize these disambiguated by the compiler on the components by setting parameters rather basis of differences in the type signa- than directly editing source code. Well- tures. And synonyms can be used in known examples of this approach include situations in which dot notation is too parameterized expansions and Ada cumbersome or overloading becomes con- generics [Ichbiah 1983]. Parameterized fusing. components are a form of abstraction. The UNIX4 pipe is another example of For example, the Ada generic package an integration framework for compo- specification represents a set of possible nents [Kernighan 1984]. The reusable package realizations, or instantiations. components in this paradigm are com- The generic parameters correspond to the plete programs. Programs are integrated variable part of the abstraction specifica- tion. Mendal [1986] gives an in-depth example of a generic Ada package for 4 UNIX is a trademark of AT & T.

ACM Computing Surveys, Vol. 24, No. 2, June 1992 Software Reuse “ 145 by sequentially “piping” the output from data representation and fetch operation one program into the input of another defined in indexed collection are reused. program. For example, the UNIX com- Subclassing, therefore, corresponds to mand who is a program that outputs the specialization of a superclass. That is, list of persons logged onto a UNIX sys- the superclass is a template that can be tem, one per line. The UNIX command lC extended by a programmer to create a counts the number of linefeeds in an in- subclass. Everything in the superclass put string. Piping the output of who into template is reused in the subclass lC produces a new program that counts [Lieberherr and Riel 1988]. The effort the number of persons logged into a UNIX required to produce a subclass is propor- system: tional to the dissimilarity between the subclass and the superclass [Deutsch who I IC 1983]. where the I designates a pipe, Liskov [1987] discusses different forms of inheritance and the effect that each form has on the encapsulation of classes. 5.5 Object-Oriented Variants to In cases in which encapsulation is com- Component Reuse promised for a class, reusability is Most of the abstraction, selection, spe- likewise compromised. When the imple- cialization, and integration issues out- mentation of a class depends on imple- lined earlier in this section also apply to mentation details in a superclass, the object-oriented languages. The notion of external dependency makes it difficult to inheritance from the object-oriented understand and reuse the class. paradigm, however, offers a unique con- Organizing classes into a subtype hier- tribution to component reuse [Meyer archy he] ps the software developer locate 1989]. and select reusable classes. Similar classes in a hierarchy are grouped close together. The path from the root of a 5.5.1 Inheritance and Subclasses class hierarchy down to a particular class Class definitions in an object-oriented is a path from a very general abstraction language such as Smalltalk are primar- for the class down to more specific ab- ily a data abstraction mechanism. Inheri- stractions for the class. A developer look- tance and subclassing enhance the data ing for a reusable class navigates through abstraction mechanism by establishing a the hierarchy until the closest match is hierarchical relationship among classes located [Liskov 1987]. and by allowing reuse to occur within the class hierarchy [Deutsch 1989; Liskov 5.5.2 Multiple Inheritance 1987]. For example, an indexed collection of items might be defined as a class (ab- Some languages offer multiple inheri- stract data type) with a hidden storage tance as an extension to the inheritance representation and visible operations to mechanism just described. Multiple in- store and fetch the zth item in the collec- heritance allows a class to have two su- tion and to get the size of the collection. perclasses, inheriting the data represen- Subclasses of the index collection can tations and operations from both [Stefik reuse the storage representation and the and Bobrow 1986]. Multiple inheritance fetch operation, while extending the type is therefore a mechanism for merging with new data and operations. For exam- multiple abstract data types into a single ple, a programmer can define a sequence type. as a subclass of the indexed collection There are several possible forms of superclass, with an additional operator multiple inheritance. The simplest is a for concatenation. The programmer only disjoint union of two or more super- has to write the concatenation operator classes, where the data representations for the sequence class definition since the and the operations from the superclasses

ACM Computing Surveys, Vol. 24, No. 2, June 1992 146 ● Charles W, Krueger do not interact in any way. In this case ware developer to go directly from an the new subclass reuses the properties of informal requirement to a fully imple- all of its parents without modifying their mented and tested source code compo- semantics. For example, the Smalltalk nent. Thus, the cognitive distance be- read-write-stream subclass is composed tween the informal concept and its final of multi~le. inheritance from the read- executable implementation is very small. stream superclass and the write-stream For components that do not have sim- superclass. ple abstractions, more general specifica- Another form of multiple inheritance is tion techniques are required. Unfortu- an intersecting union. In this case some nately most programming languages do of the data or operations from super- not come with good abstraction mecha- classes interact. For example, a stack _set nisms for describing component behavior class can be defined bv multi~le inheri- and for doing classification and retrieval. tance from a set supe~class aid a stack And although some progress has been superclass. The stack _set maintains a made using programming language ex- stack and a set as (conceptually) a single tensions such as Anna, these formal data type. The semantics are defined so specification languages are not without that the set always contains all objects in problems. The primary drawback is that the stack, and as usual the set is an nontrivial specifications are difficult to unordered collection containing no durdi- create and understand—specifications cates. Adding and removing it~ms on ~he can often be as opaque as source code stack _set is allowed only through the [Horowitz and Munson 19891. This off- push and pop operations. The insert and ~ets the benefits of using them as ab- delete operations for set are disabled. The straction specifications for reusable com- push and pop operations must be modi- ponents; that is, it makes the cognitive fied to update the set portion of the distance relatively high. stack _set in addition to the original op- Reusable components can be special- erations on the stack [Habermann et al. ized either by editing the source code 19881. directly or with mechanisms such as the In ‘this example, modifying the push Ada generic. Generics provide a level of and pop operations in the stack _set de- abstraction that isolates the software de- tracts from the reusability of those oper- veloper from many implementation de- ations since the software develo~er. must tails. Generics can also reduce the risk of manuallv . modifv them, There has been inadvertently introducing errors when some research &ing declarative specifi- the component is reused. Using a formal cations that allow the software develo~er specification language such as Anna, an to merge operations from multiple super- Ada generic package can be verified to be classes almost as easily as merging the correct for all legal instantiations of that data [Kaiser and Garlan 1987, 1989]. package. Thus t~e user does not have to reverify the reused component. Ada generics can only be parametri- 5.6 Appraisal of Reuse Techniques in Source zed with language constructs such as Code Components data types, data objects, and functions. Compared to code scavenging, reusable In Section 6 we will see how m-eater component libraries can be considerably degrees of specialization are poss~ble by more effective since components are writ- parameterizing components with higher ten, collected, and organized specifically level abstractions, such as space/time for the purpose of reuse. The most suc- performance tradeoffs. cessful reusable component systems, such Creating a relatively complete and as the IMSL math library, rely on concise practical library of reusable components abstractions from a particular applica- is a formidable challenge. Library imple- tion domain. One-word abstraction speci- mentors must have the theory, foresight, fications such as sine often allow a soft- and means to produce a collection of com-

ACM Comput]ng Surveys, Vol. 24, No. 2,, June 1992 Software Reuse w 147 ponents from which software developers these l-line components is equivalent to can select, specialize, and integrate to and as equally cost-effective as program- satisfy all possible software development ming directly in the source language. A requirements. This is currently possible library of components of length 6 might to a limited degree for specific applica- be six times more cost effective to use, tion domains that have a rich and thor- but there are 1,000,000 (lOG ) possible ough theoretical foundation, such as components of length 6. In general there statistical analysis. General-purpose li- are 10N candidate components for popu- braries, however, remain elusive for at lating a library of iV-line components in least two reasons: this model. It is therefore necessary for a library implementor to identify a small e The implementation characteristics subset of large granularity components and tradeoffs for data structures and that can serve as building blocks for soft- computations are widely variable. ware systems. Table 3 summarizes the 0 Library size grows rapidly with respect source code component approach to soft- to general-purpose component srize.- ware reuse.

Computer science provides general- 6. SOFTWARE SCI-IEMAS purpose building blocks such as stacks and lists, but these do not always repre- Software schemas are a formal extension sent the dominant portion of the compu- to reusable software components. tation needed in large software systems Reusable components often rely on ad [Booth 1987]. Yet even to construct an hoc extensions to programming lan- admittedly modest library of variations guages to implement reuse techniques on 11 abstract data types (stacks, lists, such as specification, parameterization, strings, queues, deques, rings, maps, classification, and verification. With soft- sets, bags, trees, and graphs) Booth ware schemas, however, these mecha- [1987] had to create 501 parameterized nisms are an integral part of the technol- components. In order to capture to a rea- ogy. sonable degree of tradeoffs in precision, Although the intent of software granularity, range, and robustness, McIl- schemas is similar to that of reusable roy [1968] envisioned a library with 300 components, the emphasis is on reusing different variations of the “lowly sine abstract algorithms and data structures routine.” Projecting from these examples, rather than reusing source code. To the it appears that a library providing a ver- software developer using reusable satile foundation for building software schemas, the source code representation systems must be populated with a very will be much less prominent than the large number of components. abstract notation for describing what the With general-purpose components, big- schema is intended to do, under what ger is better. A software developer who conditions it can be reused, and how to constructs a software system by select- go about reusing it. ing, specializing, and integrating 5 The PARIS system is a representative 1000-line components is typically going schema technology [Katz et al. 1989]. The be more effective than a developer using reusable schema, or program templates, 1000 5-line components. Larger compo- in the PARIS libralcy are instantiated to nents are, however, more specific, and produce source code. Each schema has a therefore, more of them are needed specification that includes the following: to populate a general-purpose library ● [Biggerstaff and Richter 19891. A formal semantics description of the For example, consider a programming schema (e.g., what the schema does), language with 10 different types of state- ● Assertions for correctly instantiating ments. Ten different l-line components the schema (e.g., constraints on the can be built in this language, but using variable parts of the schema), and

ACM Computing Surveys, Vol. 24, No. 2, June 1992 148 ● Charles W. Krueger

Table 3. Reuse m Source Code Components

Abstraction Currently, the best abstractions for reusable components are domain-specific concepts, such as those found in the numerical analysis and statistical analysis packages. Specification languages such as Anna are general-purpose mecha- nisms for writing abstraction specifications for components. Selection Selection is easier in domain-specific libraries because components can be classified, organized, and retrieved using well defined properties of the domain. In a general-purpose component library, ease of component selection is depen- dent on the quality of the abstraction, classification, and retrieval schemes. Describing component behavior with a specification language can help software developers select among a small number of alternatives, but this approach is not suitable for a manual search in a large library. Specialization Generalized components with construction-time parameters represent the most effective approach to component specialization, Directly editing component source code is a less-controlled and less-efficient means of specialization. In some object-oriented languages, inheritance offers another means of reuse with superclass specialization. Integration Module interconnection languages typically provide the framework for integrat- ing components. Naming and name binding are Important module interconnec- tion issues in component reuse since reusable components are constructed independently of any particular context. Pros Components are written, tested, and stored specifically for the purpose of reuse. Some application domains, such as numerical analysls, are particularly well suited to reusable component techniques since they have well-defined component abstractions. In these cases, cognitive distance is small. Cons Components without well-defined abstractions must be described with natural language or specification language descriptions. These descriptions can often be as difficult to understand as source code, thereby increasing the cognitive distance. Also, general-purpose component libraries will have to be very large, making them difficult to build and use.

. Assertions for the valid use of instanti- facts. The abstraction specification for a ated schema (e.g., preconditions and schema is a formal exposition of the algo- postconditions for the source code). rithm or data structure, whereas the ab- straction realization corresponds to the A software developer begins an inter- source code produced when the schema is action with PARIS by giving a problem instantiated. The fixed part of the ab- statement—a formal set of computational straction specification formally describes requirements. PARIS then assists in the invariant computation or data struc- finding a schema that can be instanti- ture of the schema, for example, sorting ated to satisfy the problem statement with respect to a particular partial order- provably. ing or a queue of an unspecified element type. The variable part of the abstraction specification describes the range of op- 6.1 Abstraction in Software Schemas tions over which a schema can be instan- Knuth’s [1973] book, The Art of Com- tiated, such as the overflow length for a puter Programming, provides a library of queue. abstract descriptions for many basic com- With schemas, the variable parts of puter science algorithms and data struc- the abstractions can be more general, or tures. Programmers can informally use higher level, than is possible with pa- these abstractions when writing source rametrized components such as the Ada code. The goal of schema researchers is generic. Parameters in generics are lim- to capture abstractions formally at this ited to a subset of programming lan- level for their schema libraries [Standish guage constructs, whereas in schemas 1984]. they can range over other software prop- In the software schema approach, the erties such as space/time efficiency algorithms and data structures captured tradeoffs. An even more complex type of by the schemas are the reusable arti- parameterization is possible with nested

ACM Computing Surveys, Vol. 24, No. 2, June 1992 Software Reuse ● 149 schemas, where parameters in a schema represent the variable part of the schema can be instantiated with other schemas. abstraction. A software developer instan- With software schemas, software de- tiates such a schema by filling in the velopers select, specialize, and integrate empty boxes with nested schemas. schemas at the abstraction specification PARIS uses temporal logic assertions level. In addition, the mapping from the and quantified logic predicates in schema specification to the source code realiza- preconditions, postconditions, and invari- tion (i.e., schema instantiation) can be ant [Katz et al. 1989]. The following automated, in which case developers are example is a PARIS description for an isolated from the details in the abstrac- insertion module that inserts an item into tion realization level. This is analogous an ordered list. The APPLICABILITY CON- to high-level languages and in contrast to DITIONS, or preconditions, state that the code scavenging and reusable compo- list is ordered before the insert. The RE- nents. SULT ASSERTIONS, or postconditions, As with reusable components, a major state that the list is ordered after the challenge to implementors of software insertion and that the new item exists in schemas is to find suitable abstractions the list. and abstraction representations for the schemas. Rich and Waters [1989, p. 3131 suggest that the limited success in com- APPLICABllLiw corwmor~s (ww) ponent and schema reuse “stems not from Ieq: D1 x DI + boolean --less than or any resistance to the idea, nor from any --equal to lack of trying, but rather from the diffi- forall n I n element_of node : culty of choosing an appropriate formal- n+neXt!= NULL=) ism for representing components.” Differ- Ieq(n + data, n + next + data) ent algorithms and data structures have RESULT ASSERTIONS (POST) diverse properties, and this diversity Ieq: D1 x 1]1 + boolean --less than or must be expressible in schema represen- --equal to tations. For example, matrix add is best equal: D1 x D1 + boolean defined by a step-by-step operational se- forall n I n element_of node : mantics description of how the computa- n+ next!= NULL=) tion is accomplished, whereas a dead- Ieq(n + data, n + next + data) lock-free scheduler is best described by a declarative set of constraints on what exists n I n element node : the computation must accomplish [Rich equal(n + data, newdata) and Waters 1989]. In an attempt to provide a diverse base for describing reusable artifacts, the Plan In Goguen’s [1986] Library Intercon- Calculus5 in the Programmer’s Appren- nection L,anguage (LIL), axiomatic theo- tice combines ideas from flowcharts, data ries with varying degrees of formality are abstraction, logical formalisms, and pro- attached to schema parameters, and gram transformations [Waters 1985]. A views describe how schema instantiation flowchart representation is used to de- satisfies the theories. The following ex- scribe algorithmic aspects of a schema ample is a theory for partially ordered such as control flow and data flow. sets (POSET): Flowcharts can be annotated with logical preconditions and postconditions to cap- theory POSET is ture the declarative aspects of the types ELT --elements m the set functions Ieq --less ihan or equal to schemas. Empty boxes within a flowchart vars El E2 E3 : ELT axioms (El Ieq El) 5 Using the terminology of Rich and Waters [1989], (El leq E3 if El Ieq E2 and E2 leq E3) plan corresponds to the notion of software schema (El = E2 if El Ieq E2 and E2 Ieq El) in this paper. end POSET

ACM Computing Surveys, Vol. 24, No. 2, June 1992 150 “ Charles W. Krueger

A schema implementor can attach this Techniques for describing components in- theory to any schema that is a partially clude natural language descriptions, in- ordered set. The implementor uses a view put/output tables to describe the device to bind the type ELT and the function function, logic-level schematics (gates, LEQ in the theory to the corresponding inverters, flip-flops, etc.), transistor-level constructs in the schema. schematics, timing diagrams to describe Other examples of schema formalisms temporal relationships, plots of linear include Volpano and Kieburtz’s [1985, characteristics such as delay time versus 1989] Software Templates with an ML- device temperature, and the range of like functional notation and Embley and normal operating conditions. The more Woodfield’s [1987] abstract data type detailed abstractions are typically rele- schemas with a regular expression nota- vant as the selection process proceeds to tion. deeper levels in the hierarchy. Analogous to The TTL Data Book, 6.2 Selection practical techniques for selecting reus- able software components or schemas can Reusable hardware components, such as use hierarchical classification and multi- TTL integrated circuits, are widely used ple forms of exposition. For example, the in hardware design. When lamenting IMSL mathematical library is a small- about the limited successes of software scale software library that has success- reuse, software engineering often draw fully used these techniques. It is interest- the analogy between software artifacts ing to note that the IMSL library and and hardware components. The analogy The TTL Data Book both contain approx- is particularly relevant when comparing imately 1000 components. the techniques for selecting TTL compo- General-purpose software libraries nents to the techniques for selecting soft- might contain several orders of magni- ware components and schemas. tude more candidates than the special- In Texas Instruments [1985] The TTL purpose IMSL library. In addition, soft- Data Book, approximately 1000 MSI/ ware libraries will probably evolve LSI TTL components are classified in a rapidly, and therefore it may not be prac- four-level hierarchy. The selection tical to maintain them with books or branching at each level in the hierarchy manuals [Biggerstaff and Richter 1989]. varies between 2 and 12 (the average Clearly, reuse technologies with a large being 5.6). Following is an example clas- number of reusable artifacts must pro- sification path for a frequency divider vide users with powerful techniques for component (S N74LS57), with the hierar- selecting artifacts. These selection tech- chy level preceding the selection made at niques must address three important that level; moving to deeper levels in the subproblems (analogous to The TTL Data hierarchy reduces the number of candi- Book): date components until only one remains:

● Classification of the artifacts in the (1) Counters library (2) Frequency dividers/multipIiers . Retrieval of artifacts from the library (3) 60-to-l frequency divider * E.~position of information necessary to (4) Low-power Schottky technology understand, compare, choose, and use If the user finds the simple abstrac- the artifacts tions presented at the branches insuffi- The following sections address these is- cient for making a choice, more detailed sues in detail. abstractions are available for the current candidates. For every component in the 6.2.1 Classification data book, there is a detailed exposition presenting different properties of the de- Prieto-Diaz describes a classification vice at different levels of abstraction. scheme for software artifacts [Prieto-Diaz

ACM Computing Surveys, Vol. 24, No. 2, June 1992 Software Reuse 9 151

1989; Prieto-Diaz and Freeman 1987]. In insert operation for a set and not encap- this scheme, artifacts are classified by sulated in the queue data type. This clas- two top-level descriptors and three sification scheme does, however, demon- lower-level facets within each descriptor: strate how classification techniques can be applied to software components, or Functionality. Describes the function schemas, analogous to hardware compo- the component is intended to perform. nents. The three lower-level facets of func- tionality, ordered by decreasing rele- 6.2.2 Retrieval vance to reusability, are (1) Function. Operation performed. PARIS is notable for its sophisticated as- (2) Object. Type of data object on sistance in locating software schemas which the operation is performed [Katz et al. 1989]. When presented with a problem statement, PARIS attempts to (3) Medium. Larger data structure in find a schema that satisfies the problem which the data object is located. statement. So that PARIS can match Environment. Describes the context for schemas to problem statements, each which the component was designed. schema in the library includes the follow- The three lower-level facets of envi- ing information: ronment, ordered by decreasing rele- vance to reusability, are Parameterization for the schema which (1) System type. Type of subsystem is a list of abstract entities that must for which the component was de- be supplied in order to use the schema. signed. A schema specification consisting of (2) Functional area. Application de- Assertions about the substitutions for pendent activities. the abstract entities, (3) Setting. Application domain. Preconditions for execution of an in- stantiation of the schema, The following is an example of a classi- Invariants for execution of an instan- fication for a file compression routine that tiation of the schema, was designed for a database manage- Postconditions for execution of an in- ment system: stantiation of the schema, Function: compress Temporal relations for the execution Object: flies of a: instantiation of the schema. Medium: disk System Type: file handler Proof of correcimess for the generic Functional Area: DB management schema specification. Setting: catalog sales This information is a detailed abstrac- The classification scheme allows only tion specification for a schema. Com- predefine values in the facets. The pre- pared to the syntactic specification of an define values for a particular facet are Ada generic, the semantic assertions that related by a conceptual closeness graph. accompany a PARIS schema provide a An automated evaluation system can more precise definition of how to reuse show the user a collection of conceptually an artifact correctly. That is, a PARIS similar artifacts. schema has a more narrow interface. There are practical limitations to this A problem statement is a set of compu- particular classification scheme. For ex- tational requirements. Its form is similar ample, it uses function as the primary to a schema specification in that there classification facet, which emphasizes the are preconditions and postcon ditions, but reuse of functional components and ig- it contains no parameterized sections as nores reuse of abstract data types. With do schemas. When presented with a this approach, the insert operation for a problem statement, PARIS first con- queue would be classified close to the structs a list of preliminary candidate

ACM Computing Surveys, Vol. 24, No. 2, June 1992 152 ● Charles W. Krueger

schemas by finding all schemas in the opment of any reuse system” [Biggerstaff library such that and Richter 1989, p. 6]. These abstrac- tions must be useful to both the user and * The preconditions of the problem state- the tools for automated retrieval. ment are a superset of the schema pre- Following is a list of exposition tech- conditions niques from The TTL Data Book (in bold- “ The postconditions of the problem face) and some suggestions for analogous statement are a subset of the schema software counterparts: postconditions Natural language descriptions. This The user then selects schemas from the is certainly useful for informal schema candidates and attempts to find instanti- descriptions and search techniques ation that will satisfy the schema asser- such as keyword lookup. But machine tions. Optionally, PARIS can attempt to understanding of natural language is instantiate a candidate schema automat- beyond current capabilities, which lim- ically. A theorem prover is used to verify its automated searching techniques that a particular instantiation satisfies [Rich and Waters 1988]. the problem statement and the asser- Input / output tables to describe the tions given for the schema. device function. Logic assertions can With PARIS, users perform schema be used to express the semantic func- searches at the abstraction specification tion of a schema (e.g., preconditions level and are therefore isolated from the and postconditions). They are not par- implementation details at the abstrac- ticularly useful for large, complex tion realization level. PARIS does not, schemas due to the complexity of the however, take advantage of the higher corresponding logic expressions. level abstractions understood by software Logic-level schematics. Data flow and developers. For example, a user cannot control flow diagrams such as those simply specify that a queue implementa- used in Plan Calculus resemble the tion is needed but rather must give an graphical abstractions found in hard- elaborate logic specification of the queue ware logic-level schematics. These dia- semantics. This monolithic, low-level ab- grams have isomorphisms to logic straction representation is necessary for formalisms that are appropriate for the theorem prover but is not ideal for machine manipulation. human users. Transistor-level schematics. This cor- Also, the state of the art for theorem responds to the source code level of a provers is not sufficiently advanced for software schema. Although this is at practical use. Finding a schema that sat- too low a level for most of the reuse isfies a problem statement requires proof process, it is occasionally useful in the that the schema is logically consistent final stages of selection, specialization, with the problem statement, which in and integration. For example, with general is well beyond the scope of cur- TTL chips, transistor-level schematics rent-day theorem-proving technology can be consulted to determine whether [Rich and Waters 1988]. or not unused pins require external connections. Knuth’s WEB, a system 6.2.3 Exposition for making source code more compre- In The TTL Data Book, a diverse collec- hensible, is one possible approach for tion of device abstractions are presented presenting the schema implementa- to help users understand and select can- tion level [Bentley 1986]. didate components. Biggerstaff believes Timing diagrams to describe tempo- that finding an analogous, diverse collec- ral relationships. Temporal logic is tion of expositions for reusable software suitable for expressing some of the artifacts is the “fundamental operational time-dependent characteristics of soft- problem that must be solved in the devel- ware.

ACM Computmg Surveys, Vol. 24, No. 2, June 1992 Software Reuse ● 153

Plots of characteristics such as delay single, complete instantiation. In both time versus device temperature. cases, the software developer may use The characteristics of computations a formal specification to make the con- within a schema can be expressed in vergent step. In the case in which the terms of the upper and lower bounds software developer specializes by substi- on time/space requirements as a func- tuting a .nested schema for a schema pa- tion of input size. rameter, then specialization is exactly the same as selection since the nested schema Biggerstaff argues that hypertext sys- must be selected from the library before tems can be used to manage a large col- it can be inserted. lection of interrelated component or PARIS uses both parameter substitu- schema abstractions [Biggerstaff 1987; tion and enumerated choices for special- Biggerstaff and Richter 1989]. So-called izing the variable parts in the schema “webs of information” allow the user to abstractions. The variable part of the ab- move easily between different abstract straction is separated into two sections: views of reusable artifacts. Seer is an example of the hypertext ● Nonprogram entities, such as abstract approach [Latour and Johnson 1988]. It data types, subtypes, ranges, func- uses a graphical representation of the tions, and variables. Booth [1987] taxonomy for reusable com- ● Program sections, which can be hand- ponents. Each component in the library wri~ten source code fragments or has a graphical information web that nested schema instantiations. provides access to all of the available information on the component. This in- The values substituted for abstract formation includes formal, informal, syn- nonprogram entities must come from the tactic, semantic, graphical, and textual concrete nonprogram entities listed in the descriptions in three different categories: user’s problem statement. The possible Specification information. What the values may be constrained by the appli- component does. cability conditions of the schema. There- fore, specialization of nonprogram enti- Usage information. How to use the ties in a schema corresponds to matching component correctly. actual entities in the problem statement Implementation information. How the to abstract nonprogram entity parame- component is implemented. ters in the schema. It is truly a continua- All three types of information are useful tion of the selection process. in selecting schema. In addition, usage The instantiation of the program sec- information applies to schema specializa- tions in a PARIS schema is best de- tion and integration, and implementa- scribed as a substitution. Arbitrarily tion information is occasionally useful for complex source code fragments are sub- detailed integration decisions. stituted into the program sections, re- stricted only by the assertions in the sec- tion conditions for the schema. If the 6.3 Specialization source code fragment is derived by nested Schemas are typically specialized either selection of another schema, however, it by substituting language constructs, code is a continuation of the selection process. fragments, specifications, or nested In LIL [Goguen 1986] and in Embley schemas into parameterized parts of a and Woodfield’s [1987] reusable abstract schema or by choosing from a predefine data types, a schema can have multiple enumeration of options. implementations. A schema implementor It can be argued that in either case, can create radically different implemen- specialization is a continuation of the se- tations that satisfy the same semantic lection process. Both selection and spe- interface but that exhibit different per- cialization are convergent steps toward a formance characteristics. The perfor-

ACM Computing Surveys, Vol. 24, No. 2, June 1992 154 “ Charles W. Krueger

mance profile for the schema then be- In some cases, schema implementa- comes a parameter in the variable part of tions may perform significantly worse the schema abstraction. A software de- than very specialized code. If necessary, veloper specializes such a schema by performance “hot spots” can be identified selecting from the enumerated perfor- with conventional monitoring techniques mance options. during the execution of the final system, For example, consider a schema for an and then the hot spots can be tuned, indexed sequential list. This schema replaced, or hand coded [Booth 1987]. maintains an ordered list of elements, so that (1) all elements of the list can be 6.4 Integration accessed in sequential order, and (2) ele- ments can also be accessed individually The simplest approach to schema inte- with an index key. Sequential access is gration is to use the module interconnec- implemented with a linked list. The tion language of the schema implementa- schema has two different implementa- tion language. For example, if schema tions of the indexed access so a software instantiation produces Ada package code, developer reusing the schema can choose an instantiated schema can be treated as between two different space/time perfor- a conventional Ada package. Integration mance profiles. The first implementation would be based on the syntactic integra- provides faster access at the expense of tion rules of Ada. storage space by maintaining an explicit More sophisticated schema integration index structure with pointers into the techniques rely on semantic specifica- linked list. The second implementation tions. In addition to the syntactic con- exhibits slower access time and requires straints on composition provided by less space by omitting the index struc- conventional module interconnection lan- ture and doing indexed access with a guages, formal or informal semantic linear search through the list. With this specifications serve to narrow the compo- implementation, insertion and deletion nent interfaces to include only correct are faster since there is no index to up- syntactic and semantic composition. For date when items are added to or removed schemas with complex interfaces, it is from the list. particularly important to have a clear This example illustrates how schemas description of both the syntactic and the with multiple implementations address semantic interface [Latour and Johnson an important problem in software reuse: 1988]. the tradeoff between generality and per- LIL (introduced in Section 6.1) is an formance. The implementor of a reusable example of a schema technology that uses artifact typically tries to generalize the semantic specifications to compose artifact, to make it suitable for a broad schemas [Goguen 1986]. Parameters in range of applications. Increased general- Ada generics are extended with semantic ity often translates into increased func- constraints that guarantee an instantia- tionality, and increased functionality tion to be semantically correct. For exam- often translates into performance com- ple, consider a generic Sort package, pa- promises in the implementation. A rametrized by the type of elements it schema can, however, offer multiple sorts. In LIL, the implementor of this implementations with a range of func- package could put a semantic constraint tionality and performance options. A on the parameter so the element type software developer can choose the imple- satisfies the Partial Ordered Set theory. mentation that most closely matches the This would gaarantee that an instanti- required functionality and performance. ated Sort package would always be able The schema is conceptually one large, to make a semantically meaningful generalized artifact, but in reality there “greater than or equal to” comparison on can be many specialized implementa- the elements. tions that do not pay a performance In LIL, a distinction is made between penalty for unnecessary functionality. horizontal and vertical schema composi-

ACM Computmg Surveys, Vol. 24, No 2, June 1992 software Reuse “ 155

tion. In vertical composition, higher-level to select, specialize, and integrate power- abstractions are created. For example, a ful, high functionality artifacts with rela- package implementing a low-level List tive ease, abstraction might be used to implement Unfortunately, with software systems a higher-level abstraction such as a Sort we do not have many universal abstrac- algorithm. In horizontal composition, tions above the stack, list, tree, and so on schemas are instantiated by specializa- level. Therefore, the semantics of higher tion. For example, a generic Sort schema level schema abstractions are often ex- instantiated with a partially ordered set pressed with logic formalisms and speci- of Labels results in a package that is fication languages,, In a sense these for- still at the Sort level of abstraction. Thus, malisms become the programming lan- horizontal composition corresponds to guage for software developers using the schema composition using nested schemas. Thus, the challenge for imple- schemas. mentors of a schema technology is to find abstraction formalisms that are natural, succinct, and expressive. Members of the 6.5 Appraisal of Reuse Techniques in PARIS project describe this challenge: Software Schemas Our experience in building the PARIS prototype Compared to reusable source code com- indicates that transforming abstract logic expres- ponents, reusable schemas place a sions into precise implementation definitions is greater emphasis on the abstract specifi- quite difficult. It is a nontrivial task to construct cation of algorithms and data structures a system that is able to provide a friendly inter- and place less emphasis on the source face to the user and at the same time produce code implementation. This shift in em- correct and efficient input to the theorem-proving phasis helps reduce the cognitive dis- mechanism [Katz et al. 1989]. tance between the informal requirements Table 4 summarizes the software of a system and its executable implemen- schema approach to software reuse. tation by isolating the software developer from the source-code-level detail. In cases 7. APPLICATION GENERATORS in which the mapping from the schema abstraction specification to the source Application generi~tors operate like pro- code realization is fully automated, the gramming language compilers—input cognitive distance is reduced by one level specifications are automatically trans- in the software development life lated into executable programs [Clevel- cycle—up from the level that describes and 1988]. Application generators differ how an algorithm or data structure is from traditional compilers in that the implemented to the level that specifies input specifications are typically very the semantics of what is implemented. high-level, special-purpose abstractions With TTL components, low abstrac- from a very narrow application domain. tion-level workhorses such as the AND By focusing on a narrow domain, the gates and INVERTERS offer leverage to code expansion (ratio of input size to out- hardware developers. By analogy, put size) in application generators can be reusable software schemas for stacks, one or more orders of magnitude greater lists, trees, and so on offer leverage to than code expansion in programming software developers. It is the higher level language compilers. Levy [1986] reports TTL device abstractions, such as coun- that application generators have been ters, multiplexer, and even CPUS, how- used to create production quality com- ever, that give hardware designers the mercial software at the equivalent rate of greatest leverage. These devices typically 2000 lines of source code per day. have a large ratio of internal gate counts Whereas software schemas reuse algo- to external pin count (i.e., large function- rithm and data structure designs, appli- to-interface ratio). Similarly, software cation generators go one step further and schemas with high-level abstractions and reuse complete software system designs. narrow interfaces allow system designers In application generators, algorithms and

ACM Computing Surveys, Vol. 24, No. 2, June 1992 156 * Charles W. Krueger

Table 4. Reuse in Software Schemas

Abstraction The schema approach emphasizes the reuse of algorithms, abstract data types, and higher level abstractions. Formal semantic specifications are typically used to express schema abstractions. Examples include data flow, control flow, pre, post, and invariant logic expressions, regular expressions, temporal logic assertions, and axiomatic theories. Selection Classification and search schemes and multiple abstraction exposition forms, analogous to The TZ’L Data Book., offer promise as practical selection techniques for reusable software schemas. The issue of scale will make large schema libraries more difficult to use than the hardware analog. Automated assistance such as hypertext databases and theorem provers can help for schema selection. Specialization Schemas are typically specialized by filling in parameterized slots. Examples include insertion ofarbitrary language constructs, choosing from an enumerated set of abstractions, or instantiating parameters with nested schemas. Schema specialization may involve characteristics not typically associated with parame- trized software, such as explicit choices among run time performance alterna- tives. Integration Schema composition often involves both the semantic and syntactic schema interface. This results in a narrower interface when compared to syntactically typed languages such as Ada, somoresemantic errors can bedetected at design time, Pros When schemas are based on formal specifications, automated tools can be used for selection, specialization, and integration. This, in combination with the fact that software developers work at a higher level of abstraction, can reduce the cognitive distance between the requmements and the implementations of algorithms and data structures. Cons Formal specifications for schema abstractions can be large and complex. Even with automated tools it can be difficult for software developers to locate, understand, and use schemas. This complexity serves to increase the cognitive distance, thereby offsetting some of the advantages of using higher level abstractions.

data structures are automatically se- tion generators generalize and embody lected so the software developer can con- the commonalities, so they are imple- centrate on what the system should do mented once when the application gener- rather than how it is done. That is, ap- ator is built and then reused each time a plication generators clearly separate the software system is built using the gener- system specification from its implemen- ator. tation [Cleveland 1988]. At this level of abstraction, it is possible for even non- 7.1 Abstraction in Application Generators programmers familiar with concepts in The abstractions presented to the user of an application domain to create software an application generator typically come systems [Horowitz et al. 1985]. Application generators are appropriate directly from the corresponding applica- tion domain. The way these abstractions in application domains where are presented is dependent on the appli- - Many similar software systems are cation domain [Cleveland 1988]. Differ- written, ent applications need different models of

● One software system is modified or data and computation in the generated rewritten many times during its life- programs, and these different models are time, or amenable to different exposition tech- Q Many prototypes of a system are neces- niques [Levy 1986]. Examples include sary to converge on a usable product. ● Textual specification languages (appli- In all of these cases, significant duplica- cation-oriented languages, fourth- tion and overlap results if the software generation languages, etc.) systems are built from scratch. Applica- * Templates

ACM Computmg Surveys, Vol 24, No. 2, June 1992 Software Reuse “ 157

● Graphical diagrams plication generators, the product is a o Interactive menu-driven dialogs complete, executable system; with soft- * Structure-oriented interfaces ware schemas a smaller unit, such as a module or a subsystem, is produced. The The reusable artifacts in an applica- primary distinction is that schemas are tion generator are the global system ar- abstractions for software implementation chitecture, major subsystems within this artifacts such as algorithms and data global architecture, and very specific data structures, whereas application genera- structures and algorithms. The abstrac- tors are abstractions for the semantics of tion specification for an application gen- a particular application domain. In com- erator describes all parts of the applica- paring the effectiveness of the two ap- tion domain that can be modeled in a proaches, an important question to ask is generated system. The abstraction real- whether or not it is easier for the user to ization level corresponds to the exe- deal with application domain abstrac- cutable systems produced by the genera- tions rather than computational abstrac- tor. The fixed part of the abstraction tions. Successful reuse applications, such specification corresponds to those parts as the IMSL mathematical library, pro- of the application the user cannot modify vide some evidence that users are com- during the generation process; the uari- fortable dealing with domain abstrac- able part corresponds to those parts of tions. the application the user can customize. For example, with a parser generator 7.2 Selection the user specifies (in the variable part of the abstraction specification) the gram- Selecting an application generator to mar of the language to be parsed. The solve a particular problem means choos- generator might, however, have LR(l) as ing an application generator whose do- a fixed part of the abstraction specifica- main couerage includes the requirements tion. in which case the user cannot of the .tmoblem statement. For situations change the algorithm. The com- in which one or more application genera- bination of the fixed part and the range tors are known to exist, the selection of the variable part for an application process may be easy. Currently, however, generator determines the range of sys- there are only a small number of applica- tems that can be generated; it is referred tion generators in a few limited domains, to as the domain width or domain cover- and their availability is not always well age of the generator [Cleveland 1988]. known [Cleveland 1988]. Also, applica- Abstraction is used effectively in ap- tion generators are often highly special- plication generators. The software de- ized, limiting their domain coverage [Bi- veloper works exclusively at a high- ggerstaff and Richter 1989]. As a result, abstraction level. Typically the hidden application generators currently provide Dart of the abstraction covers all of the a very small coverage for software devel- Implementation details in the generator, opment in general. and the user does not view. understand. As more and mm-e application genera- or modify the generator output. This is tors are built, we might expect to see analogous to conventional high-level lan- libraries of application generators. Such guage compilers where the user treats a librarv . would be similar to. but at a the inmt s~ecification as the lowest level much higher levell of abstraction than, of abs~ract~on for software development. the libraries of reusable components or The workings of the compiler and the schemas described in Section 6.2. Ideallv. object code produced are black boxes. an application generator library wou~d It is interesting to compare application cover a broad enough range of applica- generators to software schemas. Both ap- tions to be considered a source of general proaches use diversely and highly pa- computation for common applications. rametrized reusable artifacts. With ap- Classification, search, and exposition in

ACM Computing Surveys, Vol. 24, No. 2, June 1992 158 “ Charles W. Krueger this library could be done in terms of and restructuring of the database con- domain characteristics rather than algo- tents. Specifications can include con- rithmic or data requirements. Initial straints on particular data items, con- steps in the direction of an application straints between data items, access generator library are described in Sec- control constraints, and the way of tion 10. viewing the contents of the database.

The following example is a simple in- 7.3 Specialization put specification to a report generator Specialization is the primary task of a and a sample output from the generated software developer using an application system [Horowitz et al. 1985]: generator. A software developer special- report izes an application generator by provid- file SALES ing an input specification. Implementors titl:StUNITS SOLD PER CUSTOMER’ of application generators have used a wide variety of techniques for specializa- by CUSTOMER tion. To demonstrate this diversity, the across YEAR following sections describe several appli- sum (UNITS) rowtotal cation generators and the abstractions col umntotal used for specialization. where YEAR in 1980...1982 end report 7.3.1 “Conventional” Application Generators

Application generators are sometimes as- UNITS SOLD sociated with only a narrow collection of PER CUSTOMER business applications. For example, 1980 1981 1982 Horowitz et al. [1985] in “A Survey of CUSTOMER UNITS UNITS UNITS TOTAL Application Generators” discuss only generators for business-oriented, data- A 2344 3455 1234 7033 intensive applications. Application gen- B 234 1111 3221 4566 c 412 555 1212 2179 erators from this domain often cover high-level report generation, data pro- TOTAL 2990 5121 5667 13778 cessing, and data display techniques. The abstractions presented in these systems include 7.3.2 Expert System Generators Database management based on a par- Expert systems are software systems that ticular data model, such as hierarchi- embody and apply expert knowledge to cal or relational. Specialization con- solve problems in a particular domain. sists of specifying the database schema Whereas conventional software system (types, fields, relations, etc.) for an ap- development codifies algorithmic prob- plication. lem-solving-knowledge, expert system Textual report generation, which is development codifies the subjective prob- often described by a nonprocedural lem-solving knowledge of experts (declarative) language and is based on [Hayes-Roth 1983]. By analogy, if appli- associative data retrieval. cation generators for conventional soft- Graphical report generation, which ware systems reuse common algorithms uses a similar input specification as and data structures, then application textual report generation but produces generators for expert systems can reuse graphs, histograms, bar charts, scatter common expert problem-solving methods plots, pie charts, and so on. [Krueger 1987]. This idea of generating Database manipulation for batch or expert systems is advocated by expert interactive modification, processing, system development tools such as MOLE,

ACM Computing Surveys, Vol. 24, No. 2, June 1992 Software Reuse ● 159

SALT, and SEAR [Marcus 1988; McDer- (3) Collect the dati~ identified in step 2 mott 1986]. and differentiate the hypotheses. Experts in different areas may use (4) If step 3 uncovers new symptoms go similar problem-solving techniques even to step 1. though their actual knowledge is quite different. For example, an auto mechanic The input specification given to MOLE and a medical doctor may use a similar by a diagnostic expert is expressed in diagnostic method. Expert system gener- terms of domain-specific symptoms, hy- ators reuse implementations of these potheses, and data. Each piece of knowl- generic problem-solving methods. The edge in the input specification fits into fixed part of an expert system generator’s one of the first two steps listed above. abstraction has a particular problem- From the input specification, MOLE pro- solving method, whereas the variable duces an executable expert system to di- part (by which it is specialized) consists agnose problems in the expert’s domain. of the expert knowledge from a particu- SEAR provides a more general frame- lar area of expertise. The input specifica- work for creating and reusing problem- tion is typically derived through an solving methods [McDermott 1988]. The interactive dialog between the expert eventual goal is to have a library of com- system generator and a domain expert. monly used problem-solving methods that This process is referred to as knowledge can be selected and integrated to produce acquisition. a wide range of different expert system For example, MOLE is an expert sys- generators. tem generator that can be used for a wide variety of diagnostic tasks, includ- 7.3.3 Parser and Compiler Generators ing medical diagnosis, car repair, and production line troubleshooting [Eshel- Parser generators and compiler–com- man 1988], The fixed problem-solving pilers are probably the best-known exam- method reused in MOLE consists of the ples of application generators. These following steps: systems have grown out of the well- developed compiler theory and the ubiq- (1) Determine all hypotheses that ex- uitous need for compilers in computer plain the symptoms. science. (2) If there is more than one hypothesis The primary abstractions used in that explains a symptom, then iden- parser generators are regular expres- tify what data are necessary to dif- sions for generating lexical analyzers and ferentiate the hypotheses. Data for context-fi-ee grammars for generating differentiation can serve the follow- parsers. For example, Lex [Lesk and ing roles: Schmidt 1979] and Yacc [Johnson 1979] ● Rule out causality between a hy- are a pair of generators for producing pothesis and a symptom (i.e., hy- parsers. Lex deals with lexical analysis; pothesis implies the symptom in Yacc addresses parsing. some cases, but not this case). Parser generators and compiler–com- . Rule out validity of the hypothesis pilers reuse the knowledge gained (i.e., hypothesis implies the symp- through years of research and design ex- tom, but hypothesis is not true in perience in the compiler domain, includ- this case). ing the general theories behind lexical * Rate the relative likelihood of the analysis, parsing, syntactic and semantic hypotheses causing the symptom error analysis, and code generation. This (i.e., certain hypotheses are more knowledge is embedded in the design and likely to cause symptoms). implementation of a generic compiler . Rate the relative likelihood of the framework that is reused each time a hypotheses (i.e., certain hypotheses compiler is generated. The framework in- are more likely to be true). cludes such things as the generic parsing

ACM Computing Surveys, Vol. 24, No. 2, June 1992 160 ● Charles W. Krueger

algorithm and the parse tree data struc- agement and the user interface. The tures. primary differences between different The specification language used as in- structure-oriented editors are the syntax put to Lex is a regular expression lan- and semantics of languages. Structure guage. An input specification describes editor generators capitalize on this by the legal tokens (identifiers, keywords, capturing the commonalities in the fixed constants, etc.) in the language to be part of an abstraction specification while compiled. For example, the following reg- allowing the language syntax and se- ular expression describes all tokens that mantics to be expressed in a variable have a letter for the first character, fol- part of the abstraction specification. lowed by zero or more letters, digits, or Gandalf [Staudt et al. 1986], Synthesizer underscores: Generator [Reps and Teitelbaum 1984], and Mentor [Donzeau-Gouge et al. 1984] [a–zA– Z][a–zA– Z’C)–g_]k are examples of structure editor genera- The specification language used as in- tors. put to Yacc is a context-free grammar. Typically, four things are specified to The input specification to Yacc defines generate an editor: the legal syntactic structure of the lan- guage to be compiled. Actions can be (1) Tokens in the language. Regular associated with productions in the gram- expressions are typically used to de- mar. Actions can do static semantic fine tokens in the language, analo- checking, code generation, and so on. gous to those shown for compiler– Whenever a construct in the grammar is compilers. recognized by the Yacc-generated parser, (2) Abstract syntax of the language. the associated actions are executed. Ac- Context-free grammars are typically tions are expressed in C and are incorpo- used to describe the constructs in a rated into the generated parser. language. The grammars specify con- The following is an example Yacc straints on the abstract syntax trees. grammar rule for a conditional state- The following is an example of two ment in an imperative language: rules in an abstract syntax specifica- tion: stint : IF ‘(’ cond ‘)’ stint I IF ‘(’ cond ‘)’ stint ELSE stint IF expression statements statements

statements :: STATEMENT BLOCK I WHILE I 7.3.4 Structure-Oriented Editor Generators CASE IIFI .=

Structure-oriented editors are a cross be- (3) Concrete syntax of the language. tween text editors and compilers. A The concrete syntax of a language structure-oriented editor has knowledge describes how to map an abstract of the syntax and semantics of a pro- syntax tree into a textual representa- gramming language and interactively as- tion for the user interface. This is sists users to construct programs in that referred to as prettyprinting or un- language. Typically, structure- parsing. oriented editors store and manipulate (4) Semantics of the language. The programs in the form of abstract syntax techniques for specifying language trees (which are similar to parse trees) semantics vary among the different rather than text strings. This allows the editor generator systems. The Cor- editor to detect syntactic and semantics nell Synthesizer Generator uses at- errors incrementally while a user is edit- tribute grammars to declare con- ing a program. straints among different constructs in Structure-oriented editors for different the abstract syntax tree [Reps and languages have many similarities, in- Teitelbaum 1984]. Gandalf uses dae- cluding the abstract syntax tree man- mons, written in an imperative

ACM Computing Surveys, Vol. 24, No. 2, June 1992 Software Reuse ● 161

database manipulation language. emitted lby the Lex lexer and consumed Daemons are associated with items by the Yacc parser. in the abstract syntax and are trig- gered by different editing activities 7.5 Software Life Cycle with Application [Staudt et al. 1986]. ACL is a seman- Generators tics specification language that com- The software development paradigm with bines the strengths of attribute application generators is significantly grammars and daemons [Kaiser different from the conventional software 1989]. development life cycle. This section examines the software life cycle for ap- 7.3.5 Others plication generators and its impact on Cleveland reports on application gener- cognitive distance. ators for user interfaces, switching soft- ware in telephone systems, test drivers, 7.5.1 Using Applicaticm Generators hardware interprocess communication, The leverage offered by application gen- CAD design process descriptions, and erators can be shown by comparing printed circuit board layouts [Cleveland conventional software development to 1988]. Levy describes application genera- software development with application tors for designing telephone company generators [Horowitz et al. 1985]. A sim- “loop plants” and for allocating and man- ple version of the conventional software aging construction and engineering work development life cycle might consist of [Levy 1986]. Horowitz notes that spread- the following steps: sheet generators like Visicalc are suc- cessful because they model a common System (or requirements) specifica- type of application with a high-level, do- tion. The desired system functionality main-specific abstraction [Horowitz and is expressed in terms of concepts (ob- Munson 1989]. jects and operations) in the application domain. This phase is focused on what 7.4 Integration the system should do and avoids the issues of how the functionality is im- In cases in which an application genera- plemented. tor can produce a complete executable Architectural design. The high-level system from one specification, integra- system organization is expressed in tion is not an issue. Some application terms of the implementation technol- generators, however, produce subsystems ogy. This includes subsystem decom- that in turn require composition. For ex- position into modules, module inter- ample, Lex and Yacc are used in combi- connection structure (such as control nation to produce parsers. and data flow), and correspondence to Software developers using an applica- the system specification. This phase tion generator typically think in terms of identifies what the modules should do, the abstractions at the input specifica- not how they are implemented. tion level not in terms of the software Detailed design. Details are specified produced as output. Therefore, it may be inconvenient or impractical for the soft- for the implementation of modules and their interconnection based on the ar- ware developer to use a conventional module interconnection language to inte- chitectural design. This phase begins to address how modules are imple- grate the output from an application gen- mented. erator. It is better if composition is expressed in terms of the higher-level Coding. Source code is written according abstractions visible at the input specifi- to the detailed design. This phase ex- cation level. For example, with Lex and pands in full detail how each module Yacc the integration done is through the operates. abstract concept of tokens, which are Testing. The system is tested to verify

ACM Computing Surveys, Vol. 24, No. 2, June 1992 162 “ Charles W. Krueger

that the “right system is built” and ators reduce the cognitive distance for that the “system is built right” [Zave the software developer by moving much 1984]. That is, testing serves two pur- of the development effort into the hidden poses: (1’) validate the correspondence and automated parts of the generator. between the system specifications and the intent of the design, and (2) verify 7.5.2 Building Application Generators that the code satisfies the system spec- Even in situations in which an applica- ifications. tion generator is not available, it may be advantageous to build an application Modifications made in any phase are generator to produce one software sys- propagated down through the other tem [Cleveland 1988; Levy 1986]. It is phases. This is true both of changes made not intuitively obvious how building an during the initial system development application generator to generate one and of changes made during the evolu- software system can be economically ad- tion (i.e., maintenance) of the system. vantageous. As described earlier, how- With application generators, much of ever, constructing an application genera- the conventional life cycle is automated. tor is appropriate when many similar Software developers construct or modify software systems are written, when one only the system specifications. The archi- software system is modified or rewritten tectural design and detailed design are many times during its lifetime, or when embodied as reusable artifacts in the ap- many prototypes of a system are neces- plication generator. Coding is auto- sary to converge on a usable product. The mated. An application generator takes last two cases suggest situations in which the system specifications as input and application generators could be created produces executable code as output. Soft- as part of the development of a single ware developers using an application software product. generator do not have to test that the The primary economic justifications for generated code satisfies the system speci- the “up-front” expense of building an ap- fication, because (theoretically) the im- plication generator are (1) the relative plementors of the application generator ease of using domain-specific specifica- verify that the generator always pro- tions and (2) ability to do rapid iterative duces code that is consistent with the prototyping. Taking these two factors into input specifications. The software devel- account, the cost of building an applica- opers need only validate the match be- tion generator is rapidly amortized dur- tween the input specifications and the ing its life. intentions of their design. With applica- Levy [1986] presents economic models tion generators, evolution of a system is for determining the optimal distribution carried out by modifying the system spec- of effort in building a software system ifications. The application generator au- with an application generator. He shows tomatically propagates new specifica- that it is often most effective to spend tions through the conventional life cycle most of the effort up-front constructing a phases in exactly the same way the orig-i- powerful, highly productive application nal system is generated. generator, thus requiring less time proto- Thus, the powerful leverage afforded typing with the generator. by application generators comes from Levy [1986] refers to the process of specifying systems directly in terms of building an application generator as domain concepts, thereby avoiding the . Metaprogramming low-level details of implementing source requires expertise in both the application code. The cognitive distance is very small domain and in software system construc- between the informal requirements of a tion. Cleveland [ 1988] describes the pro- system and the specification of that sys- cess as follows: tem in terms of application-specific ab- stractions. Compared to conventional Recognizing an appropriate domain. software development, application gener- Recognizing when the domain of inter-

ACM Computing Surveys, Vol. 24, No. 2, June 1992 Software Reuse ● 163

est is suitable for application genera- Stage [Cleveland 1988], SSAGS [Pay- tor technology. Domains that are ton et a-l. 1982], and Draco [Neighbors amenable to generator technology will 1989] are examples of systems that assist have implementations with recogniz- in building application generators. They able patterns at the source code level, rely on generator technology to build have natural control and data ab- generators and are therefore application stractions, contain automatable generator generators. Details of this ap- transformations, and have dispersed proach are presented in Section 10, which information that can be consoli- deals with the reuse of system-level ar- dated. Domains with a well-understood chitectures. theory, such as compilers, are easiest for application generator construction. 7.6 Appraisal of Reuse Techniques in Application Generators Defining domain boundaries. Decid- ing what aspects of the domain should Application generators are practical and be included in the generator (i.e., set- attractive when high-level abstractions ting the domain coverage). Increasing from an application domain can be auto- the domain coverage allows the gener- matically mapped into executable soft- ator to handle more problems but typi- ware systems. This capitalizes on the cally makes the generator less effi- users’ understanding of semantics from cient and harder to use. the application domain rather than re- Defining underlying abstraction. quiring users to de-rive domain semantics Identifying the abstraction presented repeatedly from first principles in con- to the user of the application genera- ventional software development technol- tor. Comrhon abstractions include sets, ogy. By eliminating many of the design directed graphs, trees, formal logic and implementation steps found in the systems, and computational models conventional software life cycle, applica- such as finite-state machines and tion generators significantly reduce cog- spreadsheets. The 80/20 rule, where nitive distance. not everything in a domain falls into a Standards can be captured in the fixed clean computational model, sometimes part of the abstraction for an application applies [Levy 1986]. In this case, es- generator. For example, users may prefer capes must be provided so that the a standard user interface for all gener- 20% can be programmed directly into ated systems rather than having to learn the implementation technology. a new interface for each system built. Also, in application domains where there Defining variant and invariant parts. are established standards, it is easier to Deciding what aspects of the abstrac- implement the standard in a single ap- tion are under control of the user and plication generator than to have to inter- which parts are fixed. pret and implement the standard repeat- Defining specification input. Defining edly [Cleveland 1988]. the W=Y ‘in which the &er specifies A difficult challenge for implementors each instance of a generated program. of application generators is defining the As mentioned earlier, options include optimal domain coverage and the optimal textual specification languages (appli- distribution of domain concepts into the cation-oriented languages, fourth- fixed and variable parts of the abstrac- generation languages, etc.), templates, tion. Implementors must attempt to pre- graphical diagrams, interactive dict in advance the functionality and menu-driven dialog, and structure- variability that software developers us- oriented editing. ing the application generator will want. Defining products. Implementing the A limited number of application gener- generator (“the easy part” [Cleveland ators are available, many of which have ~988]). Creating the- set of programs narrow domain coverage. As a result, ap- that will generate the final code from plication generators do not exist for most a concise input specification. software development problems.

ACM Computing Surveys, Vol. 24, No. 2, June 1992 164 * Charles W. Krueger

Table 5. Reuse m Appllcatron Generators

Abstraction Abstractions come directly from the application domain. These high-level abstractions are mapped directly into executable source code by the generator. Selection Application generator libraries have not received much attention in the litera- ture. The parallel between software schemas and application generators sug- gests, however, that library techniques could be used to select among a collection of application generators. Specialization Application generators are specialized by writing an input specification for the generator. Due to the diversity in application domain abstractions, the tech- niques used for specialization are also widely varied. Examples include gram- mars, regular expressions, finite-state machines, graphical languages, tem- plates, interactive dialog, problem-solving methods, and constraints. Integration Application generators do not require integration techniques when a single executable system is generated. In cases in which a collection of generators produce a collection of subsystems, composition is best done in terms of domain abstractions. Pros Since high-level abstractions from an application domain are automatically mapped mto executable software systems, most of the conventional software development life cycle is automated. This significantly reduces cognitive dis- tance. Cons Because of the limited availabdity of application generators, many of which have narrow domain coverage, it is often difficult or impossible to find an application generator for a particular software development problem. It is difficult to build an application generator with appropriate functionality and performance for a broad range of applications.

Application generators offer both ad- and these specifications are compiled into vantages and disadvantages with respect assembly language realizations. Simi- to the efficient execution of generated larly, very high-level languages allow systems. One advantage is that complex software developers to create executable optimizations can be designed and built systems using constructs that are consid- once in the generator and incorporated ered high-level specifications relative to into all generated systems. For major ef- HLL languages. Because of this, VHLLS ficiency problems in an application do- are also known as executable specifica- main, a significant optimization effort in tion languages. the application generator can be amor- Developing software with VHLLS is tized over all generated programs. Also, very much like developing software with application generators can offer a range HLLs. Both VHLLS and HLLs provide a of functionality and performance options, syntax and a semantics for expressing similar to schemas with multiple imple- general-purpose computation. Both ap- mentations. On the negative side, in- proaches use compilers that accept source creasing the generality of an application code programs as input, make tests for generator often requires compromises in syntactic and static semantic correct- time or space efficiency. Table 5 summa- ness, and create executable programs if a rizes the application generator approach certain threshold of correctness is met. to software reuse. HLL source code may be an order of magnitude more succinct than the corre- sponding assembly language implemen- 8. VERY HIGH-LEVEL LANGUAGES tation, and likewise VHLL source code As the name suggests, very high-level may be an order of magnitude more suc- languages (VHLLS) are an attempt at cinct than corresponding HLL implemen- improving on the successes of conven- tations. tional high-level languages (HLLs). Very high-level languages resemble High-level languages have constructs application generators in that high-level such as iteration and relational expres- abstraction specifications are automati- sions for their abstraction specifications, cally translated into executable systems.

ACM Computmg Surveys, Vol. 24, No. 2, ,June 1992 Software Reuse ● 165

Application generators, however, use ap- produce an efficiently executable system. plication-specific abstractions, whereas This approach can be seen as an interim VHLLS use application-independent, measure until it is possible to automate general-purpose abstractions. With ap- the transformation fully (i.e., compila- plication generators, generality is sacri- tion) into efficiently executable code. ficed in order to capture much higher level abstractions in the specification 8,1 Abstraction in Very High-Level language. Therefore, VHLLS have the ad- Languages vantage of being generally applicable to software development, whereas applica- VHLLS diverge from a trend we have tion generators have the advantage of seen thus far in the survey. The reuse being more succinct and powerful in the approaches described in earlier sections relatively small number of cases in which al~i used abstractions that lie in the con- they are applicable. ventional software development life cy- This distinction between VHLLS and cle. Higher level abstractions allow lower application generators exemplifies the level artifacts to be reused. For example, tradeoff between generality and leverage HLL constructs are abstractions for as- in software reuse technologies [Bigger- sembly language code fragments; code staff and Richter 1989]. Typically, the scavenging and reusable components use more general a reuse technology is, the abstractions for HLL system designs and more effort is required to implement sys- code fragments; schemas are abstrac- tems with it. For example, the following tions for algorithms and data structures, technologies are listed by increasing and application generators use abstrac- leverage and decreasing generality: as- tions from the application domain being sembly language, HLLs, software modeled. Contrary to this trend, VHLLS schemas, VHLLS, and application gener- typically have mathematical abstractions ators. The goal of VHLL research is to that are not widely used in the conven- maximize the leverage offered by higher tional software development life cycle. levels of specification without sacrificing Implementors of a VHLL try to find a computational generality. mathematical model that is both exe- The primary concern in VHLLS is not cutable and effective for doing software efficiency in program execution but development. rather efficiency in implementing and The reusable artifacts in VHLLS are modifying programs. The SETL language analogous to those in HLLs. In HLLs, designers use the slogan “slow is beauti- the reusable artifacts are embodied in ful” to emphasize this point [Kruchten et the compiler mappings from language al. 1984]. Similarly, in the early days of constructs to assembly language pat- their development, HLLs were often con- terns. Similarly, VHLLS capture reusable sidered too slow to be of practical value HLL implementation patterns in their [Naur and Randell 19681. But the in- compilers [Cheng et al. 1984]. The VHLL creased software development productiv- constructs serve as abstraction specifica- ityy offered by HLLs, in addition to com- tions for reusable artifacts, whereas the piler optimizations and faster hardware, output from the compiler corresponds to quickly made HLLs the norm for soft- the abstraction realization level. ware development. It is possible that VHLL constructs have fixed and vari- VHLLS, which are considered too ineffi- able parts, similar to HLL constructs. cient by today’s standards, will follow a For example, in the functional expression similar course of evolution. F(X), both F and X are variable parts of Looking ahead to Section 9, trcmsfor- the function abstraction specification. mational systems use VHLLS as the ba- Different functions can be substituted for sis for rapid prototyping and validation F, and different expressions can be sub- of software systems, then a series of hu- stituted for X. The fixed part of the func- man-guided transformations is applied to tion abstraction is given by the language

ACM Computing Surveys, Vol. 24, No. 2, June 1992 166 “ Charles W. Krueger semantics for functional evaluation: Ap- satisfying the condition (cond). For ex- ply function F to expression X. Although ample, the programmer must understand the se- {2*x: xIO

{x in S I C(x)} 8.1.1 SETL Maps in SETL correspond to functions SETL is a VHLL based on set theory in set theory. A map is a set of length-2 [Doberkat et al. 1983; Dubinsky et al. tuples, where the first element in each 1989; Kruchten et al. 19841. Therefore, tuple is in the domain of the function and sets are the most important abstraction, the second element is in the range. There defining the character and semantics of are several high-level retrieval opera- the language. With a small number of tions that can be applied to maps. For primitives, all mathematical operations example, if F is a map, then F(X) yields can be succinctly expressed in set-theo- the second component of the unique pair retic terms. The data types in the lan- in F whose first component is X. Like- guage are either atomic types (such as wise, if F contains more than one tuple integers, reals, and Boolean values) or whose first component is X (this is legal), composite types from set theory. The two then F(X) is undefined, but F{X} yields composite data types are sets and tuples: the set of all second components whose Sets. An unordered finite collection of first component is X. The notation F[S], distinct SETL objects (atoms, tuples, where S is a set, returns the set of all and sets). elements in the combined ranges of every Tuples. An ordered finite sequence of element in S. SETL objects. Many common abstract data types can be implemented trivially in SETL. For Sets and tuples can be nested heteroge- example, SETL operations for insertion neously and to any depth. and deletion from the ends of tuples al- low direct implementations of stacks and Sets and tuples are formed either by queues. The concatenation operation al- enumeration of their members, as in lows tuples to be used as lists. {1,2,3}, 8.1.2 PAISLey or with set-formers, which have the gen- eral form In PAISLey, abstractions from set-based and communi- {(exp) : xl,... ,xn I (cond)}. cating asynchronous processes are com- The (exp) designates the values in the bined in a single VHLL [Zave and Schell set, as x ~ through x. vary over all values 1986]. The set-based aspects of the lan-

ACM Computing Surveys, Vol 24. No. 2, June 1992 Software Reuse ● 167 “

gaage are similar to SETL and will not Lock 1989]. Data flow and control flow be discussed here. are automatically determined by the With PAISLey, software developers can compiler, not explicitly specified as in define a collection of asynchronous paral- imperative HLLs. lel processes. The processes communicate A MODEL program contains data dec- via exchange functions. An exchange larations and consistency equations. The function consists of a pair of function MODEL compiler does data flow analysis calls, one in each of two different pro- to determine if the consistency equations cesses. During execution, when the two can be uniquely satisfied, and to find the processes both reach the evaluation of order of computations that will assign their half of the exchange function, the consistent values to all of the data ob- argument of one function is passed as the jects. return value to the other and vise versa For example, the first of the following (i.e., they exchange their arguments). equations specifies that the final value of This simple extension to the functional an account balance must be equal to the semantics allows parallelism to be di- initial value of balance minus the debit rectly modeled in the language. total plus the credit total for that ac- Real-time constraints are another count. The equation for credits specifies high-level abstraction in PAISLey. Tim- equality with the old value for A, and ing constraints allow the static and dy- debits is equal to the old value of B mi- namic execution of a system to be evalu- nus 10’% of credits: ated. Software developers can attach tim- ing constraints to functions, expressed as new. bal = old. bal – debits + credits; lower bounds, upper bounds, distribu- credits = old.A tions, or random values. Static inconsis- debits = old.B – (.1 “ci’edits) tencies and dynamic failures are detected and reported. The compiler detects that values for the debit total and the credit total must be A powerful feature of the PAISLey exe- cution environment allows incomplete computedl before the new balance can be programs to be executed. If an unde- calculated and that credits must be com- puted before debits. The appropriate code clared function is called during execu- is generated. tion, the user is prompted for a value to use in place of the evaluation. Declared The MODEL compiler checks for com- pleteness and consistency. Completeness but undefined functions can be handled in several ways: requires that all variables used are also defined. Consistency requires that no cir-

● In default mode, a distinguished value cularities exist in the constraints and will always be returned, depending on that there are no type mismatches. the type of the return value declared. For example, O is returned by default for integers. 8.2 Selection

● In random mode, a pseudorandom With VHILLS, selection can take place on value is returned. two different levels: “ In interactive mode, the user is prompted for the return value. e A software developer selects a VHLL that is appropriate for a particular ap- plication. This is analogous to selecting 8.1.3 MODEL an application generator. MODEL is a declarative constraint lan- ● A software developer selects among the gaage, whose abstract model of compu- VHLL constructs during application tation is based on the simultaneous programming. This is analogous to se- solution to a collection of constraint lecting constructs while programming equations [Cheng et al. 1984; Prywes and with a high-level language.

ACM Computing Surveys, Vol. 24, No. 2, June 1992 168 “ Charles W. Krueger

The former is guided by the latter. That two scans over set S, The following low- is, a software developer selecting a WILL Ievel SETL code is more difficult to read, for a particular application makes the but it requires only a single scan over set decision by comparing how easy it is to S during execution: program the application with the differ- POS = nl; ent VHLLS, The language will have a NEG ,= nl; very strong influence on how a problem (forall X element S) is solved. For example, SETL and PAIS- If X > 0 then Ley encourage software developers to POS with X, think about problems in terms of sets of elseif X <0 then objects and functions on those sets, NEG with X; whereas MODEL encourages thinking in end forall X; terms of data objects with static interob- SETL also allows software developers ject constraints. The best language for a to specialize the way language constructs particular application provides the small- are implemented at run time (for the est cognitive distance from initial concept sake of efficiency). For example, the de- to executable implementation. fault run-time implementation of SETL sets is a hash table. For sets that are 8.3 Specialization only subject to union and intersection operations, however, a bit string imple- VHLLS have parameterized language mentation is more efficient. Ideally, the constructs, or templates, that are special- compiler or interpreter would automati- ized by recursively substituting other cally make the optimal implementation language constructs. For example, the choice, but in general this is beyond the functional template F((x)) can be special- scope of current technology. These types ized by substituting another functional of specializations must be guided by soft- expression for (x): ware developers. Human-guided trans- F(G(2)) formations will be discussed in more de- tail in Section 9. A less conventional type of specializa- tion is found in a class of VHLLS known 8.4 Integration as wide-spectrum languages. Wide- spectrum languages contain both VHLL PAISLey and other side-effect-free func- constructs and HLL constructs. The low- tional languages simplify function inte- level constructs are present to accommo- gration because of encapsulated compu- date efficiency concerns. Since low-level tation: Computation within a function is specifications are more difficult to read only influenced by its input parameters and modify, software developers should and return values from the functions it program at this level only as a refine- calls, and the only influence a function ment, or specialization, of stable high- has is through its return value. Control level designs [Kruchten et al. 1984; Liu flow and data flow are unified, which and Paige 1979]. makes it easier for a software developer SETL is an example of a wide- to reason about function integration. spectrum language. The following high- MODEL and other declarative lan- level SETL code can be specialized into a guages simplify integration even more by more efficient low-level form. This exam- completely eliminating the. issue of con- ple creates the subset of positive integers trol and data flow at the programming and the subset of negative integers from level. In MODEL, the order in which the a set S [Liu and Paige 1979]: constraint equations are written is irrele- vant since the compiler performs data POS := {X element S I X > O}, flow analysis to determine a correct (and NEG ,= {X element S I X < O}, close to optimal) sequencing of opera- Executing this code, however, requires tions [Tseng et al. 1986].

ACM Computing Surveys, Vol. 24, No. 2, June 1992 Software Reuse ● 169

Just as source code components writ- for production use. Also, the SETL Ada ten with HLLs can be reused, software compiler was built by the SETL language developers can also reuse VHLL compo- design group, which does not answer the nents. Declarative languages like question of whether or not the set theo- MODEL simplify the integration of retic foundation of SETL is a suitable reusable VHLL components. Two or more abstraction for the general population of components can be merged without con- software developers, building a wide cern for the run-time ordering of compu- range of applications. tations since this is done automatically Zave notes that our notions of specifi- by the compiler. To merge components, a cation and implementation are floating software developer only needs to specify rather than fixed [Zave and Schell 1986]. the relationships (constraints) between Although these two levels of program ab- data objects in the different components straction will always exist (with the spec- [Kaiser and Garlan 1987; Tseng et al. ification level always at a higher level of 1986]. abstraction than the implementation level), both levels have been gradually and simultaneously moving higher. For 8.5 Appraisal of Reuse Techniques in Very example, the implementation level of to- High-Level Languages day resembles the specification level of VHLLS use high-level mathematical ab- 25 years ago. The goal of VHLL research stractions suitable for general-purpose is to continue this trend by making exe- software development. The goal of VHLL cutable specification languages a viable implementors is to find abstractions that alternative for implementing software are more natural and expressive than systems. Table 6 summarizes the very the abstractions in HLLs. As a result, high-level language approach to software VHLL programs can be an order of mag- reuse. nitude more succinct than corresponding HLL programs. VHLLS are not, however, 9. TRANSFORMATIONAL SYSTEMS as powerful as application generators since application generators use domain- With transformational systems, software specific abstractions, which can be at a is developed in two phases: much higher level of abstraction. (1) Software developers describe the se- Like HLLs and application generators, manti c behavior of a software system VHLLS have compilers to map abstrac- using a high-level specification lan- tion specifications directly into exe- guage. cutable realizations. This isolates the (2) Software developers then apply software developer from the hidden and transformations to the high-level realization parts of the abstraction, so specifications. The transformations the software developer is largely un- are meant to enhance the efficiency aware of the reuse that takes place. The of execution without changing the se- lowest level of detail that software devel- mantic behavior. opers work with are the high-level ab- stractions in the VHLL, so the cognitive The two phases make a clear distinction distance between the initial concept of a between specifying what a software sys- system and the executable implementa- tem does and the implementation issues tion is relatively small. of how it will be done [Zave 1984]. The first validated Ada compiler was The first phase is equivalent to using a built using the VHLL SETL [Kruchten VHLL. Software developers create an ex- et al. 1984]. The experiences of that pro- ecutable system in a language that has a ject demonstrate the strengths and limi- relatively small cognitive distance from tations of current-day VHLL technology. the developer’s informal requirements for Although it was possible to prototype the the system [Balzer 1989]. The second compiler rapidly, it was far too inefficient phase, however, is not present in the

ACM Computing Surveys, Vol. 24, No. 2, June 1992 170 “ Charles W. Krueger

Table 6. Reuse jr! Very High-Level Languages

Abstraction The abstractions used in WILLS can be viewed as specifications when compared to HLLs. For this reason WILLS are sometimes referred to as executable specification languages. VHLLS typically have mathematical abstractions, such as set theory or constraint equations. Selection Selection can take place at two levels: (1) selecting a VHLL that is most appropriate for a particular application and (2) selecting the language constructs that best represent the application. Specialization VHLLS have parameterized language constructs that are specialized by recur- sively substituting other language constructs. Another form of specialization is found in wide-spectrum VHLLS, where the run-time representation of language constructs can be speciahzed to provide better performance, Integration Function integration in languages such as PAISLey is sirnphfied by slde-effect- free encapsulated computation. Control flow and data flow are unified and localized to function composition, function arguments, and return values. Declarative languages such as MODEL simphfy integration further with order-independent specifications and compiler-generated control flow and data flow. Pros High-level mathematical abstractions are automatically mapped into executable systems. VHLLS are suitable for general-purpose software development. The VHLL abstractions can be more natural and expressive than HLL abstractions, and VHLL programs can be an order of magnitude more succinct than HLL programs. Thus the cognitive distance for application development can be significantly less using WILLS rather than HLLs. Cons The run-time performance of systems written with VHLLS is typically poor. In wide-spectrum languages, performance can be Improved by using low-level constructs, but this increases cognitive distance. It is also not clear whether high-level mathematical abstractions are suitable for software development by the average programmer.

pure VHLL paradigm. The goal in this can be expressed as: phase is to produce an executable system PO XTO+P1 that satisfies the high-level specification P1x T1+P2 and that also exhibits performance com- parable to an implementation in a con- x TN + P~+l ventional HLL. The transformation phase PN or can be thought of as an interactive, hu- ((... (pOXTO) XTXTJ+p~+l,~+l, man-guided compilation. Human inter- vention is necessary because issues such where each PI+, is a more efficient but as automatic algorithm and data struc- semantically equivalent implementation ture selection are beyond current com- of Pi. piler technology. A transformation is a mapping from 9.1 Abstraction in Transformational Systems m-omams to m-om-ams IPartsch and Ste~nbruggen i98~]. Application of a Transformational system programming transformation T to a program P can be languages often have higher level ab- expressed as stractions than VHLLS. The execution ef- ficiency of these higher level abstractions PxT+ P’., is often too poor to be directly executed in where P‘ is the result of the transforma- VHLLS. Transformational systems can, tion and is semantically equivalent to however, afford to use them because soft- but more efficient than P [Cheatham ware developers apply transformations 1989]. A transformational development that enhance their efficiency. history is the sequence of transforma- Since the first phase of programming tions applied during the creation of a with transformational systems is similar software system. A development history to programming with VHLLS, this sec-

ACM Computing Surveys, Vol. 24, No. 2, June 1992 Software Reuse ● 171 tion concentrates primarily on the second tion are small, significant portions of the phase—the transformations. The follow- previous development history can often ing three sections describe three differ- be reused. PADDLE is a transforma- ent kinds of reusable artifacts in the tional system that stores the develop- transformation phase: prototypes, devel- ment history as a sequence of applied opment histories, and transformations. transformations [Balzer 1989; Fickas 1985]. All. or part of a previous develop- ment history can be replayed after a 9.1.1 Prototypes specification is modified. Prototyping and its associated benefits A problem with PADDLE is that it are not unique to transformational sys- does not encode the rationale behind the tems. Transformational systems are, sequence of transformations. A PADDLE however, unique in the way prototypes development history is essentially a pro- are reused. In conventional software de- gram that defines a sequence of transfor- velopment, a prototype serves to validate mations, but the design decisions that the requirements of a system and to went into writing the program are not identify the critical implementation is- saved. As a result, PADDLE develop- sues, but typically the prototype is a ment histories are difficult to understand throwaway system. In this case software and modify. developers reuse the informal experience Glitter is a transformational system gained in designing and implementing that addresses this problem. The design the prototype. With transformational decisions that a software developer systems, however, rather than throwing makes while applying a sequence of the prototype away it can be reused for transformations are explicitly encoded in two other purposes: (1) a formal specifi- rules [Fickas 1985]. These rules provide cation of the system and (2) as the basis a level of abstraction for the development for applying a sequence of transforma- history. The rules are the abstraction tions that transform the formal specifica- specification, and the sequence of tion directly into an efficiently exe- transfor-mations correspond to the cutable implementation [Cheatham abstractionrealization. Rules make devel- 1989]. opment histories easier to read, under- stand, reuse, and modify. Glitter can au- tomatically map design decisions into an 9.1.2 Development Histories appropriate sequence of transformations. One of the key benefits of the transfor- Thus, a software developer can specify mational approach is that software de- what needs to be accomplished by the velopers perform maintenance and evolu- transformations, and Glitter will figure tion on the specification of the system, out how to do it. Therefore, Glitter rules not on the implementation. This is isolate the software developer from the advantageous since information in a hidden and realization parts of the de- specification is often localized and inde- sign history abstraction. Glitter rules are pendent, whereas information at the im- described in more detail later. plementation level is often widely dis- persed and tightly interdependent for the 9.1.3 Transformations sake of efficiency. It is much easier to understand and modify the information A transformation is a mapping from syn- before it has been dispersed throughout tactic patterns of code into functionally an implementation [Balzer 1989]. equivalent, more efficient patterns of When a software developer makes code. A transformation has a match pat- modifications at the specification level, tern, applicability conditions, and a sub- transformations have to be applied again stitution pattern. The match pattern is a to produce a new efficient implementa- template of code that defines what to tion. When modifications to the specifica- search for in the initial program, and the

ACM Computing Surveys, Vol. 24, No. 2, June 1992 172 “ Charles W. Krueger substitution pattern defines a pattern of may involve nested subgoals or the code that replaces the constructs in the application of transformations. transformed program. The applicability Selection rationale, Guidelines on how conditions define semantic constraints on to select among the different strate- when the transformation should be ap- gies. plied. For example, the following tem- plate provides an implementation for in- For example, one goal in transforming serting an element in a queue [ Cheatham a specification into an efficient imple- 1989]; the first line is the match pattern, mentation is to replace all references to and the remainder of the template is the past execution states with references to substitution pattern (there are no appli- information in the current state. The cability conditions in this example): strategies for doing this include

Insert $element Into $queue == ) = Caching historical information until it begin is needed (eager evaluation) $queue.count := $queue.count + 1; Q Computing the historical information $queue.llst[$queue count] := $element; from information in the current state end (lazy evaluation)

$element and $queue are pattern uari- The selection rationale for choosing a bles that represent arbitrary syntactic among these alternatives include code fragments. At transformation time, these code fragments from the initial ● Space/time tradeoffs for caching ver- program are inserted verbatim in the sus derivation corresponding pattern variables in the ● Whether or not it is possible to re- substitution pattern. derive historical references from the Many transformations are reusable. current context For example, the following are common Glitter automates as much of the identity transformations on predicate transformation process as possible. In specifications: situations in which strategies are incom- NOT NOT $predicate = = ) $predicate plete or selection rationale fails, how- $predlcate AND $predlcate == )$predlcate ever, the system must ask the software developer for guidance. Fickas [1985] re- 9.2 Selection ports that Glitter can automate any- where from 6570 to 90 Yo of the transfor- Glitter uses expert system technology to mation selections and applications. Note help software developers select from a that if Glitter could automate 100%, it collection of reusable transformations. As would be a compiler for the specification described earlier, Glitter rules are ab- language. stractions that explicitly describe what a transformation or sequence of transfor- 9.3 Specialization mations will accomplish. More specifi- cally, rules embody expert knowledge After being selected, transformations are about how to accomplish goals during typically applied without the software program transformation. Goals can be developer having to modify or specialize accomplished by either applying other them. Pattern variables in a transforma- rules or directly applying transforma- tion are a generalization, but the pattern tions. variables are automatically specialized Glitter rules have three parts: with existing code fragments in the pro- gram when a transformation is applied. Goal. The transformation issue that is Therefore, from the software developer’s addressed by the rule. point of view, specialization is in the hid- Strategies. A list of alternative ways to den part of the transformation abstrac- accomplish the goal. Each strategy tion.

ACM Computing Surveys, Vol. 24, No. 2, June 1992 Software Reuse ● 173

9.4 Integration oper in the compilation process, transfor- mational systems increase the cognitive The integration of transformations is distance in order to achieve better run- analogous to functional composition. For time performance. example, the application of two transfor- Expert system technology has proven mations to a program can be expressed effective for reusing transformations. The as either the sequence of two mappings rules in Glitter represent expert knowl- or the application of a composed map- edge about applying transformations to ping; that is, achieve particular goals during the POXTO+P1 transformation phase. Glitter can be PI x T, - Pflnal viewed as an interface to a library of reusable transformations. Software de- or velopers present high-level goals to Glit- PO x (Tl O TO) + pf,n,l ter, and Glitter helps locate the reusable transformations that satisfy those goals. The only time software developers need The primary challenge for the transfor- to be concerned about the composition of mational approach is to identify the types transformations is when the the order of of optimizations that software developers application is significant. For example, use to implement efficient systems and to when PA is not identical to P~ in codify these optimizations into transfor- F’. X (Tl O TO) - pp, mations and rules. T,able 7 summarizes PO x (TO O T,) + pB. the transformational system approach to PA and P~ will be semantically equiva- software reuse. lent since transformations always pre- serve semantics, but they may differ in 10. SOFTWARE ARCHITECTURES terms of implementation characteristics. Reusable software architectures are large-grain software frameworks and 9.5 Appraisal of Reuse Techniques in subsystems that capture the global struc- Transformational Systems ture of a software system design. This The VHLL used in the first phase of a large-scale global structure represents a transformational system can have higher significant design and implementation level abstractions than pure VHLLS such effort that is reused as a whole. Reuse at as SETL, PAISLey, and MODEL. The this scale offers significant leverage in run-time performance of these higher the development of software. level abstractions is very poor with the Examples of reusable software archi- current VHLL compiler technology, but tectures include database subsystems the human-guided transformations of the that are specialized and reused in differ- transformational approach produce more ent applications; the framework for a efficient implementations. Since trans- compiler into which different lexical ana- formational systems can use higher level lyzers, parsers, and code generators can abstractions than pure VHLLS, they ex- be inserted; rule-based and blackboard hibit a smaller cognitive distance be- architectures for expert systems; adapt- tween informal requirements of a soft- able user interface architectures; and ware system and its written specification even design. frameworks for oscilloscopes [Balzer 1989; Feather 1989]. [Garlan 1990; Kruegelr 1992; Lane 1990; The second phase in the transforma- Shaw 1991]. tional approach is essentially a human- Software architectures are analogous guided compilation. Compared to VHLLS to very large-scale software schemas. that use fully automated compilation, Software architectures, however, focus on software developers expend additional ef- subsystems and their interaction rather fort to produce an efficiently executable than data structures and algorithms system. By involving the software devel- [Shaw 1989]. Software architectures are

ACM Computing Surveys, Vol. 24, No. 2, June 1992 174 * Charles W. Krueger

Table 7. Reuse a- Transformahonal Systems

Abstraction Transformations are reusable artifacts that describe how to map source code fragments into semantically equivalent, but more efficient, source code frag- ments. Rules are abstractions for expert knowledge about how to apply transformations to achieve higher level goals. Selection Reusable transformations can be stored in a library, Selection of transformations from this library can be enhanced by rule-based analysis that identifies sequences of transformations that satisfy high-level transformation goals. Specialization Transformations are typically apphed without modification. Therefore, special- ization is typically not an issue for transformation reuse, Integration The integration of transformations M Implicit m the order in which they are applied. The issues are similar to functional composition. Pros Transformational systems use general-purpose, high-level programming abstrac- tions. These abstractions can be at a higher level than with VHLLS because the human-gmded transformations produce more efficient implementations than m possible with VHLL compilers. The higher level abstractions reduce cognitive distance. Cons Applying transformations requires time and effort from the software developer, thereby increasing the cognitive distance. As transformation systems improve, however, more of this effort is automated.

also analogous to application generators be used as building blocks for higher level in that large-scale system designs are architecture generators. Therefore, Draco reused. Application generators, however, is an architecture generator generator. are typically standalone systems with In Draco, each software architecture implicit architectures, whereas software has a domain language and a set of com - architectures can often be explicitly spe- ponents that implement the domain lan- cialized and integrated with other archi- guage. The domain language corresponds tectures to create many different compos- to the abstraction specification for an ar- ite architectures. chitecture. It captures the relevant ab- It is interesting to note that some of stractions (i.e., objects and operations) the more powerful reuse techniques dis- for a software architecture in a particu- cussed earlier implicitly capitalize on lar domain. The design of domain lan- reuse at the software architecture level. guages is identical to the design of the For example, experienced software engi- abstraction specifications for conven- neers scavenge previous designs with tional application generators, as de- great leverage. Application generators scribed in Section 7.5.2: reuse significant portions of system de- Recognizing an appropriate domain signs and implementations within each Defining domain boundaries (domain generated system. Transformational sys- tems reuse the design of efficient imple- coverage) mentations for high-level specifications. Defining the underlying abstractions Prototyping and experience provide soft- Defining the variant and invariant ware developers with positive and nega- parts of the abstraction tive forms of design reuse (i.e., designs Defining the form of the language that work versus those that do not). The art of domain analysis and do- main language design has become an in- 10.1 Abstraction in Software Architectures dependent area of research [Prieto-Diaz Draco is an example software architec- and Arango 1991]. Example domains ture technology [Freeman 1987a; Neigh- range from relatively small areas, such bors 1984, 1989]. With Draco, each soft- as sets and stacks, to arbitrarily large ware architecture is encapsulated in its application domains, such as navigation, own application generator. The output process control, compilation, and data from these “architecture generators” can processing.

ACM Computing Surveys, Vol. 24, No 2, June 1992 Softwai-e Reuse ● 175

Draco components are written to im- tion, the new software architecture be- plement each object and operation in a comes available for reuse in subsequent domain language. Components therefore application and architecture designs. correspond to the abstraction realization level for a software architecture. These 10.3 Specialization components define the execution seman- tics of a domain language, making it exe- Once a system is built using a Draco cutable. software architecture, it can be special- Draco provides software developers ized for more efficient execution in two with a recursive model of reuse. The ways: source-to-source transformations components that implement a new do- and component refinement. main language are written using other Source-to-source transformations, like domain languages (possibly including the those described in Section 9, are opti- base language Lisp). For example, the mization applied to programs written in implementation of a parser architecture a domain language. These transforma- might use an existing domain language tions produce semantically equivalent but for tree-structured data. A reusable do- more efficient specifications in the same main language built on several recursive domain language. For example, in an al- layers of domain languages represents a gebraic domain, the following simple significant amount of reuse. transformation converts exponentiation The software developer only deals with to multiplication for the special case of the top layer domain language in such a squares [Neighbors 1984]: recursive reuse structure. The nested do- $X2 == )$X*$X main languages used to implement the top layer language are in the hidden part Component refinements are alternative of the top layer abstraction. implementations of the same domain language construct that have different 10.2 Selection performance characteristics. This is ex- actly like the multiple schema implemen- Once the domain language for a Draco tations in Section 6.3. For example, in an software architecture is written and the algebraic domain, the exponential opera- components implemented, it is available tor in the domain language could be im- for use (reuse) in building applications or plemented. by both the binary shift new software architectures. As with li- method and the Taylor expansion braries of other types of software arti- method. The software developer using facts, a large collection of software archi- this domain language would choose the tectures must be accompanied by tools component refinement with the best exe- and techniques for selection (classifica- cution profile for a particular application. tion, searching, and exposition). In par- ticular, a library system must help 10.4 Integration software developers find software archi- tectures with the smallest cognitive dis- The primary difference between Draco tance between the architecture and the and conventional application generators application requirements. is that Draco allows an arbitrary collec- Analogous to application generators, tion of software architectures to be inte- when an appropriate software architec- grated into a single application, whereas ture does not exist for an application, it conventional application generators are may still be advantageous to create a typically standalone. This flexibility adds new reusable architecture. This up-front considerable power to the Draco model. A cost is justified in cases in which applica- reusable architecture can be used as a tion will be modified many times in its subsystem in many larger architectures life or when many prototypes are needed rather than being limited to a single ap- to converge on a usable system. In addi- plication generator.

ACM Computing Surveys, Vol. 24, No. 2, June 1992 176 ● Charles W. Krueger

Table 8. Reuse m Software Architectures

Abstraction Architectural abstractions come dmectly from application domains. High-level domain abstractions are captured in the “domain language” for an architecture and are automatically mapped mto executable implementations Selection The analogy between software schemas and software architectures suggests that hbrary techniques for doing classification, search, and exposition could he used to select among a collection of reusable software architectures. Specialization Software developers can specialize software architectures to produce implemen- tations that match the performance reqmrernents of an application. Specializa- tion may be horuontut with source-to-source transformations or zertical with component refinements. Integration Integrating different domain languages into a single apphcation requires a special-purpose module interconnection language. For example, the Draco module interconnection language checks for type representation consistency and for domain-specific semantic consistency in the architecture interfaces. Pros Application developers use high-level abstractions from an application domain to instantiate and compose software architectures. Therefore, like application generators, the cogmtive distance is small. In addition, reusable architectures can be used either stand-alone to create end-user applications or as building blocks to create higher level software architectures. Creating reusable software architectures is difficult, and many such architec- tures wdl be needed to populate a general-purpose library of architectures.

Special care must be taken in the Draco the software developer can apply trans- module interconnection language since formations and choose among different multiple software architectures with a component refinements to increase effi- variety of domain languages may be inte- ciency). This automation isolates the grated into a single application. For ex- software developer from the hidden and ample, the interconnection language realization parts of the abstraction. must accommodate different domain lan- Software architectures can be used as guages and even different component re- either standalone application generators finements with different representations to create end-user applications or as of the same data object. When objects building blocks for creating higher level are passed in and out of functions, type architectures. Therefore, in cases in representation consistency must be which a reusable software architecture maintained. In addition to syntactic con- does not exist for a particular applica- sistency, semantic consistency can be tion, lower level software architectures maintained by attaching preconditions can be composed to create the new archi- and postconditions to the interface of a tecture. component refinement. Similar to other reuse techniques, it will be difficult to build a large, general- 10.5 Appraisal of Reuse Techniques in purpose library of high-quality reusable Software Architectures software architectures [Neighbors 1989]. This is due to both the challenge of iden- As with application generators, the lever- tifying and building appropriate software age offered by software architectures architectures and the difficulty of identi- comes from the small cognitive distance fying and building effective library tools. between informal concepts in an applica- Table 8 summarizes the software archi- tion domain and executable implementa- tecture approach to software reuse. tions. The software developer using a software architecture typically works at 11. SUMMARY the abstraction specification level. The mapping from abstraction specification to In this survey, a broad range of software abstraction realization is mostly auto- reuse techniques from the research liter- mated (although in systems like Draco ature was described and compared. The

ACM Computing Surveys, Vol. 24, No 2, June 1992 Software Reuse ● 177

different approaches to software reuse (2) Reduce the amount of intellectual ef- were partitioned into categories then fort required to produce an exe- compared using a uniform taxonomy. cutable system once the software de- veloper has produced a specification 11.1 Categories and Taxonomy in terms of abstractions of the reuse technique. For this survey, the field of software reuse was partitioned into eight cate- To reduce cognitive distance with the gories, each of which illustrates a charac- first method, implementors of a reuse teristic set of issues and techniques in technique must move the abstraction software reuse: specifications in the technique closer to

● High-level languages the abstractions used to reason about ap- plications informally. For example, appli- ● Design and code scavenging cation generators use domain-specific ab- ● Source code components stractions for their programming model. ● Software schemas Following is an approximate ranking o Application generators of the eight reuse categories, rated on * Very high-level languages how well they minimize cognitive dis- e Transformational systems tance by the first method (where one is best): ● Software architectures

The following taxonomy for the cate- (1)Application generators gories allowed us to compare and con- (2) Software architectures trast different reuse techniques as well (3) Transformational systems as illustrate some of the fundamental issues in software reuse: (4) Very high-level languages (5) Software schemas = Abstraction (6) Source code components ● Selection (7) Code scavenging ~ Classification (8) High-level languages = Retrieval 9 Exposition To reduce cognitive distance with the 0 Specialization second method, imp] ementors of a reuse . Integration technique must partially or fully auto- mate the mapping from abstraction spec- 11.2 Cognitive Distance ifications to executable abstraction real- izations. Cognitive distance is reduced or The notion of cognitive distance was in- eliminated by minimizing the need for troduced as an intuitive gauge to com- software developers to see and under- pare the effectiveness of reuse tech- stand the hidden and realization parts of niques. Cognitive distance is informally the abstractions. For example, very defined as the amount of intellectual ef- high-level languages use compilers or in- fort that must be expended by software terpreters to map abstract specifications developers in order to take a software in the language directly into an exe- system from one stage of development to cutable form. another. Following is an approximate ranking Software reuse techniques can reduce of the eight reuse categories, rated on cognitive distance in two ways: how well they minimize cognitive dis- (1) Reduce the amount of intellectual ef- tance by the second method (where one is fort required to go from the initial best): conceptualization of a system to a specification of the system in abstrac- (1) High-level languages tions of the reuse technique. (2) Very high-level languages

ACM Comnutin~ Survevs. Vol. 24, No. 2, June 1992 178 ● Charles W. Krueger

(3) Application generators required to build software systems, The (4) Software architectures quality of software systems is en- hanced by reusing quality software ar- (5) Transformational systems tifacts, which also reduces the time and (6) Software schemas effort required to maintain software (7) Source code components systems. (8) Code scavenging What is required to implement a soft- ware reuse technology? Creating a An ideal software reuse technology complex executable system with fewer would use both approaches to reduce cog- key strokes and less cognitive burden nitive distance. That is. a software devel- on software developers clearly implies oper using this technology would quickly a higher level of abstraction. For a soft- be able to select, specialize, and integrate ware developer to select, specialize, and abstraction specifications. that satisfied a integrate reusable artifacts success- particular set of informal requirements, fully, the reuse technology must pro- and the abstraction specifications would vide natural, succinct, high-level ab- be automatically translated into an exe- stractions that describe artifacts in cutable svstem. In addition. this ideal terms of “what” they do rather than technolo& could be used in ‘all applica- “how” they do it. The ability of a soft- tion domains. ware developer to practice software Of the eight reuse technologies exam- reuse is limited primarily by the ability ined in this report, the reusable software to reason in terms of these abstrac- architectures probably come closest to tions. In other words, there must be a this ideal description.. A “~erfected. form small cognitive distance between infor- of this technology would have a complete mal reasoning and the abstract con- library of useful software architectures, cepts defined by the reuse technology. ranging from small-grained domains such In areas of research that strive to as sets and stacks to large-grained appli- raise the level of abstraction in soft- cation domains such as airline naviga- ware development, much of the lever- tion control systems. The library system age has come from software reuse. In would allow software developers to locate fact, much of the software reuse litera- the most appropriate reusable artifacts ture is not about new techniques devel- for their application easily. The software oped from first principles of reuse but architectures would be specialized and rather about ongoing research projects integrated in terms of concise high-level that have discovered that reuse is re- abstraction specifications from the appli- sponsible for the advantages offered by cation domain. Executable svstems would high-level abstractions in their sys- be automatically produce~ from these tems. high-level specifications. Why is software reuse difficult? Useful abstractions for large, complex, 11.3 General Conclusions reusable software artifacts will typi- cally be complex. In order to use these * What is software reuse? Software reuse artifacts, software developers must ei- is the process of using existing soft- ther be familiar with the abstractions a ware artifacts rather than building priori or must take time to study and them from scratch. Typically, reuse in- understand the abstractions. The latter volves the selection, specialization, and case can defeat some or all of the gains integration of artifacts, although differ- in reusing an artifact. The former case ent reuse techniques may emphasize or is where we have seen significant suc- reemphasize certain of these. cesses in reuse. Examples ● Why reuse software artifacts? The pri- include (1) math libraries used by de- mary motivation to reuse software arti- velopers who are familiar with the facts is to reduce the time and effort mathematical concepts, (2) abstract

ACM Computmg Surveys, Vol. 24, No 2, June 1992 Software Reuse . 179

data type schemas such as stacks and BALZER, R. 1989. A 15 year perspective on auto- queues used by developers who are fa- matic programming. In Frontier Series: Soft- ware ReusabdLty: Volume 11—Applications and miliar with the semantics, and (3) Experience. Biggerstaff, T. J., and Perlis, A. J., domain-specific application generators Eds. ACM Press, New York, pp. 289-311, Chap. and domain languages used by devel- 14. Originally Balzer [1985]. opers who are familiar with concepts in BENTLEY, J. 1986. Programming pearls. Comm un. a particular application domain. ACM 29, 5 (May), 364-369. BIGGERSTMW, T. 1987. Hypermedia as a tool to aid The following truisms were included ear- large scale reuse. In F’roceedmgs of the Work- lier in the survey. They are simple state- shop on Software Reuse (Oct.). Rocky Mountain ments on reuse that have been difficult Institute of Software Engineering, Boulder, CO1O. to satisfy in practice. BIGGERSTAFF, T. J,, AND PERLIS, A. J., EDS. 1989a. For a software reuse technique to be effectwe, it Frontier Series: So filoare Reusability: Volume I must reduce the cogntive distance between the —Concepts and Models, ACM press, New York. mtial concept of a system and Its final executable BIGGERSTAFF, T. J., AND PERLIS, A. J., EDS. 1989b. Implementation. Frontier Series: Software Reusability: Volume II—Applications and Experience. ACM Press, For a software reuse technique to be effective, it New York. must be easier to reuse the artifacts than it IS to BIGGERSTAFF, T., AND RICHTER, C. 1987. Reusabil- develop the software from scratch. ity framework, assessment, and directions. To select an artifact for reuse, you must know what IEEE Softw. 4, 2 (Mar.), 41–49. Also in Tracz It does. [19881 and Biggerstaff and Richter [1989]. To reuse a software artifact effectively, you must be BIGGERSTAFF, T., AND RICHTER, C. 1989. Reusabil- able to “find it” faster than you could “build it, ” ity framework, assessment, and directions. In Frontier Series: Software Reusability: Volume What are the open areas of research in software I-Concepts and Models. Biggerstaff, T. J., and reuse? There IS clearly much work that remams to Perlis A. J., Eds. ACM Press, New York, pp. be done. In general, the search for high-level ab- 1–17, Chap. 1. Originally Biggerstaff and stractions for software artifacts is probably the most Richter [ 1987]. crucial, but thw type of research IS not new or BOEHM, B. W. 1987. Improving software produc- unique to software reuse. tivity. IEEE Comput. 20, 9 (Sept.), 43-57. Library issues are also critical to many Booc~, G. 1987. Software Components with Ada: Structures, Tools, and Subsystems. Benjamin/ reuse technologies. These issues, how- Cummings Publishing Company, Inc., Menlo ever, lead right back to abstraction since Park, Calif. classification, search, and exposition are BROOKS, F. P. JR. 1975. The Mythical Man-Month. heavily dependent on high-level abstrac- Addison-Wesley, Reading, Mass. tions for software artifacts in a library. BROOKS, F. P. 1987. No silver bullet: Essence and accidents of software engineering. IEEE Com- ACKNOWLEDGMENTS put. 20, 4 (Apr.), 10-19. CHEATHAM, T. E. JR. 1983. Reusability through Nico Habermann contributed greatly to this survey program transformations. In Workshop on through his patient reading, insightful comments, Reusability in Programming (Newport, RI, and thoughtful discussion. John McDermott, Bar- Sept.). ITT Programming, Stratford, Corm., pp. 122-128. Also in Cheatham [ 1984, 1989]. bara Staudt Lerner, Peter Feiler, John Ockerbloom, and the anonymous reviewers all provided valuable CHEATHAM, T. E. JR. 1984. Reusability through feedback and detailed comments. Ellen Douglas program transformat~ons. IEEE Trans. Softw. Eng. SE-10, 5 (Sept.), 589–594. Originally gave many suggestions on improving the technical Cheathalm [1983]. writing style (although I admittedly have not done CHEATHAM, T. E. JR. 1989. Reusability through justice to her efforts). My initial interest in the program transformations. In Frontier Series: subject of software reuse was sparked in a discus- Software Reusability: Volume I—Concepts and sion group headed by Mary Shaw at the Software Models. Biggerstaff, T. J., and Perlis, A. J., Engineering Institute. Eds., ACM Press, New York, pp. 321-335, Chap. 13. Originally Cheatham [ 1983]. REFERENCES CHENG, T. T., LOCK, E. D., AND PRYWES, N. S. 1984, Use of very high level languages and program BALZER, R. 1985. A 15 year perspective on auto- generation by management professionals. matic programming. IEEE Trans. Softw. Eng. IEEE Trans. Softw. Eng. SE-10, 5 (Sept.), SE-II, 11 (Nov.), 1257-1268. 552-563.

ACM Computing Surveys, Vol. 24, No. 2, June 1992 180 ● Charles W. Krueger

CLEAVELAND, J. C. 1988. Building application gen- FEATHER, M, S, 1989. Reuse in the context of a erators. IEEE Softw. 5, 4 (July), 25–33. Also in transformation based methodology. In Frontier Prieto-Diaz and Arango [ 1991]. Series: Software Reusability: Volume I—Con- DEREMER, F., AND RRON, H. H, 1976. Program- cepts and Models. Biggerstaff, T. J., and Perlis, ming-in-the-large versus programming-in-the- A, J., Eds. ACM Press, New York, pp. 337-359, small. IEEE Trans. Softw. Erg. SE-2, 2 (June), Chap. 14. Originally Feather [1983]. 80-86. FICKAS, S. F 1985. Automating the transforma- DEUTSCH, P. L. 1983. Reusability in the Small- tional development of software. IEEE Trans. talk-80 programming system. In Workshop on Softw. Eng. SE-11, 11 (Nov.), 1268-1277. Reusabdlty m Programming (Newport, R. I. FREE MAN, P. 1983. Reusable software engineer- Sept. ). ITT Programming, Stratford, Corm., pp. ing: Concepts and research directions. In Work- 72-76. Also in Freeman [ 1987 b]. shop on Reusabdtty zn Programmmg (Newport, DEUTSCH, L, P. 1989. Design reuse and frame- R. I., Sept.). ITT Programming, Stratford, Corm., pp. 2-16. Also in Freeman [ 1987b] works in the Smalltalk-80 system. In Frontier Sertes: Software Reusability: Volume II—Ap - FREEW, P. 1987a. A conceptual analysis of the plwations and E:

ACM Computing Surveys, Vol. 24, No. 2, June 1992 Software Reuse ● 181

ICHBIAH, J. D. 1983. On the design of Ada. In management architectures. Ph.D. dissertation, Information Processing 83. Mason, R. E.A., Ed. Carnegie-Mellon Univ. IFIP, Elsevier Science Pub., New York, pp. LANE, T. G. 1990. User interface software struc- 1-1o. tures. Ph.D. dissertation, Carnegie-Mellon IMSL 1987. IMSL Math/Library User’s Manual. Univ. 1.0 Edition, Houston, Tex. LATOUR, L., AND JOHNSON, E. 1988. Seer: A graphi- JOHNSON, S. C. 1979. Yacc: Yet Another Compder- cal retrieval system for reusable Ada software Compiler in the UNIX Programmer’s Manual modules. In The 3rd International IEEE Con- —Supplementary Documents. 7th ed. AT&T ference on Ada Applications and Environments Bell Laboratories, Indianapolis, Ind. (Manchester, N. H., May). IEEE Computer So- KAISER, G. E. 1985. Semantics for structure edit- ciety Press, Los Alamitos, Calif., pp. 105–113. ing environments. Ph.D. dissertation, LESK, M. E., AND SCHMIDT, E. 1979. Lex: A Lexical Carnegie-Mellon Univ. Analyzer Generator in the UNIX Programmer’s Manual—Supplementary Documents, 7th ed. KAISER, G. E. 1989. Incremental dynamic seman- AT& T Bell Laboratories, Indianapolis, Ind. tics for language-based programming environ- ments. ACM Trans. Program. Lang. Syst. 11, LEVY, L. S. 1986. A metaprogramming method 2 (Apr.), 169-193. See also Kaiser [ 1985]. and its economic justification. IEEE Trans. Softw. Eng. SE-12, 2 (Feb.), 272-277. KAISER, G. E., AND GARLAN, D. 1987. Melding soft- ware systems from reusable building blocks. LIEBERHERR, K. J., AND lRI~L, A. J. 1988. Demeter: IEEE Softw. 4, 4 (July), 17–24. Also in Tracz A CASE study of software growth through pa- [1988]. rametrized classes. In The 10th International Conference on Software Engineering (Singa- KAISER, G. E., AND GARLAN, D. 1989. Synthesizing pore, Apr.). IEEE Compu+er Society Press, Los programming environments from reusable fea- Alamitos, Calif., pp. 254-264. tures. In Frontier Series: Software Reusability: Volume 11—Apphcations and Experience. Elig- LISKOV, B. 1987. Data abstraction and hierarchy. gerstaff, T. J., and Perlis, A. J., Eds. ACM In 00PSLA ’87, Addendum to the Proceedings Press, New York, pp. 35-55, Chap. 2. (Orlando, Fla., Oct.). ACM Sigplan, New York, pp. 17--34. KATZ, S., RICHTER, C. A., AND THE, K. 1987. PARIS: A system for reusing partially interpreted LnJ, S., AND PAIGE, R. 1979. Data structure schemas. In 9th International Conference on choice/formal differentiation. Tech. Rep. NSO- Software Engineering (Monterey, Calif., Mar.). 15, Courant Institute of Mathematical Sci- IEEE Computer Society Press, Los Alamitos, ences, New York. Calif., pp. 377-385. Also in Tracz [1988] and LOECKX, J., AND SIEBER, K. 1984. Wiley and Teub- Katz et al. [1989]. ner Series in Computer Science: The Founda- KATZ, S., RICHTER, C. A., AND THE, K. 1989. PARIS: tions of Program Verzfzcation. John Wiley, New York. A system for reusing partially interpreted schemas. In Frontier Series: Software LUCKHAM, D. C., AND VON HENKE, F. W. 1984. An Reusability: Volume I—Concepts and Models. overview of Anna, a specification language for Biggerstaff, T. J., and Perlis, A. J., Eds. ACM Ada. In 1984 Conference on Ada Applications Press, New York, pp. 257-273, Chap. 10. Origi- and Envu-onments (St. Paul, Minn., Oct.). IEEE nally Katz et al. [1987]. Computer Society Press, Los Alamitos, Calif., pp. 116-127. KERNIGHAN, B. W. 1983. The Unix system and software reusability. In Workshop on Reusabil- MARCUS, S., ED. 1988. Kluwer International Series ity m Programming (Newport, R. I., Sept.). ITT in Engineering and Computer Science: Auto- Programming, Stratford, Corm., pp. 235-239. matic Knowledge Acquisition for Expert Sys- Also in Kernighan [1984] and Freeman [ 1987b]. tems. Kluwer Academic Publishers, Boston. Mass. KERNIGHAN, B. W. 1984. The Unix system and software reusability. IEEE Trans. Softw. Eng. MCDERMOTT, J. 1986. Making expert systems ex- SE-10, 5 (Sept.), 513–518. Originally Kernig- plicit. In Information Processing 86. IFIP, El- han [1983]. sevier Science Pub., New York, pp. 539–544. KNOTH, D., E. 1973. The Art of Computer Pro- MCDERMO’I”r, J. 1988. Preliminary steps toward a gramming. Addison-Wesley, Reading, Mass. taxonomy of problem-solving methods. In Kluwer International Series in Engineering and KRUCHTEN, P., SCHONB~RG, E., AND SCHWARTZ, J. Computer Science. Automatic Knowledge Ac- 1984. Software prototyping using, the SETL quisition for Expert Systems. Marcus, S., Ed., programming language. IEEE Softw. 1, 4 Kluwer Academic Publishers, Boston, Mass., (Oct.), 66-75. pp. 225–256, Chap. 8. KRUEGER, C. W. 1987. Expert system engineering MCILROY, M. D., 1968. Mass produced software and its relation to conventional software engi- components. In Sofiware Engineering; Report neering. Ph.D. area qualifier paper. Carnegie- on a conference by the NATO Science Commit- Mellon Univ., Pittsburgh, Penn. Available from tee (Garmisch, Germany, Oct.). Naur, P., and author on request. Randell, B., Eds. NATO Scientific Affairs Divi- KRUEGER, C. W. 1992. Application-specific object sion, Brussels, Belgium, pp. 138–150.

ACM Computing Surveys, Vol. 24, No. 2, June 1992 182 * Charles W. Krueger

MENDAL, G. O., 1986. Designing for Ada reuse: A PRIETO-DIAZ, R., AND ARANGO, G., EDS. 1991. Do- case study. In 2nd Znternatzonal Conference on main Analysis and Software System Modeling. Ada Applications and Environments (Miami IEEE Computer Society Press, Los Alamitos, Beach, Fla., Apr.). IEEE Computer Society Calif. Press, Los Alamitos, Calif., pp. 33-42. PRIETO-DIAZ, R., AND FREEMAN, P. 1987. Classify- MEYER, B, 1987. Reusability: The case for object- ing software for reusability IEEE Soff w. 4, 1 oriented design. IEEE Softw. 4, 2 (Mar.), (Jan.), 6-16. Also in Freeman [1987b]. 50-63. Also m Tracz [1988] and Meyer [1989]. PRIETO-DIAZ, R., AND NEIGHBORS, J. M. 1986. Mod- MEYER, B. 1989. Reusability: The case for obJect- ule interconnection languages. J. Syst. SoftZL1, oriented design. In Front~er Series: Softu,are 6, 4 (Nov.), 307-334. Also in Freeman [1987b]. Reusability: Volume 11—Applmat~ons and E.Y- PRYWES, N. S., AND LOCK, E. D. 1989. Use of the per~ence. Biggerstaff, T. J., and Perlis. A. J,, model equational language and program gener- Eds. ACM Press, New York, pp. 1-33, Chap. 1. ator by management professionals. In Frontier Originally Meyer [1987]. Series: Software Reusabdify: Volume II—Ap - NAUER, P., AND RANDELL, B., EDS. 1968. Software plicatzons and Experience. Blggerstaff, T. J., Engineering; Report on a Conference by the and Perlis, A, J., Eds. ACM Press, New York, NATO Science Committee. NATO Scientific Af- pp. 103–129, Chap. 5, fairs Division, Brussels, Belgium. REPS, T., AND TEITELBAUM, T, 1984. The synthe- NEIGHBORS, J. M. 1983. The Draco approach to sizer generator. In Proceedings of the ACM constructing software from reusable compo- Slgsoft/Slgplan Software Engzneenng Sympo- nents. In Workshop on Reusability in Program- sum on Pructlcal Software De L~elopmen t Envi- ming (Newport, R. I., Sept.). ITT Programming, ronments (Pittsburgh, Penn., Apr.). ACM SIG- Stratford, Corm., pp. 167-178. Also in Neigh- SOFT/SIGPLAN, New York, pp. 42-48, bors [ 1984] and Freeman [ 1987b]. RICH, C., AND WATERS, R. 1983. Formalizing NEIGHBORS, J. M. 1984. The Draco approach to reusable software components. In Workshop on constructing software from reusable compo- Reusability m Programmmg (Newport, R. I,, nents. IEEE Trans. Softw. Eng. SE-10, 5 Sept.). ITT Programming, ITT, Stratford, (Sept.), 564-574. Originally Neighbors [1983]. Corm., pp. 152-159. NIZIGHBORS, J. M. 1989. Draco: A method for en~- RICH, C., AND WATERS, R. C. 1988. Automatic pro- neering reusable software systems. In Frontier gramming: Myths and prospects. IEEE Com - Series: Software Reusability: Volume I—Con- put 21, 8 (Aug.), 40-51, cepfs and Models. Biggerstaff, T. J., and Perlis, RICH, C., AND WATERS, R, C. 1989. Formalizing A. J., Eds. ACM Press, New York, pp. 295-319, reusable software components in the program- Chap. 12, Also in Prieto-Diaz and Arango mer’s apprentice. In Frontier Series: Software [1991]. Reusabdify: Volume 11—Applications and Ex- PARNAS, D. L., CLEMENTS, P. C., AND WEISS, D. M. perience. Biggerstaff, T. J., and Perlis, A. J., 1983. Enhancing reusability with information Eds. ACM Press, New York, pp. 313-343, Chap, hiding. In Workshop on Reusability in Pro- 15. Expanded version of Rich and Waters gramming (Newport, R. I., Sept.). ITT Program- [1983]. ming, Stratford, Corm., pp. 240–247, Also in SHAW, M. 1984. Abstraction techniques in modern Freeman [ 1987b] and Parnas et al. [ 1989], programming languages. IEEE Softw. 1, 4 PARNAS, D, L., CLEMENTS, P. C,, MVD WEISS, D. M. (Oct.), 10-26. 1989. Enhancing reusability with information SHAW, M, 1989. Larger scale systems require hiding. In Frontier Ser~es: Software Reusabll- higher-level abstractions. In Proceeding.s of’ the lty: Volume I—Concepts and Models. 131gger- 5th Internat~onal Workshop on Software Speci- staff, T. J., and Perlis, A, J., Eds. ACM Press, fication and Deszgn (May). IEEE Computer So- New York, pp. 141-157, Chap. 6. Originally ciety Press, Los Alamitos, Cahf,, pp. 143–146. Parnas and Clements [1983]. SHAW, M. 1991, Heterogeneous design idioms for PARTSCH, H., AND STEINBRUGGEN, R. 1983. Pro- gram transformation systems. ACM C“omput. software architectures. In Proceedl ngs of the 6th Internat~onal Workshop on Software Specl- Surv. 15, 3 (Sept.), 199-236. fzcafion and Design (Como, Italy, Oct.), IEEE PAYTON, T,, KELLER, S., PERKINS, J., ROWLAN, S., Computer Society Press, Los Alamitos, Calif., AND MARDINLY, S. 1982. SSAGS: A syntax and pp. 1–8. semantics analysis and generation system. In 6th International Computer Software and Ap- STANDISH, T. A. 1984, An essay on software reuse. plications Conference (COMPSAC82 ) (Chicago, IEEE Trans. Softw. Eng. SE-10, 5 (Sept.), Ill., Nov.). IEEE Computer Society Press, Los 494-497. Alamitos, Cahf., pp. 424-432. STAUDT, B. J., KRUEGER, C. W., HABERMANN, A, N., PRIETO-DLAZ, R. 1989. Classification of reusable AND AMBRIOLA, V. 1986. The GANDALF sys- modules. In Frontier Series: Soffware Reus- tem reference manuals. Tech, Rep. CMU-CS- ability: Volume I—Concepts and Models. Big- 86130, Carnegie-Mellon Univ., Pittsburgh, gerstaff, T. J., and Perlis, A. J., Eds. ACM Penn. Press, New York, pp. 99–123, Chap. 4. STEFIK, M., AND BOBROW, D. G. 1986. Object-ori-

ACM Computing Surveys, Vol. 24, No. 2, June 1992 Software Reuse * 183

ented programming: Themes and variations. Perlis, A. J., Eds. ACM Press, New York, pp. The AI Msg. 16, 4, 40-62. 247-255, Chap. 9. TEXAS INSTRUMENTS 1985. The TTL Dattz Book. WATERS, R. C. 1985. ‘The programmer’s appren- Vol. 2. Texas Instruments, Dallas, Tex. tice: A session with KBEmacs. IEEE Trans. TRACZ, W., ED. 1988. Tutorial: Software Reuse: Softzu. Eng. SE-11, 11(Nov.), 1296-1320. Emerging Technology. IEEE Computer Society Press, Los Alamitos, Calif. WEGNER, P. 1983. Varieties of reusability. In TSENG, J. S., SZYMANSN, B., SHI, Y., AND PRYWES, N. Workshop on Reusability in Programming S. 1986. Real-time software life cycle with the (Newport, R. I., Sept.). ITT Programming, model system. IEEE Trans. Softw, En,g. SE-12, Stratford, Corm., pp. 30–44, Also in Freeman 2 (Feb.), 358-373. [1987b]. VOLPANO, D. M., AND KIEBURTZ, R. B. 1985. Soft- ware templates. In 8th International Confer- ZAVE, P. 1984. The operational versus tbe conven- en ce on Software Engineering (London, UK, tional approach to software development. Corn- Aug. ). IEEE Computer Society Press, Los rnun. ACM 27, 2 (Feb.), 104–118. Alamitos, Calif., pp. 55-60. VALPANO, D. M., AND KIEBURTZ, R. B. 1989. The ZAVE, P., AND SCHELL, W, 1986. Salient features of templates approach to software reuse. In Fron- an executable specification language and its tier Series: Software Reusability: Volume environment. IEEE Tra~ts. Softw. Eng. SE-12, I—Concepts and Models. Biggerstaff, T. J., and 2 (Feb.), 312-325.

Recewed February 1990; final revmlon accepted October 1991.

ACM Computing Surveys, Vol. 24, No. 2, June 1992