<<

AI Magazine Volume 13 Number 3 (1992) (© AAAI) Articles in the Twenty-First Century Michael . Lowry

There is substantial evidence that AI technology can meet the require- ments of the large potential market that will exist for knowledge-based at the turn of the century. In this article, which forms the conclusion to the AAAI Press book Automating Software Design, edited by Michael Lowry and Robert McCartney, Michael Lowry discusses the future of software engineering, and how knowledge-based software engineering (KBSE) progress will lead to system development environments. Specifically, Lowry examines how KBSE techniques pro- mote additive programming methods and how they can be developed and introduced in an evolutionary way.

By the year 2000, there will be a large poten- encoded form and automated. KBSE will revo- tial market and a fertile environment for lutionize software design, just as - knowledge-based software engineering (KBSE). aided design has revolutionized hardware In the coming decade, hardware improve- design, and desktop publishing has revolu- ments will stimulate demand for large and tionized publication design. sophisticated application software, and stan- This article draws on the chapters in dardization of software interfaces and operat- Automating Software Design and other sources ing systems will intensify competition to present one vision of the future evolution among software developers. In this environ- of KBSE. After an executive summary and a ment, developers who can rapidly create brief history of software engineering, the role robust and error-free software will flourish. of AI technology is examined in mainstream Developers who deliver buggy software years software engineering today. Currently, AI behind schedule, as is typical today, will programming environments facilitate rapid perish. To meet this challenge, software prototyping but do not produce efficient, developers will seek tools and methods to production-quality code. KBSE technology automate software design. combines the advantages of rapid prototyp- Computer-aided software engineering ing and efficient code in a new programming (CASE) is undergoing tremendous commercial paradigm: transformational programming, growth. However, the current generation of described in the subsequent part of the con- CASE tools is limited by shallow representa- clusion. In transformational programming, tions and shallow reasoning methods. CASE prototyping, validation, and modifications tools will either evolve into, or be replaced are done at the specification level; automatic by, tools with deeper representations and then translates specifica- more sophisticated reasoning methods. The tions into efficient code. The following part enabling technology will come from AI, compares the trade-offs in various approach- formal methods, es to program synthesis. Then, several near- theory, and other areas of . term commercial applications of KBSE This technology will enable much of the technology are predicted for the next decade. knowledge now lost in the software develop- To scale up from these near-term applications ment process to be captured in machine- to revolutionizing the entire software life

0738-4602/92/$4.00 ©1992 AAAI FALL 1992 71 Articles

cycle in the next cen- This article draws on the chapters in lutionary and com- tury, the computa- Automating Software Design and other sources patible with current tional requirements of to present one vision of the future evolution software develop- KBSE technology need of KBSE. After an executive summary and a ment methods. Some to be addressed. An brief history of software engineering, the role research systems examination of cur- of AI technology is examined in mainstream for intelligent rent benchmarks software engineering today. Currently, AI assistance in programming environments facilitate rapid reveals that hardware requirement and prototyping but do not produce efficient, performance by the specification engi- production-quality code. KBSE technology year 2000 is not combines the advantages of rapid prototyp- neering, such as likely to be a limiting ing and efficient code in a new programming ARIES and ROSE-2 factor but that fun- paradigm: transformational programming, incorporate many damental issues such described in the subsequent part of the con- of the current CASE as search control clusion. In transformational programming, representations. require further prototyping, validation, and modifications Thus, as they research. Finally, the are done at the specification level; automatic mature, they could program synthesis then translates specifica- future of KBSE in the be integrated into tions into efficient code. The following part next century is pre- the next generation compares the trade-offs in various approach- of CASE tools. On the sented from the es to program synthesis. Then, several near- research front, I viewpoint of differ- term commercial applications of KBSE ent people in the technology are predicted for the next expect continued software life cycle— decade. To scale up from these near-term progress in repre- from end users to the applications to revolutionizing the entire senting and reason- knowledge engineers software life cycle in the next century, the ing with domain who encode domain computational requirements of KBSE technol- and design knowl- knowledge and design ogy need to be addressed. An examination of edge. This research knowledge in soft- current benchmarks reveals that hardware will pay off in more performance by the year 2000 is not likely to ware architectures. sophisticated tools be a limiting factor but that fundamental for requirement and issues such as search control require further specification engi- Executive research. Finally, the future of KBSE in the next century is presented from the viewpoint neering as well as Summary of different people in the software life better program syn- Currently, KBSE is at cycle—from end users to the knowledge thesis systems. the same stage of engineers who encode domain knowledge Within the decade, development as early and design knowledge in software architec- the break-even tures. or expert point will be systems. Commer- reached in general- cially, a few dozen purpose program handcrafted systems are in real use as indus- synthesis systems, trial pilot projects. The first and third sec- where given a domain theory, it will be faster tions of this book describe pilot systems for to interactively develop a program with one software maintenance and special-purpose of these systems than by hand. A key to this program synthesis. In research laboratories, breakthrough will be continued improve- many prototype KBSE systems have been ments in search control. Substantial research developed that have advanced the science of programs are now under way to scale KBSE formalizing and automating software design technology up from programming in the knowledge. Program synthesis research has to programming in the large. matured over the last two decades to the point that sophisticated can be The Next Century synthesized with only limited human guid- Software engineering will evolve into a radi- ance. Research in intelligent assistance for cally changed discipline. Software will requirement and specification engineering is become adaptive and self-configuring, less mature but already shows considerable enabling end users to specify, modify, and promise. maintain their own software within restricted contexts. Software engineers will deliver The Next Decade knowledge-based application generators Within the next decade, significant commer- rather than unmodifiable application pro- cial use of KBSE technology could occur in grams. These generators will enable an end software maintenance and special-purpose user to interactively specify requirements in program synthesis. The influence will be evo- domain-oriented terms, as is now done by

72 AI MAGAZINE Articles

…the current generation of CASE tools is limited by shallow representations and…reasoning methods.

telephone engineers with WATSON, and then shorten design times. As hardware has automatically generate efficient code that become less expensive, more resources have implements these requirements. In essence, been devoted to making easier to software engineers will deliver the knowledge use and program. User interfaces have for generating software rather than the soft- evolved from punched cards for batch pro- ware itself. cessing to teletypes and cathode ray tubes for Although end users will communicate with time sharing to graphic user interfaces for these software generators in domain-oriented networked workstations. In the nineties, new terms, the foundation for the technology will modes of interaction such as handwriting and be formal representations. Formal representa- voice will become common. tions can be viewed as the extension of cur- Likewise, computers are becoming easier to rent CASE representations, which only capture program. In the fifties and sixties, the first structural and syntactic information about a “automatic programming” tools were intro- software design, into complete semantic rep- duced—assemblers and compilers. Research in resentations that capture the full spectrum of programming languages and tech- software design knowledge. Formal languages nology has been a major success for computer will become the lingua franca, enabling science, raising the level of programming knowledge-based components to be com- from the machine level toward the posed into larger systems. Formal specification level. In the seventies, interac- specifications will be the interface between tive programming environments such as interactive problem-acquisition components Interlisp were created that enabled programs and automatic program synthesis compo- to be developed in small increments and nents. made semantic information about a software will evolve from an system readily available to the programmer art into a true engineering discipline. Soft- (Barstow, Shrobe, and Sandewall 1984). In the ware systems will no longer be developed by eighties, languages designed to facilitate the handcrafting large bodies of code. Rather, as reuse of software components were intro- in other engineering disciplines, components duced, such as Ada and object-oriented exten- will be combined and specialized through a sions of . Object-oriented programming chain of value-added enhancements. The methodologies encourage a top-down final specializations will be done by the end approach in which general object classes are user. KBSE will not replace the human software incrementally specialized. engineer; rather, it will provide the means for As hardware performance increased, the leveraging human expertise and knowledge scope of software projects soon exceeded the through automated reuse. New subdisciplines, capabilities of small teams of programmers. such as domain analysis and design analysis, Coordination and communication became will emerge to formalize knowledge for use in dominant management concerns, both hori- KBSE components. zontally across different teams of program- mers and vertically across different phases of Capsule History of Software software development. Structured methods Engineering were introduced to manage and guide soft- ware development. In the late sixties and Since the introduction of the electronic digi- early seventies, was tal computer at the end of World War II, introduced for incrementally developing cor- hardware performance has increased by an rect code from specifications (Dahl, Dijkstra, order of magnitude every decade. This rate of and Hoare 1972). In structured programming, a improvement has accelerated with the recent specification is first developed that states the introduction of reduced instruction set com- intended function of a program, and then an puter (RISC) architectures, which simplify implementation is developed through itera- hardware by pushing complex instructions tive refinement. Structured programming also into software to optimize performance and prescribes methods for making maintenance

FALL 1992 73 Articles

easier, such as -structuring programs to semantic processing. Nonetheless, they have simplify control flow. been successful; it is estimated that over half Although structured programming helped of current COBOL code is developed by appli- to correct the coding errors of the sixties, it cation generators. did not address requirement analysis errors The CASE market for just commercial data and system design errors. These errors are processing tools totaled several billion dollars more difficult to detect during testing than in 1988 and is doubling every three to four coding errors and can be much more expen- years (Schindler 1990). The major advantages sive to fix. To address this problem, in the of using current CASE technology are that it late seventies, structured methods were enforces a disciplined, top-down methodolo- extended to structured analysis and struc- gy for software design and provides a central tured design (Bergland and Gordon 1981; repository for information such as module Freeman and Wasserman 1983). These meth- interfaces and data declarations. However, ods prescribe step-by-step processes for ana- current CASE technology can only represent a lyzing requirements and designing a software small portion of the design decisions neces- system. The manual overhead in creating dia- sary to build, maintain, and modify a soft- grams and documentation was one limitation ware system. CASE design tools mainly of structured methods. A more fundamental represent the structural organization of a soft- limitation is that they provide only limited ware design, not the function of the compo- means for validating analysis and designs nents of a software system. Current CASE tools and, thus, work best for developing familiar only provide limited forms of analysis, such types of software, as is common in commer- as checking the integrity of control and data cial data processing. For new types of soft- flow and the consistency of data declarations ware, exploratory programming techniques across modules. CASE technology does not cur- developed in the AI community provide rently provide a means for validating the better means for incrementally developing functional behavior of a software design to and validating analysis and designs. ensure that the design satisfies the needs of CASE was introduced in the mid-eighties to the customer. provide computer support for structured methods of software development (Chikofsky AI and Software 1989; Gane 1990). CASE tools include interac- Engineering Today tive graphic editors for creating annotated structure chart, data flow, and control-flow AI has already made a significant impact on diagrams. Like CAD, the development of software engineering. First, mainstream soft- graphically oriented personal computers and ware engineering has adopted AI program- workstations has made CASE economically fea- ming techniques and environments, sible. The information in these diagrams is including expert system shells, as prototyping stored in a database called a repository, which tools. A software prototype is a system con- helps to coordinate a software project over structed for evaluation purposes that has only the whole development life cycle. In integrat- limited function and performance. Second, AI ed CASE environments, project management components are being integrated into larger tools use the repository to help managers systems, particularly where flexibility and decompose a software project into subtasks, ease of modification are needed for rapidly allocate a budget, and monitor the progress changing requirements. Third, AI inference of software development. technology is providing the foundation for CASE technology includes limited forms of more powerful and user-friendly information automatic code generation. Module and data systems (Barr 1990). Fourth, AI programming declarations are generated directly from infor- paradigms are being adopted in more con- mation in a repository. In addition, special- ventional environments, including graphic purpose interpreters or compilers called user interfaces, object-oriented programming, application generators have been developed for constraint-based programming, and rule- stereotyped software such as payroll programs based programming. In all these applications, and screen generators for video display termi- the major factor has been the ease of develop- nals. Application generators are essentially a ing, modifying, and maintaining programs user-friendly front end with a back-end inter- written in AI programming environments. preter or sets of macros for code generation. AI technology provides particularly effec- In contrast to knowledge-based program syn- tive exploratory programming tools for thesis systems, current application generators poorly understood domains and require- have narrow coverage, are difficult to modify, ments (Shiel 1986). Exploratory program- are not composable, and provide limited ming converges on well-defined requirements

74 AI MAGAZINE Articles and system specifications by developing a system to be developed comparatively fast, prototype system, testing the prototype with the inefficiency of the resulting code caused the end user to decide whether it satisfies the the system to grind to a halt when it came customer’ needs, and then iteratively modi- online and created havoc for several years for fying the prototype until the end user is the Garden State. As another example, satisfied. Exploratory programming identifies although mathematical models of three- errors in requirements and specifications early dimensional physical systems can be specified in the design process, when they are cheap to with MATHEMATICA, the execution of large or fix, rather than after the system has been complex models can be unworkably slow. delivered to the customer, when they can cost When efficient performance is necessary, a a hundred to a thousand times as much to prototype system currently can only be used fix. This advantage of early feedback with a as a specification for a production system; the prototype system is leading to the replace- production system has to be coded manually ment of linear methods of software develop- for efficiency. For example, after a three- ment with methods that incorporate one or dimensional physical system is specified in more passes of prototype development. AI MATHEMATICA, it still needs to be coded into technology is suitable for exploratory pro- efficient code to run simulations. gramming because its programming con- Although both MATHEMATICA and MACSYMA can structs, such as objects, rules, and constraints, produce default Fortran code from mathemat- are much closer to the conceptual level than ical specifications, the code is not efficient conventional programming constructs. This enough for simulating large three-dimension- approach enables prototype systems to be al systems. rapidly constructed and modified. The necessity of recoding a prototype for Prototype systems written in very high- efficiency incurs additional costs during level languages and environments can direct- development but has even more pernicious ly be evolved into working systems when effects during the maintenance phase of the efficient performance is not necessary. For software life cycle. As the production system example, many small businesses build their is enhanced and modified, the original proto- own information and accounting systems on type and documentation are seldom main- top of standard database and pro- tained and, therefore, become outdated. Also, grams. Historically, spreadsheet programs because modifications are done at the code descended from VISICALC, a simple constraint- level, the system loses its coherency and based reasoning system within a standard becomes brittle. Eventually, the system accounting paradigm. For scientific applica- becomes unmaintainable, as described in tions, Wolfram’s MATHEMATICA environment chapter 1 of Automating Software Design. enables scientists and engineers to rapidly develop mathematical models, execute the Transformational Programming models, and then graph the results. MATHEMAT- ICA uses a programmable rule-based method KBSE seeks to combine the development for manipulating symbolic mathematics that advantages of high-level environments sup- can also be used for transformational program porting rapid prototyping with the efficiency derivations. advantages of manually coded software. The However, when efficient performance is objective is to create a new paradigm for soft- necessary, the current generation of high- ware development—transformational pro- level development environments does not gramming—in which software is developed, provide adequate capabilities because the modified, and maintained at the specification environments interpret or provide default level and then automatically transformed translations of high-level programs. Although into production-quality software (Green et al. in the future, faster hardware can compensate 1986). For example, the SINAPSE system auto- for constant factor overheads, the most mates the production of efficient Fortran code difficult inefficiencies to eliminate are expo- from high-level specifications of three-dimen- nential factors that result from applying sional mathematical models. As another generic interpretation algorithms to declara- example, the ELF system automates the pro- tive specifications. For example, in the mid- duction of efficient, VLSI wire routing software eighties, the New Jersey motor vehicle from specifications. Where this automatic department developed a new information translation is achieved, software develop- system using a fourth-generation language, ment, maintenance, and modification can be which is essentially an environment for pro- carried out at the specification level with the ducing database applications. Although the aid of knowledge-based tools. Eventually, soft- fourth-generation language enabled the ware engineering will evolve to a higher level

FALL 1992 75 Articles

CASE was introduced…to provide computer support for structured methods of software development…

and become the discipline of capturing and text of a new software system. For these rea- automating currently undocumented domain sons, word processing and desktop publishing and design knowledge. have had an impact disproportionate to the To understand the impact of automating resources consumed by the manual process the translation between the specification they automate. For similar reasons, automat- level and the implementation level, consider ing the development of production-quality the impact of desktop publishing during the software from high-level specifications and past decade. Before the introduction of com- prototypes will have a revolutionary impact puter-based word processing, manually on the software life cycle. typing or typesetting a report consumed a By automating printing, desktop pub- small fraction of the time and money needed lishing has created a market for computer- to generate the report. Similarly, manually based tools to help authors create coding a software system from a detailed publications from the initial conceptual specification consumes a small fraction of the stages through the final layout stage. Out- resources in the software life cycle. However, lining programs help authors organize and in both cases, this manual process is an error- reorganize their ideas. Spelling checkers prone bottleneck that prevents modification and grammar checkers help ensure consis- and reuse. Once a report is typed, seemingly tency with standards. Page layout pro- minor modifications, such as inserting a para- grams optimize the visual presentation of graph, can cause a ripple effect requiring a the final publication and often integrate cascade of cutting and repasting. Modifying a several source files into a final product. software system by patching the code is simi- These tools would be useful even in the lar to modifying a report by erasing and absence of automated printing. However, typing in changes. After a few rounds of they reach their full potential in environ- modification, the software system resembles ments supporting the complete spectrum an inner tube that is patched so often that of activities for the incremental and itera- another patch causes it to blow up. When a tive development of publications. typewritten report gets to this stage, it is Current CASE tools are similar to outliners, simply retyped. However, when a software grammar checkers, and tools that integrate system gets to this stage, the original design several source files. CASE tools have not yet information is usually lost. Typically, the reached their full potential because coding is original programming team has long since still mostly a labor-intensive manual process. departed, and the documentation has not As coding is increasingly automated, more been maintained, making it inadequate and sophisticated tools that support the initial outdated. The only recourse is an expensive stages of software design will become more reengineering effort that includes recovering useful. Rapidly prototyped systems developed the design of the existing system. with these tools will be converted with mini- Because maintenance and modification are mal human guidance into production-quality currently done at the code level, they con- code by program synthesis systems, just as sume over half the resources in the software word processing output today is converted life cycle, even though the original coding almost automatically into publication-quality consumes a small fraction of the life-cycle presentations through page layout programs. resources. Most maintenance effort is devoted CASE representations will move from partial to understanding the design of the current representations of the structure and organiza- system and understanding the impact of pro- tion of a software design toward formal posed modifications. Furthermore, because specifications. Tools such as WATSON that modification is difficult, so is reuse. It is easy interactively elicit requirements from end to reuse portions of previous reports when users and convert them into formal they can easily be modified to fit in a new specifications will be one source of formal context. Similarly, it is easy to reuse portions specifications. Tools such as ARIES will enable of previous software systems when abstract software developers to incrementally develop components can easily be adapted to the con- and modify formal specifications. Tools such

76 AI MAGAZINE Articles as ROSE-2 will match specifications to existing Programs written in these languages are either designs for reuse. interpreted or compiled to remove interpre- tive overhead such as pattern matching. Comparison of Program These languages can be used to write declara- Synthesis Techniques tive specifications that are generally imple- mented as generate-and-test algorithms, with The ultimate objective of transformational correspondingly poor performance. It is also programming is to enable end users to possible to develop efficient logic programs describe, modify, and maintain a software by tuning them to the operation of the inter- system in terms natural to their application preter, in effect, declaratively embedding con- domain and, at the same time, obtain the trol structure. The code implemented for efficiency of carefully coded machine-level these programs approaches, within a constant programs. Great progress has been made factor, the efficiency of code written for con- toward this objective in the progression from ventional languages, except for the lack of assemblers to compilers to application genera- efficient data structures. Efficient logic pro- tors and fourth-generation languages. Howev- grams can either be developed manually or er, much remains to be done. can be the target code of a program synthesis Given current technological capabilities, system such as PAL or XPRTS there are trade-offs between several dimen- Knowledge interpreters such as expert system sions of automatic translation. The first shells are seldom based on formal semantics, dimension is the distance spanned between and thus, formally verifying that they pro- the specification or programming level and duce correct code is impossible. However, the implementation level. The second dimen- some research-oriented languages such as sion is the breadth of the domain covered at pure are based on formal semantics. the specification level. The third dimension is Formal proofs in which abstract compilers for the efficiency of the implemented code. The these languages produce correct code for fourth dimension is the efficiency and degree abstract virtual machines have appeared in of automation of the translation process. The the research literature (Warren 1977; Despey- fifth dimension is the correctness of the roux 1986). These proofs are based on the implemented code. The Automating Software close relationship between logic program- Design chapters in the sections on domain- ming and theorem proving, for which the specific program synthesis, knowledge compi- soundness and completeness of inference pro- lation, and formal derivation systems show cedures have mathematically been worked how current KBSE research is expanding our out. Proving that real compilers produce cor- capabilities along each of these dimensions. rect code for real machines is a much more The following paragraphs compare different difficult and detailed undertaking. Success has approaches to automatic translation. recently been reported in Hunt (1989), Moore Compilers for conventional programming (1989), and Young (1989). languages such as Fortran and COBOL have a Application generators span a significant dis- wide breadth of coverage and efficient trans- tance between the specification level and the lation and produce fairly efficient code. These implementation level by choosing a narrow accomplishments are achieved with a com- domain of coverage. Application generators paratively short distance between the pro- typically use simple techniques to interpret or gramming level and the implementation produce code in conventional languages, level. Optimizing compilers sometimes have such as filling in parameters of code tem- user-supplied pragmas for directing the plates. Thus, the performance of implement- translation process, thereby getting greater ed code usually is only fair. The translation efficiency in the implemented code at the process is efficient and automatic. Because expense of greater user guidance in the trans- application generators are mainly used in lation process. Verifying that a compiler pro- commercial data processing, mathematically duces correct code is still a major research verifying that an application generator pro- issue. duces correct code has not been a major issue. Very high-level languages such as logic pro- Domain-specific program synthesis systems gramming and knowledge interpretation lan- will likely become the next generation of guages also achieve a wide breadth of application generators. They also span a coverage through a short distance between significant distance between the specification the programming level and the implementa- level and the implementation level by choos- tion level. However, the programming level is ing a narrow domain of coverage. They are much closer to the specification level than easier to incrementally develop and modify with conventional programming languages. than conventional application generators

FALL 1992 77 Articles

The objective because they are based on transformation potential uses, technology already developed rules. It is also easier to incorporate more in the research laboratory could be combined is to create a sophisticated problem-solving capabilities. with conventional software-engineering tools new paradigm Therefore, compared to conventional applica- to provide much better tools. For some of tion generators, they tend to produce much these applications, industrial pilot projects for software better code and provide higher-level are already under way. First, I review several development specifications. In theory, the transformation potential near-term uses, then examine rules could be derived by composing applications in software maintenance and —transforma- transformation rules that were rigorously special-purpose program synthesis. The com- tional based on logic and, therefore, could verifiably putational requirements of KBSE technology be correct. However, in practice, the transfor- are also discussed. program- mation rules are derived through knowledge First, KBSE tools for software understanding ming… engineering. Compared with other program will likely be adopted for software mainte- synthesis approaches, the narrow domain nance (as described earlier). Current AI tech- enables domain knowledge and search nology can significantly enhance standard knowledge to be hard wired into the transfor- software understanding tools such as data mation rules; therefore, the translation pro- flow and control-flow analyzers. KBSE aims at cess is comparatively efficient. eventually elevating maintenance from the Knowledge compilation research seeks to com- program level to the specification level. How- bine the high-level specification advantages ever, because of the enormous investment in of knowledge interpreters with the efficiency the existing stock of software, program main- advantages of conventional programs. The tenance will be the dominant cost in software goals are broad coverage, a large distance engineering for decades to come. between the specification level and the Second, AI technology could be used to implementation level, and efficient code. To produce the next generation of application achieve these goals, the major research issue generators (see The Next Century). Transfor- is search control during program synthesis. mational programming provides a flexible Like approaches to domain-specific synthesis, and modular basis for a deeper level of the transformation rules are derived through semantic processing than is feasible with cur- knowledge engineering and, hence, are not rent application generators. Its use leads to verifiably correct. However, the domain better user interaction and higher-level knowledge is usually encoded separately and specifications as input and more optimal explicitly; therefore, knowledge compilers code as output. can work in different domains given different Third, AI environments for rapid prototyp- domain knowledge. ing and exploratory programming could be Formal derivation research seeks the same enhanced and integrated with CASE design goals as knowledge compilation research but tools. Rapid prototyping has recently become within a framework that is rigorously based a major topic in software engineering. CASE on logic and, therefore, is verifiably correct. design tools support decomposing a system The formal basis for the extraction of pro- into a hierarchy of modules. The result is usu- grams from constructive proofs and for basic ally a hierarchical graph representation anno- transformation rules was worked out 20 years tated with text commenting on the intended ago. As in knowledge compilation, the major function of the modules. AI programming research issue is search control during pro- environments can be used to rapidly proto- gram synthesis. The major approach to search type the function of these modules and, thus, control is to develop metalevel programs create an executable prototype of the whole called tactics and strategies that encapsulate system. search control and design knowledge. Fourth, the currently small but growing use of formal methods (Wing 1990) for software The Next Decade development could considerably be enhanced through software-engineering technology. A KBSE will revolutionize the software life cycle, method is formal if it has a sound mathemat- but the path will be evolutionary because its ical basis and, therefore, provides a systemat- development, as well as its incorporation into ic rather than ad hoc framework for software-engineering practice, will be incre- developing software. In Europe, formal lan- mental. KBSE will adapt itself to current prac- guages such as VDM (Jones 1986) and Z (Spivey tices before it changes these practices. 1989) have been used in pilot projects to In the coming decade, there are several manually develop formal specifications of leverage points where KBSE technology could real software systems. Tools that incorporate commercially be applied. In each of these automated reasoning could provide substan-

78 AI MAGAZINE Articles

and hardware platforms. Because mainte- tial assistance in developing formal KBSE will specifications, as shown earlier. Furthermore, nance is increasingly difficult, reengineering within this decade, formal derivation systems is especially important for code written long revolutionize will become sufficiently advanced to provide ago in older languages and for older hardware the software interactive environments for refining formal platforms. Many businesses are legally specifications to code (see Transformational required to be able to access records in life cycle… Programming). databases dating back decades. These databas- Finally, a major problem with programming es were written as flat files, making it impossi- in the large is ensuring that a large software ble to integrate them with newer relational system is consistent: The implementation databases and requiring the business to keep must be consistent with the system design backward compatibility with old hardware that, in turn, must be consistent with the platforms. Reengineering is also needed to requirements; the components of the soft- port engineering and scientific code written ware system developed by separate teams in the sixties and seventies to newer hardware must be consistent with each other. Current platforms with parallel architectures. Reengi- CASE tools use shallow representations and, neering these older systems occupies many thus, provide only limited consistency check- programmers; AI technology can provide ing. For example, CASE design tools can check significant assistance to this task. that a hierarchy of modules has no dead ends Standard AI techniques such as pattern in the flow of control. More complete consis- matching and transformation rules enhance tency checking will require deeper representa- conventional tools for software understand- tions and more inference capabilities. The ing by enabling higher-level analysis and research on ROSE-2 shows that there is an evo- abstraction (Biggerstaff, Hoskins, and Webster lutionary path from current CASE consistency 1989; Hartman 1989; Letovsky 1988; Wills checking to consistency checking with deeper 1989). With these techniques, much of the representations. information needed by current CASE design tools can be recovered semiautomatically, Software Maintenance: The Next Decade even from archaic code. Industry and government have invested hun- One approach is cliche recognition (Rich and dreds of billions of dollars in existing software Waters 1990), matching stereotyped program- systems. The process of testing, maintaining, ming patterns to a program’s data and control modifying, and renovating these systems con- flow, thereby abstracting the program into a sumes over half of the software-engineering hierarchy of these cliches. resources and will continue to do so for AI techniques for software understanding decades to come. Often, older systems are no can also be applied to the testing and integra- longer maintainable because no one under- tion phases of software development. For stands how they work, preventing these sys- example, a major aerospace company used a tems from being upgraded or moved to KBSE system to test two million lines of For- higher-performance hardware platforms. This tran code produced by subcontractors for problem is acute for software written in compliance with its coding standards. archaic languages that programmers are no Hundreds of violations were found, including longer trained in. unstructured do loops, dead code, identifier Intelligent software maintenance assistants inconsistency, and incorrectly formatted are based on AI techniques for software code. This technique has already saved four understanding and ramification reasoning. person-years of hand checking, yet the system Software understanding is a prerequisite to took less than half a person-year to develop. other maintenance tasks and currently The system was built in REFINE, which is a very accounts for over half the time spent by high-level programming environment that maintenance programmers. In the long term, integrates technology, an object-ori- software systems will include built-in self- ented knowledge base, and AI rule-based tech- explanation facilities such as those in LASSIE nology (Burson, Kotik, and Markosian 1990). and the explainable expert system (Neches, Most of the development time was devoted to Swartout, and Moore 1985). Today, software writing grammar definitions for the three understanding is done by reverse engineering, dialects of Fortran used by subcontractors; analyzing existing code and documentation REFINE automatically generates parsers from to derive an abstract description of a software the grammar definitions. The core of the system and recover design information. system was written as a set of rules and took Reverse engineering is the first step in only a few weeks of development time. reengineering, renovating existing systems, Because it was written as a set of rules, it is including porting them to newer languages easily modifiable as coding standards change.

FALL 1992 79 Articles

Understanding the existing code is the first tion phase can be augmented with program step in software modification. The next step transformations to produce better code. Pro- is ramification reasoning to determine the gram transformations enable code templates impact of proposed modifications to a soft- to be more abstract and, therefore, have ware system. The objective is to ensure that a wider coverage. For example, code templates modification achieves its goal without caus- can be written with abstract data types, such ing undesired changes in other software as sets, that program transformations, then behavior. This is essentially a task for con- refine into concrete data structures. straint-based reasoning. Ramification reason- Transformation rules provide a flexible and ing can be done at many levels of abstraction. modular basis for developing and modifying At the lowest level, it involves tracing program transformations and semantic analy- through data flow and control-flow graphs to sis routines. Eventually, end users will be able decide which programs and subroutines to interactively develop their own transfor- could be affected by a proposed modification. mation rules in restricted contexts. In the Automated ramification reasoning is even near term, transformational technology will more effective when given higher-level enable software engineers to build higher- design information. For example, it can deter- performance application generators. KBSE mine whether proposed modifications violate environments that include parser-printer data-integrity constraints and other invari- generators and support for program-transfor- ants. As CASE representations evolve into more mation rules, such as REFINE, provide tools for complete and formal representations, AI tech- developing knowledge-based application niques for ramification reasoning will be generators. incorporated into CASE tools. Intelligent application generators are cost- effective options not only in standard Special-Purpose Program Synthesis: application areas such as business informa- The Next Decade tion systems but also in the enhancement of Today, application generators are one of the existing software. Automatic code enhance- most effective tools for raising programmer ment makes software development and main- productivity. Because they are based on code tenance more efficient and results in fewer templates, they provide a more flexible and bugs. Examples of enhancements include higher level of software reuse than the reuse better error handling and better user inter- of code. Transformational technology pro- faces. vides the means for higher-performance Making software fault tolerant is particular- application generators, which are also poten- ly important in real-time systems and in tially easier to develop and modify. transaction systems where the integrity of a Application generators generally produce database could be affected. Consequently, a code in three phases. The first syntactic anal- significant amount of code in these systems is ysis phase converts a specification written in devoted to error detection and recovery. An an application-oriented language or obtained informal sampling of large telecommunica- interactively through menus and forms into a tions programs found that 40 percent to 80 syntactic parse tree. A semantic analysis percent of branch points were devoted to phase then computes semantic attributes to error detection. obtain an augmented semantic tree. The final User interface code often accounts for 30 generation phase traverses the semantic tree percent of an interactive system. Screen gen- and instantiates code templates. The genera- erators have long existed for business transac- tion phase is similar to macroexpansion in tion systems, and graphic user interface conventional programming languages. generators for workstations have recently Application generator generators (Cleave- been marketed. However, current tools only land and Kintala 1988) are tools for building make it faster to develop precanned displays. application generators. Parser generators such Intelligent interfaces are much more flexible. as YACC take the definition of an application- They can adapt to both the semantics of the oriented language as input and produce a data they present and the preferences of the parser for the syntactic analysis phase as user. For example, instead of precanned text, output. Tools for building the semantic analy- intelligent interfaces use natural language sis phase are usually based on attribute-gram- generation techniques that are sensitive to a mar manipulation routines. Tools for model of the user’s knowledge. Graphics can building the generation phase are similar to be optimized to emphasize information definition languages. important to the user. AI provides technology for much richer However, intelligent interfaces are more semantic processing. In addition, the genera- difficult to design and program than standard

80 AI MAGAZINE Articles user interfaces, making it costly to incorpo- SINAPSE system, which synthesizes a 5,000-line KBSE can be rate them in each new application. Roth, Fortran three-dimensional modeling program Mattis, and Mesnard (1990) describe SAGE, a from a 60-line specification in 10 minutes on expensive… prototype system for application-independent a SUN-4 (a SUN-4 is a workstation running in terms of intelligent data presentation. SAGE incorporates about 12 MIPS with memory from 16 to 64 design expertise for selecting and synthesiz- megabytes). This type of program would take memory and ing graphic components in coordination with weeks to generate by hand. SINAPSE is special- processor the generation of natural language descrip- ized to synthesize only finite-difference algo- tions. SAGE combines a declarative representa- rithms, so it can make large-grained decisions cycles. tion of data with knowledge representations without sacrificing automation or the of users’ informational goals to generate efficiency of the resulting code. SINAPSE also graphics and text. An application developer uses knowledge-based techniques to constrain can generate an intelligent presentation the search space. Another benchmark comes system for his(her) application using SAGE. from the testing of Fortran code for adher- technology can ence to coding standards, which was directly be applied to the synthesis of visual described earlier. On a SUN-4, 20,000 lines of presentations. Westfold and Green (1991) code were checked each hour. describe a transformation system for deriving The computational demands of KBSE tech- visual presentations of data relationships. The nology result from the scope of the knowl- underlying idea is that designing a visual pre- edge representation, the granularity of sentation of data relationships can be viewed automated decision making, and the size and as designing a . Starting with a complexity of the software systems produced. relational description of data, transformations Greater breadth or depth of representation are applied to reformulate the description requires increased memory, and increased into equivalent descriptions that have differ- granularity of automated decision making ent implementations, that is, different repre- requires increased processor cycles. The criti- sentations on the screen. Many different cal issue is how computational requirements representations of the same data can be gen- scale as a function of these variables. erated, each emphasizing different aspects of The computational requirements of soft- the information. For example, an n-ary rela- ware-engineering technology can be factored tion can be reformulated so that all the tuples into two groups: those that increase either with the same first argument are grouped. linearly or within a small polynomial in Successive applications of this rule gradually terms of these variables and those that transform the n-ary relation into a tree. Other increase exponentially. The former group rules transform data relationships into dis- includes factors such as the amount of play-oriented relationships, which are then memory needed to represent the history of a rendered graphically. derivation, including all the goals and sub- goals. As discussed in chapter 23 of Automat- Computational Requirements ing Softwre Design, this amount is probably proportional to N * log N, where N is the size KBSE can be expensive computationally, both of the derivation. The exponential factors in terms of memory and processor cycles. The include combinatorially explosive search research systems described in this book often spaces for program derivations. These expo- push the performance envelope of current nential factors cannot be addressed by any workstations. Hardware capabilities have pre- foreseeable increase in hardware performance. viously been a limiting factor in the commer- Instead, better KBSE technology is required, cial adoption of CAD, CASE, and AI technology. such as tactics for search control. The most limiting factor was the computa- Although software engineers will always tional requirements of providing a graphic want more hardware power, increases in hard- user interface with the hardware available in ware performance within the next decade will the late seventies and early eighties. Hardware likely address the computational require- performance in the nineties will probably not ments of KBSE that do not grow exponentially. be a major limiting factor in scaling up KBSE The benchmarks from industrial pilot projects technology to industrial applications. show that some KBSE technology can already Benchmarks from several industrial pilot be applied to industrial-scale software systems projects show that current computer worksta- within the performance limitations of current tions provide sufficient computational power computer workstations. to support KBSE on real problems when appro- Hardware performance within the fiercely priate trade-offs are made. A benchmark from competitive workstation market will continue domain-specific program synthesis is the to grow rapidly. Based on technology already

FALL 1992 81 Articles

In the future, in advanced development, an order of magni- tion systems such as LASSIE. Maintenance pro- tude increase is expected by the middle of the grammers will no longer study reams of software decade in both processor speed and memory and outdated documentation to systems will size. Simpler architectures such as RISC have understand a software system. Instead, they decreased development time and will facili- will query the system itself. Eventually, the include built- tate the introduction of faster device tech- information will automatically be updated, so in, knowledge- nologies such as gallium arsenide and that as a system is modified, the information Josephson junctions. is kept current. The information in these sys- based Lucrative, computationally intensive appli- tems will expand to include the full range of software cations, such as real-time digital video pro- design decisions made in developing the soft- cessing, will drive hardware performance ware system. Eventually, these software information even higher in the latter part of the decade, information systems will be produced as a by- systems… with yet another order-of-magnitude product of software development with KBSE increase. Compared with the computational tools. requirements of these applications, scaling up These software information systems will KBSE technology to industrial applications will evolve to become active assistants in software not be hardware limited if exponential factors maintenance. They will be able to trace the can be avoided. The next part of this article ramifications of proposed changes at the describes how combinatorially explosive code, system design, and requirement levels. search might be avoided through the reuse of They will help to ensure that as a software domain and design knowledge. system evolves to meet changing require- ments, it remains consistent and bug free. The Next Century Eventually, maintenance will no longer be done by modifying source code. Instead, Within the KBSE community, there is a broad desired changes will be specified directly by consensus that knowledge-based methods the end user at the requirement level, and the will lead to fundamentally new roles in the system will carry them out automatically. software-engineering life cycle and a revised Software systems will become self-document- view of software as human knowledge that is ing and self-modifying. The typically junior encapsulated and represented in machine programmers who are now burdened with manipulable form. At the beginning of the maintaining code designed by other program- computer age, this knowledge was represent- mers will spend their time in more interest- ed as the strings of 1’s and 0’s of machine lan- ing pursuits. guage. By applying software engineering to itself by developing compilers, the represen- End Users tation level has been raised to data structures and control structures that are closer to the In the future, end users will interactively cus- conceptual level. Although this improvement tomize generic software systems to create is considerable, most of the domain and applications tailored to their own particular design knowledge that is used in developing a needs. Progress in this direction has already modern software system is lost or, at best, is been made. For example, several word pro- encoded as text in documentation. cessing and spreadsheet programs allow users The goal of KBSE research is to capture this lost to customize menus, change keyboard bind- knowledge by developing knowledge representa- ings, and create macros. Sometimes, macros tions and automated reasoning tools. As this are compiled for greater efficiency. KBSE tech- task is accomplished, software engineering will nology will provide much more flexibility in be elevated to a higher plane that emphasizes customization, more optimized code, and formalizing and encoding domain and design intelligent assistance in helping an end user knowledge and then automatically replaying develop requirements. and compiling this knowledge to develop work- A scenario for future intelligent assistance ing software systems. Below, I envision how for end users in scientific computing is pre- future KBSE environments might support differ- sented by Abelson et al. (1989). Below, I ent classes of people in the software life cycle, examine a simpler domain by considering from end users to the knowledge engineers who how a businessperson will develop a custom encode domain knowledge and design knowl- payroll system in the future. Current payroll edge in software architectures. program generators automate code genera- tion but do not provide substantial help in Maintenance Programmers eliciting and analyzing requirements. An In the future, software systems will include intelligent payroll application generator will built-in, knowledge-based software informa- include a knowledge base about the payroll

82 AI MAGAZINE Articles domain that defines concepts such as hourly at the requirement level and be shielded from versus salaried employees, various types of the complexity of specifications and system bonuses and deductions, tax codes, and different design. In contrast, system developers prefer kinds of pay periods. Using this knowledge, a help managing the complexity of specifications requirement acquisition component will interac- and system designs but want to be shielded tively elicit the relevant structure of the busi- from the implementation details. Two inter- nessperson’s company and the types of output related ideas currently being developed could and information desired. During this interac- meet some of the needs of future system tion, the requirement acquisition component developers: megaprogramming and software will check the consistency of evolving require- architectures. ments with constraints, such as tax codes. It will Megaprogramming is programming at also guide the businessperson in resolving ambi- the component level (.g., user interfaces, guities and incompleteness. The result of this databases, device controllers) rather than the requirement elicitation process will be a formal code level. To program at the component specification of the desired payroll system in the level, it is necessary to either reuse compo- form of declarations and constraints formulated nents or generate components from in terms of the concepts and operations of the specifications. Components can more readily payroll domain. be reused or generated if they are defined in To validate the specification, the payroll appli- the context of a software architecture that cation generator will then transform these identifies major types of components and declarations and constraints into a prototype provides the glue for composing them spreadsheet program and try out a few test together. A software architecture is a high-level examples with the businessperson. This step will description of a generic type of software lead to further refinements of the requirements system, such as the class of transaction-pro- until the businessperson is satisfied. The payroll cessing systems. program generator will then compile the Software architectures will subsume cur- specification into machine code and generate rent CASE design representations. To support the proper interfaces for other accounting software developers, software architectures systems. will include the functional roles of major The businessperson does not have to be an software components and their interrelation- expert in accounting, tax codes, or business ships stated in an application-oriented lan- programming or even be able to develop guage; a domain theory that provides precise . The domain knowledge of pay- semantics for the application-oriented lan- rolls would enable the requirement acquisi- guage for use in automated reasoning; tion component to elicit the appropriate libraries of prototype components with exe- information in terms the businessperson can cutable specifications; program synthesis understand. capability to produce optimized code for Many of the capabilities described in this components after a prototype system has sketch are within the range of current KBSE been validated; a constraint system for rea- research systems. For example, eliciting pay- soning about the consistency of a developing roll requirements involves temporal reason- software system; and design records that link ing that is much less complex than that requirements to design decisions, as in the performed by WATSON in eliciting requirements ROSE-2 system . for new telephone features. Other recent work In the future, routine system development on knowledge-based requirement acquisition will be done by small teams of system ana- includes Rubenstein and Waters (1989) and lysts and domain experts working with KBSE Anderson and Fickas (1989). Similarly, com- environments that support software architec- piling the arithmetic constraints of a payroll tures. In contrast to the guidance provided specification into efficient code is far less for an end user, the KBSE environment will difficult than compiling partial differential play an assistant role in requirement analysis. equations into efficient finite-difference pro- The team will start with a set of informal grams, as done by SINAPSE. What is missing are requirements and elaborate them into precise good tools for acquiring the knowledge of the statements in the application-oriented lan- payroll domain, so that this knowledge can guage. As the requirements are made precise, be used by both requirement elicitation com- the software-engineering environment will ponents and program synthesis components. propagate these decisions through the design records to check the feasibility of meeting System Developers these requirements. This propagation will The needs of system developers differ from also set up default design decisions and those of end users. End users prefer to interact record design alternatives to be investigated

FALL 1992 83 Articles

Domain by the team. teams to build large software systems, end After a subset of the requirements has been users to customize their own applications, knowledge is made into precise specifications, the KBSE and application software to be self-maintain- a prerequisite environment will generate executable proto- ing. A software architecture represents a par- types to help validate and refine the require- tial set of decisions about requirements and to requirement ments. Analytic assessments of an evolving system design that are generic to a class of engineering. design will be provided by having the con- software systems. To be reusable, this partial straint system determine the ramifications of information must be represented so it is design decisions. The constraint system will extensible. System developers and end users ensure that design decisions are internally will incrementally add to these decisions to consistent and consistent with the software generate a software system. Systems will be architecture. The team will iteratively refine modified for changing requirements by its requirements, the precise specifications, retracting decisions and adding new decisions. and the system design using an opportunistic methodology similar to current rapid proto- Software Architects, Domain Analysts, typing methodologies. After the system and Design Analysts design is complete, the implementation of Software architectures will be created through the production system will mostly be auto- domain analysis and design analysis. Domain matic. The KBSE environment will ask for analysis (Arango 1988) is a form of knowledge parameters relevant to performance, such as acquisition in which the concepts and goals the size of expected input, and will occasion- of an application domain are analyzed and ally ask for guidance on implementation then formalized in an application-oriented decisions. language suitable for expressing software System development by a close-knit team specifications. Design analysis is the formaliza- with automated support will diminish many tion of transformational implementations for communication, coordination, and project a class of software artifacts. A domain designer management difficulties inherent in manag- is a design analyst who derives implementa- ing large numbers of programmers. The feasi- tions for the objects and operations in an bility of small teams developing prototype application-oriented language developed systems has already been shown using an AI through domain analysis. programming environment suitable for rapid System developers will reuse domain analy- prototyping, reusable prototype components, sis by stating requirements, specifications, and a software architecture (Brown et al. and system designs in the application-orient- 1988). However, implementing the final pro- ed language. System developers will also reuse duction system required a large number of design analysis when they develop an appli- programmers to code efficient versions of the cation system through a software architec- components. In the future, this final produc- ture. Thus, the cost of domain analysis and tion phase will largely be automated. design analysis will be spread over many dif- There is an inherent tension between ferent systems. reusability and performance in component Domain knowledge is necessary for intelli- implementations, which is why megapro- gent requirement analysis, specification gramming is currently easier to apply to acquisition, and program synthesis. Domain system prototyping than production system knowledge can implicitly be embedded in implementation. To be maximally reusable, special-purpose rules or can be formal and components should make as few assumptions explicit. Formalizing domain knowledge is a about other components as possible. Reusable difficult task, currently requiring extended components should use abstract data types collaboration between domain experts and and late-binding mechanisms common in AI knowledge engineers (Curtis, Krasner, and environments. To be maximally efficient, Iscoe 1988). One advantage of formalizing components should take advantage of as domain knowledge is that it can then be used much of the context of use as possible, in all phases of software development, with including the implementations of data types the assurance of correctness between imple- in other components and early binding mentations and specifications. In other mechanisms. Transformational programming words, if a user validates a specification with will make it possible to use megaprogram- a specification assistant, and a program syn- ming during system prototyping and then thesis system derives an implementation for automatically generate efficient production- this specification, the program correctly quality code. implements the user’s intentions. Formalized Extensible software and additive program- domain knowledge can also be used with ming are the foundation for enabling small generic automated reasoning tools, thus

84 AI MAGAZINE Articles enabling the reuse of KBSE components. overall goal structure of the derivation. Some Domain knowledge is a prerequisite to dependency information will automatically requirement engineering. Requirements are be generated. The annotations will provide necessarily described in terms of the applica- information to intelligent replay systems to tion domain in which a software system will modify a derivation history for use in similar be used. Current requirement languages are specifications (Mostow 1989). The second restricted to expressing system-oriented con- level of generalization will be done by creat- cepts. Knowledge representation languages ing metalevel tactics. A third level of general- will provide the basis for expressing domain ization will be strategies that encapsulate knowledge within which domain-oriented heuristic knowledge for choosing tactics. requirements can formally be expressed These strategies will include methods for (Borgida, Greenspan, and Mylopoulos 1986). decomposing specifications into components This approach will enable automatic theorem to be synthesized by specialized tactics and provers to verify whether a set of require- methods for using performance requirements ments is consistent with domain knowledge to guide the transformational derivation. and to fill incompleteness in a partial set of A software architect will integrate the requirements, as is done by WATSON for the results of domain analysis and design analysis telephone domain. Program synthesis systems with a constraint system for reasoning about use domain knowledge to decompose and the consistency of a developing software implement domain objects and operations. system. The results of domain analysis will be For example, distributive laws in a domain an application-oriented language for a soft- theory are used to decompose and optimize ware architecture and a domain theory that domain operations. defines the semantics of this language. The A future design analyst will first results of design analysis, that is, annotated interactively develop transformational deriva- derivation histories, tactics, and strategies, tions for a set of generic programs in an appli- will essentially form a set of special-purpose cation domain using a general-purpose program synthesizers for the components of a program synthesis system. These transforma- software architecture. The software architect tional derivations will be recorded as deriva- will also provide an initial set of design tion histories Derivation histories can be records linking requirements to design deci- viewed as the execution trace of an implicit sions for choosing one component over metalevel program that controls the transfor- another. These design records will be elaborat- mational derivation. The objective of design ed by teams of system developers. analysis is to make at least part of the infor- In essence, software engineering will mation in this metalevel program explicit so become knowledge acquisition followed by that it can be stored in a software architecture redesign. Although creating a software archi- and reused by a system developer. A deriva- tecture will require more effort than design- tion history needs to be generalized so that it ing any particular software system, it will be can be applied to similar but not necessarily paid back over the creation of many software identical specifications. Because of the systems. Software architectures will provide difficulty of automatically generalizing execu- the right tools for what is now partially done tion traces to programs, I anticipate that gen- through copy and edit. Instead of designing a eralizing derivation histories will be a manual system from scratch, software system develop- process, with some computer assistance, for ers will extend and redesign the defaults in a the foreseeable future. In the long-term software architecture. Redesign is the method future, this generalization process might be used for iteratively modifying a rapidly proto- automated through explanation-based gener- typed system. In the future, redesign will be alization, given a general theory of the teleo- supported with a spectrum of tools, including logical structure of transformational specification evolution transformations, con- derivations. Applications of explanation- sistency maintenance systems, and intelligent based generalization to program synthesis are replay of derivation histories. described in chapters 14 and 22 of Automating KBSE environments that support domain Software Design and also in Shavlik (1990). analysis, design analysis, and software archi- Fickas (1985) describes GLITTER, a research tectures are being explored in research labora- system that uses teleological structure to par- tories. Domain analysis and design analysis tially automate transformational derivations. are knowledge-acquisition tasks. A prototype The first level of generalizing a derivation domain analysis environment is DRACO, which history will be done by annotating the deriva- includes generators for application-oriented tion history with the dependencies between languages. DRACO also provides support for the individual transformation steps and the transformation rules within a domain and

FALL 1992 85 Articles

between domains. KIDS and WATSON provide and automated reasoning will enable domain some support for creating and maintaining experts and design experts to encode their formal domain theories. In the future, knowl- knowledge in software architectures so that it edge-acquisition tools will provide sophisti- can be reused by system developers and end cated environments for domain analysis users. Software engineering will be elevated (Marcus 1989). to the engineering discipline of capturing and Design analysis is the acquisition of control automating currently undocumented domain knowledge. Representing and acquiring this and design knowledge. knowledge is critically dependent on having The path to this new paradigm will be a good understanding of the structure of incremental. In the coming decade, KBSE will transformational derivations. As shown by begin to supplant CASE with more powerful the chapters in Automating Software Design, knowledge-based tools. This book documents great progress has been made since the early industrial pilot projects in software mainte- days of program synthesis. For the most part, nance and special-purpose program synthesis. the program derivations presented here are at Current AI technology can greatly enhance a higher level of abstraction than the details conventional maintenance tools to help of the logical calculi that underlay the indi- maintenance programmers understand a soft- vidual transformations. Several of the interac- ware system and determine the ramifications tive program transformation systems, of changes. Current AI technology can also especially KIDS, provide a natural high-level help reengineer existing systems. Special-pur- interface for those schooled in transforma- pose program synthesis systems will likely tional derivations. However, we are just become the next generation of application beginning to understand how to encode the generators. Compared to the technology used rationale for choosing particular paths in existing application generators, transfor- through the design space. This semantic mational technology can produce more opti- information is what we need for generalizing mal code and provide a higher-level user derivation histories and developing tactics interface. In short, KBSE will revolutionize the and strategies. I expect significant progress practice of software engineering by adapting over the coming decade. For a comparative to and improving current practice. study of different derivations in the program synthesis literature, see Steier and Anderson Acknowledgments (1989), especially the concluding chapter on The author thanks the following people for design space. helpful reviews: Lee Blaine, Allen Goldberg, Cordell Green, Laura Jones, Richard Jullig, Summary David Lowry, Larry Markosian, and Stephen Westfold. The ideas in this conclusion were To date, the main use of AI in software engi- stimulated by the chapters in Automating Soft- neering has been for rapid prototyping. Rapid ware Design as well as sources in the literature, prototyping enables requirements and system numerous conversations with other members designs to be iteratively refined with cus- of the KBSE community, and various research tomers before production-quality software is initiatives sponsored by the U.S. government. manually coded. However, just as manual The views expressed in this conclusion are typing makes it difficult to modify and reuse the sole responsibility of the author. publications, manual coding of production- quality software makes it difficult to modify, References maintain, and reuse software. Abelson, H.; Eisenberg, M.; Halfant, M.; Katzenel- In the next century, transformational pro- son, J.; Sacks, E.; Sussman, G. J.; Wisdom, J.; and gramming will create a paradigm shift in soft- Yip, . 1989. Intelligence in Scientific Computing. ware development like that created by desktop Communications of the ACM 32(5): 546–562. publishing. Software development and main- Anderson, J. S., and Fickas, S. 1989. A Proposed Per- tenance will be elevated to the specification spective Shift: Viewing Specification Design as a level by automating the derivation of efficient Planning Problem. Presented at the Fifth Interna- code from specifications. Knowledge-based tional Workshop on Software Specification and tools for requirement elicitation, specification Design, May, Pittsburgh. evolution, and program synthesis will enable Arango, G. 1988. Domain Engineering for Software end users to specify and modify their own Reuse. Ph.. thesis, Dept. of Information and Com- puter Science, Univ. of California at Irvine. software. Software architectures will enable small teams of system developers to create Barr, A. 1990. The Evolution of Expert Systems. Heuristics 3(2): 54–59. large software systems. Advances in knowl- edge representation, knowledge acquisition, Barstow, D. R.; Shrobe, H. E.; and Sandewall, E.,

86 AI MAGAZINE Articles eds. 1984. Interactive Programming Environments. Using VDM. Englewood Cliffs, N.J.: Prentice Hall. New York: McGraw-Hill. Letovsky, S. 1988. Plan Analysis of Programs. Ph.D. Bergland, G. D., and Gordon, R. D., eds. 1981. Tuto- thesis, Computer Science Dept., Yale Univ. rial: Software Design Strategies. Washington, D.C.: Marcus, S., ed. 1989. Machine Learning (Special Issue IEEE Computer Society. on Knowledge Acquisition) 4(3–4). Biggerstaff, T. J.; Hoskins, J.; and Webster, D. 1989. Moore, J. S. 1989. A Mechanically Verified Lan- Design Recovery for Reuse and Maintenance, Tech- guage Implementation. Journal of Automated Rea- nical Report STP-378-88, MCC Corp., Austin, Texas. soning 5(4): 461–492. Bordiga, A.; Greenspan, S.; and Mylopoulos, J. Mostow, J. 1989. Design by Derivational Analogy: 1986. Knowledge Representation as the Basis for Issues in the Automated Replay of Design Plans. Requirements Specifications. In Readings in Artificial Artificial Intelligence 40(1–3): 119–184. Intelligence and Software Engineering, eds. C. Rich and Neches, R.; Swartout, W. R.; and Moore, J. D. 1985. R. C. Waters, 561–569. San Mateo, Calif.: Morgan Enhanced Maintenance and Explanation of Expert Kaufmann. Systems through Explicit Models of Their Develop- Brown, D. W.; Carson, C. D.; Montgomery, W. A.; ment. IEEE Transactions on Software Engineering SE- and Zislis, P. M. 1988. Software Specification and 11(11): 1337–1351. Prototyping Technologies. AT&T Technical Journal Rich, C., and Waters, R. 1990. The Programmer’s 67(4): 46–58. Apprentice. New York: Association of Computing Burson, S.; Kotik, G. B.; and Markosian, L. Z. 1990. Machinery. A Program Transformation Approach to Automat- Roth, S. F.; Mattis, J. A.; and Mesnard, X. A. 1990. ing Software Reengineering. In Proceedings of the Graphics and Natural Language as Components of Fourteenth International Computer Software and Automatic Explanation. In Architectures for Intelli- Applications Conference, 314–322. Washington, gent Interfaces: Elements and Prototypes, eds. J. Sulli- D.C.: IEEE Computer Society. van and S. Tyler. Reading, Mass.: Addison-Wesley. Cleaveland, J. C., and Kintala, C. M. R. 1988. Tools Reubenstein, H. B., and Waters, R. C. 1989. The for Building Application Generators. AT&T Techni- Requirements Apprentice: An Initial Scenario. Pre- cal Journal 67(4): 46–58. sented at the Fifth International Workshop on Soft- Chikofsky, E. J. 1989. Computer-Aided Software Engi- ware Specification and Design, May, Pittsburgh. neering (CASE). Washington, D.C.: IEEE Computer Schindler, M. 1990. Computer-Aided Software Design. Society. New York: Wiley. Curtis, B.; Krasner, H.; and Iscoe, N. 1988. A Field Shavlik, J. 1990. Acquiring Recursive and Iterative Study of the Software Design Process for Large Sys- Concepts with Explanation-Based Learning. tems. Communications of the ACM 31:1268–1287. Machine Learning 5(1): 39–70. Dahl, O. J.; Dijkstra, E. W.; and Hoare, C. A. R. Shiel, B. 1986. Power Tools for Programmers. In 1972. Structured Programming. In A.P.I.C. Studies in Readings in Artificial Intelligence and Software Engi- Data Processing, no. 8, eds. F. Duncan and M. J. R. neering, eds. C. Rich and R. C. Waters, 573–580. San Shave. London: Academic. Mateo, Calif.: Morgan Kaufmann. Despeyroux, J. 1986. Proof of Translation of Natural Spivey, J. M. 1989. The Z Notation: A Reference Semantics. In Proceedings of the Symposium on Manual. New York: Prentice Hall. Logic in Computer Science. Washington, D.C.: IEEE Computer Society. Steier, D. M., and Anderson, A. P. 1989. Synthesis: A Comparative Study. New York: Springer- Fickas, S. 1985. Automating the Transformational Verlag. Development of Software. IEEE Transactions on Soft- ware Engineering SE-11(11): 1268–1277. Warren, D. H. D. 1977. Implementing Prolog— Compiling Predicate Logic Programs, volumes 1 Freeman, P., and Wasserman, A.I., eds. 1983. Tutori- and 2, D.A.I. Research Reports, 39 and 40, Universi- al on Software Design Techniques, 4th ed. Washing- ty of Edinburgh. ton, D.C.: IEEE Computer Society. Westfold, S., and Green, C. 1991. A Theory of Auto- Gane, C. 1990. Computer-Aided Software Engineering: mated Design of Visual Information Presentations. The Methodologies, the Products, and the Future. Technical Report KES.U.91.1, Kestrel Institute, Palo Englewood Cliffs, N.J.: Prentice Hall. Alto, California. Green, C.; Luckham, D.; Balzer, R.; Cheatham, T.; and Wills, L. M. 1989. Determining the Limits of Auto- Rich, C. 1986. Report on a Knowledge-Based Software mated Program Recognition, Working Paper, 321, Assistant. In Readings in Artificial Intelligence and AI Lab., Massachusetts Institute of Technology. Software Engineering, eds. C. Rich and R. C. Waters, 377–428. San Mateo, Calif.: Morgan Kaufmann. Wing, J. M. 1990. A Specifier’s Introduction to Formal Methods. IEEE Computer 23(9): 8–26. Hartman, J. 1989. Automatic Control Understand- ing for Natural Programs. Ph.D. thesis, Dept. of Young, W. D. 1989. A Mechanically Verified Code Computer Sciences, Univ. of Texas at Austin. Generator. Journal of Automated Reasoning 5(4): 493–518. Hunt, W. A. 1989. Microprocessor Design Verification. Journal of Automated Reasoning 5(4): Michael Lowry is affiliated with the AI Research 429–460. Branch, NASA Ames Research Center in Moffett Jones, C. B. 1986. Systematic Software Development Field, California.

FALL 1992 87