<<

DEDALUS—The DEDuctive ALgorithm Ur-Synthesizer*

by ZOHAR MANNA Stanford University California and RICHARD WALDINGER SRI International Menlo Park, California

INTRODUCTION The entire sequence of descriptions leading from the speci- fications to thefinal program is called aprogram derivation. Program synthesis is the automatic construction ofprograms The transformation rules are guided by certain strategic to meet givenspecifications. These specifications constitute controls, which ensure that they are applied only at the a high-level description of the desired program which ex- appropriatetime. Many ofthe transformation rules represent presses the purpose of the program, without indicating the knowledge about the program's subject domain; some ex- method by which that purpose is to be achieved. plicate the meaning of the constructs of the specification The specifications are expressed in terms of many con- and target languages; a few rules correspond to basic pro- structs which are endemic to theparticular subject domain grammingprinciples, which are independentoftheparticular of the desired program(e.g., numbers, sets, lists). Because subject domain or programming language. these constructs are only intended to describe the purpose The programs constructed by these techniques are guar- of the program and need not be computed, they can be of anteedto be correct and to terminate, and require no sep- a much higherlevel than the constructs of any programming arateverification phase. Up to now, we have not been con- language(e.g., they can include logicalquantifiers, con- cerned with the efficiency of the target program. structors, and other noncomputableoperations). The spec- The techniques we developare tested in the implemen- ification language can correspond closely with the concepts tation of an experimental system called DEDALUS (DE- a programmeractually uses in thinking about the problem. Ductive ALgorithm Ur-Synthesizer). This system is imple- i The techniques we are developingare independentof the mented in the QLISP programming language, an extension choice of a target programming language. The particular of INTERLISP that includes pattern-matching and back- language we use in our examples and in our experimental tracking facilities. system is a simple LISP-like languagecontaining onlybasic The identification of the basic programming principles, numerical and list-processing operations, conditional their codification as transformation rules, and their imple- expressions, and . In considering the formation of mentation in our experimental system have constituted a programs with side effects, we extend the language to in- major component of our effort. Some of the principles we clude assignments to variables, array elements, and other have identified so far are: data-structure components. Cur basic approach is to transform the specifications re- " Conditional formation—This principle causes a case peatedly according to certain rules; each rule replaces one analysis to be introduced into the derivation, yielding segment of a program description by another, equivalent, a conditional expression (or test) in the ultimate pro- segment. The process continues until a description is ob- gram. that is entirely in terms of the primitive constructs of " Recursion formation —This principle introduces a re- the target language; this description is the desired program. cursive call into the ultimate program by observing when a subgoal to be achieved is actually an instance of the desired top-levelgoal. * This research was supported in part by the National Science Foundation ordering The the recur- under Grants DCR72-03737 AOl Well-founded — termination of and by the Office of Naval " sive programs formedby Research under Contracts NOOOl4-76-C-0687 and NOOOI4-75-C-0816, by the the above techniqueis ensured Advanced Research Project? Agency of the Department of Defense under by constructing a well-founded ordering with the prop- Contract MDA9O3-76-C-0206, and a grant from the United States-IsraelBi- erty that the ofthe program's recursive calls national Science Foundation. are all strictly less than the program's inputs. 683

Stanford,

MCS76-83655, V J 684 National Computer Conference, 1978

y*.

" Procedure formation —A subsidiary procedure is an equivalent description that employs only primitive con- formed when a subgoal is found to be an instance, not structs of the target language. of the top-level goal, but of a previously generated The specification language is pot fixed: as we consider ■$ subgoal. new subject domains, we introduce new specification con- " Generalization A generalized procedure is formed structs accordingly. when two subgoals— are found to be an instance of a third expression, which is somewhat more generalthan both. Transformation rules Simultaneous goals constructing a program to In We use the notation " achieve two or more—goals simultaneously, we first con- struct a program to achieve one goal, then modify that / 3> /' if P programto achieve the others as well, while protecting to a subexpression / may replaced the condition that was already achieved. denote that of form be by the corresponding expression /', provided that the con- dition P is true. Some of theseprinciples are fairly well understood and have For example, the rule been incorporated into the DEDALUS system; others re- quirefurther study. In the next sectionswe examine in some Q and true => Q more detail our basic program-synthesis techniques, indi- logicalprinciple expression cating which areas have already been implemented. Our denotes the basic that an of form and may replaced by treatmentof these techniques willbeextremely further "Q true" be Q. This rule has no conditions; always applied. discussion of the same topics, at a more leisurely pace, it can be appears in the authors' Stanford University-SRI Interna- The rule tional report, "Synthesis: Dreams^Programs" (November P(all(l))^P(head{l))andP(alHfail(l))) ifnotempty(l) 1977). expressesthe fact that apropertyP holds for every of a nonempty list / if it holds for the first element head{l) andfor every element of the list tail(l) ofthe other elements. GENERAL FRAMEWORK This rule imposes the conditionthat the list / be nonempty. In the DEDALUS system, transformation rules are rep- Specifications resented as programs in the QLISP language. The full ex- pressive powerof the programming languagemaybe brought In designing the specification language, we have incor- to bear in representing each rule. porated many constructs (e.g., the set constructor and the logical quantifiers) that facilitate the description of a pro- gram but that may not be included in the target programming Derivationtrees language. We present below examples of specifications for In developinga program whose specifications are simple programs using some of these high-levelconstructs. A program lessall(x I), to test if a numberx is less than /(*K=computeP{x) every element of a list / of numbers, is specified as follows: where Q(x), lessall(x l) u\vand u\w—v x*oor y*Q. we obtain the subgoal The set constructor {«: P(u)} denotes the set of all elements u satisfying the propertyP. Goal 2: compute max{z: z\xand z\y—x} The construct (P(all(l)) and the set constructor all If a transformation rule imposes a condition P, which {it:P(u)} are specification constructs that are nonprimitive must be true if the rule is to be applied, a subgoal (i.e., they are in programming of the not the target language). The form synthesis task is to transform a description of the desired program, such as the specifications presented above, into Goal: prove P

brief; > t DEDALUS 685

been implemented must achieved before the rule can be applied. For ex- tional expressions by case analysis have be 1 2 ample, in developing the program lessall(x I) to test if a by Buchanan and Luckham and Warren. numberx is less than every element of a list / of numbers, we have the top-level goal computex

therefore,

false,

formed, F

1

686 National Computer 1978

The recursion-formation principle is well-understood and where is included in the DEDALUS implementation.The principle (s is the same as the "folding rule" of the Burstall and Dar- carthead tfcif empty (t) lington3 program transformation system. then { } / else {(head (s) head(t))} U carthead(s tail(t)). Procedureformation The basic procedure-formation principle has been imple- mented, and the DEDALUS system Suppose developing a can carry out this and in programwhose specifications are such of theform other derivations. However, our method for proving the termination of ordinaryrecursive calls does not always /(A.-)<=compute P(x) extend to the multiple-procedurecase. where Q(x) Generalization ;c. encounter a subgoal Suppose Goal B: compute R(t), in deriving a program we obtain two subgoals Goal A: compute R(a(x)) which is in instance, not of the output specificationcompute P(x), 'jut of some previously generatedsubgoal and

Goal A: compute R(x). Goal B: compute R(b(x)), The procedure-formation principle proposes that we intro- neither of which is an instance of the other butboth ofwhich duce a new procedure g{x) whose output specificationis are instances of the more general expression -/(*X=compute R(x). compute R(y). In this way, we can achieve both Goals A and B by calls Then the extended procedure-formationrule proposes g(x) that and g(t) to a single procedure. In the case that Goal B we introduce a new procedure, whose output specification has been derived from Goal A, the call to g(t) will be a is recursive call; otherwise, both calls will be simple procedure calls. g(y)<=compute R(y), example, For in constructing a program cart(s t) to com- so that we will be able to satisfy Goal A by aprocedure the call pute Cartesian product s and t), of two sets, we are g(a(x)) and Goal B by a procedure call g{b(x)). specification siven che For example, in constructing a program reverse (/) to car:(s /)<=compute {{x y):xesand yet} reverse a list we derive two subgoals wheres and / are finite sets. Goal A: compute append(reverse(tail(l)) In deriving the program, we obtain a subgoal cons{head(l) nil)) Goa. A: compute {(x y):x=head(s)and yet} and in the case that s is nonempty. Developing Goal A further, . f. derive the subgoal Goal B: compute append(reverse(tail(tail(/)!.) Goal B: compute {(x y):x=head(s) and ye tail(t)} cons(head(tail{l)) cons(head(l)nil))). in the case that t is nonempty. Goal B is an instance of Goal Each of A; therefore, the procedure-formation rule proposes intro- these goals is an instance of the more general expression ducing a new procedure carthead(s t) whose output speci- fication is compute append{reverse(tail(/)) carthead(s recompute {(x y):x=head(s)and yet} cons(head(l) m)); zr. that we can achieve Goal A with a procedure call eart- therefore, the hed J{s t) and Goal B with a (recursive) call carthead(s extended procedure-formationrule proposes -

Conference,

/, "

DEDALUS 687

it turns out that the reversegen procedure is actuallyeasier To circumvent difficultiesof this sort, we have introduced to construct. The final system of programs we obtain is the following simultaneous-goalprinciple: To satisfy a goal of form < reverse(/K='/ empty (l) then nil achievePi and P^ else reversegen (I nil) first construct a programF to achieve P%, then modifyF to where achieveP2 while protectingPi at the end ofP. The program- modification technique we employis based on the "weakest- reversegen )) (I m empty(tail(I precondition operator" (Dijkstra6). A special "protection then cons(head(l) m) mechanism" (cf. Sussman 7) ensures that no modification is else reversegen(tail(l) permitted that destroys the truth of the protectedcondition cons(head(l) m)). Pi at the end of the program. The generalizationmechanism and the extended proce- To apply this principle to the goal dure-formation principle are justbeginningto beformulated; achieve xsyand ysz the elaboration of these concepts is an importantpart of our projected effort. Generalization was proposedas aprogram- in the sorting problem, we first construct the program seg- synthesis technique by Siklossy,4 and is routinely performed ment sort2(x y) that achieves the first condition. We then by -proving systems for proofs by induction (e.g., modify this program to achieve the second condition ysz. see Boyer and Moore5). Wecannot achieve this condition by inserting the instruction sort2(y z) at the end of the program, because (as we have seen) this modification violates the condition x

far,

false,

x,

effort,

z, structured programming(cf. Dijkstra8) presents princi- ples for deriving a program systematically from given specifications. However, the principles of structured The specificationsare expressed in a LISP-like notation. programming are intended to guide a human program- Thus, the output specificationfor the lessall program, which mer, whereas the principles of program synthesis are we wrote as \ meant to direct a computer system. Nevertheless, we x

$Q).

$U)))) ! >

DEDALUS 689

Programs: When a new goal is generated, the QLISP system retrieves List those rules whose patterns match theform of the goal. This finding the maximum element of a list retrieval is facilitated by arranging the rules in a classifica- " testing if a list is sorted tion tree according to their patterns; thus the two rules " testing if a number is less than every element of a list above would be classified on the same branch of the tree. " of numbers (lessall)) This mechanism allows us to avoid matching every rule in testing if every element of one list of numbers is less goal. the system against each newly-generated " than every elementof another If no rule matches the entire expression of a goal, its If rule subexpressions are established as subgoals. no Set Programs: any subexpression ofa givengoal, afailure occurs, matches union or of two sets backtracking is invoked; the system attempts to find an computing the intersection and to a set that applies to a previous subgoal. " testing if an element belongs alternate transformation " of another The pattern-matcher has special provisions for testing if one set is a QLISP product of two sets (cart). matching commutative functions. Thus, because the and " computing the Cartesian operation is commutative, the rule " Q, 0 and true => COMPARISONS WITH AUTOMATIC PROGRAMMING represented as the QLISP program 13 complex Ithas been claimed (e.g., see Balzer ) that,for a (QLAMBDA (AND «-Q TRUE) programming task, it is unrealistic to expect the user to formulate complete, correct specifications for the desired applied goals of form "true and Q" as well as "Q op- car. be to program. In specifying an airline-reservation system, an For this reason, commutativity rules such as and true". erating system, or a spacecraft-guidance system, for exam- the desired behavior of the ?andQ^QandT ple, we are unlikelyto anticipate \ system in every possible situation. In some systems, the are not necessary in the DEDALUS system. specifications for the program are formulated gradually the This kind of matching also occurs in the recursion-for- through an extended dialogue between the user and goalis aninstance 14 al., 15 survey mation rule, in determiningwhether a new system. (See, e.g., Green and Balzer et or the of some earlier goal. For example,in the actual synthesis of of Heidorn16.) The dialogueis continued during the program- aid in the design the zedprogram, the top-level goal construction process, so that the user can of the algorithm and resolve any ambiguities or inconsis- compute max{z:z\x and z\y} tencies the system might discover. Typically, these systems regarded as an instance of itself with the roles ofx and y attempt to play therole ofan expertprogrammer-consultant, b knowledge versed, because the and is commutative. The andthey tend torely more on built-in or acquired rule, is able to propose the of aparticular subject domain than on deductive processes. lecuision'-formation we have call gcd(y x). By concentrating on basic programming principles, recursive any subject Thefinal gcd program we obtain is focused on general techniques that apply to domain. The ultimate automatic programming system, of yKpify

$Q),

\ therefore,

Stanford, CA., Warren, D.,

Behavior, Scotland,

JACM, 1,

L., Heuristic," TX.,

JACM, 1, W., t

»

690 National Computer Conference, 1978 ! i

7. Sussman. G. J.. A Computer Model of Skill Acquisition. American El- 13. Balzer, R. M., Automatic Programming, Technical Report, Information sevier, New York. 1975. Science Institute, University of Southern California, Marina del Rey, 8. Dijkstra. E. A Discipline of Programming, Prentice-Hall, Englewood CA., September 1972. Cliffs. NX, 19-6. 14. C. C, "The Design of the PSI Program Synthesis System." 9. Guttag, J. V.. E. Horowitz and D. R. Musser, Abstract Data Types and Proceedings' of the Second International Conference on SoftwareEngi- Software Valuation. Technical Report, Information Sciences Institute, neering, San Francisco, CA, October 1976, pp. 4-18. Marina del Rey. CA., August 1976. 15. Balzer, R. M., N. Goldman, and D. Wile, "Informality in Program Spec- 10. WiTwr, B. M„ A QLISP Manual, Technical Report, SRI Proceedings of the Fifth International Joint Conference on International, Mertlo Park, March 1976. ArtificialIntelligence, Cambridge, MA, August 1977, pp. 389-397. 11. Teitehnaa. W'..ISThZRUSPReference Manual, Xerox Research Center, 16. Heidorn, G. E., "Automatic Programming Through Natural Language Palo Alto. CA.. 1974. Dialogue: A Survey,"IBMJournal ofResearch and Development, Vol. 12. Waldinger, R. J.. "Achieving Several Goals Simultaneously," in Machine 20, No. 4, July 1976, pp. 302-313. Intelligence Representations 8: Machine of Knowledge, (E. W. Elcock 17. Warren, D. H. D., WARPLAN: A Systemfor Generating Plans, Tech- ?n& D. Michie, eds.), Ellis Hcrwood Ltd., Chichester, England, 1977, nical Report, Department of Computational Logic, University of Edin- pp. 94-136. burgh, Edinburgh, Scotland, June 1974.

i !

NY., W., Green,

ifications," CA.,