We Shall Refer to Previous Work in Heuristics by Its Acronym And/Or The

CONCEPTS AND METHODS FOR HEURISTIC SEARCH

Erik J Sandewall Computer Sciences Department Uppsala University- Uppsala, Sweden

0. Summary The advantages and disadvantages of each approach are obvious: generality has to be paid The transformation or derivation problem for by a decrease in program efficiency. An treated by most "problem-solving" programs is advantage with the generality approach is that expressed in a formal notation, and various methods one single heuristic method can quickly be put to for "problem-solving" are reviewed. The conventio use in a variety of problem domains. nal search tree is generalized into a search lattice which can accomodate multiple-input opera It would seem, however, that methods which tors, e.g. resolution. The paper argues that have been developed in one "expertise" program descriptions of heuristic methods can be signi can be carried over to another problem environment ficantly compacted if a higher degree of formaliza and another program. The only problem is to pull tion is used. This point is illustrated with two out the abstract heuristic methods from the program practical examples. descriptions, which are often quite technical and detailed. 1. Introduction One example of this will suffice. The SIN This paper is an attempt at a survey and program contains an important heuristic, which synthesis of past work on heuristic search methods. Moses describes as follows: "The Edge heuristic Following Feigenbaum and Feldman (1963), we define is based on the Liouville theory of integration. a heuristic method as a device which drastically In this theory it is shown that if a function is limits search for solutions in large problem integrable in closed form, then the form of the spaces. As can be seen from the list of references, integral can be deduced up to certain coefficients. much work on heuristic methods has been performed A program which employs the Edge heuristic, called during the last few years. Edge, uses a simple analysis to guess at the form of the integral and then it attempts to obtain Two approaches have been competing in this coefficients." (page 8). The Edge heuristic is work. In his report on SIN , Moses characterizes further described on seventeen pages in chapter 5- them as emphasizing generality and expertise, respectively. In the generality approach, one Unfortunately, the author fails to formulate tries to write a general program which can solve this important heuristic method in abstract terms. all kinds of problems, provided only that Such an abstract formulation could e.g. run as (adequately phrased) information about the follows: The purpose of the integration program particular "problem environment" of each problem is to start from a given, initial object, and to is provided. The "General Problem Solver" (sic!), apply the right operators (from a given set of DEDUCOM, and the Graph Traverser are examples of operators) in the right order, until the given this approach. object has been transformed into a given target set (i.e. the set of all expressions where the In the approach that stresses expertise, one integral sign(s) have been eliminated). The Edge concentrates instead on writing a good program for heuristic relies on information which is local to solving problems in one given problem environment. this particular problem environment, and which SIN itself is a typical example of that approach, makes it possible to say, during the search of as are game-playing programs (Samuel' s, Greenblatfs), the solution tree, where in the target set we will and some programs which, according to rumour, are eventually land. The Edge program utilizes this being used for industrial purposes. information to get a better estimate of the remaining "distance" to the target set from each node. - With such a description, it becomes clear This research was supported in part by the that the same heuristic may well be applicable to Swedish Natural Science Research Council (contracts other problem environments, in other expertise- Dnr 2711-5, 2711-6) and by the Swedish Research oriented programs. Institute of National Defense (bestalln. 714257, 715411). Abstract method descriptions, as outlined ** .... here, can of course not serve as substitutes for We shall refer to previous work in heuristics conventional ones. A concrete description, like by its acronym and/or the author's name. For exact the one Moses has given for SIN, will always be references, use tables at the end of the paper. -199- needed by the user of the program, or the set of objects, rather than a single object. In researche who attempts to improve on previous the transformation process, each output of the work. By contrast, the abstract description is operator must then be transformed into the target useful for the man who wants to carry over set. methods to other problem environments, and (of course) for the theoretician who, some time in Example: In analytic integration, the target set M the future, will attempt to build a mathematical consists of the set of all formulae where the theory of heuristics. integration sign does not occur. The rule

The morale is, therefore, that we need an abstract frame of reference, a set of concepts for describing and analysing heuristic methods. can be used as an operator q defined by Such concepts would help in the dissemination of know-how; they would also make it possible to compare the efficiency of various methods and programs, expertise-oriented as well as generality- In other words, q tells us to integrate A + B oriented. by integrating A and B separately. (The final task of joining together the solutions to those In this report, we shall attempt to set up two integration problems with a + sign is a such a "frame of reference". In section 2, we trivial matter). formulate a general "transformation problem", and discuss some of its cases. In sections 3-4, Operators with several input. various commonly used heuristic techniques are The problem specification is changed as follows. formulated and discussed. Since we argued, in Initially, each member of R is considered / section 2, that one-input and multiple-input available. At each cycle of.the solution process, operators must be carefully distinguished, we use one selects one operator which requires i section 5 to extend the conventional search tree arguments, and i available objects into a search lattice. Our stock of concepts is tested in sections 6-7, where abstract descriptions of some well-known programs and heuristic methods are given. is defined, it is included among the available objects. Problem: find some available object which 2. Heuristic search: rules of the game is also a member of M .

The problem environments for heuristic search Exemple: This variation frequently occurs in methods always include a set P of objects and a "forward" logical inference, e.g. in the resulu- set Q of operators on these objects. The following tion logic environment. It has been common practice problem has often been studied, (see e.g. {Newell in heuristic research to consider the cases of 1960c} and {Doran 1967a}), and has sometimes been several inputs or several outputs as trivial referred to as the problem-solving problem: extensions of the one-input/one-output case. For example, the General Problem Solver is formulated Basic transformation problem. in terms of one input operators, and then Given an initial set immediately applied to a problem environment where determine r in R and q, ,q ,... q in a two-input operator (Modus Ponens in forward proof) 2 r Q such that is essential. Similarly, Slagle s group have attempted to use their MULTIPLE program (which is designed for one-input, multiple-output operators) to the resolution logic environment, where the most exists and is a member of the target set M. We important operator has two inputs and one output. call this a transformation problem from R to M. The fact that an operator requires several A method for solving basic transformation inputs can be "hidden" in various ways. In the case problems is called a heuristic search method if of Modus Ponens, which takes A and as it searches the tree(s) of all possible operator inputs, one can say that the operator "essentially" applications, and the order in which the nodes of takes as input, so that the merit of an this tree are inspected, is governed in some ways formula determines whether the operator - by properties of the nodes which have already been shall be applied or not. If the system decides to created. Heuristic methods require, therefor, that apply Modus Ponens to a formula it checks the objects in P are known as symbolic expressions whether the formula A is available. If it is not, or otherwise have a non-trivial information content. the output is "failure", - Another, and more general They cannot simply be non-informative tokens of the way of hiding multiple inputs is to consider the set form "p.". The following variations to the basic of all available objects as a "higher level" object. transformation problem occur frequently: Similarly, the operators are redefined to accept one higher level object as input, and to emit an Operators with several outputs. incremented object as output. The disadvantage of all such tricks is that important information gets The problem specification is changed as lost to the system. For example, with the follows. Application of an operator can return a

200- introduction of "higher level" objects and literals, and the operator "annihilates" (in a operators, one will have certain sense) two literals, one from each input. The operator has one output for each combination of literals in the two inputs, and is therefore ambiguous. ( except when is essential for the application of q, or q(p) - p is essential for (c) because the operator requires a parameter, the application of q' ). It is hard to make which may or may not be in the set of objects. traditional tree-search routines "aware" of such For example, in order to prove B in conventional commutativity. In our opinion, one should instead predicate calculus, it is sufficient to prove A face the fact that some operators take multiple and , where A is arbitrary. inputs, and study then separately. We shall refer to all operators which yield or- oonnected outputs, as ambiguous. Thus (a) Thus the failure to recognize multiple-input exemplifies an ambigous perporator, (b) an operators has led to inefficient programs. It has ambiguous conporator, and (c) an ambiguous also led to a regrettable lack of communication: diporator. techniques which have been designed for handling multiple-input operators (e.g. the various Still another complication is "strategies" for the resolution method) have not been recognized as heuristic methods. People seem Operators with restricted domain, i.e. a domain to think that they are technical details for which is a proper subset of the set P. Some handling resolution, whereas in fact they are possible ways of dealing with this complication examples of quite general heuristic principles. are discussed in section 3. One can make a parallell with the "Edge" heuristic discussed in section 1: general principles have Example: In integration, the partial integration gone unnoticed for lack of abstract concepts to operator is not always applicable. phrase them in. A final complication is As a first step to remedy this situation, let us introduce separate names for the various kinds No back-up. of operators. The following terms are believed to be illustrative: In typical problem-solving, application of an operator is never irrevocable: we are always number of inputs number of outputs name permitted to back up in the solution tree and try some other operator on a previously used object. one one perporator In some situations (e.g. the Edinburgh studies of one multiple diporator heuristic automata), one encounters similar multiple one conporator problems where back-up is not permitted. The multiple multiple fociporator transformation problem the* boils down to the problem of selecting the best operator in each Our second step is to introduce a formalism and a step. vocabulary which enables us to deal with these different kinds of operators. The formalism is based on lattice theory, and requires a section Sometimes, e.g. in planning, a back-up problem (section 5) of its own. can be transformed to a no-back-up problem, or vice versa. We therefore consider both kinds as variants Our third step will be to illustrate these of the same basic problem. general concepts and principles by re-interpreting some current heuristic methods (including the unit Summing up, transformation problems can be characterized by a couple of features, i.e. preference strategy in resolution). This is done in sections 6 and 7. (1) what kinds of operators? (per-, con-, di-, foci-porators) Some other complications which may occur in the basic transformation problem,are: (2) are operators ambiguous? (3) are there restrictions to the domain of Operators with or-connected outputs. operators? One often encounters operators which, like (4) is back-up permitted? diporators, yield a set of objects of outputs, but which merely require that one of the outputs is to 3. Approaches to heuristic search be transformed to the target set. Such or- connections may occur In this section, we shall attempt to classify (a) intrinsically,"in order to prove and name some methods of heuristic search. Our classification will be put to use in the next few prove a, or prove sections, where some previously published methods (b) because the operator is ambigous, e.g. in for heuristic search are reviewed. resolution logic, where the resolution operator takes two clauses as input and gives one clause as In example (b), we assume forward proof, and output. Each of the two clauses is a set of in (a) and (c) backward proof.

-201- In each cycle of the heuristic search process, a few (but not all) operators are selected the program should select one operator to use, and and applied to the object(s). one object (viz. set of objects) to use it on. Object selection seems to be performed in most (Bk) Object and operator together method: Consider cases by either of the following two methods: all possible object-operator combinations and select one of them, using a priority function. (Al) Labyrinthic methods proceed doan the search (This is in other words a best-bud method, tree, and have an explicit mechanism for where each object-operator combination is deciding direction in the tree.* This considered as a "bud".) mechanism tells the program "this is a good branch, go on the same direction", or "this The MULTIPLE program is an example of (B2), is a bad branch, back up — steps and select GPS and SAINT are examples of (B3), whereas unit another branch". preference and PPS are examples of (B4). The version of the Graph Traverser described in { Doran (A2) Best bud methods use an evaluation function 1966a} is an example of (B2), whereas the later which assigns a priority or merit to each version described in {Michie 1967a} is of type (Bl). growth direction (bud) in the tree. At each cycle, the program takes a global look at In methods (B2) and (B3), object selection in all the buds, selects the best one, sprouts one cycle is effectively a choice of operator in it, and iterates the cycle. In the new cycle, the previous cycle. Therefore, they can be the best bud from last cycle is no longer a considered as special cases of (Bl), with a very candidate, but it has yielded several new careful and timeconsuming method for operator buds. All other buds from last cycle are selection. candidates anew. Back-up occurs automatically if the new buds are unable to compete with The four cases above are clearly not the stand-by buds from last cycle. exhaustive, as it is in principle quite possible to run an operator first, object afterward method. Methods (Al) and (A2) have been formulated Also, labyrinthic instead of best-bud selection of for perporators. It is easy to extend them to operators is possible (one would keep using the diporators. For conporators, it is sometimes a same operator until a "back-up" or "change operator" good idea to select one input to the operator criterion is satisfied). However, these possibilities according to a labyrinthic or best-bud method, are probably useless for practical problems. and then to select "best companions" to the selected first input. We consider this the If the number of operators is very large, or generalization of (Al) and (A2) to multiple-input if some operators are ambiguous with a large number operators. A third method cathegory for them would of alternatives, then it is not possible to search be through all possible cases. This excludes (B2) and (B4) methods. One must first select the proper (A3) Best bud bundle methods, which use an object(s), and then use a function which selects evaluation function which assigns a priority one or a few operators (and ways of applying them, to each combination ("bundle") of "buds", if ambiguous). Usually, this function recognizes and selects the best one in each step. features in the given object, features which determine what operators may be suitable. GPS and SIN use labyrinthic methods, whereas SAINT, the Graph traverser, MULTIPLE, and PPS use In many practical problem environments, one best-bud methods. The unit preference heuristics encounters operators which are only defined on a (strategy) in resolution is an example of a best subset of the set P of objects. This restriction bud bundle method. has been dealt with in at least two ways, which provides us with a classification in still another Another (and at least in principle, dimension: independent) basis of classification is how the program selects the operator in; each cycle. The (Cl) Consider as failure. If we have heuristically following methods have often been used in practice: selected an object and an operator, and it turns out that the object is not in the domain (Bl) Object(s) first, one operator afterwards of the operator, then give up this branch and method; First select the most promising try something else. object(s) to work upon, according to a labyrinthic or best-bud method. After that, (C2) Solve sub-problem. Let Mr be the domain of the find a good operator to apply to it (them). operator. Solve the transformation problem from the given object to M1, and apply the (B2) Exhaustive method: Select object(s) like in given operator to the result. Formally, we (Bl) and apply all operators to it. extend the definition of our operators, so that q(p) ■ Q.(Pi) > vhere p1 is the (B3) Object(s) first, a few operators afterwards (possibly ambiguous I) solution to the trans- method: A compromise between (Bl) and (B2): formation problem from p to the domain of 0. • As we shall see later, we sometimes have a lattice rather than a simple tree.

-202- SAINT uses a type (Cl) method, whereas GPS and At first sight, the idea of using a merit PPS use type (C2) methods. ordering has much appeal. On closer scrutiny, it turns out to be less than obvious. It all depends In conclusion, we have pointed out three on what kind of economy we desire. features in heuristic methods. These features can be used to classify and characterize the methods. Suppose we are solving a transformation They are: problem for perporators, and that we have already searched part of the tree. Then which of the (A) Mode of object selection following quantities do we want to minimize in our (B) Mode of operator selection next step: (C) Way of handling restricted domains for (Dl) The number of steps (i.e. operator applications) operators. in the "solution path" from the initial set R to the target set M ? k. Some frequent techniques in heuristics. (D2) The remaining number of steps in the "solution In this section, we shall discuss the use of path" from the selected bud to a member of the "merit orderings", plans, and feature vectors target set M ? ("images") in heuristic methods. (D3) The (remaining) number of steps, including Use of merit orderings. steps that are performed in blind alleys (i.e. Definitionwise, best-bud methods require that the total number of arcs in the solution tree there exists a way of selecting the "best" one the way it looks when we have reached M)? from a set of buds. In all best-bud-type methods known to the author, this selection is based on an (DM The quantity mentioned in (D3), except that (explicit or implicit) partial ordering > on the if a path is trodden, abandoned through back- set P of objects. Some maximal bud according to > up, and then resumed, the steps which are (i.e. some bud b such that no other bud trodden several times shall be counted aB satisfies is then selected as "best bud", multiple steps? and is sprouted. If the path to the solution of the transforma In some, but not in all cases, the merit tion problem is to be used as a plan for a more ordering > is implemented as an explicit merit expensive activity in another environment, then assignment function e , i.e. a mapping from P (Dl) is of course the correct criterion. On the to the set of real numbers. > is then defined in other hand, if we are interested in a member of M, an obvious manner through rather than in the path to this member (e.g. if we are searching for a solution to an integration problem), then (D3) or (D4) would be the correct quantity to minimize. (D3) should be used if the entire search tree is stored in memory, and (D4) The problem of finding a suitable merit should be used if the search tree is stored ordering for a given problem environment is of implicitly on the push-down-list, so that abandoned course crucial. Often, it is thought about as an paths are garbage-collected and all work there has estimate of distance. One attempts to define a to be re-performed. (D2) is sound in no-back-up function d , where is a rough estimate situations, like Doran's heuristic automaton. of the work (the number of operator applications) required to transform into. Similarly, If criterion (Dl), (D3), or (D4) is to be used, one attempts to compute then the "merit" of a bud is not simply a function of that bud and the target set, but instead a function of the srhole "stump" of the solution tree that has been searched up to now. For example, if for reasonable sets B . The merit function e is the criterion (D3) is used, then the remaining then defined e.g. as work from a bud is affected if there exists some other bud which has almost as much merit, and which The use of merit orderings is not restricted in the future may attract the problem-solver's to best-bud methods. In labyrinthic methods, the attention for blind-alley work. It follows that the criterion for abandoning a path and trying another idea of a merit ordering is sound only if we want may be that by some merit ordering. to use criterion (D2). The GPS utilizes exactly this heuristics. Although theoretically shaky, the use of merit orderings seems to be the only available technique The name "General Problem Solver" has some today. If criteria (D3) or {D4) are relevant (which times been criticized as being too uninformative. is usually the case), then the use of a distance It is natural to call a heuristic method goal- estimate as a merit function is even more question directed if its merit function is defined through able. We shall treat this question in a later paper. D. The variant of GPS described in {Newell 196la} But again, the distance estimate seems to be the can then be characterized as a Goal-directed only technique we have. Perporator Search method.

-203- Use of plans. between features); Let P, 1 Q, R, and M define a transformation (2) as objects in an auxiliary problem space used problem for which a solution is known, and let P', for planning; Q' = Q, R' , and M' define a transformation problem which is to be solved. Assume also that there (3) in methods of type (B3), for the selection of exists some mapping h which maps P1 onto P, R' operators that should be applied to a given object. onto R, etc. in such a way that if p and q(p) are steps in the known solution, and if p = h(p'), then q(p) = h(q(p')). In other words, the function Examples: (l) game-playing programs and (with h maps solutions in P' onto solutions in P. Then certain modifications) Doran's heuristic automaton; we can clearly find a solution in P' by just re (2) planning GPS, Planner, PPS; (3) GPS. tracing the solution in P *. The solution in P will be referred to as a plan for the solution in In this section, we have described and P'. classified general heuristic techniques, and given references from each technique to actual programs which utilizes it. In sections 6-7, we This ideal situation probably never exists, shall build an inverse system of references. Each except when h is the identity function. However, section will review one heuristic program in terms it may be the case that the requirement q(p') = of the classification and concepts above. h(q(p)) often (though not always) holds. Then it can still be a good strategy to try to follow the plan. If it does not work, we have to take resort 5. Lattice instead of trees. in another plan, or in the object-operator selection methods mentioned above, (in other words, Heuristic search is often referred to as tree use of plans may be considered as yet another search. However, the tree model is only applicable method, (B5), of operator selection). to cases where all operators are perporators or (with some extra conventions) diporators. With conporators, the need for a more general structure Plans can be generated in several ways, e.g. arises. In this section, we shall suggest one by memorization of previous, successful solutions possible way of performing the generalization. (Doran's heuristic automaton), by human advice, or by "look-ahead": solution of an analogous problem in an auxiliary problem space (e.g. in Instead of a solution tree, we shall introduce the Planner system and the PPS). a solution lattice. For perporators, but not for diporators, the solution lattice degenerates into a tree as usually drawn. - For a good introduction When the problem environment is predicate to lattice theory, see {Rutherford 1965a}. calculus, the "abstraction function" h can e.g. be selected so as to throw away everything except First some general notation. Let q be an the variables in the formulas (planning GPS) or unambiguous operator which is defined with one to throw away everyting except the boolean set P1 C P as inputs, and which yields P"C P connectives (Planner). as outputs. We then write P" c q(P') . For the A third technique is moment, we forget about ambiguous operators.

Use of images. The ordered k-tuple whose elements are a , a2, ... a, will be written . By an image, we mean an item which expresses a and < a> are considered as distinct items*. some, but not all the information of an object in If b is a k-tuple, the last element of b is the set P. The image may be for example, a vector written a)(b) .If B is a set of tuples, the set of features in the object, or (in the case of a of last elements of members of B is written ft(B) . LISP-type formula), the top-levei structure of the object, with lowerlevel sub-expressions being We now define the set S (the solution lattice) replaced by asterisks. Although they rarely talk as follows: about it in abstract terms, many creators of heuristic programs do in fact use such images. (1) if p is an object, then

is a member of S; Images are used for several purposes, including: (2) if s and t are members of S , then sv t and s A t are also members of S; (l) as a basis for merit functions (a numerical value is assigned to each feature, and merit (3) see below. is computed as a weighted average of the feature values) or distance functions Following Rutherford, we define xcy to computed as a wighted average of distance" mean x ■ x A y . Also, we assume commutative, associative, and absorptive laws for VJ and o . To insure that we have a solution, we must Distributive and idempotent ( x X ■ x etc.) assume that only members of M' are mapped into M, laws for ^ and r\ follow easily. Also, we find i.e. that the C relation is transitive, and that n(pf) (Mop' £ M' x C y A y C x x = y Moreover, it is essential that R1 is mapped onto (rather than into) R, and that M' is mapped onto Instead, we shall frequently write a when M. we mean {a} • -204- -205- inferred from the others, and is crucial. for the integral of a sum. Some of the perporators (e.g. the substitution operator) are ambiguous and Let us finally turn to the case of ambiguous governed by a parameter. Most operators have a operators. Suppose, in example 2 (figure 5) that restricted domain. q is an ambiguous perporator, and that it is sufficient to transform either Discussion of heuristic method. set M. We then simply redefine It is natural to sort up the operators in Q with unchanged notation otherwise. See figure 8,^ and compare figure 5. into the following disjoint cathegories: a. Standard forms (26 operators). These are If desired, the notation can of course be perporators whose output is always in the further extended to arbitrarily complex and/or target set M (if the input contains only one structures: occurrence of the integral operator). An example of such a perporator is Example 4. After applying operator q to object p , we find that it is sufficient to transform

Remark: the possibility to single out those operators which land in the target set is to the target set M. With s. as before, we then particular for this problem environment, and simply define does not occur in e.g. logical inference.

b. Algorithm-like transformations (8 operators). These are operators which, if applicable, are usually appropriate. The diporator is one of Using the obvious distributive etc. laws, them. this expression can be reduced to the canonical form of an ambiguous diporator. c. Heuristic transformations (10 operators). These are operators which may or may not be appropriate. In summary of this section, we have suggested Substitution is one of them. a formal and pictorial representation of the search "trees" for arbitrary operators. Our search lattice Let us call these sets andrespectively, S is the set of all possible nodes in search space. and define: In the search for a solution, we gradually extend the searched poset, which is a subset of S, until it has been proved that PI the set of all objects in P which are in the domain of some operator in Ql;

P2 the set of all objects in P-Pl which are in the domain of some operator in 6. Heuristics in the SAINT program

In sections 1-5, some aspects of heuristic Objects in PI have a solution just around the programs have been discussed. As an exercise in corner, and should of course be given top priority. the use of these concepts, we shall now give a description of Slagle's program SAINT. We wish For objects in P2, we know which operator should to demonstrate that, with the concepts that have be applied (it turns out that there is never more been introduced, the description can be more than one), so such objects are given higher priority abstract and involve less programming details than objects in P3. For objects in P3, several than before. operators may be applicable, so a heuristic search has to be performed. Problem environment. Each object p stands for an expression built The set P of objects consists of all formulas with functions. The "maximum depth" of this built from real numbers, variables, various expression is significant for the following reasons: arithmetic functions , and one functional: the (l) the members of PI (usually) have small maximum integration operator. The target set M consists depth; (2) operators often perform only a small of all objects which do not use the integration change (one or a few units) in the maximum depth operator. The initial set R consists of one single of their input. Under such conditions, it is object, which is given to the program on each reasonable to use the depth of an expression as a occasion of use. gross measure of its "distance" to the target set, and (therefore 7) to use it as a merit function. The set Q consists of 44 operators. All are perporators, except for one diporator, the formula With this background, the heuristic method used by SAINT can be outlined. addition, subtraction, multiplication, power function, logarithmic, trigonometric, and inverse trigonometric functions.

-206- Images. heuristic search need only span objects in P3 and The SAINT program uses images = feature operators in Q3. vectors with eleven components. Maximum depth of expression is one of them. Images are used for Like most heuristic programs, SAINT maintains three purposes: a bud list, i.e. a list of objects to which no operator has yet been applied. This list contains (a) selection of best bud (only maximum depth members of P3 ordered according to (the merit component used); order) > . (b) selection of appropriate operators for a given object in P3; Somewhat idealized, the cycle in the SAINT program runs as follows: (c) selection of parameters for ambiguous operators. (1) Take the first object on the bud list. (This is a maximum bud in P3). Handling of restricted domain. If a selected operator is not applicable to (2) Select suitable operators for this object. a selected object, SAINT just gives up. It does Apply them. The set or results is called P". not try to solve a sub-problem. (3) For each member of P", check if some member Object and operator selection. of Ql or Q2 is applicable. If so, apply it. If it was a member of Ql, terminate. Other Abstractly speaking, the SAINT program uses wise, include the result in P" and reperform an "object first, a few operators afterwards" (3) on it. selection system, where objects are selected with a best bud method based on a merit ordering. (k) Let P+ be the modified P" after all Ql or Q2 However, there are certain complications to this operators have been applied. By hypothesis, simple scheme. + + P is a subset of P3. Merge P into the bud list according to > . The following merit ordering is used: The cycling starts in step 3 with the bud iff p is a member of PI and is not, list empty, and with P" = the given, initial object ("the given integration problem"). or p is a member ofand is a member of P3, The occurrence of a diporator in the problem environment is a complication. To handle this, the or both p and p' are members of P3, but program maintains a "goal tree", which is p has less maximum depth than p' has. equivalent to the search poset of last section, but utilizes a slightly different notation. On In each step, SAINT selects some maximal bud discovery of a member of Ql (step (3) in the in the search tree according to this partial order, routine) SAINT does not actually terminate, but and applies suitable operators to it. The operators utilizes instead the "goal tree" to remove from are selected according to the following table: the bud list those buds that need no longer be transformed to the target set. In lattice terms, if SAINT has proved for a node t is the search poset that f|M_t, then it removes from the bud

list all nodes t1 such that t! b . Also, and for the same reason, such members of P" (P ) are thrown away. SAINT then continues the above cycle (starting in step (l)) as long as there is anything left on the bud list. In P2, only one operator is usually applicable; Remarks in P3, the object's image determines which operators shall be selected. Notice in particular that if This terminates our description of the SAINT object is in P2, then an operator from Q3 is never program. It is based on a rather short summary of selected, even if the object is in its domain. The the work on SAINT, { Slagle 1963a}, rather than the reason is that an object in P2 can be transformed full thesis. There may therefore be mistakes in one or more steps by operators in Q2, and then the details of our description. However, let us repeat desired operator in Q3 can be applied to the result. that the intention with this section was to This is sufficient (and is in fact a good pruning demonstrate how exactly the same material may be technique), since operators in Q2 only effect described in completely different terms when it is trivial modifications on the objects. to be used for another purpose.

Programming To facilitate comparison, let us finally give a short dictionary that translates between Slagle's Since only one operator is applied to objects terminology and ours: in PI and P2, these objects and operators can be given a separate and "algorithmic" treatment. The

-207- images, it is reasonable to take the image of an object as a crude estimate of its merit in the search towards the target, with small images having a higher merit. Therefore, operations which decrease the image can be expected to bring us closer to a solution. This gives us a preference for factoring, and for resolution when one input has image 1.

A trivial strategy would be to reduce the image to zero through successive factorings. However, we run into problems with the restricted T> The unit preference heuristics in resolution domains of the operators: factoring when the image of the input is 1 (i.e. the last step) is never The purpose of this section is the same as possible, and in all reasonable problems we would that of section 6, i.e. to demonstrate the fail far before that. usefulness of abstract heuristic concepts. In addition, we shall try to show that the so-called Resolution when the partner's image is 1 strategies used in resolution are in fact ("unit resolution") seems to be a better strategy, heuristic methods, and amenable to the same adn is what our heuristics prefers as first treatment as other such methods . Therefore, we choice. When it cannot be had, we perform other have selected to make a description of the unit resolutions or factoring a couple of steps, in the preference strategy for resolution. hope of achieving unit resolution later.

Problem environment Handling of restricted domain. Each object in the set P is a set of literals, If a desired operator is not applicable, the a literal being a symbolic expression unit preference method just gives up. (NOT (Ri )) or (% ) . The target set M has one member: the null set (i.e. Object and operator selection. the set of no literals). The initial set R consists of a relatively small number of objects Unit preference utilizes a best bud bundle and is given to the program on each occasion of method, where a suitable operator and its input(s) its use. are selected together. The system makes implicit use of a merit ordering > defined as follows on Notice that in this case, R is given as input I ^ 12 : to the program, and M is fixed. In the case of SAINT, ve had the opposite situation. numerical relationship merit ordering ( < means "less than") ( > means "better than") The set Q consists of a two-input conporator ("resolution") and a perporator ("factoring"). Both have a restricted domain, and both are ambiguous. The ambiguities are moderate: the number of alternatives is finite and so small that all can be tried.

Images Unit preference uses images for object- operator selection. The image of an object is an integer, viz. the number of literals in the object. Operators can be extended to images in the following manner: if the inputs to the resolution operator have images and k. , then the output (if it exists) has image . Similarly, if the input to the factoring operator has image then the output, if it exists, has j-1 as image**.

Discussion of heuristic method. Since the target object has image zero, and the operators effect a relatively small change on

Feigenbaum, in his IFIP 68 paper { Feigenbaum 1968a}, argues a similar standpoint. ** It may accidentally happen that the image of The relation > is extended to P w P in the the output is less than (but never greater than) obvious way. j+k-2 viz. . Such accidents are rare and do not affect the heuristics.

-208- In each cycle, unit preference uses > to determine applicability. This "semi-deterministic" select one maximal object or object-pair and property has otherwise been characteristic of applies the correct operator (factoring in the planning methods, notably planning GPS, and PLANNER. case of an object, resolution in the case of an As a result of some present work, we believe that object-pair) to it. In case of ambiguity, all semi-deterministic images have interesting theoretic alternatives are treated with the same priority. properties. If operator application in some alternative is successful, and the output has higher priority Pruning criteria. than the input (and therefore, higher priority than the other alternatives processed together The unit preference heuristics should only be with this one), then the higher priority is used in combination with various pruning criteria, honored immediately. such as:

ne Programming. (1) Restriction_on_.-Search Pth' T depth of an object is the number of resolutions that was Although our reference says little about the required to construct it. Objects of depth actual program that performs the unit preference >_ K( (where kQ is a fixed parameter) are heuristics, the following are some suggestions rejected; for such a program. (2) Set _of sjijyoort strategy.. A subset T of R is The program utilizes lists L , LQ, ... L., singled out as "essential initial objects", ... , where L. contains all generated objects and nodes p in the search poset which satisfy with image j, together with the following information for each object: p «= U(I~¥) (1) has factoring been attempted on this object? are given zero merit;

(2) if factoring is ambiguous, for which cases (3) Rejection:1 by Pattern.. Objects p which conform has it been attempted? to certain patterns (e.g. contain two literals of the form A viz. (NOT A) ) are rejected. (3) with what other objects has resolution been attempted? We have then made a distinction between heuristics (i.e. rules which govern the order in (U) if resolution is ambiguous, for which cases which the solution lattice is searched) and has it been attempted? pruning criteria (which are extreme cases of heuristic rules since they cut off some "branches" The answers to these questions can be altogether). In the resolution literature, both represented as follows: heuristics and pruning criteria are called strategies. (1) for each list L- where j ^ 2, a pointer indicate how far down the list factoring has Pruning criteria can formally be treated as 4 proceeded; further restriction on the domains of operators. The first two pruning criteria above can (2) for the pointed-at element of each list L-, (alternatively) be implemented by using images the attempted alternatives are listed. <.j,d,s> , defined as follows: (For all other alternatives of Lj, either none or all alternatives have been attempted); If p is a member of the initial set R , p's image is ^j,d,s> , where and similarly, for each object p. on each list j is the number of literals in p ; d is zero; (3) for each list L where k _< j , a pointer indicates how far down Lk resolution with s is the truth-value of p £ T p. has been attempted; If p was derived through resolution, and {h) for the pointed-at element of each list Lk, the images of the inputs were < Ji d1s1 and the attempted alternatives are listed. < J2 d2 » then the' image of p is , where With these conventions, programming is straight- forward. j = j + j - 2 (the number of literals in p)

d = max(d.,dp) + 1 Remark. The images used by the unit preference method have a noteworthy property: the image of the output of an operator is a function of the image(s) of the Finally, if p was derived through factoring, input(s), if the operator is applicable; but the and the image of the input was , then the image does not contain enough information to

-209- image of the output is Graph traverser {Doran 1966a}, {Doran 1967a}, {Michie 1967a} When the partial order > is extended to triples and pairs of such triples, the Heuristic automaton following items are considered as zeroes (i.e. {Doran 1968a} < all other items) and therefore rejected: Logic theory machine {Stefferud 1963a}, {Millstein ?a}

MULTIPLE {Slagle 1968a}

Planner {Hewitt 1967a} With these exceptions, the order > treats Planning GPS {Newell 1960c}, {Newell 1964a}

Notice that if we ignore the accidents PPS (Planning Problem Solver) mentioned in the footnote on page 28, both {Sandewall 1968b} operators are semi-deterministic on these extended images. SAINT {Slagle 1963a} Modification: the fewest-component preference heuristics. SIN {Moses 1967a) Slagle has proposed to streamline the unit preference heuristics into a fewest-component Unit preference heuristics preference method. The idea is to change the {Wos 1964a} definition of the merit order so that the special preference for pairs (l1j) is dropped. The details are: in the above definition of > , References drop rules (l) through (3), and use rules (4) through (6) even if some of i,j,k or m equals Comm Processes 1963 one. For the redefined > , we have e.g. Communication Processes Proceedings of a symposium held in Washington, 1963 Pergamon Press 196U

Doran 1966a 8. Conclusion J E Doran, D Michie Experiments with the graph traverser program We have defined a number of concepts which Proc R Soc. (A) p. are useful for the compact and abstract definition or heuristic methods. For illustration, these Doran 1967a concepts have been applied to two well-known J E Doran methods. Examples of their compactness can be An approach to automatic problem-solving found on pages 9 (Slagle's AND/OR tree pruning), in {Mach Int 1}, 11 (set of support strategy) and 12 (fewest- component preference heuristics). We have argued Doran 1968a that abstract descriptions of similar kind will J E Doran be useful as complements to conventional A form of goal selection and achievement in a descriptions of heuristic programs and methods. heuristic automaton Univ of Edinburgh, Dept. of Machine Intelligence and Perception, memo MIP-R-34 Index of heuristic methods and programs. Feigenbaum 1963a Arrow method E F Feigenbaum, J Feldman {Hart 1967a}, {Nilsson 1968a} Computers and thought McGrat-Hill, 1963 DEDUCOM {Slagle 1965a} Hart 1967a P E Hart, N J Nilsson, B Raphael Fewest-component preference heuristic A formal basis for the heuristic determination {Slagle 1965b} of minimum cost paths Stanford Research Institute, Menlo Park, Cal., GPS (General Problem Solver) June 1967 {Newell 1960c}, {Newell 196la}

-210- Hewitt 1967a Samuel 1967a C Hewitt A L Samuel Planner, a language for proving theorems Some studies in machine learning using the game MIT/MAC A I Group, memo 137, July 1967 of checkers II - Recent progress Stanford A.I. Project, memo 52 1F1P 1959 Proceedings of an international conference on Sandewall 1968b information processing E J Sandewall UNESCO House, Paris 1959 A planning problem solver based on look-ahead in stochastic game-trees Mach Int 1 Uppsala univ. Comp. Sci. Dept., Report 13, N L Collins, D Michie, eds. April 1968 Machine intelligence 1 Oliver & Boyd, London 1967 Slagle 1963a J R Slagle Mach Int 2 A heuristic program that solves symbolic E Dale, D Michie, eds. integration problems in freshman calculus Machine intelligence 2 in Comm ACM _b, Oct. 1963 _or {Feigenbaum 1963a} Oliver & Boyd, London 1968 Slagle 1965a Michie 1967a J R Slagle D Michie Experiments with a deductive question-answering Strategy-building with the graph traverser program in {Mach Int 1} i\n Coram ACM 8_, Pec. 1965, p. 792

Millstein QA Slagle 1965b Millstein R J R Slagle The logic theorist in LISP A proposed preference strategy using sufficiency- Lawrence Rad Labs, memo UCRL-70037 resolution for answering questions Lawrence Rad Labs, memo UCRL-14361e 1968a Moses 1967a J Moses Symbolic integration J R Slagle, P H Bursky MIT Math Dept, Ph D Thesis, September 1967 Experiments with a multi-purpose, theorem- proving heuristic program Newell 1960c in Journ ACM I_5, Jan. 1968 A Newell et til Report on a general problem solving program Stefferud 1963a for a computer information processing system E Stefferud _in {IFir 3 959), p. 256 The logic theory machine - a model heuristic program Newell 1961a Rand, memo RM-3731-CC, June 1963 A Newell, H Simon GPS, a program that simulates human thought Wos 1964a in IFeigenbaum 19b3a) L Wos, I) Carson, G Robinson The unit preference strategy in theorem proving Newell 1961+a in Proc. FJCC, 1964 p. 6l5 A Newell The possibility of planning languages in man- computer communication in tComin Processes 1963)

Nilsson 1968a N J Nilsson Searching problem-solving and game-playing trees for minimal cost solutions in. {IFIP 1968}

Rutherford 1965a D E Rutherford Introduction to lattice theory Oliver & Boyd, 1965

Samuel 1959a A L Samuel Some studies in machine learning using the game of checkers in {Feigenbaum 1963a}

-216-