Unification: A Multidisciplinary Survey

KEVIN KNIGHT Department, Carnegie-Mellon University, Pittsburgh, Pennsylvania 15213-3890

The unification problem and several variants are presented. Various and data structures are discussed. Research on unification arising in several areas of computer science is surveyed, these areas include proving, programming, and natural language processing. Sections of the paper include examples that highlight particular uses of unification and the special problems encountered. Other topics covered are , higher order logic, the occur check, infinite terms, feature structures, equational theories, inheritance, parallel algorithms, generalization, lattices, and other applications of unification. The paper is intended for readers with a general computer science background-no specific knowledge of any of the above topics is assumed.

Categories and Subject Descriptors: E.l [Data Structures]: Graphs; F.2.2 [Analysis of Algorithms and Problem Complexity]: Nonnumerical Algorithms and Problems- Computations on discrete structures, Pattern matching; Ll.3 [Algebraic Manipulation]: Languages and Systems; 1.2.3 [Artificial Intelligence]: Deduction and Theorem Proving; 1.2.7 [Artificial Intelligence]: Natural Language Processing General Terms: Algorithms Additional Key Words and Phrases: Artificial intelligence, computational complexity, equational theories, feature structures, generalization, higher order logic, inheritance, lattices, logic programming, natural language processing, occur check, parallel algorithms, Prolog, resolution, theorem proving, type inference, unification

INTRODUCTION lem, Unification and Computational Com- plexity, and Unification: Data Structures The unification problem has been studied and Algorithms introduce unification. Def- in a number of fields of computer science, initions are made, basic research is re- including theorem proving, logic program- viewed, and two unification algorithms, ming, natural language processing, compu- along with the data structures they require, tational complexity, and computability are presented. Next, there are four sections theory. Often, researchers in a particular on applications of unification. These appli- have been unaware of work outside cations are theorem proving, logic pro- their specialty. As a result, the problem has gramming, higher order logic, and natural been formulated and studied in a variety of language processing. The section Unifica- ways, and there is a need for a general tion and Feature Structures is placed before presentation. In this paper, I will elucidate the section on natural language processing the relationships among the various con- and is required for understanding that sec- ceptions of unification. tion. Following are four sections covering The sections are organized as follows. various topics of interest. Unification and The three sections The Unification Prob- Equational Theories presents abstract

Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the ACM copyright notice and the title of the publication and its date appear, and notice is given that copying is by permission of the Association for Computing Machinery. To copy otherwise, or to republish, requires a fee and/or specific permission. 0 1989 ACM 0360-0300/89/0300-0093 $1.50

ACM Computing Surveys, Vol. 21, No. 1, March 1989 94 l Kevin Knight

CONTENTS conclusion gives an abstract of some gen- eral properties of unification and projects research trends into the future. Dividing this paper into sections was a INTRODUCTION difficult task. Some research involving 1. THE UNIFICATION PROBLEM unification straddles two fields. Take, for 2. UNIFICATION AND COMPUTATIONAL example, higher order logic and theorem COMPLEXITY 3. UNIFICATION: DATA STRUCTURES AND proving-should this work be classified as ALGORITHMS Unification and Higher Order Logic or 4. UNIFICATION AND THEOREM PROVING Unification and Theorem Proving? Or 4.1 The Resolution Rule consider ideas from logic programming put 4.2 Research to work in natural language processing-is 5. UNIFICATION AND LOGIC PROGRAMMING it Unification and Logic Programming or 5.1 Example of Unification in Prolog Unification and Natural Language Pro- 5.2 Research cessing? Likewise, some researchers work 6. UNIFICATION AND HIGHER ORDER in multiple fields, making their work hard LOGIC 6.1 Example of Second-Order Unification to classify. Finally, it is sometimes difficult 6.2 Research to define the boundary between two disci- 7. UNIFICATION AND FEATURE plines; for example, logic programming STRUCTURES grew out of classical theorem proving, and 8. UNIFICATION AND NATURAL LANGUAGE the origin of a particular contribution to PROCESSING 8.1 Parsing with a Unification-Based Grammar unification is not always clear. I have pro- 8.2 Research vided a balanced classification, giving cross 9. UNIFICATION AND EQUATIONAL references when useful. THEORIES Finally, the literature on unification is 9.1 Unification as Solving enormous. I present all of the major results, 9.2 Research 10. PARALLEL ALGORITHMS FOR if not the methods used to reach them. UNIFICATION 11. UNIFICATION, GENERALIZATION, AND LATTICES 1. THE UNIFICATION PROBLEM 12. OTHER APPLICATIONS OF UNIFICATION Abstractly, the unification problem is the 12.1 Type Inference following: Given two descriptions x and y, 12.2 Programming Languages can we find an object z that fits both 12.3 Machine Learning descriptions? 13. CONCLUSION 13.1 Some Properties of Unification The unification problem is most often 13.2 Trends in Unification Research stated in the following context: Given two ACKNOWLEDGMENTS terms of logic built up from function sym- REFERENCES bols, variables, and constants, is there a of terms for variables that will make the two terms identical? As an ex- ample consider the two terms f(x, y) and work on unification as equation solving. f(g(y, a), h(a)). They are said to be unifi- Parallel Algorithms for Unification reviews able, since replacing x by g(h(a), a) and y recent literature on unification and paral- by h(u) will make both terms look like lelism. Unification, Generalization, and f(g(h(a), a), h(u)). The nature of such Lattices discusses unification and its dual unifying substitutions, as well as the means operation, generalization, in another ab- of computing them, makes up the study of stract setting. Finally, in Other Applica- unification. tions of Unification, research that connects In order to ensure the , con- unification to research in abstract data ciseness, and independence of the various types, programming languages, and ma- sections of this paper, I introduce a few chine learning is briefly surveyed. The formal definitions:

ACM Computing Surveys, Vol. 21, No. 1, March 1989 Unification: A Multidisciplinary Survey l 95

Definition 1.1 s and t, and u(s) is called a unification of s A symbol is one of (x, y, z, . . .). and t. A constant symbol is one of (a, b, c, . . .). A function symbol is one of ( f, g, h, . . .I. Definition 1.5 A unifier u of terms s and t is called a most Definition 1.2 general unifier (MGU) of s and t if for any other unifier 0, there is a substitution 7 A is either a variable symbol, a con- such that 7u = 8. Consider, for example, the stant symbol, or a function symbol followed two terms f(x) and fig(y)). The MGU is by a series of terms, separated by commas, enclosed in parentheses. Sample terms are (X t g(y)], but there are many non-MGU unifiers such as (x t g(a), y t a), Intui- a, x7fix, Y), fi&, Y), hiz)). Terms are tively, the MGU of two terms is the sim- denoted by the symbols (s, t, u, . . .).’ plest of all their unifiers. Definition 1.3 Definition 1.6 A substitution is first a function from vari- Two terms s and tare said to match if there ables into terms. Substitutions are denoted is a substitution u such that u(s) = t. by the symbols {a, 7, 0, . . .). A substitution Matching is an important variant of u in which a(x) is g(h(a), a) and a(y) is unification. h(a) can be written as a set of bindings enclosed in curly braces; that is, (X t g(h(a), a), y t h(a)]. This set of bindings Definition 1.7 is usually finite. A substitution is also a Two terms s and t are infinitely unifiable if function from terms to terms via its appli- there is a substitution, possibly containing cation. The application of substitution u infinitely long terms, that is a unifier of s to term t is also written u(t) and denotes and t. For example, x and f(x) are not the term in which each variable xi in t is unifiable, but they are infinitely unifi- replaced by a(s). able under the substitution u = (X t Substitutions can be composed: a0 (t) de- fififi . . . )))I, since 4x1 and dfix)) are notes the term t after the application of the both equal to f (f (f ( . . . ))). An infinitely substitutions of 19followed by the applica- long term is essentially one whose string tion of the substitutions of u. The substi- representation (to be discussed in the next tution a0 is a new substitution built from section) requires an infinite number of old substitutions u and (3 by (1) modifying symbols-for a formal discussion, see 0 by applying u to its terms, then (2) adding Courcelle [ 19831. variable-term pairs in u not found in 8. These general definitions will be used Composition of substitutions is associative, often throughout the paper. Concepts spe- that is, (&3)7(s) = a(&)(s) but is not in cific to particular areas of study will be general commutative, that is, ad(s) # &is). defined in their respective sections.

Definition 1.4 2. UNIFICATION AND COMPUTATIONAL Two terms s and t are unifiable if there COMPLEXITY exists a substitution u such that u(s) = In his 1930 thesis, Herbrand [1971] pre- u(t). In such a case, u is called a unifier of sented a nondeterministic to compute a unifier of two terms. This work was motivated by Herbrand’s interest in ’ I will often use the word “term” to describe the equation solving. The modern utility and mathematical object just defined. Do not confuse these terms with terms of predicate logic, which include notation of unification, however, originated things like V(n)V(y):(f(x) + f(y)) c, (if(y) - with Robinson. In his pioneering paper Y(x)). Robinson [1965] introduced a method of

ACM Computing Surveys, Vol. 21, No. 1, March 1989 96 l Kevin Knight theorem proving based on resolution, a algorithm in his thesis [Baxter 19761, in powerful inference rule. Central to the meth- which he also discussed restricted and od was unification of first-order terms. higher order versions of unification. Robinson proved that two first-order terms, In 1976, Paterson and Wegman [1976, if unifiable, have a unique most general 19781 gave a truly linear algorithm for uni- unifier. He gave an algorithm for comput- fication. Their method depends on a careful ing the MGU and proved it correct. propagation of the equivalence-class rela- Guard [ 19641 independently studied the tion of Huet’s algorithm. The papers of unification problem under the name of Paterson and Wegman are rather brief; matching. Five years later, Reynolds [ 19701 de Champeaux [1986] helps to clarify the discussed first-order terms using lattice issues involved. theory and showed that there is also a Martelli and Montanari [1976] inde- unique “most specific generalization” of pendently discovered another linear al- any two terms. See Section 11 for more gorithm for unification. They further details. improved efficiency [Martelli and Monta- Robinson’s original algorithm was inef- nari 19771 by updating Boyer and Moore’s ficient, requiring exponential time and structure-sharing approach. In 1982, they space. A great deal of effort has gone into gave a thorough description of an efficient improving the efficiency of unification. The unification algorithm [Martelli and Mon- remainder of this section will review that tanari 19821. This last algorithm is no effort. The next section will discuss basic longer truly linear but runs in time algorithmic and representational issues in O(n + m log m), where m is the number of detail. distinct variables in the terms. The paper Robinson himself began the research on includes a practical comparison of this al- more efficient unification. He wrote a note gorithm with those of Huet and Paterson [Robinson 19711 about unification in which and Wegman. Martelli and Montanari also he argued that a more concise representa- cite a study by Trum and Winterstein tion for terms was needed. His formulation [ 19781, who implemented several unifica- greatly improved the space efficiency of tion algorithms in Pascal in order to com- unification. Boyer and Moore [1972] gave pare actual running times. a unification algorithm that shares struc- Kapur et al. [ 19821 reported a new linear ture; it was also space efficient but was still algorithm for unification. They also related exponential in time complexity. the unification problem to the connected In 1975, Venturini-Zilli [ 19751 intro- components problem (graph theory) and duced a marking scheme that reduced the the online/offline equivalence problems complexity of Robinson’s algorithm to (automata theory). Unfortunately, their quadratic time. proof contained an error, and the running Huet’s [1976] work on higher order uni- time is actually nonlinear (P. Narendran, fication (see also Section 6) led to an personal communication, 1988). improved time bound. His algorithm is Corbin and Bidoit [ 19831 “rehabilitated” based on maintaining equivalence classes Robinson’s original unification algorithm of subterms and runs in O(na(n)) time, by using new data structures. They reduced where a(n) is an extremely slow-growing the exponential time complexity of the al- function. We call this an almost linear gorithm to O(n’) and claimed that the al- algorithm. Robinson also discovered this gorithm is simpler than Martelli and algorithm, but it was not published in Montanari’s and superior in practice. accessible form until Vitter and Simons’ The problem of unifying with infinite [1984, 19861 study of parallel unification. terms has received some attention. (Recall The algorithm is a variant of the algorithm the definition of infinitely unifiable.) In his used to test the equivalence of two finite thesis, Huet [1976] showed that in the case automata. Ait-Kaci [1984] came up with a of infinite unification, there still exists a similar, more general algorithm in his dis- single MGU-he gave an almost-linear sertation. Baxter [ 19731 also discovered an algorithm for computing it. Colmerauer almost-linear algorithm. He presented the [ 1982b] gave two unification algorithms for

ACM Computing Surveys, Vol. 21, No. 1, March 1989 Unification: A Multidisciplinary Survey l 97

function UNIFY(t1, t2) + (unifiable: Boolean, Q: substitution) begin if tl or t2 is a variable then begin let x be the variable, and let t be the other term if n = t, then (unifiable, a) c (true, 0) else if occur(x, t) then unifiable + false else(unifiable, o) +-- (true, {x + t)) end else begin assume tl = f (xl, . . , x”) and t2 = g(y,, . . . , ym) if f # g or m # n then unifiable t false else begin kc0 unifiable c true o c nil while k < m and unifiable do begin ktk+l (unifiable, 7) - UNIFY(o(xk), u(yk)) if unifiable then CTt compose(7, u) end end end return (unifiable, a) end

Figure 1. A version of Robinson’s original unification algorithm.

infinite terms, one “theoretical” and one and Bidoit [1983].) The function UNIFY “practical”: The practical version is faster takes two terms (tl and t2) as arguments but may not terminate. Mukai [1983] gave and returns two items: (1) the Boolean- a new variation of Colmerauer’s practical valued unifiable, which is true if and only algorithm and supplied a termination if the two terms unify and (2) u, the unify- proof. Jaffar [ 19841 presented another ing substitution. The algorithm proceeds algorithm, designed along the lines of from left to right, making substitutions Martelli and Montanari. Jaffar claimed when necessary. The routines “occur” and that his algorithm is simple and practical, “compose” need some explanation. The even for finite terms; he discussed the rel- “occur” function reports true if the first ative merits of several unification algo- argument (a variable) occurs anywhere in rithms, including Colmerauer [ 1982b], the second argument (a term). This func- Corbin and Bidoit [ 19831, Martelli and tion call is known as the occur check and is Montanari [1977], Mukai [1983], and Pa- discussed in more detail in Section 5. The terson and Wegman [1976]. None of these “compose” operator composes two substi- algorithms is truly linear: The theoretical tutions according to the definition given in question of detecting infinite unifiability in the Introduction of this paper. linear time was left open by Paterson and Robinson’s original algorithm was expo- Wegman and has yet to be solved. nential in time and space complexity. The major problem is one of representation. Efficient algorithms often turn on special 3. UNIFICATION: DATA STRUCTURES AND data structures; such is the case with uni- ALGORITHMS fication. This paper will now discuss the As mentioned in the previous section, various data structures that have been Robinson [ 19651 published the first modern proposed. unification algorithm. A version of that First-order terms have one obvious rep- algorithm appears in Figure 1. (This ver- resentation, namely, the of sym- sion is borrowed in large part from Corbin bols we have been using to write them

ACM Computing Surveys, Vol. 21. No. 1, March 1989 98 l Kevin Knight

k/l g h

; : 1 1 n x

Figure 2. Tree representation of the term f(g(f(x)), h(f(x))). down. (Recall the inductive definition given The subterm f(x) is shared; the work to at the beginning of this paper.) In other unify it with another subterm need be done words, a term can be represented as a linear only once. If f(x) were a much larger struc- array whose elements are taken from func- ture, the duplication of effort would be tion symbols, variables, constants, commas, more serious of course. In fact, if subterms and parentheses. We call this the string are not shared, it may be necessary to gen- representation of a term. (As an aside, it is erate exponentially large structures during a linguistic fact that commas and parenthe- unification (see Corbin and Bidoit [1983], ses may be eliminated from a term without de Champeaux [1986], and Huet [1976]). introducing any ambiguity.) Robinson’s The algorithms of Huet [1976], Baxter [ 19651 unification algorithm (Figure 1) uses [1973, 19761, and Jaffar [ 19841 all use the this representation. graph representation. Paterson and Weg- The string representation is equivalent man’s linear algorithm [1976, 19781 uses a to the tree representation, in which a func- graph representation modified to include tion symbol stands for the root of a subtree parent pointers so that leaves contain in- whose children are represented by that formation about their parents, grandpar- function’s arguments. Variables and con- ents, and so on. The Paterson-Wegman stants end up as the leaves of such a tree. algorithm is still linear for terms repre- Figure 2 shows the tree representation of sented as strings, since terms can be con- the term fMf(x)), W(x))). verted from string representation to graph The string and tree representations are representation (and vice versa) in linear acceptable methods for writing down terms, time [de Champeaux 19861. Corbin and and they find applications in areas in which Bidoit [1983] also use a graph representa- the terms are not very complicated, for tion that includes parent pointers. The al- example, Tomita and Knight [ 19881. They gorithms of Martelli and Montanari [ 19861 have drawbacks, however, when used in and of Kapur et al. [ 19821 both deal directly general unification algorithms. The prob- with sets of produced by a unifi- lem concerns structure sharing. cation problem. Their representations also Consider the terms f(g(f(x)), h(f(x))) share structure. andf(g(f(a)), h(f(a))). Any unification al- What I have called the graph represen- gorithm will ensure that the function sym- tation is known in combinatorics as a bols match and ensure that corresponding directed acyclic graph, or dag, in which all arguments to the functions are unifiable. vertices of the graph are labeled. If we allow In our case, after processing the subterms our graphs to contain cycles, we can model f(x) and f(a), the substitution {x t al will infinitely long terms, and unification may be made. There is, however, no need to product substitutions involving such terms. process the second occurrences of f(x) and This matter is discussed in connection with f(a), since we will just be doing the same the occur check in Section 5. work over again. What is needed is a Before closing this section, I will present more concise representation for the terms: a unification algorithm more efficient than a graph representation. Figure 3 shows the one in Figure 1. I have chosen a variant graphs for this pair of terms. of Huet’s algorithm, since it is rather simple

ACM Computing Surveys, Vol. 21, No. 1, March 1989 Unification: A Multidisciplinary Survey l 99

f f 1(1 1(1 g hg h Ir( 1J f i 1 x a

Figure 3. Term graphs for f(g(f (x)), h(f (x))) and f(g(f (a)), h(f (a))). function UNIFY(t1, t2) =+ (unifiable: Boolean, (T: substitution) begin pairs-to-unify + {(tl, t2)) for each node z in tl and t2, z.class + * while pairs-to-unify # 0 do begin (31,y) -pop (pairs-to-unify) u + FIND(r) u + FIND(y) ifu#uthen if u and u are not variables, and u.symbol f v.symbol or numberof(u.subnodes) # numberof(u.subnodes) then return (false, nil) else begin w c UNION(u, u) if w = u and u is a variable then u.class + v if neither u nor u is a variable then begin let(u,,..., un) be u.subnodes let(u,,..., vn) be vsubnodes foricltondo push ((IA:, ui), pairs-to-unify) end end end Form a new graph composed of the root nodes of the equivalence classes. This graph is the result of the unification. If the graph has a cycle, return (false, nil), but the terms are infinitely unifiable. If the graph is acyclic, return (true, CT),where c is a substitution in which any variable x is mapped on to the root of its equivalence class, that is, FIND(r). end

Figure 4. A version of Huet’s unification algorithm. and uses the graph representation of terms. Huet’s algorithm maintains a set of The algorithm is shown in Figure 4. equivalence classes of nodes using the Terms are represented as graphs whose FIND and UNION operations for merging nodes have the following structure: disjoint sets (for an explanation of these operations, see Aho et al. [1974]). Each type = node node starts off in its own class, but as the symbol: a function, variable, or constant symbol unification proceeds, classes are merged subnodes: a list of nodes that are children of together. If nodes with different function this node symbols are merged, failure occurs. At the class: a node that represents this node’s end, the equivalence classes over the nodes equivalence class in the two-term graphs form a new graph, end namely, the result of the unification. If this

ACM Computing Surveys, Vol. 21, No. 1, March 1989 100 l Kevin Knight graph contains a cycle, the terms are infi- The first job is to put these three state- nitely unifiable; if the graph is acyclic, the ments into logical form: terms are unifiable. Omitting the acyclicity test corresponds to omitting the occur (la) Vz : student(z) + [?akesadvice(z, check discussed in Section 5. advisor(z)) + angry(advisor(z))] The running time of Huet’s algorithm is (2a) Vx, y: angry(x) + ltakesadvice(x, y) O(na(n)), where n is the number of nodes (3a) VW: student(w) + [angry(w) -+ and al(n) is an extremely slow-growing angry(advisor(w))] function. a(n) never exceeds 5 in practice, so the algorithm is said to be almost linear. Next, we drop the universal quantifiers Note also that it is nonrecursive, which may and remove the implication symbols (z + increase efficiency. For more details, see y is equivalent to lx V y): Huet [1976] and Vitter and Simons [1984, 19861. A version of this algorithm, which (lb) lstudent(z) V takesadvice(z, includes inheritance-based information advisor(z)) V angry(advisor(z)) (see Section 7), can be found in Ait-Kaci (2b) langry(x) V ltakesaduice(x, y) [ 19841 and Ait-Kaci and Nasr [ 19861. (3b) lstudent(w) V langry(w) V angry(advisor(w))

4. UNIFICATION AND THEOREM PROVING (lb), (2b), and (3b) are said to be in clausal form. Resolution is a rule of inference that Robinson’s [ 19651 work on resolution theo- will allow us to conclude the last clause rem proving served to introduce the uni- from the first two. Here it is in its simplest fication problem into computer science. form: Automatic theorem proving (a.k.a. com- putational logic or automated deduction) became an area of concentrated research, The Resolution Rule and unification suddenly became a topic of interest. In this section, I will only discuss If clause A contains some term s and clause first-order theorem proving, although there B contains the negation of some term t and are theorem-proving systems for higher or- if s and t are unifiable by a substitution c, der logic. Higher order unification will be then a resolvent of A and B is generated by discussed in a later section. (1) combining the clauses from A and B, (2) removing terms s and t, and (3) applying the substitution u to the remaining terms. 4.1 The Resolution Rule If clauses A and B have a resolvent C, then C may be inferred from A and B. I will briefly demonstrate, using an exam- ple, how unification is used in resolution In our example, let s be takesadvice(z, theorem proving. In simplest terms, reso- advisor(z)) and let t be takesadvice(x, y). lution is a rule of inference that allows one (lb) contains s, and (2b) contains the to conclude from “A or B” and “not-A or negation of t. Terms s and t are unifi- C” that “B or C.” In real theorem proving, able under the substitution {x t z, y t resolution is more complex. For example, advisor(z)]. Removing s from (lb) and the from the two facts negation of t from (2b) and applying the substitution to the rest of the terms in (lb) (1) Advisors get angry when students don’t and (2b), we get the resolvent: take their advice. (2) If someone is angry, then he doesn’t take (4) lstudent(z) V angry(advisor(z)) V advice. ~angry(z) We want to infer the third fact: (4) is the same as (3b), subject to disjunct reordering and renaming of the (3) If a student is angry, then so is his unbound variable, and therefore resolution advisor. has made the inference we intended.

ACM Computing Surveys, Vol. 21, No. 1, March 1989 Unification: A Multidisciplinary Survey l 101

Resolution is a powerful inference rule- query. (One example is taken from Clocksin so powerful that it is the only rule needed and Mellish [ 19811.) for a sound and complete system of logic. The simple example here is intended only likes(mary, food). likes(mary, wine). to illustrate the use of unification in the likes(john, wine). resolution method; for more details about likes(john, mary). resolution, see Robinson’s paper [ 19651 and Chapter 4 of Nilsson’s book [1980]. ?-likes(mary,X),likes(john,X). The query at the end asks, Does Mary 4.2 Research like something that John likes? Pro- log takes the first term of the query Much of the theorem-proving community likes (mary, X) and tries to unify it with has recently moved into the field of logic some assertion in the database. It succeeds programming, while a few others have in unifying the terms likes(mary, X) and moved into higher order theorem proving. likes (mar-y, food) by generating the sub- I will present the work of these two groups stitution (X t food ). Prolog applies this in different sections: Unification and substitution to every term in the query. Higher Order Logic and Unification and Prolog then moves on to the second term, Logic Programming. which is now likes(john, food). This term fails to unify with any other term in the 5. UNIFICATION AND LOGIC database. PROGRAMMING Upon failure, Prolog backtracks. That is, it “undoes” a previous unification, in this The idea of “programming in logic” came case, the unification of likes (mar-y, X) with directly out of Robinson’s work: The origi- likes (mar-y, food). It attempts to unify nal (and still most popular) logic program- the first query term with another term in ming language, Prolog, was at first a tightly the database: the only other choice is constrained resolution theorem prover. likes(mary, wine). The terms unify under Colmerauer and Roussel turned it into a the substitution (X t wine], which is useful language [Battani and Meloni 1973; also applied to the second query term Colmerauer et al. 1973; Roussel 19751, and likes( john, X) to give likes( john, wine). van Emden and Kowaiski [1976] provided This term can now unify with a term in an elegant theoretical model based on Horn the database, namely, the third assertion, clauses. Warren et al. [1977] presented an likes(john, wine). Done with all query accessible early description of Prolog. terms, Prolog outputs the substitutions it Through its use of resolution, Prolog in- has found; in this case, “X = wine.” herited unification as a central operation. I used such a simple example only to be A great deal of research in logic program- clear. The example shows how Prolog ming focuses on efficient implementation, makes extensive use of unification as a and unification has therefore received some pattern-matching facility to retrieve rele- attention. I will first give an example of vant facts from a database. Unification it- how unification is used in Prolog; then I self becomes nontrivial when the terms are will review the literature on unification and more complicated. Prolog, finishing with a discussion of the logic programming languages Concurrent Prolog, LOGIN, and CIL. 5.2 Research Colmerauer’s original implementation of unification differed from Robinson’s in one 5.1 Example of Unification in Prolog important respect: Colmerauer deliberately I turn now to unification as it is used in left out the “occur check,” allowing Prolog Prolog. Consider a set of four Prolog asser- to attempt unification of a variable with tions (stored in a “database”) followed by a a term already containing that variable.

ACM Computing Surveys, Vol. 21, NO. 1, March 1989 102 l Kevin Knight foo(X, X). ‘?- fdf(Y), Y). y = f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f( f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f( . . .

Figure 5. Infinite unification in Prolog.

Version 1. less(X, s(X)). /* any number is less than its successor */ ?- less(s( Y), Y). /* is there a Y whose successor is less than Y? */ Y=s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s( s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s~s~s~s~s~s~s~s~s~s~~~s~s~s~s~s~s~s~s~s~s~s~s~s~s~s~s~s~s~s~s~s~s~ s(s(s(s(s(s . . . Version 2. less(X, s(X)). foe :- less(s( Y), Y). /* we don’t care about the value of Y */ ?- foe. /* is there a number whose successor is less than it? */ yes. /* there is?! ‘/

Figure 6. Prolog draws the wrong conclusion because it lacks an occur check.

For example, Prolog will unify the that lack the occur check. He examined terms xand f(x), using the substitution the termination problem and produced a {x t f(f(f(. . .)))I; Robinson’s occur unification algorithm for infinite terms. check, on .the other hand, would instantly Plaisted [ 19841 presented a preprocessor report nonunifiability. for Prolog programs that detects poten- Discarding the occur check corresponds tial problems due to Prolog’s lack of an to moving from unification to infinite uni- occur check. See the end of Section 2 for fication. Figure 5 shows Prolog’s infinite a discussion of algorithms for infinite unification in action. In unifying f(Y) unification. and Y, Prolog comes up with an infinite As noted above, Prolog is a search engine, substitution. and when a particular search path fails, Omitting the occur check results in effi- backtracking is necessary. As a result, some ciency gains that outweigh the risk in- of the unifications performed along the way curred. For example, the concatentation of must be “undone.” Studies of efficient two lists, a linear-time operation without backtracking can be found in Bruynooghe the occur check, becomes an O(n2) time and Pereira [ 19841, Chen et al. [ 19861, Cox operation with the occur check [Colmer- [ 19841, Matwin and Pietrzykowski [ 19821, auer 1982b]. For this reason, most current and Pietrzykowski and Matwin [ 19821. Prolog interpreters omit the occur check. Mannila and Ukkonen [1986] viewed But since its unification is really infinite Prolog execution as a sequence of unifi- unification, in this respect, Prolog can be cation and deunifications, relating it to said to be “unsound.” Figure 6 (adapted the UNION/FIND (disjoint set union) from Plaisted [ 19841) shows a case in which problem. They showed how standard (fast) Prolog draws an unexpected conclusion, unification algorithms can have poor one that would be blocked by an occur asymptotic performance when we consider check. deunifications; algorithms with better per- Colmerauer [1982b, 19831 studied infi- formance are given. nite terms in an effort to provide a theoret- Concurrent Prolog [Shapiro 19831 ex- ical model for real Prolog implementations tends Prolog to concurrent programming

ACM Computing Surveys, Vol. 21. No. 1, March 1989 Unification: A Multidisciplinary Survey l 103 and parallel execution. In this model, uni- ematical vein, here is a statement of the fication may have one of three possible induction property of natural numbers: outcomes: succeed, fail, or suspend. Unifi- cation suspends when a special “read-only V(f) : [(f (0) A V(x)[f (x) ---, f(x + Ul) variable” is encountered-execution of the + V(x)f (x)1 unification call is restarted when that vari- able takes on some value. It is not possible That is, for any predicate f, if f holds for 0 to rely on backtracking in the Concurrent and if f (x) implies f (x + l), then f holds for Prolog model because all computations run all natural numbers. independently and in parallel. Thus, unifi- For a theorem prover to deal with such cations cannot simply be “undone” as in statements, it is natural to work with ax- Prolog, and local copies of structures must ioms and inference rules of a higher order be maintained at each unification. Strate- logic. In such a case, unification of higher gies for minimizing these copies are dis- order terms has great utility. cussed in Levy [ 19831. Unification in higher order logic requires Alt-Kaci and Nasr [1986] present a some special notation for writing down new logic programming language called higher order terms. The typed X-calculus LOGIN, which integrates inheritance- [Church 1940; Henkin 19501 is one com- based reasoning directly into the uni- monly used method. I will not introduce fication process. The language is based on any formal definitions, but here is an ex- ideas from A’it-Kaci’s dissertation [ 19841, ample: “A (u, v)(u)” stands for a function of described in more detail in Section 7. two arguments, u and v, whose value is Mukai [1985a, 1985b] and Mukai and always equal to the first argument, u. Uni- Yasukawa [1985] introduce a variant of fication of higher order terms involves h- Prolog called CIL aimed at natural lan- conversion in conjunction with application guage applications, particularly those in- of substitutions. volving situation semantics [Barwise and Finally, we must distinguish between Perry 19831. CIL is based on an extended functibn constants (denoted A, B, C, . . . ) unification that uses conditional state- and function variables (denoted f, g, h, . . . ), ments, roles, and types and contains many which range over those constants. Now we of the ideas also present in LOGIN. Hasida can look at unification in the higher order [1986] presents another variant of unifica- realm. tion that handles conditions on patterns to be unified. 6.1 Example of Second-Order Unification Consider the two terms f (x, b) and A(y). 6. UNIFICATION AND HIGHER ORDER Looking for a unifying substitution for vari- LOGIC ables f, x, and y, we find First-order logic is sometimes too limited f + X(u, v)(u) or too unwieldy for a given problem. Second-order logic, which allows variables x+A(y) to range over functions (and predicates) as well as constants, can be more useful. Con- Y+Y sider the statement “Cats have the same This substitution produces the unifica- annoying properties as dogs.” We can ex- tion A(y). But if we keep looking, we find press this in second-order logic as another unifier: V(x)~(y)V(f) : f +- Nu, v)A(g(u, v)) . [cat(x) A dog(y) A annoying(f)] X+X --, [f(x) * f(Y)1 Y - dx, b) Note that the variable f ranges over pred- In this case, the unification is A (g(x, b)). icates, not constant values. In a more math- Notice that neither of the two unifiers is

ACM Computing Surveys, Vol. 21, No. 1, March 1989 104 l Kevin Knight more general than the other! A pair of Huet [ 19721 in his dissertation presented higher order terms, then, may have more a new refutational system for higher order than one “most” general unifier. Clearly, logic based on a simple unification algo- the situation is more complex than that of rithm for higher order terms. A year later, first-order unification. part of that dissertation was published as Huet [1973b]. It showed that the unifica- 6.2 Research tion problem for third-order logic was un- decidable; that is, there exists no effective Gould [1966a, 1966b] was the first to inves- procedure to determine whether two third- tigate higher order unification in his disser- (or higher) order terms are unifiable. tation. He showed that some pairs of Lucchesi [ 19721 independently made this higher order terms possess many general same discovery. unifiers, as opposed to the first-order case, Baxter [1978] extended Huet’s and where terms always (if unifiable) have a Lucchesi’s undecidability results to third- unique MGU. Gould gave an algorithm to order dyadic unification. Three years later, compute general unifiers, but his algorithm Goldfarb [1981] demonstrated the unde- did not return a complete set of such uni- cidability of the unification problem for fiers. He conjectured that the unifiability second-order terms by a reduction of problem for higher order logic was solvable. Hilbert’s Tenth Problem to it. Robinson [1968], urged that research in Huet’s refutational system is also de- automatic theorem proving be moved into scribed in Huet [1973a], and his higher the higher order realm. Robinson was non- order unification algorithm can be found in plussed by Godel’s incompleteness results Huet [1975]. Since w-order unification is in and maintained that Henkin’s [ 19501 inter- general undecidable, Huet’s algorithm may pretation of logical validity would provide not halt if the terms are not unifiable. The the completeness needed for a mechaniza- algorithm first searches for the existence of tion of higher order logic. Robinson [1969, a unifier, and if one is shown to exist, a 19701 followed up these ideas. His theorem very general preunifier is returned. provers, however, were not completely Huet’s system was based on the typed mechanized; they worked in interactive X-calculus introduced by Church [1940]. mode, requiring a human to guide the The X-calculus was the basis for Henkin’s search for a proof. [1950] work on higher order completeness Andrews [ 19711 took Robinson’s original mentioned above in connection with resolution idea and applied it rigorously to Robinson [1968]. Various types of mathe- higher order logic. But, like Robinson, he matical can be expressed natu- only achieved partial mechanization: Sub- rally and compactly in the typed X-calculus, stitution for predicate variables could not and for this reason almost all work on be done automatically. Andrews describes higher order unification uses this formal- an interactive higher order theorem prover ism. based on generalized resolution in Andrews Pietrzykowski and Jensen also studied and Cohen [1977] and Andrews et al. higher order theorem proving. Pietrzy- [1984]. kowski [1973] gave a complete mechaniza- Darlington [ 1968, 19711 chose a different tion of second-order logic, which he and route. He used a subset of second-order Jensen extended to w-order logic in Jensen logic that allowed him only a little more and Pietrzykowski [1976] and Pietrzy- expressive power than first-order logic. He kowski and Jensen [ 1972,1973]. Like Huet, was able to mechanize this logic completely they based their mechanization on higher using a new unification algorithm (first order unification. Unlike Huet, however, called f-matching). To repeat, he was not they presented an algorithm that computes using full second-order logic; his algorithm a complete set of general unifiers3 The could only unify a predicate variable with algorithm is more complicated than Huet’s, a predicate constant of greater arity. 3 Of course, the algorithm still may not terminate if * The above example is taken from that dissertation. the terms are nonunifiable.

ACM Computing Surveys, Vol. 21, No. 1, March 1989 Unification: A Multidisciplinary Survey l 105 using five basic types of substitutions as opposed to Huet’s two, and is generally redundant. Winterstein [ 19761 reported that even for the simple case of monadic(one argument per function) second-order terms unifica- Figure 7. A feature structure. tion could require exponential time under the algorithms proposed by Huet, Pietrzy- kowski, and Jensen. Winterstein and Huet independently arrived at better algorithms l The distinction between function and for monadic second-order unification. argument is removed. Darlington [1977] attempted to improve l Variables and coreference are treated the efficiency of higher order unification by separately. sacrificing completeness. If two terms I will now discuss each of these differ- unify, Darlington’s algorithm may not re- port this fact, but he claims that in the ences. usual case unification runs more swiftly. Substructure labeling. Features are As mentioned above, Goldfarb [1981] labeled symbolically, not inferred by argu- showed that higher order unification is in ment position within a term. For example, general undecidable. There are, however, we might want the term “person (john, 23, variants that are decidable. Farmer [1987] 70, spouse(mary, 25))” to mean “There is a shows that second-order monadic unifica- person named John, who is 23 years old, 70 tion (discussed two paragraphs ago) is ac- inches tall, and who has a 25-year-old tually decidable. Siekmann [1984] suggests spouse named Mary.” If so, we must re- that higher order unification under some member the convention that the third ar- theory T might be solvable (see Section 9), gument of such a term represents a person’s but there has been no proof to this effect. height. An equivalent feature structure, using explicit labels for substructure, is shown in Figure 7. (I have used a standard, 7. UNIFICATION AND FEATURE but not unique, representation for writing STRUCTURES down feature structures.) Fixed arity. Feature structures do not First-order terms turn out to be too limited have the fixed arity of terms. Traditionally, for many applications. In this section, I will feature structures have been used to rep- present structures of a somewhat more gen- resent partial information-unification eral nature. The popularity of these struc- combines partial structures into larger tures has resulted in a line of research that structures, assuming no conflict occurs. In has split away from mainstream work in other words, unification can result in new theorem proving and logic programming. structures that are wider as well as deeper An understanding of these structures is than the old structures. Figure 8 shows necessary for an understanding of much an example (the square cup stands for research on unification. unification). Kay [1979] introduced the idea of using Notice how all of the information con- unification for manipulating the syntac- tained in the first two structures ends up tic structures of natural languages (e.g., in the third structure. If the type features English and Japanese). He formalized the of the two structures had held different linguistic notion of “features” in what are values, however, unification would have now known as “feature structures.” Feature structures resemble first-order failed. terms but have several restrictions lifted: The demoted functor. In a term, the functor (the function symbol that begins l Substructures are labeled symbolically, the term) has a special place. In feature not inferred by argument position. structures, all information has equal status. l Fixed arity is not required. Sometimes one feature is “primary” for a

ACM Computing Surveys, Vol. 21, No. 1, March 1989 106 . Kevin Knight

type: person age: 23

Figure 8. Unification of two feature structures.

type : person name: john age: Iheight: 1; Figure 9. A feature structure with variables and internal coreference constraints. particular use, but this is not built into the coreference label share the same value (in syntax. the sense of LISP’s EQ, not EQUAL). The Variables and coreference. Variables value itself can be placed after any one of in a term serve two purposes: (1) They are the coreference labels, the choice of which place holders for future instantiation, and is arbitrary. The symbol [ ] indicates a vari- (2) they enforce equality constraints among able. A variable [ ] can unify with any other different parts of the term. For example, feature structure s. the term person (john, Y, 2, spouse ( W, Y)) Like terms, feature structures can also expresses the idea “John can marry anyone be represented as directed graphs (see Sec- exactly as old as he is” (because the variable tion 3). This representation is often more Y appears in two different places). Note, useful for implementation or even presen- however, that equality constraints are re- tation. Whereas term graphs are labeled on stricted to the leaves of a term.4 Feature uertices, feature structure graphs have la- structures allow within-term coreferences bels on arcs and leaves. Figure 10 shows the to be expressed by separating the two roles graph version of the feature structure of of variables. Figure 7. The graph of Figure 10, of course, If we wished, for example, to express the is just a tree. In order to express something constraint that John must marry his best such as “John must marry a woman exactly friend, we might start with a term such as as old as he is,” we need coreference, indi- cated by the graph in Figure 11. person(john, Y, 2, spouse( W, X), This ends the discussion of feature struc- bestfriend( W, X)) tures and terms. At this point, a few other This is awkward and unclear, however; it structures are worth mentioning: frames, #- is tedious to introduce new variables for terms, and LISP functions. Frames [Min- each possible leaf, and moreover, the term sky 19751, a popular representation in ar- seems to imply that John can marry anyone tifical intelligence, can be modeled with as long as she has the same name and age feature structures. We view a feature as a as his best friend. What we really want is a slot: Feature labels become slot names and “variable” to equate the fourth and fifth feature values become slot values. As in the arguments without concern for their inter- frame conception, a value may be either nal structures. Figure 9 shows the feature atomic or complex, perhaps having sub- structure formalization. structure of its own. The boxed number indicates coreference. The $-terms of Ait-Kaci [1984, 19861 Features with values marked by the same and Ait-Kaci and Nasr [1986] are very similar to feature structures. Subterms are 4 This is construed in the sense of the term’s graph labeled symbolically, and fixed arity is representation. not required, as in feature structures, but

ACM Computing Surveys, Vol. 21, No. 1, March 1989 Unification: A Multidisciplinary Survey 107

mary 25

Figure 10. Graph representation of the feature kructure shown in Figure 7.


Figure 11. Graph representation of a feature structure with variables and coreference. the functor is retained. And like feature and a fish-eater. Then unifying the follow- structures, variables and coreference are ing two q-terms, handled separately. Although feature struc- tures predate q-terms, Ait-Kaci’s discus- fish-eater (likes + trout) sion of arity, variables, and coreference is bird (color + brown; likes * fish) superior to those found in the feature struc- ture literature. yields the new $-term The novel aspect of G-terms is the use of pelican (color =$ brown; likes j trout) type inheritance information. Ait-Kaci views a term’s functions and variables as Unification does not fail when it sees filters under unification, since two struc- conflicts between “fish-eater” and “bird” tures with different functors can never or between “trout” and “fish.” Instead, unify, whereas variables can unify with it resolves the conflict by finding the anything. He questions and relaxes this greatest lower bound on the two items in “open/closed” behavior by allowing type the taxonomic hierarchy, in this case information to be attached to functions “pelican” and “trout,” respectively. In this and variables. Unification uses information way, Ait-Kaci’s system naturally ex- from a taxonomic hierarchy to achieve a tends the information-merging nature of more gradual filtering. unification. As an example, suppose we have the fol- As a third structure, I turn to the function lowing inheritance information: Birds and in the programming language LISP. Func- fish are animals, a fish-eater is an animal, tions are not directly related to the feature a trout is a fish, and a pelican is both a bird structures, but some interesting parallels

ACM Computing Surveys, Vol. 21, No. 1, March 1989 108 l Kevin Knight can be drawn. Older dialects of LISP re- grammar rules define ways in which words quired that parameters to functions be may be combined with one another to form passed in an arbitrary, specified order. larger units of language called constituents. Common LISP [Steele 19841, however, in- Sample constituent types are the noun cludes the notion of keyword parameter. phrase (e.g., “the man”) and the verbphrase This allows parameters to be passed along (e.g., “kills bugs”). Grammar rules in nat- with symbolic names; for example, ural language systems share a common pur- pose with grammar rules found in formal (make-node: state/( 1 2 3 ) : language theory and compiler theory: The score 12 : successors nil) description of how smaller constituents can be put together into larger ones. which is equivalent to Figure 12 shows a rule (called an aug- mented context-free grammar rule) taken (make-node : score 12 : state’( 1 2 3 ) : from a sample unification-based grammar. successors nil) It is called an augmented context-free grammar rule because at its core is the Keyword parameters correspond to the simple context-free rule S * NP VP, explicit labeling of substructure discussed meaning that we can build a sentence (S) above. Note that Common LISP functions out of a noun phrase (NP) and a verb are not required to use the keyword param- phrase (VP). The equations (the augmen- eter format. Common LISP also allows for tation) serve two purposes: (1) to block optional parameters to a function, which applications of the rule in unfavorable cir- seems to correspond to the lack of fixed cumstances and (2) to specify structures arity in feature structures. Common LISP, that should be created when the rule is however, is not quite as flexible, since applied. all possible optional parameters must be The idea is this: The rule builds a feature specified in advance. structure called X0 using (already existent) Finally, feature structures may contain feature structures Xl and X2. Each feature variables. LISP is a functional language, structure, in this case, has two substruc- and that means a function’s parameters tures, one labeled category, the other la- must be fully instantiated before the function beled head. Category tells what general type can begin its computation. In other words, of syntactic object a structure is, and head variable binding occurs in only one direc- stores more information. tion, from parameters to their values. Con- Let us walk through the equations. The trast this behavior with that of Prolog, in first equation states that the feature struc- which parameter values may be temporarily ture X0 will have the category of S (sen- unspecified. tence). The next two equations state that the rule only applies if the categories of X 1 8. UNIFICATION AND NATURAL LANGUAGE and X2 are NP and VP, respectively. PROCESSING The fourth equation states that the num- ber agreement feature of the head of Xl This section makes extensive use of the must be the same as the number agreement ideas covered in Section 7. Only with this feature of the head of X2. Thus, this rule background am I able to present examples will not apply to create a sentence like and literature on unification in natural “John swim.” The fifth equation states that language processing. the subject feature of the head of X0 must be equal to Xl; that is, after the rule has applied, the head of X0 should be a struc- 8.1 Parsing with a Unification-Based ture with at least one feature, labeled sub- Grammar ject, and the value of that feature should be Unification-based parsing systems typi- the head of the structure Xl. cally contain grammar rules and lexical The sixth equation states that the head entries. Lexical entries define words, and of X0 should contain all of the features of

ACM Computing Surveys, Vol. 21, No. 1, March 1989 Unification: A Multidisciplinary Survey l 109

x0=%.x1x2 (X0 category) = S (Xl category) = NP (X2 category) = VP (Xl head agreement) = (X2 head agreement ) (X0 head subject) = (Xl head) (X0 head) = (X2 head) (X0 head mood) = declarative

Figure 12. Sample rule from a unification-based grammar. category: NP 1 det: the head: root: man I [agreement: singular 11 category: VP

root: kill tense: present held agreement: singular object: bug agreement: plural 111 Figure 13. Feature structures representing analyses of “the man” and “kills bugs.”

category: NP det: the root : man agreement: singulnr 11 category: VP root: kill tense: present 22: agreement: singular head: I r

Figure 14. Two feature structures combined with “dummy” features xl and x2. the head of X2 (i.e., the sentence inherits “the man” and “kills bugs.” In parsing, the features of the verb phrase). The last we want to combine these structures equation states that the head of X0 should into a larger structure of category S. have a mood feature whose (atomic) value (2) Temporarily combine the constituent is declarative. structures. NP and VP are combined Now where does unification come in? into a single structure by way of Unification is the operation that simulta- “dummy” features 3tl and x2 (Figure neously performs the two tasks of building 14). up structure and blocking rule applications. (3) Represent the grammar rule itself as a Rule application works as follows: feature structure. The feature structure (1) Gather constituent structures. Suppose for the sample rule given above is we have already built up the two struc- shown in Figure 15. The boxed corefer- tures shown in Figure 13. These struc- ence labels enforce the equalities ex- tures represent analyses of the phrases pressed in the rule’s augmentation.

ACM Computing Surveys, Vol. 21, No. 1, March 1989 110 l Kevin Knight

declarative II- Figure 15. Feature structure representation of the grammar rule in Figure 12.

category: S x0: [ head: q I r category: NP det : Xl: the head: pJ root: man L agreement: q singular II r category: VP root: kill tense: present mood: declarative x2: subject: q head: q agreement: q category: Np object: root : bw agreement: plural 11 Figure 16. Result of unifying constituent and rule structures (Figures 14 and 15).

(4) Unify the constituent structure with for the word “man.” All higher level feature the rule structure. We then get the structures are built up from these lexical structure shown in Figure 16. In this structures (and the grammar rules). manner, unification builds larger syn- Second, how are grammar rules chosen? tactic constituents out of smaller ones. It seems that only a few rules can possibly (5) Retrieve the substructure labeled X0. apply successfully to a given set of struc- In practice, this is the only information tures. Here is where the category feature in which we are interested. (The final comes into play. Any parsing method, such structure is shown in Figure 17.) as Earley’s algorithm [Earley 1968, 19701 or generalized LR parsing [Tomita 1985a, Now suppose the agreement feature of 1985b,1987], can direct the choice of gram- the original VP had been plural. Unifica- mar rules to apply, based on the categories tion would have failed, since the NP has of the constituents generated. the same feature with a different (atomic) Third, what are the structures used for? value-and the rule would fail to apply, Feature structures can represent syntactic, blocking the parse of “The man kills bugs.” semantic, or even discourse-based infor- A few issues bear discussion. First, where mation. Unification provides a kind of did the structures Xl and X2 come from? constraint-checking mechanism for merg- In unification-based grammar, the lexicon, ing information from various sources. The or dictionary, contains basic feature struc- feature structure built up from the last rule tures for individual words. Figure 18, for application is typically the output of the example, shows a possible lexical entry parser.

ACM Computing Surveys, Vol. 21, No. 1, March 1989 Unification: A Multidisciplinary Survey l 111 category: s root: kill tense: present mood: declarative det: the subject: root: man head : agreement: q singular I agreement: q

object: bug agreement: plural 11 Figure 17. Feature structure analysis of “The man kills bugs.”

category: N presents motivations and examples far be- root: man yond the above exposition. The reader in- head: [ agreement: singular 11 terested in this area is urged to find this source. Now I will discuss research on fea- Figure 18. Sample lexical entry for the word “man.” ture structures, unification algorithms, and unification-based grammar formalisms. Finally, why do the rules use equations Karttunen [ 19841 discusses features and rather than explicit tests and structure- values at a basic level and provides linguis- building operations? Recall that unifica- tic motivation for both unification and generalization (see also Section 11). tion both blocks rule applications if certain Structure-sharing approaches to speeding tests are not satisfied and builds up new up unification are discussed by Karttunen structures. Another widely used grammar and Kay [ 19851 and by Pereira [ 19851, who formalism, the Augmented Transition Net- imports Boyer and Moore’s [1972] ap- work (ATN) of Woods [1970], separates proach for terms. Wroblewski [1987] gives these two activities. An ATN grammar is a a simple nondestructive unification algo- network of states, and parsing consists of rithm that minimizes copying. moving from state to state in that network. Much unpublished research has dealt Along the way, the parser maintains a set with extensions to unification. For exam- of registers, which are simply variables that ple, it is often useful to check for the contain arbitrary values. When the parser presence or absence of a feature during moves from one state to another, it may unification rather than simply merging test the current values of any registers and features. Other extensions include (1) may, if the tests succeed, change the values multiple-valued features-a feature may of any registers. Thus, testing and building take on more than one value, for example, are separate activities. “big, red, round “; (2) negated features:-a An advantage of using unification, how- ever, is that it is, by its nature, bidirec- feature may be specified by the values it cannot have rather than one it can have; tional. Since the equations are stated and (3) disjunctive features:-a feature may declaratively, a rule could be used to cre- be specified by a set of values, at least one ate smaller constituents from larger ones. of which must be its true value. One interesting aspect of unification- Disjunctive feature structures are of based grammars is the possibility using the great utility in computational , same grammar to parse and generate nat- but their manipulation is difficult. Fig- ural language. ure 19 shows an example of unification of disjunctive feature structures. 8.2 Research The curly brackets indicate disjunction; An excellent, modern introduction to the a feature’s value may be any one of the use of unification in natural language pro- structures inside the brackets. When per- cessing is Shieber’s [ 19861 book. This book forming unification, one must make sure

ACM Computing Surveys, Vol. 21, No. 1, March 1989 112 l Kevin Knight

c: 2 d: 2 [ d: 1 1 b: [e: 41 c: 2

Figure lg. Unification of disjunctive feature structures.

that all possibilities are maintained. This complete bibliography. The formalisms can usually involves cross multiplying the val- be separated roughly into families: ues of the two disjunctive features. In the In the “North American Family,” we above example, there are four possible ways have formalisms arising from work in com- to combine the values of the “b” feature; putational linguistics in North America in three are successful, and one fails (because the late 1970s. Functional Unification of the conflicting assignment to the “d” Grammar (FUG; previously FG and UG) feature). is described by Kay [1979, 1984, 1985aJ. The mathematical basis of feature struc- Lexical Functional Grammar (LFG) is tures has been the subject of intensive introduced in Bresnan’s [ 19821 book. study. Pereira and Shieber [1984] devel- The “Categorial Family” comes out of oped a denotational semantics for unifica- theoretical linguistic research on categorial tion grammar formalisms. Kasper and grammar. Representatives are Unification Rounds [1986] and Rounds and Kasper Categorial Grammar (UCG), Categorial [ 19861 presented a logical model to describe Unification Grammar (CUG), Combina- feature structures. Using their formal se- tory Categorial Grammar (CCG), and mantics, they were able to prove that the Meta-Categorial Grammar (MCG). UCG problem of unifying disjunctive structures was described by Zeevat [1987], CUG by is flp-complete. Disjunctive unification Uzkoreit [1986], and CCG by Wittenburg has great utility, however, and several re- [ 19861. Karttunen [ 1986131 also presented searchers, including Kasper [ 19871 and work in this area. Eisele and Dorre [ 19881, have come up with The “GPSG Family” begins with Gener- algorithms that perform well in the average alized Phrase Structure Grammar (GPSG), case. Rounds and Manaster-Ramer [ 19871 a modern version of which appears in discussed Kay’s Functional Grammar in Gazdar and Pullum [1985]. Work at terms of his logical model. Moshier and ICOT in Japan resulted in a formalism Rounds [1987] extended the logic to deal called JPSG [Yokoi et al. 19861. Pollard with negation and disjunction by means of and colleagues have been working recently intuitionistic logic. Finally, in his disserta- on Head-Driven Phrase Structure Gram- tion, Ait-Kaci [1984] gave a semantic ac- mar (HPSG); references include Pollard count of $-terms, logical constructs very [1985] and Sag and Pollard [1987]. close to feature structures-these con- Ristad [1987] introduces Revised GPSG structs were discussed in Section 7. (RGPSG). The past few years have seen a number The “Logic Family” consists of grammar of new grammar formalisms for natural formalisms arising from work in logic pro- languages. A vast majority of these formal- gramming. Interestingly, Prolog was devel- isms use unification as the central opera- oped with natural language applications in tion for building larger constituents out of mind: Colmerauer’s [1978, 1982a] Meta- smaller ones. Gazdar [ 19871 enumerated morphosis Grammars were the first gram- them in a brief survey, and Sells [1985] mars in this family. Pereira and Warren’s described the theories underlying two of the [1980] Definite Clause Grammar (DCG) more popular unification grammar formal- has become extremely popular. Other logic isms. Gazdar et al. [1987a] present a more grammars include Extraposition Grammar

ACM Computing Surveys, Vol. 21, No. 1, March 1989 Unification: A Multidisciplinary Survey l 113 (XC) [Pereira, 19811, Gapping Grammars, h(a)), and the most general unifier u was [Dahl1984; Dahl and Abramson 19841 Slot Grammar [McCord 19801, and Modular x +--Ah(a), a) Logic Grammar (MLG) [McCord 19851. y + h(a) Prolog has been used in a variety of ways for analyzing natural languages. Definite From an algebraic viewpoint, we can think Clause Grammar (DCG) is a basic exten- of this unification as solving the equation sion of Prolog, for example. Bottom-up par- s = t by determining appropriate values for sers for Prolog are described in Matsumoto the variables n and y. and Sugimura [1987], Matsumoto et al. Now, for a moment, assume that f [ 19831, and Stabler [ 19831. Several articles denotes the function add, g denotes the on logic programming and natural language function multiply, h denotes the function processing are collected in Dahl and Saint- successor, and a denotes the constant Dizier [1985]. Implementations of LFG in zero. Also assume that all the axioms of Prolog are described in Eisele and Dorre number theory hold. In this case, the equa- [ 19861 and Reyle and Frey [ 19831. tion s = t is interpreted as add(x, y) = Pereira [ 19871 presented an interesting add(mult(y, O), succ(0)). It turns out that investigation of unification grammars from there are many solutions to the equation, a logic programming perspective. In a sim- including the substitution r: ilar vein, Kay [1985b] explained the need to go beyond the capabilities of unification x t h(a) present in current logic programming Y+-a systems. Finally, in the “miscellaneous” category, Notice that 7(s) is f(h(a), a) and that we have PATR-II, an (implemented) gram- 7(t) is f (g(a, a), h(a)). These resulting two mar formalism developed by Karttunen terms are not textually identical, but under [1986b], Shieber [1984, 19851, and others. the of f, g, h, and a given PATR is actually an environment in which above, they are certainly equivalent. The grammars can be developed for a wide va- former is succ(0) + 0, or succ(0); the latter riety of unification-based formalisms. In is (0 . 0) + succ(O), or succ(0). Therefore, fact, many of the formalisms listed above we say that r unifies s and t under the are so similar that they can be translated axioms of number theory. It is clear that into each other automatically; for example, determining whether two terms are unifia- see Reyle and Frey [1983]. Hirsh [1986] ble under the axioms of number theory is described a compiler for PATR grammars. the same problem as solving equations in number theory. Number theory has complex axioms and 9. UNIFICATION AND EQUATIONAL inference rules. Of special interest to THEORIES unification are simpler equational axioms Unification under equational theories is a such as very active area of research. The research f (f (x, y), 2) = f (x, f (y, 2)) associativity has been so intense that there are several commutativity papers devoted entirely to classifying fb,Y) =f(y,x) previous research results [e.g., Siekmann f(&X)=x idempotence 1984, 1986; Walther 19861. Many open A theory is a finite collection of axioms problems remain, and the area promises to such as the ones above. The problem of provide new problems for the next decade. unifying two terms under theory T is writ- ten (s = t)T. 9.1 Unification as Equation Solving Consider the problem of unification under commutativity: (s = t)c. To unify At the very beginning of this paper, I pre- f(x, y) with f(a, b) we have two substi- sented an example of unification. The tutions available to us: 1~ t a; y t b} and terms were s = f (x, y) and t = f (g(y, a), (x t b; y c a}. With commutativity, we no

ACM Computing Surveys, Vol. 21, No. 1, March 1989 114 l Kevin Knight

longer get the unique most general unifier arising theories. The study of Universal of Robinson’s null-theory unification- Unification seeks to discover properties of compare this situation to Gould’s findings unification that hold across a wide variety on higher order unification. of equational theories, just as Universal There are many applications for unifi- Algebra abstracts general properties from cation algorithms specialized for certain individual algebras. equational theories. A theorem prover, for To give a flavor for results in Universal example, may be asked to prove theorems Unification, I reproduce a recent theorem about the multiplicative properties of inte- due to Book and Siekmann [1986]: gers, in which case it is critical to know that the multiplication function is associa- Theorem 9.1 tive. One could add an “associativity axiom” to the theorem prover, but it might If T is a suitable first-order equation theory waste precious inferences simply bracket- that is not unitary, then T is not bounded. ing and rebracketing multiplication for- mulas. A better approach would be to “build This means the following: Suppose that in” the notion of associativity at the core unification under theory T produces a finite of the theorem prover: the unification al- number of general unifiers but no MGU (as gorithm. Plotkin [ 19721 pioneered this with commutativity). Then, there is no par- area, and it has since been the subject of ticular n such that for every pair s much research. and t, the number of general unifiers is less than n (T is not bounded). Term- systems, systems for 9.2 Research translating sets of equations into sets of Siekmann’s [ 19841 survey of unification rewrite rules, are basic components of uni- under equational theories is comprehen- fication algorithms for equational theories. sive. The reader is strongly urged to seek Such systems were the object of substantial out this source. I will mention here only a investigation in Knuth and Bendix [1970]. few important results. A survey of work in this area is given in We saw above that unification under Huet and Oppen [ 19801. commutativity can produce more than one Some recent work [Goguen and Mese- general unifier; however, there are always guer 1987; Meseguer and Goguen 1987; a finite number of general unifiers [Livesey Smolka and Ait-Kaci 1987; Walther 19861 et al. 19791. Under associativity, on the investigates the equational theories of or- other hand, there may be an infinite num- der-sorted and many-sorted algebras. ber of general unifiers [Plotkin 19721. Smolka and Ait-Kaci [1987] show how Baader [ 19861 and Schmidt-Schauss [ 19861 unification of Ait-Kaci’s $-terms (see Sec- independently showed that unification un- tion 7) is an instance of unification in der associativity and idempotence has the order-sorted Horn logic. In other words, “unpleasant” feature that no complete set unification with inheritance can be seen as of general unifiers even exists. (That is, unification with respect to a certain set of there are two unifiable terms s and t, but equational axioms. for every unifier u, there is a more general Unification algorithms have been con- unifier 7.) structed for a variety of theories. Algo- Results of this kind place any equational rithms for particular theories, however, theory T into a “hierarchy” of theories, have usually been handcrafted and based according to the properties of unification on completely different techniques. There under T. Theories may be unitary (one is interest in building more general (but MGU), finitary (finite number of general perhaps inefficient) “universal” algorithms unifiers), infinitary (infinite number of gen- [e.g., Fages and Huet 1986; Fay 1979; eral unifiers), or nullury (no complete set Gallier and Snyder 1988; Kirchner 19861. of general unifiers). Siekmann [1984] sur- There is also interest in examining the veys the classifications of many naturally computational complexity of unification

ACM Computing Surveys, Vol. 21, No. 1, March 1989 Unification: A Multidisciplinary Survey 115 under various theories [e.g., Kapur and results, investigated a form of logic pro- Narendran 19861. gramming based on matching rather than unification. 10. PARALLEL ALGORITHMS FOR Vitter and Simons [1984, 19861 argued UNIFICATION that unification can be helped by multiple processors in a practical setting. They de- Lewis and Statman [1982] wrote a note veloped complexity classes to formalize this showing that nonunifiability could be com- notion of “practical” speedup and gave a puted in nondeterministic log space (and parallel unification algorithm that runs in thus, by Savitch’s theorem, in deterministic time O(E/P + V log P), where P is the log2 space). By way of the parallel compu- number of processors, E the number of tation thesis [Goldschlager 19781, which re- edges in the term graph, and V the number lates sequential space to parallel time, this of vertices. result could have led to a log2 n parallel Harland and Jaffar [ 19871 introduced an- time algorithm for unification. other measure for parallel efficiency and In the process of trying to find such an compared several parallel algorithms in- algorithm, Dwork et al. [1984] found an cluding Yasuura’s algorithm [ 19841, Vitter error in the above-mentioned note. They and Simon’s algorithm [1984, 19861, and went on to prove that unifiability is actually a parallel version of Jaffar’s algorithm log space complete for 9, which means that [1984]. it is difficult to compute unification using a small amount of space. This means that 11. UNIFICATION, GENERALIZATION, AND it is highly unlikely that unification can be LATTICES computed in O(logk n) parallel time using a polynomial number of processors (in com- Generalization is the dual of unification, plexity terminology, unification is in the and it finds applications in many areas in class &%? only if 9 = Jy^%?). which unification is used. Abstractly, the In other words, even massive parallelism generalization problem is the following: will not significantly improve the speed of Given two objects x and y, can we find a unification-it is an inherently sequential third object z of which both x and y are process. This lower bound result has a wide instances? Formally it is as follows: range of consequences; for example, using parallel hardware to speed up Prolog exe- Definition 11.1 cution is unlikely to bring significant gains An antisubstitution (denoted y, 17,5; . . .), is (but see Robinson [1985] for a hardware a mapping from terms into variables. implementation of unification). Yasuura [ 19841 independently derived Definition 11.2 a complexity result similar to that of Dwork et al. [1984] and gave a parallel Two terms s and t are general&able if there unification algorithm that runs in time exists an antisubstitution y such that O(log’ n + m log m), where m is the total y(s) = y(t). In such a case, y is called a number of variables in the two terms. generalizer of s and t, and y(s) is called Dwork et al. [1984] mentioned the exist- a generalization of s and t. ence of a polylog parallel algorithm for term Generalizability is a rather vacuous con- matching, the variant of unification in cept, since any two terms are generalizable which substitution is allowed into only one under the antisubstitution that maps all of the two terms. They gave an O(log2 n) terms onto the variable X. Of more interest time bound but required about O(n5) is the following: processors. Using randomization tech- niques, Dwork et al. [1986] reduced the Definition processor bound to O(n3). Maluszynski and Komorowski [ 19851, A generalizer y of terms s and t is called motivated by Dwork et al.‘s [ 19841 negative the most specific generalizer (MSG) of s and

ACM Computing Surveys, Vol. 21, No. 1, March 1989 116 l Kevin Knight

fk b)

Figure 20. A portion of the lattice of first-order terms.

fk b) fb, Y)

Figure 21. Unification of terms f(x, b) and f (a, y), t if for any other generalizer 7, there is an came up with a generalization algorithm. antisubstitution {such that {y(s) = v(s). (Plotkin [ 19701 independently discovered this algorithm.) Reynolds used the natural Intuitively, the most specific generaliza- lattice structure of first-order terms, a par- tion of two terms retains information that tial ordering based on “subsumption” of is common to both terms, introducing new terms. General terms subsume more spe- variables when information conflicts. For cific terms; for example, f (x, a) subsumes example, the most specific generaliza- f(b, a). Many pairs of terms, of course, do tion of f(a, g(b, c)) and f(b, g(2c, c)) is not stand in any subsumption relation, and f(z, g(x, c)). Since the second argument to this is where unification and generalization f can be a or b, generalization “makes an come in. Figure 20 shows a portion of abstraction” by introducing the variable Z. the lattice of first-order terms augmented Unification, on the other hand, would fail with two special terms called top (T) and because of this conflict. bottom (I). As mentioned in a previous section, Unification corresponds to finding the Reynolds [1970] proved the existence of greatest lower bound (or meet) of two terms a unique MSG for first-order terms and in the lattice. Figure 21 illustrates the

ACM Computing Surveys, Vol. 21, No. 1, March 1989 Unification: A Multidisciplinary Survey l 117


fh b) fh Y)

Figure 22. Generalization of terms f(x, b) and f(a, c). unification of f(x, b) and f(a, y). The bot- 12.1 Type Inference tom of the lattice (I), to which all pairs of Milner [1978] presented work on compile- terms can unify, represents inconsistency. time type checking for programming lan- If the greatest lower bound of two terms guages, with the goal of equating syntactic is I, then they are not unifiable. correctness with well typedness. His algo- Generalization corresponds to finding rithm for well typing makes use of unifica- the least upper bound (or join) of two terms tion, an idea originally due to Hindley in the lattice. Figure 22 illustrates the gen- [1969]. More recent research in this area eralization of f(x, b) and f (a, c). The top of includes the work of Cardelli [1984], the lattice (T), to which all pairs of terms MacQueen et al. [1984], and Moshier and can generalize, is called the universal term. Rounds [ 19871. Everything is an instance of this term. Feature structures, previously discussed in Section 7, make up a more complex 12.2 Programming Languages lattice structure. Pattern matching is a very useful feature Inheritance hierarchies, common struc- of programming languages. Logic program- tures in artificial intelligence, can be seen ming languages make extensive use of as lattices that admit unification and gen- pattern matching, as discussed above, eralization. For example, the generalization but some functional languages also pro- of concepts “bird” and “fish-eater” in a vide matching facilities (e.g., PLANNER, certain knowledge base might be “animal,” QA4/QLISP). In these cases, functions whereas the unification might be “pelican.” may be invoked in a pattern-directed Ait-Kaci extends this kind of unification manner, making use of unification as a and generalization to typed record struc- pattern-matcher/variable-binder. Stickel tures (#-terms), as discussed in Section 7. [ 19781 discusses pattern-matching lan- guages and specialized unification algo- rithms in his dissertation. 12. OTHER APPLICATIONS OF UNIFICATION 12.3 Machine Learning I have already discussed three application Learning concepts from training instances areas for unification: automatic theorem is a task that requires the ability to gener- proving, logic programming, and natural alize. Generalization, the dual of unifica- language processing. Other areas are listed tion, is an operation widely used in machine here. learning.

ACM Computing Surveys, Vol. 21, No. 1, March 1989 118 . Kevin Knight Plotkin [ 19701, who opened up work in unidirectional variant of unification and unification under equational theories, also appears in the “function call” of imperative produced work on inductive generalization, programming languages. that is, abstracting general properties from Unification deals in partially defined a set of specific instances. Mitchell’s [ 19791 structures. Unification, unlike many work on version spaces investigated the other operations, accepts inputs that con- same problem using both positive and tain uninstantiated variables. Its output negative training instances. Positive ex- may also contain uninstantiated variables. amples force the learner to generate more general hypotheses, whereas negative ex- 13.2 Trends in Unification Research amples force more specific hypotheses. The general-to-specific lattice (see Sec- One trend in unification research seeks to tion 11) is especially useful, and Mitchell’s discover faster algorithms for first-order algorithms make use of operations on this unification.5 Over the past two decades, lattice. many quite different algorithms have been presented, and although the worst-case 13. CONCLUSION complexity analyses are very interesting, it is still unclear which algorithms work best This section contains a list of properties in practice. There seems to be no better that unification possesses along with a way to compare algorithms than to imple- summary of the trends in research on uni- ment and test them on practical problems. fication. As it stands, which algorithm is fastest depends on the kinds of structures that are 13.1 Some Properties of Unification typically unified. Another trend, begun by Robinson The following properties hold for first- [1968], tackles the problems of unification order unification: in higher order logic. Such problems are in Unification is monotonic. Unification general unsolvable, and the behaviors of adds information, but never subtracts. It is unification algorithms are less quantifiable. impossible to remove information from one Systems that deal with higher order logic, structure by unifying it with another. This for example theorem provers and other AI is not the case with generalization. systems [Miller and Nadathur 19871, stand Unification is commutative and asso- to gain a great deal from research in higher ciative. The order of unifications is irrel- order unification. evant to the final result. Unification and A third trend runs toward incorporating generalization are not, however, distribu- more features into the unification opera- tive with respect to each other [Shieber tion. Automatic theorem provers once in- 19861. cluded separate axioms for commutativity, Unification is a constraint-merging associativity, and so on, but research begun process. If structures are viewed as en- by Plotkin [I9721 showed that building coding constraint information, then uni- these notions directly into the unification fication should be viewed as merging routine yields greater efficiency. Logic pro- constraints. Unification also detects when gramming languages are often called on to combinations of certain constraint sets are reason about hierarchically organized data, inconsistent. but they do so in a step-by-step fashion. Ait-Kaci and Nasr [1986] explain how to Unification is a pattern-matching modify unification to use inheritance infor- process. Unification determines whether mation, again improving efficiency. Many two structures match, using the mechanism unification routines for natural language of variable binding. parsing perform unification over negated, Unification is bidirectional since vari- able binding may occur in both of the ’ This trend toward efficiency also explores topics such structures to be unified. Matching is the as structure sharing and space efficiency.

ACM Computing Surveys, Vol. 21, No. 1, March 1989 Unification: A Multidisciplinary Survey l 119 multiple-valued, and disjunctive structures. BAXTER, L. 1978. The undecidability of the third New applications for unification will no order dyadic unification problem. Znf. Control 38, pp. 170-178. doubt drive further research into new, more BOOK, R. V., AND SIEKMANN, J. 1986. On unifica- powerful unification algorithms. tion: Equational theories are not bounded. J. Symbolic Comput. 2, pp. 317-324. BOYER, R. S., AND MOORE, J. S. 1972. The sharing ACKNOWLEDGMENTS of structure in theorem-proving programs. Mach. Intell. 7. I would like to thank Yolanda Gil, Ed Clarke, and BRESNAN, J., AND KAPLAN, R. 1982. Lexical- Frank Pfenning for their useful comments on earlier functional grammar: A for gram- drafts of this paper. My thanks also go to the reviewers matical representation. In The Mental Represen- for making many suggestions relating to both accuracy tation of Grammatical Relations, J. Bresnan, Ed. and clarity. Finally, I would like to thank Masaru MIT Press, Cambridge, Mass. Tomita for introducing me to this topic and for helping BRUYNOOGHE, M., AND PEREIRA, L. M. 1984. to arrange this survey project. Deduction revision by intelligent backtracking. In Prolog Implementation, J. A. Campbell, Ed. Elsevier, North-Holland, New York. CARDELLI, L. 1984. A semantics of multiple inher- REFERENCES itance. In Semantics of Data Types, Lecture Notes in Computer Science, No. 173, G. Kahn, D. B. AHO, V. A., HOPCROFT, J. E., AND ULLMAN, J. D. MacQueen, and G. Plotkin, Eds., Springer- 1974. The Design and Analysis of Computer Al- Verlag, New York. gorithms. Addison-Wesley, Reading, Mass. CHEN, T. Y., LASSEZ, J., AND PORT, G. S. 1986. ANDREWS, P. B. 1971. Resolution in . Maximal unifiable subsets and minimal non- J. Symbolic Logic 36, pp. 414-432. unifiable subsets. New Generation Comput. 4, pp. ANDREWS, P. B., AND COHEN, E. L. 1977. Theorem 133-152. proving in type theory. In Proceedings of Inter- CHURCH, A. 1940. A formulation of the simple theory national Joint Conference on Artificial Intelli- of types. J. Symbolic Logic 5, pp. 56-68. gence. CLOCKSIN, W. F., AND MELLISH, C. S. 1981. ANDREWS, P., MILLER, D., COHEN, E., AND PFEN- Programming in Prolog. Springer-Verlag, New NING, F. 1984. Automating higher-order logic. York. Vol. 29. In Automated Theorem Prouing: After 25 Years, W. Bledsoe and D. Loveland, Eds. Amer- COLMERAUER, A. 1978. Metamorphosis grammars. ican Mathematical Society, Providence, R.I., In Natural Language Communication with Com- Contemporary Mathematics Series. puters, L. Bolt, Ed. Springer-Verlag, New York. AiT-KACI, H. 1984. A lattice theoretic approach to COLMERAUER, A. 1982a. An interesting subset of nat- computation based on a calculus of partially or- ural language. In Logic Programming, K. L. Clark dered type structures. Ph.D. dissertation. Univ. and S. Tarnlund, Eds. Academic Press, London. of Pennsylvania, Philadelphia, Pa. COLMERAUER, A. 1982b. Prolog and infinite trees. AiT-KACI, H. 1986. An algebraic semantics approach In Logic Programming, K. L. Clark and S. to the effective resolution of type equations. Timlund, Eds. Academic Press, Orlando, Fla. Theor. Comput. Sci. 45. COLMERAUER, A. 1983. Prolog in ten figures. In Pro- AiT-KACI, H., AND NASR, R. 1986. LOGIN: A logic ceedings of the International Joint Conference on programming language with built-in inheritance. Artificial Intelligence. J. Logic Program. 3. (Also MCC Tech. Rep. AI- COLMERAUER, A., KANOUI, H., AND VAN CANEGHEM, 068-85, 1985.) M. 1973. Un systeme de communication BAADER, F. 1986. The theory of idempotent semi- homme-machine en Francais. Research Rep. groups is of unification type zero. J. Autom. Rea- Groupe Intelligence Artificielle, Universitb Aix- soning 2, pp. 283-286. Marseille II. BARWISE, J., AND PERRY, J. 1983. Situations and CORBIN, J., AND BIDOIT, M. 1983. A rehabilita- Attitudes. MIT Press, Cambridge, Mass. tion of Robinson’s unification algorithm. Inf. BATTANI, G., AND MELONI, H. 1973. Interpreteur du Process. 83, pp. 73-79. langage de programmation PROLOG. Groupe In- COURCELLE, B. 1983. Fundamental properties of in- telligence Artificielle, UniversitG Aix-Marseille II. finite trees. Theor. Comput. Sci. 25, pp. 95-169. BAXTER, L. 1973. An efficient unification algorithm. COX, P. T. 1984. Finding backtrack points for intel- Tech. Rea. No. CS-73-23. Univ. of Waterloo. liaent backtracking. In Prolog ImDlementation. Waterloo,- Ontario, Canada: Jy A. Campbell, I?d. Elsevie;, Nbrth-Holland; BAXTER, L. 1976. The complexity of unification. New York. Ph.D. dissertation. Univ. of Waterloo, Waterloo, DAHL, V. 1984. More on gapping grammars. In Pro- Ontario, Canada. ceedings of the International Conference on Fifth

ACM Computing Surveys, Vol. 21, No. 1, March 1989 120 l Kevin Knight

Generation Computer Systems. Sponsored by In- GAZDAR, G., AND PULLUM, G. K. 1985. stitute for New Generation Computer Technol- Computationally relevant properties of natural ogy (ICOT). languages and their grammars. New Generation DAHL, V., AND ABRAMSON. 1984. On gapping gram- Comput. 3, pp. 273-306. mars. In Proceedings of the 2nd International GAZDAR, G., FRANZ, A., OSBORNE, K., AND EVANS, Conference on Logic Programming. Sponsored by R. 1987. Natural Language Processing in the Uppsala University, Uppsala, Sweden. 1980S-A Bibliography. CSLI Lecture Notes DAHL, V., AND SAINT-DIZIER, P., EDS. 1985. Natural Series, Center for the Study of Language and Language Understanding and Logic Program- Information, Stanford, California. ming. Elsevier North-Holland, New York. GOGUEN, J. A., AND MESEGUER, J. 1987. Order- DARLINGTON, J. 1968. Automatic theorem proving sorted algebra solves the constructor-selector with equality substitutions and mathematical in- multiple representation and coercion problems. duction. Mach. Zntell. 3. Elsevier, New York. Tech. Rep. CSLI-87-92, Center for the Study of DARLINGTON, J. 1971. A partial mechanization of Language and Information. second-order logic. Mach. Intell. 6. GOLDFARB, W. D. 1981. The undecidability of the DARLINGTON, J. 1977. Improving the efficiency of second order unification problem. J. Theoret. higher order unification. In Proceedings of the Comput. Sci. 13, pp. 225-230. International Joint Conference on Artifical Intel- GOLDSCHLAGER, L. 1978. A unified approach to ligence. models of synchronous parallel machines. In Pro- DE CHAMPEAUX, D. 1986. About the Paterson- ceedings of the Symposium on the Theory of Com- Wegman linear unification algorithm. J. Comput. puting. Sponsored by ACM Special- Interest Syst. Sci. 32, pp. 79-90. Group for Automata and Computabilitv Theorv DWORK, C., KANELLAKIS, P. C., AND MITCHELL, (SIGACT). J. C. 1984. On the sequential nature of unifica- GOULD, W. E. 1966a. A matching procedure for tion. J. Logic Program. 1, pp. 35-50. w-order logic. Ph.D. dissertation. Princeton DWORK, C., KANELLAKIS, P. C., AND STOCKMEYER, Univ., Princeton, N.J. L. 1986. Parallel algorithms for term matching. GOULD, W. E. 196613. A matching procedure for In Proceedings of the 8th International Conference w-order logic. Scientific Rep. 4, AFCRL 66-781. on Automated Deduction. GUARD, J. R. 1964. Automated logic for semi-auto- EARLEY, J. 1968. An efficient context-free parsing mated mathematics. Scientific Rep. 1, AFCRL algorithm. Ph.D. dissertation, Computer Science 64-411. Dept., Carnegie-Mellon Univ., Pittsburgh, Pa. HARLAND, J., AND JAFFAR, J. 1987. On Parallel Uni- EARLEY, J. 1970. An efficient context-free parsing fication for Prolog. New Generation Comput. 5. algorithm. Commun. ACM 6, pp. 94-102. HASIDA, K. 1986. Conditioned unification for natu- EISELE, A., AND DORRE, J. 1986. A lexical functional ral language processing. In Proceedings of the grammar system in Prolog. In Proceedings of the International Conference on Computational International Conference on Computational Lin- Linguistics. North-Holland, Amsterdam, New guistics. North-Holland, Amsterdam, New York. York. EISELE, A., AND D~RRE, J. 1988. Unification of dis- HENKIN, L. 1950. Completeness in the theory of junctive feature descriptions. In Proceedings of types. J. Symbolic Logic 15, pp. 81-91. the 26th Annual Meeting of the Association for HERBRAND, J. 1971. Recherches sur la theorie de la Computational Linguistics. demonstration. Ph.D. dissertation. In Logical FAGES, F., AND HUET, G. 1986. Complete sets of Writings, W. Goldfarb, Ed. Harvard University unifiers and matchers in equational theories. Press, Cambridge, Massachusetts. Theor. Comput. Sci. 43, pp. 189-200. HINDLEY, R. 1969. The principal type-scheme of an FARMER, W. M. 1987. A unification algorithm for object in combinatory logic. Trans. Am. Math. second-order monadic terms. Tech. Rep. (forth- SOC. 146. coming). The MITRE Corporation, document HIR~H, S. B. 1986. P-PATR: A compiler for unifi- MTP-253. cation-based grammars. Center for the Study of FAY, M. 1979. First order unification in an equa- Language and Information. tional theory. In Proceedings of the 4th Workshop HUET, G. 1972. Contrained resolution: A complete on Automated Deduction, Austin, Texas. method for higher order logic. Ph.D. dissertation. GALLIER, J. H., AND SNYDER, W. 1988. Complete Case Western Reserve Univ., Cleveland, OH. sets of transformations for general E-unification. Tech. Rep. MS-CIS-88-72-LINCLAB-130. Dent. HUET, G. 1973a. A mechanization of type theory. In of Computer and Information Science, Univ.-of Proceedings of the International Joint Conference Pennsylvania, Philadelphia, Pa. on Artificial Intelligence. GAZDAR, G. 1987. The new grammar formalisms- HUET, G. 1973b. The undecidability of unification A tutorial survey (abstract). In Proceedings of the in third order logic. Inf. Control 22, pp. 257-267. International Joint Conference on Artificial Zntel- HUET, G. 1975. A unification algorithm for typed ligence. X-calculus. Theoret. Comput. Sci. I, pp. 27-57.

ACM Computing Surveys, Vol. 21, No. 1, March 1989 Unification: A Multidisciplinary Survey l 121

HUET, G. 1976. Resolution d’equations dans les KIRCHNER, C. 1986. Computing unification algo- langages d’ordre 1, 2, . , o. Ph.D. dissertation. rithms. In 1st Symposium on Logic in Computer Univ. de Paris VII, France. Science. Co-sponsored by IEEE Computer Soci- HUET, G., AND OPPEN, D. 1980. Equations and ety, Technical Committee on Mathematical rewrite rules: A survey. In Formal Language Foundations of Computing; ACM Special Interest Theory, R. V. Book, Ed. Academic Press, Group for Automata and Computability Theory Orlando, Fla. (SIGACT); Association for Symbolic Logic; European Association for Theoretical Computer JAFFAR, J. 1984. Efficient unification over infinite Science. terms. New Generation Comput. 2, pp. 207-219. KNUTH, D., AND BENDIX, P. B. 1970. Simple word JENSEN, D. C., AND PIETRZYKOWSKI, T. 1976. problems in . In Computational Mechanizing w-order type theory through unifi- Problems in Abstract Algebra, J. Leech, Ed. cation. Theoret. Comput. Sci. 3, pp. 123-171. Pergamon Press, Oxford. KAPUR, D., AND NARENDRAN, P. 1986. NP-com- LEVY, J. 1983. A unification algorithm for concur- pleteness of the set unification and matching rent Prolog. In Proceedings of the 2nd Logic Pro- problems. In Proceedings of the 8th International gramming Conference. Sponsored by Uppsala Conference on Automated Deduction. Springer- University, Uppsala, Sweden. Verlag, New York. LEWIS, H. R., AND STATMAN, R. 1982. Unifiability KAPUR, D., KRISHNAMOORTHY, M. S., AND NAREN- is complete for co-NLogSpace. Info. Process. L&t. DRAN, P. 1982. A new linear algorithm for 15, pp. 220-222. unification. Tech. Rep. 82CRD-100, General Electric. LIVESEY, M., SIEKMANN, J., SZABO, P., AND UNVERICHT, E. 1979. Unification problems for KARTTUNEN, L. 1984. Features and values. In Pro- combinations of associativity, commutatitivity, ceedings of the International Conference on Com- distributivity and idempotence axioms. In Pro- putational Linguistics. ceedings of the 4th Workshop on Automated De- KARTTUNEN, L. 1986a. D-PATR: A development en- duction, Austin, Texas. vironment for unification-based grammars. In LUCCHESI, C. L. 1972. The undecidability of the Proceedings of the International Conference on unification problem for third order languages. Computational Linguistics. North-Holland, Tech. Rep. CSRR 2059, Dept. of Applied Analysis Amsterdam, New York. and Computer Science, University of Waterloo, KARTTUNEN, L. 1986b. Radical lexicalism. Tech. Waterloo, Ontario, Canada. Rep. CSLI-86-68, Center for the Study of Lan- MACQUEEN, D., PLOTKIN, G., AND SETHI, R. guage and Information. 1984. An ideal model for recursive polymorphic KARTTUNEN, L., AND KAY, M. 1985. Structure shar- types. In Proceedings of the ACM Symposium on ing with binary trees. In Proceedings of the 23rd Principles of Programming Languages. ACM, Annual Meeting of the of the Association for Com- New York. putational Linguistics. MALUSZYNSKI, J., AND KOMOROWSKI, H. J. 1985. KASPER, R. 1987. A unification method for disjunc- Unification-free execution of Horn-clause pro- tive feature descriptions. In Proceedings of the grams. In Proceedings of the 2nd Logic Progrum- 25th Annual Meeting of the Association for Com- ming Symposium. IEEE, New York. (Also, putational Linguistics. Harvard Univ., Computer Science Dept., Tech. KASPER, R., AND ROUNDS, W. C. 1986. A logical Rep. TR-10-85, 1985.) semantics for feature structures. In Proceedings MANNILA, H., AND UKKONEN, E. 1986. On the com- of the 24th Annual Meeting of the Association for plexity of unification . In Proceedings Computational Linguistics. of the 3rd International Logic Programming Con- KAY, M. 1979. Functional grammar. In 5th Annual ference. Springer-Verlag, New York. Meeting of the Berkeley Linguistic Society. Spon- MARTELLI, A., AND MONTANARI, U. 1976. sored by Berkeley Linguistics Society, Berkeley, Unification in linear time and space: A structured California. presentation. Internal Rep. No. B76-16, 1st. di KAY, M. 1984. Functional unification grammar: A Elaborazione delle lnformazione, Consiglio Na- formalism for machine translation. In Proceed- zionale delle Ricerche, Pisa, Italy. ings of the International Conference on Compu- MARTELLI, A., AND MONTANARI, U. 1977. Theorem tational Linguistics. North-Holland, Amsterdam, proving with structure sharing and efficient uni- New York. fication. In Proceedings of International Joint KAY, M. 1985a. Parsing in functional grammar. In Conference on Artificial Intelligence. (Also lnter- Natural Language Parsing, D. Dowty, L. Karttu- nal Rep. No. S-77-7, lstituto di Scienze nen, and A. Zwicky, Eds. Cambridge Univ. Press, dell’lnformazione, Univ. of Pisa, Italy). Cambridge. MARTELLI, A., AND MONTANARI, U. 1982. An effi- KAY, M. 1985b. Unification in grammar. In Natural cient unification algorithm. ACM Trans. Prog. Language Understanding and Logic Program- Lang. Syst. 4. ming, V. Dahl and P. Saint-Dizier, Eds. Elsevier, MATSUMOTO, Y., AND SUGIMURA, R. 1987. A pars- North-Holland, New York. ing system based on logic programming. In Pro-

ACM Computing Surveys, Vol. 21, No. 1, March 1989 122 ’ Kevin Knight

ceedings of the International Joint Conference on PEREIRA, F. C. N. 1981. Extraposition grammars. Artificial Intelligence. Comput. Linguist. 7. MATSUMOTO, Y., TANAKA, H., HIRAKAWA, H., PEREIRA, F. C. N. 1985. A structure-sharing repre- MIYOSHI, H., AND YASUKAWA, H. 1983. BUP: sentation for unification-based grammar formal- A bottom-up parser embedded in Prolog. Near isms. In Proceedings of the 23rd Annual Meeting Generation Comput. I, pp. 145-158. of the Association for Computational Linguistics. MATWIN, S., AND PIETRZYKOWSKI, T. 1982. PEREIRA, F. C. N. 1987. Grammars and of Exponential improvement of exhuastive back- partial information. In Proceedings of the 4th tracking: Data structure and implementation. In International Conference on Logic Programming. Proceedings of the 6th International Conference Springer-Verlag, New York. on Automated Deduction. Springer-Verlag, New PEREIRA, F. C. N., AND SHIEBER, S. 1984. The se- York. mantics of grammar formalisms seen as computer MCCORD, M. C. 1980. Slot grammars. Comput. languages. In Proceedings of the International Linguist. 6, pp. 31-43. Conference on Computational Linguistics. MCCORD, M. C. 1985. Modular logic grammars. In PEREIRA, F. C. N., AND WARREN, D. H. D. 1980. Proceedings of the 23rd Annual Meeting of the Definite clause grammars for language analysis- Association for Computational Linguistics. a survey of the formalism and a comparison with MESEGUER, J., GOGUEN, J. A., AND SMOLKA, G. augmented transition networks. Artif. Zntell. 13, 1987. Order-sorted unification. Tech. Rep. pp. 231-278. CSLI-87-86, Center for the Study of Language PIETRZYKOWSKI,T. 1973. A complete mechaniza- and Information. tion of second-order type theory. J. ACM, 20, pp. MILLER, D. A., AND NADATHUR, G. 1987. Some use 333-365. (Also Univ. of Waterloo, Research Rep., of higher-order logic in computational linguistics. CSRR 2038,197l.) In Proceedings of the 25rd Annual Meeting of the PIETRZYKOWSKI,T., AND JENSEN, D. 1972. A com- Association for Computational Linguistics. plete mechanization of w-order logic. Research MILNER, R. 1978. A theory of type polymorphism Rep. CSRR 2060, Univ. of Waterloo, Waterloo, in programming. J. Comput. Syst. Sci. 17, pp. Ontario, Canada. 348-375. PIETRZYKOWSKI, T., AND JENSEN, D. 1973. MINSKY, M. 1975. A framework for representing Mechanizing w-order type theory through unifi- knowledge. In The Psychology of Computer cation. Tech. Rep. CS-73-16, Univ. of Waterloo, Vision, P. Winston, Ed. McGraw-Hill, New York. Waterloo, Ontario, Canada. MITCHELL, T. 1979. Version spaces: An approach to PIETRZYKOWSKI, T., AND MATWIN, S. 1982. concept learning, Ph.D. dissertation, Stanford Exponential improvement of efficient backtrack- Univ.: Stanford,Calif. ing: A strategy for plan based deduction. In Pro- MOSHIER, D., AND ROUNDS,W. C. 1987. A logic for ceedings of the 6th Znternatianal Conference on partially specified data structures. In ACM Sym- Automated Deduction. Springer-Verlag, New posium on Principles of Programming Languages. York. ACM, New York. PLAISTED, D. 1984. The occur-check problem in MUKAI, K. 1983. A unification algorithm for infinite Prolog. New Generation Comput. 2, pp. 309-322. trees. In Proceedings of the International Joint PLOTKIN, G. 1970. A note on inductive generaliza- Conference on Artificial Intelligence. tion. Mach. Intell. 5. MUKAI, K. 1985a. Horn clause logic with parameter- PLOTKIN. G. 1972. Building-in equational theories. ized types for situation semantics programming. Mach. Zntell. 7. Tech. Rep. TR-101, ICOT. POLLARD, C. 1985. Phrase structure grammar with- MUKAI, K. 198513. Unification over complex indeter- out metarules. In Proceedings of the 4th West minates in Prolog. Tech. Rep. TR-113, ICOT. Coast Conference on Formal Linguistics. MUKAI, K., AND YASUKAWA, H. 1985. Complex in- REYLE, U., AND FREY, W. 1983. A PROLOG imple- determinates in Prolog and its application to mentation of lexical functional grammar. In Pro- discourse models. New Generation Comput. 3, pp. ceedings of the International Joint Conference on 441-466. Artificial Intelligence. NILSSON, N. J. 1980. Principles of Artificial Zntelli- REYNOLDS, J. C. 1970. Transformational systems gence. Tioga, Palo Alto. and the of atomic formulas. PATERSON, M. S., AND WEGMAN, M. N. 1976. Mach. Zntell. 5. Linear unification. In Proceedings of the Sympo- RISTAD, E. S. 1987. Revised generalized phrase sium on the Theory of Computing. ACM Special structure grammar. In Proceedings of the 25th Interest Group for Automata and Computability Annual Meeting of the Association for Computa- Theory (SIGACT). ticmal Linguistics. PATERSON, M. S., AND WECMAN, M. N. 1978. ROBINSON, J. A. 1965. A machine-oriented logic Linear unification. J. Comput. Syst. Sci. 16, pp. based on the resolution principle. J. ACM 12, pp. 158-167. 23-41.

ACM ComputingSurveys, Vol. 21, No. 1, March 1989 Unification: A Multidisciplinary Survey l 123

ROBINSON, J. A. 1968. New directions in mechanical fication, C. Kirchner, Ed., 1988.) Springer-Verlag, theorem proving. In Proceedings of the Znterna- New York. tional Federation of Information Processing SIEKMANN, J. 1986. Unification theory. In Proceed- Congress. ings of the 7th European Conference on Artificial ROBINSON, J. A. 1969. Mechanizing higher-order Intelligence. Sponsored by the European Coordi- logic. Mach. Zntell. 4. nating Committee for Artificial Intelligence. ROBINSON, J. A. 1970. A note on mechanizing higher SMOLKA, G., AND AiT-KACI, H. 1987. Inheritance order logic. Mach. Zntell. 5. hierarchies: Semantics and unification. Tech. ROBINSON, J. A. 1971. Computational logic: The Rep. AI-057-87, Microelectronics and Computer unification computation. Mach. Zntell. 6. Technology Corporation (MCC), Austin, Texas. ROBINSON, P. 1985. The SUM: An AI coprocessor. (To appear in the J. Symbolic Comput. special Byte 10. issue on unification, C. Kirchner, Ed., 1988.) ROUNDS, W. C., AND KASPER, R. 1986. A complete STABLER, E. P. 1983. Deterministic and bottom-up logical calculus for record structures representing parsing in Prolog. In Proceedings of the Confer- linguistic information. In Proceedings of the 1st ence of the American Association for Artificial Symposium on Logic in Computer Science. Co- Intelligence. sponsored by the IEEE Computer Society, Tech- STEELE, G. 1984. Common LISP: The Language. nical Committee on Mathematical Foundations Digital Press, Bedford, Mass. of Computing; ACM Special Interest Group for STICKEL, M. 1978. Mechanical theorem proving and Automata and Computability Theory (SIGACT); artificial intelligence languages, Ph.D. disserta- Association for Symbolic Logic; European Asso- tion, Computer Science Dept., Carnegie-Mellon ciation for Theoretical Computer Science. Univ., Pittsburgh, Pa. ROUNDS, W. C., AND MANASTER-RAMER, A. 1987. A TOMITA, M. 1987. An efficient augmented-context- logical version of functional grammar. In Pro- free parsing algorithm. Comput. Linguist. 23, pp. ceedings of the 25th Annual Meeting of the Asso- 31-46. ciation for Computational Linguistics. TOMITA, M. 1985a. An efficient context-free parsing ROUSSEL, P. 1975. PROLOG, Manuel de reference algorithm for natural languages. In Proceedings et d’utilisation. Groupe Intelligence Artificielle, of the International Joint Conference on Artificial Universite Aix-Marseille II. Intelligence. SAG, I. A., AND POLLARD, C. 1987. Head-driven TOMITA, M. 1985b. An efficient context-free parsing phrase structure grammar: An informal synopsis. algorithm for natural languages and its applica- Tech. Rep. CSLI-87-79, Center for the Study of tions, Ph.D. dissertation, Computer Science Language and Information. Dept., Carnegie-Mellon Univ., Pittsburgh, Pa. SCHMIDT-SCHAUSS, M. 1986. Unification under asso- TOMITA, M., AND KNIGHT, K. 1988. Full unification ciativity and idempotence is of type nullary. and pseudo unification. Tech. Rep. CMU-CMT- J. Autom. Reasoning 2, pp. 277-281. 87-MEMO. Center for Machine Translation, SELLS, P. 1985. Lectures on Contemporary Syntactic Carnegie-Mellon Univ., Pittsburgh, Pa. Theories. Univ. of Chicago, Chicago, Ill. Also, TRUM, P., AND WINTERSTEIN, G. 1978. Description, CSLI Lecture Notes Series. implementation, and practical comparison of uni- SHAPIRO, E. 1983. A subset of concurrent Prolog and fication algorithms. Internal Rep. No. 6/78, Fach- its interpreter. Tech. Rep. TR-003, ICOT. bereich Informatik, Universitat Kaiserlautern, SHIEBER, S. 1984. The design of a computer language W. Germany. for linguistic information. In Proceedings of USZKOREIT, H. 1986. Categorial unification gram- the International Conference on Computationk mars. In Proceedings of the International Confer- Linguistics. North-Holland, Amsterdam, New ence on Computational Linguistics. (Also CLSI York. Tech. Rep. CSLI-86-66.) North-Holland, Amster- SHIEBER, S. 1985. Using restriction to extend pars- dam, New York. ing algorithms for feature-based formalisms. In VAN EMDEN, M. H., AND KOWALSKI, R. A. 1976. The Proceedings of the 23rd Annual Meeting of the semantics of predicate logic as a programming Association for Computational Linguistics. Spon- language. J. ACM 23, pp. 733-742. sored by International Joint Conference on Ar- VENTURINI-ZILLI, M. 1975. Complexity of the uni- tificial Intelligence. fication algorithm for first-order expressions. SHIEBER, S. 1986. An Zntroduction to Unification- Res. Ren. Consialio Nazionale Delle Ricerche Based Approaches to Grammar. CSLI Lecture Istituto -per le applicazioni de1 calcolo, Rome, Notes Series, Center for the study of Language Italy. and Information, Stanford, California. VITTER, J. S., AND SIMONS, R. A. 1984. Parallel SIEKMANN, J. 1984. Universal unification. In Pro- algorithms for unification and other complete ceedings of the 7th International Conference on problems in P. In ACM ‘84 Conference Proceed- Automated Deduction. (Newer version to appear ings (San Francisco, Calif., Oct. S-10). ACM, New in the J. Symbolic Comput. special issue on uni- York.

ACM Computing Surveys, Vol. 21, No. 1, March 1989 124 l Kevin Knight

VITTER, J. S., AND SIMONS, R. A. 1986. New classes WOODS, W. 1970. Transition network grammars for for parallel complexity: A study of unification natural language analysis. Commun. ACM 13, pp. and other complete problems for P. IEEE Trans. 591-606. Comput. C-35. WROBLEWSKI, D. 1987. Nondestructive graph uni- WALTHER, C. 1986. A classification of many-sorted fication. In Proceedings of the Conference on the unification problems. In Proceedings of the 8th American Association for Artificial Intelligence. International Conference on Automated Deduc- YASUURA, H. 1984. On the parallel computational tion. Springer-Verlag, New York. complexity of unification. In Fifth Generation WARREN, D. H. D., PEREIRA, F. C. N., AND PEREIRA, Computer Systems. Sponsored by. the Institute L. M. 1977. Prolog-The language and its im- for New Generation Computer Technology plementation compared with LISP. In Sympo- (ICOT). (Also Yajima Labs, Tech. Rep. ER-83- sium on Artificial Intelligence and Programming 01, 1983.) Systems. Co-sponsored by ACM Special Interest YOKOI, T., MUKAI, K., MIYOSHI, H., TANAKA, Y., Group on Artificial Intelligence (SIGACI), ACM AND SUGIMURA, R. 1986. Research activities on Special Interest Group on Programming Lan- natural language processing of the FGCS project. guages (SIGPL). ZCOT J. 14, pp. 1-8. WINTERSTEIN, G. 1976. Unification in second order ZEEVAT, H., KLEIN, E., AND CALDER, J. 1987. logic. Res. Rep. Fachbereich Informatik, Univer- Unification categorial grammar. In Categorial sitat Kaiserslautern, W. Germany. Grammar, Unification Grammar, and Parsing, N. WITTENBURG, K. 1986. Natural language parsing Haddock. E. Klein. and G. Merrill. Eds. Centre with combinatory categorial grammar in a graph- for Cognitive Science, University df Edinburgh, unification-based formalism. Ph.D. dissertation. Edinburgh, Scotland. Univ. of Texas, Austin, Texas.

Received January 1988; final revision accepted July 1988.

ACM Computing Surveys, Vol. 21, No. 1, March 1989