MODULAR LOGIC GRAMMARS

Michael C. McCord IBM Thomas J. Watson Research Center P. O. Box 218 Yorktown Heights, NY 10598

ABSTRACT scoping of quantifiers (and more generally focalizers, McCord, 1981) when the building of log- This report describes a logic grammar formalism, ical forms is too closely bonded to syntax. Another Modular Logic Grammars, exhibiting a high degree disadvantage is just a general result of lack of of modularity between syntax and semantics. There modularity: it can be harder to develop and un- is a syntax rule compiler (compiling into ) derstand syntax rules when too much is going on in which takes care of the building of analysis them. structures and the interface to a clearly separated semantic interpretation component dealing with The logic grammars described in McCord (1982, scoping and the construction of logical forms. The 1981) were three-pass systems, where one of the main whole system can work in either a one-pass mode or points of the modularity was a good treatment of a two-pass mode. [n the one-pass mode, logical scoping. The first pass was the syntactic compo- forms are built directly during through nent, written as a definite clause grammar, where interleaved calls to semantics, added automatically syntactic structures were explicitly built up in by the rule compiler. [n the two-pass mode, syn- the arguments of the non-terminals. Word sense tactic analysis trees are built automatically in selection and slot-filling were done in this first the first pass, and then given to the (one-pass) pass, so that the output analysis trees were actu- semantic component. The grammar formalism includes ally partially semantic. The second pass was a two devices which cause the automatically built preliminary stage of semantic interpretation in syntactic structures to differ from derivation trees which the syntactic analysis tree was reshaped to in two ways: [I) There is a shift operator, for reflect proper scoping of modifiers. The third pass dealing with left-embedding constructions such as took the reshaped tree and produced logical forms English possessive noun phrases while using right- in a straightforward way by carrying out modification rezursive rules (which are appropriate for Prolog of nodes by their daughters using a modular system parsing). (2) There is a distinction in the syn- of rules that manipulate semantic items -- consist- tactic formalism between strong non-terminals and ing of logical forms together with terms that de- weak non-terminals, which is important for distin- termine how they can combine. guishing major levels of grammar. The CHAT-80 system (Pereira and Warren, 1982, Pereira, 1983) is a three-pass system. The first I. INTRODUCTION pass is a purely syntactic component using an extrapositJon grammar (Pereira, 1981) and producing l'he term logic grammar will be used here, in syntactic analyses in righ~ost normal form. The the context of natural language processing, to mean second pass handles word sense selection and slot- a logic programming system (implemented normally filling, and =he third pass handles some scoping in P£olog), which associates semantic represent- phenomena and the final semantic interpretation. ations Cnormally in some version of preaicate logic) One gets a great deal of modularity between syntax with natural language text. Logic grammars may have and semantics in that the first component has no varying degrees on modularity in their treatments elements of semantic interpretation at all. of syntax and semantics. Th,ere may or may not be an isolatable syntactic component. In McCocd (1984) a one-pass semantic inter- pretation component, SEM, for the EPISTLE system In writing metamorpilosis grammars (Colmerauer, {Miller, Heidorn and Jensen, 1981) was described. 1978), or definite clause grammars, DCG's, (a spe- SEM has been interfaced both to the EPISTLE NLP cial case of metamorphosis grammars, Pereira and grammar (Heidorn, 1972, Jensen and Heidorn, 1983), Warren. 1980), it is possible to build logical forms as well as to a logic grammar, SYNT, written as a directly in the syntax rules by letting non- DCG by the author. These grammars are purely syn- terminals have arguments that represent partial tactic and use the EPISTLE notion (op. cir.) of logical forms being manipulated. Some of the ear- approximate parse, which is similar to Pereira's ties= logic grammars (e.g., Dahl, 1977) used this notzon of righ~s~ normal form, but was developed approach. There is certainly an appeal in being independently. Thus SYNT/SEM is a two-pass system dicect, but there are some disadvantages in this with a clear modularity between syntax and seman- lack of modularity. One disadvantage is that it tics. seems difficulZ to get an adequate treatment of the

104 In DCG's and extraposition grammars, the matically produced) syntactic analysis trees building of analysis structures .(either logical much more readable and natural linguistically. forms or syntactic trees) must be specified ex- In the absence of shift constructions, these plicitly in the syntax rules. A certain amoun~ of trees are like derivation trees, but only with modularity is then lost, because the grammar writer nodes corresponding to strong non-terminals. must be aware of manipulating these structures, and the possibility of using the grammar in different [n an experimental MLG, the semantic component ways is reduced. [n Dahl and McCord (1983), a logic handles all the scoping phenomena handled by grammar formalism was described, modifier structure that in McCord (1981) and more than the semantic grammars (HSG's), in which structure-building (of component in McCord (1984). The logical form annotated derivation trees) is implicit in the language is improved over that in the previous formalism. MSG's look formally like extraposition systems. grammars, with the additional ingredient that se- mantic items (of the type used in McCord (1981)) The MLG formalism allows for a great deal of modu- can be indicated on the left-hand sides of rules, larity in natural language grammars, because the and contribute automatically to the construction syntax rules can be written with very little of a syntactico-semantic tree much like that in awareness of semantics or the building of analysis HcCord (1981). These MSG's were used interpretively structures, and the very same syntactic component in parsing, and then (essentially) the two-pass can be used in either the one-pass or the two-pass semantic interpretation system of McCord (1981) was mode described above. used to get logical forms. So, totally there were three passes in this system. Three other logic grammar systems designed with modularity in mind are Hirschman and Puder (1982), [n this report, [ wish to describe a logic Abramson (1984) and Porto and Filgueiras (198&). grammar system, modular logic grammars (MLG's), These will be compared with MLG's in Section 6. with the following features:

There is a syntax rule compiler which takes care 2. THE MLG SYNTACTIC FORMALISM of the building of analysis structures and the interface to semantic interpretation. The syntactic component for an MLG consists of a declaration of the strong non-terminals, fol- There is a clearly separated semantic inter- lowed by a sequence of MLG syntax rules. The dec- pretation component dealing with scoping and [aration of strong non-terminals is of the form the construction of logical forms. strongnonterminals(NTI.NT2 ..... NTn.nil). The whole system (syntax and semantics) can work optionally in either a one-pass mode or a two- where the NTi are the desired strong non-terminals pass mode. (only their principal functors are indicated). Non-terminals that are not declared strong are In the one-pass mode, no syntactic structures called weak. The significance of the strong/weak are built, but logical forms are built directly distinction will be explained below. during parsing through interleaved calls to the semantic interpretation component, added auto- MLG syntax rules are of the form matically by the rule compiler. A ~---> B in the two-pass mode, the calls to the semantic interpretation component are not interleaved, where A is a non-terminal and B is a rule body. A but are made in a second pass, operating on rule body is any combination of surlCace terminals, syntactic analysis trees produced (automat- logical terminals, goals, shifted non-terminals, ically) in the first pass. non-tprminals, the symbol 'nil', and the cut symbol '/', using the sequencing operator ':' and the 'or' The syntactic formalism includes a t device, symbol 'l' (We represent left-to-right sequencing called the shift operator, for dealing with with a colon instead of a comma, as is often done left-embedding constructions such as English in logic grammars.) These rule body elements are possessive noun phrases ("my wife's brother's Prolog terms (normally with arguments), and they friend's car") and Japanese relative clauses. are distinguished formally as follows. ~ne shift operator instructs the rule compiler to build the structures appropriate for left- A su~e terminal is of the form +A, where A embedding. These structures are not derivation is any Prolog term. Surface terminals corre- trees, because the syntax rules are right-re- spond to ordinary terminals in DCG's (they match cursive, because of the top-down parsing asso- elements of the surface word string), and the ciated with Prolo E. notation is often [A] in DCG's.

There is a distinction in the syntactic A logical terminal is of the form 0p-L~, where formalism between strong non-terminals and weak Op is a modification operator and LF is a logical non-terminals, which is important for distin- form. Logical terminals are special cases of guishing major levels of grammar and which semantic items, the significance of which will simplifies the. working of semantic interpreta- be explained below. Formally, the rule compiler tion. This distinction also makes the (auto-

105 recognizes them as being terms of the form A-B. information to be used in the call to semantics made There can be any number of them in a rule body. by the next higher strong rule. (Also, a shift generates a call to semantics.) A goal is of the form $A, where A is a term re- presenting a Prolog goal. (This is the usual In the two-pass mode, where syntactic analysis provision for Prolog procedure calls, which are trees are built during the first pass, the rule often indicated by enclosure in braces in compiler builds in the construction of a tree node DCG's.) corresponding to every strong rule. The node is labeled essentially by the non-terminal appearing A shifted non-terminal is either of the form%A, on the left-hand side of the strong rule. (A shift or of the form F%A, where A is a weak non- also generates the construction of a tree node.) terminal and F is any ~erm. (In practice, F Details of rule compilation will be given in the will be a list of features.) As indicated in next section. the introduction, the shift operator '~' is used to handle left-embedding constructions in a As indicated above, logical terminals, and more right-recursive ~ule system. generally semantic items, are of the form

Any rule body element not of the above four Operator-LogicalForm. forms and not 'nil' or the cut symbol is taken to be a non-terminal. The Operator is a term which determines how the semantic item can combine with other semantic items A terminal is either a surface terminal or a during semantic interpretation. (In this combina- logical ~erminal. Surface ~erminals are building tion, new semantic items are formed which ;ire no blocks for the word string being analyzed, and longer logical terminals.) Logical terminals are logical terminals are building blocks for the most typically associated with lexical items, al- amalysis structures. though they ar~ also used to produc~, certain non- lexical ingredients in logical form analysis. An A syntax rule is called strong or weak, .,u- example for the lexical item "each" might be cording as the non-terminal on its left-hand side is strong or weak. Q/P - each(P,Q).

It can be seen that on a purely formal level, Here the operator Q/P is such that when the "each" the only differences between HLG syntax rules and item modifies, say, an item having logical form DCG's are (1) the appearance of logical terminals man(X), P gets unified with man(X), and the re- in rule bodies of MLG's, (2) the use of ~he shift sulting semantic item is operator, and (3) the distinction between strong and weak non-terminals. However, for a given lin- @Q - each(~.an(X),Q) guistic coverage, the syntactic component of an MLG will normally be more compact than the corresponding where @q is an operator which causes Q to get uni- DCG because structure-building must be ,~xplicit in fied wi~h the logical form of a further modificand. DCG's. In this report, the arrow '-->' (as opposed Details ,Jr the dse of semantic items will be given to ':>') will be used for for DCG rules, and the in Section A. same notation for sequencing, terminals, etc.. will be used for DCG's as for MLG's. Now let us look at the syntactic component of a sample HLG which covers the same ground as a What is the significance of the strong/weak welt-known DCG. The following DCG is taken essen- distinction for non-terminals and rules? Roughly, tially from Pereira and Warren (1980). It is the a strong rule should be thought of as introducing sort of DCG that builds logical forms directly Dy a new l®vel of grammar, whe[eas a weak rule defines manipulating partial logical forms in arguments of analysis within a level. Major categories like the grammar symbols. sentence and noun phrase are expanded by strong rules, but auxiliary rules like the reoursive rules sentfP) --> np(X,PI,P): vp(X,Pl). that find the postmodifiers of a verb are weak np(X,P~,P) --~ detfP2,PI,P): noun(X,P3): rules. An analogy with ATN's (Woods, 1970) is t~at relclause(X,P3,P2). strong non-tecminals are like the start categories np(X,P,P) --> name(X). of subnetworks (with structure-building POP arcs vp(X,P) --> transverbfX,Y,Pl): np(Y,Pl,P). for termination), whereas weak non-terminals are vpfX,P~ --> intransverb(X,P). llke internal nodes. relcbtuse(X,Pl,Pl&P2) --> +that: vp(X,P2). relc~ause(*,P,P) --> nil. In the one-pass mode, the HLG rule compiler det(PI,P2,P) --> +D: $dt~D,PI,P2,P). makes the following distinction for strong and weak nounfX,P) --> +N: SnfN,X,P). rules. In the Horn clause ~ranslatiDn of a strong name(X) --> +X: $nm(X). ~11e, a call to the semantic interpretation compo- transverb(X,Y,P) --> +V: $tv(V,X,Y,P). nent is compiled in at the end of the clause. The intransverb(X,P) --> +V: $iv(V,X,P). non-terminals appearing in rules (both strong and weak) are given extra arguments which manipu!aKe /~ Lexicon */ semantic structures used in the call to semantic interpretation. No such call to semantics is com- n(maa,X,man(X) ). n(woman, X,woman (X)). piled in for weak rules. Weak rules only gather ~(john). nm(mary).

106 dt(every,P1,P2,all(P1,P2)). goes to npl. In npl, one reads several premodifiers dt(a,PI,P2,ex(Pl,P2)). (premods), say adjectives, then a head noun, then tv(loves,X,Y,love(X,Y)). goes to np2. [n np2, one may either finish by iv(lives,X,live(X)). reading postmodifiers (postmods), OR one may read an apostrophe-s (poss) and then SHIFT back to npl. The syntactic component of an analogous HLG is as Illustration for the noun phrase, "the old man's follows. The lexicon is exactly the same as that dusty hat": of the preceding DCG. For reference below, this grammar will be called MLGRAH. the old man 's np det npl premods noun np2 poss %npl

strongnonterminals(sent.np.relclause.det.nil). dusty hat (nil) premods noun np2 postmods sent ~> np(X): vp(X). np(X) => dec: noun(X): relclause(X). When the shift is encountered, the syntactic np(X) ~> name(X). structures (in the two-pass mode) are manipulated vp(X) ~> transverb(X,Y): np(Y). (in the compiled rules) so that the initial np ("the vp(X) ~> intransverb(X). old man") becomes a left-embedded sub-structure of relclause(X) ~> +that: vp(X). the larger np (whose head is "hat"). But if no relclause(*) ~> nil. apostrophe-s is encountered, then the structure for det ~> +O: Sdt(D,P1,P2,P): PZ/PI-P. "the old man" remains on the top level. noun(X) ----> +N: Sn(N,X,P): I-P. name(X) ~> +X: Snm(X). transverb(X,Y) :> +V: $tv(V,X,Y,P): I-P. 3. COMPILATION OF MLG SYNTAX RULES intransverb(X) => +V: $iv(V,X,P): l-P In describing rule compilation, we will first This small grammar illustrates all the ingredients look at the two-pass mode, where syntactic struc- of HLG syntax rules except the shift operator. The tures are built in the first pass, because the re- shift will be illustrated below. Note that 'sent' lationship of the analysis structures to the syntax and 'np' are strong categories but 'vp' is weak. rules is more direct in this case. A result is that there will be no call to semantics at the end of the 'vp' rule. Instead, the semantic The syntactic structures manipulated by the structures associated with the verb and object are compiled rules are represented as syntactic items, passed up to the 'sent' level, so that the subject which are terms of the form and object are "thrown into the same pot" for se- mantic combination. (However, their surface order syn(Features,Oaughters) is not forgotten.) where Features is a feature list (to be defined), and There are only two types of modification op- Daughters is a list consisting of syntactic items erators appearing in the semantic items of this MLG: and terminals. Both types of terminal (surface and 'I' and P2/PI. The operator 'i' means 'left- logical) are included in Daughters, but the dis- conlotn . Its effect is to left-conjoin its asso- playing procedures for syntactic structures can ciated logical form to the logical form of the optionally filter out one or the other of the two modificand (although its use in this small grammar types. A feature list is of the form nt:Argl, where is almost trivial). The operator P2/PI is associ- nt is the principal fun=tot of a strong non-terminal ated with determiners, and its effect has been il- and Argl is its first argument. (If nt has no ar- lustrated above. guments, we take Argl=nil.) It is convenient, in large grammars, to use this first argument Argl to The semantic component will be given below in hold a list (based on the operator ':') of gram- Section &. A sa~_ple semantic analysis for the matical features of the phrase analyzed by the sentence "Every man that lives loves a woman" is non-terminal (like number and person for noun phrases). all(man(Xl)&live(Xl),ex(woman(X2),love(Xl,X2))). [n compiling DCG rules into Prolog clauses, This is the same as for the above DCG. We will also each non-terminal gets two extra arguments treated show a sample parse in the next section. as a difference list representing the word string analyzed by the non-terminal. In compiling MLG A fragment of an MLG illustrating the use of rules, exactly the same thing is done to handle word the shift in the treatment of possessive noun strings. For handling syntactic structures, the phrases is as follows: MLG rule compiler adds additional arguments which manipulate 'syn' structures. The number of addi- np ~---> deC: npl. tional arguments and the way they are used depend npl => premods: noun: np2. on whether :he non-terminal is strong or weak. If vp2 ~> postmods. the original non-terminal is strong and has the form np2 ~> poss: %npl. nt(Xl .... , Xn) _The idea of this fragment can be described in a rough procedural way, as follows. In parsing an then in the compiled version we will have np, one reads an ordinary determiner (deC), then

107 nt(Xl ..... Xn, Syn, Strl,Str2). postmods(Synl,Syn2, Hodsl,Hods2, Strl,Str2).

Here there is a single syntactic structure argument, np2(syn(Feas,HodsO),Syn, Hod:Hodsl,Hodsl, Syn, representing the syntactic structure of the Strl,Str3) <- phrase associated by nt with the word string given poss(Mod, Strl,Str2) & by the difference list (Strl, Sir2). npl(syn(Feas,syn(Feas,HodsO):Hods2),Syn, Hods2,nil, Str2,Str3). On the other hand, when the non-terminal nt is weak, four syntactic structure arguments are added, producing a compiled predication of the form In the first compiled rule, the structure Syn to be associated with the call to 'np' appears again nt(Xl, .... Xn, SynO,Syn, Hodsl,Hods2, Strl,Str2). in the second matrix structure argument of 'npl' The first matrix structure argument of 'npl' is Here the pair (Hodsl, Hods2) holds a difference list for the sequence of structures analyzed by the weak syn(np:nil,Mod:Hods). non-terminal nt. These structures could be 'syn' structures or terminals, and they will be daughters and this will turn out to be the value of Syn if (modifiers) for a 'syn' structure associated with no shifts are encountered. Here Hod is the 'syn' the closest higher call to a strong non-terminal structure associated with the determiner 'det', and -- let us call this higher 'syn structure the ma- Hods is the list of modifiers determined further trix 'syn' structure. The other pair (SynO, Syn) by 'npi'. The feature list np:nil is constructed represents the changing view of what the matrix from the leading non-terminal 'np' of this strong 'syn' structure actually should be, a view that may rule. (It would have been np:Argl if np had a change because a shift is encountered while satis- (first) argument Argl.) fying nt. SynO represents the version before sat- isfying nt, and Syn represents the version after [n the second and third compiled rules, the satisfying nt. If no shift is encountered while matrix structure pairs (first two arguments) and satisfying nt, then Syn will just equal SynO. But the modifier difference list pairs are linked in a if a shift is encountered, the old version SynO will straightforward way to reflect sequencing. become a daughter node in the new version Syn. ]'be fourth rule shows the effect of the shift. In compiling a rule with several non-terminals Here syn(Feas,HodsO), the previous "conjecture" for in the rule body, linked by the sequencing operator the matrix structure, is now made simply the first ':', the argument pairs (SynO, Syn) and (Hodsl, modifier in the larger structure Hods2) for weak non-terminals are linked, respec- tively, across adjacent non-terminals in a manner syn(Feas,syn(Feas,HodsO):Hods2) similar to the linking of the difference lists for word-string arguments. Calls to strong non- which becomes the new "conjecture" by being placed terminals associate 'syn' structure elements with in the first argument of the further call to 'npl'. the modifier lists, just as surface terminals are If the shift operator had been used in its binary associated with elements of the word-string lists. form FO%npl, then the new conjecture would be

Let us look now at the compilation of a set syn(NT:F,syn(NT:FO,Mods0):Hods2) of rules. We will take the noun phrase grammar fragment illustrating the shift and shown above in where the old conjecture was syn(NT:F,HodsO). [n Section 2, and repeated for convenience here, to- larger grammars, this allows one to have a com- gether with declarations of strong non-terminals. pletely correct feature list NT:FO for the left- embedded modifier. strongnon~erminals(np.det.noun.poss.nil). To illustrate the compilation of terminal np => det: npl. symbols, let us look at the rule npl => premods: noun: np2. np2 ----~-> postmods. det => +O: Sdt(D,PI,P2,P): P2/Pt-P. rip2 => poss: %npl. from the grammar HLGRAM in Section 2. The compiled The compiled rules are as follows: rule is

np[Syn, Strl,Str3) <- det(syn(det:nil,+D:P2/PI-P:nil), D.Str,Str) <- det(Hod, Strl,Str2) & dt(D,PI,P2,P). npl(syn(np:nil,Hod:Hods),Syn, Hods,nil, Str2,Str3). Note that both the surface terminal +D and the logical terminal P2/PI-P are entered as modifiers npl(Synl,Syn3, Hodsl,Hods3, Strl,Str4) <- of the 'det' node. The semantic interpretation premods(Synl,Syn2, Hodsl,Hod:Hods2, component looks only at the logical terminals, but Strl,Str2) & in certain applications it is useful to be able to noun(Hod, Str2,Str3) & see the surface terminals in the syntactic struc- np2(Syn2,Syn3, Hods2,Hods3, Str3,Str4). tures. As mentioned above, the display procedures for syntac=i¢ structures can optionally show only np2(Synl,Syn2, Hodsl,Hods2, Strl,Str2) <- one type of terminal.

108 The display of the syntactic structure of the now we need a list of structures because of a sentence "Every man loves a woman" produced by raising phenomenon necessary for proper scoping, MLGRAM is as follows. which we will discuss in Sections A and 5. sentence:nil When the non-terminal nt is weak, five extra np:Xl arguments are added, producing a compiled predi- det:nil cation of the form X2/X3-alI(X3,X2) l-man(Xl) nt(Xl, ..., Xn, Fees, SemsO,Sems, Semsl,Sems2, l-love(Xl,XA) Strl,Str2). np:XA det:nil Here Fees is the feature list for the matrix strong XS/X6-ex(X6,XS) non-terminal. The pair (SemsO, Sems) represents l-woman(X&) the changing "conjecture" for the complete list of. daughter (augmented) semantic items for the matrix Note that no 'vp' node is shown in the ; node, and is analogous to first extra argument pair 'vp' is a weak non-terminal. The logical form in the two-pass mode. The pair (Semsl, Sems2) holds produced for this tree by the semantic component a difference list for the sequence of semantic items given in the next section is analyzed by the weak non-terminal nt. Semsl will be a final sublist of SemsO, and Sems2 will of all(man(Xl), ex(woman(X2),love(XI,X2))). course be a final sub|ist of Semsl.

For each strong rule, a cal-i to 'semant' is Now let us look at the compilation of syntax added at the end of the compiled form of the rule. rules for the one-pass mode. In this mode, syn- The form of the call is tactic structures are not built, but semantic structures are built up directly. The rule compiler semant(Feas, Sems, Semsl,Sems2). adds extra arguments to non-terminals for manipu- lation of semantic structures, and adds calls to Here teas is the feature list for the non-terminal the top-level semantic interpretation procedure, on the left-hand side of the rule. Sems is the final 'semant'. version of the list of daughter semantic items (after all adjustments for shifts) and (SemsL, The procedure 'semant' builds complex semantic Sems2) is the difference list of semantic items structures out of simpler ones, where the original resulting from the semantic interpretation for this building blocks are the logical terminals appearing level. (Think of Fees and Sems as input to in the MLG syntax rules. In this process of con- 'semant', and (Semsl, Sems2) as output.) CSemsl, struction, it would be possible to work with se- Sems2) will be the structure arguments for the mantic items (and in fact a subsystem of the rules non-terminal on the left-hand side of the strong do work directly with semantic items), but it ap- rule. A call to 'semant' is also generated when a pears to be more efficient to work with slightly shift is encountered, as we will see below. The more elaborate structures which we call augmented actual working of 'semant' is the topic of the next semantic items. These' are terms of the form section.

sem(Feas,Op,LP), For the shift grammar fragment shown above, the compiled rules are as follows. where Op and [2 are such that Op-LF is an ordinary semantic item, and Fees is either a feature list np(Sems,Sems0, Strl,Str3) <- or the list terminal:nil. The latter form is used det(Semsl,Sems2, Strl,Str2) & for the initial augmented semantic items associated npl(np:nil, Semsl,Sems3, Sems2,nil, Str2,Scr3) a with logical terminals. semant(np:nil, Sems3, Sems,SemsO).

As in the two-pass mode, the number of analysis npl(Feas, Semsl,Sems3, Semsa,Sems7, Strl,St[~) <- structure arguments added to a non-terminal by the premods(Feas, Semsl,Sems2, SemsA,Sems5, compiler depends on whether the non-terminal is Strl,Str2) & strong or weak. If the original non-terminal is noun(Sems5,Sems6, Str2,Str3) & strong and has the form np2(Feas, Sems2,Sems3, Sems6,SemsT, Str3,StrA).

nt(Xl, ..., Xn) np2(Feas, Semsl,Sems2, Sems3,Semsd, Strl,Str2) <- postmods(Feas, Semsl,Sems2, Sems3,SemsA, then in the compiled version we will have Strl,Str2).

nt(Xl, ..., Xn, Semsl,Sems2, Strl,Str2). npE(Feas, Semsl.SemsA, SemsS,Sems6, Strl,Str3) <- poss(SemsS,Sems6, Strl,Str2) & Here (Semsl, Sems2) is a difference list of aug- semant(Feas, Semsl, Sems2,Sems3) & mented semantic items representing the list of se- npl(Feas, Sems2,Sems~, Sems3,nil, Str2,Str3). mantic s~ruotures for the phrase associated by n~ with the word s~ring given by the difference list In the first compiled rule (a strong rule), the pair (Strl, Sir2). In the syntactic (two-pass) mode, (Seres, SemsO) is a difference list of the semantic only one argument (for a 'syn') is needed here, but items analyzing the noun phrase. (Typically there

109 will just be one element in this list, but there transverb(X,Y, Feas, Semsl,Semsl, can be more when modifiers of the noun phrases sem(terminal:nil,l,P):Sems2,Sems2, contain quantifiers that cause the modifiers to get V.Str,Str) <- noun promoted semantically to be sisters of the tv(V,X,Y,P). phrase.) This difference list is the output of the call to 'semant' compiled in at the end of the first The strong rule rule. The input to this call is the list Sems3 (along with the feature list np:nil). We arrive det -----> +D: Sdt(D,PI,P2,P): P2/PI-P. at Sems3 as follows. The list Semsl is started by , ! the call to det ; its first element is the compiles into the clause determiner (if there is one), and the list is con- tinued in the list Sems2 of modifiers determined det(Semsl,Sems2, D.SemsA,Sems&)<- further by the call to 'npl'. In this call to 'npl', dt(D,P1,P2,P) & the initial list Semsl is given in the second ar- semant(det:nil, gument of 'npl' as the "initial verslon for the sem(terminal:nil,P2/PI,P):nil, final list of modifiers of the noun phrase. Sems3, Semsl,Sems2). being in the next argument of 'npl', is the "final version" of the np modifier list, and this is the list given as input to 'semant'. [f the processing of 'npl' encounters no shifts, then Sems3 will just 4. SEMANTIC INTERPRETATION FOR MLG'S equal 5ems I. The semantic interpretation schemes for both [n the second compiled rule (for 'npl'), the the one-pass mode and the two-pass mode share a "versions" of the total list of modifiers are [inked large core of common procedures; they differ only in a chain at the top level. In both schemes, augmented se- mantic items are combined with one another, forming (Semsl, 5ems2, Sems3) more and more complex items, until a single item is constructed which represents the structure of in the second and third arguments of the weak non- the whole sentence. In this final structure, only terminals. The actual modifiers produced by this the logical form component is of interest; the other rule are linked in a chain two components are discarded. We will describe the top levels for both modes, then describe the common (SemsA, Sems51 Sems6, SemsT) core.

in the fourth and fifth arguments of the weak non- The top level for the one-pass mode is simpler, terminals and the first and second arguments of the because semantic interpretation works in tandem with strong non-terminals. A similar situation holds the parser, and does not itself have to go through for the first of the 'np2' rules. the parse tree. The procedure 'semant', which has interleaved calls in the compiled syntax rules, [n the second 'npZ' rule, a shift is encount- essentially is the top-level procedure, but there ered, so a call to 'semant' is generated. This is is some minor cleaning up that has to be done. If necessary because of the shift of levels; the mod- the top-level non-terminal is 'sentence' (with no ifiers produced so far represent all the modifiers arguments), then the top-level analysis procedure in an np, and these must be combined by 'semant' for the one-pass mode can be to get the analysis of this np. As input to this call to 'semant', we take the list Semsl, which is analyzeCSent) <- the current version of the modifiers of the matrix sentence(Sems,nil,Sent,nil) & np. The output is the difference list .(Sems2, semant(top:nil,Sems,sem(*,e,iF):nil,nil) & gems3). Sems2 is given to the succeeding call to outlogform(LF). 'npl' as the new current version of the matrix modifier list. The tail Sems3 of the difference Normally, the first argument, Sems, of 'sentence' list output by 'semant' is given to 'npl' in its will be a list containing a single augmented se- fourth argument to receive further modifiers. SemsA mantic item, and its logical form component will is the f~.nal uersion of the matrix modifier list, be the desired logical form. However, for some determined by 'npi I , and this information is also grammars, the ~dditional call to 'semant' is needed put in the third a,'gument of 'np2'. The difference to complete the modification process. The procedure list (Sems5, Semsb) contains the single element 'outlogform' simplifies the logical form and outputs produced by 'poss', and this list tails off the list it. Semsl. ~ne definition of 'semant' itself is given in When a semantic item Op-LF occurs in a rule a single clause: body, the rule compiler inserts the augmented se- mantic item sem(terminal:nil,Op,LF). As an example, semant(Feas,Sems,Sems2,Sems3) <- the weak rule reorder(Sems,Semsl) & modlist(Semsl,sem(Feas,id,t), transverb(X,Y) ~> +V: $tv(V,X,Y,P): I-P. Sem,Sems2,Sem:Sems3).

compiles into the clause Here, the procedure 'reorder' takes the list Sems of augmented semantic items to be combined and re-

110 orders it (permutes it), to obtain proper (or most likely) scoping. This procedure belongs to the The first clause calls 'synsem' recursively when common core of the two methods of semantic inter- the daughter is another 'syn' structure. The second pretation, and will be discussed further below. clause replaces a logical terminal by an augmented The procedure 'modlist' does the following. A call semantic item whose feature list is terminal:nil. The next clause ignores any other type of daughter modlist(Sems,SemO,Sem,Semsl,Sems2) (this would normally be a surface terminal). takes a list Sems of (augmented) semantic items and Now we can proceed to the common core of the combines them with (lets them modify) the item SemO, two semantic interpretation systems. The procedure producing an item Sem (as the combination), along 'modlist' is defined recursively in a straightfor- with a difference list (Semsl, Sems2) of items which ward way: are promoted to be sisters of gem. The leftmost member of Sems acts as the outermost modifier. modlist(Sem:Sems, Sem0, Sem2, Semsl,Sems3) <- Thus, in the definition of 'semant', the result list modlist(Sems, SemO, Seml, Sems2,Sems3) & Semsl of reordering acts on the trivial item modify(Sem, Seml, Sem2, Semsl,Sems2). sem(Feas,id,t) to form a difference list (gems2, modlist(nil, Sem, gem, Sems,Sems). Sem:Sems3) where the result Sem is right-appended to its sisters. 'modlist' also belongs to the Here 'modify' takes a single item Sem and lets it common core, and will be defined below. operate on Seml, giving Sem2 and a difference list (Semsl, Sems2) of sister items. Its definltion is The top level for the two-pass system can be defined as follows. modify(Sem, Seml, Seml, Sem2:Sems,Sems~ <- raise(Sem,Seml,Sem2) &/. analyze2(Sent) <- modify(sem(*,Op,LF), sentence(gyn,Sent,nil) & sem(Feas,Opl,LFI), synsem(Syn,Sems,nil) & sem(Feas,Op2,LF2), Sems,Sems) <- semant(top:nil,gems,sem(*,e,LF):nit,niI) & mod(Op-LF, OpI-LFI, Op2-LF2). outlogform(LF).

The only difference between this and 'analyze' above Here 'raise' is responsible for raising the is that the call to 'sentence' produces a syntactic item Seml so that it becomes a sister of the item item Syn, and this is given to the procedure Seml; gem2 is a new version of Seml after the 'synsem'. The latter is the main recursive proce- raising, although in most cases, gem2 equals geml. dure of the two-pass system. A call Raising occurs for a noun phrase like "a chicken in every pot", where the quantifier "every" has synsem(Syn,SemsI,Sems2) higher scope than the quantifier "a". The semantic item for "every pot" gets promoted to a left sister takes a syntactic item Syn and produces a difference of that for "a chicken". 'raise' is defined bas- list (Semsl, Sems2) of augmented semantic items ically by a system of unit clauses which look at representing the semantic structure of Syn. (Typ- specific types of phrases. For the small grammar ically, this list will just have one element, but MLGRAM of Section 2, no raising is necessary, and it can have more if modifiers get promoted to sis- the definition of 'raise' can just be omitted. ters of the node.) The procedures 'raise' and 'reorder' are two The definition of 'synsem' is as follows. key ingredients of reshaping (the movement of se- mantic items to handle scoping problems), which was synsem(syn(Feas,Mods),Sems2,Sems3) <- discussed extensively in McCord (1982, 1981). [n synsemlist(Mods,Sems) & those two systems, reshaping was a separate pass reorder(Sems,Semsl) & of semantic interpretation, but },ere, as in McCord modlist(Semsl,sem(Feas,id,t), (198&), reshaping is interleaved with the rest of Sem,Sems2,Sem:Sems3). semantic interpretation. In spite of the new top- level organization for semantic interpretation of Note that this differs from the definition of MLG's, the low-level procedures for raising and 'semant' only in that 'synsem' must first reordering are basically the same as in the previous recursively process the daughters Mode of its input systems, and we refer to the previous reports for syntactic item before calling 'reorder' and further discussion. 'modlist' The procedure 'synsemlist' that proc- esses the daughters is defined as follows. The procedure 'mod', used in the second clause for 'modify', is the heart of semantic interpreta- synsemlist(syn(Feas,Mods0):Mods,Semsl) <- /& tion. synsem(syn(Feas,ModsO),SemsI,Sems2) & synsemlist(Mods,Sems2). mod(Sem, Seml, Sem2) synsemlist((Op-LF):Mods, sem(terminal:nil,Op,LF):Sems) <- /& means that the (non-augmented) semantic item Sem synsemlist(Mods,Sems). modifies (combines with) the item Semi to give the synsemlist(Nod:Mods,Sems) <- item Sem2. 'mod' is defined by a system consisting synsemlist(Mods,Sems). basically of unit clauses which key off the mod- synsemlist(nil,nil). ification operators appearing in the semantic items.

111 In the experimental MLG described in the next sec- plexity of the MLG, including the semantic inter- tion, there are 22 such clauses. For the grammar pretation component we have given in this Section, MLGRAM of Section 2, the following set of clauses is certainly greater than that of the comparable suffices. DCG in Section 2. However, for larger grammars, the modularity is definitely worthwhile -- concep- mod(id -~, Sem, Sem) <- /. tually, and probably in the total size of the sys- mod(Sem, id -~, Sem) <- /. tem. mod(l-P, Op-Q, Op-R) <- and(P,Q,R). mod(P/Q-R, Op-Q, @P-R). mod(@P-Q, Op-P, Op-Q). 5. AN EXPERIMENTAL MLG

The first two clauses say that the operator 'id' This section describes briefly an experimental acts like an identity. The second clause defines MLG, called HODL, which covers the same linguistic 'i' as a left-conjoining operator (its corresponding ground as the grammar (called HOD) in HcCord (198l). logical form gets left-conjoined to that of the The syntactic component of HOD, a DCG, is essen- modificand). The call and(P,Q,R) makes R=P&Q, ex- tially the same as that in HcCord (1982). One cept that it treats 't' ('true') as an identity. feature of these syntactic components is a system- The next clause for 'mod' allows a quantifier se- atic use of slot-filling to treat complements of mantic item like P/Q-each(Q,P) to operate on an item verbs and nouns. This method increases modularity like I-man(X) to give the item @P-each(man(X),P). between syntax and lexicon, and is described in The final clause then allows this item to operate detail in McCord (1982). on I-live(X) to give l-each(man(X),live(X)). One purpose of HOD, which is carried over to The low-level procedure 'mod' is the same (in MODL, is a good treatment of scoping of modifiers purpose) as the procedure 'trans' in HcCord (1981), and a good specification of logical form. The amd has close similarities to 'trans' in McCord logical form language used by >IODL as the target (1982) and 'mod' in McCord (198&), so we refer to of semantic interpretation has been improved some- this previous work for more illustrations of this what over that used for HOD. We describe here some approach to modification. of the characteristics of the new logical form language, called LFL, and give sample LFL analyses For MLGRAH, the only ingredient of semantic obtained by MODL, but we defer a more detailed de- interpretation remaining to be defined is 'reorder'. scription of LFL to a later report. We can define it in a way that is somewhat more general than is necessary for this small grammar, The main predicates of LFL are word-senses for but which employs a technique useful for larger words in the natural language being analyzed, for' grammars. Each augmented semantic item is assigned example, believel(X,Y) in the sense "X believes that a precedence number, and the reordering (sorting) Y holds". Quantifiers, like 'each', are special is done so that wh@n item B has higher precedence cases of word-senses. There are also a small number number than item A, then B is ordered to the left of non-lexJcal predicates in LFL, some of which are of A; otherwise items are kept in their original associated with inflections of words, like 'past' order. The following clauses then define 'reorder' for past tense, or syntactic constructions, like in a way suitable for MLGRAM. 'yesno' for yes-no questions, or have significance at discourse level, dealing for instance with reorder(A:L,H) <- topic/comment. The arguments for predicates of LFL reorder(L,Ll) & insert(A,Li,M). can be constants, variables, or other logical forms reordef(nit,n£1). (expressions of LFL).

insert(A,B:L,S:Ll) <- Expressions of LFL are either predications (in prec(A,PA) & prec(B,PB) & gt(PB,PA) &/& the sense just indicated) or combinations of LFL insert(A,L,Li). expressions using the conjunction '&' and the in- insert(A,L,a:L~. dexing operator ':'. Specifically, if P is a log- ical form and E is a variable, then P:E (read "P prec(sem(term~nal:*,e,~),2) <- /. indexed by E"~ is also a logical form. When an pruc(sem(relc!ause:e,e,e),l) <- /. indexed logical form P:E appears as part of a larger prec(e,3). logical form Q, and the index variable E is used elsewhere in Q. then E can be thought of roughly ~nus terminals are ordered to the end, except not as standing for P together with its "context". after relative clauses. In particular, the subject Contexts include references to time and place which and object of a sentence are ordered before the verb are normally left implicit in natural language. (~ terminal in the sentence), and this allows the When P specifies an event, as in see(john,mary), ssraightforward process of modification in :mod' writing P:E and subsequently using E will guarantee to scope the quantifiers of the subject and object that E refers to the same event. In the logical over the material of the verb. One can alter the form language used in McCord (1981), event variables definition of 'prec' to get finer distinctions in (as arguments of verb and noun senses) were used ~coping, and for this we refer to McCord (1982, for indexing. But the indexing operator is more 1981). powerful because it can index complex logical forms. For some applications, it is sufficient to ignore For a grammar as small as MLGRAM, which has contexts, and in such cases we just think of P:E no treatment of scoping phenomena, the total tom- as verifying P and binding E to an instantiation

112 of P. In fact, for PROLOG execution of logical is being compared to the base. This shows up most forms without contexts, ':' can be defined by the clearly in the superlative forms. Consider the single clause: P:P <- F. adverb "fastest":

A specific purpose of the MOD system in McCord John ran fastest yesterday. (1981) was to point out the importance of a class fastest(run(john):E, yesterday(E)). of predicates called focaiizers, and to offer a method for dealing with them in semantic interpre- John ran fastest yesterday. tation. Focalizers include many determiners, fastest(yesterday(run(X)), X=john). adverbs, and adjectives (or their word-senses), as well as certain non-lexical predicates like 'yesno'. In the first sentence, with focus on "yesterday", Focalizers take two logical form arguments called the meaning is that, among all the events of John's the base and the fOCUS: running (this is the base), John's running yesterday was fastest. The logical form illustrates the in- focalizer(Base,Focus). dexing operator. [n the second sentence, with focus on "John", the meaning is that among all the events The Focus is often associated with sentence stress, of running yesterday (there is an implicit location hence the name. The pair (Base, Focus) is called for these events), John's running was fastest. the SCOpe of the focalizer. As an example of a non-lexical focalizer, we The adverbs 'only' and 'even' are focalizers have yesno(P,q), which presupposes that a case of which most clearly exhibit the connection with P holds, and asks whether P & Q holds. (The pair stress. The predication only(P,Q) reads "the only (P, Q) is like Topic/Comment for yes-no questions.) case where P holds is when Q also holds". We get Example: different analyses depending on focus. Did John see M@ry yesterday? John only buys books at Smith's. yesno(yesterday(see(john,X)), X=mary). only(at(smith,buy(john,X1)), book(X1)).

John only buys books at Smith's. It is possible to give Prolog definitions for only(book(Xl)&at(X2,buy(john,Xl)), X2=smith). most of the focalizers discussed above which are suitable for extensional evaluation and which amount to model-theoretic definitions of them. This will quantificational adverbs like 'always' and be discussed in a later report on LFL. 'seldom', studied by David Lewis (1975), are also focalizers. Lewis made the point that these A point of the grammar HODL is to be able to quantifiers are properly considered unseJKtJve, in produce LFL analyses of sentences using the modular the sense that they quantify over all the free semantic interpretation system outlined in the variables in (what we call) their bases. For ex- preceding section, and to arrive at the right (or ample, in most likely) scopes for focalizers and other modi- fiers. The decision on scoping can depend on John always buys books at Smith's. heuristics involving precedences, on very reliable always(book(Xl)&at(X2,buy(john,Xl)), X2=smith) • cues from the syntactic position, and even on the specification of loci by explicit underlining in the quantification is over both X1 and X2. (A ~he input string (which is most relevant for paraphrase is "Always, if X1 is a book and John buys adverbial focalizers). Although written text does X1 at X2, then X2 is Smith's".) not often use such explici~ specification of adverbial loci, it is important that the system can Quantificational determiners are also get the right logical form after having some spec- focalizers (and are unselective quantifiers); they ification of the adverbial focus, because this correspond closely in meaning to the specification might be obtained from prosody in quantificational adverbs ('all' - 'always', 'many' spoken language, or might come from the use of 'often', 'few' - 'seldom', etc.). We have the discourse information. [t also is an indication paraphrases: of the modularity of the system that it can use the same syntactic rules and parse path no matter where Leopards often attack monkeys in trees. the adverbial focus happens to lie. often(leopard(Xl)&tree(X2)&in(X2,attack(Xl,X3)), monkey(X3)). Most of the specific linguistic information for semantic interpretation is encoded in the Many leopard attacks in trees are (attacks) procedures 'mod', 'reorder', and 'raise', which on monkeys. manipulate semantic items. In MODL there are 22 many(leopard(Xl)&tree(X2)&in(X2,attack(Xi,X3)), clauses for the procedure 'mod', most of which are monkey(X3)). unit clauses. These involve ten different modifi- cation operators, four of which were illustrated in the preceding section. The definition of 'mo

113 especially the indexing operator. The definitions attack(X,Y). The combination of items on the left of 'reorder' and 'raise' are essentially the same of 'only' is as for procedures in HOD. leopard(X)&tree(Z)&in(Z,attack(X,Y)) An illustration of analysis in the two-pass mode in HODL is now given. For the sentence This goes into the base, so the whole logical form "Leopards only attack monkeys in trees", the syn- is tactic analysis tree is as follows. only(leopard(X)&tree(Z)&in(Z,attack(X,Y)), sent monkey(Y)). nounph l-leopard(X) For detailed traces of logical form construction avp by this method, see McCord (1981). (P

Next, these items successively modify (according The syntactic component of MODL was adapted to the rules for 'mod') the matrix item, sent id as closely as possible from that of HOD (a DCG) in t, with the rightmost daughter acting as innermost order to get an idea of the efficiency of HLG's. oodifier. The rules for 'mod' involving the oper- The fact that the MLG rule compiler produces more ator (P

114 was written by a woman. the same syntactic component can he used with one- Leopards only attack monkeys in trees. pass or two-pass interpretation. John saw each boy's brother's teacher. Does anyone wanting to see the teacher know In MODL, there is a small dictionary stored whether there are any hooks left in this room? directly in Prolog, but MODL is also interfaced to a large dictionary/morphology system (Byrd, 1983, Using Waterloo Prolog (an interpreter) on an IBM 1984) which produces syntactic and morphological 3081, the following average times to get the logical information for words based on over 70,000 lemmata. forms for the five sentences were obtained (not There are plans to include enough semantic infor- including ~ime for [/0 and initial word separation): mation in this dictionary to provide semantic con- straints for a large MLG. MODL, one-pass mode - 40 milliseconds. MODL, two-pass mode - 42 milliseconds. Alexa HcCray is working on the syntactic com- MOD - 35 milliseconds. ponent for an MLG with very wide coverage. I wish to thank her for useful conversations about the So there was a loss of speed, but not a significant nature of the system. one. MODL has also been implemented in PSC Prolog (on a 3081). Here the average one-pass analysis time for the five sentences was improved to 30 6. COMPARISON WITH OTHER SYSTEMS milliseconds per sentence. The Restriction Grammars (RG's) of HLrschman On the other hand, the MLG grammar (in source and Puder (1982) are logic grammars that were de- form) ls more compact and easier to understand. signed with modularity [n mind. Restriction Gram- The syntactic components for MOD and MODL were mars derive from the Linguistic String Project compared numerically by a Prolog program that totals {Sager, 1981). An RG consists of conLexE-free up the sizes of all the grammar rules, where the size phrase structure rules to which restrictions are of a compound term is defined to be I plus the sum appended. The rule compiler {written in ProIo K and of the sizes of its arguments, and the size of any compiling into Prolog), sees to it that derivation other term is I. The total for MODL was l&33, and trees are constructed automatically during parsing. for MOD was 1807, for a ratio of 79%. The restrictions appended to the rules are basically Prolog procedures which can walk around, during the So far, nothing has been said in this report parse, in the partially constructed parse tree, and about semantic constraints in HODL. Currently, MODL can look at the words remaining in the input stream. exercises constraints by unification of semantic Thus there is a modularity between the phrase- types. Prolog terms representing type requirements structure parts of the syntax rules and the re- on slot-fillers must be unified with types of actual strictions. The paper contains an interesting fillers. The types used in MODL are t%/pe trees. discussion of Prolog representations of parse trees A type tr~ is either a variable {unspecified type) that make it easy to walk around in them. or a term whose principal functor is an atomic type (like 'human'), and whose arguments are subordinate A disadvantage of RG's is that the automat- type trees. A type tree T1 is subordinate to a type ically constructed analysis tree is just a deriva- tree T2 if either T1 is a variable or the principal tion tree. With MLG's, the shift operator and the functor of T1 is a subtype (ako) of the principal declaration of strong non-terminals produce analy- functor of T2. Type trees are a generalization of sis structures which are more appropriate seman- the type lists used by Dahl (1981), which are lists tically and are easier to read for large grammars. of the form TI:T2:T3:..., where T1 is a supertype [n addition, MLG analysis trees contain logical of T2, T2 is a supertype of TS, ..., and the tail terminals as building blocks for a modular semantic of the list may be a variable. The point of the interpretation system. The method of walking about generalization is to allow cross-classification. in the partially constructed parse tree is powerful Multiple daughters of a type node cross-classify and is worth exploring further; but the more common it. The lexicon in MODL includes a preprocessor way of exercising constraints in logic grammars by for lexical entries which allows the original lex- parameter passing and unification seems to be ade- ical entries to specify type constraints in a com- quate linguistically and notationally more compact, pact, non-redundant way. There is a Pro|o K as well as more efficient for the compiled Prolog representation for type-hierarchies, and the [exi- program. cal preprocessor manufactures full type trees from a specification of their leaf nodes. Another type of logic grammar developed with modularity in mind is the Definite Clause Trans- [n the one-pass mode for analysis with MLG's, lation Grammars (DCTG's) of Abramson (1984). These logical forms get built up during parsing, so log- were inspired partially by RG's (Hirschman and ical forms are available for examination by semantic Puder, 1982), by MSG's {Dahl and McCord, 1983), and checking procedures of the sort outlined in McCord by Attribute Grammars (Knuth, 1968). A DCTG rule (198&). If such methods are arguably best, then is like a DCG rule with an appended list of clauses there may be more argument for a one-pass system which compute the semantics of the node resulting (with interleaving of semantics). The general from use of the rule. The non-terminals on the question of the number of passes in a natural lan- right-hand side of the syntactic portion of the rule guage understander is an interesting one. The MLG can be indexed by variables, and these index vari- formalism makes this easier to investigate, because ables can be used in the semantic portion to link to the syntactic portion. For exa~le, the DCG rule

115 DCG's. The authors define a notion of intermediate sent(P) --> np(X,P1,P): vp(X,Pl). semantic representation (ISR) including entities and predications, where the predications can be viewed from the DCG in Section 2 has the DCTG equivalent: as logical forms. In writing DCG rules, one sys- tematically includes at the end of the rule a call sent ::= np@N: vp@V <:> to a semantic procedure (specific to the given rule) logic(P) ::- N@Iogic(X,PI,P) & V@logic(X,Pl). which combines ISR's obtained in arguments of the non-terminals on the right-hand side of the rule. (Our notation is slightly different from Abramson's Two DCG rules in this style (given by the authors) and is designed to fit the Prolog syntax of this are as follows: report.) Here the indexing operator is '@'. The syntactic portion is separated from the semantic sent(S) --> np(N): vp(V): $ssv(N,V,S). portion by the operator '<:>'. The non-terminals vp(S) --> verb(V,trans): np(N): Ssvo(V,N,S). in DCTG's can have arguments, as in DCG's, which could be used to exercise constraints (re- Here 'ssv' and 'svo' are semantic procedures that strictions), but it is possible to do everything are specific to the 'sent' rule and the 'vp' rule, by referring to the indexing variables. The DCTG respectively. The rules that define 'ssv' and 'svo' rule compiler sees to the automatic construction can include some general rules, but also a mass of of a derivation tree, where each node is labeled very specific rules tied to specific words. Two not only by the expanded non-terminal but also by specific rules given by the authors for analyzing the list of clauses in the semantic portion of the "All Viennese composers wrote ~ waltz" are as fol- expanding rule. These clauses can then be used in lows. computing the semantics of the node. When an in- dexed non-terminal NT@X appears on the right-hand svo(wrote,M:X,wrote(X)) <- is_a(M,music). side of a rule, the indexing variable X gets ssv(P:X,wrote(Y),author_of(Y,X)) <- iastantiated to the tree node corresponding to the is_a(P,person). expansion of NT. Note that the verb 'wrote' changes from the surface There is a definite separation of DCTG rules form 'wrote', to the intermediate form wrote(X), into a syntactic portion and a semantic portion, then to the form author of(Y,X). [n most logic with a resulting increase of modularity. Procedures grammar systems (including MOD and MODL), some form involving different sorts of constraints can be of argument filling is done for predicates; infor- separated from one another, because of the device mation is added by binding argument variables, of referring to the indexing variables. However, rather than changing the whole form of the predi- it seems that once the reader (or writer) knows that cation. The authors claim that it is less efficient certain variables in the DCG rule deal with the to do argument filling, because one can make an construction of logical forms, the original DCG rule early choice of a word sense which may lead to is just as easy (if not easier) to read. The DCTG failure and . An intermediate form like rule is definitely longer than the DCG rule. The wrote(X) above may only make a partial decision corresponding MLG rule: about the sense.

sent :> np(X): vp(X). The value of the "changing" method over the "adding" method would appear to hinge a lot on the is shorter, and does not need to mention logical question of parse-time efficiency, because the forms at all. Of course, there are relevant "changing" method seems more complicated conceptu- portions of the semantic component that are applied ally. It. seems simpler to have the notion that in connection with this rule, but many parts of the there are word-senses which are predicates with a semantic component are relevant to several syntax certain number of arguments, and to deal only with rules, thus reducing the total size of the system. these, rather than inventing intermediate forms that help in discrimination during the parse. So it is A claimed advantage for DCTG's is that the partly an empirical question which would be decided semantics for each rule is listed locally with each after logic grammars dealing semantically with rule. There is certainly an appeal in that, because massive dictionaries are developed. with MLG's (as well as the methods in McCord (1982, lq81)), the semantics seems to float off more on .There is modularity in rules written in the its own. Semantic items do have a life of their style of Porto and Filgueiras, because all the se- own, and they can move about in the tree (implic- mantic structure-building is concentrated in the itly, in some versions of the semantic interpreter) semantic procedures added (by the grammar writer) because of raising and reordering. This is not as at the ends of the rules, in MLG's, in the one-pass neat theoretically, but it seems more appropriate mode, the same semantic procedure call, to 'semant', fur capturing actual natural language. is added at the ends of strong rules, automatically by the compiler. The diversity comes in the an- Another disadvantage of DCTG's (as with RG~s) ciliary procedures for 'semant', especially 'mod'. is that the analysis trees that are constructed In fact, 'mod' (or 'trans' in McCord, 1981) has automatically are derivation trees. something in common with the Porto-Filgueiras pro- cedures in that it takes two intermediate repres- The last system to be discussed here, that in entations (semantic items) in its first two Porto and Filgueiras (198&), does not involve a new arguments and produces a new intermediate repre- grammar formalism, but a methodology for writing sentation in its third argument. However, the

116 changes that 'mod' makes all involve the Knuth, D. E. (1968) "Semantics of context-free modification-operator components of semantic items, languages," Mathematical Systems Theory, vol. 2, rather than the logical-form components. It might pp. 127-145. be interesting and worthwhile to look at a combi- nation of the two approaches. Lewis, D. (1975) "Adverbs of quantification," In E.L. Keenan (Ed.), Formal Semantics of Natural Both a strength and a weakness of the Porco- Language, pp. 3-15, Cambridge University Press. Filgueiras semantic procedures (compared with 'mod') is that there are many of them, associated McCord, M. C. (1982) "Using slots and modifiers in with specific syntactic rules. The strength is that logic grammars for natural language," Artificial a specific procedure knows that it is looking at Intelli~ence, vol 18, pp. 327-367. (Appeared first the "results" of a specific rule. But a weakness as 1980 Technical Report, University of Kentucky.) is that generalizations are missed. For example, modification by a quantified noun phrase (after McCord, M. C. (1981) "Focalizers, the scoping slot-filling or the equivalent) is often the same, problem, and semantic interpretation rules in logic no matter where it comes from. The method in MLG's grammars," Technical Report, University of allows semantic items to move about and then act Kentucky. To appear in Logic Programming and its by one 'mod' rule. The reshaping procedures are Applications, D. Warren and M: van Caneghem, Eds. free to look at specific syntactic information, even specific words when necessary, because they work McCord, M. C. (1984) "Semantic interpretation for with augmented semantic items. Of course, another the EPISTLE system," Proc. Second International disadvantage of the diversity of the Porto- Logic Programming Conference, pp. 65-76, Uppsala. Filgueiras procedures is that they must be explic- itly added by the writer of syntax rules, so that Miller, L. A., Heidorn, G. E., and Jensen, K. there is not as much modularity as in MLG's. (1981) "Text-critiquing with the EPISTLE system: an author's aid to better syntax," AFIPS Conference Proceedings, vol. 50, pp. 649-655. REFERENCES Pereira, F. (1981) "Extraposition grammars," Amer- Abramson, H. (1984) "Definite clause translation ican Journal of Computational Linguistics, vol. 7, grammars," Proc. 1984 International Symposium on pp. 243-256. Logic Prograem,ing, pp. 233-240, Atlantic City. Pereira, F. (1983) "Logic for natural language Byrd, R. J. (1983) "Word formation in natural lan- analysis," SRI International, Technical Note 275. guage processing systems," Proc. 8th International Joint Conference on Artificial [ntelli~ence, pp. Pereira, F. and Warren, D. (1980) "Definite clause 704-706, Karlsruhe. grammars for language analysis - a survey of the formalism and a comparison with transition net- Byrd, R. J. (1984) "The Ultimate Dictionary Users' works," Artificial Intelligence , vol. 13, pp. Guide," IBM Research Internal Report. 231-278.

Colmerauer, A. (1978) "Metamorphosis grammars," in Pereira, F. and Warren, D. (1982) "An efficient L. Bolt (Ed.), Natural Language Communication with easily adaptable system for interpreting natural Computers, Springer-Verlag. language queries," American Journal of Computa- tional Linguistics, vol. 8, pp. 110-119. Dahl, V. (1977) "Un systeme deductif d'interrogation de banques de donnees en espagnol," Groupe Porto, A. and Filgueiras, M. (1984) "Natural lan- d'Intelligence Artificielle, Univ. d'Aix-Marseille. guage semantics: A logic programming approach," Proc. 198A International Symposium on Logid Pro- Dahl, V. (1981) "Translating Spanish into logic gramming, pp. 228-232, Atlantic City. through logic," American Journal of Computational Linguistics, vol. 7, pp. 149-164. Sager, N. (1981) Natural Language Information Processing: A Computer Grammar of English and Its Dahl, V, and HcCord, M. C. (1983) "Treating coor- Applications, Addison-Wesley. dination in logic grammars," American Journal of Computational Linguistics , vol. 9, pp. 69-91. Woods, W. A. (1970) "Transition network grammars for natural language analysis," C. ACM, vol. 13, Heidorn, G. E. (1972) Natural Language Inputs to a pp. 591-606. Simulation Programming System, Naval Postgraduate School Technical Report No. NPS-55HD7210IA.

Hirschman, ~. and Puder, K. (1982) "Restriction grammar in Prolog," Proc. First International Logic Programming Conference, pp. 85-90, Marseille.

Jensen, K. and Heidorn, G. E. (1983) "The fitted parse: 100% parsing capability in a syntactic grammar of English," IBM Research Report RC 9729.

117