Compostional Semantics for Unification-Based Linguistics Formalisms

University of Pennsylvania ScholarlyCommons IRCS Technical Reports Series Institute for Research in Cognitive Science January 1999 Compostional Semantics for Unification-based Linguistics Formalisms Shuly Wintner University of Pennsylvania, [email protected] Follow this and additional works at: https://repository.upenn.edu/ircs_reports Wintner, Shuly, "Compostional Semantics for Unification-based Linguistics ormalisms"F (1999). IRCS Technical Reports Series. 44. https://repository.upenn.edu/ircs_reports/44 University of Pennsylvania Institute for Research in Cognitive Science Technical Report No. IRCS-99-05. This paper is posted at ScholarlyCommons. https://repository.upenn.edu/ircs_reports/44 For more information, please contact [email protected]. Compostional Semantics for Unification-based Linguistics ormalismsF Abstract Contemporary linguistic formalisms have become so rigorous that it is now possible to view them as very high level declarative programming languages. Consequently, grammars for natural languages can be viewed as programs; this view enables the application of various methods and techniques that were proved useful for programming languages to the study of natural languages. This paper adapts the notion of program composition, well developed in the context of logic programming languages, to the domain of linguistic formalisms. We study alternative definitions for the semantics of such formalisms, suggesting a denotational semantics that we show to be compositional and fully-abstract. This facilitates a clear, mathematically sound way for defining grammar modularity. Comments University of Pennsylvania Institute for Research in Cognitive Science Technical Report No. IRCS-99-05. This technical report is available at ScholarlyCommons: https://repository.upenn.edu/ircs_reports/44 Comp ositional Semantics for Uni cation-based Linguistic Formalisms Shuly Wintner Institute for Research in Cognitive Science University of Pennsylvania 3401 Walnut Street, Suite 400A Philadelphia, PA 19104-6228 [email protected] Abstract Contemp orary linguistic formalisms have b ecome so rigorous that it is now p ossible to view them as very high level declarative programming languages. Consequently, grammars for natural languages can b e viewed as programs; this view enables the application of various metho ds and techniques that were proved useful for programming languages to the study of natural languages. This pap er adapts the notion of program composition, well develop ed in the context of logic programming languages, to the domain of linguistic formalisms. We study alternative de nitions for the semantics of such formalisms, suggesting a denotational semantics that we show to b e comp ositional and fully-abstract. This facilitates a clear, mathematically sound way for de ning grammar mo dularity. 1 Intro duction The tasks of developing large scale grammars for natural languages b ecome more and more complicated: it is not unusual for a single grammar to be develop ed by a team including a number of linguists, computational linguists and computer scientists. The problems that grammar engineers face when they design a broad-coverage grammar for some natural language are very reminiscent to the problems tackled by software engineering see Erbach and Uszkoreit 1990 for a discussion of some of the problems involved in developing large-scale grammars. It is p ossible { and, indeed, desirable | to adapt metho ds and techniques of software engineering to the domain of natural language formalisms. The departure p oint for this work is the b elief that any advances in grammar engineering must be preceded by a more theoretical work, concentrating on the semantics of grammars. This view re ects the situation in logic programming, where developments in alternative de nitions for the semantics of predicate logic led to implementations of various program comp osition op erators. Viewing contemp orary linguistic formalisms as very high level declarative programming languages, a grammar for a natural language can b e viewed as a program. The execution of a grammar on an input sentence yields an output which represents the sentence's structure. This pap er adapts well-known results from logic programming languages semantics to the framework of uni cation-based linguistic formalisms. While most of the results we rep ort on are probably not surprising, we b elieve it is imp ortant to derive them directly for linguistic formalisms for two reasons. First, practitioners of linguistic formalisms usually do not view them as instances of a general logic programming framework, but rather as rst-class programming environments which deserve indep endent study. Second, there are some crucial di erences b etween linguistic formalisms and, say, Prolog: the basic elements|typ ed feature structures | are more general then rst-order terms, the notion of uni cation 1 is di erent, and computations amount to parsing, rather than SLD-resolution. The fact that we can derive similar results in this new domain is encouraging, and should not b e considered trivial. In the next section we review the literature on mo dularity in logic programming. We discuss alternative approaches, b oth op erational and denotational, to the semantics of uni cation-based linguistic formalisms in section 3. In section 4 we show that the standard semantics are \to o crude" and present an alternative semantics, whichwe show to b e comp ositional with resp ect to grammar union, a simple syntactic combination op eration on grammars. However, this de nition is \to o ne": we show that it is not fully-abstract, and in section 5 we present an adequate, comp ositional and fully-abstract semantics for uni cation-based linguistic formalisms. We conclude with suggestions for further research. 2 Mo dularity in logic programming The seminal work in the semantics of logic programming is Van Emden and Kowalski 1976, where three alternative de nitions for the meanings of predicate logic sentences are given and proven equivalent: the mo del theoretic de nition, viewing the denotation of the program as its minimal mo del, the intersection of all its Herbrand mo dels; the op erational de nition, viewing the denotation of a program as the set of consequences derivable from the data structures it manipulates; and the xp oint semantics, by which the meaning of a program is the least xp oint of its immediate consequence op erator. These results are elab orated up on by Apt and Van Emden 1982, further exploiting the application of xp oints by relating them to SLD resolution and accounting also to nite failure. A di erent approach is pursued by Lassez and Maher 1984: viewing a logic program as the set of its rules, distinct from the facts, they de ne an immediate consequence op erator for the rules only. The denotation of a grammar is the xp oint of this op erator, starting from the set of facts. This approach facilitates comp osition of programs, as the meaning of any comp osite sentence can b e expressed in terms of the meanings of its immediate constituents. Gaifman and Shapiro 1989 observe that the classical Herbrand-base semantics of logic programs is inadequate, as it identi es programs that should b e distinguished and vice versa. They de ne the notions of comp ositional semantics and fully-abstract semantics and show that the reason for the inadequacy of the standard mo del-theoretic semantics lies in the fact that it captures only the derivable atoms in its Herbrand base, whereas a query can intro duce new function symb ols that are not part of the program's signature. Rather than resort to a solution that incorp orates a \universal" signature for logic programs, they change the notion of program equivalence by taking as invariant the set of all most general atomic logical consequences of the program. This de nition of semantics is shown to b e b oth comp ositional and fully-abstract. Gaifman and Shapiro 1989 de ne comp ositional semantics as follows: let P b e a class of programs or program parts; let Ob b e a function asso ciating a set of ob jects, the observables, with a program P 2P. Let Com be a class of comp osition functions over P such that for every n-ary function f 2 Com and every set of n programs P ;:::;P 2P, f P ;:::;P 2P. A comp ositional equivalence with resp ect 1 n 1 n to Ob; C om is an equivalence relation ` ' that preserves observables i.e., whenever P Q, ObP = ObQ, and for every f 2 Com, if for all 1 i n, P Q then f P ;:::;P f Q ;:::;Q . i i 1 n 1 n The need for a mo dular extension for logic programming languages has always b een agreed up on, as relations were viewed as providing to o ne-grained abstraction for the design of large programs. Two ma jor tracks were taken: one, referred to as programming in the large and inspired by O'keefe 1985, suggests a meta-linguistic mechanism: mo dules are viewed as sets of Horn clauses and their comp osition is mo deled in terms of op erations on the comp onents such as union, deletion, closure etc. The other approach, known as programming in the smal l and originating with Miller 1986; 1989, enhances logic programming with linguistic abstraction mechanisms that are richer than those o ered by Horn clauses see Bugliesi, Lamma, and Mello 1994 for a review of mo dularity in logic programming. Unlike O'keefe 1985, who de nes the denotation of a program to be the transformation op erator 2 obtained by applying deduction steps arbitrarily many times, Mancarella and Pedreschi 1988 move from an ob ject level to a function level and interpret a logic program as its immediate consequence op erator, mo deling a single deduction step. They de ne an algebra over op erators, suitable for de ning a union comp osition op erator over sub-programs.

Compostional Semantics for Unification-Based Linguistics Formalisms

Lecture Notes in Computer Science

Unification Grammars Nissim Francez and Shuly Wintner Frontmatter More Information

Chp%3A10.1007%2F3-54

Amalia--A Unified Platform for Parsing and Generation

Semantics of Nondeterminism, Concurrency, and Communication*

David S. Brée

Off-Line Parsability and the Well-Foundedness of Subsumption

Proceedings of the 15Th Meeting on the Mathematics of Language (MOL 2017), Held at Queen Mary University of London, on July 13–14, 2017

Notes on Computational Linguistics

Elephant 2000: a Programming Language Based on Speech Acts

Rechnion - Israel Institute Ot Technology Computer Science Dep~Rtment

Categorial Grammar and the Semantics of Contextual Prepositional Phrases