The Linguistic Relevance of Tree Adjoining Grammar

University of Pennsylvania ScholarlyCommons Technical Reports (CIS) Department of Computer & Information Science April 1985 The Linguistic Relevance of Tree Adjoining Grammar Anthony S. Kroch University of Pennsylvania Aravind K. Joshi University of Pennsylvania, [email protected] Follow this and additional works at: https://repository.upenn.edu/cis_reports Recommended Citation Anthony S. Kroch and Aravind K. Joshi, "The Linguistic Relevance of Tree Adjoining Grammar", . April 1985. University of Pennsylvania Department of Computer and Information Science Technical Report No. MS-CIS-85-16. This paper is posted at ScholarlyCommons. https://repository.upenn.edu/cis_reports/671 For more information, please contact [email protected]. The Linguistic Relevance of Tree Adjoining Grammar Abstract In this paper we apply a new notation for the writing of natural language grammars to some classical problems in the description of English. The formalism is the Tree Adjoining Grammar (TAG) of Joshi, Levy and Takahashi 1975, which was studied, initially only for its mathematical properties but which now turns out to be a interesting candidate for the proper notation of meta-grammar; that is for the universal grammar of contemporary linguistics. Interest in the application of the TAG formalism to the writing of natural language grammars arises out of recent work on the possibility of writing grammars for natural languages in a metatheory of restricted generative capacity (for example, Gazdar 1982 and Gazdar et al. 1985). There have been also several recent attempts to examine the linguistic metatheory of restricted grammatical formalisms, in particular, context-free grammars. The inadequacies of context-free grammars have been discussed both from the point of view of strong generative capacity (Bresnan et al. 1982) and weak generative capacity (Shieber 1984, Postal and Langendoen 1984, Higginbothem 1984, the empirical claims of the last two having been disputed by Pullum (Pullum 1984)). At this point TAG grammar becomes interesting because while it is more powerful than context-free grammar, it is only "mildly" so. This extra power of TAG is a direct corollary of the way TAG factors recursion and dependencies, and it can provide reasonable structural descriptions for constructions like Dutch verb raising where context-free grammar apparently fails. These properties of TAG and some of its mathematical properties were discussed by Joshi 1983. Comments University of Pennsylvania Department of Computer and Information Science Technical Report No. MS- CIS-85-16. This technical report is available at ScholarlyCommons: https://repository.upenn.edu/cis_reports/671 THE LINGUISTIC RELEVANCE OF TREE ADJOINING GRAMMAR* Anthony S. Krocht and Aravind K. Joshi tt Department of Computer and Information Science Room 268 Moore School/DIL University of Pennsyhania Philadelphia, PA 19104 April 1985 Revised, June 1985 ------------------ f~cknowledgement: This research was supported in part by NSF grants MCS8219196CER, MCS-82-07294, and a grant from the Alfred P. Sloan Foundation for a Program in Cognitive Science, Grant No. 84-4-15. ?Department of Linguistics Williams H311/CU ??Department of Computer and Information Science Moore School/DL Table of Contents 1. Introduction 1.1 Motivation 1.2 Factoring recursion and co-occurrence restrictions 1.3 The plan of the paper 2. An introduction to the Tree Adjoining Grammar formalism 2.1 Tree Adjoining Grammar - TAG 2.2 TAG'S with .linksm 2.3 TAG'S with local constraints on adjoining 2.4 Some formal properties of TAG'S 3. Some linguistic examples 4. Raising and equi constrnctions in a TAG 4.1 The basic issues 4.2 The problem of nominalizations 4.3 Further considerations 5. The Passive in a TAG 5.1 The link between raising and the passive 5.2 A TAG analysis of the passive 5.3 Impersonal and raising passives 6. m-movement in a TAG 6.1 Subjacency in a TAG 6.1.1 6.1.2 6.1.3 6.2 The that-trace effect in a TAG 6.2.1 6.2.2 6.2.3 6.3 A further example 7. Conclusions 8. References 1. Introduction 1.1 Motivation In this paper we apply a new notation for the writing of natural language grammars to some classical problems in the description of Englishf. The formalism is the Tree Adjoining Grammar (TAG) of Joshi, Levy and Takahashi 1975, which was studied . initially only for its mathematical properties but which now turns out to be a interesting candidate for the proper notation of meta-grammar; that is for the universal grammar of contemporary linguistics. Interest in the application of the TAG formalism to the writing of natural language grammars arises out of recent work on the possibility of writing grammars for natural languages in a metatheory of restricted generative capacity (for example, Gazdar 1982 and Gazdar et al. 1985). There have been also several recent attempts to examine the linguistic metatheory of restricted grammatical formalisms, in particular, context-free grammars. The inadequacies of context-free grammars have been discussed both from the point of view of strong generative capacity (Bresnan et al. 1982) and weak generative capacity (Shieber 1984, Postal and Langendoen 1984, Higginbothem 1984, the empirical claims of the last two having been disputed by Pullum (Pullum 1984)). At this point TAG grammar becomes interesting because while it is more powerful than context-free grammar, it is only .mildlym so. This extra power of TAG is a direct corollary of the way TAG factors recursion and dependencies, and it can provide reasonable structural descriptions for constructions like Dutch verb raising where context-free grammar apparently fails. These properties of TAG and some of its mathematical properties were discussed by Joshi 1983. It is our hope that the presentation below will support the claim, currently controversial, that the exploration of restrictive mathematical formalisms as meta- languages for natural language grammars can produce results of value in empirical linguistics. In spite of the fact that the syntactic theory of natural languages and mathematical linguistics share a common origin, the relevance of the latter to the former is a matter of contention. Linguists agree that an explanatory theory of language requires a restrictive specification of universal grammar since that notion defines the space of possible human languages. Without a restrictive universal grammar the problem of language acquisition becomes intractable for the child, who has to entertain so many hypotheses as to the correct grammar for his language that the limited primary data of his experience will not allow him to choose among them. Linguists also agree that the formalization of transformational-generative grammar (TG) as it is used in the Aspects and related models is far too permissive. In the search for a more restrictive theory, however, different researchers have taken very different tacks. Some have claimed that progress can best be made by reanalyzing the syntax of English and other languages that have received extensive treatment in l~hispaper has benefited enormously from discussions the authors have had with many colleagues and students, and written comments received from others. We especially want to thank Mark Baltin, Bob Berwick, Dominique Estival, Gerald Gazdar, Richard Lawn, Ellen Prince, Geoff Pullum, Vijay Shankar, Tom Wasaw, and David Weir for their helpful comments. Of course, they do not necessarily agree with us in all our claims. TG grammar within systems of grammar that are provably less powerful in generative capacity than transformational grammars are. Only in this way can researchers be sure that the grammars they construct will not only be learnable but also usable in the real-time linguistic activities of parsing and generation that grammatical knowledge underlies. Under this approach transformational grammars must be excluded by the theory of universal grammar because, since they can generate non-recursive sets, the languages they generate cannot be expected to be parsable within reasonable (i.e., polynomial) time bounds (Generalized Phrase Structure Grammar (GPSG) of Gazdar takes this approach). Other linguists, most notably Chomsky, have argued that a restrictive theory of universal grammar can and should be developed by the empirically driven discovery of constraints on rules and representations and that these constraints cannot be expected to restrict the generative capacity of grammars in any interesting way. Chomsky appears to believe that the effect of the constraints that comprise universal grammar on the mathematically defined generative capacity of possible human language grammars has little linguistic relevance (Chomsky 1977, 1980). For him the learnability problem is the only one that should constrain universal grammar. The parsability of languages is not a goal that universal grammar should aim to account for because it is doubtful that the complete set of grammatical sentences of a human language is necessarily parsable and it may even be that parsing is not algorithmic. The thrust of GPSG and similar approaches is to elaborate a formal theory that constrains the generative capacity of possible grammars and then to show that, in spite of its mathematical restrictiveness, grammars admitted by this theory give empirically satisfactory analyses of various syntactic phenomena previously analyzed in transformational terms. One may consider alternative linguistic analyses for a certain phenomemon in these frameworks, but the formal power of the theory is always well understood.

The Linguistic Relevance of Tree Adjoining Grammar

Fundamental Methodological Issues of Syntactic Pattern Recognition

Lecture 5 Mildly Context-Sensitive Languages

Algorithm for Analysis and Translation of Sentence Phrases

Surface Without Structure Word Order and Tractability Issues in Natural Language Analysis

Parsing Discontinuous Structures

Fundamental Study

Technical Report No. 2006-514 All Or Nothing? Finding Grammars That

Formal and Computational Aspects of Natural Language Syntax Owen Rambow University of Pennsylvania

On the Degree of Nondeterminism of Tree Adjoining Languages and Head Grammar Languages Suna Bensch, Maia Hoeberechts

Discontinuous Data-Oriented Parsing Through Mild Context-Sensitivity

Parsing Linear Context-Free Rewriting Systems with Fast Matrix Multiplication

Expressivity and Complexity of the Grammatical Framework