Signature Redacted ----- Chairman, Epartmental Committee on Graduate Students - - - /\Rr·· ''VE"' ~~~
Total Page:16
File Type:pdf, Size:1020Kb
A Theory of Syntactic Recognition for Natural Language by Mitchell Philip Marcus A.B., Harvard University (1972) Submitted in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy at the Massachusetts Institute of Technology February, 1978 Signature of Author Sign~ture redacted Department of Electrical int.ftteeriii.l/ Jind Computer Science, October 13, 1977 Certified by Signature redacted .., --7"--_.,.a;:..<-- ....., -J----7-..-----,.......-----77~~--~7.....,...::;»...,.... __T_h_e_si-s-Su-pe-r-vi-·-so-r Accepted Signature redacted ----- Chairman, epartmental Committee on Graduate Students - - - /\Rr·· ''VE"' ~~~ .. _ • ><~ vr1i ~ / .. '· ( f'AAY ,,, ,,1,) ; - / ' ~~--~ a A Theory of Syntactic Recognition for Natural Language by Mitchell Philip Marcus Submitted to the Department of Electrical Engineering and Computer Science on October 13, 1977 in partial fulfillment of the requirements of the Degree of Doctor of Philosophy. Abstract This dissertation investigates the hypothesis that the syntax of natural language can be parsed by a left-to-right deterministic mechanism without facilities for parallelism or backup. This determinism hypothesis, explored in the context of the grammar of English, is shown to lead to a simple mechanism, a grammar interpreter, having the following properties: -Simple rules of grammar can be written for this interpreter which capture the generalizations behind passives, yes/no questions, and other linguistic phenomena, despite the Intrinsic difficulty of capturing such generalizations in the framework of a processing model for recognition, as opposed to a competence model. -The structure of the grammar interpreter constrains its operation in such a way that, by and large, grammar rules cannot parse sentences which violate either of two constraints on rules of grammar currently proposed by Chomsky. This result depends in part upon the computational use of the notion of traces and the related notion of Annotated Surface Structure, which derive from Chomsky's work. -The grammar interpreter provides a simple explanation for the difficulty caused by "garden path" sentences, such as "The cotton clothing is made of grows in Mississippi". Rules can be written for this interpreter to resolve local struutural ambiguities which might seem to require nondeterministic parsing; however, the power of such rules depends upon a parameter of the mechanism. Most structural ambiguities can be resolved, given an appropriate setting of this parameter, but those which typically cause garden paths cannot. To the extent that these properties, all of \which reflect deep propertier of natural language, follow from the determitism hypothesis, this paper provides indirect evidence for the truth of this assurpption. This paper also demonstrates that the determinism hypothesis necessitates semantic/syntactic interaction to test the comparative semantic goodness of differing structural postibilities for an input. Thesis Supervisor: Jonathan Allen Title: Professor of Electrical Engineering '.5 0 '4 'a p '1 S Ct U) 4 Acknowledgements I would like to express my gratitude to everyone who contributed to this woik, and provided support and encouragement to its author: -to Jonathan Allen, my advisor, for much good advice, and especially for his constant focus on what is central and what is not; -to my readers: to Ira Goldstein, for his careful attention during the course of this research, and his careful reading of several iterations of this document, and to Seymour Papert, for helping to narrow the range of the research, and for creating, with Marvin Minsky, an environment of great intellectual intensity and freedom; -to Bill Martin, who will never let me forget how hard the problem really is; -to Bob Moore and Chuck Rieger, for long talks spent trying to talk me out of various bad ideas; -to Mike Genesereth, Gerry Sussman, and Mike Brady, for valuable suggestions at crucial phases of this research; -to Bill Woods and Susumu Kuno, my undergraduate teachers, who taught me how to look at natural language; and especially -to Craig Thiersch, who first introducing me to much of the linguistic theory that underlies this paper, for many long discussions and tutorials on various points of that theory; -to "The Natural Language Group", Candy Bullwinkle, Kurt VanLehn, Dave McDonald, and Beth Levin, for innumerable suggestions, as well as for helping me to get an education; I also thank Kurt VanLehn for recoding part of the parser, and Beth Levin for allowing me to quote extensively from her working paper; -to Candy, Kurt, Dave, and especially Beth, and to Chuck Rich, Brian Smith, Bill Martin, and Bob Woodham for unflagging support and enthusiasm; -and to Susie, with love, for accepting additional burdens while I completed this work and, most importantly, simply for being there, and to Joshua, who kept his father sane without even trying. 5 Stranger: Then since some will blend, some not, they might be said to be in the same case with the letters of the alphabet. Some of these cannot be conjoined; others will fit together. Theatetus: Of course. Stranger: And the vowels are specially good at combination - a sort of bond pervading them all, so that without a vowel the others cannot be fitted together. Theatetus: That is so. Stranger: And does everyone know which can combine with which, or does one need an art to do it rightly? Theatetus: It needs art. Stranger: And that art is? Theatetus: Grammar. -Plato, The Sophist Translated by F.M. Cornford 6 TABLE OF CONTENTS 1. Introduction The Determinism Hypothesis 1 A Note on Methodology 9 The Notion of "Strictly Deterministic" 16 The Structure of the Parser - Motivation 20 The Structure of the Parser - an Overview 23 rs This Machine Really Strictly Deterministic? 29 The Limits of This Research 34 The Structure of This Paper 36 2. Historical Perspective Introduction 46 The Classical Solution - Hypothesis-Driven Search 46 A Critique of Hypothesis-Driven Parsing 51 Summary 57 3. The Grammar Interpreter PARS IFAL's Major Data Structures 59 The Active Node Stack 59 Nodes 63 The Buffer 64 The Buffer as a Virtual Machine 66 A Buffer with Offset 70 The Buffer as Utilized by the Parser 72 Building Structure is Irrevocable 74 The Buffer and the Stack Reflect Different Aspects of the Parser's Operation 74 A Snapshot of the Parser 77 The Basic Operations of the Parser 80 4. The Structure of the Grammar Introduction 85 The Grammar 86 Grammar Rules 87 Packets of Rules 91 The Packeting Mechanism is Redundant 97 An Example 99 Parsing a Simple Declarative Sentence - A Small Grammar Z09 7 5. Capturing Linguistic Generalizations Introduction 130 The General Grammatical Framework 131 Winograd's Use of Features 133 Traces and Annotated Surface Structure 136 The Grammar Elegantly Captures Linguistic Generalizations 142 Parsing a Yes-No Question 146 Parsing an Imperative 151 Parsing Passives - A Simple Example 156 Passives and Embedded Complements 161 On the Discovery of These Generalizations 177 6. The Grammar Interpreter and Chomsky's Constraints Introduction 179 An Outline of Chomsky's Theory 180 The Structure Preserving Hypothesis 184 Subjacency 186 The Specified Subject Constraint 188 The Tensed S Constraint 189 The Constraints Imposed by the Grammar Interpreter 190 The Specified Subject Constraint and the Grammar Interpreter 193 The L-to-R Constraint and the Lowering Lemma 198 Subjacency and the Grammar Interpreter 204 These Arguments as Evidence in Support of the Determinism Hypothesis 212 7. Parsing Relative Clauses - Ross's Complex NP Constraint Introduction 216 An Analysis of "Wh-clauses" 217 Ross's Complex NP Constraint 222 The CNPC and the Determinism Hypothesis 225 Chomsky's Analysis - a Comparison 232 Woods' Hold-list Mechanism - A Comparison 236 8. Parsing Noun Phrases - Extending the Grammar Interpreter Introduction 238 Attention Shifting Rules - Motivation 239 Attention Shifting - Implementation 244 The ATN PUSH Arc - A Comparison 249 An Example of Attention Shifting 251 Another Example of Attention Shifting 259 Why There Must be a Stack of Buffer Pointers 262 AS Rules and the Length of the Buffer 264 In Conclusion 268 8 9. Differential Diagnosis and Garden Path Sentences Introduction 271 A Theory of Re-analyzing Garden Path Sentences 272 Diagnosing Between Imperatives and Yes/No Questions 277 Garden Paths and a Three Constituent Window 282 John Lifted A Hundred Pound Bags. 286 Most Garden Paths Cause No Difficulty in Spoken Language 290 Diagnostic vs. Non-diagnostic Rules 292 10. On the Necessity of Some Semantic/Syntactic Interactions Introduction 294 Historical Perspective 296 The Problem of Wh-Questions and Indirect Objects 301 Some Inadequate Explanations 303 The Notion of Syntactic and Semantic Biases 307 A Solution to the Problem 310 Conclusions 316 11. CONCLUSIONS Chomsky's Constraints: Competence or Performance Phenomena? 320 In Summary 324 Directions for Future Work 329 Bibliography 332 APPENDICES A. An Approach to Noun-Noun Modification 336 B. The Grammar Language 344 C. Grammar Rules Referenced in the Text 362 D. The Current Grammar 367 E. The Case Frame Interpreter 399 Chapter 1 9 Introduction CHAPTER 1 INTRODUCTION The Determinism Hypothesis All current natural language parsers that are adequate to cover a wide range of syntactic constructions operate by simulating nondeterministic machines, either by using backtracking or by pseudo-parallelism. On the face of it, this seems to be necessary, for a cursory examination of natural language reveals many phenomema that