University of Pennsylvania ScholarlyCommons

IRCS Technical Reports Series Institute for Research in Cognitive Science

October 1993

A Computational Model of Syntactic Processing: Resolution from Interpretation

Michael Niv University of Pennsylvania

Follow this and additional works at: https://repository.upenn.edu/ircs_reports

Niv, Michael, "A Computational Model of Syntactic Processing: Ambiguity Resolution from Interpretation" (1993). IRCS Technical Reports Series. 184. https://repository.upenn.edu/ircs_reports/184

University of Pennsylvania Institute for Research in Cognitive Science Technical Report No. IRCS-93-27.

This paper is posted at ScholarlyCommons. https://repository.upenn.edu/ircs_reports/184 For more information, please contact [email protected]. A Computational Model of Syntactic Processing: Ambiguity Resolution from Interpretation

Abstract abounds in natural language, yet humans have no diffculty coping with it. In fact, the process of ambiguity resolution is almost always unconscious. But it is not infallible, however, as example 1 demonstrates. 1. The horse raced past the barn fell.

This is perfectly grammatical, as is evident when it appears in the following context:

2. Two horses were being shown to to a prospective buyer. One raced past a meadow and the other was raced past a barn.

Grammatical yet unprocessable sentences such as 1. are called 'garden-path sentences.' Their existence provides an opportunity to investigate the human mechanism by studying how and when it fails. The aim of this thesis is to construct a computational model of language understanding which can predict processing difficulty. The data to be modeled are known examples of garden path and non-garden path sentences, and other results from .

It is widely believed that there are two distinct loci of computation in sentence processing: syntactic and semantic interpretation. One longstanding controversy is which of these two modules bears responsibility for the immediate resolution of ambiguity. My claim is that it is the latter, and that the syntactic processing module is a very simple device which blindly and faithfully constructs all possible analyses for the sentence up to the current point of processing. The interpretive module serves as a filter, occasionally discarding certain of these analyses which it deems less appropriate for the ongoing discourse than their competitors.

This document is divided into three parts. The first is introductory, and reviews a selection of proposals from the sentence processing literature. The second part explores a body of data which has been adduced in support of a theory of structural preferences - one that is inconsistent with the present claim. I show how the current proposal can be specified ot account for the available data, and moreover to predict where structural preference theories will go wrong. The third part is a theoretical investigation of how well the proposed architecture can be realized using current conceptions of linguistic competence. In it, I present a parsing algorithm and a meaning-based ambiguity resolution method.

Comments University of Pennsylvania Institute for Research in Cognitive Science Technical Report No. IRCS-93-27.

This thesis or dissertation is available at ScholarlyCommons: https://repository.upenn.edu/ircs_reports/184 The Institute For Research In Cognitive Science

A Computational Model of Syntactic Processing: Ambiguity Resolution from Interpretation P (Ph.D. Dissertation)

by E Michael Niv

University of Pennsylvania Philadelphia, PA 19104-6228 N October 1993

Site of the NSF Science and Technology Center for Research in Cognitive Science N

University of Pennsylvania IRCS Report 93-27 Founded by Benjamin Franklin in 1740

A Computational Model of Syntactic Processing

Ambiguity Resolution from Interpretation

Michael Niv

A Dissertation

in

Computer and Information Science

Presented to the Faculties of the UniversityofPennsylvania in Partial Fulllmentofthe

Requirements for the Degree of Do ctor of Philosophy

Mark J Steedman

Sup ervisor of Dissertation

Mark J Steedman Graduate Group Chairp erson

c

Copyright

Michael Niv

Acknowledgements

My deep est feelings of gratitude and indebtedness are to my advisor and mentor Mark Steedman

Mark has taught me not just linguistics and cognition but also ab out thinking and b ehaving like

a scientist He tirelessly and carefully read through each of the manymany drafts of pap ers I

have given him including this thesis providing meticulous and insightful comments and sp ent

countless hours explaining and debating these comments with me Iwould guess that every

single p oint I discuss in this thesis reects his contributions

This thesis like me is a pro duct of a community The Cognitive Science communityatPenn

brings forth a b eautifully dissonant buzz of intellectual activity The diversity of outlo oks

paradigms metho ds results and opinions ab out the study of the mind provides an ideal en

vironment in which to go shopping for ones own direction It is imp ossible to identify the

individual memb ers and visitors to IRCS Institute for Research in Cognitive Science that have

shap ed my p ersp ective so Ill just thank the two p eople who are resp onsible for enabling the

tremendous growth of IRCS during mystayatPenn Aravind Joshi and Lila Gleitman

Iamvery grateful to my do ctoral committee Janet Fodor Aravind Joshi Mitch Marcus and

Ellen Prince for their helpful comments on this do cument in particular to Ellen and Janet for

long and fruitful discussions ab out the rst parts of the thesis and to Mitch for making the

Penn Treebank available to me

My colleagues Breck Baldwin Barbara Di Eugenio Bob Frank Dan Hardt Jamie Henderson

oungSuk Lee Owen Rambow Phil Resnik Rob ert Rubino Lyn Walker Beth Ann Ho ckeyY

and many other memb ers of the Computational Linguistics group at Penn provided me with

much help supp ort and friendship throughout my graduate studies IRCS p ostdo cs have b een

particularly helpful I have learned a great deal from Sandeep Prasada Je Siskind and Mark

Hepple

This work has b enetted greatly from suggestions and advice by Ellen Bard Julie Boland Lyn

Frazier Susan Garnsey Mark Johnson Rob ert Ladd Don Mitchell Dick Oehrle Stu Shieb er

Val Tannen Henry Thompson AmyWeinb erg Steve Whittaker and Bill Wo o ds Rich Pito

was very helpful in extending his treebank searching program tgrep to accommo date my needs

I am grateful to Bonnie Webb er for oerring me my rst opp ortunity to do research the

TraumAID pro ject and for her nancial supp ort during my rst few years at Penn The

research rep orted in this do cumentwas supp orted by the following grants DARPA N

J ARODAALC NSF IRI Ben Franklin SC I thank the

Computational Linguistics facultymemb ers for their sustained nancial supp ort My thanks

also go to Jow and Oyi Ali Baba Yue Kee and the Kims for sustenance iii

Iamvery grateful to Barbara Ramesh Bob Sandeep and to the ocial memb ers of the late

nightWawa crew Patrick Anuj Tilman esp ecially to YoungSuk for the b eginnings of lifelong

friendships

Ive had a few really wonderful teachers Ruta in kindergarten Mme OConnor for Frenchin

high scho ol Jo e ORourke in college and Val Tannen in graduate scho ol They each p ossess

the rare and precious of sparking curiosity and intellectual excitement in their students

My most imp ortant teachers havebeenmy parents Yaa and Avigdor I thank them and my

two sisters Adi and Tamar for encouraging my curiosity and coming to terms with just how

curious I have b ecome iv

Abstract

A Computational Mo del of Syntactic Pro cessing

Ambiguity Resolution from Interpretation

Michael Niv

Mark J Steedman Sup ervisor

Syntactic ambiguity ab ounds in natural language yet humans have no diculty coping with it

In fact the pro cess of ambiguity resolution is almost always unconscious But it is not infallible

however as example demonstrates

The horse raced past the barn fell

This sentence is p erfectly grammatical as is evident when it app ears in the following context

Two horses were b eing shown o to a prosp ective buyer One was raced past a meadow

and the other was raced past a barn

Grammatical yet unpro cessable sentences such as are called gardenpath sentences Their

existence provides an opp ortunitytoinvestigate the human sentence pro cessing mechanism by

studying how and when it fails The aim of this thesis is to construct a computational mo del

of language understanding which can predict pro cessing diculty The data to b e mo deled

are known examples of garden path and nongarden path sentences and other results from

psycholinguistics

It is widely b elieved that there are two distinct lo ci of computation in sentence pro cessing

syntactic parsing and semantic interpretation One longstanding controversy is which of these

two mo dules b ears resp onsibility for the immediate resolution of ambiguity My claim is that it

is the latter and that the syntactic pro cessing mo dule is a very simple device which blindly and

faithfully constructs all p ossible analyses for the sentence up to the current p oint of pro cessing

h The interpretive mo dule serves as a lter o ccasionally discarding certain of these analyses whic

it deems less appropriate for the ongoing discourse than their comp etitors

This do cument is divided into three parts The rst is intro ductory and reviews a selection of

prop osals from the sentence pro cessing literature The second part explores a b o dy of data which

has b een adduced in supp ort of a theory of structural preferences one that is inconsistent

with the present claim I showhow the current prop osal can b e sp ecied to accountforthe

available data and moreover to predict where structural preference theories will go wrong The

third part is a theoretical investigation of howwell the prop osed architecture can b e realized

using current conceptions of linguistic comp etence In it I present a parsing algorithm and a

meaningbased ambiguity resolution metho d v

Contents

Acknowledgements iii

Abstract v

Intro duction

Previous Work

Memory Limitation

Representing an Analysis

Representing Comp eting Analyses

Delib eration b efore Ambiguity Resolution

Shieb er and Pereira

Syntactic Optimality

Lexical Asso ciation

Explicit Consideration of Syntactic Choices

The Weakly InteractiveModel

Discussion

The Central Claim

Accounting for Recency Phenomena

Right Asso ciation

Information Volume

A study

APotential Practical Application

olume and Sensibleness Information V

Conclusion vi

Other Constructions

Available Evidence

NP vs S complement

Late Closure Ambiguity

Degree of Disconnectedness

A Representation of Semantics

AFormal Denition

Consequences

Avoid New Sub jects

Given and New

Consequences

Summary

Parsing CCG

Goal

Steedmans Prop osal

Strict Comp etence and Asynchronous Computation

Synchronous and Asynchronous Computation

Evaluation

Identifying Ungrammaticality

ShiftReduce Conicts

Heavy Shift and Incremental Interpretation

Coping with Equivalent Derivations

Evaluation Criteria for a Parser

Previous Attempts

AProposal

Using the Recovered Constituent

Summary vii

A Computer Implementation

Desiderata

Data Structure

Control Structure

BottomUp Reduce Algorithm

Buer Admissibility Condition

Interpretation

Real World Implausibility

Denite

Detecting the End of a Phrase

An Example

A Consistent Theory of Penalties

Desired Behavior

Fitting The Data

A Prediction

Summary

Conclusion

A Data from Avoid New Sub ject investigation

B A Rewrite System for Derivations

Bibliography viii

List of Figures

Mainverb and reducedrelativeclause analyses of

An interactive sentencepro cessing architecture

A circuit for computing z xy y from Shieb er and Johnson

Recombining a recovered constituent with a rightward lo oking mo dier

System Diagram

Denite Reference Resolution Algorithm

B Schema for one redex in DRS

B Normal form computation of a quasirightchain

B When two redexes are not indep endent

B Why DRS is weakly ChurchRosser

B One application of

ctr ix

Chapter

Intro duction

The question I address here is how p eople deal with the linguistic ambiguity which p ervades

natural language discourse I fo cus on syntactic ambiguity The task is to construct a detailed

theory of the sentence pro cessing mechanism its comp onents and the nature and dynamics of

their interaction

The data to b e accounted for are measurements of pro cessing diculty or lack thereof in

various sentence typ es b oth in and out of context Current metho ds of measuring pro cessing

diculty are often crude eg naive understandability judgements for a list of sentences or

at b est indirect eg spatially diuse EEG resp onse patterns and chronometric measurement

of crossmo dal lexical selfpaced wordbyword reading eyemovement Nevertheless

observations of pro cessing dicultyvery often show remarkable and unmistakable regularity

This regularity is the data to b e explained

Many mo dels of human sentence pro cessing have b een put forth Most try to account for

pro cessing dicultyby p ositing some prop erties of the parsing comp onent of the linguistic

cognitive apparatus

Frazier and Fo dor and Marcus are well known examples which attempt to derive

a wide variety of phenomena from memory limitations in the pro cessor

Theories have also b een prop osed in which the parser emb o dies a preference for certain analyses

over certain others Frazier and her colleagues haveadvo cated preferences for certain structural

congurations Pritchett has argued for preference for maximizing the degree to whichthe

principles of grammar are satised at every step of the parsing pro cess

Distinct from these parserbased theories of pro cessing diculty is a theory advo cated by Crain

Steedman and Altmann CSA hereinafter which ascrib es the lo cus of ambiguity resolution pref

erences to higherlevel interpretive comp onents as opp osed to the lowerlevel syntactic parsing

comp onent CSA describ e this architecture in broad terms and apply it in detail to a fairly

narrow class of phenomena essentially mo dier attachmentambiguity In this dissertation I

argue for a conception of the syntactic pro cessor which is a generalization of CSAs prop osal

In this work I do not address the strength of pro cessing diculty eects nor the issue of howhumans cop e

with pro cessing diculty eg by rereading the oending passage The aim is solely to account for those linguisti c

environments whichgive rise to pro cessing diculty

My claim is that the syntactic pro cessor is the simplest imaginable all it represents is syntactic

analyses of the input It is not resp onsible for resolution of ambiguity that task is p erformed

by the interpreter

This do cument is divided into three parts The rst is intro ductory and reviews a selection of

prop osals from the sentence pro cessing literature much of which implicitly assume a sp ecialized

syntactic pro cessor It concludes with a detailed statement of the central claim of the disser

tation The second part chapters and explores a b o dy of data which has b een adduced

in supp ort of a theory of structural preferences one that is inconsistent with the present

claim In these chapters I showhow the current prop osal can b e sp ecied to accountforthe

available data and moreover to predict where structural preference theories will go wrong The

third part chapters and is a theoretical investigation of howwell the prop osed architec

ture can b e realized using current conceptions of linguistic comp etence Chapter addresses

issues of parsing it is an attempt to carry out Steedmans program of simplifying

the theory of the parser by adopting a comp etence grammar which denes more incremental

analyses than other grammars Chapter is a synthesis of the parser develop ed in chapter and

the comp etencebase ambiguity resolution criteria develop ed in previous chapters It describ es

an implemented computer mo del intended to demonstrate the viability of the central claim

Chapter provides a conclusion and suggests areas of further research

Chapter

Previous Work

In this chapter I review a selected sample of the sentence pro cessing literature Of the many

issues whichany prop osed mo del of human sentence pro cessing must address I fo cus on two

the role of memory limitations and the extent of delib eration which precedes ambiguity

resolution The reader is referred to Gibson for a general review and cogent critique of

the literature

Memory Limitation

Considering that the pro cess of sentence understanding is successfully implemented by the com

putational mechanism of the human brain one may ask ab out the nature of the architectural

features of this computational device what is the relation among the various sub comp onents

lexical syntactic and interpretive pro cesses and what sorts of limitations are imp osed on

computational and memory resources by the nite hardware dedicated to the task I b egin

with the latter question and fo cus on memory limitations

Representing an Analysis

The most familiar demonstration that the pro cessing system do es not nd all grammatically

p ossible analyses of a string with equal ease is the classic example from Chomsky and Miller

The rat that the cat that the dog bit chased died

Miller and Chomsky accounted for this in automaton theoretic terms the pro cessor cannot

be interrupted while pro cessing a constituentoftyp e X to pro cess another constituentoftyp e

X More recentwork Gibson Joshi Rambow and Joshi consider a variety

of constructions in English and German whichgive rise to centeremb eddinglike eects and

come to similar though not identical conclusions as it pro ceeds incrementally through the

input string the underlying automaton is incapable of maintaining a large numb er of separate

pieces of the input which are not integrated together I return to this issue in sections

and Diculty with sentences such as arise indep endently of syntactic ambiguity

they indicate an inherent limitation in the pro cessor in representing the linguistic structure

which they require The conclusion that this diculty results from memory constraints in the

pro cessor is unchallenged in the recent literature as far as I know

Representing Comp eting Analyses

The question of whether memory limitations are resp onsible for another form of pro cessing

diculty namely socalled garden path sentences as in is much more controversial

The horse raced past the barn fell

With this sentence there is no question that the pro cessor is capable of representing the necessary

linguistic structure the grammatically identical sentence in causes no pro cessing diculty

The horse ridden past the barn collapsed

Authors suchasFrazier and Fo dor and Marcus see Mitchell Corley and Garnham

Weinb erg for more recent incarnations of the twoworks resp ectively have argued

that when the pro cessor encounters the lo cal temp orary ambiguity in the word raced in

it is incapable of keeping track of b oth available analyses of the input until the arrival of the

disambiguating information That is memory limitations force a commitment Other authors

Crain and Steedman Altmann and Steedman McClelland St John and Taraban

Gibson Pritchett SpiveyKnowlton Trueswell and Tanenhaus inter alia

have argued that the pro cessor considers all grammatically available analyses and pic ks among

the alternatives according to certain preferences these authors dier widely ab out what the

preferences are I now consider a few of these pap ers in more detail

The Sausage Machine

Frazier and Fo dor prop osed an architecture for the syntactic pro cessor whose central

characteristic is a stage of pro cessing whose is limited Their prop osal is that

the sentence pro cessing mechanism is comprised of mo dules

The Preliminary Phrase Packager PPP is a shortsighted device which p eers at

the incoming sentence through a narrow window which subtends only a few words

at a time It is also insensitive in some resp ects to the wellformedness rules of the

language The Sentence Structure Sup ervisor SSS can survey the whole phrase

marker for the sentence as it is computed and it can keep track of dep endencies

between items that are widely separated in the sentence and of longterm structural

commitments which are acquired as the analysis pro ceeds p

Interesting predictions of pro cessing diculty arise for situations where the PPP imp oses the

incorrect bracketing or chunking on a substring of the input Frazier and Fodor characterize

the PPP has having a memory size of roughly six words and attempting at any p ointto

group as may items as it can into a single phrasal package p Aside from predicting

diculties with centeremb edded sentences eg in the PPP might try to chunk the rat that

the cat into one package their accountmakes interesting predictions with resp ect to mo dier

attachment Consider

Wewent to the lake to swim quickly

Their account predicts that the PPP will attempt to structure quickly with the material imme

diately to its left namely to swim rather than with went This prediction is not made when

the adverbial consists of more words eg

Wewent to the lake to swim but the weather was to o cold

In the adverbial clause but cannot t into the PPP together with to swim so the PPP

puts the two constituents into separate packages and the SSS has the opp ortunity to decide

how to attach the three packages

Wewent to the lake to swim but the weather was to o cold

The timepressure under which the pro cessor is op erating faced with quickly incoming words

leads Frazier and Fo dor to make another prediction ab out attachmentambiguity resolu

tactically simplest analyses will b e found rst thus preferred This was tion namely that syn

formalized byFrazier

Minimal Attachment Attach incoming material into the phrasemarker

b eing constructed using the fewest no des consistent with the wellformedness

rules of the language

Minimal Attachment predicts that the mainverb analysis of raced in will b e initially pur

sued as can b e seen by the relativesyntactic complexity of the main verb and reducedrelative

analyses in gure

Minimal Attachment similarly predicts that the sentences in eachgive rise to a garden path

a The cop shot the spy with the bino culars

b The do ctor told the patient that he was having trouble with to leave

In more recentwork Frazier and her colleagues Rayner Carlson and Frazier Frazier

prop ose a dierent mo dularization of the language pro cessing faculty the syntactic processor

constructs a single analysis of the incoming words according to structurally dened criteria such

as Minimal Attachmentabove The thematic processor considers the phrasal constituents that

the syntactic pro cessor has found and considers in parallel all the p ossible thematic combina

tions of these constituents When it nds a b etter thematic combination than the one b eing

constructed by the syntactic pro cessor it interrupts the syntactic pro cessor telling it to reana

lyze the sentence

NP

l

l

l

DET N

e

e

the RC N

S horse S

Q

e

Q

Q

e

NP VP VP NP

DET N V trace V

the horse raced raced

Figure Mainverb and reducedrelativeclause analyses of

PARSIFAL

Marcus seeks to reconcile the apparent sp eed and eciency of the human sentence pro

cessing mechanism with traditional parsing techniques for ambiguous grammars whichare

signicantly slower Standard parsing algorithm require time which is either p olynomial or ex

p onential in the length of the string but humans do not require words to arrive more slowly as

the input string the sentence b ecomes longer Marcus concludes that the pro cessor must

b e able to make all parsing decisions in a b ounded amount of time ie using a b ounded number

pro cessing steps He prop oses an automaton mo del which he calls Parsifal This mo del is a

pro duction system which has a data store and set of patternaction rules Toachieve a b ound

on the amount of time required by the pro cessor to make its move Marcus b ounds the p ortion

of the pro cessors memory which is visible to the rules The store has two comp onents a parse

stack and a buer of three cells each capable of storing one constituents The rules may only

mention the syntactic category of the contentofeach of the cells and roughly the top of the

parse stack The pro cessor pro ceeds deterministical ly in the sense that any structure it builds

by attaching constituents from the buer into the stack may not b e destroyed When the

pro cessor reaches an ambiguityitmay either resolve it or it mayleave one or more constituents

uncombined in the buer provided there is ro om If there is no ro om in the buer for new

ed to make a commitment whichmay result in a garden path constituents the pro cessor is forc

An account of garden paths which is based strictly on the cell memory limit quickly runs into

empirical diculties Pritchett provides the following examples which can b e resolved

within a cell buer but nevertheless app ear to b e garden paths see Gibson for a

detailed critique of Marcuss parser

a The b oat oated quickly sank

b Without her money would b e hard to nd

c While Tegan sang a song played on the radio

Minimal Commitment

Marcus Hindle and Fleck prop ose an architecture whichmaintains the cell buer

of Marcuss earlier work but factors the pro cedural patternaction rules into a more elegant

collection of structural description rules and an engine which applies them While the rules of

grammar are ab out direct dominance of no des in the phrase marker the pro cessor maintains

partial sp ecications by means of dominance statements and other devices Preserving the

determinism in Marcuss parser their pro cessor may not retract any assertions ab out the phrase

marker that it is constructing Weinb erg adopts Marcus et als prop osal of partial

descriptions of phrase structure but jettisons altogether the idea of a b ounded buer Instead

Weinb erg adopts an arguably less stipulative accountofgardenpathsentences

Principle of QuickInterpretation The parser attaches using

the smallest numb er of dominance statements and features necessary to assign

grammatically relevant prop erties

This account predicts a garden path whenever the commitments necessitated by the Principle of

QuickInterpretation turns out to b e inconsistent with subsequent material No garden path is

predicted in cases where the commitment ie partial description constructed by the Principle

of QuickInterpretation is consistent with the rest of the string

For an illustration of Weinb ergs parser consider

a I knew Eric

b I knew Eric was a great guy

Weinb ergs accountentails that neither a nor b is a garden path This follows from the

description that the pro cessor builds after encountering the prex I knew Eric

S

l

l

NP VP

NP I V

knew Eric

where the links are express dominance not direct dominance is compatible with either

the direct dominance interpretation of or with the analysis necessary for b where an S

no de intervenes b etween the VP no de and the Eric no de is in contrast with

np

Weinb ergs partial structural description include statements of dominance direct dominance linear prece

dence and partial category sp ecication using features

After Mary ate the fo o d disapp eared from the table

When the pro cessor encounters the fo o d the Principle Of QuickInterpretation commits it

to the fact that the VP headed by ate dominates the NP the fo o d This commitmentis

inconsistent with the rest of the string so a garden path is correctly predicted

Weinb ergs prop osal is that the sentence pro cessors working memory is limited to hold exactly

one structural representation UnlikeFrazier and Fo dors and Marcuss prop osals this limita

tion is conned to the representation of ambiguityWeinb ergs memory limitation makes no

predictions ab out diculty with center emb edding

The three prop osals ab ove all share the fundamental prop erty that the pro cessor pursues only

one analysis at a time This has b een called serial pro cessing as well as determinism Standing

in contrast to serial pro cessing are prop osals that the pro cessor constructs representations for



the various ambiguous analyses available at anypoint Of the many parallel prop osals in the

literature I shall review only two Gibsons prop osal of pro cessing load and breakdown

and the parallel weakinteraction mo del of Crain Steedman and Altmann CSA Crain and

Steedman Altmann and Steedman

Gibson

Gibson prop oses that the human sentence pro cessing mechanism pursues all grammati

cally available analyses in parallel as it pro cesses the string discarding those analyses whichare

to o costly that is when the cost of one analysis A exceeds that of another analysis B by

more than P Pro cessing Load Units A is discarded necessitating conscious eort to reconstruct

should it b e subsequently necessary The cost of an analysis is the sum of Pro cessing Loads which

it incurs byvirtueofhaving certain memoryconsuming prop erties Within Gibsons mo del a

theory of sentence pro cessing consists in a precisely dened collection of memoryconsuming

prop erties and a numeric cost asso ciated with each Considering a variety of data mostly in

trosp ective judgements of pro cessing dicultysentences Gibson prop oses a collection of four

memoryconsuming prop erties three have to do with failures to identify the relations among

the various constituents in the string cf Chomskys Principle of Full Interpretation

the fourth prop erty asso ciates a cost with the need to access a constituent which is not the

most recent Gibson concentrates on syntactic prop erties which he considers the most tractable

to investigate He acknowledges that a complete theory of sentence pro cessing would likely

require augmenting his set of prop erties with lexical semantic pragmatic and discourselevel

prop erties which are asso ciated with signicant pro cessing loads

Crain Steedman and Altmann

Crain and Steedman and Altmann and Steedman rep ort a collection of exp eri

ments which militate against a mo del in which the syntactic pro cessor op erates in a serial or

deterministic fashion Consider the lo cal ambiguity in a illustrated in b and c



One could argue that Frazier et als mo del is a mix of serial syntactic and parallel thematic pro cessing

but what is relevant here is the question of whether the initial syntactic analysis is carried out in serial or parallel

a The psychologist told the wife that

b The psychologist told the wife that he was having trouble with her husband

c The psychologist told the wife that he was having trouble with to leave her

husband

A mo del where the syntactic pro cessor op erates serially would predict that the ambiguitywould



b e resolved on some structural grounds eg Minimal Attachment presumably toward the com

plementizer analysis of that as in b not the relativizer analysis in c This resolution

would o ccur indep endently of meaning of the constituents in question But Crain and Steedman

found that dep ending on compatibilitywiththediscourse context the pro cessor can b e made to

select either analysis When there were twowives in the discourse context b wasagarden

path reecting a commitmenttoward a further restrictor on the set of candidate referents

When there was one wife in the discourse c was a garden path This basic nding was

replicated using a dierentambiguous structure and metho dologies by Altmann and Steedman

Sedivy and SpiveyKnowlton and Altmann Garnham and Dennis Given

the sensitivity to the meaning of the various alternatives CSA argue that the pro cessor must b e

explicitly weighing the sensibleness of the alternatives It follows that the interpreter receives

representations in parallel from the syntactic pro cessor of all available syntactic analyses

Neither Gibson nor CSA discuss explicit b ounds on the numb er of analyses that are maintained

by the pro cessor at any time This just means that unlike Marcuss prop osal and the Sausage

machine it is only the preference criteria themselves not the memory b ounds that b ear the

y resolution b ehavior It must b e emphasized that neither paral explanatory role for ambiguit

lel mo del ab ove requires that the pro cessor b e able to represent the p otentially exp onentially

proliferating set of ambiguous analyses for a multiply ambiguous string whenever the pro ces

sors preference reaches some threshold it discards the lesspreferred analyses thus keeping the

size of analysisset manageable Indeed most are resolved very quickly making the

pro cessor app ear as if it op erates serially There is additional exp erimental evidence in supp ort

of a parallel mo del of the sentence pro cessor

Gorrell



Gorrell used a lexical decision task to show that b oth analysis of a temp orarily am

biguous sentence are maintained that is the ultimately dispreferred analysis exerts an eect

of lexical decision facilitation With sentences such as Gorrell used target words is has

must which are consistent with the dispreferred complex analysis and found facilitation in

b oth the Ambiguous and Complex conditions but not for the unambiguous Simplex condition

Presentation of the sentences were interrupted at the p oints marked with for presentation of

the target word



CSA address their arguments sp ecically against Minimal Attachment but it applies to other any structural

preference strategies such as those in the prop osals of Weinb erg ab ove Pritchett and others



Where in the middle of reading a sentence the sub ject is presented with a word and has to quickly resp ond

with whether it is a word of the English language It has b een argued Wright and Garrett that this task

is facilitated if the target word ts in at the p oint in the sentence that the sub ject is pro cessing

NPS Ambiguity

Simplex Its obvious that Holmes saved the son of the banker rightaway

Ambiguous Its obvious that Holmes susp ected the son of the

banker rightawaywas guilty

Complex Its obvious that Holmes realized the son of the banker was guilty

Main VerbParticiple ambiguity

Simplex The companywas loaned money at low rates to ensure high volume

Ambiguous The company loaned money at low rates to ensure high volume

decided to b egin expanding

Complex The company they loaned money at low rates decided to b egin

expanding

Hickok Pickering and Nicol

Additional exp eriments byHickok and Nicol and Pickering conrm Gorrells nd

ings Working indep endently these researchers considered the lo cal ambiguity used byCSAin



Using the metho d of antecedent reactivation they found that the analy

sis which is strongly dispreferred to the complement clause analysis is still active and causes

reactivation of the WH trace at the p osition marked with

The girl swore to the dentist that a group of angry p eople called the oce

ab out the incident

Hickok used visual computerpaced presentation of the sentence while Nicol and Pickering

used crossmo dal priming the sentences were presented auditorily and the target word was

presented visually Results from the two exp eriments consistently show reactivation of the WH

antecedent This result is quite surprising given the remarkable extent to which sub jects are

garden pathed when faced with a string such as It suggests that dispreferred analyses are

not discarded outright they just fade away

The girl swore to the dentist that a group of angry p eople called that she was

going to quit

MacDonald Just and Carp enter

MacDonald Just and Carp enter argue that how quickly dispreferred analyses fade away

is sub ject to individual variations in short term memory MacDonald et al rated their sub jects

on their p erformance on the Reading Span Task a task in which the sub ject reads a list of

unrelated sentences keeping track of the the last word in eachsentence At the end of the list the

sub ject must recall the nal words Sub jects vary substantially on the length of the list for which

they can p erform the task accurately Score on this task is p ositively correlated with a variety



Where at the p osition of a WH trace the lexical decision times for words which are semantically related to

the antecedent of the trace are facilitated Swinney et al See Fo dor for a review

of language p erformance scores including SATverbal score The theory that MacDonald et al

prop ose is that highspan sub jects maintain ambiguities for longer p erio ds of time This theory

makes the interesting and counterintuitive prediction that for lo cally ambiguous sentences which

are disambiguated consistently with the preferred analysis highspan readers would havetowork

harder than lowspan readers since they would also b e maintaining the do omed nonpreferred

analysis This is indeed what they found They compared the lo cally ambiguous sentence in

a to an unambiguous control b and to the nongardenpathing mainverb analysis in

c

a The exp erienced soldiers warned ab out the dangers conducted the midnight raid

b The exp erienced soldiers who were told ab out the dangers conducted the

midnight raid

c The exp erienced soldiers warned ab out the dangers b efore the midnight raid

They found that highspan readers could cop e b etter with the ambiguity in a On a reading

comprehension task highspan readers p erformed b etter than lowspan readers correct

versus correct almost at chance on truefalse questions This conrms the relevance

of the reading span task to some asp ects of reading ability More interestingly MacDonald et

al found that for the mainverb sentences as in c high span readers to ok signicantly

more time to read the last word of the sentence For high span readers there was a very



slight elevation in the reading time of the ambiguous region warned ab out the dangers in

the ambiguous sentences a and c as compared to the lo cally unambiguous b This is

clear evidence of the additional burden whichmaintaining the p ossibility of a reducedrelative

analysis imp oses on highspan readers Slight though this eect is it do es constitute an online

measure of the cost of maintaining multiple analyses in parallel

Summary

The existence of garden path sentences leads to the inescapable conclusion that not all syn tactic

analyses are maintained indenitely The stronger conclusion that multiple syntactic analyses

are never retained from word to word is inconsistent with three sorts of psycholinguistic evidence

The meaning of the various comp eting analyses are compared hence computed requiring

the identication of syntactic relations Note that prop osals such as the thematic pro ces

sor of Frazier and her colleagues do not sp ecify howtheinterpretive mo dule can identify

which of the many p ossible relations among constituents are p otentially allowed by gram

matical analyses of the string which the pro cessor has not chosen Dierent languages

imp ose dierent restrictions on which constituents maybecombined so syntactic analysis

must precede interpretation

The discarded reading still manifests certain signs of life on suciently sensitive tests

such as the lexical decision task



This eect reached statistical signicance only when data from many exp eriments with slightly dierent

conditions were p o oled together

For readers who show signs of coping b etter with ambiguity the dispreferred reading exacts

a measurable pro cessing cost

Delib eration b efore Ambiguity Resolution

Aside from memory limitations assumed by a mo del another dimension along which the various

prop osals vary is the nature and amount of computation that precedes ambiguity resolution The

two logically extreme p ositions haveeach b een advo cated that any pro cessing whatso ever

including arbitrarily complex inference can precede ambiguity resolution and that ambiguityis

not even identied by the pro cessor online let alone delib erated on Some pap ers advo cate inter

mediate p ositions In this section I present a few pap ers arranged in approximately increasing

order of amount of preresolution delib eration

Shieb er and Pereira



Shieb er and Pereira prop ose a technique for constructing a deterministic au

tomaton given a p otentially nondeterministic grammar The automatons memory consists of

a stack of symb ols grammatical categories and a register which stores the name of one of a

b ounded numberofstateswhich the automaton is in It is equipp ed with a precompiled action

table which completely determines what move it should take next addremove items from the

stack change the state it is in based on the current state the next word in the input string and

the topofstack symb ol This action table is constructed from a grammar using a wellknown

grammar compilation technique LR parsing Aho and Johnson If the grammar is lo cally

ambiguous the compilation technique results in certain entries in the action table containing

sets of actions each corresp onding to a dierent analysis Shieb er and Pereira showhow struc

tural preference strategies such as Minimal Attachment and Lexical Preference see b elow

can b e used to resolvesuch indeterminacies in the action table at compile time The resulting

deterministic automaton will therefore follow the path of action consistent with the minimally

attached reading and not even detect the p ossibility of another analysis

Syntactic Optimality

Avariety of prop osals Frazier and Fo dor Rayner Carlson and Frazier Weinberg

Pritchett inter alia p osit structural preference criteria None of these prop osals

concretely sp ecify the algorithm by which the pro cessor nds the preferred parse Presumably

this involves some sort of searchover the space of analyses p ossible for the input so far For

example Fraziers Minimal Attachment principle could b e made to fall out of a pro cessor which

tries to integrate the next word into the current phrase marker by trying all combinations in

parallel and stopping as so on as it has found the rst grammatical solution In none of these

prop osals do es any nonsyntactic information enter into the pro cess of determining the rstpass

analysis



written at roughly the same time

Lexical Asso ciation

Ford Bresnan and Kaplan argue that aside from purely structural ambiguity resolution

criteria the pro cessor is also sensitive to the strength of asso ciation b etween certain words like

verbs and the nouns they take as arguments They conducted a questionnaire exp erimentin

which they presented participants with an ambiguous sentence such as and asked them to

identify which reading they got rst

The woman wanted the dress on the rack

They found that bychanging the main verb they could signicantly alter the ambiguity res

olution preferences observed For example was resolved of the time with the PP

mo difying dress and of the time mo difying wanted however when wanted was replaced

with p ositioned the preferences reversed from vs to vs Ford et al incorp orate

such preferences into a serial pro cessing algorithm their pro cessor considers the set of p ossible

rules at any p oint applying b oth lexical preference and general structurallystate rules to decide

which rule to apply next

Explicit Consideration of Syntactic Choices

The mo dels of Marcus and Gibson explicitly reason ab out the various syntactic

alternatives available at anypoint Marcuss system contains rules for dierential diagnosis

of lo cal structural ambiguity These rules consider the current collection of constituents and

decide howtocombine them Gibsons system explicitly constructs all grammatically available

structures and applies preference metrics to adjudicate among them While b oth systems adhere

y resolution their authors acknowledge the need for to solely syntactic criteria for ambiguit

certain meaningbased preferences in more completerealistic versions of their work Gibson

chapter esp p Marcus chapter

The Weakly Interactive Mo del

CSA argue that the syntactic pro cessor constructs all grammatically available analyses and the

interpreter evaluates these analyses according to meaningbased criteria While the criterion they

prop ose requires p otentially very elab orate inferences to apply their actual exp eriments

rely on relatively easy to compute asp ects of meaning

Principle of Parsimony Crain and Steedman

If there is a reading that carries fewer unsatised but consistent presupp ositions

or entailments than any other then other criteria of plausibili ty b eing equal that

reading will b e adopted as most plausible by the hearer and the presupp ositions

in question will b e incorp orated in his or her mental mo del of the discourse

In their exp eriments Crain and Steedman presented a lo cally ambiguous sentence suchas

in two dierent contexts as exemplied in

The psychologist told the wife that he was having trouble with to leaveher

husband

a One couple context

Apsychologist was counseling a married couple One memb er of the pair was

ghting with him but the other was nice to him

b Two couple context

A psychologist was counseling two married couples One of the couple was ght

ing with him but the other was nice to him

The inference which their sub jects evidently were computing were rst going from a married

couple or two to a part of the couple namely a wife second determining whether the denite

expression the wife referred uniquely presumably by determining whether the cardinalityof

the set of wives was greater than one In another exp eriment Crain and Steedman found

eects of plausibili tyinhow often sub jects garden pathed on examples suchas

a The teachers taughtby the Berlitz metho d passed the test

b The children taughtby the Berlitz metho d passed the test

This is evidence that sub jects use online the knowledge that teachers typically teach and children

typically are taught Again one may argue that this sort of knowledge could conceivably b e fairly

directly represented and is very quick to access see Resnik Plausibility eects on the

reducedrelativemainverb ambiguityinhave since b een found by many researchers Pearl

ell mutter and MacDonald Trueswell Tanenhaus and Garnsey inter alia Truesw

and Tanenhaus have found that sub jects are sensitive to the temp oral coherence of the

discourse when parsing reduced relative clauses For example The student caughtcheating

is more likely to b e interpreted as a reduced relative when the discourse is in the future tense

than when it is in the past tense

MarslenWilson and Young cited in MarslenWilson and Tyler conducted an exp eriment

whichshows immediate eects of a rather complex inference pro cess They placed ambiguous

phrases such as ying planes and visiting relatives in contexts which inferentially favor one of

their two readings

a If you wanta cheap holiday visiting relatives

b If you have a spare b edro om visiting relatives

Sub jects listened to an audio tap e of these materials and at end of the fragment they were

presented with a written word Their task was to read the word outloud the socalled cross

mo dal naming task The words of interests were is and are consistent with the a and

b meanings resp ectively MarslenWilson and Young found signicant eects of plausibili tyon

sub jects reaction times indicating that the relatively complex inference required is broughtto

b ear on the immediately fol lowing word It is not clear just howmuch inference is broughtto

b ear on a wordbyword basis this is due in part no doubt to our current inability to ob jectively

assess the complexity of inference

Discussion

There is a substantial and growing b o dy of evidence in supp ort of the claim that the human

sentence pro cessing mechanism consults a variety of information sources b efore it resolves am

biguities

prop erties of particular lexical items such as their preferred sub categorization frames Ford

Bresnan Kaplan GarnseyLotocky and McConkie Trueswell Tanenhaus and



Kello Juliano and Tanenhaus

semantic prop erties asso ciated more or less directly with the words in the sentence eg

married couple wife teachers teach from Crain and Steedman cheap vacation

visiting relatives from MarslenWilson and Young

t of the linguistic expression into the current discourse eg denite reference CSA

coherence of tense Trueswell and Tanenhaus

There is not however a consensus that the language pro cessing architecture is indeed parallel

and highly delib erative Mitchell Corley and Garnham argue that there are separate

syntactic and thematic pro cessors see section while the thematic pro cessor do es con

sider the meanings of the various combinations of the words in the string so far the syntactic

pro cessor pursues only one analysis The thematic pro cessor can come to susp ect that the syn

tactic pro cessor may b e pursuing the wrong analysis and alert it very quickly to change course

This quick alert strategy whichMitchell et al refer to as stitch in time can sometimes trickthe

pro cessing system into a garden path The consequence of this is that if one trains

ones psycholinguistic measurement apparatus on the exact p oint in the pro cess one could catch

the syntactic pro cessor constructing the minimally attached analysis only to have this analysis

undred milliseconds later abandoned in favor of the contextually appropriate analysis a few h

This issue is currently b eing debated with researchers on b oth sides rening their exp erimental

techniques see Altmann Garnham and Dennis



For information suchasverb sub categorizati on frame preferences it is very hard to tease apart whether the

information is asso ciated with the lexical entry for the verb or with the deep er representation of the concept

eg of the verb and how it is asso ciated to other concepts eg its arguments to which it is b eing related

by the sentence Current research on practical applications of natural language technologyintryingtoavoid

the complexityofknowledge representation has b een quite successful in assuming rich relation among words

Collecting lexical co o ccurrence statistics from large text corp ora researchers eg Hindle and Ro oth are

able to construct ambiguity resolution algorithms which p erform signicantly b etter than ones based handco ded

domain knowledge In fact it is surprising to see just how far co o ccurrencebased statistical approaches to

approximating natural language can go Church presents an algorithm for determining the formclass of

words in text This algorithm is trained on handtagged text it p erforms no syntactic analysis of its input it only

keeps track of the formclass frequency for eachword and the frequency of consecutive formclass tags in text

Using this remarkably imp overished approximation of the linguisti c phenomena of English Churchs algorithm

was able to achieve formclass determination p erformance of b etter than The success of these algorithms

can serve as a demonstration of how easy it is to cheat by attributing complex b ehavior using asso ciationbased

strength of representations of surface observable ob jects suchaswords

The Central Claim

The claim that I argue in this dissertation is that the parsing mechanism a straightforward device

which blindly and faithfully applies the knowledge of language syntactic comp etence directly

to its input allowing the interpretation mo dule to imp ose its preferences in case of ambiguity

At each p oint in pro cessing the parser constructs all available syntactic analyses for the input

thus far The interpreter considers the set of available analyses and what eachwould mean and

selects a subset to discard The parser deletes these analyses and extends the remaining ones

with the next incoming word rep eating the pro cess until it is either exhausted the input string

or it is stuck none of the nondiscarded analyses has a grammatical in the next

input word

The following are immediate consequences of this claim

There are no structural preferences eg Minimal Attachment enco ded or implemented

by the parser

All ambiguity resolution decisions among grammatically licensed analyses stem directly

from the linguistic comp etence in the broadest sense of the term

plausibili ty of the message carried by the analysis

quality of t of this message into the current discourse

felicity of the constructions used in the utterance to express the message



the relative frequency of use of a certain construction or lexical item

That is when resolving ambiguity the hearer answers the question which of these gram

matically p ossible analyses is the one that the sp eaker is most likely trying to communicate

to me

Each of the four criteria in ab ovecanbeinvestigated indep endently of syntactic ambi

guity

The parser uses a direct representation of the comp etence grammar as opp osed to some

sp ecially pro cessed enco ding intended solely for the task of parsing

y memory b ounds in the Certain parsing eects whichhave b een heretofore explained b

parser have explanations elsewhere

Parsing do es not always pro ceed serially or deterministically

Buerlimitationbased predictions of how long ambiguity can b e maintained and

when it must b e resolved eg Marcuss cell buer the Sausage Machines word

window will predict either to o long an ambiguitymaintenance p erio d in case disam

biguating information is available early or to o short a p erio d in case disambiguating

information is not available



On the assumption that the knowledge of language sp ecies quantitative frequency information eg sub

categorization frame preference

True memoryload eects whichwould arise in articial situations where many lo

cally ambiguous readings are available for the input string but no disambiguating

information is applicable will result from a diuse shortage in attentional resources

needed to keep track of the many analyses in parallel in analogy with an overloaded

multiuser computer which exhibits gradual p erformance degradation

Chapter

Accounting for Recency Phenomena

In the previous chapter I reviewed evidence that the ambiguity resolution pro cess is sensitiveto

avariety of asp ects of sensibleness of the comp eting analyses realworld plausibilityfelicityof

denite reference and temp oral coherence In this chapter and the next I consider a collection

of ambiguities which seem at rst glance to b e resolved by criteria other than sensibleness

I will argue that when the notion of sensibleness is broadened to encompass the degree of t

to current discourse situation these ambiguities receive a straightforward sensiblenessbased

account

Right Asso ciation

Kimball prop oses the parsing strategy of Right Asso ciation RA RA resolves mo diers

attachmentambiguities by attaching at the lowest syntactically p ermissible p osition along the

right frontier of the phrase marker Many authors among them Wilks Schub ert

Whittemore et al and Weischedel et al incorp orate RA into their parsing systems

yet none rely on it solelyintegrating it instead with ambiguity resolution preferences derived

from wordconstituentconcept coo ccurrence based criteria On its own RA p erforms rather

well given its simplicity but it is far from adequate Whittemore et al evaluate RAs p erfor

mance on PP attachment using a corpus derived from computermediated dialog They nd

that RA makes correct predictions of the time Weischedel et al using a corpus of news

stories rep ort a success rate on the general case of attachment using a strategy Closest

ttachment which is essentially RA In the works just cited RA plays a relatively minor role A

as compared with coo ccurrence based preferences

The status of RA is very puzzling Consider

a John said that Bill left yesterday

b John said that Bill will leaveyesterday

In China however there isnt likely to b e any silver lining because the economy

remains guided primarily by the state

from the Penn Treebank corpus of Wall Street Journal articles

John sold it to day John sold to day it

John sold the newspap ers to day John sold to day the newspap ers

John sold his rustysocketwrench set to day John sold to day his rustysocketwrench set

John sold his collection of RPM Elvis John sold to day his collection of RPM

records to day Elvis records

John sold his collection of old newspap ers John sold to day his collection of old news

from b efore the Civil War to day pap ers from b efore the Civil War

Table Illustration of heaviness and word order

On the one hand manynaive informants do not see the ambiguity of a and are often confused

by the semantically unambiguous b a strong RA eect On the other hand violates

RA with impunity What is it that makes RA op erate so strongly in but disapp ear in

In the rest of this chapter I argue that the high attachment of the adverbial enco des

a commitment ab out the information structure of the sentence which is infelicitous with the

information carried in but not with that in This commitmentisaboutthevolume of

information enco ded in various constituents in the sentence and the feature which enco des this

commitmentisword constituent order

Information Volume

Quirk et al dene end weight as the tendency to place material with more information

content after material with less information content This notion is closely related with end

which is stated in terms of imp ortance of the contribution of the constituent not merely

the quantity of lexical material These two principles op erate in an additive fashion Quirk et

al use them to accountforavariety of phenomena among them

genitiveNPs

the sho ck of his resignation

his resignations sho ck

itextrap osition

It b othered me that she left quickly

That she left quickly b othered me

Information volume clearly plays a role in mo dier attachment as shown in table My claim

is that what is wrong with sentences such as is the violation in the high attachment of the

principle of end weight While violations of the principle of end weight in unambiguous sentences

eg those in table cause little grief as they are easily accommo dated by the hearerreader

the online decision pro cess of ambiguity resolution could well b e much more sensitivetosmall

dierences in the degree of violation In particular it would seem that in b the weightbased

preference for low attachmenthasachance to inuence the parser b efore the temp oral inference

based preference for high attachment

Iamaware of no work which attempts to systematically tease apart the notion of amountof

linguistic material measured in words or morphemes from the notion of amount of information

communicated in the pragmatic sense In this do cument I use the term information volume

to refer to a vague combination of these two notions on the assumption that they are highly

correlated in actual sp eech and text To further simplify and op erationalize the denition of

information volume I classify single word constituents and simple NPs as low information volume

and constituents which include a clause as high information volume In section I argue that

avery signicant determinant of information volume is the pragmatic information carried by

the constituent not by length of its surface realization

A study

The consequence of my claim is that low information volume adverbials cannot b e placed after

high volume arguments while high volume adverbials are not sub ject to such a constraint

When the sp eaker wishes to convey the information in a high attachment there are other

wordorders available namely

a Yesterday John said that Bill left

b John said yesterday that Bill left

If the claim is correct then when a single word adverbial mo dies a VP which contains a high

information volume argument the adverbial will tend to app ear either b efore the VP or b etween

the verb and the argument High volume adverbials should b e immune from this pressure

Toverify this prediction I conducted an investigation of the Penn Treebank corpus of ab out

million words of syntactically annotated text from the Wall Street Journal Unfortunately

the corpus do es not currently distinguish b etween arguments and adjuncts they are b oth

annotated as daughters of VP Since at this time I do not have a dictionarybased metho d for

S when from VP left S whenmy search cannot include all distinguishing VP asked

adverbials only those which could never or rarely serve as arguments I therefore restricted

my search to subgroups of the adverbials

S s whose complementizers participate overwhelmingly in adjuncts after although as be

cause beforebesides but by despite even lest meanwhile onceprovided should since so though

unless until upon whereas while

single word adverbials now however then already heretoorecently instead often later once

yet previously especial ly again earlier soon ever rst indeed sharply largely usual ly together

quickly closely directly alone sometimes yesterday

The particular words were chosen solely on the basis of frequency in the corpus without p eeking

at their wordorder b ehavior

Each adverbial can app ear in at least one p osition b efore the argumenttotheverb sentence initial preverb

between verb and argument and at least one p ostverbalargument p osition end of VP end of S

For arguments I only considered NPs and Ss with complementizer that and the zero comple

mentizer

The results of this investigation app ear the following table

adverbial single word clausal

arg typ e prearg p ostarg prearg p ostarg

lowvolume

high volume

total

Of o ccurrences of single word adverbials app ear after the argument If we

consider only cases where the verb takes a high volume argument dened as one which contains

an S of the o ccurrences only app ear after the argument This interaction with



the information volume of the argument is statistically signicant p

Clausal adverbials tend to b e placed after the verbal argument only out of the o ccurrences

of clausal adverbials app ear at a p osition b efore the argument of the verb Even when the

argument is high in information volume clausal adverbials app ear on the right out of a total

of clausal adverbials

and are two examples of RAviolating sentences whichIhave found

According to department p olicy prosecutors must make a strong showing that

lawyers fees came from assets tainted by illegal prots before any attempts at

seizurearemade

To summarize low information volume adverbials tend to app ear b efore a high volume argument

and high information volume adverbials tend to app ear after it The prediction is thus conrmed

RA is at a loss to explain this sensitivity to information volume Even a revision of RA

such as the one prop osed bySchub ert which is sensitive to the size of the mo dier and

of the mo died constituent would still require additional stipulation to explain the apparent

conspiracy b etween a parsing strategy and tendencies in generator to pro duce sentences with

the wordorder prop erties observed ab ove This also applies to Frazier and Fo dors

Sausage Machine mo del which accounts for RA eects using a narrow window in the parser see

section

APotential Practical Application

How can we exploit the ndings ab ove in our design of practical parsers Clearly RA seems to

work extremely well for single word adverbials but how ab out clausal adverbials Toinvestigate

this I conducted another search of the corpus this time considering only ambiguous attachment



sites I found all structures matching the following twolowattached schemata



By I mean match or more daughters By I mean constituentxcontains constituent y as a

x y

rightmost descendant By I mean constituent x contains constituent y as a descendant

x y

lowVPattached adv

vp s vp

low S attached adv

vp s

and the following two highattached schemata

high VP attached v adv

vp s

high S attached adv

s vp s

The results are summarized in the following table

adverbtyp e lowattached highatt high

single word

clausal

As exp ected with singleword adverbials RA is almost always right failing only of the



time However with clausal adverbials RA is incorrect almost one out of ve times

Information Volume and Sensibleness

Let us return to the question of whether the attachment preferences discussed ab ove are indeed

consistent with the thesis of sensiblenessbased ambiguity resolution If it turns out that infor

mation volume is simply a measure of surface complexitywords morphemes phrase marker

tree depth etc then there is no role for interpretation and sensibleness to play it follows that

the comp etence grammar marks information volume as a feature on certain no des and assigns a

graded p enalty of some kind to certain sequences of volumemarkings While the idea of graded

p enalty against certain structural congurations is not new cf sub jacency the requirement for

a HighVolume feature is rather o dd Still there is nothing in this view which is inconsistent

with the main thesis



There is an interesting putative counterexample to the generalization that only low information volume

adverbials give rise to recency eects shown in i I am grateful to Bill Wo o ds for bringing this example to my

attention

i The Smiths saw the Grand Canyon ying to California

Here there is a remarkably strong tendency to take the participial phrase ying to California as b elonging to an

argument to saw The more plausible reading treats the participle as mo difying the matrix sub ject or the matrix

predication I claim that this eect is not residual RA but rather it stems from a subtle pragmatic infelicityin

the plausible construal of the participle My intuition is that when a participial adverbial is felicitousl y used the

relation b etween the adverbial and the matrix predication is not merely cotemp oraneity but rather the adverbial

must b e a relevant to matrix predication An informal survey of p osthead participles that do not app ear in

construction with their heads eg sp ent the weekend writing a pap er reveals that they most often app ear

delimited by a comma and are relevantly related to their heads serving such rhetorical purp oses as evidence

consequence elab oration and exception I did not nd examples of mere cotemp oraneity or scenesetting In

fact for scenesetting functions one tends to add the word while to the participle So the subtle infelicityinii

can remedied as in iii or in iv

ii John collapsed ying to California

iii John collapsed while ying to California

iv John collapsed trying to run his third marathon in as manydays

In i the matrix attachment of the participial makes only the infelicitous scenesettingcotemp oranei ty relation

available so the system is forced to the ECM analysis which for all it can determine online could have a felicitous

ending or a slow to compute metaphorical interpretation

The other p ossible domain over which to dene information volume is whatever

Grices maxims of quantity and manner are ab out that is informativeness of the contri

bution and brevityprolixity resp ectively If this is the case then constituents are not marked

by the syntactic pro cessor with their information volume All that the syntax determines is

constituent order This constituent order can enco de the commitment that constituentXmust

carry less information than constituent Y The actual information volume is determined bythe

interpreter Such determinations may b e inconsistent with the orderbased commitments in

which case the analysis is deemed less sensible



Iwould like to suggest that the pragmatic sense is a strong if not an exclusive determinantof

information volume Here is one example

The acceptabilityofverbparticle constructions clearly has to do with information volume

a Jo e called the friend who had crashed into his new car up

b Jo e called up the friend who had crashed into his new car

c Jo e called his friend up

d Jo e called up his friend

It has b een widely noted that pronouns are very awkward in p ostverbparticle p ositions

a This pissed o him

b This pissed o Bob

The reason I claim for the relative acceptability of b is the accommo dation by the hearer

of the p ossibility of a context which places new information in the NP Bob eg

Mary passed John and Bob in the corridor without even saying hello

Surprisingly this only pissed o Bob John didnt seem to mind

a can b e made acceptable if the pronoun him is replaced by a deictic accompanies by

physical p ointing that is increasing the amount of information asso ciated with the word

When Returning to the central example of this chapter let us consider the dialog in

appropriately intoned this dialog shows that a constituent like that Bill will leave whichis

construed as b earing high information volume when it app ears out of context in b can indeed

b ear low information volume when it expresses a prop osition or concept which is already given



in the discourse



Ford Bresnan and Kaplan p oint out that RA eects are sensitive to the syntactic category of the more

recent attachment site They contrast i with ii

i Martha notied us that Jo e died by express mail

ii Martha notied us of Jo es death by express mail

It is quite clear that the absurd RA reading is more prominent in i than in ii This is rather surprising

b ecause on informational terms I can see no notion of information bywhich that Jo e died b ears any more

information than of Jo es death



I am grateful to Ellen Prince for this example

A John said that Bill will leavenextweek and that

Mary will go on sabbatical in September

B Oh really When did he announce all this

A He said that Bill will leaveyesterday and he told

us ab out Marys sabbatical this morning

Conclusion

Ihave argued that the apparentvariability in the applicabilityofRight Asso ciation can b e ex

plained if we consider the information volume of the constituents involved I have demonstrated

that in at least one written genre low information volume adverbials are rarely pro duced after

high volume arguments precisely the conguration which causes the strongest RAtyp e eects

Considering the signicant inuence of pragmatic content on the degree of information volume

the interaction b etween information volume and constituent order provides a sensiblenessbased

account for the resolution of a class of mo dier attachmentambiguities

Chapter

Other Constructions

Two oftendiscussed structural ambiguities have not b een mentioned so far

John has heard the jokeisoensive

When the cannibals ate the missionaries drank

I will refer to the ambiguity in as NP vs S complement and to the gardenpath eect in

as the Late Closure Eect the term whichFrazier and Fo dor intro duced

In this chapter I consider the psycholinguistic evidence available ab out these ambiguities and

consider two dierentways of accounting for these and other data The rst prop osal which I call

disconnectedness theory is a formalization of many accounts of pro cessing diculty that app ear

in the literature The second whichIcallAvoid New Subjects has not b een prop osed b efore in

relation to ambiguity resolution I then consider the evidence available for distinguishing these

two accounts ultimately trying to show that disconnectedness theory makes some incorrect

predictions

Available Evidence

NP vs S complement

Advo cates of structural ambiguity resolution strategies have argued that the ambiguityin

is initially resolved by Minimal Attachment

Tom heard the latest gossip ab out the new neighbors was false

and are intuitivel y garden paths One might argue that given the strong bias for jokes b eing heard

and cannibals eating missionaries the structures in and is irrelevant But it is equally plausible that

someone hears some fact and that cannibals engage in an intransitive eating activity so the question remains

of why these strings are resolved as they are

Frazier and Rayner used eyetracking to nd that for sentences such as p eople

slowed down when reading the disambiguation region was false Holmes Kennedy and Murray

used a sub jectpaced wordbyword cumulative display exp erimenttoshow that the

slowdown whichFrazier and Rayner observed p ersists even when the ambiguity is removed by

the intro duction of an overt complementizer With exp erimental materials suchas

TR The maid disclosed the safes lo cation within the house to the ocer

TC The maid disclosed that the safes lo cation within the house had b een

changed

RC The maid disclosed the safes lo cation within the house had b een changed

they found that in the disambiguation region either to the ocer or had been changed the

transitiveverb sentence TR was read substantially faster than the other twosentences The

thatcomplement TC sentence was read slightly faster than the reduced complementRC

sentence

In resp onse Rayner and Frazier ran an an eyemovement exp eriment which contradicted

the conclusions of Holmes et al Using materials whichwere similar but not identical

to those of Holmes et al they found that at the disambiguation region TC was read the

fastest followed by TR followed by the ambiguous RC consistent with the theory of Minimal

Attachment

In turn Kennedy Murray Jennings and Reid argued that Rayner and Frazier

intro duced serious artifacts into their eyetracking data bypresenting their material on multi

ple lines and not controlling for the resulting righttoleft eyemovement Kennedy et al also

criticized other technical asp ects of Rayner and Fraziers exp eriment Kennedy et al ran an

eyetracking study using the materials from Holmes et al They found that TC and

RC sentences were read signicantly slower in the disambiguation region than TR sentences

They found no reliable dierence b etween TC and RC In a further exp eriment to test the eect

of linebreaks they found statistically signicant eects whose nature was rather dicult to

interpret They to ok this as evidence that linebreaks do indeed intro duce artifacts

In summary there is evidence that Scomplement sentences TC and RCabove takelonger

to comprehend than comparable NPcomplementsentences

Another debate is whether RC sentences take longer to read than TC sentences and under what

conditions Quite a few researchers haveinvestigated the question of whether RC sentences cause

a gardenpath eect when the matrix verb prefers an NP rather than a clausal complement

Kennedy et al partitioned the materials for their rst exp eriment according to the bias

of the matrix verb NP versus clausal They found no eects of verbbias on rstpass reading

time but found a statistically nonsignicant eect of verbbias on eye regressions initiated from

the disambiguating zone ie indications of backtrackingconfusion For b oth kinds of verbs

there were more regressions in the RC condition than the TC condition but the dierence was

greater for NPbias verbs However Kennedy et al did not demonstrate that they accurately

identied verb biases

Many groups of researchers rep ort exp eriments sp ecically designed to investigate verbbias

eects on the extent of gardenpath in RCsentences I rep ort the work of four Holmes Stowe

and Cupples Ferreira and Henderson Garnsey Loto cky and McConkie

and Trueswell Tanenhaus and Kello I refer to them as HSC FH GLM and TTK

resp ectively HSC GLM and TTK ran twophase exp eriments In the rst phase sub jects

were asked to complete sentences such as He susp ected Statistics from these data were used

to assess verbs biases Two groups of verbs were selected NPpreference and Spreference

HSCs criterion for counting a verb as having a bias was a or greater imbalance in sub jects

resp onses When assessing a verbs argument structure they lump ed TC and RC resp onses into

one category GLM and TTK kept separate tallies of these two kinds of complements This

dierence is signicant b ecause the ambiguity in question is really b etween a TR and an RC

analysis FH did not use a questionnaire instead verbs were selected either on the basis of

normativedatacollectedby Connine Ferreira Jones Clifton and Frazier or according

to the intuitions of the exp erimenters The study of Connine et al asked sub jects to write a

sentence for each of a group of verbs They did not sp ecify that the verb must immediately follow

an agentsub ject and this mighthave certain eects on the way their data can b e interpreted

for the purp ose at hand

HSC considered the eects of two factors up on the degree of the garden patheect verbbias

and plausibili ty of the p ostverb NP as a direct ob ject Their materials were of the form

NPbias verb

TC Plausible The rep orter saw that her friend was not succeeding

Implausible The rep orter saw that her metho d w as not succeeding

RC Plausible The rep orter saw her friend was not succeeding

Implausible The rep orter saw her metho d was not succeeding

Clausalbias verb

TC Plaus The candidate doubted that his sinceritywould b e appreciated

Implaus The candidate doubted that his champagne would b e appreciated

RC Plaus The candidate doubted his sinceritywould b e appreciated

Implaus The candidate doubted his champagne would b e appreciated

They tested the ecacy of the plausibility manipulation by asking sub jects to rate sentences

such as the rep orter saw her metho d This is an inadequate test of the online plausibility

of the NP analysis Just b ecause a sub ject rejects the string as a sentence it do es not mean

that the sub ject would online reject the NP analysis for her metho d doing so would

commit the sub ject dep ending on ones theory of grammar to rejecting strings such as The

rep orter saw her metho d fail miserably when interviewing athletes or The do ctor found the

fever discouraging This criticism applies to the ma jority of their implausible exp erimental

materials though unevenly for the NP and Sbias verbs I therefore omit their ndings with

resp ect to this factor

They conducted three selfpaced exp eriments varying the metho d of presentation of the materi

als Their rst exp eriment used a selfpaced wordbyword cumulativedisplay After eachword

the sub ject had to decide whether the string is grammatical so far This resulted in remarkably

slow reading times three times slower than in eyemovement exp eriments The RC condition

was slower at the disambiguation region than the TC condition this dierence was enhanced

in the NPbias condition consistent with their theory that lexical information is incorp orated

into the parsing pro cess Advo cates of Minimal Attachment often argue that slow presentation

metho ds may not b e sensitive to rstpass pro cessing and that it is not at all surprising that lex

ical information is incorp orated at the later stage tapp ed by this sort of exp eriment Addressing

this criticism HSC ran another wordbyword selfpaced exp eriment but this time sub jects were

required to rep eat the sentence when it was nished This resulted in somewhat faster reading

times still roughly twice as slowasineyemovement The ndings in the second exp eriment

were comparable to those of the rst although the gardenpath was detected roughly one word

later and the dierences in reading time were not quite as large

One problem with cumulative displays is that sub jects may b e employ a strategy of pressing

the selfpaced button faster than they are actually reading the words In a third exp eriment

HSC used a noncumulative display where the letters of eachword in the sentence except the

one b eing read were replaced with underscores Instead of manipulating the plausibili tyofthe

ambiguous NP they manipulated its length Two examples are

The lawyer heard that the story ab out the accident was not really true

The rep orter saw that the woman who had arrived was not very calm

For NPbias verbs at the rst word of disambiguating region was in and the RC

condition to ok ms longer than the TC ms versus ms p er word The dierence

between RC and TC was a slightly larger for short NPs but there was no statistically signicant

interaction b etween NP length and overtness of complementizer

For clausalbias verbs RC sentences with long ambiguous NPs had a reading time of roughly

ms for the disambiguating word whereas the three other conditions ie the two TC conditions

and the short NP RC condition required roughly ms For short NPs the dierence b etween

TC and RCwas not signicant whereas for long NPs it was the magnitudes were approximately

ms and ms resp ectively

In summary the third exp eriment conrmed the rst twoby showing more pro cessing diculty

for the NPbias verbs than the complementbias verbs In addition it showed that when the

ambiguous NP is long readers tend to interpretitasanargument to the matrix verb even for

complementbias verbs

Ferreira and Henderson attempted to dispute the claim that lexical bias is incorp orated

into the pro cessors rst pass ambiguityresolution decisions They conducted three exp eriments

using three dierent exp erimental pro cedures eyemovement noncumulative selfpaced read

ing and cumulative selfpaced reading They used the same materials for all three exp eriments

One example is

NPbias

TC Bill wrote that Jill arrived safely to day

RC Bill wrote Jill arrived safely to day

Clausalbias

TC Bill hop ed that Jill arrived safely to day

RC Bill hop ed Jill arrived safely to day

In their rst two exp eriments FH found no inuence of verb bias on the gardenpath eect

between the RC and TC conditions They did nd a weak inuence in the third exp eriment

These results supp ort their claim that Minimal Attachment is relevant for rstpass pro cessing

and lexical prop erties such as argumentpreferences are considered only in subsequent pro cessing

But there are some serious aws with their exp eriment First they did not demonstrate that

their at least partially intuitivelyarrivedat verb categories do indeed corresp ond to frequency

of use or any other measure of argumentstructure bias Second many of their examples give

rise to semantic implausibil iti es in the NP reading eg

Jan warned the re

Ed asserted eggs

Ed disputed eggs

These implausibili ty problems aect the the NPbias materials and Sbias materials in dierent

frequencies thus intro ducing a serious p otential source of diculty

GarnseyLotocky and McConkie conducted one exp eriment where they tested whether

lexicalbias information is quickly incorp orated by the pro cessor They used an eyemovement

exp eriment with materials suchas

NPbias

TC The manager overheard that the employees were planning a surprise party

RC The manager overheard the employees were planning a surprise party

Clausalbias

TC The manager susp ected that the employees were planning a surprise party

RC The manager susp ected the employees were planning a surprise party

They found a statistically signicantinteraction b etween complementizer presence and verbbias

This eect app ears at the rst xation on the rst disambiguating word were

Trueswell Tanenhaus and Kello argued that sub categorization information is incorp o

rated into the analysis at the earliest p oint p ossible and furthermore that the eect of verb

sub categorization is not categorical but graded reecting preferences which show up in b oth

pro duction and parsing tasks They used a crossmo dal naming task with materials such as

The old man insistedaccepted that

The visually presented targets were he and him TTK found that for TR bias verbs absence of

a complementizer commits the reader to an NPcomplement analysis as can b e seen by facilita

tion of him relative to he Scomplementbiasverbs ranged in the eect of the complementizer

TCbias verbs tended to require the complementizer in order to activate the Scomplement anal

ysis whereas RCbias verbs showed activation of the Scomplement analysis even in the absence

of the complementizer TTK found converging evidence from two other exp eriments which used

noncumulativewordbyword selfpaced reading and resp ectively with materials

such as

The student forgothop ed that the solution was in the back of the b o ok

They found garden path eects as can measured by the slowdown in the disambiguating region

in RC sentences with TRbias verbs For Scomplementbiasverbs the extent of garden path

dep ended on how frequently the verb app ears with a thatcomplementversus zerocomplement

TTK used two forms of statistical analysis They used ANOVA to argue for an eect of TRbias

versus Sbias verbs and a regression statistic to argue for a correlation b etween the strength of

complementizer preference and the degree of garden path for the Scomplementbiasverbs

Ignoring the older p otentially problematic selfpaced studies where FHs results conict with

GLMs and TTKs the latter studies are more b elievable FHs failure to nd an eect of verb

bias could b e due to a variety of factors as discussed at length in Trueswell et al

From the exp eriments listed ab ove I conclude the following

Absence of a complementizer in a RC string can lead to pro cessing diculty

The magnitude of this diculty is often less than that of standard gardenpath sentences

Some of this diculty mightwell p ersist when the complementizer is present

There is some evidence suggesting that the magnitude of the eect b ecomes is higher when

the ambiguous NP is long

The magnitude of the diculty is sensitive to the sub categorization p ossibilitypreference

of the matrix verb

In short the evidence for the strong inuence of lexical factors is clear But there is some

evidence that some pro cessing diculty is residually asso ciated with sentential complements

indep endentofambiguity and lexical factors

Late Closure Ambiguity

Frazier and Rayner argue that Fraziers structural preference principle of Late

Closure is what is resp onsible for the garden path in a

Late Closure

When p ossible attach incoming lexical items into the clause or phrase currently

b eing pro cessed ie the lowest p ossible nonterminal no de dominating the last

item analyzed

a Since Jayalways jogs a mile seems like a short distance to him

b Since Jayalways jogs a mile this seems like a short distance to him

Late closure can similarly account for the pro cessing diculty in and

Without her contributions failed to come in From Pritchett

When they were on the verge of winning the war against Hitler Stalin Churchill

and Ro osevelt met in Yalta to divide up p ostwar Europ e From Ladd

Frazier and Rayners gardenpath theory distinguishes two comp onents of the human language

pro cessor which I will followMitchell in calling the Assembler and the Monitor The

Assembler very quickly hyp othesizes a syntactic structure for the words encountered thus far

The Monitor evaluates this hyp othesis and sometimes initiates backtracking when it detects a

semantic problem The Assembler uses quicklycomputable strategies like Minimal Attachment

and Late Closure Mitchell investigated a prediction of the gardenpath theory that

the Assembler only pays attention to ma jor grammatical categories of the incoming lexical items

Finer distinctions suchasverb sub categorization frames are only considered in the Monitor

It follows that Late Closure eects as in should p ersist even when the rst verb is purely

intransitive as in

After the child had sneezed the do ctor prescrib ed a course of injections

Mitchells used sub jectpaced readingtime measurement Instead of a wordbyword pro cedure

eachkeypress would present the next segment of the sentence Segments were fairly large

eachtestsentence was divided into only two segments

As he predicted Mitchell found garden path eects when the segment b oundary was after the

ambiguous NP the do ctor But this eect could arise as an artifact of the way he segmented



hsegment as clause To address this criticism his materials leading the reader to construe eac

Mitchell and his colleagues Adams Clifton and Mitchell conducted an eye tracking study

Using materials as in they manipulated the availability of a transitive reading for the verb

and whether or not there was a disambiguating comma after the prep osed clause

After the dog struggledscratched the vet to ok o the muzzle

Their results suggest that when the comma was omitted sub jects attempted to construe the

ambiguous NP the vet as the ob ject of the preceding verb even when it was purely intransitive

In other words the lexical prop ertyofintransitivitywas not as eectiveasthecommainavoiding

a transitive analysis

Stowe exp eriment provides evidence that directly contradicts Mitchells claim that

verb sub categorization information is ignored by the rst phase of sentence pro cessing Stowe

exploited the phenomenon of causativeergative alternation exemplied in to show that

readers are immediately sensitive to the sub categorization frames available for the verb



Just b ecause readers try to make sense of a constituentsuch as after the child had sneezed the do ctor it do es

not mean that they are ignoring sub categorizati on information It is not clear that putatively intransitive verbs

such as sneeze burp sleep are indeed ungrammatical when used transitively or merely lead to implausibi l iti es

It could b e that in a sentence like whatever it is that is resp onsible for the late closure eect is driving the

interpreter to come up with a transitiveverb interpretation Transitive uses of many putatively intransitive verbs

are not imp ossible consider slept his fare share burp ed YankeeDoodle sneezed her way out of the oce

sneezed his brains out

Causative John moved the p encil

Ergative The p encil moved

Using materials such as those in Stowe manipulated the plausibili ty of the sub ject as causal

agent eectively changing sub categorization frame of the verb

Ambiguous

Animate Before the p olice stopp ed the driver was already getting nervous

Inanimate Before the truck stopp ed the driver was already getting nervous

Unambiguous

Animate Before the p olice stopp ed at the restaurant the driver was already

Inanimate Before the truck stopp ed at the restaurant the driver was already

Late Closure predicts that in the ambiguous conditions the disambiguating word was in

should not vary when the of the sub ject is manipulated Stowes account of early use

of lexicallysp e ci ed information predicts a garden path eect at was only in the ambiguous

animate condition And this is exactly what she found Using a sub jectpaced wordbyword

cumulative display task where sub jects were required to monitor the grammaticalityofthe

string she found signicantly elevated reading times and ungrammatical resp onses at the rst

disambiguating word in the ambiguousanimate condition and nowhere else The exp erimental

technique used by Stowe has often b een criticized as to o slow for detecting rstpass pro cesses

But I am aware of no exp eriments whichchallenge Stowes result

ween lexical preferences and In a followup exp eriment Stoweinvestigated the interaction b et

plausibility She used materials such as those in

Animate

Plausible When the p olice stopp ed the driver b ecame very frightened

Implausible When the p olice stopp ed the silence b ecame very frightening

Inanimate

Plausible When the truck stopp ed the driver b ecame very frightened

Implausible When the truck stopp ed the silence b ecame very frightening

She used the same pro cedure as in her rst exp eriment a sub jectpaced wordbyword cumu

lative display task where sub jects were required to monitor the of the string at

eachword Aside from the animacy eect observed in her rst exp eriment Stowe found that

the implausibil ity of the sub ject NP silence in to serve as an ob ject of the preceding

verb is noted as so on as the word itself app ears p Stowe also observed The most

p erplexing p oint ab out the results of Exp eriment is that p eople apparently b ecome aware of

the unsuitability of the NP to b e an ob ject of the preceding verb even when there is evidence

that they exp ect an intransitiveverb structure ie in the Inanimate conditions p

In summary While the issue of whether verbsub categorization information comes to b ear imme

diately on resolving the lateclosure ambiguity is not denitively settled the available evidence

suggests that it do es Nevertheless there is still evidence for some residual eects Late Closure

and preference for NP complements over S complements that lexical prop erties alone cannot

account for Below is additional evidence for this claim

Consider

John nally realized just how wrong he had b een remained to b e seen



The main verb realize is biased toward a sentential complement yet there is still a p erceptible

garden path

Lexical bias alone also fails to account for all of the late closure eect In

When Mary returned her new car was missing



the verb return o ccurs more frequently without an ob ject than with one Nor can lexical

bias account for the gardenpath sentences and rep eated here as and

Without her contributions failed to come in

When they were on the verge of winning the war against Hitler Stalin Churchill

and Ro osevelt met in Yalta to divide up p ostwar Europ e

In the rest of this chapter I investigate two theories to account for these preferences

Degree of Disconnectedness

One idea that has b een recently put to use by Pritchett and Gibson but go es

back at least as far as Eady and Fo dor and MarslenWilson and Tyler is that the



Verb data from ve sources conrm this

NP RC TC RCTC units

Trueswell et al in completion task

Garnsey et al in completion task Garnsey pc

Connine et al frequency in questionnaire

Brown corpus raw frequency

Wall Street Journal corpus raw frequency



The verb return o ccurs in the Brown and Wall Street corp ora as follows

corpus transitive intransitive

Wall Street Journal

Brown



Note that while the verb return has b oth an intransitive and a transitive sub categorization frame it is

dierent from a verb like eat which is transitive but may drop its ob ject It may b e the case that ob jectdrop

uses require a pro cess of accommo dating an implicit ob ject While diculties with this pro cess could p otentially

account for the garden path in they cannot account for a garden path in

pro cessor has dicultykeeping around many fragments for whichitdoesnotyet have semantic

connections and thus prefers b etterintegrated analyses In this section I give one formalization

of this idea and showhow it can account for many dierent pieces of data including those just

discussed

The basic notion here is degree of disconnectedness Intuitively the disconnectedness measure

of an analysis of an initial segmentofasentence is howmany semantically unrelated pieces have

b een intro duced so far In standard syntactic theory disconnectedness has a straightforward

implementation

Theta Attachment Pritchett

The theta criterion attempts to b e satised at every p oint during pro cessing given

the maximal theta grid

The theta criterion is part of the comp etence theory which assigns every verb and other op en

class complementtaking words such as adjectives and nouns a collection of thematic slots

called thetaroles For a sentence to b e wellformed every thetarole must b e lled byan

argument and every argumentmust ll a particular slot It turns out that thematic roles

are not rich enough to capture the necessary semantic relations among words in a sentence

esp ecially when their semantic content eg agent instrument is ignored So Pritchett

broadens his heuristic to include every principle of syntax not just the thetacriterion Gibson

eitwork with his parsing algorithm op erationalizes in a slightly dierentwaytomak

and data representation and notes that anysyntactic theory that mentions thematic relations

would give rise to a similar parsing heuristic In this pro ject I cast the notion of disconnectedness

minimization in purely semantic ie nonsyntactic terms I do not distinguish the semantic

relation of thematic role from any other semantic relation such as determinernoun or mo dal

verb etc This notion of disconnectedness will b e made more concrete presently But rst I

intro duce the semantic representation formalism which I will use in this dissertation

A Representation of Semantics

For the purp oses of the present pro ject the semantic representation whichIcho ose is b orrowed

from the work of Hobbs and his colleagues Hobbs Hobbs Stickel App elt and Martin

which is in turn an elab oration of work byDavidson Davidson argues that events

can b e talked ab out just likephysical ob jects so a must include eventvariables

as well as the traditional thing variables Hobbs argues that predications eg states

must also b e aorded this treatment as rst class memb ers of the ontology The semantic

representation which he prop oses is not the usual term or logical formula but rather a set of

terms each comprising a predicate symb ol and one or more arguments which are either variables

or constants but crucially not terms themselves All variables are implicitly existentially

quantied For example the semantic representation for is

The b oywanted to build a b oat quickly

e e e xyPaste w ant e xe quick e e buil d e xy boy x

     

boaty

Which means something like

There is an eventstate e which o ccurred in the past in which the entity x wants

the eventstate e e is an eventstate in which the eventstate e o ccurs quickly

  

e is a building event in which x build y wherex is a b oyandy is a b oat



Hobbs motivates this at representation on the grounds that it is simpler and thus

enco des fewer commitments in the level of the semantics This is sup erior to hierarchical

recursivelybuilt representation he argues as semantic representation is dicult enough as it

is without additional requirements that it cleverly account for certain syntactic facts as well

eg count nouns vs mass nouns He defends the viability of this approachby showing that it

can cop e with traditional semantic challenges such as opaque adverbials de dictode re b elief

rep orts and identity in b elief contexts This notation is used in the tacitus pro ject Hobbs

et al a substantial natural language understanding system demonstrating its

k exploits the simple structure of viability as a meaning representation Haddo c

each term to p erform ecientsearch of the representation of a prior discourse in order to resolve

denite NPs

In the current pro ject a lexicalized grammar is used where eachword is asso ciated with a

combinatory p otential and a list of terms When words constituents are combined their term

lists are simply app ended to determine the term list of the combined constituent Details and

examples will b e given in section The semantic analysis then develops incrementally

wordbyword

AFormal Denition

To formally dene the degree of disconnectedness of a semantic analysis S I rst construct an

undirected graph whose vertices are the variables b oth ordinary thing variables and eventstate

variables mentioned in S Twovertices are adjacent have an edge connecting them just in case

they b oth app ear as arguments of a term in S The disconnectedness measure of S is the number

of comp onents of the graph minus By number of components I mean the standard graph

theoretic denition twovertices are in the same comp onent if and only if there is a path of

edges that connects them

For example when the initial segment in is encountered

When the cannibals ate the missionaries

there are two analyses corresp onding to the transitive and intransitive readings of ate resp ec

tively

whenee eateee denitee cannibalse denitee missionar iese e1 e2

e3 e4

whenee eatee denitee cannibalse denitee missionariese e1 e2

e3 e4

Since the intransitive reading carries a higher disconnectedness measure than the transitive

reading vs the transitive reading is preferred A Late Closure Eect therefore results

when the next word is drank I use the capitalized term Disconnectedness to refer to the

theory that the pro cessor prefers to minimize the measure of disconnectedness

Disconnectedness similarly accounts for the garden path eects in At the word contribu

tions there are two analyses corresp onding to the common noun and NP readings resp ectively

withoutee femininee ofee contributionse



withoutee femininee implicitquantie re contributionse

The common noun reading is thus preferred

Consequences

Disconnectedness predicts diculty with In fact since Disconnectedness is insensi

tive to lexical or conceptual preferences its input to the analysis selection pro cess could conict

with the input from lexical preferences This conict can account for the puzzling ndings of

Stowes second exp erimentabove

The ndings of Holmes et al and Kennedy et al that in sentences suchas

ab ove rep eated here as the clausal conditions TC and RC are slower to read than TR are

also consistent with the additional disconnectedness asso ciated with the sub ject reading of the

safes lo cation within the house

TR The maid disclosed the safes lo cation within the house to the ocer

TC The maid disclosed that the safes lo cation within the house had b een

changed

RC The maid disclosed the safes lo cation within the house had b een changed

Additional evidence in supp ort of Disconnectedness comes from exp eriments with lling WH

gaps Boland Tanenhaus Carlson and Garnsey investigated whether plausibili ty aects

gaplling They used materials such as those in and a sub jectpaced wordbyword cumu

lative display metho d where sub jects were asked to detect when the sentence stopp ed making

sense They found that the word them caused diculty in a as compared to its control

b

a Whichchild did Mark remind them to watch this evening

b Whichmovie did Mark remind them to watch this evening



This is a placeholder for a semantic theory of bare plurals

Boland et al conclude from this and other exp eriments that inferential information suchasthe

argument structure of the verb are used as so on as logically p ossible Let us examine closely

what happ ens with these two sentences In b when the reader comes to the word remind

she can check whether movies can b e reminded Since that is implausible the remindee sp ot

is not lled and the next word them causes no diculty In b a child is something that

can b e reminded so the gaplling analysis is pursued But the nonlling alternatives is just as

plausible A p erson can remind someone of something having to do with children Plausibility

alone cannot fully explain why the lling analysis is preferred in this case Note that the non

lling analysis has a higher disconnectedness measure the relation b etween the WHelement

and the rest of the material in the utterance is not established Without Disconnectedness

one need a partially structurallybased theory such as rst plausible gap to account for this

gaplling b ehavior

The interpretation of the results of Holmes et al and Kennedy et al as dis

connectednessrelated pro cessing diculty in the unambiguous TC condition suggest that there

might b e other unambiguous highly disconnected structures which are hard to pro cess Indeed

center emb edding the classical example of an unambiguous structure that is hard to pro cess

reaches a disconnectedness measure of after the word dog

The rat that the cat that the dog denitee rate denitee cate denitee doge e1

e2 e3

The rat that the cat that the dog bit chased died

denitee rate denitee cate denitee doge

biteeee paste chaseeee paste dieee paste e5e1 e6

e2 e3

e4

It seems quite likely that the computations of pro cessing load whichHawkins uses to

derive manywordorder universals could b e recast in terms of disconnectedness score Rambows

ab account of marginally grammatical scrambled sentences in German in terms of storage

requirements is also very likely to b e statable in terms of disconnectedness But this awaits

further research

Given that disconnectednessrelated pro cessing diculties such as the Late Closure Eect

are mitigated and often overridden by inferential preferences one would exp ect pro cessing

diculties with structures such as to b e ameliorated by b etter semantic coherence In

fact this prediction is b orne out Bever hyp othesized that might b e easier to pro cess

than

The dog that the destruction that the wild fox pro duced was scaring will run

away fast

Fo dor Bever and Garrett and Frazier and Fo dor mention and resp ec

tively which seem somewhat easier to understand than

The water the sh the man caughtswam in was p olluted

The snow that the match that the girl lit heated melted

Frank provides which seems to do away with pro cessing diculty altogether

A b o ok that some Italian Ivenever heard of wrote will b e published so on by

MIT press

Inferential and discourse factors are clearly involved in the degree of diculty of these sentences

For example note that having a deictic as the most deeply emb edded sub ject as in seems

to improve things somewhat and replacing the denite sub jects in with indenites

in seems to make a further improvement The interaction b etween these interpretive factors

and pro cessing diculty in the absence of ambiguity remain matters for further research as do



the subtle eects of the choice of relativizer that vs whowhomwhich vs zero

The connection b etween ambiguity resolution preferences for semantically b etterintegrated

readings and pro cessing diculties with center emb edding has b een explored by Gibson

While Gibsons measure of semantic integration is formulated in terms of the Governmentand

Binding principles eg the Criterion see Chomsky and not graph theoretic notions

his prop osed underlying mechanisms are comparable to the account oered here analyses

in which semantic relations among entities are established are preferred to analyses in which

they are not Gibson opts for a dierent explanation for the relativeimprovement of over

He assumes that each of through and presumably overwhelms the parsers

capacity and causes a breakdown of ordinary syntactic pro cessing In sentences like the

interpretive mo dule is still able to piece the uncombined fragments together using inferential

pro cesses such as determinations of plausibili ty This sort of inference cannot salvage a sentence

such as Gibsons account predicts that in deeply emb edded structure where the parser

breaks down if there is a choice b etween syntactic illformedness and inferential implausibil i ty

the former will b e opted for by the inferential salvaging pro cess For example the string



Whatever manipulati ons one applies to to make it as go o d as can b e applied in reverse to cause

centeremb eddingtyp e pro cessing diculty for structures which are usually considered unproblematical Consider

i which Gibson following Cowp er takes to b e unproblematical

i The p ossibili ty that the man who I hired is incomp etentworries me

Replacing the deictic pronouns with a denite NPs renders the resulting sentence ii harder to understand

ii The p ossibil ity that the man who the executive hired is incomp etentworries the sto ckholders

Some Italian that a b o ok Ivenever heard of wrote will b e published so on by

MIT press



is predicted by his account to b e judged as acceptable or at least signicantly b etter than

and construed as meaning the same thing as

There is no necessary connection b etween center emb edding and disconnectedness is just

as center emb edded as but do es not encounter disconnectedness at any p oint

John asked the woman that gave the b oy that kicked the dog some candy why

shes sp oiling him



Intuitively is slightly easier to read than a variant whose structure directly mirrors

that of

The woman that the b oy that the dog frightened is b othering rightnowisa

friend of Johns

Eady and Fo dor rep ort an exp erimentinwhich they indep endently manipulated two

relative clauses one contained in the other for center emb edding versus rightbranching

They found that a and b were of comparable reading diculty c was substantially

harder to read than b and d was harder yet The largest dierence was b etween b

and c That is when the innermost relative clause is centeremb edded the dicultyis

greatest Their results argue against an account of pro cessing diculty which is in their words

based on inherent prop erties of centeremb edding Descriptively what matters most is whether

or not the llergap dep endencies overlap Disconnectedness theory captures this nding the

maximum disconnectedness scores for a through d are and resp ectively

a Jack met the patient the nurse sent e to the do ctor the clinic had hired e

i i j j

to the do ctor the clinic had hired e met Jack b The patient the nurse sent e

j j i i

c Jack met the patient the nurse the clinic had hired e sent e to the do ctor

i j j i

d The patient the nurse the clinic had hired e sent e to the do ctor met Jack

i j j i

underlining depicts llergap dep endencies



I assume that the string in is somehow derivable byacombination of scrambling op erations which op erate

in other languages but cannot b e ruled out for this English sentence b ecause the comp etence grammar is b eing

ignored



To corrob orate myintuitions I conducted a miniature survey of six colleagues I presented them with sentences

through

John asked the woman that gave the b oy that kicked the dog some candy why shes sp oiling him

John asked the woman that gave the b oy that the dog frightened some candy why shes sp oiling him

The woman that the b oy that the dog frightened is b othering rightnow is a friend of Johns

The woman that the b oy that kicked the dog is b othering rightnow is a friend of Johns

Their maximum disconnectedness measures are and resp ectivelyEveryone I asked initially rated all

sentences as equally bad After some b egging and coaxing on my part each informantprovided some partial

ranking All resp onses were consistent with the ranking from b est to worst and only this ranking

This is consistent with the predictions of Disconnectedness

A Disconnectednessbased account of the dicultyofwould predicts that should b e

completely free of anycenteremb eddingtyp e pro cessing diculty and that an even more deeply

nested structure should b e as easy to pro cess as its purely rightbranching control in

This do es not seem to b e the case more research is needed

John asked the woman that oered the b oy that gave the dog that chased the

cat a big kick some candy why shes sp oiling him

John met the woman that rewarded the b oy that kicked the dog that chased the

cat

In summary the strategy of minimizing the measure of disconnectedness has a variety of evidence

to supp ort it

residual Late Closure Eects

residual NP preference for NP vs S ambiguities

gaplling

pro cessing diculty in unambiguous temp orarily disconnected sentences

But would adoption of Disconnectedness weaken the overall thesis After all Disconnectedness

is stated over the sensesemantics of a string a level of representation which is on the interface

of syntax and interpretation It is quite conceivable that one could prop ose a notational variant

of disconnectedness theory which is stated solely in terms of structure After all its theoretical

predecessors Pritchett and Gibsons prop osals are based on thematic role assignment

in syntactic structure Nevertheless I claim that Disconnectedness is a viable candidate for a

comp onent of the thesis of ambiguity resolution from interpretation It is stated over the domain

of meaning not syntactic structure As is suggested by the susceptibility of disconnectedness to

discourse factors eg the lo cus of disconnectedness might not b e the sensesemantics as I

dened it but a level of meaning representation which is deep er more pragmatic

Another p otential ob jection is why should a temp orarily high disconnectedness measure matter

to the pro cessor Given that no complete grammatical sentence has any disconnectedness the

pro cessor can just patiently wait until the connecting words arrive There are two resp onses

to this ob jection First a pro cessor that waits for additional information b efore making its

decision might require large computational resources when faced with comp ounding ambiguity

ie waiting might b e to o exp ensive Second a pro cessor mightwell b e closely attuned to

disconnectedness since the very task of a sentenceunderstanding system is to determine the

logical connection among the words in the sentence the b etter the connection the more

preferred the analysis It would follow that some connection is preferable to no connection

Inow turn to a drastically dierentaccount for most of the data in this section

Avoid New Sub jects

An examination of the syntactic structures that disconnectedness accounts for reveals that with

one exception they all involve a preference not to analyze an NP as a sub ject

a late closure eects

When the cannibals ate the missionaries drank

Without her contributions failed to come in

When they were on the verge of winning the war against Hitler Stalin

Churchill and Ro osevelt met in Yalta to divide up p ostwar Europ e

b NP Preference for NP vs S complementambiguity

John has heard the joke is oensive

c sub ject relative clause center emb edding

The rat that the cat that the dog bit chased died

d gaplling

Whichchild did Mark remind them to watch this evening

The one exception is gaplling The socalled l led gap eect Crain and Fo dor which

readers exp erience in d at the word them tends to b e less severe than the other garden

path eects discussed in section

In a second set of exp eriments Boland et al presentintriguing evidence that the pro

cessing diculty in d is not of the same sort as the other gardenpath eects in section

Using the same sub jectpaced wordbyword cumulative display stopmakingsense task that

they used in their rst exp eriment describ ed on page ab ove they investigated the eect of

the plausibili ty of the WHller on reading time For materials as in

a Bob wondered whichbachelor Ann granted a maternityleave to this after

no on

b Bob wondered which secretary Ann granted a maternit y leave to this after

no on

they found that sub jects were able to detect the anomaly in a starting with the word leave

that is b efore the prep osition to could trigger the construction of the phrase whichcontains the

gap p osition This suggests that certain pragmatic integration pro cesses o ccur b efore b ottomup

syntactic evidence is available to tell the pro cessor that a gap is present

It follows from this nding that encountering the unexp ected NP them in d is o dd not

just syntactically but also pragmatically Indeed Boland pc rep orts varying strengths

of lledgap eects for dierent lexical realizations of the surprising NP eg pronoun prop er

name indenite NP denite NP suggesting that inference and accommo dation might b e in

volved The dierence b etween a lledgap eect and a garden path eect is then in the pro

cessing comp onent in which they are detected a garden path is detected when the syntactic

pro cessor discovers that none of the analysis that it is currently maintaining can b e extended

with the currentword This condition results b ecause the necessary analysis was discarded

earlier A lledgap eect on the other hand is initially detected in the interpreter not the syn

tactic pro cessor When the surprising NP app ears the interpreter has not yet told the syntactic

pro cessor to commit to the lledgap analysis

With lledgap eects now eliminated from the collection of gardenpath data that Disconnect

edness is relevant for Disconnectedness is indistinguishable on the remaining examples from a

preference for avoiding treating an NP as a sub ject This is a very strange preference to have

in a pro cessor whose purp ose it is to understand sentences given that every sentence has a

sub ject Perhaps all of the sub jects in the examples in section are somehow sp ecial and the

prohibition is not on all sub jects only on this sp ecial sort of sub ject In this section I argue that

this is indeed the case All of the sentences were presented out of context and it is subjects

that are new to the discourse that the pro cessor seeks to avoid It must b e emphasized that

Avoid New Sub jects makes no primary distinctions b etween denite and indenite NPs Out of

context b oth are new to the discourse In context denites tend to b e given more frequently

but deniteness is not a dening characteristic of

Given and New

Prince prop oses a classication of o ccurrences of NPs in terms of assumed familiarity

When a sp eaker refers to an entity which she assumes salientfamiliar to the hearer she tends

to use a brief form such as a denite NP or a pronoun Otherwise the sp eaker is obliged to

provide the hearer with enough information to construct this entity in the hearers mind Prince

classies the forms of NPs and ranks them from given to new

evoked An expression used to refer to one of the conversations participants or an entity which

is already under discussion usually a denite NP or pronoun

unused A prop er name which refers to an entityknown to the sp eaker and hearer but not

already in the present discourse

tro duces an entity not already in the discourse but which is easily inferable A phrase whichin

inferred from another entity currently under discussion cf bridging inference of Haviland

and Clark

containing inferable An expression that intro duces a new entity and contains a reference to

the extant discourse entity from which the inference is to pro ceed eg One of the p eople

that work with me b oughtaToyota

brand new An expression that intro duces a new entity which cannot not b e inferentially related

or predicted from entities already in the discourse

Prince constructs this scale on the basis of scalebased implicatures that can b e drawn if a

sp eaker uses a form which is either to o high or to o low such a sp eaker would b e sounding

unco op erativecryptic or needlessly verb ose resp ectively

By sub ject I refer solely to canonical preverbal sub jects and not to the broader class of grammatical sub ject

whichmay include existential there sentences and V constructions such as Outside sto o d a little angel inter

alia

Using this classication Prince analyzed two texts the rst is an informal chat and the

second formal scholarly prose Her ndings are summarized in the following table

sp oken written

sub ject nonsub ject sub ject nonsub ject

Evoked

containing Inferable

New unused and brand new

In b oth genres there is a clear tendency to make sub jects more given If we construe this

tendency as resulting directly from a principle of the linguistic comp etence which calls for using

sub ject p osition to enco de given information wewould indeed exp ect a reader to prefer to treat

outofcontext NPs as something other than sub jects I refer to this principle as Avoid New

Subjects

Consequences

The theory of Avoid New Sub jects predicts that for ordinary text sp oken or written the Late

Closure Eect and the residual NP preference for NP vs S ambiguities should disapp ear I now

present corpusbased investigations of these two predictions in turn

Late Closure and Avoid New Sub jects

To test the prediction that Late Closure Eects should disapp ear when the sub ject is given I

conducted a survey of the bracketed Brown and Wall Street Journal corp ora for the following

conguration a VP which ends with a verb and is immediately followed byanNP Crucially

no punctuation was allowed b etween the VP and the NP I then removed by hand all matches

where there was no ambiguity eg the clause was in the passive or the verb could not takethe

NP as argument for some reason Here are the remaining matches preceded by a bit of context

and followed by illustrationdiscussion of the ambiguity

An article ab out a movie describ es how its comp oser approached one of the singers When

you approach a singer and tell her you dont want her to sing you always run the risk of

oending

You dont w ant her to sing you a song

From the way she sang in those early sessions it seemed clear that Michelle Pfeier had

b een listening not to Ella but to Bob Dylan There was a pronunciation and approach that

seemed Dylaninuenced recalled Ms Stevens Vowels were swallowed word endings

were given short or no shrift When weworked it almost b ecame a joke with us that I

was constantly reminding her to say the consonants as well as the vowels

When weworked it out

After the crash and as a result of the recommendations of many studies circuit

breakers were devised to allowmarket participants to regroup and restore orderly market

conditions Its doubtful though whether circuit breakers do any real go o d In the

additional time they provide even more order imbalances might pile up as wouldb e sellers

nally get their broker on the phone

Even though this example involves gaplling the fact remains that the NP even more

order imbalances could b e initially construed as a dative as in In the additional time

they provide even the slowest of traders problems could

article is ab out the movie The Fabulous Baker Boys Preceding paragraphs describ e

the actors and movie in generalities When the movie op ens the Baker brothers are doing

what theyve done for years professionally and twice as long as that for themselves

Theyre playing procient piano facetoface on twin pianos

The movie op ens the Baker brothers to criticism from

Jonathan Lloyd executive vice president and chief nancial ocer of Qintex Entertain

ment said Qintex Entertainmentwas forced to le for protection to avoid going into

default under its agreement with MCA The million paymentwas due Oct and the

deadline for default was Oct Mr Lloyd said if Qintex had defaulted it could have

b een required to repay million in debt under its loan agreements

Both Websters and American Heritage Dictionary classify the verb default as b oth

transitiveandintransitive None of the o ccurrences of default in a larger corpus of

Wall Street Journal text take an NP complement

Whats more the US has susp ended million in military aid and million in economic

aid to Somalia But this is not enough Because the US is still p erceived to b e tied to

Mr Barre when he go es the runway could go to o

There are many transitive uses of go in the corpus go a long way a step further a full

seven games golng town watching home nuts hand in hand

Butch McCarty who sells oileld equipment for Davis To ol Co is also busy A nativeof

the area he is backnow after riding the oileld b o om to the top then surviving the bust

running an Oklahoma City convenience store First year I came back there wasnt any

work he says I think its on the waybacknow

First year I came back there I nearly

Story ab out the winning company in a comp etition for teenagerun businesses its pres

ident Tim Larson and the organizing entity Junior Achievement For winning Larson

will receive a US Savings Bond from the Junior Achievement national organization

winning Larson over to their camp

Why did the Belgians grant indep endence to the Congo a colony so manifestly unpre

pared to accept it Yet there were other motivations for which history may not nd

them guiltless paragraphbreak

As the time for indep endence approached there were in the Congo no fewer than

p olitical parties or approximately eight for each university graduate

As the time for indep endence approached there the p eople

Science has simply left us helpless and p owerless in this imp ortant sector of our lives

spirituality

paragraphbreak

The situation in whichwe nd ourselves is brought out with dramatic force in Arthur

Millers play The Crucible which deals with the Salem witch trials As the play op ens the

audience is intro duced to the community of Salem in Puritan America at the end of the

eighteenth century

the play op ens the audience up to new

b o dybuildin g advice exp erimenting with a particular technique Oh youll wobble and

weave quite a bit at rst But dont worry Before your rst training exp eriment has

ended there will b e a big improvement and almost b efore you knowityoull b e raising and

lowering yourself just likeaveteran

Before your rst training exp eriment has ended there in the ro om youll know

The givenness status of the ambiguous NPs is as follows

match NP givenness status

you evoked

it pleonastic

even more order imbalances brand new

Summary

The Baker brothers evoked

pleonastic

it evoked

evoked

the runway evoked

inferable

there pleonastic

brand new

Larson evoked

there pleonastic

the audience inferable

there pleonastic

Princes givenness scale do es not include pleonastic NPs since they do not refer For the present

purp ose it suces to note that Avoid New Sub jects do es not rule out pleonastics While the



numb ers here are to o small for statistical inference the data suggest that the prediction of



Avoid New Sub jects is maintained

If one had to guess the p erceived givenness status of pleonastic considering their tendency cross linguisti cal ly

to b e homophonous with pronouns and deictics one would guess that they are treated as given



Given the high frequency of given sub jects optionally transitiveverbs and fronted adverbials one might

exp ect more matches in a two million word corpus But examination of the Wall Street Journal corpus reveals

that most fronted adverbials are set o by comma regardless of p otential ambiguity Of sentence initial

adverbials only are not delimited by comma Of these adverbials have the category

SBAR of which only are not delimited by comma The great ma jority of fronted adverbials

have category PP of which are not delimited by comma The high frequency of the comma therefore

has the eect of signicantl y shrinking the available corpus of relevant examples



It must b e emphasized that these ndings are just suggestive Just b ecause a particular sentence app ears in a

newspap er it do es not mean that that it did not cause the pro ofreader to garden path This is esp ecially true of

sentence ab ove which causes some readers to gardenpath The only way to really test the currenthyp othesis

is using carefully constructed minimal pairs

Avoid New Sub jects also provides an account for the p erplexing results of the second exp eri

ment rep orted byStowe as discussed in the b eginning of section Recall that Stowe

used materials suchas

Animate

Plausible When the p olice stopp ed the driver b ecame very frightened

Implausible When the p olice stopp ed the silence b ecame very frightening

Inanimate

Plausible When the truck stopp ed the driver b ecame very frightened

Implausible When the truck stopp ed the silence b ecame very frightening

For the animate condition one exp ects an eect of implausibili ty at the critical NP the silence

b ecause the reader is using the causative analysis of stopp ed Given the evidence from her

rst exp eriment using sentences like the inanimate plausible in that inanimate sub jects

cause readers to adopt the ergative analysis one would not exp ect the reader to consider the

ob ject analysis of the critical NP for inanimate conditions But this is exactly what Stowe

found implausibil ity eects for the inanimate condition which mirrored those for the animate

condition

To resolve this paradoxical ndings one must maketwo observations First while the inan

imate sub ject truck indeed rules out a causative analysis for the verb stopp ed it do es

not necessarily rule out all other transitive analyses In particular stopp ed allows a third

sub categorization frame the socalled instrumental

Causative John moved the p encil

Ergative The p encil moved

Instrumental The p encil moved the pap er

Unlike the ergative the sub ject of an instrumental is not the patient aected ob ject The

name is somewhat of a misnomer b ecause in examples such as the instrumental sub ject



might not b e serving as an instrumentofan y causal agent

The sleet stopp ed the parade short

The second observation is that in the inanimate plausible condition the givenness status of the

critical NP the truck is inferable In the inanimate implausible condition it is brand new

more new in Princes terms than an inferable

Both of these observation are true of the great ma jority of the exp erimental materials in Stowe

Given the availability of a transitiveverb analysis of the inanimate conditions and given

the tendency to avoid new sub ject a tendency whichislikely to b e sensitive to the degree of

newness it is no longer surprising that readers chose the ob ject analysis for the critical NP in

the inanimate implausible condition The presence of the instrumental analysis did not matter



Theological arguments to the contrary notwithstanding

for the rst exp eriment where the critical NPs were plausible and crucially inferable not

so new as to drive the pro cessor to the ob ject NP analysis

Disconnectedness theory is not conditioned on discourse status and cannot simultaneously ac

count for the plausibility eects in the inanimate conditions of exp eriment and the lackof

garden path eects in the ambiguous conditions of exp erimentItwould have to b e restated

over representations which distinguish unrelated entities from those which can b e related by

means of bridging inferences

In order to decide b etween Disconnectedness and Avoid New Sub jects wemay b e able to

combine results from twovery dierent exp eriments using the following reasoning While Dis

connectedness theory makes predictions for gaplling Avoid New Sub jects do es not To falsify

Disconnectedness one could show that Disconnectedness acts irreconcilably dierently when

driving gaplling than it do es when driving lateclosureeects One wayofcharacterizing a

preference is how strong it is compared to another one in this case plausibility Recall the

exp eriment of Boland Tanenhaus Carlson and Garnsey discussed on page Using

examples such as Boland et al argued that gaps are lled unless implausibili ty results

a Whichchild did Mark remind them to watch this evening

b Whichmovie did Mark remind them to watch this evening

It follows that for gaplling decisions Disconnectedness is not as strong a factor as Plausibil

ity Stowes second exp eriment on the other hand suggests that Disconnectedness or Avoid

New Sub jects is suciently strong so as to override Plausibility Of course to b e convincing

e Stowes second exp erimentmust b e rep eated with materials which completely rule out transitiv

instrumental readings in the inanimate conditions For example

While the cakewas baking the oven caught re

As the plot unfolds the reader is ushered into a world

Complement Clauses

In order to b e relevant for the ambiguity in b Avoid New Sub jects must b e applicable

not just to sub jects of ro ot clauses but also to emb edded sub jects It is widely b elieved that

constituents in a sentence tend to b e ordered from given to new The statistical tendency to avoid

new sub jects may b e arising solely as a consequence of the tendency to place new information

toward the end of a sentence and the grammaticallyimp osed early placement of sub jects If

this were the case that is Avoid New Sub jects is a corollary of Given Before New then Avoid

New Sub jects would make no predictions ab out sub jects of complement clauses as these are

neither at the b eginning nor at the end of sentenceutterance In this section I argue that it

is the grammatical function of sub jects not just their linear placement in the sentence that is

involved with the avoidance of new information

When a sp eakerwriter wishes to express a prop osition whichinvolves reference to an entity not

already mentioned in the discourse she must use a new NP She is quite likely to avoid placing

this NP in sub ject p osition To this end she may use constructions such as passivization there

insertion and clefts It is often observed that sp eakers tend to use structures like b in order

to avoid structures like a

a A friend of mine drives a Mercedes

b I have a friend who drives a Mercedes

The theory of Avoid New Sub jects predicts that this sort of eort on b ehalf of writers should

b e evident in b oth ro ot clauses and complement clauses To test this prediction I conducted

another survey of the Penn Treebank I compared the informational status of NPs in sub ject and

nonsub ject p ositions in b oth ro ot and emb edded clauses as follows I dened sub ject p osition

as an NP immediately dominated by S and followed not necessarily immediatelytoallow for

auxiliaries punctuation etc byaVP I dened nonsub ject p osition as an an NP either

immediately dominated by VP or immediately dominated by S an not followed not necessarily



immediately byVP To determine givenness status I used a simple heuristic pro cedure

to classify an NP into one of the following categoriesemptycategory pronoun proper

name definite indefinite notclassified The observed frequencies for the bracketed



Brown corpus are as follows

ro ot clause emb edded clause

sub j nonsub j sub j nonsub j

emptycategory

pronoun

propername

definite

indefinite

notclassified

total

All pronouns are either pleonastic or evoked they are thus fairly reliable indicators of given

at least nonnew NPs The category indefinite contains largely brandnew or inferable NPs

thus b eing a go o d indicator of new information Considering pronouns and indefinites there

is a clear eect on grammatical function for b oth ro ot clauses and emb edded clauses

ro ot clause emb edded clause

sub j nonsub j sub j nonsub j

pronoun

indefinite

 

p p

The prediction of Avoid New Sub jects is therefore veried

As remarked earlier in this section when a hearerreader is faced with an initialsegmentsuch

as

John has heard the joke

the ambiguity is not exactly b etween an NP complementanalysisversus an Scomplement anal

ysis but rather b etween an TR transitiveverb analysis and an RC reduced Scomplement It



I am grateful to Rob ert Frank for helpful suggestions regarding this pro cedure



For clarity I only give results from the Brown corpus but all assertions I make also hold of the Wall Street

Journal corpus App endix A contains data for b oth corp ora

is therefore necessary to verify that Avoid New Sub jects is indeed op erating in this RC subclass

of sentential complements A further analysis reveals that this is indeed the case

TC RC

sub j nonsub j sub j nonsub j

emptycategory

pronoun

propername

definite

indefinite

notclassified

total

TC RC

sub j nonsub j sub j nonsub j

pronoun

indefinite

 

p p



If anything Avoid New Sub jects has a stronger eect after a zero complementizer

Unambiguous Structures

The consequences of Avoid New Sub jects on unambiguous structures suchas

TR The maid disclosed the safes lo cation within the house to the ocer

TC The maid disclosed that the safes lo cation within the house had b een

changed

RC The maid disclosed the safes lo cation within the house had b een changed

from Holmes et al and center emb edding are remarkably similar to those of Disconnect

edness theory When presented out of context TC requires the reader to accommo date a



This is in fact demonstrable when a writer must place a new NP in an emb edded sub ject p osition she

tends not to omit the complementizer

emb edded sub ject TC RC

pronoun

indefinite



p

This observation provides a tantalizing suggestion that the thattrace eect exemplied by may in fact

have a functional explanation the overt complementizer tends to signal new sub jects and a WHgap can b e

thought of as the most given NP p ossible

a Who did John say Mary likes

b Who did John say that Mary likes

c Who did John saylikes Mary

d Who did John say that likes Mary

As given here the mere tendency for new sub jects to b e asso ciated with an overt complementizer falls short of

completely accounting for the categorical thattrace eect This issue awaits further research

sub ject which is new to the discourse which the TR form do es not require The TC form is

thus predicted to present some diculty

Avoid New Sub jects also predicts a dierence b etween and

The rat that the cat that the dog bit chased died

A b o ok that some Italian Ivenever heard of wrote will b e published so on by

MIT press

requires the reader to accommo date three new sub jects simultaneously whereas re

quires only two since I is an evoked entity Substituting a new entity for I is predicted to

render the sentence harder to pro cess

A b o ok that some Italian the teacher has never heard of wrote will b e published

so on by MIT press

Also as with Disconnectedness changing the lo cus of the emb edding from sub ject to comple

ment predicts an amelioration of center emb edding diculty in rep eated here as as

compared with a mixed sub jectob ject emb edding in and the doubly sub ject emb edded

John asked the woman that gave the b oy that kicked the dog some candy why

shes sp oiling him

John asked the woman that gavetheboy that the dog frightened some candy

why shes sp oiling him

The woman that the b oy that the dog frightened is b othering rightnowisa

friend of Johns

Considering Eady and Fo dors results see on page Avoid New Sub jects predicts di

culty when there are manysimultaneous new sub jects The maximum number of simultaneous

sub jects in a through d is and resp ectively This makes the incorrect predic

een b and c should b e the smaller tion that the dierence in pro cessing dicultybetw

than the other two dierences

Lastly b oth Avoid New Sub jects and Disconnectedness fail to account for the remaining center

emb edding eects in nonsub ject emb edding is emb edded one level deep er than The

additional level of emb edding exacts a cost in pro cessing diculty despite its inoensiveness to



b oth Disconnectedness and Avoid New Sub jects



Rob ert Ladd pc hyp othesizes that diculties with centeremb edded constructions stem from the

unavailabil i tyofwellformed proso dic structures Consider the following contrast

i The shirts that the maid Tom cant stand sent to the laundry came back in tatters

ii The shirts that the maid whom Tom cant stand sent to the laundry came back in tatters

Ladd argues that the vo cabulary of ma jor proso dic breaks ie the single item ma jor break or comma is

not suciently rich to indicate nesting of brackets or even whether a break denotes a left or right bracket In i

one could get by with one break after the entire matrix sub ject and the sentence sounds ne except p erhaps

foranunusually long intonational phrase In ii three breaks are necessary one for each comma and one at

the end of the matrix sub ject so the nesting relations are not prop erly enco dedrecovered

John asked the woman that oered the b oy that gave the dog that chased the

cat a big kick some candy why shes sp oiling him

The residual diculty of centeremb edding constructions is very likely explained by memory

limitations in the syntactic pro cessor Bach Brown and MarslenWilson compared center

emb edding and crosseddep endency constructions in German and Dutch as in and found

that the centeremb edded examples in German were harder to understand than their crossed

dep endency analogs in Dutch

German

Arnim hat Wolfgang der Lehrerin die Murmeln aufraumen helfen lassen

Arnim has Wolfgang the teacher the marbles collect up help let

Arnim let Wolfgang help the teacher collect up the marbles

Dutch

Aad heeft Jantje de lerares de knikkers laten help en opruimen

Aad has Jantje the teacher the marbles let help collect up

Aad let Jantje help the teacher collect up the marbles

Rambow and Joshi prop ose a syntactic parsing automaton based on Tree Adjoining

Grammar Using the storage mechanism of their automaton they dene a pro cessing complexity

metric based on howmany storage cells a particular parse needs and how long they are needed

for in analogy with paying rent for storage space This complexity metric is consistent with the

ndings of Bach et alThis metric also provides an interesting predictor of pro cessing diculty

asso ciated with various wordorder variations of complex sentences in German Rambowand

Joshi show that a range of acceptability judgements can b e accounted for Applying Joshi

and Rambo ws automaton account to pre and p ost verbal center emb edding and

resp ectively yields no dierence Owen Rambow pc all that matters is that the

dep endencies are nested This suggests that center emb edding diculties really do originate

from memory limitations in the syntactic pro cessor We can conclude that the diculties with

the classic center emb edded sentences such as is the aggregate of diculties in two lo ci

memory requirements in the syntactic pro cessor and interpretive eects resulting from sub ject

emb edding as discussed ab ove cf Eady and Fo dor

The rat that the cat that the dog bit chased died

Summary

Ihave presented two comp eting theories which account for human p erformance patterns on a va

rietyofsyntactic constructions Disconnectedness theory assigns a p enalty for each constituents

which has not b een semantically integrated with the rest of the constituents Avoid New Sub

jects theory assigns p enalty for noun phrases which app ear in sub ject p osition and intro duce

entities which are new to the discourse Avoid New Sub jects requires no assumptions ab out

the sentence pro cessing system b eyond what is already necessary for accounting for comp etence

phenomena namely that p eople use sub ject p osition to enco de given information Disconnect

edness theory requires the assumption that the pro cessor prefers to avoid disconnected analyses

even when the disconnectedness can b e eliminated by immediately forthcoming words in the

string

While these two theories are very dierent stated over dierent domains their predictions

coincide for much of the available data Disconnectedness theory as dened here is inconsistent

with the p ostho c analysis I have presented for Stowes second exp eriment in section

ie it is insensitive to the degree of newness Of course a direct exp erimentwould b e necessary

to validate that analysis Another area of disagreementbetween the two theories is in gap

lling Disconnectedness theory predicts that other factors b eing equal the pro cessor would

prefer to ll a gap if a ller is available Avoid New Sub jects makes no predictions with regards

to gaplling Ihave argued at the end of section that putting together results from

dierent exp eriments could provide us with grounds for falsifying Disconnectedness theory The

necessary exp eriments remain for future research

In the next twochapters I present a computational framework for mo delling the various asp ects

of sentence pro cessing The aim is to ultimately provide a means of integrating exp erimental

results and theories regarding a variety of factors into one consistent picture

Chapter

Parsing CCG

In the preceding chapters I have argued for a view of the sentence pro cessing architecture where

the syntactic pro cessor the parser prop oses syntactic analyses for the incoming words and

the interpreter cho oses among them based on sensibleness This is depicted diagrammatically in gure

Syntactic Analyses

Semantic-Pragmatic Input Utterance Parser Interpreter

Analysis-suspension Competence Messages Grammar key:

declarative knowledge

computational process

data flow

Figure An interactivesentencepro cessing architecture

Having argued for an architecture of this general kind I now fo cus on the sp ecics of eachof

the two constituent comp onents in turn In this chapter I consider the design of the parsing

comp onent and in the next I turn to the interpreter and the integrated system

Goal

In the preceding twochapters I have argued that syntactic ambiguity is resolved according to

the interpretations of the available readings Given the virtual immediacy in whichawords

contribution to the meaning impinges on ambiguity resolution decisions it follows that the

parser whose task it is to identify for the interpreter the syntactic relations among the words

in the sentence must b e p erforming this task very quickly That is at every word the parser

identies al l of the grammatical ly available alternative analyses and determines for each analysis

al l of the syntactic relations which arelogical ly entailedbythecompetencegrammar or at least

enough syntactic relations to draw the distinctions necessary for interpretation Crucially these

determinations must not b e delayed by the parser until the end of an utterance or even the end

of a clause

My aim in this chapter is to adhere to the central claim of the dissertation that the parser is

as simple as logically p ossible all that it enco des is analyses as dened by the comp etence

grammar

Steedman has prop osed a pro cessor which he claims is able to construct sensesemantics

in a timely fashion and in addition emb o dies a very transparent relation to the grammatical

comp etence the so called Strict Competence Hypothesis In the next section I present Steedmans

architecture In the following ve sections I consider ve dierentchallenges to the simplicity

and adequacy of Steedmans prop osal and advo cate certain extensions to Steedmans design

which promise to address shortcomings with the original

Steedmans Prop osal

How simple can the syntactic pro cessor b e Steedman argues as follows At the very

minimum the pro cessor needs three comp onents a transparent representation of the grammar

a metho d for constructing constituents by executing steps in the grammar and a metho d of

resolving ambiguity If the comp etence grammar is in its traditional form eg always dividing

a sentence into a sub ject and a predicate then it turns out that this minimal collection of three

comp onents is inadequate to provide the necessary sensesemantics Consider the pair in

a The do ctor sentforthepatient arrived

b The owers sent for the patient arrived

While a is a garden path b is not This is b ecause the implausibil ity of the main verb

analysis of the owers sent is detected This detection takes place b efore the sentence is com

plete It follows that the sensesemantics of the sub ject is combined the verb b efore the entire

VP is pro cessed If the grammar requires a VP no de however the straightforward interpre

tation of the minimal mo del ab ove wherein the pro cessor can only combine two constituents

when the syntax allows them to combine must wait until the VP is nished b efore the con

tent of the sub ject is integrated with the contentoftheVP Steedman argues that the obvious

ways of relaxing this strict ruletorule parsing hyp othesis which he calls the Strict Competence

Hypothesis such as adding Earleystyle dotted rules Earley or a topdown parsestack

complicate the design of the parser and shifts additional explanatory burden to the theory of

evolution Steedman argues that if the grammar do es not require an explicit VP constituent

ie if it is able to treat The owers sent where sent is the main verb as a constituent strict

comp etence can b e restored to the pro cessor

Details of Steedmans grammatical theory Combinatory CCG provide

an illustration of his claim In CCG every constituent has a grammatical category drawn from

a universe of categories as follows There is a nite set of basic categories such as s np n

etc It is given either as a list of symb ols or a space of nite feature structures There are two

binary typ eforming connectives and n such that if X and Y are categories then XY and

XnY are also categories The set of categories is the set of basic categories closed under the

connectives and nByconvention slashes asso ciate to the left so snnpnp is usually written

snnpnp Intuitively a constituent with category XY or XnY is an X which is missing a Y

to its right left CCG is a lexicalized grammar formalism which means that the collection

of constituentcombination rules is rather minimal and most of the complexity of the grammar

resides in the way individual words are assigned a category or a set of categories in case of

lexical ambiguity From the description of the meaning of the slash connectives one exp ects

the combinatory rules in

Forward Functional Application BackwardFunctional Application

XY Y X Y XnY X

By convention the arrow is in the direction of parsing not generation These are actually rule

schemata where X and Y are variables which range over categories A rule combines twoadja

cent constituents whose categories match its left hand side and creates a new constituent with the

category on its righthand side A particular CCG can stipulate restrictions over the categories

that the variables may takeasvalues In addition to the two socalled functional application

rules ab ove CCGs also includes functional composition rules suchasXY YZ XZ In

the rest of this do cument I use the following unied notation for application and generalized

functional comp osition

Forward Combination rule name Backward Combination rule name

XY Y X Y XnY X

XY YjZ XjZ YjZ XnY XjZ

XY YjZ jZ XjZ jZ YjZ jZ XnY XjZ jZ

   

XY YjZ jZ XjZ jZ n YjZ jZ XnY XjZ jZ n

n n n n

In the table ab ove jZ stands for either Z or nZ Underlined regions in a rule must match

Aside from the combination rules ab ove CCG systems often include two other kinds of rules

Typ e raising and Substitution Typ e raising schematized as

Forward Typ e Raising Backward Typ e Raising

X YYnX T X YnYX T

is assumed to apply in the lexicon and is therefore not included as a rule in the grammar The

Substitution rule p osited in order to handle parasitic gaps Steedman Szab olsci is

Substitution

YZ XnYZ XZ S

and is also included in the universe of rules

In addition to the ab ove rules There is a sp ecial rule for co ordination whichcombines three

sub constituents

X co ord X X

A derivation is a tree whose leaves are categories and whose internal no des are valid rule

applications A string is grammatical just in case there is a derivation whose frontier is a

sequence of categories which are each in the lexical entry of the corresp onding word in the

string Aside from determining the syntactic category of a string a derivation can also assign it

semantics One wayofachieving this is using combinators Curry and Feys Quine

Steedman A combinatory semantics consists of augmenting each lexical entry with a

semantic ob ject and eachcombinatory rule with a semantic combination recip e The lexicon

then maps a word to a set of pairs h syntacticcategory semantic ob ject i The semantic

combinations recip es are as follows

i

Xa Yb ZB ab i

i

Yb Xa ZB ab i

Xa YYnXT a T

Xa YnYXT a T

YZb XnYZa XZS ab S

Juxtap osition denotes term application By con

vention terms asso ciate to the left so xy z is

written as xy z

i

The semantic terms B T and S are sp ecial symb ols called combinators They do not carry

any semantic content themselves rather they enco de combinatorial recip es of their arguments

according to the following equations

i

B xy y xy y

i i

T xy yx

S xyz xzyz

By way of an illustration consider the following unambiguous lexicon

John ssnnp T j notice the application of T in the lexicon

has snnpsnnp has

met snnpnp met

Susan np s

The string John has met Susan is grammatical since it is p ossible to derive a single constituent

from it as follows

John has Susan met

ssnnp Tj snnpsnnp h snnpnp m nps

ssnnp B Tj h

snp B B Tj hm

s B B Tj hms

B Tj hms

Tj hms

hmsj

Notice however that there are other derivations for this string which yield the same semantic

result For example

Susan has met John

ssnnp Tj snnpsnnp h snnpnp m nps

snnpnp B hm

snp B Tj B hm

s B Tj B hms

Tj B hms

Tj hms

hmsj

These analyses makes use of the functional comp osition rule to construct the nontraditional

constituent John has met It has b een argued eg Dowty Steedman

that such constituents are necessary for a prop er treatment of the syntax of co ordination WH

dep endencies and sentencelevel proso dic structure The reader is referred to these pap ers for

details of the theory of comp etence

Steedmans p oint then is that a pro cessor for CCG uses the slash mechanism of the competence

grammar the same mechanism which is resp onsible for constructing the material b etween a

WH ller and its gap and nonstandard constituents for co ordination in order to pro duces

a grammatical constituent for the owers sent in which rep eated here as whereas

a pro cessor for a traditional phrasestructure grammar would have to use grammarexternal

devices such as dotted rules to achieve the same eect

a The do ctor sent for the patient arrived

b The owers sent for the patient arrived z

+ phase 2

uv

x - phase 1

x y

Figure A circuit for computing z xy y from Shieb er and Johnson

Strict Comp etence and Asynchronous Computation

Shieb er and Johnson claim that Steedmans argument that a standard rightbranching

grammar requires a more complicated parser rests on an incorrect assumption They distinguish

two sorts of computational architectures synchronous and asynchronous and argue that the

assumption of a synchronous architecture necessary for Steedmans argument is no more likely

a priori than that of an asynchronous architecture and may in fact b e less likely In this section

I present Shieb er and Johnsons argument and assess its force

Synchronous and Asynchronous Computation

Supp ose one had to construct a machine to compute the following function of twonumeric

arguments x and y

f x y xy y

out of comp onents which p erform primitive arithmetic op erations

One way to do this is to use a twophase circuit as in gure The rst phase computes the

intermediate results u xy and v y and the second phase computes the sum of u and v

The multiplication unit could come in one of twovarieties synchronous or asynchronous The

synchronous variety requires that b oth of its inputs b e sp ecied b efore the output is computed

The asynchronous v ariety emits an output as so on as it can whenever one of its inputs is

zero it do es not wait for data on the other input b efore emitting the answer zero on its output

If the circuit is indeed built from asynchronous comp onents then a y input of zero would cause

it to emit a z answer of zero without waiting for the value of x If one were using asynchronous

comp onents but wanted phaselevel synchronization ie all of a phases inputs must b e sp ecied

b efore its output is emitted one would have to build in additional restraints into the circuit

Shieb er and Johnson argue that Steedmans strict comp etence hyp othesis precisely imp oses

phaselevel synchronization at the level of the mo dule Steedman requires that the syntactic

mo dule makeavailable to the interpretive mo dule analyses of only complete constituents That

is the interpreter maynotsee the results of combinations of incomplete syntactic constituents

They argue that this phaselevel synchronization is not necessary and could in fact makethe

design of the pro cessor more complicated as is the case with the design of phasesynchronous

digital electronic devices

To illustrate the viability of asynchronous computation they prop ose a grammatical formalism

which pairs partial parsetrees with partial LFlike representations May The appa

ratus uses the same structurecombining op eration for b oth the construction of grammatical

constituents including the residue of WHmovement as well as the construction of partial

constituents such as

S

H

H

H

H

VP NP

Q Q

Q Q

Q

Q

the owers sent unspecied

Such a tree is paired by the formalism with an undersp ecied logicalform representation similar

to

unspeciedop unspeciedop sendhtheowersiunsp eciedob ject

This representation anticipates zero or more sentencelevel op erators whichmay app ear syn

tactically adjoined to the VP no de but move to S at the logical form level The sub ject of

the sending is sp ecied but the ob ject is not An interpreter may lo ok at this structure and

opp ortunistically draw whatever conclusions it can from the parts that are sp ecied

The formalism that Shieb er and Johnson use is that of Synchronous Tree Adjoining Grammars

Shieb er and Schab es I nowsketch the idea briey The reader is referred to the original

pap ers for details

Lexicalized Tree Adjoining Grammar Joshi Joshi and Schab es is a gram

matical formalism where a grammar asso ciates a nite set of trees with each lexical item and

trees can combine using one of two op erations substitution and adjunction One tree is sub

stituted into another by simply replacing one nonterminal symb ol at the frontier of with

by excising a subtree of Adjunction is slightly more complex a tree is adjoined into

that is ro oted at some nonterminal X substituting the subtree into at some o ccurrence of X

and then substituting the new into the original X site in Synchronous TAG no relation to

synchronous computation is a grammatical formalism for transduction the idea is that two

TAGs are synchronized or coupled such that op erations in one memb er are reected in the

other Given twoTAGs a synchronous TAG can b e dened as a set of ordered pairs of trees

from the resp ective grammars Within an ordered pair no des in one tree can b e paired with

corresp onding no des in the other Whenever an op eration substitution or adjunction happ ens

to one no de in a tree a corresp onding op eration must happ en to the no de it is linked to

Evaluation

Steedman claimed that a pro cessor for a rightbranching grammar needs to have a grammar

external op eration for partial combination Given that the interpreter must b e able to take

advantage of applications of this op erations it must follow that it to o is able to see the op eration

as well Shieb er and Johnson have shown that using an asynchronous computational paradigm

it is not necessary to augment the parser or interpreter with any op erations b eyond those allowed

by the comp etence grammar

The ultimate question of whose system is simpler Steedmans CCG or Shieb er and Johnsons

asynchronously computed partialstructure paradigm can only b e resolved when they are b oth

extended to provide wide coverage of the linguistic phenomena and their precise implementation

details are given

One imp ortant asp ect which Shieb er and Johnson do not address in their pap er is co ordina

tion CCG provides a uniform mechanism for incremental interpretation and the constituency

necessary for co ordination For example CCG assigns the string John loves the grammatical

category SNP This category can b e co ordinated with another of the same typ e giving rise to

constructions such as Right No de Raising

John loves and Bill hates London

Such an analysis of Right No de Raising is not readily available in Shieb er and Johnsons mech

anism which assigns John loves the grammatical category S which the grammar cannot dis

arious approaches p ossible tinguish from the category for John loves London While there are v

for extending Shieb er and Johnsons account eg by adopting a prop osal by Joshi or

elab orating on the work of Sag et al more research is needed to determine whether the

elegance and simplicity of their accountwould remain once it is extended to cover co ordination

One p otential source of diculty for CCG is the need for interpretation of constituents in which

not all combinators have b een rewritten For example the interpretation for the mainverb

analysis the owers sent is

BTthe owers send

One might supp ose that interpreter contains sp ecial strategies to cop e with such expressions

but suchamoveintro duced serious complexity

Another p ossibility is to redene the notion of combinator Instead of treating it as a primitive

symb ol one could treat is as standing for a term Given their denitions ab ove this is

straightforward

i

B xy y xy y

i i

T xy yx

S xyz x z y z

Given this interpretation rewrites and reduces to

x send x the owers

whichisvery similar to Shieb er and Johnsons representation in

I conclude that Shieb er and Johnson have made a comp elling case against Steedmans claim

that CCG clearly gives rise to a simpler pro cessor than any system based on a rightbranching

grammar This debate therefore is far from settled For now either approach is viable so I use

one CCG or the remainder of this do cument

A note is in order ab out the choice of a formalism for semantic representation The obvious

formalism is that of an applicative system suchascombinators or terms describ ed ab ove

Such formalisms require the application of zero or more reduction rules after each syntactic

combination It is p ossible to eliminate the necessity for reduction rules using precompilation

The idea describ ed in Pereira and Shieb er is to replace the simple category symbol with

a Prolog term which in addition to enco ding the usual syntactic features suchasnumber and

gender enco des the predicate argument structure as well See Pareschi and Steedman

and Park for discussions of applications of this approach to CCG Mo ore argues

that reduction rules cannot b e eliminated altogether the problem is that unicationbased

approximations of the calculus do not treat separate bindings of the same v ariable as distinct

A clear illustration of this problem arises in sub ject co ordination as in

John and Bill walk

If the predicate walk is treated as in

XwalkX

and the co ordinate sub ject John and Bill is given a generalizedquantier treatment ie typ e

raised as in

SPjohnPbillP

Pereira and Shieb er intro duce the inx op erator as a notation to enco de terms So a term suchas

xy f xy

would b e enco ded in Prolog as

XYfXY

then the two copies of the predicate b ound by the variable P will fail to serve as indep endent

binders of X The unication step necessary to give the result

johnwalkjohn billwalkbill

will b e blo cked Park considered a collection of co ordinate structures and argued that

for each it is p ossible to construct a co ordination rule in his case a separate CCG lexical entry

for the word and whichprovides the correct logical form using only unication But in some

cases Parks resulting logical form is not the intuitively obvious simplest one but rather a

more complicated form which is truthconditionally equivalent For example the logical form

assigned to a is b not c

a A farmer and every senator talk

b xFarmerx yy x Talky

zSenatorz yy z Talky

c xFarmerx Talkx zSenatorz Talk z

Park suggests that a p ostsemantic pro cess within the interpreter could massage forms like

b into forms like c But these manipulations of logical form while they do preserve

the entailments of the meaning might b e to o heavyhanded for other more pragmatic asp ects of

meaning I conclude from Parks results that Mo ores observations are quite accurate attempt

ingtosimulate reduction using term unication results in rather contorted and unnatural

semantic representations Phenomena such as co ordination indeed do require interleaving ap

plications of the combinatory rules of grammar with applications of semantic reduction rules

The Davidsonian approach to semantic representation is somewhat similar to a unicationbased

approach to semantics in CCG It to o is incapable of an elegant treatmentofmany co ordinate

structures eg a But it is quite straightforward to extend it to allow one The idea is

to move to a representation which separately enumerates each argument to a predicate Thus

a would have c as its representation instead of b

a Most students prefer denim

b mostX studentX tnsE present preferEXY denimY

c mostX studentX tnsEpresent sub jEX ob jEY denimY

A co ordinate structure such as a would have the semantic analysis in b

a Most students and some professors prefer denim

b mostU studentU someV professorV andUVX tnsEpresent

sub jEX ob jEY denimY

In the current implementation I use the traditional Davidsonian approach mostly for ease of

readability If co ordination were to b ecome imp ortanttothework it would b e straightforward

to map the system to the newer representation

Identifying Ungrammaticality

The minimum conceivable sentence pro cessor contains a representation of the grammar a non

deterministic algorithm for applying the rules of the grammar and an ambiguity resolution

mechanism which it is claimed here is solely based on sensibleness of the available analyses

Consider how such a pro cessor would cop e with the unremarkable sentence in

The insults the new students shouted at the teacher were appalling

The word insults has two categories a plural noun and a nite transitiveverb The noun

analysis can combine with the determiner to its left The verb analysis cannot Should the

pro cessor abandon the verb reading As external observers we can examine the grammar of

English and conclude that the verb analysis is do omed weknow that salvation cannot arrive

later in the string in the form of a category suchassnnpnnsnnpnp But the pro cessor

cannot know this since it do es not contain precompiled knowledge ab out the grammar The

pro cessor cannot automatically prefer a combined analysis to an uncombined analysis as this

would constitute a structural preference strategy whichwould have wide ranging and bizarre

predictions For example in when the word Chris is encountered the pro cessor would

prefer to co ordinate the two NPs Sandy and Chris b ecause that would yield a single constituent



for the whole string

Kim likes Sandy and Chris likes Dana

The only available recourse given the minimal pro cessor is an accountby which the interpreter

is able to discard the verb analysis of insults But this is rather unlikely There is no a priori

reason to exp ect that an uncombined determiner should present a problem for an interpreter

thus imp osing a p enalty on the verb analysis In fact there are languages where determiners

eg deictics quantiers are routinely kept uncombined for manywords until their head nouns

are pro cessed For example in Korean the structure of a nounphrase is

Determiner RelativeClause Noun

The noun analysis of insults on the other hand do es incur p enalties when the subsequent

words arrive and require a restrictive relativeclause analysis whichentails a complex pro cess

accommo dating the resulting nounphrase out of context These same words p ose no problems

for the verb analysis the verb phrase insults the new students is constructed Given the

preference for avoiding complex accommo dation pro cesses one would exp ect the interpreter to

discard the noun analysis leading to a garden path eect at the disambiguating word shouted

This is clearly wrong The sentence causes no conscious pro cessing diculty

Whatever solution is provided for eliminating inappropriate analyses eg the verb analysis

for insults in it must op erate rather quickly and ruthlessly otherwise the number of

surviving ungrammatical analyses b ecomes unmanageable



Another problem with preferring a combined to an uncombined analysis incorrectly predicts diculties with

the string Which house did John paint a picture of This will b e discussed in section

The problem here is by no means a new one The minimal design is a classic b ottom up

parser which online can determine which analyses are p ossible but not which are imp ossible

Ungrammaticality information is only available at the end of the string when no analysis contains

one category which spans entire input Many parsing techniques have b een prop osed that

address this problem LR tables see section are precompiled guides to a parse stack

which identify viable sequences of stack elements and implicitly enco de how these elements will

b e ultimately combined The Earley parser Earley constructs this sort of information

online using annotations on the rules of the grammar to enco de which constituents have b een

seen and which are exp ected Marcuss parser Marcus see section contains rules

which explicitly diagnose whichavailable syntactic analysis should b e followed

To address this problem I prop ose to augment the syntactic parsing mo dule and interpretation

mo dule with a third mo dule an unviable state lter There are three issues p ertaining to the

design of this lter

Should it op erate as a categorical lter ruling out most or all ungrammatical analyses

or should it have graded judgement rating certain analyses b etter than others

Should it b e conceived of as innate of biological standing equal to the other two mo dules

or should it b e conceived of as a skill which an exp erienced language user acquires for

discriminating grammatically viable analyses from ones that are do omed

Should this mo dule b e placed b efore the syntactic pro cessor mediating lexical access by

p erforming a rstcut disambiguation pro cess over the available grammatical categories or

should this mo dule op erate on the output of the syntactic pro cessor discarding unviable

category buer congurations

Implementing the lter as rating among available analyses can b e thoughtofasawayofim

p orting structurallylexically based ambiguity resolution preferences For example supp ose the

pro cessor rates a complementizer analysis of the word that when it follows a noun higher than

it rates the relativizer analysis The exp ectation then is indistinguishabl e from that of Minimal

ttachment in examples like A

The psychologist told the wife that he was having trouble with

Similarly for ambiguous words like raced in

The horse raced past the barn

As has b een argued in the preceding chapters there is little evidence for structurally based

preferences so a categorical lter that evaluates each analysis on its own without regard to its

comp etitors is preferable

As for the second choice it is clear that an innate account of this lter is evolutionarily unpar

simonious if this element is necessary for language communication then a grammar could not

haveevolved without it nor could the lter haveevolved without the grammar An empiricist

account of the lter as a skill is rather plausible When a child b egins acquiring language the

lter is totally p ermissive allowing all analyses even those of a determiner followed byaverb

At this stage the proliferating candidate analyses usually quickly overwhelm the pro cessors

abilitytokeep track of them Consequently only short utterances are prop erly understo o d The

child observes that a buer such as theDET insultsVERB never gives rise to valid utterances

and learns to lter it out Gradually this lter is rened to its observed sophistication in adult

listeners

The third choice concerns the placement of the lter in the rest of the system Placing the lter

as prop osed by Steedman b etween the lexicon and the syntactic pro cessor allows one

to exploit muchrecentwork in automatic partofsp eech lab eling of words Church has

shown that is very easy to train a partofsp eech tagger on a tagged corpus to achieve accuracy

b etter than on unseen text There has b een much recentwork on improving the accuracy

of such taggers andor reducing the volume of training materials necessary see Brill and

therein Such taggers are sensitive to only a small p ortion of the syntactic context in

whichaword app ears usually a windowofafewwords to either side of it In many taggers

it is p ossible to adjust a parameter called the precisionrecall tradeo When precision is high

the tagger is likely to nd few incorrect categories for a word When recal l is high the tagger

is likely to miss few correct categories but it may increase the numb er of incorrect partsof

sp eech it guesses for eachword It is quite plausible that an excellentrecall mo derateprecision

h tagger mediates lexical access I am aware of no comparable existing work for partofsp eec

automatically training a lter which discriminates viable b ottomup buer states But given

the fact that an unviable buer state never results in a grammatical analysis for a string or

at least an analysis which do es not require correction on the part of the hearer whereas every

viable buer state do es eventually give rise to a grammatical sentence and given the fact that

the space of viable buer states is quite small and regularly structured it is plausible that such

a lter can b e trained by observation of successful and unsuccessful buer states

Either placement of the lter is therefore viable In the next two sections I consider a two further

problems to the lterless minimal architecture These problem are resolved using a lter placed

between the syntactic pro cessor and the interpreter

Here is a sketch of an algorithm which could b e used to carry out the acquisition of this skill

of identifying viable buers After eachword for each parser state record the sequence of

categories in that states buer At the end of a grammatical string for each state in the correct

analysis go back and add a mark to that states buer For each state in each analysis

which did not turn out to b e the correct one add a mark After some training the resulting

collection of marks can b e used to implement the viable buer criterion as follows

If a particular buer conguration contains at least one mark then it is viable

If the buer conguration contains no marks and more marks than some threshold

then it is unviable

If the buer conguration contains no marks and fewer marks than the threshold

then not enough information is available In the absence of denitive information accept

the buer thus trading eciency for completeness

Note that this algorithm considers each buer state individually the viability of a buer is

indep endent of other comp eting analyses As given ab ove the algorithm is inecient mayb e

even impractical in that it requires p otentially unb oundedly long buer congurations b e stored

and retrieved But as will b e seen in the next three sections the parser in practice will construct

very short buers rarely exceeding three constituents Furthermore it maywell turn out I

susp ect that it is sucient to consider only the rightmost two or three constituents in a long

buer for the purp oses of buer viability Finally there is the issue of howmany distinct

categories must b e kept track of Considering current CCG analyses of English the collection of

relevant categories is likely to turn out to b e quite small For Dutch whichallows verb clusters

to form constituents see Steedman some additional bit of cleverness may b e necessary

to make the theoretically innite space of categories manageable Empirical investigation of

particular induction strategies for the viable buer criterion await a broad coverage CCG for

English

ShiftReduce Conicts

A b ottomup parser for CCG encounters three kinds of nondeterminism

categorial ambiguity Aword mayhave more than one part of sp eech eg rose is either n

or snnp or even for the same part of sp eech a word mayhave more than one combinatory

potential eg raced is either snnp or nnnpp In LR parsing parlance this is can b e

thought of as a shiftshift conict

ts in the buer maycombine in more than one way howtocombine constituents constituen

One example is PP attachment Chris tickled the dog with the feather This is a reduce

reduce conict

whether to combine constituents Consider the string

Which house did you paint a picture of

After the word paint the relevant buer state is

Which house did you paint

qs np s np

inv inv

Combining the two constituents is a valid move It yields an analysis wherein it was a

house that was painted not something else The combined analysis cannot b e continued

grammatically by another NP a picture This is obviously not the appropriate move

here The two constituents must remain uncombined until the end of the string as in

Which house did you paint a picture of

qs np s np npn npp ppnp

inv inv

s n

inv

s pp

inv

s np

inv

q

This is a shiftreduce conict

The rst twoambiguities were the topic of the preceding chapters Shiftreduce ambiguityisa

sub ject of serious concern since it p otentially applies to every combination

One may opt to treat this latter ambiguityasany other pursue b oth analyses in parallel and let

the interpreter work it out While this sort of solution mightwork for ordinary phrase structure

grammars it is impractical for CCG b ecause of CCGs asso ciativityofderivation Recall that

CCGs rule of functional comp osition cangiverisetomultiple equivalent analyses as in



and on page This derivational ambiguity proliferates very quicklyFor example

the string in has truthconditionally equivalent CCG analyses

John was thinking that Bill had left

ssnnp snnpsnnp snnps ss ssnnp snnpsnnp snnp

In general for sequences of functional comp ositions the degree of this ambiguity grows as

the Catalan series that is roughly exp onentially

Catalan

X

Catalann CatalaniCatalann i

in

In section I describ e a way for the parser to cop e with this proliferation of equivalent analyses

bykeeping track of one representative from each truthconditional equivalence class of anal

yses The pro cessor can therefore pursue only the maximally leftbranching analysis ignoring

the p ossibility that two constituents may remain uncombined But the lo cal ambiguity in

aects truth conditions it is either a house that was painted or a picture So that example

requires sp ecial treatment The pro cessor must know to distinguish the uncombined analysis

and pursue it in this case One may argue that the question of whether to leave which house

and did you paint uncombined can b e easily resolved bywaiting for the very next word for dis

ambiguating information But this is not always p ossible sometimes syntactic disambiguating

information is delayed for manywords as in



This has b een called spurious ambiguity Wittenburg Although it has b een p ointed out that this

ambiguity of CCGs is necessary on linguisti c grounds see section

a Here is the cathedral that John drew and Bill b ought three b eautiful

charcoal sketches of

b Which of his daughters was Percival planning to donate to the university

an extravagantportraitof

In these examples it is clear that interpretation determines how the lo cal ambiguity is resolved

The parser therefore must present the interpreter with b oth analyses On what basis then

can the parser know to make the interpreter aware of the uncombined analysis in this case but

not to b other the interpreter with the many other truthconditionally irrelevant uncombined

analyses The viablebuer lter discussed in section oers a solution Placing this lter

between the parser and the interpreter allows the p ossibility for distinguishing relevant non

reductions which in the case of picture nouns eg are identiable by a sequence of

categories of the form Xsnp snp Placing the lter b etween the lexicon and the parser

do es not immediately prop ose such a solution

Notte that allowing the WHller and the gapcontaining constituent not to combine precisely

implements the idea argued for in section that the lo cus of lled gap eects is in the

interpreter not the parser

Heavy Shift and Incremental Interpretation

Another challenge to the lterless architecture arises from the interaction of heavy NP shift

and referential pro cesses The Strict Comp etence Hyp othesis section taken together with

the usual assumption of Comp ositionality that combinations in the syntactic domain are

are mapp ed to combination in the semantic domain predicts that the interpreter may not

b ecome aware of combination of semantic constituents b efore parser p erforms the corresp onding

syntactic combination Steedman uses this reasoning section to argue against a grammar

which requires a VP no de But do es CCG provide suciently incremental analyses so as to

overcome every instance of this problem The places to lo ok for an answer is where CCG do es

not provide a wordbyword leftbranching analysis One such place is around the canonical

p osition of heavyshifted arguments

a exemplies heavy NP shift Once one of a verbs arguments is heavyshifted it is ungram

matical to move its other argumentsasshown by the ungrammaticality of WHmovement

b and rightno deraising in c Note that multiple rightno deraising is not imp ossible

in general as d shows the latter is from Abb ott

a The bird found in its nest a nice juicy worm

b What did the bird nd in a nice juicy worm

c The bird found in and its mate found near the nest some nice juicy worms

d I promised but you actually gave a pink Cadillac to Billy Schwartz

In order to rule out b and c CCG must delay the combination of found and in until the

entire PP in its nest is constructed If wecanshow that b efore the PP is fully pro cessed the

pro cessor is nevertheless aware of the combination of found and in then wehave shown that

even CCG fails to provide suciently incremental analyses

Evidence for the detection of the heavy NP shift b efore the PP is fully pro cessed is provided by

the lack of garden path in a as compared to b

a The bird found in its nest died

b The horse raced past the barn fell

In a when the pro cessor encounters found it has two analyses

or main verb Out of context there is no mutually established background so the reduced

relative analysis requires the accommo dation of a fairly complex set of presupp ositions as dis

cussed in section The main verb analysis has no problems so far But the next word

a prep osition do es present a problem for the main verb analysis the verb nd is obligatorily

transitive so heavy NP shift must b e assumed The construction of heavy NP shift is felic

itous when the material which mediates the verb and the shifted argumentisbackgrounded

given information Out of context heavy shift is therefore not felicitous so the main verb

reading also carries a p enaltyFaced with two imp erfect analyses the pro cessor has no basis

for preferring one over the other so it keeps them b oth leading to the acceptability of either

continuation a and a When the ambiguous verb is p otentially intransitive eg

raced in b encountering a prep osition do es not presentany diculties for the main verb

analysis so the pro cessor decides to discard the reduced relative clause analysis in favor of the

main verb analysis leading to the garden path in b

Crucially the pro cessor is able to detect the inevitability of heavy NP shift for the main verb

analysis of a b efore the PP is fully pro cessed Were the pro cessor to wait until the end of



ould surely b e able to avoid the garden path in b the PP to resolve the ambiguityitw

Placing a viablebuer lter b etween the parser and the interpreter can provide the necessary

mechanism for identifying the unavoidable heavy NP shift in a In the same way that

such a device would learn the inevitable failure of certain buer congurations it could also

learn the inevitabili ty of the heavy shift construction which the parser will nd The current

implementation of this mechanism is presented in section



One p ossible attempt to salvage the minimal account is to argue that the pro cessor do es not actually determine

that heavy shift is unavoidabl e in the mainverb analysis of the bird found in but rather that the pro cessor

merely notices that there are two constituents which it cannot yet combine In such cases the pro cessor pro ceeds

cautiously not discarding comp eting analyses ie the reduced relative clause analysis

To counter this argument one could make the following observation While b oth analyses of found are main

tained when the sentence is presented out of context there are contexts which can cause the pro cessor to makea

commitment b efore the end of the PP The relevant case here is a context which makes heavy shift felicitous as

in the question i

i What did the bird nd in the nest

The resp onse in ii is a garden path

ii The bird found in the nest died

A theory in which the pro cessor do es not discard infelicitous analyses eg an unnecessary restrictive relative

clause would fail to predict the garden path in ii

Coping with Equivalent Derivations

In this section I address the problem of proliferating equivalent analyses stemming from CCGs

asso ciativityofcombination as intro duced in section I rst examine existing prop osals for

coping with this sort of ambiguityCombining ingredients from two of the prop osals Pareschi

and Steedmans idea of lazy parsing with Hepples normal form construction

I then intro duce a new parsing system which addresses shortcomings in its predecessors

Evaluation Criteria for a Parser

In light of the discussion earlier in this chapter any algorithm whichistoserve as an adequate

parser must satisfy the following desiderata

soundness All parser outputs must b e consistent with the grammar and the input string

completeness Given a grammar and a string every grammatical analysis for the string should

b e constructible by the parser That is the parser is free of structural ambiguity resolution

tendencies

incrementality Given an initial segment of a sentence the parser must b e able to identify

all the semantic relations which necessarily hold among all of the constituents seen thus

far For example having encountered a sub ject NP followed by a transitive main verb

the parser must identify or merely narrowdown dep ending on ones theory of thematic

relations the semantic role which the sub ject NP plays in the main sentence

feasibility The computational resources needed to run the algorithm must plausibly b e pro

vided by the human brain Given our current understanding of the brain this criterion

is unavoidably fuzzy Clearly algorithms which are exp onential in the length of the string

e brand infeasible any algorithm which do es not b ound the are infeasible but should w

pro cessing time of eachword to a constant The answer is less clear issues of implemen

tation of parallelism and the brevity of most utterances complicate matters In the case

of parsing CCG the asso ciativityofderivations must not impact the parsers p erformance

adversely

transparency The parser uses the comp etence grammar directly not a sp ecially transformed

or compiled form

Previous Attempts

There has b een a variety of prop osals for parsing CCG Wittenburg Wall and Wittenburg

prop ose that the grammar b e compiled into a dierent one in whicheachsemantically

distinct parse has a unique derivation or in some cases a few but muchfewer than the Catalan

series Their prop osal addresses only the rules and It do es not seem to generalize

obviously to higherorder combinations esp ecially when so called mixed comp osition in which



the slashes are not all of the same direction This compilation pro cess comes at the cost

of substantially changing the constituency structure dened by the linguists original source

grammar hence compromising transparency Furthermore the complexity of the op erations

required to p erform this compilation renders suchascheme a rather unlikely accountofhumans

representation of grammar

Following up on the work of Lamb ek who prop osed that the pro cess of deriving the

grammaticality of a string of categories b e viewed as a pro of there have b een quite a few

prop osals put forth for computing only normal forms of derivations or pro ofs Mo ortgat

Konig Hepple and Morrill Hepple The basic idea with all of these works is

to dene normal forms distinguished memb ers of each equivalence class of derivations and

to require the parser to search this smaller space of p ossible derivations These prop osals enjoy

the advantage of transparency Unfortunately most of them cannot result in parsing systems

which pro ceed incrementally through the string This results either from an intrinsically non

stringbased Gentzenlike pro of system Mo ortgat Konig or from a rightbranching

normal form Hepple and Morrill A p ossible exception to this criticism is the work of

Hepple Hepple considers Meta Categorial Grammars a close relative of CCG prop osed

by Morrill Hepples normal form derivations are as leftbranching as the grammar

allows just the sort of incrementality necessary for our parser But Hepple do es not provide

a computational implementation for the elegant normal form construction which he presents

Unfortunately Hepples claims that his system can b e parsed suciently incrementally are not

tenable The problem is with the timing moving lefttoright through the input the parser

cannot know what is ahead b efore it must commit to a normal form parse for the input so far

For example in

John loves Mary

svp vpnp np

Of the two p ossible derivations the leftbranching one the one which treats John loves as a

constituentoftyp e snp is the normal form However in

John loves Mary madly

svp vpnp np vpnvp

There is only one derivation This derivation treats loves Mary as a constituentoftyp e vp

So what is a parser to do after having encountered John loves Mary It is not allowed to

construct the nonnormalform derivation If it commits to the normal form derivation for these

three words then it would b e stuckifanadverb were to come next It is also not allowed

to simply wait and not decide b ecause that would violate incrementality Stated dierently



For example

anbd sae bc cde

bde

crossing

ae

s

the problem is the inability to extend a leftbranching normal form by adding the next word

When the next word in the input is encountered the pro cessor computes distinct normal forms

representative for each distinct analysis it has no general way of excluding those analyses which

are extensions of analyses whichhave already b een discarded

Karttunen prop oses a very simple solution to the problem of asso ciativity of derivation

He uses a b ottomup chart parser and simply avoids adding duplicate arcs into the chart Since

he uses a unicationbased system he checks for subsumption rather than simple equality

or uniability of terms It follows that for a string of n applications of the rule of forward



comp osition O n arcs are added to the chart instead of O Catalann Karttunens

parser is clearly sound complete and transparent But it do esnt construct derivations or

analyses Instead it constructs arcs The dierence may app ear insignicant at the end of

the parse those arcs that span the whole string are exactly the analyses The dicultyarises

in the interaction with the interpreter The interpreter cannot simply checkeach arc against

every other arc a constituentmust b e evaluated in the context of its preceding constituents

in the analysis The syntactic contextindep endence assumption which dynamic programming

algorithms such as Karttunens chart parser rely up on is not compatible with the context

necessary for interpretation The pro cess of computing all valid constituentsequences which

span the input so far is quite complex esp ecially if one wishes to consider only maximally long

void truly spurious ambiguity The cost of constituents and not their sub constituents ie a

integrating this chart parser with the rest of the current system thus renders it infeasible

Pareschi and Steedman have made a third sort of prop osal construct only maximally left

branching derivations but allow a limited form of backtracking when a lo cally nonmaximally

leftbranching analysis turns out to have b een necessary For example when parsing

Pareschi and Steedmans algorithm constructs the left branching analysis for John loves Mary

When it encounters madly it applies in reverse to solve for the hidden constituent loves

Mary by subtracting the svp category John from the s category John loves Mary

madly John Mary loves

svp vpnp np vpnvp

snp

s

reveal

vp

vp

s

The idea with this revealing op eration is to exploit the fact that the rules n and n when

viewed as threeplace relations are functional in all three arguments That is to say knowing

anytwoof fleft constituent right constituent resultg uniquely determines the third There are

some problems with Pareschi and Steedmans prop osal

The rst class of problems is the incompleteness of the parsing algorithm which they give a

chart parser Hepple The essence of these problems is that in a chart parser common

subpieces are shared across dierent analyses In Pareschi and Steedmans lazy chart parser

the presence in one analysis of a certain arc can lead to the omission in another analysis of a

crucial arc Pareschi and Steedman use a scheme rightgenerator marking wherein if an arc

has b een combined with another arc to its left then it is prevented from combining with anyarcs

on its right In the vpnp arc for loves is such an arc and is therefore prevented from

combining with the np arc Mary to yield a vp arc In the presence of ambiguity this could lead

to incompleteness For example in from Hepple one category of the word that

comp oses with he This renders he unable to combine with liked It follows that the parser

cannot nd an analysis for the whole string which is of course grammatical

he liked he told the woman that that it was late

ssn n nnnsnp svp vpnp

ss

ss

ss

svp

snp

stuck

This problem of separate analyses contaminating one another through shared chart cells can b e

eliminated if one replaces the chartparsing framework with one that do es not factor subresults

as I do b elow

A second class of problems with Pareschi and Steedmans revealing computation is the unsound

ness which results from the assumption that the combinatory rules are invertible In the

category ab is subtracted from a to reveal the category b as the result of combining bc and

anac This is an unsound inference regardless of the control algorithm in whichitisembed

ded The consequence is that the parser nds an analysis for which is not licensed bythe

grammar

anac bnb ab bc

ac

a

reveal

b

b

a

This form of unsoundness is not a problem if the grammar happ ens to b e such that whenever a

constituent has a typ eraised category of the form XnXZ for some categories X and Z then

it also has the category YnYZ for any other category Y While the class of such grammars

maybeofpotential interest eg it would include any reasonable CCG for English additional

arguments on languageuniversal grounds would b e necessary b efore one accepts this theoretical

unsoundness as having no practical imp ort

Hepple provides another illustration of the unsoundness of the revaling pro cedure For

heavy NP shift CCG allows a rule of backward crossing comp osition as in

loves madly the crazy Scottish p o et

vpnp vpnvp np

x

vpnp

vp

Also the grammar allows co ordination of nontraditional constituents as in

loves Mary madly and Susan passionately

vpnp vpnvpnp vpnvp conj vpnvpnp vpnvp

vpnvpnp vpnvpnp

co ord

vpnvpnp

vp

Using revealing the pro cessor construct the following parse for

Mary and Susan passionately loves madly

vpnp np vpnvp conj vpnvpnp vpnvp

vp vpnvpnp

vp

reveal

vpnvpnp

co ord

vpnvpnp

vp

The revealing step in cannot help but also reveal a vpnvpnp constituent for the string

madly the crazy Scottish p o et thus allowing the pro cessor to admit which is ruled out

by the grammar

loves and the crazy Scottish p o et madly Susan passionately

vpnp vpnvp np conj vpnvpnp vpnvp

x

vpnp vpnvpnp

vp

reveal

vpnvpnp

co ord

vpnvpnp

vp

Again this unsoundness is not a problem if one assumes as I do in this pro ject that the

semantic analysis of heavyshifted constructions suchasloves madly bare markings which

distinguish them from unshifted constructions The discrepancy in this marking will prevent

the co ordination rule from treating the two constituents in as like categories But unless

one has strong crosslinguistic evidence that the unsoundness ab ove will never present a problem

for any reasonable grammar it is b est to have parser whichworks correctly for every grammar

in the formalism

Aside from intro ducing unsoundness the revealing pro cedure is also incomplete In the

category bnc cannot b e revealed after it had participated in twocombinations of mixed direction

and

bncnbnc ab c bnc

b

a

stuck

A Prop osal

Pareschi and Steedmans prop osal emb o dies an app ealing idea construct the maximally left

branching analysis revising this commitment only when it b ecomes necessary The chart parser

implementation of this lazy parsing idea is clearly unacceptable Given the diculty of in

crementally computing partial derivations from intermediate chartparser states and given the

interactive nature of the current system the traditional advantages of a chart parser ie its

reuse of analyzed substring across divergent analyses are eclipsed by its disadvantages Replac

ing the chart parser with a shiftreduce parser whichsimulates nondeterminism using explicit

parallelism eliminates the problems asso ciated with the system of rightgenerator marking

But there are still problems with the subtractionstyle op eration of revealing The unsoundness

and incompleteness in and resp ectively still remain One way out of these problems



is to reparse the constituent instead of revealing it Reparsing the substring need not b e

p erformed from scratch if the parsers data structure maintains links from a each constituent

to its sub constituents then this derivation history can b e reused when constructing the revealed

constituent For example to reparse the vp saw three birds the derivation history tells us to

use the rule to combine saw and three and then use the rule to combine saw three

with birds

three yesterday birds Fred saw

svp vpnp npn n vpnvp

snp

sn

s

reparse

vp

vp

s



This prop osal is sketched in Hepple

Unfortunately the presence of categorial applicabili ty conditions on combinatory rules presents

the following problem to this recip ereparsing approach Supp ose one wanted to rule out the

following derivation from the comp etence grammar

p otato Dan b ought a and ate the

svp vpnp npn co ord vpnp npn n

vpn vpn

co ord

vpn

sn

s

One could stipulate the following restriction on the rule

XY Y X unless XY vpn

Granted this is not the only way of capturing this fact nor is it a particularly app ealing one

But this is a substantive grammatical question and should not b e resolved arbitrarily bythe

parsing algorithm Could recip ereparsing handle the following example

p otato ate Dan the quickly

svp vpnp npn n vpnvp

snp

sn

s

Recip e reparsing done the obvious way ie mirroring the derivation would rst combine ate

the to make vpn and then attempt to combine that with p otato This derivation is ruled

out But there do es exist a derivation for the ab ove string

quickly Dan ate the p otato

svp vpnp npn n vpnvp

np

vp

vp

s

Recip ereparsing therefore results in incompleteness Note that if one were to change recip e

reparsing so as to work the other way around ie build the rightbranching structure then it

would b e p ossible to construct a counterexample which required the leftbranching analysis

The move from revealing by subtraction to recip ereparsing is one which trades eciency for

accuracy Given that recip ereparsing is not suciently accurate is it necessary to giveup

Pareschi and Steedmans intuition of lazy parsing altogether and use fulledged parsing on the

substring to b e recovered I now argue that the answer is No

The way I prop ose to exploit the information implicit in the derivation history is by rewriting

the derivation into another derivation which preserves all the semantic relations enco ded in the

original derivation but makes p ossible syntactic combinations which the original did not For

example the derivation

Mary loves John

svp Tj vpnp l np m

snp BTj l

s BTj lm

can b e rewritten using one step to the equivalentrightbranching derivation

John loves Mary

svp Tj vpnp l np m

vp lm

s Tj lm

which has the vp constituent necessary for combining with madly whose category is vpnvp I

use the technique of term rewrite systems and normal forms Hepple and Morrill Hepple

Intuitively semanticspreserving derivationrewrite rules such as the one mapping

to can b e applied rep eatedly to correctly compute the rightbranching equivalentofany

derivation This computation can b e p erformed quite eciently in time prop ortional to the

size of the derivation In App endix B I provide a formal denition of the rewrite op eration

an eective pro cedure for applying this op eration to compute rightbranching derivations and a

pro of of the correctness and eciency of this pro cedure

Using the Recovered Constituent

Given the rightmost sub constituentrecovered using the normal form technique ab ove how

should parsing pro ceed Obviously if the leftward lo oking category which precipitated the

normal form computation is a mo dier ie of the form XnX then it oughttobecombined with

the recovered constituent in a form analogous to Chomsky adjunction as in gure As an

illustration shows a state of the parser when it encounters a backward lo oking category

Normal form computation results in the state shown in From here two states are p ossible

corresp onding to the twoways of Chomsky adjoining the mo dier low and high attachment

resp ectively These are given in and x\x x\x x (recovered) x (recovered)

x

Figure Recombining a recovered constituent with a rightward lo oking mo dier

said yesterday Mary John Bill that saw

svp vps ss svp vpnp np vpnvp

svp

ss

svp

snp

s

yesterday John said that Bill saw Mary

svp vps ss svp vpnp np vpnvp

vp

s

s

vp

s

yesterday John said that Bill saw Mary

svp vps ss svp vpnp np vpnvp

vp

vp

s

s

vp

s

yesterday said that Bill saw Mary John

svp vps ss svp vpnp np vpnvp

vp

s

s

vp

vp

s

But what if this category is not of the form XnX Should the parser compute the reanalysis in

ab bc cd snabnbd ab bc cd snabnbd

ac bd

ad snab

s

Suchamovewould constitute a very o dd form of costfree backtracking Before reanalysis the

derivation enco ded the commitment that the b of the rst category is satised bythebofthe

bc in the second category This commitment is undone in the reanalysis This is an undersirable

prop ertytohave in a computational mo del of parsing commitment as it renders certain revisions

of commitments easier than others without any empirical justication Furthermore given

the p ossibility that the parser change its mind ab out what serves as argument to what the

interpreter must b e able to cop e with such nonmonotonic up dates to what it knows ab out the



derivation so far this would surely complicate the design of the interpreter

Summary

This chapter b egan by reviewing a very b old prop osal of Steedmans The internal representation

used by the human syntactic parser consists only of grammatical analyses The prop osal is b old

on two counts

This pro cessing mo del is unusually imp overished

On the basis of the parsimony of the grammar parser package Steedman attempted to

argue for a certain theory of comp etence

The primary thrust of the argumentpoint that in principle a pro cessor for CCG avoids

design complexity which is necessary for other grammatical frameworks was challenged by

Shieb er and Johnsons argument that asynchronous computation could capture the same com

putational simplicity for rather traditionallo oking phrase structure grammars Resolution of

this issue awaits renement and elab orations of each of these theories to allow their evaluation

as adequate characterization of how the brain actually represents and pro cesses grammars

e I considered whether an imp overished pure b ottom up CCG parser Returning to p oint ab ov

can serve as an adequate parsing mo dule for the language pro cessing system I considered three

problems whichwould traditionally have received some sort of precompilation of the grammar

or top down prediction in the parsing sense of topdown



I am indebted to Henry Thompson for a discussion of this issue of monotonicity

Timely detection of ungrammaticality eg the ability to quickly detect that an adjacent

pair of categories eg determiner verb has no chance of ever leading to a grammatical

analysis

Shift reduce conicts identifying the rare set of cases where a CCG rule should b e allowed

not to apply eg picturenoun extractions

Timely detection of crossing comp osition detecting the inevitability of certain rule ap

plications b efore they actually happ en eg detecting heavy shift when an obligatorily

transitiveverb is immediately followed by a prep osition

Not surprisingly the pure b ottom up pro cessor cannot handle these cases correctly More

interestinglyhowever I have argued that ones theory of the innate pro cessor can remain as

parsimonious as Steedmans if one makes the rather plausible assumption that while the ability

to parse is innate the ability to parse eciently is not The skill which the language learner

acquires by attending to intermediate parser congurations and their eventual outcomes can

serve to p erform the predictive functions necessary for the three cases ab ove The acquisition

pro cess is similar in some ways to the training of ngram mo dels for partofsp eech taggers

In the last section of this chapter I discussed a problem which is quite sp ecic to CCG CCG

distinguishes leftbranching and rightbranching analyses which are often truthconditionally



equivalent To cop e with the additional ambiguity brought ab out by CCGs asso ciativityof

derivation I prop osed that only the maximally leftbranching analysis as allowed by the gram

mar b e maintained and whenever this analysis turns out not b e the correct one the necessary

rightbranching analysis is computed from the derivation history

Steedmans prop osal of a parser which only represents grammatical analyses has therefore sur

hallenges which it had b een put to In the next chapter I showhow the resulting vived the c

parsing algorithm is used in the broader sentence pro cessing system



This prop erty has b een called spurious ambiguity Wittenburg Steedman has argued that this

ambiguity is not spurious rather dierent constituencies corresp ond to dierentways of breaking the string into

a theme and a rheme proso dic constituents which are used to enco de information status But CCG provides

more ambiguity than what is necessary for proso dic constituency the theme and the rheme may in turn receive

many truthconditiona ll y equivalent derivations

Chapter

A Computer Implementation

In this chapter I instantiate the parsing mechanism describ ed in chapter and the meaning

based ambiguity resolution mechanisms presented in chapters through I do so by presenting

a computer program which simulates human sentence pro cessing p erformance The aim of this

chapter and the implementation it describ es is to show the consistency of the collection of

subtheories develop ed thus far to account for the limited data that has b een collected and to

test whether these ingredients can indeed b e combined in a straightforward and non ad hoc

way

The program accepts words as input one at a time developing a set of partial analyses as it

progresses through the sentence If at any time this set b ecomes empty the pro cessor is said

to have failed the analog of a garden path In this pro ject I do not address recovery from a

garden path This mo del is successful just in case two goals are achieved

It correctly predicts garden path eects in the range of examples discussed in the earlier

chapters

The implementation is straightforward that is it is a simple pro cedure which applies

linguistic comp etence to the input representation without having to resort to sp ecialized

algorithms

Desiderata

Let us b egin by stating the desiderata for the computational mo del in detail The system is

divided into the mo dules shown in gure The b ottomup syntactic rule applier ie the

parser constructs in parallel all p ossible analyses for the initial segment seen so far The buer

viability lter detects unviable analyses and immediately signals the parser to discard these The

semanticpragmatic interpreter examines only the sensesemantics which the parser constructs

and not other more sup ercial asp ects of the syntactic analyses The parser in turn may

not lo ok inside the interpreter the only information owing from the interpreter to the

t analyses The actual program do es not literally parser is whether to maintain or discard curren Syntactic Analyses Discourse model Lexicon

Semantic-Pragmatic Bottom up Viable Buffer Input Utterance Interpreter Rule Applier Filter

Combinatory Analysis-suspension Messages World Knowledge Rules key:

declarative knowledge

computational process

data flow

Figure System Diagram

separate the dierent pro cedural mo dules into informationally encapsulated mo dules eg using

asynchronous communicating pro cesses but nevertheless ob eys these restrictions on data ow

Toavoid the inferential complexities asso ciated with accommo dation the rep ositories of knowl

edge ab out the world and knowledge of the preceding discourse are not up dated by the inter

preter ie they are treated as readonly storage

The following phenomena are covered

Referential Felicity Crain and Steedman show context sensitivity in pairs such

as see section Altmann et al

a The psychologist told the wife that he was having trouble with to leave

her husband

b The psychologist told the wife that he was having trouble with her

husband

Inacontext with just one wife a is a garden path whereas b is not The

opp osite is true if the context mentions twowives

Complexity of Accommo dation Crain and Steedmans Principle of Parsimony

see section entails that out of context the simplex NP reading of the wife

compatible with b would b e preferred to the restrictively mo died NP reading of

a

Principle of Parsimony Crain and Steedman

If there is a reading that carries fewer unsatised but consistent pre

supp ositions or entailments than any other then other criteria of

plausibility b eing equal that reading will b e adopted as most plau

sible by the hearer and the presupp ositions in question will b e in

corp orated in his or her mental mo del of the discourse

In the current pro ject the complex pro cess of determining the numb er and plausibilityof

presupp ositions carried by an NP will b e approximated byavery simple and crude metho d

accommo dating a simple NP incurs no cost while accommo dating an NP which is restric

tively mo died carries some xed cost It must b e emphasized that this approximation is

not based on the syntactic complexity of complex NPs but on the presupp osition enco ded

by the use of a restrictive relative clause there is more than just one entity matching

the description of the head noun so a restrictive mo dier is necessary to individuate the

referentintended by the sp eaker

Plausibility and Garden Paths Bever noticed that garden path eects as in

The horse raced past the barn fell

can b e minimized when the plausibili ty of the main verb analysis of the rst verb is

decreased see section Bevers example is

I do not consider nonrestrictive relative clauses in this pro ject

The light airplane pushed past the barn crashed into the p ost

In this pro ject I use a slightvariant

a The p o et read in the garden stank

b The p o em read in the garden stank

Heavy Shift and Garden Paths Pritchett p oints out the garden path eect

in is absent in

The bird found in the store died

Clearly the fact that nd is an obligatorily transitiveverb plays a role here Given that

The bird found in the store a corner in whichtonest

is also not a garden path sentence it follows that b oth reduced relative and main verb

analyses of found are pursued in parallel It is p ossible to force one or the other reading

using an appropriate context

Q What did the bird nd in the store

A The bird found in the store died

A The bird found in the store a corner in which to nest

In the p et store two exotic birds escap ed from their cages One was lo cated in

a nearby tree and the other was found hiding inside the store

The bird found in the store died

The bird found in the store a corner in whichtonest

Adverbial Attachment In chapter it was argued that considerations of information

volume were resp onsible for the low attachment preference of the adverbial in

The p o et said that the psychologist fell yesterday

But that no such considerations apply to the attachment of the adverbial in

The p o et said that the psychologist fell b ecause he disliked us

Since the inference required for determining correct attachment decisions in is op en

ended and nonlinguistic the current program leaves this ambiguity unresolved and rep orts

b oth readings

Socalled Late Closure Eects Out of context the examples in are garden

paths

a When the cannibals ate the missionaries drank

b Without her contributions failed to come in

c When they were on the verge of winning the war against Hitler Stalin

Churchill and Ro osevelt met in Yalta to divide up p ostwar Europ e

Ihave implemented b oth a newsub ject detector and a disconnectedness determining pro

cedure in order to exp eriment with the two theories presented in chapter

Center Emb edding Eects While do es not give rise to garden path eects the

system do es represents the fact that it is harder than other sentences

The worm that the bird that the p o et watched found died

This measure of dicultyislower when some of the sub jects are given in the discourse

Syntax

The comp etence grammar in this system is an instantiation of Steedmans Combinatory

Categorial Grammar which is capable of constructing left branching analyses as discussed in

section A prop er linguistic investigation of grammatical comp etence b eing outside the scop e

of this work the aim of the grammar here is to provide at least one analysis for each reading

relevant for the examples in section To this end the following grammar will do



A Basic category is represented as an ordered pair a Prolog term and a semantic variable

separated by a colon The Prolog term is a ma jor category symb ol with zero or more arguments

its features The basic categories are

Basic categories

nNUM common noun

npPERSNUM noun phrase

sTNSFINCOMP sentence or SBAR

partPART particle

ppPREP prep ositional phrase

eop end of phrase marker zero morpheme

A feature may b e unsp ecied or haveavalue from the following domains



I use Prolog notation throughout symb ols b eginning with a lowercase letter are constants symb ols b eginning

with an upp ercase letter are variables an underscore denotes an anonymous variable dierent o ccurrences

of denote dierent anonymous variables See Pereira and Shieb er for an intro duction to the Prolog

programming language

There is an exception to this naming scheme however Prolog is usually unable to keep track of names of

variable names after unication takes place When it must printavariable it prints something tedious suchas

Tomake terms easier to read I use a printing pro cedure which gives semantic indices names such as e

e syntactic variables names suchasss and category variables names suchascc

Feature Possible values

NUM sg pl

PERS

TNS to en ing plup ed s fut ed s fut

FIN dep ends on TNS plup ed s are ed s fut to en ing are

PART awaydown up over

COMP that q q is sp ecial it means that the s is a WH question

PREP in to without

For example the basic category sedthatX stands for a sentence in the past tense whose

complementizer is that eg that Mary loves John In the accompanying semantic term list

the variable X represents the main situation in the sentence

The lexicon is stored as a collection of words and their asso ciated part of sp eech lab el When

the system is started a pro cess generates lexical entries from these part of sp eech lab els A

lexical entry is a triple of a word a syntactic category and a semantic term list Examples of

the dierent parts of sp eech lab els are as follows aS aS ofXjohn closedX prsX plX closedX prsY plY prsX plX closedX axX tnsSs yS sw ySX tnsSs erXY erXY a tic term list w tSXZY tnsSs a closedY theX ofXY ySXY tnsSs axX alkSX esterda go trySXY tnsSs tellSXZY tnsSs gran inSY inSY passionatelyS sw w sa birdX w theX name theX st theX inXY npmo dX whenev whenev seman callSXY tnsSs existX w theX st theY st juicyX y Z remindSXZY tnsSs X X s s X Znp Z Y Y plX np np np X Y n n n n Z YeopX YeopX S Snp Snp S N X F NX npP Y Ynp n y a sTF npP sTF sTF Y n w n n n y nNXnp a n w tactic category np parta eopSstoY eopSs pptoYnp eopSstoY  eopSs sTFS VP VP VP VP VP VP nsgX nsgX npsgX npplX npNXeopXnNX ppinXnp VP syn npsgX sTFSsT npNXeopXnNX parta nNXnNX sTFS nNX sTFS sTFSsTF VP sTFS er X sTFY y y er X Y sTFYsTF a w t y a y alk ax e esterda w call w go a reminds tell gran bird w John us the in whenev try sa w our juicy Y whenev example a passionately y t plX np n al S eV e pron e pron eV e transitiv in transitiv V part V innitiv V Ob j V Sinf S Scomp complemen VObj PP VObj common noun mass noun det mass np prop er name ob ject pronoun determiner particle adjectiv prep osition N p ostmo dier S p ostmo dier S premo dier sub ordinating conj nominativ p ossessiv adv adv stands for ssS pro pro vp s pro VP  OS lab el o oi c oc op

P v v vpr vi v v v v cn mn pn nom ob j p oss det part adj p ost p ost prep sconj

word category term list comment

that sTthatEsTE complementizer

that nNEnnNEeopSs SnnpNE npmo dE sub ject relativizer

that nNEnnNEeopSs SnpNE npmo dE ob ject relativizer

which nNEnnNEeopSs npNE npmo dE ob ject relativizer

which sTqEsTq npNEnNE whE question

did sedqEsedE for sub jaux inversion

to stoEnnpPNXstoEnnpPNX innitiveto

will sfutEnnpPNXsfutEnnpPNX futureE aux will

was sedEnnpPNXsingEnnpPNX past progressive

is ssEnnpPNXsingEnnpPNX pres progressive

had splupEnnpPNSsenEnnpPNS plup erfectE

b een senEnnpPNSsingEnnpPNS past p erf progressive

eopX closedX end of phrase marker

Table Lexical entries for closed class items

In addition to the ab ove lexical entry generators idiosyncratic ie closedclass words have

the lexical entries listed in table

The annotation swaX is asso ciated with all single word adverbials The annotation npmo dX

is asso ciated with all nominal mo diers These annotations allowinterpreter to approximate

the detection of a low information volume adverbial preceded by a high information volume

argument as discussed in chapter

ofX ofXY and phrase The zero morpheme eopX and the semantic terms theX name

closedX are part of the reference resolution system They will b e describ ed in section

closedX is abbreviated in the table ab ove as simply closedX The latter annotation phrase

for reasons of space

There are lexical entries for eachverb in all of its inected forms as follows

form category semantics

walked sedEnnp X walkEXtnsEed

walked senEnnp X walkEXtnsEen

walking singEnnp X walkEXtnsEing

walks ssEnnpsgX walkEXtnsEs

walk sTEnnp X walkEXtnsET untensed

walk ssEnnp plX walkEXtnsEs plural present

walk ssEnnp X walkEXtnsEs st p ers present

walk ssEnnp X walkEXtnsEs nd p ers present

The tense system implemented in the current grammar is rather crude but it suces to construct

the analyses necessary for the examples



Sub jectAux inversion is handled as follows

nd did Mary

sedqesede sXYesXYennpsge sTennpsgenpe

sedqesedennpsge

sedqenpsge

ofemaryndeeetnseed name

Notice that the identity of the tense ed in this case is passed from did through Mary where

the tense variable X is unied with ed to nd whose include a variable T

which is unied with ed

The sedqnp constituent did Mary nd can then combine to form a WH question

whichbird did Mary nd

sTqesTqenp e sedqenpsge

sedqe

whebirdename ofemaryndeeetnseed

A sub ject typ e raising rule applies to all NPs with the exception of ob jective case pronoun



npPNX semS sTFSsTFnnpPNX semsub jXS

Avariant of this rule applies to all determiners

npPersNumXeopXnNumX semS

sTFSsTFnnpPersNumXeopXnNumX semsub jXS

Words that create nonsub ject WHdep endencies relativizers whquestion words eachhave

in addition to the categories listed in table two additional categories which reect one and

two applications of a nondirection preserving version of Geachs division rule Geach

XY XZYZ

For example the relativizing pronoun which has in addition to the category

nNEnnNEeopSs npNE

listed in table the following two categories

XnpNE nNEnnNEXeopSs

nNEnnNEXYeopSs XYnpNE



The latter two categories are included in order to allow for nonp eripheral extraction for

example



Recall fo otnote that symbols like e stand for semantic variables



The Prolog notation HT stands for a list whose rst element is H and the rest of whose elements are T

The sub ject typ eraising rule therefore adds the notation that the NP app ears as sub ject to the semantic term

list asso ciated with the NP



See Steedman for a dierentway to capture nonp eripheral extraction interaction b etween crossing

comp osition and ob jecttyp eraising

Mark reminded the babysitter to watch the movie

the babysitter that Mark reminded to watch the movie

The combinatory rules are as follows

leftchild rightchild result rule name

AB B A

AB BC AC

AB BCD ACD

AB BCDE ACDE

B AnB A

AB CnA CB

The capital letters in the rules are Prolog variables and these rules op erate by unication

Along lines suggested by Aone and Wittenburg there is a rule for p ositing a zero mor

pheme adjacent to a category which exp ect it The pro cessor blo cks excessive applications of

this rule For example given a determiner and a noun the rule applies to combines them



and yield the category npeop

bird the

npeopn n

npeop

The zero morpheme eop end of phrase is then p osited to the right of the noun and immediately

combined to yield an np

the bird

npeopn n eop

npeop

np

When a rule is applied to combine two constituents the semantic term list of the result is

simply the concatenation of the term lists of the two constituents with one exception the

last combinatory rule so called backward crossing comp osition intro duces an additional term

h shiftedXY which designates that argumentX ofY was heavy shifted For example in

a nice car found yesterday john

sesennpe sennpenpe sense npeeope eope

senpe

senpe

seeope

se



Inessential details inside the categories are omitted for clarity

the semantic term lists asso ciated with the marked derivation step are as follows

John found

thee name ofejohn closede ndeee tnseed

yesterday

yesterdaye swae

john found yesterday

thee name ofejohn closede ndeee tnseed yesterdaye swae h shiftedee

Data Structure

The pro cessor maintains one or more analyses in parallel Each analysis has data comp onents

on twolevels SyntaxSemantics and InterpretationEvaluation There are four comp onents

altogether

Buer

SyntaxSemantics

Semantic Term List

Interpreter Annotations

InterpretationEvaluation

Penalties

The Buer is a sequence of constituents Adjacent constituents maybecombined using the

combinatory rules or revealing see chapter I use the term revealing for the pro cess of

recovering the implicit constituent using derivation rewriting A constituent is a tuple

hCategory Rule LeftChild RightChildi

where LeftChild and RightChild are normally constituents and Rule is the name of a combina

tory as listed in section When a constituent is a single word Rule is lex LeftChild holds

the actual word and RightChild holds the placeholder There is a sp ecial rule init whichis

used in the initial state of the parser It is discussed in section The Semantic Term List

holds the list of semantic terms asso ciated with a constituent In case the Buer contains more

than one constituent the Semantic Term List is the concatenation of the term lists of those



constituents

The interpreter may read the Semantic Term List but not mo dify it It records its results

eg pronoun resolution in the Interpreter Annotations comp onent The interpreter records its

assessment of the sensibleness of the analysis in the Penalties comp onent This comp onent has

two parts the p enalty list whichenumerates the particular p enalties asso ciated with the state

and a score which is determined from the p enalty list and is used for comparing the current

analyses



Given this representational system it is logically p ossible that there b e two terms in the term list which

originate from dierent constituents thus having no semantic indices in common Subsequently when the two

constituents are combined unication could cause two such distinct indices to b ecome identical Curiously

such a phenomenon do es not arise in the grammar and semantics of the current system That is whenever

two constituents do not combine it is never the case that they b oth intro duce semantic terms over semantic

indices which will subsequently b e unied If this prop erty remains in more comprehensive grammars it provides

opp ortunities for certain monotonicityrelated inferences whose consequences require further research

Control Structure

When the system encounters a string the following toplevel control algorithm is executed

Start with one initial state S where

init

S s buer is the single constituent htlsTCXeopXsTCXiniti

init

S s semantic term list interpretations and p enalties are all empty

init

For eachword W in the input

For each lexical entry hW CatSemi

For each current state S

Make a copy S of S

Add the constituent hCatlexW i to the Buer of S

App end Sem to the Semantic Term List of S

For eachway S of nondeterministicall y applying the rules of grammar to S section

If the resulting buer is an admissible one section then

For eachwayofinterpreting S section

Compute the p enalty of the interpretation

unless subsumed byanextantstate Save S

Remove S

Perform discarding pro cedure on the current set of states section

Continue with the next word

Of the states whose buer has the singleton constituent whose category is a tls

display the most sensible state or states ie the ones with the least p enalty

The category in the initial state has as its result the sp ecial symbol tls top level sentence whichis

not mentioned elsewhere in the grammar This symbol is intro duced mostly for convenience and

should b e thought of as identical to the symb ol s The dierence will b e ignored in the exp osition

X which whenever p ossible The category has as its rst argument the basic category s

creates the exp ectation for a tensed sentence

BottomUp Reduce Algorithm

The nondeterministic reduce computation is as follows

reducestate S

either

Sasis

or

if there is a reduce step that can b e applied to the buer of S

then p erform this step and recursively call reduce on the resulting state

or

let RC b e the rightmost constituent of the buer of S

if the category of RC is of the form Z

where Z matches a zero morpheme eg eopX

then

app end the constituent hZlexi to the right of the buer

app end the semantic term list asso ciated with that zero morpheme to Ss semantics

recursively call reduce on the resulting state

end if

end

There are twoways of p erforming a single reduce step as discussed in chapter

letXYbethetwo rightmost constituents in the buer of S

let XC and YC b e the syntactic categories of X and Y resp ectively

metho d

if there is a combinatory rule R of the form XC YC Z

then

replace X and Y in the buer with the constituent hZRXYi

if rule R has any semantic terms

then app end these terms to the semantic term list of S

metho d

If YC is of the form WnW

then

let XNF b e the right normal form of X

if there exists a right sub constituent RS of XNF such that

the syntactic category of RS is RC and

there is a combinatory rule R of the form RCYC Z

then

replace RS by hZRRSYi

if rule R has a nonempty semantic term list

then app end this list to the semantic term list of S

Buer Admissibility Condition

As discussed in chapter esp ecially sections the adult listenerreader has access to a

pro cedure which identies and discards unviable buers such as theDET insultsVERB For

y stipulating the the purp oses of this pro ject I circumvent the step of acquiring this pro cedure b

condition in which is adequate for the grammar I use

Buer Admissibili ty Condition

For every pair of adjacent constituents whose categories are X and Y

No obligatory combinatory rule exists whichcancombine X and Y and

the categories of X and Y are ultimately combinable

All combinatory rules are obligatory except those forward rules n where the left category is

np ie those rules which determine llergap relations as discussed in section

X and Y are ultimately combinable in case either or or holds

Xisoftheform Z and

m

YisoftheformZ

n

for some m n and some category Z

and YisoftheformAnB

n

there is a combinatory rule which can combine X and AnB

YisoftheformAnB and

n

the right normal form of X has a right sub constituentRS

such that there is a combinatory rule which which can combine RS with AnB

Conditions and anticipate applications of certain backward combinatory rules In

section I argued that semantic terms in particular a term for marking crossing comp osition

a signal for heavyNP shift whichwould b e intro duced by the anticipated rule application

should b e detected immediately and not delayed until the rule is actually applied This is realized

in the implementation

Interpretation

The interpretive comp onent in the current system p erforms only two of the manyinterpretive

functions of its human counterpart It p erforms a simplistic databaselo okup op eration for resolv

ing denite noun phrases against the prior discourse without any socalled bridging inferences

see Haviland and Clark It also implements a trivial form of plausibilityimpl ausibi l ity

inference relying on a handco ded database of implausible scenarios These inferences are

of interest of course only insofar as their contribution to the evaluation of comp eting analyses

Real World Implausibility

Minimal pairs suchas

a The p o et read in the garden stank

b The p o em read in the garden stank

where a is a garden path but not b demonstrate the reliance of the pro cessor on world

knowledge inferences see section It do es not follow of course that al l ambiguities which

can b e resolved by inference are indeed thus resolved online One could set up arbitrarily complex

puzzles the solutions of which are crucial for resolving a particular ambiguity An accountof

which inferences are suciently fast so as to direct online ambiguity resolution is far outside



t pro ject I assume that suchan the scop e of the currentwork For the purp oses of the curren

inferential device exists and is able to quickly notice certain obvious semantic incongruities and

alert the interpreter One could think of the N signal in electro encephalograms Garnsey et



See Shastri and Ajjanagadde for one view of fast inference

al as a correlate of the human analog of this incongruity alert I simulate the b ehavior of

such an anomaly detector byanticipating each anomalous situation which will b e encountered by

the system and enco ding that situation by hand A partial list of these situations is as follows

S is the semantic variable of the implausible scenario

scenario description explanation

readSX p o emX Po ems cant read

p o emX Po ems cant read anything readSX

X poemX One cant warn p o ems warnS

stopSX p o emX Po ems cant stop anything

futureS yesterdayS Anything that happ ened yesterday is not in the future

Denite Reference

In the current system all denite NPs pronouns names and NPs with denite determiners

have uniform semantic representations A segment of the semantic term list which b egins

with the term theX ends with the term phrase closedX Between these markers lie semantic

terms Here are some examples

phrase semantic term list

the p o em theX p o emX phrase closedX

she theY third p ersY feminineY singularY

phrase closedY

john theY name ofYjohn phrase closedY

his p o em theX third p ersX masculineX singularX

phrase closedX theY ofYX p o emY phrase closedY

the p o em that theY p o emY npmo dY theZ name ofZjohn

john likes phrase closedZ likeWZY tnsWs phrase closedW

phrase closedY

Terms suchasname ofXY and p o emX are called restrictive Others such as phrase

closedX and theX are nonrestrictive as they do not servetonarrowdown the set of p ossible

referents It is assumed that all mo diers are restrictive ie nonattributive

The algorithm for resolving denite reference is in gure

Some illustrations will make this algorithms op erations clear

Supp ose that is encountered out of context

The horse shown to the p o et fell

When the rst word the is pro cessed the state has the semantics sub jethee Since

there are no restrictivesemantic atoms the algorithm do es nothing The next word horse

intro duces a syntactic ambiguity is the phrase the horse closed or not

Recall that sub jX is intro duced by the sub ject typ eraising rule It is not a restrictive semantic atom

Given a database D representing the entities of the prior discourse and relations among them

and given a state S

with semantic term list SEM

interpreter annotation list IA and

p enalty list P

Scan SEM from right to left SEMs atoms reect the order of the input string

For each o ccurrence O of theX

if accomX or resolvedX is in IA then do nothing Already processed

else

let SEM b e the nal segment of SEM which b egins with O

let Q b e the query derived by conjoining all the restrictive atoms of SEM

if the Q is empty

then do nothing Dont look for a referent of a phrase stil l missing its lexical head

else

let C b e the set of values for X for which Q succeeds on D

if C is empty then

closedX app ears in SEM if the term phrase

then add accomXQ to IA

else add accom complex descriptionX to P

endif

else if kCk

add resolvedXC to IA where C is the elementofC

closedX do es not app ear in SEM if the term phrase

then add oversp ecied refX to P

endif

else if kCk then

closedX app ears in SEM if the term phrase

then

let C b e an arbitrary member of C

add resolvedXC toIA

add undersp ecied refX to P

endif

endif

endif

endif

end for

Figure Denite Reference Resolution Algorithm

state i

the horse

sesennpeeopene ne eope

sesennpeeope

sesennpe

sub je thee horsee phrase closede

state ii

the horse

sesennpeeopene ne

sesennpeeope

sub je thee horsee

In state i the parser nondeterministically chose to close the NP The discourse representation

is queried to nd all things X whichmatch the query horseX Since the discourse representation

is empty the result of this query is the empty set The following annotation is therefore added to

the states Interpreter Annotations List accomehorsee No p enalties apply In state ii

the parser chose not to close the NP The p enalty accom complex descriptione is added to

the states p enalty list since the states buer enco des a commitment to restrictive p ostmo diers

for the NP

The next word shown resolves the closurenonclosure ambiguity as it triggers a restrictive

reduced relative clause When the reduced relative clause is nished again there is a closure

ambiguityasfollows

state iii

horse shown to the p o et the

sesennpeeopene ne nenne eope

ne

sesennpeeope

sesennpe

sub je thee horsee showeee tnseen npmo de toee thee p o ete

phrase closede phrase closede

Inumb er the states solely for ease of reference

state iv

the horse shown to the p o et

sesennpeeopene ne nenne

ne

sesennpeeope

sub je thee horsee showeee tnseen npmo de toee thee p o ete

closede phrase

State iii gets the interpreter annotation

accomehorsee showeee tnseen toee p o ete

ignoring the indep endent pro cesses of resolving the NP the p o et State iv is not yet closed

so it do es not get this accommo dation annotation Instead it gets another accom complex

descriptione p enalty which is subsequently removed by a duplicate removal pro cedure

The presence of the main verb fell disambiguates the closure question this time by selecting

the closed state iii

Supp ose the prior discourse contains two horses intro duced for example by the passage

There were two horses b eing shown to a prosp ective buyer One was raced in the

meadow and the other was raced past the barn

In this context the interpretation of pro ceeds dierently After encountering the rst two

words the horse the parser constructs states i and ii ab ove The query of horseX now

returns two p ossible candidates call them horse and horse State i in which the NP is

marked as closed is incapable of acquiring additional information to identify a unique referent

for the horse The interpreter then cho oses one of these arbitrarilysay horse and adds the

annotation resolvedehorse to the interpreter annotations of state i Noting this premature

choice it adds the p enalty undersp ecied refe to the states p enalty list State ii is not

closed so the algorithm decides to wait for additional individuating information

The next few words which the pro cessor encounters are shown to the p o et When interpreting

state ii the interpreter decided to wait for information to distinguish horse from horse But

this restrictive clause is infelicitous It refers to a p o et which is not in the current discourse and

must b e accommo dated When the algorithm applies the query

horseX showYZX tnsYen toYP poetP

it nds no matching candidates for the variable X As in ab ove the interpreter adds an

annotation accommo dating the denite description Also it applies the p enaltyaccom complex

descriptione

Had the restrictiverelative clause b een appropriate eg had the sentence b een

The horse raced past the barn fell

The set of discourse elements satisfying the query

horseX raceYZX tnsYen pastYP barnP

would have b een the singleton set horse The algorithm would add the annotation resolvede

horse to the interpreter annotations list and apply no p enalties

Detecting the End of a Phrase

In this section I provide the rationale for the endofphrase mechanism used in the current

implementation

The denite reference resolution algorithm relies on the accurate signaling of the end of an NP

Without the ability to identify the b oundaries of a noun phrase the pro cessor would b e unable

to distinguish from the various assertions made of a semantic variable those which are within

the scop e of the determiner from those which are not

Since this algorithm ts squarely within the interpretation mo dule it do es not have direct

access to the syntactic representation so identication of the end of the phrase cannot b e simply

p erformed bychecking that a particular no de or constituent is no longer on the right frontier of

the emerging analysis The detection of an end of a noun phrase must therefore b e identied by

the syntactic pro cessor and passed to the interpreter using the only available datapath namely

the sensesemantics Given the tremendous of NP structure in the worlds languages it

is natural to place the burden of endofphrase detection with the languageparticular grammar

not with the pro cessor in general

How can can phraseb oundary b e implemented in a CCG In semantics of the usual sort where

a constituent is assigned a meaning term or a logical form the mechanism of quantier

scop e is available and nothing sp ecial is required However in the semantictermlist approach

whichIhave adopted here see section scop e is rather dicult to express in the sense

semantics One way to implement phraseclosuredetection in CCG is to disallow recursive

p ostmo dication of NPs and simply state in advance in the lexical entry for a determiner or a

noun exactly what the constituents of the NP are This is rather awkward and maywell b e



missing the generalization that p osthead adjectival apply recursively The other wayusto

use the narrowly constrained zeromorpheme scheme as I have presented ab ove I use the same

zero morpheme eop for clauses as well This move is not forced byanything and is adopted

mostly for uniformity It happ ens to play a convenientroleinavoiding certain shortcomings

whichwould otherwise arise from the wayrevealing is implemented in Prolog

An Example

The pro cessor consists of the comp onents discussed ab ove comp etence grammar control

structure parsing algorithm and interpreter as well as stateadjudication algorithm Before



It is not clear to me whether restrictive adjectivals really can recurse but they are commonly assumed to do

so

turning to the details of this nal comp onent it would b e b est to illustrate the op eration of the

pro cessor so far with an example In this example stateadjudication should b e thoughtofas

working out by magic In section I present a decision pro cedure for it

Let us b egin with the string

The p o et read in the garden stank

encountered out of context

Before anywords are pro cessed the parser starts with one initial state whose buer has one

constituent

htlsTCEeopEsTCElexiniti

The rst word the is encountered It has two lexical entries corresp onding to the original

determiner category and the sub jecttyp eraised determiner resp ectively

npseeopenNe

ssessennpseeopense

The nondeterministic reduce algorithm results in three states

state the initial category and the nontyp eraised category for the determiner

the init

tlsTCEeopEsTCE npseeopenNe

state the initial category and the typ eraised category for the determiner

init the

tlsTCEeopEsTCE ssessennpseeopense

state initial category and typ eraised determiner combined

init the

tlsTCEeopEsTCE ssessennpseeopense

tlsseeopessennpseeopense

State is ruled out by the second clause of the Buer Admissibili ty Condition which

requires adjacent constituents to b e ultimately combinable State is ruled out by the rst

clause of which requires that adjacent constituents not b e immediately combinable State

is therefore the only one which the parser outputs Since it do es not have its head noun yet

the interpreter do es not add anyinterpretations or p enalties to this state For the rest of this

example I ignore the initial state and pretend that the current state has the category

ssessennpseeopense

I also omit parser states which are ruled out by the Buer Admissibili ty Condition

The next word p o et is encountered It gives rise to

state

Buer ssessennpsgeeope

Semantics sub je thee p o ete

The parser also nondeterministically p osits a zero morpheme following p o et yielding

state

Buer ssessennpsge

closede Semantics sub je thee p o ete phrase

In state Because the denite phrase e is closed the interpreter accommo dates a p o et In

state the interpreter anticipates further restrictive mo diers so it p enalizes the state for hav

ing to accommo date a complex NP The results are

state

Buer ssessennpsgeeope

Semantics sub je thee p o ete

Interpretation

Penalties accom complex descriptione

state

Buer ssessennpsge

closede Semantics sub je theep o ete phrase

Interpretation accomep o ete

Penalties

Despite the p enalty in state b oth states are maintained for now Also b oth states and

incur a p enalty for having a new NP the p o et in sub ject p osition Because this p enaltywill

apply to every state in the rest of this example it will turn out to b e irrelevant so I omit it

The next word read is manyways ambiguous The untensed verb reading and the present

tense nonrdp ersonsingular reading are ruled out b ecause their features conict with the

nnpsg requirement of the sub ject NP category Three readings remain pasttense s

intransitive pasttense transitive and pastparticiple acting as head of a reduced relative clause

The rst two combine with state to yield states and resp ectively The third is added to

state to yield state



state

B sede

S sub je thee p o ete phrase closede readeee tnseed

I accomep o ete

P

state

B sedenpsse

S sub je thee p o ete phrase closede readeee tnseed

I accomep o ete

P

state

B ssessennpsgeeope

nsgennsgesssenssse

S sub je thee p o ete readeee tnseen npmo de

I

complex descriptione P accom



Recall that the category of this state is actually tlsedeeope So in addition to state the pro cessor

constructs state where it p osits an end of phrase morpheme signaling the end of the main clause This state

has no continuation and it disapp ears when the next word is encountered

Notice that state satises clause of the Buer Admissibility Condition That is the

category n is revealed from the righthand edge of the rst constituent the p o et and this

category may b e mo died by the second constituent read when the latter has received all of

its arguments namely the adverbial in the garden

The next word in is fourwayambiguous as shown in the table in section Of these only

one categorysentential p ostmo dier is not ruled out by the buer admissibility condition

States and then b ecome states and resp ectively

state

B sede sedensedenpsse

closede readeee tnseed S sub je thee p o ete phrase

inee

I accome p o ete

P

state

B sedenpsse sedensedenpsse

closede readeee tnseed S sub je thee p o ete phrase

shiftedee inee h

I accome p o ete

P shifted past non givene

state

B ssessennpsgeeope nsgennsgenpsse

S sub je thee p o ete readeee tnseen npmo de inee

I

P accom complex descriptione

State incurs a p enalty for heavy NP shift past material whichisnotgiven in the discourse

see section States and are discarded b ecause while they each carry a p enalty state

do es not By discarded state the pro cessor has resolved the mainverbreducedrelative

ambiguity of read selecting the mainverb analysis By discarding state it has further

committed to the intransitive use of this verb The consequence of the latter commitmentwill

b e discussed in section

The word the yields state from state

state

B sede sedensedeeopense

closede readeee tnseed S sub je thee p o ete phrase

inee thee

I accome p o ete

P

The word garden leads to the familiar closure ambiguity in states and

state

B sede sedensedeeope

S sub je thee p o ete phrase closede readeee tnseed

inee thee gardene

I accomep o ete

complex descriptione P accom

state

B sede

closede readeee tnseed S sub je thee p o ete phrase

inee thee gardene phrase closede

I accomegardene accomep o ete

P

But neither state is compatible with the next word stank The lack of surviving states indicates

the garden path eect of the sentence

A Consistent Theory of Penalties

Inow construct an adjudication algorithm a pro cedure to evaluate the set of pro cessor states

and based on the kind and numb er of p enalties which they have decide which if any to discard

I b egin by considering the examples listed in section the p enalties that the pro cessor assigns

each analysis and the appropriate action at eachmoment I then present one of many p ossible

algorithms that t these data

Desired Behavior

For convenience I refer to p enalties by their numb er as follows

implausibil i ty

ref undersp ecied

oversp ecied ref

accom complex description

new sub ject

arg light mo dier heavy

shifted past non given

One wife context

The psychologist told the wife that he disliked Florida

that relativizer

that complementizer ok

The psychologist told the wife that he disliked that he liked Florida

that relativizer gp

that complementizer

two wife context

The psychologist told the wife that he disliked that he liked Florida

that relativizer ok

that complementizer

two wife context

The psychologist told the wife that he disliked that he liked Florida

that relativizer

that complementizer gp

out of context

The p o et read in the garden stank

main verb

reduced relative gp

The p o em read in the garden stank

main verb

reduced relative ok

The bird found in the nest a nice juicy worm

reduced relative

main verb ok

The bird found in the nest died

reduced relative ok

main verb

context what did the bird nd in the nest

The bird found in the nest a nice juicy worm

reduced relative

main verb ok

The bird found in the nest died

reduced relative gp

main verb

context Fred found a bird in a nest and Bill found one in the garden

The bird found in the nest a nice juicy worm

reduced relative

main verb gp

The bird found in the nest died

reduced relative ok

main verb

out of context

Without her contributions dwindled

determiner

ob ject pronoun gp

Without her contributions the charityfailed

determiner ok

ob ject pronoun

The p o et said that the psychologist fell yesterday

low attachment ok

high attachment

The p o et said that the psychologist will fall yesterday

low attachment

high attachment gpawkwardness

Fitting The Data

The simplest conceivable state discarding algorithm is

If at least one state has no p enalties

then discard every state that has one or more p enalties

There are two problems with The rst has to do with the fact that not all p enalties are

of equal strength scenario shows that p enalty is stronger than p enalty Using similar

reasoning one can deduce the following constraints among p enaltystrengths s stands for the

strength of p enalty

scenario constraint provided

s s

ss

s s



These constraints underdetermine the ranking of the p enalties with resp ect to strength The

following is one of manyschemes which are consistent with the constraints It uses two strength

levels the minimum p ossible

p enalty strength in numb er of p oints

The second problem with is that of timing Scenarios and show that sometimes a state

which has a p enalty is not discarded even when it is comp eting with one that has none In

these scenarios information whicharrives one or twowords after the rst detection of a p enalty

p enalty is broughttobearandprevents discarding This is in contrast with scenario where

as so on as a p enalty p enalty is detected the oending state is discarded I let each p enalty

typ e carry a graceperiod aninterval of time When the p enalty is detected a countdown

clo ck asso ciated with it is started The p enalty is ignored until its clo ckreaches zero

The scenarios ab oveprovide the following constraints on the assignment of grace p erio ds where



g stands for the grace p erio d of p enalty measured in words

scenario constraint provided

g

g

g

g g

g g

g g

g

No timing constraintisprovided by scenario b ecause the interaction b etween p enaltyand

o ccurs at the end of the sentence

These constraints again underdetermine the grace p erio ds Here is one solution which minimizes

the grace p erio d values



In fact I am making a great simplication by treating all instances of a p enaltyashaving the same strength

For example implausibi li ty is surely a graded judgement as is the degree of complexity of accommo dation



Using the word as a measure of time is intended to b e an approximation Clearly the time course of pro cessing

function words is very dierent from that of pro cessing long or novel contentwords Given the currently available

psycholingui sti c evidence only a crude timing analysis is p ossible at this time But see Trueswell and Tanenhaus

for some preliminary work at trying to understand the time course of the interaction of comp eting p enalties

constraints in their terms

p enalty name strength grace p erio d

implausibili ty

undersp ecied ref

oversp ecied ref

accom complex description

new sub ject

heavy arg light mo dier

shifted past non given

The revised algorithm then is

For each state let its p enalty score b e the sum of the strengths of all

p enalties whose grace p erio ds have passed

Find the minimum score

Discard each state whose score is ab ove the minimum

It must b e emphasized that the algorithm and parameters serve merely to demonstrate the

consistency of the set of p enalties listed in the b eginning of this section so the particular

numbersoreven exactly what they measure should not b e construed as a prop osed theory

A Prediction

Despite the preliminary nature of the sp ecics of the statediscarding algorithm it is nevertheless

p ossible to deriveaninteresting prediction from the system as develop ed so far in particular

from the interaction of the choice of the theory of syntax and the state discarding pro cedure

The account presented here assumes p enalties for heavy shift that is infelicitous in context and

for accommo dating a complex NP Recall the example in section The verb read has three

categories a reduced relative clause and two mainverb categories transitiveandintransitive

Consider what the account do es when faced with each sentence in out of context

a The p o et read in the garden stank

b The p o et read in the garden a lengthy article ab out Canadian earthworms

In a the complex NP accommo dation p enalty correctly excludes the reduced relative anal

ysis resulting in a garden path What is the predicted status of b Given that the

reducedrelative analysis is discarded one would exp ect the mainverb analysis to p ersist But

CCG has two completely separate mainverb analyses The transitive analysis requires heavy

NP shift which is deemed infelicitous out of context The surviving analysis is of the intransitive

verb and cannot cop e with the shifted NP So the account predicts a garden path in b

This prediction arises of course b ecause of the lexicalized nature of CCG every combinatory

p otential of a word is treated as a separate lexical entryInotherwords CCG do es not distin

een categories eg sub categorization from ma jor dierences eg guish small dierences b etw

main verb versus reduced relative clause

So while CCG predicts a garden path for b a more traditional phrasestructure theory

of grammar might not dep ending on whether it distinguishes analyses on the basis of lexical

sub category The gardenpath status of b is an empirical one and necessitates teasing

apart any pro cessing diculty asso ciated with the infelicity of the heavy NP shift from truly

syntacticparsing eects indicative of a garden path It remains for future research

Summary

Using the meaningbased criteria develop ed in chapters through referential felicity felicity

with resp ect to givenness plausibili ty and the parsing algorithm presented in chapter I

have presented a simple mo del of the pro cess of sentence comprehension The p oint of this

demonstration is to show that it is p ossible to construct a simple sentence pro cessor whichcan

account for signicant subset of the data available ab out when garden paths arise in English

This enterprise is largely successful the data structures and algorithms needed by toplevel of the

architecture are obvious and straightforward Complexity arises from the sp ecic requirements

necessitated by the grammar formalism CCG and by the scop e of the state discarding criterion

The latter is severely underdetermined by the available data

The long term goals of this work is to provide a detailed mo del of sentence pro cessing one which

makes clear and testable prediction While this is still a long way o I havenevertheless shown

that already it is p ossible to make some sort of predictions from the interaction of the various

ingredients

The program as describ ed in this chapter is written in Quintus Prolog and is called arfi

Ambiguity Resolution From Interpretation It is accompanied by a graphical user interface

which provides an easytouse facility for insp ecting the execution trace of the pro cessor on a

particular input string The insp ector program called the viewer is written for the X window

system and requires Common LISP and the software package CLIM arfi and viewer are have

b een available on the Internet by anonymous FTP They are on the host ftpcisup ennedu in

the directory pubnivthesis

Chapter

Conclusion

Of the class of computational functions p erformed by the human language faculty ranging from

phonetics to the so cial activityofcommunication I have considered two

parsing the application of the rules of syntax to identify the relations among the words in the

input string

interpretation the up dating of the hearers mental mo del of the ongoing discourse based on

the sensesemantic relations recovered by the parsing pro cess

Ihave argued for a particular view of the interaction b etween these two functions First I

adopted the uncontroversial assumption almost uncontroversial see MarslenWilson and Tyler

that parsing and interpretation o ccur in separate mo dules and that these mo dules interact

through wellsp ecied channels the parser sends nothing but sensesemantic representations to

the interpreter and the interpreter sends the parser nothing but feedback ab out sensibleness of

the various analyses

Second I adopted the more controversial assumption that the parser computes all of the gram

matically licensed analyses of the string so far and sends them all to the interpreter for evalua

tion in parallel I claimed that the parser do es not provide its own ranking or evaluation of the

analyses it constructs by applying structurally stated preference criteria that all observable

preferences among ambiguous readings stem from principles of the linguistic comp etence princi

ples such as Grices maxims of quantity and manner for evaluating the felicity of denite referring

expressions the comp etence principle in English to place high information volume constituents

after lowvolume ones to use sub ject p osition for enco ding given information etc I did not

explore in detail the question of whether the parsing comp onent applies some evaluation of the

various analyses based on strengths of various alternatives in the comp etence grammar While

ord Bresnan and Kaplan Trueswell Tanenhaus verb sub categorization preferences eg F

and Kello can b e ascrib ed either to the lexical entries part of grammatical comp etence

or a deep er representation of the concepts attached to them there are some preference phe

nomena which seem to necessitate grammatically sp ecied strength of preference Kro ch

This issue remains for future research

Third I investigated the design of the parser My aim was to identify the simplest design p ossi

ble I investigated Steedmans thesis that conceiving of syntactic knowledge of language

as a Combinatory Categorial Grammar CCG allows one to construct a parser that is signif

icantly simpler than what would b e needed for traditional grammars while still maintaining

the inputoutput b ehavior necessary to function as the syntactic parsing mo dule in the overall

sentencepro cessing system design It turned out that designing a parser for CCG runs into its

own collection of complexities Certain of these complexities detection of inevitable ungram

maticality detection of inevitable wordorder noncanonicality in the form crossing comp osition

identifying optional combinations eg picturenoun extraction can b e elegantly handled by

assuming that the ability to parse in adults is comp osed of an innate ability to put grammatical

constituents together and an acquired skill of quickly anticipating the consequences of certain

combinations One nal complexity the interaction of CCGs derivational equivalence with

the incremental analysis necessary for timely interpretation necessitated assuming that the

history of the derivation is prop erly a comp onent of a grammatical analysis and augmenting the

rep ertory of the parser with an op eration which explicates the interchangeability of derivational

equivalent analyses by manipulating the history of the derivation

Fourth I have constructed a computational simulation of the sentence comprehension pro cess

whichallows one to evaluate the viability of the central claim of the dissertation that the

syntactic pro cessor blindly and transparently computes all grammatical analyses and ambiguity

resolution is based on interpretation This simulation serves as a computational platform for

exp erimenting with various analysis pruning strategies in the interpreter It has shown that at

the moment the available psycholinguistic data greatly underdetermines the precise strategy

but some empirical predictions do emerge

the dissertation gives rise to three sp ecic empirical questions

Given the example dialog in on page whichshows that discourse status can aect

p erceived information volume just howmuch of the information volume as op erationaliz

able by observing attachment preferences can b e accounted for by discourse factors and

howmuch of it is irreducible to the form of the constituent the amount of linguistic

material and other syntactic attributes such as grammatical category

Do es Disconnectedness theory playanyroleatallinambiguity resolution To what

extentisAvoid New Sub jects really resp onsible for the data cited in chapter As

discussed in detail in section if one were to rerun the second exp eriment rep orted

by Stowe ruling out the instrumental reading and still get implausibil ity eects in

the inanimate condition one would have an empirical basis to rule in Avoid New Sub jects

and rule out Disconnectedness theory

Is CCG correct in its equal treatment of ma jor category and sub category distinctions

That is can the pro cessor b e made to commit to an intransitive analysis for a verb thus

gardenpathing on its direct ob ject as predicted in on page

Note that it is logically p ossible to dene a more extreme condition on a parsimonious parser This condition

would disallow op erations and representations which are not strictly dened bythewellformedness rules of the

grammar Since CCG do es not strictly sp eaking dene wellformed analyses only wellformed constituents

and since it do es not explicitl y related equivalent derivations this view of grammar is not compatible with the

derivationrewrite algorithm I have presented

App endix A

Data from Avoid New Sub ject

investigation

Brown Corpus

Sub jects Non Sub jects

givenness status tc rc tcrc matrix tc rc tcrc matrix

emptycategory

pronoun

propername

definite

indefinite

notclassified

total

Wall Street Journal Corpus

Sub jects Non Sub jects

givenness status tc rc tcrc matrix tc rc tcrc matrix

emptycategory

pronoun

propername

definite

indefinite

notclassified

total

Nonzero cells of empty categories in non p ostZEROcomplementizer sub jects are due to an

notation errors in the corpus

App endix B

A Rewrite System for Derivations

In this app endix I dene a formal system DRS a rewrite system for CCG derivations as

sketched in section I then prove that DRS preserves the semantics of a derivation and show

that it can form the basis of a correct and ecient algorithm for computing the rightfrontier

of a derivation

Denition Two derivations are equivalent just in case the category of their resp ective ro ots are

equal

Inow give the denition of DRS DRS allows one to describ e equivalence classes of derivations

and provide the means of picking out one representative from each equivalence class

Given the set D of valid derivations dene the relation D D to hold b etween a pair of

derivations dd just in case exactly one application of one of the derivation rewrite rules in

and to some no de in d yields d

Any subtree of a derivation whichmatches the lefthandside of either or is called

a redex The result of replacing a redex by the corresp onding righthandside of a rule is called

the contractum A derivation is in normal form NF if it contains no redex

Y jZ jZ c jY Y b WX a X jY

m n m m

m

m

W jY jY Y B ab

m m

n

n m

W jY jY jZ jZ B B abc

m n

WX a X jY jY Y b Y jZ jZ c

m m m n

n

n

X jY jY jZ jZ B bc

m n

mn

m n n

W jY jY jZ jZ B aB bc

m n

For an overview of rewrite systems the reader is referred to Le Chenadec esp ecially section

Y jZ jZ c X jY jY nY b WnXa

m n m m

n

n

X jY jY jZ jZ B bc

m n

mn

m n n

W jY jY jZ jZ B aB bc

m n

Y jZ jZ c X jY jY nY b WnXa

m n m m

m

m

W jY jY nY B ab

m m

n

n m

W jY jY jZ jZ B B abc

m n

Lemma preserves equivalence of derivations

pro of When the semantic terms from the ro ots of the lefthand derivation and righthand

derivation are compared by applying each of them to suciently many arguments so that all

reductions take place the results are identical

n m

B B abcd d

n m

m

B abcd d d d

n n n m

abcd d d d

n n n m

m n n

B aB bcd d

n m

n

aB bcd d

n m

abcd d d d

n n n m

Let b e the converse of Let be Let b e the reexive transitive closure

of and similarly the reexive transitive closure of and the reexive transitive

closure of

Note that is an equivalence relation and that d d dd are equivalent but

the converse do es not hold b ecause two categories could b e accidentally equivalent an o dd

prop erty for a linguistic analysis to have but a p ossible one nonetheless The reader mayverify

that no other combinatory rules may b e substituted for in and in to yield a

semanticspreserving derivation rewrite rule In particular the twocombinatory rules must b e

of the same directionality

Theorem For a derivation with n internal no des every sequence of applications of is

nite and is of length at most nn

pro of Every derivation with n internal no des is assigned a p ositiveinteger score whichis

b ounded by nn An application of is guaranteed to yield a derivation with a lower

ation as score This is done by dening the functions weight and score for each no de of the deriv

follows a b bc d c a f

e e’

Figure B Schema for one redex in DRS

if x is a leaf no de

weight x

weightleftchildx weight rightchildx otherwise

if x is a leaf no de

scorex

scoreleftchildx scorerightchildx weightleftchildx otherwise

Each application of decreases the score of the derivation This follows from monotonic

dep endency of the score of the ro ot of the derivation up on the scores of each subderivation and

from the fact that lo cally the score of a redex decreases when is applied In gure B a

derivation is schematically depicted with a redex whose subconstituents are named a b and c

Applying reduces the scoree hence the score of the whole derivation

in redex

weightd weight a weightb

scoreb weighta scored scorea

scoree scored scorec weightd

scorea scoreb weightascorecweightaweight b

scorea scoreb scorec weightb weighta

in contractum

scoref scoreb scorec weightb

scoree scorea scoref weighta

scorea scoreb scorec weightb weighta

scorea scoreb scorec weightb weight a

n is also the lower b ound on sequences of application of Inow show that n

A leftchain is either a leaf no de or a derivation whose leftchild is a leftchain and whose

rightchild is a leaf no de A rightchain is either a leaf no de or a derivation whose rightchild

is a rightchain and whose leftchild is a leaf no de A quasirightchain is a derivation whose

rightchild is a leaf and whose leftchild is a rightchain n-2 n n-1

n n+1

Figure B Normal form computation of a quasirightchain

Lemma A quasirightchain of n internal no des can b e rewritten using n application of

to a rightchain

pro of Atevery p oint in the rewriting op eration there is only one redex This is depicted in

gure B

Lemma A leftchain C of n internal no des can b e rewritten to a rightchain using a sequence

of exactly nn applications of

pro of By induction on n

n C is already a rightchain steps are required

n C is a redex One application of rewrites it into a rightchain

n Supp ose this is true for all m n Apply as follows Rewrite the leftchild of C to a

rightchain in n n steps The result of this is a quasirightchain of n internal

no des which can b e rewritten to a rightchain using n applications of The total

numb er of applications of is

 

n n n n n n n nn

n

A rewrite system is strongly normalizing SN i every sequence of applications of is nite

Corollary DRS is SN

pro of Immediate corollary of theorem

So far I have shown that nondeterministic computation of the rightbranching NF of a derivation

is quite tractable quadratic in the size of the derivation I will nowshow that this is even so

on a deterministic machine

A rewrite system is ChurchRosser CR just in case

x y x y zx z y z

A rewrite system is Weakly ChurchRosser WCR just in case

x y w w x w y zx z y z

Lemma DRS is WCR

ation with two distinct redexes x and y yielding the two distinct pro of Let w b e a deriv

derivations w and w resp ectively There are a few p ossibilitie s

case x and y have no no des in common There are three sub cases x could dominate y

include y as a sub constituent x could b e dominated by y orx and y could incomparable

with resp ect to dominance Either way it is clear that the order of application of makes

no dierence

case x and y share no des Assuming that x and y are distinct and without loss of generality

that y do es not dominate xwehave the situation depicted in gure B Note that all three

internal no des in gure B are of the same combinatory rule either or In this case

there do es exist a derivation z such that w z w z This is depicted in gure B

Lemma Newman W CR SN CR

Theorem DRS is CR

pro of Follows from lemmas and corollary

Corollary CR x y x y x y are NFs x y Y

X

Figure B When two redexes are not indep endent

d c b a

cb d d cb a a

cb The arrows are annotated a by the sub-structure to ba which they are applied. d c

d

Figure B Why DRS is weakly ChurchRosser

pro of By contradiction supp ose CR x y x y x y are NFs x y Then

zx z y z Given that x is a NF x z entails that x z Similarly y z This

contradicts the assumption that x y

Therefore every DRS equivalence class contains exactly one NF derivation It follows that any

deterministic computational path of applying will lead to the normal form

As for the eciency of computing the rightbranching NF for a derivation of n internal no des

theorem shows that for a derivation of n internal no des every sequence of applications of is

at most nn steps long This is the worst case which arises from applying as far away

from the ro ot as p ossible An insp ection of the denition of score suggests that applying as

close to the ro ot as p ossible yields the largest decrease in score since weighta is maximized

In fact in case it is always grammatically p ossible to apply to the closest redex to the ro ot

every derivation has a CTR closest to ro ot rewrite sequence of length O n The pro of requires

dening a function which measures the numb er of CTR rewrite steps that a derivation requires

to reach NF Let us rst dene the function which applies once to the closest redex

ctr

to the ro ot of its argument

x if x is a leaf no de

combineleftchildx rightchildx if leftchildx is a leaf

ctr

x combineleftchildleftchil dx otherwise

ctr ctr

binerightchildleftchil dx com

rightchildx

Let cost x b e the numb er of times that must b e iterated on x so as to yield an NF

ctr

denes cost xby the following recurrence equations

ctr

if x is a leaf no de

cost rightchildx if leftchildxisaleaf

cost combineleftchildleftchi ld x otherwise cost x

combinerightchildleftchild x

rightchildx

Observe that in the third case subsequent applications of will pro cess all of leftchildleft

ctr

c hildx then pro ceed to pro cess rightchildleftchil dx and nally pro cess rightchildx

This is illustrated in gure B

The cost of doing these three steps can b e accounted for separately

cost x

cost combineleftchildleftchi l dxl

cost combinerightchildleftchil dxl

cost rightchildx

where l is a dummy leaf no de

It is now p ossible to proveby induction on the derivation tree that

Figure B One application of

ctr

cost x x minus the depth of the rightmost leaf in x

where D is the number of internal no des in the derivation D

Base cases

x is a leaf cost x

x has one internal no de cost x

Induction

Supp ose true for all trees of fewer internal no des than x Let d b e the depth of the rightmost

leaf of x

Case leftchildx is a leaf cost x cost rightchildx n d n d

Case leftchildx is not a leaf

Let a b c b e leftchildleftchil dx rightchildleftchildx rightchildx resp ectivelyNote

that x a b c

cost x cost combineal cost combinebl cost c

a b c depth of rightmost leaf in c

x d

x d

So while the worstcase sequence of applications of is quadratic in the size of derivation the

b est case p ossible as long as the grammar allows it is linear

Bibliography

Abb ott Barbara Right no de raising as a test for constituentho o d Linguistic Inquiry

Adams Beverly C Jr Charles E Clifton and Don C Mitchell Lexical guidance in

sentence parsing

Aho Alfred and S C Johnson LR parsing ACM Computing Surveys

Altmann Gerry T Alan Garnham and Yvette Dennis Avoiding the garden path Eye

movements in context Journal of Memory and Language

Altmann Gerry T and Mark J Steedman Interaction with context during human

sentence pro cessing Cognition

Aone Chinatsu and Kent Wittenburg Zeromorphemes in unicationbased combinatory

categorial grammar In Proceedings of the th Annual Meeting of the Association for

Computational Linguistics

Bach Emmon Colin Brown and William D MarslenWilson Crossed and nested dep en

dencies in Dutch and German A psycholinguistic study Language and Cognitive Processses

Bever Tom G The cognitive basis for linguistic structures In John R Hayes Ed

Cognition and the Development of Language chapter New York Wiley

Boland Julie Michael Tanenhaus Greg Carlson and Susan Garnsey Lexical pro jection

and the interaction of syntax and semantics in parsing Journal of Psycholinguistic Research

Brill Eric A simple rulebased part of sp eech tagger In Proceedings of the Third Con

ference on Applied Computational Linguistics Trento Italy

Chomsky Noam Lectures on Government and Dordrecht Foris

Chomsky Noam Know ledge of Language Its Nature Origin and Use New York

Praeger

Chomsky Noam and George Miller Intro duction to the formal analysis of natural lan

guages In R Duncan Luce Rob ert R Bush and Eugene Galanter Eds Handbook of

Mathematical Psychology Vol I I New York Wiley

Church Kenneth W A sto chastic parts program and noun phrase parser for unrestricted

texts In Proceedings of the Second Conference on AppliedNatural Language Processing

Connine Cynthia Fernanda Ferreira Charlie Jones Charles Clifton and Lyn Frazier

Verb frame preferences Descriptive norms Journal of Psycholinguistic Research

Cowp er Elizab eth A Constraints on Sentence Complexity A Model for Syntactic Pro

cessing PhD thesis Brown UniversityProvidence Rho de Island

Crain Stephen and Janet D Fo dor How can grammars help parsers In David Dowty

Lurie Kartunnen and Arnold Zwicky Eds Natural language parsing Psychological Com

putational and Theoretical Perspectives Cambridge University Press

Crain Stephen and Mark J Steedman On not b eing led up the garden path the

use of context by the psychological syntax pro cessor In David Dowty Lurie Kartunnen

and Arnold Zwicky Eds Natural Language Parsing Psychological Computational And

theoretical Perspectives Cambridge University Press

Curry Haskell B and Rob ert Feys Combinatory Logic Vol I Amsterdam North

Holland

Davidson Donald The logical form of action sentences In Nicholas Rescher Ed The

Logic of Decision and Action University of Pittsburgh Press

DowtyDavid Typ e raising functional comp osition and nonconstituent conjunction In

Richard T Oehrle Emmon Bach and Deirdre Wheeler Eds Categorial Grammars and

Natural Language Structures Reidel

Eady J and Janet D Fo dor Is centeremb edding a source of pro cessing diculty In

Presented at the Linguistics Society of AmericaAnnual Meeting

EarleyJay An ecientcontextfree parsing algorithm Communications of the Associa

tion for Computing Machinery

Ferreira Fernanda and John M Henderson Use of verb information in syntactic parsing

Evidence from eyemovements and wordbyword selfpaced reading Journal of Experimen

tal Psychology Learning Memory and Cognition

Fo dor Janet D Empty categories in sentence pro cessing Language and Cognitive

Processses SI

Fo dor Jerry Thomas G Bever and Merrill Garrett The Psychology of Language An

Introduction to Psycholinguistics and New York McGrawHill

Ford Marilyn Joan Bresnan and Ronald Kaplan A comp etencebased theory of syntactic

MIT closure In Joan Bresnan Ed The Mental Representation of Grammatical Relations

Press

Frank Rob ert E Syntactic Locality and TreeAdjoining Grammar Grammatical Acqui

sition and Processing Perspectives PhD thesis UniversityofPennsylvania

Frazier Lyn On comprehending SentencesSyntactic Parsing Strategies PhD thesis

University of Massachusetts

Frazier Lyn Exploring the architecture of the languagepro cessing system In Gerry T

Altmann Ed Cognitive ModelsofSpeech Processing MIT press

Frazier Lyn and Janet D Fo dor The sausage machine A new two stage parsing mo del

Cognition

Frazier Lyn and Keith Rayner Making and correcting errors during sentence compre

hension Eyemovements in the analysis of structurally ambiguous sentences Cognitive

Psychology

Garnsey Susan M Melanie A Loto cky and George W McConkie Verbusage knowledge

in sentence comprehension Novemb er

Garnsey Susan M Michael K Tanenhaus and Rob ert M Chapman Evoked p otentials

and the study of sentence comprehension Journal of Psycholinguistic Research

Geach Peter AProgram for Syntax VolofSynthese Reidel

Gibson Edward A F A Computational Theory of Human Linguistic Processing Memory

Limitations and Processing Breakdown PhD thesis Carnegie Mellon UniversityMay

Gorrell Paul Establishing the lo ci of serial and parallel eects in syntactic pro cessing

Journal of Psycholinguistic Research

Grice H Paul Logic and conversation In Peter Cole and Jerry Morgan Eds Syntax

ork Academic Press and Semantics III Speech Acts New Y

Haddo ck Nick Incremental interpretation and combinatory categorial grammar In

Proceedings of the th International Joint ConferenceonArticial Intel ligence

Haddo ck Nick Incremental Semantics andInteractive Syntactic Processing PhD thesis

Univesity of Edinburgh

Haviland Susan E and Herb ert H Clark Whats new acquiring new information as a

pro cess in comprehension Journal of Verbal Learning and Verbal Behavior

Hawkins John A parsing theory of word order universals Linguistic Inquiry

Hepple Mark Metho ds for parsing combinatory grammars and the spurious ambiguity

problem Masters thesis University of Edinburgh

Hepple Mark Ecient incremental pro cessing with categorial grammar In Proceedings

of the th Annual Meeting of the Association for Computational Linguistics

Hepple Mark and Glyn Morrill Parsing and derivational equivalence In Proceedings of

the Annual Meeting of the European Chapter of the Association for Computational Linguis

tics

Hickok Greg Parallel parsing Evidence from reactivation in gardenpath sentences

Journal of Psycholinguistic R esearch in press

Hindle Don and Mats Ro oth Structural ambiguity and lexical relations Computational

Linguistics

Hobbs Jerry Mark Stickel Douglas App elt and Paul Martin Interpretation as ab duc

tion Articial Intel ligence Journal

Hobbs Jerry R Ontological promiscuityInProceedings of the rdAnnual Meeting of

the Association for Computational Linguistics

Holmes Virginia M Alan Kennedy and Wayne S Murray Syntactic structure and the

garden path Quarterly Journal of Experimental Psychology A

Holmes Virginia M Laurie Stowe and Linda Cupples Lexical exp ectations in parsing

complementverb sentences Journal of Memory and Language

Joshi Aravind K Tree adjoining grammars Howmuchcontextsensitivity is required to

provide reasonable structural description In David Dowty Lauri Karttunen and Arnold

Zwicky Eds Natural Language Processing Psycholinguistic Computational and Theoret

ical Perspectives New York Cambridge University Press

Joshi Aravind K Pro cessing crossed and nested dep endencies an automaton p ersp ective

on the psycholinguistic results Language and Cognitive Processses

Joshi Aravind K TAGs in categorial clothing Presented at rd Meeting on Mathematics

vemb er of Language MOL No

Joshi Aravind K and Yves Schab es Treeadjoining grammars and lexicalized grammars

In Maurice Nivat and Andreas Po delski Eds Denability and Recognizability of Sets of

Trees Elsevier

Juliano Cornell and Michael K Tanenhaus Contingent frequency eects in syntactic

ambiguity resolution In Proceedings of the th Annual ConferenceoftheCognitive Science

Society Lawrence Erlbaum Asso ciates June

Karttunen Lauri Radical lexicalism In Mark Baltin and Anthony S Kro ch Eds

Alternative Conceptions of Phrase Structure Chicago University of Chicago Press

Kennedy Alan Wayne S MurrayFrancis Jennings and Claire Reed Parsing com

plements Comments on the principle of minimal attachment Language and Cognitive

Processses SI

Kimball John Seven principles of surface structure parsing in natural language Cogni

tion

Konig Esther Parsing as natural deduction In Proceedings of the th Annual Me eting

of the Association for Computational Linguistics June

Kro ch Anthony S Reexes of grammar patterns of language change Language Variation

and Change

Ladd D Rob ert Comp ound proso dic domains Technical rep ort University of Edin

burgh February

Lamb ek J The mathematics of sentence structure American Mathematical Monthly

Le Chenadec Philipp e Canonical Forms in Finitely PresentedAlgebras Wiley

MacDonland Maryellen Adam Just and Patricia Carp enter Working memory con

straints on the pro cessing of syntactic ambiguity Cognitive Psychology

Marcus Mitchell A Theory of Syntactic Recognition for Natural Language Cambridge

Mass MIT Press

Marcus Mitchell Donald Hindle and Margaret Fleck Dtheory Talking ab out talking

ab out trees In Proceedings of the st Annual Meeting of the Association for Computational

Linguistics Cambridge Mass

MarlsenWilson William and Lorraine K Tyler The temp oral structure of sp oken

language understanding Cognition

MarlsenWilson William and Lorraine K Tyler Against mo dularity In Jay Gareld

Ed Modularity in Know ledge Representation and Natural Language Understanding MIT

Press

bridge Mass MIT Press May Rob ert Logical Form Its Structure and Derivation Cam

McClelland Jay L Mark St John and Roman Taraban Sentence comprehension A

parallel distributed pro cessing approach Language and Cognitive Processses SI

Mitchell Don C Lexical guidance in human parsing Lo cus and pro cessing characteristics

In Max Coltheart Ed Attention and Peformance XII The Psychology of Reading

Hillsdale NJ Lawrence Erlbaum Asso ciates

Mitchell Don C Martin M B Corley and Alan Garnham Eects of context in hu

man sentence parsing Evidence against a discoursebased prop osal mechanism Journal of

Experimental Psychology Learning Memory and Cognition

Mo ore Rob ert C Unicationbased semantci interpretation In Proceedings of the th

Annual Meeting of the Association for Computational Linguistics June

Mo ortgat Michael Categorial Investigations Dordrecht Foris

Morrill Glyn V Extraction and Coordination in Phrase StructureGrammar and Catego

rial Grammar PhD thesis University of Edinburgh

Newman M H A On theories with a combinatorial denition of equivalence In

Annals of Mathematics chapter Princeton University Press

Nicol Janet L and Martin J Pickering Pro cessing syntactically ambiguous sentences

esearch Evidence from semantic priming Journal of Psycholinguistic R

Pareschi Remo and Mark J Steedman Alazywaytochart parse with combinatory

grammars In Proceedings of the th Annual Meeting of the Association for Computational

Linguistics

Park Jong C A unicationbased semantic interpretation for co ordinate constructs In

Proceedings of the th Annual Meeting of the Association for Computational Linguistics

Pearlmutter Neal J and Maryellen C MacDonald Plausibility and syntactic ambiguity

resolution In Proceedings of the th Annual ConferenceoftheCognitive ScienceSociety

Lawrence Erlbaum Asso ciates

Pereira Fernando C N A new characterization of attachment preferences In David

Dowty Lurie Kartunnen and Arnold Zwicky Eds Natural Language Parsing Psycholog

ical Computational And theoretical Perspectives Cambridge University Press

Pereira Fernando C N and Stuart M Shieb er Prolog and NaturalLanguage Analysis

Vol Stanford CSLI

Prince Ellen F Toward a taxonomyofgivennew information In P Cole Ed Radical

Pragmatics New York Academic Press

anguage Pritchett Bradley L Garden Path Phenomena and the the Grammatical Basis of L

Processing PhD thesis Harvard University

Pritchett Bradley L Garden path phenomena and the the grammatical basis of language

pro cessing Language

Pritchett Bradley L Grammatical Competence and Parsing Performance Chicago

University of Chicago Press

Quine Willard V O Variables explained away In SelectedLogic Papers New York

Random House

Rambow Owen a A linguistic and computational analysis of the German Third Con

struction In Proceedings of the th Annual Meeting of the Association for Computational

Linguistics

Rambow Owen b Natural language syntax and formal systems Dissertation Prop osal

UniversityofPennsylvania

Rambow Owen and Aravind Joshi A pro cessing mo del for free word order languages

presented at the CUNY Sentence Pro cessing Conference

Randolph Quirk Sidney Greenbaum Georey Leech and Jan Svartik A Comprehensive

Grammar of the English Language London Longman

Rayner Keith Marcia Carlson and Lyn Frazier The interaction of syntax and semantics

during sentence pro cessing Eyemo vement in the analysis of semantically biased sentences

Journal of Verbal Learning and Verbal Behavior

Rayner Keith and Lyn Frazier Parsing temp orarily ambiguous complements Quarterly

Journal of Experimental Psychology A

Resnik Philip R Selection and Information PhD thesis UniversityofPennsylvania

Sag Ivan Gerald Gazdar Thomas Wasow and Steven Weisler Co ordination and how

to distinguish categories Natural Language and Linguistic Theory

Schub ert Lenhart Are there preference tradeos in attachment decisions In Proceedings

of the Fifth National ConferenceonArticial Intel ligence

Sedivy Julie and Michael J SpiveyKnowlton The eect of NP deniteness on pars

ing attachmentambiguityInProceedings of the rdAnnual Meeting of the North East

Linguistics Society

Shastri Lokendra and Venkat Ajjanagadde From simple asso ciates to systematic reason

ing A connectionist representation of rules variables and dynamic bindings using temp oral

synchrony Behavioral and Brain Sciences

Shieb er Stuart M Sentence disambiguation by a shiftreduce parsing technique In

Proceedings of the st Annual Meeting of the Association for Computational Linguistics

Shieb er Stuart M and Mark Johnson Variations on incremental interpretation Journal

of Psycholinguistic Research to app ear

hronous tree adjoining grammars In Proceed Shieb er Stuart M and Yves Schab es Sync

ings of the th International Conference on Computational Linguistics

SpiveyKnowlton Michael J John C Trueswell and Michael K Tanehaus Context ef

fects in syntactic ambiguity resolution Discourse and semantic inuences in parsing reduced

relative clauses Canadian Journal of Psychology

Steedman Mark Dep endency and co ordination in the grammar of dutch and english

Language

Steedman Mark J Combinatory grammars and parasitic gaps Natural Language and

Linguistic Theory

Steedman Mark J Gapping as constituent co ordination Linguistics and Philosophy

Steedman Mark J Structure and Intonation Language

Steedman Mark J Surface structure Technical Rep ort MSCIS Universityof

Pennsylvania

Steedman Mark J Grammars and pro cessors In Hans Kamp and Christian Rohrer

Eds Aspects of Computational Linguistics Springer Verlag to app ear

Stowe Laurie A Thematic structures and sentence comprehension In Greg N Carl

son and Michael K Tanehaus Eds Linguistic StructureinLanguage Processing Kluwer

Academic Press

SwinneyDavid A Marilyn Ford Uli Frauenfelder and Joan Bresnan On the temp oral

course of GapFilling and antecedent assignment during sen tence comprehension In Barbara

Grosz Ronald Kaplan M MacKen and Ivan Sag Eds Language structure and processing

Stanford CSLI

Szab olsci Anna ECP in categorial grammar Max Planck Institute Nijmegen

Trueswell John C and Michael K Tanenhaus Tense temp oral context and syntactic

ambiguity resolution Language and Cognitive Processses

Trueswell John C and Michael K Tanenhaus Consulting temp oral context during sen

tence comprehension Evidence from the monitoring of eyemovements in reading In Pro

ceedings of the th Annual Conference of the Cognitive ScienceSociety

Trueswell John C Michael K Tanenhaus and Susan M Garnsey Semantic inuences

on parsing Use of thematic role information in syntactic ambiguity resolution manuscript

UniversityofRochester

Trueswell John C Michael K Tanenhaus and Christopher Kello Verbsp ecic con

straints in sentence pro cessing Separating eects of lexical preferences from gardenpaths

Journal of Experimental Psychology Learning Memory and Cognition in press

Wall Rob ert and Kent Wittenburg Predictive normal forms for function comp osition in

categorial grammars In Proceedings of the International Workshop on Parsing Technolo

gies

Weinb erg Amy A parsing theory for the nineties Minimal commitment manuscript

ersity of Maryland Univ

Weischedel Ralph Damaris Ayuso Rob ert Bobrow Sean Boisen Rob ert Ingria and Je Pal

mucci Partial parsing A rep ort on work in progress In Proceedings of the DARPA

Speech and Natural Language Workshop

Whittemore Greg Kathleen Ferrara and Hans Brunner Empirical study of predictive

powers of simple attachmentschemes for p ostmo dier prep ositional phrases In Proceedings

of the th Annual Meeting of the Association for Computational Linguistics

Wilks Yorick Rightattachment and preference semantics In Proceedings of the rd

Annual Meeting of the Association for Computational Linguistics

Wittenburg Kent Natural Language Parsing with Combinatory Categorial Grammar in

aGraphUnicationBasedFormalism PhD thesis UniversityofTexas

Wright B and Merrill Garrett Lexical decision in sentences Eects of syntactic struc

ture Memory and Cognition