<<

Croatian Journal of Philosophy Vol.XI,No.31,2011

Psychological and Computational Models of Comprehension: In Defense of the Psychological Reality of

DAVIDPEREPLYOTCHIK Department of Philosophy, Baruch College, CUNY

In this paper, I argue for a modifi ed version of what Devitt (2006) calls the Representational Thesis (RT). According to RT, syntactic rules or princi- ples are psychologically real, in the sense that they are represented in the mind/brain of every linguistically competent speaker/hearer. I present a range of behavioral and neurophysiological evidence for the claim that the human processing mechanism constructs mental representa- tions of the syntactic properties of linguistic stimuli. I then survey a range of psychologically plausible computational models of comprehension and show that they are all committed to RT. I go on to sketch a framework for thinking about the nature of the representations involved in sentence processing. My claim is that these are best characterized not as proposi- tional attitudes but, rather, as subpersonal states. Moreover, the repre- sentational properties of these states are determined by their functional role, not solely by their causal or nomological relations to mind-indepen- dent objects and properties. Finally, I distinguish between explicit and implicit representations and argue, contra Devitt (2006), that the latter can be drawn on “as data” by the algorithms that constitute our sentence processing routines. I conclude that Devitt’s skepticism concerning the psychological reality of grammars cannot be sustained.

Key words: psychologicalrealityoflanguage,mentalrepresenta tion,sentenceprocessing,computationalparsingmodels,personal andsubpersonalstates

§1. Introduction MichaelDevitt’sbook, Ignorance of Language ,markedanewchapter in the debate concerning the psychological reality of language. Like

31 32 D.Pereplyotchik, Psychological and Computational Models languageitself,thedebateismultifaceted,bringingintothefoldcore issuesinepistemology,metaphysics,andpsychology.Devitt’sdiscus sioniscomprehensive;hecoversmuchgroundandmakesboldclaims abouttheepistemicstatusoflinguisticintuitions,theontologyoflin guistic theory, and the proper characterization of innateness, inter alia .It’shardlysurprising,then,thatDevitt’scriticshavechallenged hisviewsfromawidevarietyofangles.Forinstance,Culbertsonand Gross(2009)mountacritiqueofDevitt’sviewsonintuitions,andRey (2006,2008)challengesDevitt’squasinominalismregardinglinguistic entitieslikewordsand.Similarly,Pietroski(2008)arguesthat Devitt’spositionisatoddswithwellfoundedclaimsaboutlanguage acquisitionandCain(2010)shedsdoubtonDevitt’scharacterizationof publiclanguageconventions. Joiningthefray,ItacklewhatItaketobethemainthesisof Ig- norance of Language ,viz.,thattherulesorprinciplesofgrammarare notmentallyrepresented.Incontrasttothecommentatorsmentioned above,myapproachwillcenterprimarilyontheresultsemergingfrom psycholinguistic research into the character of human parsing and comprehension.Ithereforeleavetoonesidedebatesaboutintuitions, ontology,andinnateness.Myclaimisthat,howeverthosedebatesturn out,Devitt’smainthesiscannotbesustained.I’llarguethatourbest psycholinguistictheoriesofcomprehensionarecommittedtotheclaim thattherulesorprinciplesof some grammararepsychologicallyreal, inthesenseofbeing represented intheminds/brainsoflinguistically competentindividuals. Devitt(2006)formulateswhathecallsthe Representational Thesis (RT)asfollows: (RT) Aspeakerofalanguagestandsinanunconsciousortacitproposi tionalattitudetotherulesorprinciplesofthelanguage,whichare representedinherlanguagefaculty(p.4). It’sdifficulttosaywhetheranypsycholinguistictheoryiscommittedto RT.Asstated,thethesiscontainsseveraltechnicalterms,theimportof whichispresentlyindispute.Inparticular,itisamatteroflivedebate whatexactlyapropositionalattitudeis, 1whatittakesforaproposi tional attitude to be held tacitly, 2 and what it is for a propositional attitude to be nonconscious. 3 Further, there is widespread disagree mentregardingtheboundariesofthe“languagefaculty,”and,indeed, regardingtheveryexistenceofsuchafaculty,andofmental“faculties”

1 For an account of the structure of propositional attitudes that departs substantiallyfromtheinfluentialviewdevelopedbyJ.A.Fodor(1987),seeCummins (1996). 2AnearlydiscussionoftacitknowledgeappearsinJ.A.Fodor(1968).Amuch more helpful approach to issues concerning this difficult notion can be found in Davies(1987,1989,1995).Idrawheavilyonthisworkin§4. 3 Philosophers working in the field of consciousness studies tend to focus exclusivelyonqualitativestates.Foraluciddiscussionofconsciousnessasitpertains topropositionalattitudes,seeRosenthal(2005). D.Pereplyotchik, Psychological and Computational Models 33 moregenerally. 4Finally,whileanumberoftheoriesofrepresentation, intentionalcontent,reference,aboutness,andmeaningarecurrently onoffer, 5it’snotatallclearwhich,ifany,isapplicabletothecasehere inquestion—i.e.,therelationbetweenaspeaker/hearerandthegram marofhisorherlanguage.Iwilladdressmanyoftheseissuesin§4, whereIconcludethatasuitablymodifiedversionofRTenjoysanabun danceofempiricalsupportandislikelytoremainacorecommitmentof anyviableresearchprograminpsycholinguisticswellintothefuture. Idevelopmycaseforthisconclusionintwosteps.Thefirststepis toshowthat,inthecourseofcomprehension,thehumansentencepro cessingmechanism(henceforth,the HSPM )constructswhatIwillcall mental markers —i.e.,causallyactivementalrepresentationsof thesyntacticstructuresoflinguisticstimuli.Isecurethispremisein §2byappealtoevidencefromprimingstudies,EEGexperiments,and various types of gardenpath effect. The second step is to show that theparsingroutinesthatdelivertheserepresentationsdrawonmen talrepresentationsofgrammaticalrulesorprinciples.Tosupportthis premise,in§3Isurveyarangeofpsychologicallyplausiblecomputa tionalparsingmodels,andexaminetheircommitmentsregardingwhat datastructuresarenecessaryforaneffectiveparsingprocedure.

§2. The Psychological Reality of Mental Phrase Markers TheclaimthattheHSPMconstructsmentalphrasemarkersduring comprehension underlies virtually all of the work in contemporary psycholinguistics.Herearetwotypicalstatementsofitbyleadingre searchersinthefield: [L]et us suppose (as we surely should, until or unless the facts dictate againstit)thatthehumansentenceprocessingroutinescomputeforasen tencetheverystructurethatisassignedtoitbythemental“competence” grammar.(Fodor,J.D.,1989:p.157). Mostmodelsofhumanlanguagecomprehensionassumethattheprocessor incorporateswordsintoagrammaticalanalysisassoonastheyareencoun tered.…Weassumethatsentenceprocessinginvolvesthecomputationof

4J.A.Fodor(1983)isaclassicdiscussionoffacultypsychologyanditsdiscontents. MorerecentworkonthetopiccanbefoundinPinker(1999),Coltheart(1999),J.A. Fodor(2002),Carruthers(2006),andPrinz(2006). 5Nearlyeverypublicationonthetopicofintentionalcontentbeginswithaseries of objections to each of the other available theories. Cummins (1991) provides a useful,ifsomewhatoutdated,summary;amoreuptodatecatalogueofobjectionscan befoundinNeander(2006).Foraversionofthepopular“usetheoryofmeaning,”see Horwich(1998).InferentialrolehasbeenforcefullydefendedbySellars (1963a,b) and extended by Brandom (1998, 2008). The “asymmetric dependence” approach is an alternative developed by J. A. Fodor (1987, 1990). Fodor’s theory buildsonthe“informationalsemantics”introducedbyDretske(1981).Teleosemantic theoriesowemuchoftheirpopularitytoMillikan(1984).Aninterestinglydifferent versionofteleosemanticsisadvancedbyCummins(1996).Finally,thereare“two factor”views,e.g.,Block(1986),whichseektoincorporatethevirtuesofbothcausal/ informationaltheoriesandusetheoriesofmeaning. 34 D.Pereplyotchik, Psychological and Computational Models

dependenciesbetweenthewordsandphrasesthatareencountered.Forex ample,inthesentence The troops found the enemy spy ,therelationsinclude informationthat the troops isthesubjectof found .Often,wordsareincorpo rateddirectlyintotherepresentationwithoutbreakingexistingdependen cies.Forexample,whenthemain found isencountered,theprocessor formsthedependencybetween the troops and found ,anddoesnotneedto break any other dependencies (e.g., that between the and troops ). (Sturt, Pickering,andCrocker,2001:p.283) Inthissection,Ipresentseveralargumentsforthepsychologicalreal ityofmentalphrasemarkers.Ibeginbytakingabrieflookatwhatis currentlyknownabouttheneuralprocessesunderlyinglanguagecom prehension,withaparticularfocusonEEGstudiesthatemploythe socalled“violationparadigm.”Turningtobehavioralstudies,Idiscuss the results of a number of experiments designed to examine a phe nomenonknownasstructuralpriming.ThedatafromboththeEEG workandthestructuralprimingstudiesprovidestrongsupportforthe claimthattheHSPMconstructsrepresentationsofdistinctly syntactic propertiesoftheinput.Ithenturntothepsycholinguisticexperiments that reveal “gardenpath” phenomena in sentence processing. These are cases in which the HSPM encounters a locally ambiguous input andresolvestheambiguityinawaythatturnsouttobeincorrectrela tivetothecompletionofthesentence.Idiscussthreeprinciplesofam biguityresolution,whichformthefoundationofmostpsychologically plausibleparsingmodels.Finally,Idiscussevidenceforthepresenceof emptycategories—specifically, wh traces—inthementalphrasemark ers that the HSPM constructs. The data currently available suggest that wh traces are psychologically real and that the HSPM employs rather sophisticated strategies in searching for them, making use of thecuesprovidedbytheirantecedentsaswellasconsiderableknowl edgeofgrammaticalconstraints.

2.1 The Argument from Neurophysiological Data Letusbeginbysurveyingsomeoftheresultsemergingfromthefield of that bear on the psychological reality of mental phrasemarkers.Inarguingfortheclaimthat“realtimeprocessesas semblesyntacticrepresentationsthatare the same as those motivated by grammatical analysis ”(emphasismine),PhillipsandLewis(forth coming)saythefollowing: [S]tudies that use highly timesensitive measures such as eventrelated brainpotentials(ERPs)havemadeitpossibletotrackhowquicklycompre hendersareabletodetectdifferenttypesofanomalyinthelinguisticinput. Thisworkhasshownthatspeakersdetectjustaboutanylinguisticanomaly withinafewhundredmillisecondsoftheanomalyappearingintheinput. Differenttypesofgrammaticalanomalieselicitoneormorefromamonga familyofdifferentERPcomponents,includingan(early)leftanteriornega tivity(‘(e)LAN’;Nevilleetal.,1991;Friederici,Pfeifer,&Hahne,1993)… [F]orcurrentpurposesthemostrelevantoutcomefromthisresearchisthat D.Pereplyotchik, Psychological and Computational Models 35

moreorlessanygrammaticalanomalyelicitsanERPresponsewithina few hundred milliseconds. If the online analyzer is able to immediately detectanygrammaticalanomalythatitencounters,thenitisreasonableto assumethatitisconstructingrepresentationsthatincludesufficientgram maticaldetailtodetectthoseanomalies. Below,Idescribetwooftheexperimentsmentionedinthispassage. EEGdevicesmeasurewhatareknownas event-related potentials (ERPs).Thesearesmallvoltagedifferencesbetweenelectricalactivi tiesinthebrain,recordedbyelectrodesplacedonthescalp.Instudies oflinguisticcomprehension,atypicalpieceofEEGdatawillhavethe formatexemplifiedinFigure1.There,weseeacomparisonbetween the ERPs evoked by two distinct stimuli—the critical regions of two Germansentences,onegrammatical,theothernot.

Figure 1: AtypicaldisplayofEEGdata,showingthelatency,degree,po larity,location,anddistalcauseoftheneuronalsignal.Atthetopleft,the grosslocationoftheactivityisspecified(inthiscase,bythesymbol‘F7’). Thevaluesontheyaxisrepresentthedegreeofthesignalanditspolar ity(positiveornegative).Thevaluesonthexaxisrepresentthesignal’s latency—i.e.,timeatwhichitoccurs,relativetotheonsetofthestimulus. Inthiscase,thesignalisanELAN—anearlynegativityintheleftanterior region of the brain. As the graph shows, the ELAN occurs roughly 125 millisecondsaftertheonsetofthecriticalstimulus.Likethestudiesdis cussedinthemaintext,theexperimentfromwhichthisdatawasderived usedtwoGermansentencesasstimuli.Thefirst,indicatedbytheunbro kenline,isagrammaticalsentence.Thesecond,indicatedbythebroken lineis un grammatical—i.e.,itexhibitsabasicphrasestructureviolation. The critical stimulus is the word ‘ironed’. The ERP associated with the grammaticalsentencearesignificantlydifferentfromtheoneassociated withtheungrammaticalsentence.Thegraphshowsadistinctnegativity (conventionallyplottedupwardonyaxis). Source: BornkesselSchlesewsky andSchlesewsky(2009:p.110) 36 D.Pereplyotchik, Psychological and Computational Models

ThemostwidelyusedexperimentalparadigminneurolinguisticERP studiesisknownas the violation paradigm .Inthisparadigm,partici pantsareshownavarietyofsentences,someofwhichcontainviola tionswithrespecttooneoranotherlinguisticproperty.Forinstance,in anearlyERPstudy,Neville et al .(1991)usedthefollowingmaterials: (1) *ThemanadmiredDon’sofsketchthelandscape. syntacticviolation (2) *ThemanadmiredDon’sheadacheofthelandscape.semantic/pragmaticviolation 6 (3) ThemanadmiredDon’ssketchofthelandscape. controlsentence(noviolation) Assubjectsreadsuchsentences,anEEGdevicemonitorstheirbrain activity,particularlyatthecrucialregions,underlinedin(1)and(2). TheERPsevokedbytheanomalousstimulidifferfromthoseevokedby thewellformedstimuli.Thisyieldsinformationaboutwhereandwhen inthebrainspecifickindsofviolationaredetected.Forinstance,Neville et al .foundthatsentence(1)evokedanegativepolarityresponseinthe leftanteriorregionofthebrainapproximately125millisecondsafter theonsetoftheword‘of’.Thisfastnegativepolarityresponsehascome tobeknownastheELAN—earlyleftanteriornegativity.(SeeFig.1 above.)Bycontrast,thenonsyntacticviolationinsentence(2)evoked anegativepolarityresponseapproximately400millisecondsafterthe onsetoftheword‘headache’.ThishasbeendubbedtheN400.Using thematerialsin(4)(6),Friederici,Pfeifer,andHahne(1993)foundthe same pattern—a replication that is especially striking given the fact that, unlike Neville et al ., Friederici et al . used German rather than Englishsentencesandpresentedthemauditorilyratherthanvisually. (4) *DerFreundwurdeimbesucht. syntacticviolation thefriendwasinthevisited (5) *DieWolkewurdebegraben. semantic/pragmaticviolation thecloudwasburied (6) DerFinderwurdebelohnt. controlsentence(noviolation) thefinderwasrewarded Indeed,virtuallythesamepatternhasbeenobservedindozensofsub sequentstudies.Thenaturalinterpretationisthatincomingwordsare incrementally incorporated into a mental phrase marker, with syn tacticinformationbeingaccessedquiteearly—125millisecondsafter stimulusonset—whileotherpropertiesofthestimulusarerecovered significantlylater. Consider now sentences like (7), in which a semantic violation is combined withasyntacticviolation. (7)*DasGewitterwurdeimgebugelt.combinedsyntacticandsemanticviolation thethunderstormwasintheironed HahneandFriederici(2002)foundthatsentence(7)evokesanELAN, whichischaracteristicofsyntacticviolations,butnotanN400,which

6Inviewofthenotoriouslyshakystatusofthesemantics/pragmaticsdistinction, I simply slur over it in what follows, labeling various properties of sentences ‘semantic’regardlessofwhethertheywouldbeclassifiedassemanticorpragmatic byatheoristwhoinsistsondrawingthedistinction. D.Pereplyotchik, Psychological and Computational Models 37 seemstobecorrelatedwithsemanticviolations.Itappears,then,that the(presumablysyntactic)ELANiscapableof blocking the(presum ably semantic) N400. Crucially, it has also been discovered that the reverseis not true;asemanticN400evoked prior toasyntacticELAN cannot“block”theELAN.Researchershavethusconcludedthat“exist ingERPfindingsprovidestrongconvergingsupportfortheassumption thatconstituentstructureinformation hierarchically dominates other information types such as semantics/plausibility” (BornkesselSchle sewskyandSchlesewsky,2009:p.113).

2.2 The Argument from Structural Priming Fodor, Bever, and Garrett (1974) reported a range of studies that theyinterpretedasdemonstratingthepsychologicalrealityofmental phrase markers. Among these were the famous “click” experiments, in which subjects monitoring a speech stream misheard short clicks as if they occurred at constituent boundaries. However, as Jurafsky andMartin(2008:pp.424–5)pointout,manyofthesestudiesfailedto controlforsemanticbiasesthatcorrelatewithsyntacticstructure.Af terall,effectsthatcanbeexplainedbythehypothesisthattheHSPM groupswordsintoa syntactic perceptualunitcanoftentimesbeequally wellexplainedbythehypothesisthatthegroupingisa semantic one. Convincing arguments for the psychological reality of syntactic con stituencymust,therefore,bebasedondatathatcanbeshowntobe independentofsemanticeffects.Recentevidencefromprimingstudies fitsthebill. Thelogicofprimingphenomenaisthis:Amentalrepresentation activatedattime t—say,inresponsetoastimulus—continuestobeac tiveforsometimeafter t,influencingcognitiveprocessingaslongasit persists.InthefollowingpassagefromPickeringandFerreira(2008), theauthorsdiscusstheimportanceofprimingtorecentstudiesofsen tenceprocessing. In the past couple of decades, research in the language sciences has re vealedanewandstrikingformofrepetitionthatweherecall structural priming .Whenpeopletalkorwrite,theytendtorepeattheunderlyingba sicstructuresthattheyrecentlyproducedorexperiencedothersproduce. Thisphenomenonhasbeenthesubjectofheavyempiricalscrutiny.Someof thisscrutinyhasbeenbecause,asinotherdomainsincognitivepsychology (e.g.,priminginthewordrecognitionliterature;e.g.,McNamara,2005),the tendencytobeaffectedbytherepetitionofaspectsofknowledgecanbeused todiagnosethenatureofthatknowledge.[T]hetendencytorepeataspects ofsentencestructurehelpsresearchersidentifysomeoftherepresentations thatpeopleconstructwhenproducingorcomprehendinglanguage.Aswe shallsee,muchstructuralprimingisunusuallyabstract,evidentlyreflect ingtherepetitionofrepresentationsthatareindependentofmeaningand sound.Thisisthereforeinformativeabouthowpeoplerepresentanduse abstractstructurethatisnotdirectlygroundedinperceptualorconceptual knowledge.Onepossibilityisthattherepresentationsthatitidentifiescan beequatedwiththerepresentationsassumedinformallinguistics.(p.427) 38 D.Pereplyotchik, Psychological and Computational Models

AnearlyandinfluentialstudyinthisveinisreportedinBockand Loebell (1990). The researchers were careful in eliminating the se manticandlexicalconfoundsmentionedabove,byconstructingtheir stimulusmaterialsinsuchawayastovarysyntacticstructureinde pendentlyoflexicalandsemanticstructure,andviceversa.Thiswas madepossiblebythefactthatsomeinEnglishareditransitive— capableofbeingusedinsemanticallyidenticalbutsyntacticallydis tinctexpressions.Examplesofditransitiveverbsinclude‘give’,‘sell’, and‘send’.Sentence(8)illustratesadoubleobjectdativeconstruction, i.e.,aV–NP–NPstructure,while(9)isanexampleofaprepositional dativeconstruction,i.e.,aV–NP–PPstructure.

(8) Quentin[ Vgave/sold/sent[ NP Oliver[ NP atoy]]].

(9) Quentin[ Vgave/sold/sent[ NP atoy[ PP toOliver]]]. Bock and Loebell’s experiment made use of the picturedescription paradigm. Participants were first asked to read some sentences out loud.Unbeknownsttothem,theseservedasprimesandwereselected bytheexperimentersforhavingaprepositionaftertheverb(i.e.,aV– NP–PPstructure),butdifferingfrom(9)intheirsemanticsandlexical constituency.Forinstance,althoughasentencelike(10)hasthesame syntactic structureas(9),ithasalmostnoneofthesamewordsas(9) and,crucially,hasadifferentsemanticinterpretation—e.g.,theprepo sitioncarriesalocativemeaning,asagainstthedativemeaningofthe prepositionin(9).

(10)IBM[ Vmoved[ NP abiggercomputer][ PP totheSearsstore]. Havingreadoutloudsentenceslike(10),participantswereshownpic turesandaskedtodescribethem.Thepicturesdepictedeventsthat involveanagentgivingsomethingtosomeone.Sucheventscanbede scribedequallywellbysentencesthatemploythedoubleobjectdative construction(V–NP–NP)andonesthatemploytheprepositionaldative construction(V–NP–PP). BockandLoebellfoundastrongprimingeffect.Participantswho initiallyreadoutloudsentencesthatemploythedoubleobjectdative constructionweremorelikelytoemploythatsameconstructioninde scribingtheeventsdepictedinthepictures.Similarly,thosewhohad recentlyreadoutloudsentencesthatemploytheprepositionaldative constructionweremorelikelytoemploy that constructionindescribing theeventsdepictedintheverysamepictures.Thisstronglysuggests thattheparticipantsintheexperimentmentallyrepresentedthesyn tacticpropertiesofthesentencesthattheywereinitiallyaskedtoread, andthenusedthoserepresentationsinrepeatingthosesentencesout loud.Havingbeenactivatedtwice—onceinthecomprehensionofthe writtensentencesandonceintheirspokenproduction—therepresen tationthenremainedactiveintheirsentenceprocessingmechanisms, hencemorelikelytobereusedlaterinproduction.Thisprimingeffect accountsfortheparticipants’choiceofconstruction. D.Pereplyotchik, Psychological and Computational Models 39

Thisexperimentwasoneofthefirstinwhathasbecomealongline. Indeed,theareaofstructuralprimingresearchiscurrentlythriving, largelybecausetheresultsaresorobustandthedatasotelling. 7Bock andLoebell’sinitialconclusionshavebeenrefinedandextendedina numberofways.PickeringandFerreira(2008)discussstudieswhich show that structural priming is not restricted to the constructions mentionedabove—e.g.,itoccursalsowithactivepassivepairs.More overitisnotduetothepresenceofcommonclosedclasswordsinthe stimulusmaterials—e.g.,thepreposition‘to’insentences(9)and(10). Forinstance,Bock(1989)findsprimingacrosssentenceswith different prepositions—e.g.,‘to’and‘for’.Norisstructuralprimingrestrictedto asinglelanguage;thephenomenonhasbeenobservedinGerman,and bilingualEnglishGermanspeakersevenexhibit cross-linguistic prim ing effects—i.e., primed production across their two , with respect to suitable constructions. Children and aphasics also exhibit structuralprimingeffects,rulingoutthepossibilitythatthephenom enonisrestrictedtosomespecialsetoflanguageusers.Furtherstudies ruleoutthepossibilitythatsubjectsproduceformssimilartotheprime becausetheywanttostayinthesamerhetoricalregister,e.g.,formal speech.Similarly,structuralprimingisindependentofbothprosody andargumentstructure(i.e., θassignment),andcanbeelicitedcross modallyfromspokentowrittenlanguageandviceversa. 8Finally,the same findings have been replicated using experimental paradigms otherthanthepicturedescriptionparadigm.Theseincludesentence recall,writtensentencecompletion,andspokensentencecompletion. Havingruledoutabroadrangeofpossibleconfounds,Pickeringand Ferreirawrite: Inconclusion,takentogether,theseresultsprovidecompellingevidencefor autonomoussyntax:Theproductionofasentencecriticallydependsupon anabstractsyntacticformthatisdefinedintermsofpartofspeechforms (e.g.,nouns,verbs,prepositions)andphrasalconstituentsorganizedfrom those(nounphrases,verbphrases,prepositionalphrases),andthisabstract syntacticformhasalargeinfluenceuponstructuralpriming.(p.431) Wemayconclude,inasimilarvein,thatstudiesofstructuralpriming effectsshoreupdecisiveevidenceinfavorofthepsychologicalreality ofmentalphrasemarkersandprovideaglimpseintotheirroleinlan guagecomprehensionanduse.

7PickeringandFerreira(2008)evenentertainthe“intriguingpossibilitythat all levelsofprocessingthatoccurduringproductionshowprimingandthereforethatthe absenceofprimingsuggeststheabsenceofacorrespondinglevelofrepresentation” (p.429,emphasisadded). 8ThiscontradictsacontentionofDevitt(2006)totheeffectthattheremaywell benomodalityneutrallanguagefaculty.PickeringandFerreiradiscusswhatthey taketobe“strongevidencethatatleastthoseaspectsofstructuralknowledgethat underliestructuralprimingaremodalityindependent—theyareusedinthesame waybothwhenspeakingandwhenwriting”(p.439). 40 D.Pereplyotchik, Psychological and Computational Models

2.3 The argument from garden-path effects A classic form of argument for the psychological reality of mental phrasemarkersbeginswiththeobservationthatcompetentlanguage usershaveproblemsreadingandunderstandingsentenceslikethefol lowing. (11)Danieltellsstudentsheintriguestostay. (12)Jakeknowstheboyhurriedoutthedoorslipped. (13)Thesoldierpersuadedtheradicalstudentthathewasfightinginthewarforto enlist. (14)Arongavethemanwhowaseatingthefudge. Fromthepointofviewofformalsyntax,allofthesesentencescontain alocalambiguitythatisresolvedatorbeforetheendofthesentence. What could explain the fact that even proficient readers encounter measurableprocessingdifficultieswithregardtosuchsentences?The standard explanation appeals to the online construction of mental phrasemarkers. Aparsingroutinethatcomputesphrasemarkersincrementallywill updateitsrepresentationofasentenceinaccordancewiththewords orphrasesthatitencounters,uptothepointatwhichthesentencebe comessyntacticallyambiguous.Atthatpoint,theparserhastomake a choice among the possible ways of continuing the phrase marker that it has thus far constructed. 9 Any ambiguity resolution strategy will sometimes lead a parser to make incorrect choices—i.e., choices thatgiverisetoexpectationsthattheremainderofthesentencewill servetodisconfirm.Hereinliestheexplanationoftheaforementioned processingdifficulties.Insentenceslike(11)–(14),thehumansentence processor’spreferredstructuralassignmentturnsouttobeincorrect bythetimethesentenceiscomplete.Additionalcomputationalloadis thenincurredinrevisingthementalphrasemarker—aprocessknown as reanalysis .10 Thisadditionalprocessingburdenshowsupinbehav ioralandneurophysiologicalindicatorsofprocessingdifficulty,suchas errorratesanddelayedreactiontimes.Thesuccessofthisexplanation oftheobservedprocessingdifficultiesconstitutesevidenceinfavorof theclaimthathumansentenceprocessingroutinesconstructmental phrasemarkers. Countless instances of this explanatory strategy can be found in thesentenceprocessingliterature.Considerforinstance,thefollowing sentences.

9 Here, I assume a serial architecture. Parallel models will bifurcate their processingatthispoint,buildingmultiplegrammaticallylicensedstructuresand rankingthem. 10 Whenaserialmodelhasmadeamistake,itwillincurextracomputationalload bybeingforcedtobacktrack.Parallelmodelswilllikewiseincurextracomputational load,onaccountoftheirhavingtoreranktheparsesthattheyhaveconstructedand storedinworkingmemory,ifonlybydeletingthedisconfirmedparse.Forfurther discussionofthisissue,seeCrocker,PickeringandClifton(2000),ch.1. D.Pereplyotchik, Psychological and Computational Models 41

(16)Thespysawthecopwitharevolver,butthecopdidn’tseehim. (17)Thespysawthecopwiththebinoculars,butthecopdidn’tseehim. Rayner, Carlson, and Frazier (1983) examined the eye movements involvedinreadingthesesentencesandotherslikethem. 11 Thedata showthatreadersfixateimmediatelyaftertheword‘revolver’insen tence(16)forasignificantlylongertimethantheydoaftertheword ‘binoculars’insentence(17).(SeeFig.2,below.)Thisisindicativeofa mildhiccupinprocessing.Rayner et al .arguethattheincreasedfixa tiontimeisaresultofthefactthattheHSPMpreferstoconstructa representationofsentence(16)inwhichtheprepositionalphrase‘with arevolver’attachestotheverb‘saw’,notthenoun‘thecop’.(SeeFig. 3.)Fractionsofasecondlater,thisinitialattachmentpreferencecomes intoconflictwiththereader’sencyclopedicinformation—specifically,a belieftotheeffectthatoneismuchlesslikelytousearevolvertosee somethingthantoseeapersonwhoisinpossessionofarevolver.This givesrisetoatimeandresourceconsumingreanalysisofthesentence, inthecourseofwhichtheprepositionalphraseisattachedtothenoun ‘thecop’.Bycontrast,inthecaseofsentence(17),theinitialattach mentpreferenceisconsistentwiththesemanticinterpretationofthe sentence,soprocessingisnotdelayed.

Figure 2 :EyetrackingdatafromRayner,etal.(1983).Thegraphdepicts asignificantdifferencebetweenthefixationtimesatthecriticalregionsof sentences(16)and(17).

TheRayner et al. studyandnumerousotherslikeitservetoillustrate twopoints.First,thereisthenowfamiliarpointthatphrasestructure isrecoveredfromthelinguisticinputinthecourseofprocessing.That is,amentalphrasemarkerisconstructed,inamannerthatissensi tivetothesyntacticpropertiesofthestimulus.Thesecondpointisthat theHSPMseemstoconstructmentalphrasemarkersinaccordance withaquitegeneralambiguityresolutionstrategy,knowninthelit eratureas Minimal Attachment .MinimalAttachmentisaleasteffort principle,accordingtowhichtheparserwillattachincomingmaterial

11 Foranextensivediscussionoftheeyetrackingparadigm,seeRayner(1998). 42 D.Pereplyotchik, Psychological and Computational Models intotheexistingmentalphrasemarkerinsuchawayastominimize thenumberofnonterminalnodesintheresultingstructure.(SeeFig ure3,below.)Theparser’sadherencetothisstrategyalsoaccountsfor thegardenpatheffectassociatedwiththefamoussentence(18)andits cohorts,(19)–(21). (18)Thehorseracedpastthebarnfell. (19)Theshipfloateddowntheriversank. (20)Thedealersoldtheforgeriescomplained. (21)Themansentthelettercried. Theverbforms‘floated’,‘raced’,‘sold’,and‘sent’areambiguous.They canserveaseitherpasttenseverbsthatarepartofamainclauseor aspastparticiplesthatservetointroducearelativeclause.(Inthese cases,theoptionalcomplementizer‘that’hasbeenomitted.)Thelocally ambiguousstructuresassociatedwiththesesentencesareillustrated inFigure4. MinimalAttachmentisoneofthreeprinciplesthat,takentogether, explainawiderangeoftheHSPM’sambiguityresolutionpreferenc es.ThesecondsuchprincipleisLateClosure,whichdictatesthatthe parserwillincorporatenewlyencounteredmaterialintothemostre centphraseorclauseofthementalphrasemarkerthatithasalready constructed.LateClosureisinvokedtoexplaintheHSPM’spreference incasesliketheoneillustratedinFigure5. 12

12 Thetreestructuresinthediagramspresentedbelowarefarsimplerthanwould bepositedincontemporarysyntactictheories,buttheyaresufficienttoillustratethe structuraldistinctionsrelevanttothepresentdiscussion. D.Pereplyotchik, Psychological and Computational Models 43 is (a) .Inthis (d) .Thereasonisthat,in (b) .Theadditionalnonterminalnodein (b) andbuildinginsteadthestructureshownin (c) ,preferringinsteadthesemanticallyimplausiblestructurein (a) InaccordancewiththeprincipleofMinimalAttachment,whensentence(16)istheinput,theHSPMwillinitiallyfailtobuild ,theverbphrase(VP)dominatesmorenodesthanitdoesinthestructuredepictedin anNP,whichisneededtoaccommodatetheattachmentoftheprepositionalphrase‘withthebinoculars’totheNP‘thecop’.Minimal MinimalAttachmentlikewisedictatesavoidingthestructuredepictedin case,theattachmentpreferenceturnsouttobesemanticallyplausible. (a) Figure 3: Figure theplausibleandgrammaticalstructurein Attachmentrulesagainstconstructingamentalphrasemarkerwiththisadditionalinternalstructure.Whensentence(17)istheinput, 44 D.Pereplyotchik, Psychological and Computational Models

Figure 4: InaccordancewiththeprincipleofMinimalAttachment,the HSPMassumesthatthenoun‘thedealer’isthesubjectofthatverb ‘sold’.Itthusattachestheverbtotheexistingstructureinthemanner depictedintheleftpanel.SubsequentinputrevealsthattheHSPM’s assumptionwasincorrect,thusnecessitatingreanalysis.Thedifficulty ofreanalysisinthiscaseisafunctionofthesheeramountofadditional structureneededtoaccommodateapassiveparticiplereading.

Figure 5: Havingbuiltastructurefortheinput‘Shesaidhesawher’, theHSPMreceivestheadverb‘yesterday’.Thegrammarlicensestwo possibleattachments,representedbytheleftandrightpanels.Inac cordancewiththeprincipleofLateClosure,theHSPMresolvesthis ambiguityinfavorofthestructuredepictedintheleftpanel.Thatis, itattachestheadverbtothemostrecentphraseofthestructureithad alreadybuilt—inthiscase,theverbphrase‘sawher’. D.Pereplyotchik, Psychological and Computational Models 45

Thethirdambiguityresolutionprinciple,knownastheMinimalChain Principle,pertainstotheprocessingofsocalled“fillergap”construc tions,towhichwenowturn.

2.4 The Argument from Filler-Gap Processing Animportantfeatureofthedominantframeworkingenerativelinguis tics—often referred to as “Principles and Parameters” theory (P&P, henceforth)—is that it posits socalled “empty categories.” These are linguisticentitiesthatarenotovertlypresentinwritingandspeech,but areassumedtooccupyapositionintheunderlyingsyntacticstructures ofmanytypesofsentence.Emptycategoriesplayaroleinexplaining whypassivesentences,certainkindsofellipsis,and wh questions(to namejustafewconstructions)havethesemanticinterpretationsthat theydo,despitethefactthattherelationsbetweenthenounsandthe verbsintheseconstructionsarenotthecanonicalonesthatsyntacti cianspresumetobeencodedinthelexicon.Forexample,thelexical entryfortheverb‘consult’specifiesthatthisverbrequiresasubjectto itsleftandanobjecttoitsright.Insentence(22),however,theobjectof theverbdoesnotoccupyitscanonicalposition.Nevertheless,theword ‘whom’ bears a structural relation to the object position—a relation thatiscrucialforcorrectlyinterpretingthequestion.Tocapturethis relation,P&Pgrammarspositanemptycategoryknownas“ wh trace” intheobjectposition,and co-index thisentitywith‘whom’,formally representingthisrelationwiththesubscript‘i’. 13 wh (22)[Whom] ididyourparentsconsult trace ibeforebuyingtheguitar? Theword‘whom’servesasthe antecedent ofthis wh trace.Inthejar gonofpsycholinguistics,‘whom’issaidtobethe“filler”andthe wh traceissaidtobethe“gap.” Whilethenotionofanemptycategoryhelpsthesyntacticiancap turesignificantstructuralrelationships,itposesspecialchallengesfor the psycholinguist’s theory of comprehension. A model of the HSPM mustaccommodatetheprocessingofinputthatisnotovertlypresent inthesoundstreamorontheprintedpage.Hence,onemightwonder whether empty categories are psychologically real —i.e., whether the HSPMbotherstoincludeemptycategoriesinthementalphrasemark ersthatitbuilds.Thisquestiongainsurgencyinlightofthefactthat alternativedescriptivegrammars—rivalsoftheP&Papproach,such as Lexical Functional Grammar and Generalized PhraseStructure Grammar—encode the relevant structural relations without positing emptycategories. 14 Thus,ademonstrationofthepsychologicalreality

13 See Haegeman (1994) for detailed coverage of the rich study of empty categories. 14 Thisisanoversimplification.FordetailsregardingthewayinwhichLFGand HPSGtreatofwhconstructionsandrelativeclausesseeBresnan(2001)andPollard andSag(1994)respectively.Fortreatmentsofthisissuethathighlightitsrelevance topsycholinguistics,seeFodor,J.D.(1989:177–186)andFeatherston(2001). 46 D.Pereplyotchik, Psychological and Computational Models ofemptycategoriescanbeusedtoarguethatgrammarsemergingfrom theP&Pframeworkmorecloselyresemblethegrammaremployedby theHSPMthandorivalgrammaticalformalisms. Fodor(1989,1995)summarizesanumberofexperimentsdesigned toaddressthisissue.Toillustrate,considerwhatpsycholinguistscall “thefilledgapeffect.”Sentences(23)and(24)demonstratethata wh tracecanappearinavarietyoflocationsintheinput. wh- (23)Who icouldthelittlechildhaveforced trace itosingthosestupidsongsfor Jenniferlastyear?

(24)Who icouldthelittlechildhaveforcedustosingthosestupidFrenchsongsfor wh- trace i lastyear? Eyetracking studies reveal a minor hitch in processing at the word ‘us’in(24),ascomparedwiththeanalogouspositionin(23).Theex planationforthis,whichagainappealstotheconstructionofmental phrase markers, runs as follows: The HSPM is sensitive to the fact thattheword‘Who’isanantecedent,towhicha wh tracewillhave tobebound—i.e.,afillerawaitingagap.Thus,itactivelysearchesfor legitimatepositionsatwhichtopositthe wh trace.TheHSPMpredicts thatthegapwilloccurafter‘forced’,inboth(23)and(24).Inthecaseof (23),thepredictioniscorrect,socomprehensionproceedssmoothly.But inthecaseof(24),thepredictionleadstheparserastray,givingrise tomeasurableprocessingdifficultiesatjustthepointinthesentence wheretheword‘us’occupiesthepredictedpositionofthe wh trace. Fodor(1989)notesthatthesuccessofthisexplanationrestsonour havingindependentreasontobelievethattheHSPMactivelyhuntsfor positionsatwhichtopositagap.Thereis,afterall,no obvious reason whytheHSPMshouldpredictagapwherethereisnone,insteadof simplywaitingtoseewhethersomeovertmaterialoccupiesthatposi tion.Nevertheless,asshegoesontoargue,theHSPM does makeactive predictions,andthesearebynomeansrandomorblind.Onthecon trary,theHSPMappearstobewellinformedaboutwhereagapwould belicensed,whichinturnstronglysuggeststhatitsparsingroutines drawonagrammartoguideitspredictions.PhillipsandLewis(forth coming)echothisconclusioninthefollowingpassage: [One]bodyofonlinestudieshasexaminedwhetheronlinestructurebuild ingrespectsvariousgrammaticalconstraints,i.e.,whethertheparserever creates grammatically illicit structures or interpretations. Many stud ieshavefoundevidenceofimmediateonlineeffectsofgrammaticalcon straints,suchaslocalityconstraintsonwhmovement(Stowe,1986;Traxler &Pickering,1996;Wagers&Phillips,2009),andstructuralconstraintson forwardsandbackwardsanaphora(Kazaninaetal.,2007;Nicol&Swinney, 1989;Sturt,2003;Xiang,Dillon,&Phillips,2009).Findingssuchasthese implythatthestructurescreatedonlineincludesufficientstructuraldetail toallowtheconstraintstobeappliedduringparsing. LetusbrieflyexamineoneofthestudiesthatPhillipsandLewismen tion.NicolandSwinney(1989)madeuseofanexperimentalparadigm knownas cross-modal priming .Toseehowthisworks,considersen D.Pereplyotchik, Psychological and Computational Models 47 tence(25),whichcontainsarelativeclausewitha wh traceintheob jectpositionoftheverb‘accused’. wh (25)Thepolicemansawtheboy i[that ithecrowdatthepartyaccused trace iof thecrime]. In(25),thereisonlyonepositionatwhichthe wh traceisgrammati cally licensed and only one noun phrase that can legitimately serve astheantecedentofthat wh trace—viz.,‘theboy’.However,thesen tencecontainsanumberofothernounphrases—‘thepoliceman’,‘the crowd’, and ‘the party’—any of which a linguistically “ignorant” pro cessormighttaketobetheantecedent.Let’scallthese distracters .To test whether the HSPM is temporarily fooled into taking any of the distractersastheantecedentofthe wh trace,NicolandSwinneyhad participantslistentosentenceslike(25)whilelookingatacomputer screen.Whentheword‘accused’wasspoken,participantssawaword appearonthescreen.Insometrialsthewordwassemanticallyrelated to‘boy’,whichistheantecedentofthe wh tracethatappearsafter‘ac cused’.Forexample,someparticipantssawtheword‘girl’.Inothertri als,participantssawwordsthatwerecomparableinlengthto‘girl’,but semanticallyrelatedtooneofthedistracters.Participantswereasked toreadthiswordoutloudandtheirreactiontimesweremeasured. Nicol and Swinney made the following assumption: If the HSPM positsa wh traceafter‘accused’,thenitwillactivatethemeaningof theantecedent‘boy’atthatpoint,whichwouldinturnprimetherec ognitionofsemanticallyrelatedwords,like‘girl’,thusspeedingupthe participants’reactiontimesinreadingthosewords. 15 Bycontrast,the recognition of words that are semantically related to one of the dis tracterswould not beprimed.Andthisispreciselywhattheyfound. Participantsweresignificantlyfasteratreadingthewordsthatbeara strongsemanticrelationto‘boy’thantheywereatreadingwordsthat haveclosersemanticaffinitieswiththedistracters.Theresultswere quiterobustandhavebeenreplicatedanumberoftimes. 16 Nicoland Swinney concluded that wh traces are psychologically real and that the HSPM uses sophisticated, grammatically informed strategies in activelypredictingtheiroccurrenceanddeterminingtheirrelationto otheritemswithinthesyntacticstructureoftheincomingstimuli. 17 15 Priming studies had, by this time, demonstrated quite clearly that the recognitionofawordprimestherecognitionofsemanticallyrelatedwords. 16 It is worth pointing out that other aspects of the data from Nicol and Swinney’sexperimentprovidesevidencefora“modular”or“syntaxfirst”processing architecture.Inreportingtheirfindings,theywrite:“Whenstructuralinformation cannot serve to constrain antecedent selection, then pragmatic information may playarole, but only at a later point in processing ”(p.5,emphasisadded). 17 Subsequentresearchraisedanimportantquestionaboutwhethertheseresults, and others like them, should be seen instead as semantic rather than syntactic effects.Althoughadecisiveresolutionoftheensuingdebateiscurrentlyoutofreach, itisworthnotingthatrecentexperimentsreportedinFeatherston(2001)provide stronggroundsinfavoraviewthatlocatesfillergapeffectsata syntactic levelof representation.Theseexperimentsalsoprovidegroundsforattributingpsychological 48 D.Pereplyotchik, Psychological and Computational Models

Onthebasisofthefindingsdescribedabove,wecanaddaprocess ing principle to the list that already contains Minimal Attachment andLateClosure.FrazierandClifton(1989)referredtotheHSPM’s strategy for dealing with fillergap constructions as the “Active Fill er Strategy”—a label that reflects the fundamental point that, upon discoveringafiller,theHSPMmakesactiveandinformedpredictions abouttheoccurrenceofthecorrespondinggapinthelinguisticinput. Soonafter,deVincenzi(1989)proposedamoregeneralprinciplethat subsumes the Active Filler Strategy by making use of the notion of asyntacticchainfromGovernmentandBindingtheory.Accordingto whatshecalledthe“MinimalChainPrinciple,”theHSPMwill“avoid postulatingunnecessarychainmembersatSstructure,but[will]not delayrequiredchainmembers”(199).AsdeVincenzipointedout,the secondconjunctofthisprincipleisequivalenttotheActiveFillerStrat egy.Itstates,inessence,thatwhentheHSPMrecognizessomeaspect oftheinputasanantecedent,itwillpositagapintheveryfirstposi tionatwhichagapislicensedbythegrammar. For our purposes, it is noteworthy that the Minimal Chain Prin ciplemakesineliminablereferencetoabstractgrammaticalnotions— e.g., ‘position at which a gap is licensed’. 18 Note, moreover, that the principleentailsthattheHSPMwillpredictagapinpositionswhere theinformationprovidedbytheantecedent, combined with an internal representation of a grammar ,providessufficientgroundsfordoingso. Withoutaninternalrepresentationofagrammar,themerepresence ofanantecedentwouldnotbesufficientfortheHSPMtoventureany guessesaboutwhereintheinputagapmightbefound,norwhatrela tionsthatgapbearstootheritemsinthesyntacticstructurethatthe HSPMhasalreadyconstructed.

2.5 Summary and further refl ections Thefindingsreviewedabove,garneredfromERPstudies,structural primingexperiments,andresearchconcerninggardenpathandfiller gapprocessing,allpointtothesameconclusion:Inthecourseofcom prehension,theHSPMconstructsexplicitrepresentationsofthesyn tacticstructureoflinguisticinput.Itgoeswithoutsayingthatthese studies are all subject to further scrutiny. Nevertheless, at present, theyunderwriteanumberofpowerfulargumentsforthepsychological realityofmentalphrasemarkers. ThisconclusionisfurthersupportedbymoralsdrawnfromtheAI literature, specifically the failure of the socalled “mostlysemantics” modelsdevelopedbyRogerSchankandhiscolleaguesinthe1980s. 19 Thesemodelseschewedthecomputationofsyntacticdependenciesand realitytootheremptycategories,inparticular NP traceand PRO . 18 IfrecentversionsofMinimalistsyntaxarecorrect,thentherelevantlicensing conditionistheEmptyCategoryPrinciple.Haegeman(1994)andChomsky(1995) presentseveralformulationsofthisprinciple. 19 Foranoverview,seeSchankandBirnbaum(1984). D.Pereplyotchik, Psychological and Computational Models 49 attemptedtoanalyzelinguisticinputbyfirstidentifyingthethematic structureofpredicatesintheinputandthenusingthe linear order of thewordsintheinputstringtodeterminewhichnounphrasesplay whichoftherequiredthematicroles.Inhiscritiqueofsuchmodels, Marcus(1984)demonstratedthatthisstrategyfailstohandleawide varietyofmulticlausesentences,complexpassives,andsentencesthat containwhatwe’vebeenreferringtoas“gaps”(whichincludesmost wh questions). Plainly, the inability of mostlysemantics models to copewithsuchubiquitousphenomenadisqualifiesthemasplausible candidatesforamodelofhumansentenceprocessing.Moreover,itis arguablethatthe“brutecausal”modeltentativelyadvancedbyDevitt (2006),accordingtowhichphoneticrepresentationsaremapped direct- ly intothoughts,facesthesameinsuperabledifficulties. 20 Onthewhole, wecanbereasonablysurethatthereisnohopeformodelsthatfailto computementalphrasemarkersinthecourseofcomprehension. Ourdiscussionalsopointstoasecondandmoreprofoundconclu sion:TheHSPMisnotanaïvemechanism.Theexplanatorysuccess ofprincipleslikeMinimalAttachment,LateClosure,andtheMinimal ChainPrinciplesuggeststhattheHSPMbuildsmentalphrasemark ersinawaythatis linguistically informed .Theaforementionedprin ciplesallmakeineliminablereferencetotheproprietarynotionsofan independentlymotivatedsyntactictheory—e.g.,thenotions number of nonterminal nodes and position at which an empty category is licensed . Thus, in addition to explaining a broad range of experimental data, theseprinciplesmakeitpossibletosee,ifonlyindimoutline,howone mightincorporateaformalsyntactictheoryintoamodelofsentence processing—an idea on which Fodor, Bever, and Garrett (1974) cast doubt,forreasonsthathavecometoberecognizedasspurious. 21 This hasimportantimplicationsforourunderstandingoftherelationbe tweentheHSPMandgrammar.AsFodor(1989)observes, longdistancebindingoftracesshouldprovidemanypitfallsforaroughand readyprocessorwhichreliesoninformalstrategiesratherthanconsulting theinformationprovidedbythegrammar(exceptperhapsasanemergency backup).Thehypothesisthatthe[HSPM]issuchadevice(Fodor,Bever, andGarrett,1974;Beveretal.,inpress)becomesquiteimplausibleinthe faceofthespeedandaccuracywithwhichthe[HSPM]interpretstraces. Wecanconclude,instead,thatthe[HSPM]isverycloselyattunedtothe grammarofthelanguage.Ifthatisso,thendifferencesinhowtheprocessor respondstodifferent(putative)emptycategoriescanbetakenseriouslyas evidenceofhowtheyaretreatedbythegrammar.(Fodor,1989:p.205) Thefinalremarkinthispassageisparticularlysignificantforthelin guistwhoseekstoformulateagrammarthatisnotonlydescriptively adequatebutalso psychologically real .22

20 SeePereplyotchik(forthcoming)forfurtherdiscussion. 21 See Berwick and Weinberg (1984: ch. 2), Phillips (1994), and Phillips and Lewis(forthcoming). 22 Needless to say, descriptive adequacy is a worthy goal, in and of itself, to pursueinconstructingagrammar.And,asDevitt(2006)arguesatlength,itseems 50 D.Pereplyotchik, Psychological and Computational Models

Itisimportanttobeclearaboutthestatusoftheprocessingprin ciplesdiscussedabove.Tomyknowledge,ithasneverbeensuggested thatMinimalAttachment,LateClosure,ortheMinimalChainPrinciple are represented intheHSPM,orthattheHSPMhas knowledge ofthem, howevertacit.Rather,theseareintendedtobe descriptive principles— theyare true of theHSPMinmuchthesamewaythattheprinciples ofcelestialmechanicsare true of oursolarsystem.Still,giventhatthe principlesmakeineliminableuseoftheproprietarynotionsofsyntactic theory,itfollowsthattheHSPMworks in accordance withaparticular theoryofsyntax.Thereis,ofcourse,adifficultquestionregardinghowto properlycashoutthisnotionof“workinginaccordancewith.”Anoppo nentoftheRepresentationalThesismight,atthispoint,insistthatthe HSPMdoesn’tneedto represent agrammar—implicitly or explicitly—in ordertoact“inaccordance”withit.IfaskedhowtheHSPMmanagesto build just the right structures andmake just the right predictions inthe courseofcomprehension,theopponentmightreply:“Itjust does !”Toful lyappreciatetheinadequacyofthisreply,onemustdelveintothedetails oftheexistingcomputationalmodelsofparsingandcomprehension,both classicalandconnectionist,andtounderstand why allpsychologically plausiblemodelsrequirethegrammartoberepresented—again,either implicitly or explicitly.Asnotedattheoutset,thenotionofrepresenta tionis,itself,“upforgrabs,”sotospeak.Accordingly,Iwillconcludein §4byclarifyingthenotionsofimplicitandexplicitrepresentation.

§3. A Survey of Parsing Algorithms Inthissection,weexaminewhatDevitt(2006)callsthe“processing rules” of a comprehension system by surveying computational pars ingmodelsthathavesomeclaimtopsychologicalplausibility.Inthese models,internalrepresentationsoftherulesorprinciplesofagram marareconsultedfromtheveryoutsetoftheparsingprocess.Iwill arguethatthereissimplynowayofbuildingmentalphrasemarkers withoutconsultinganinternalrepresentationofagrammar. There is no such thing as a parser without an internally represented grammar .

3.1 Context-free grammars, the Earley Algorithm, and the CKY algorithm Thestartingpointofmanycontemporarysyntactictheoriesconsistsof alistof context-free phrase structure rules ,examplesofwhichappear below. 23 to be the only goal that many syntacticians pursue. Nevertheless, it’s plain that findingadescriptivelyadequate and psychologicallyrealgrammarisamuchmore excitingprospect.NotealsothatIampassingoverChomsky’simportantnotionof explanatory adequacy.Thedistinctionbetweendescriptiveandexplanatoryadequacy raisesissuesthatarewellbeyondthescopeofthepresentdiscussion. 23 ThecontextfreegrammardisplayedhereisadaptedfromJurafskyandMartin (2008: p. 394). The symbol ‘|’ expresses exclusive disjunction. Contextfree rules D.Pereplyotchik, Psychological and Computational Models 51

Lexical rules Grammatical rules Noun → fl ight | trip | morning S → NP VP Verb → is | prefer | like | need S → VP Adjective → cheapest | non stop NP → Pronoun Pronoun → me | I | you | it NP → Proper-Noun Proper-Noun → Chicago | United NP → Det Nominal Determiner → the | a | an | this Nominal → Nominal Noun Preposition → from | to | on | near Nominal → Noun Conjunction → and | or | but VP → Verb NP Acontextfreegrammar(henceforth, CFG )canbeusedtodemarcatea classofwellformedsentencesinsomelanguageandtodescribetheir hierarchicalstructure.Forinstance,thegrammarpresentedabovede scribessentence(26)ashavingthestructureshownin(27). 24 (26)Ipreferamorningflight.

(27)[S [NP [Pronoun I]][ VP [V prefer][ NP [Det a][ Nom [N morning][ Nominal [N flight]]]]]] CFGsformthefoundationofawiderangeparsingmodels.Thesecan bedividedintotwobroadclasses: top-down and bottom-up parsing. 25 Thedistinctionbetweenthesetwoapproachesreflectsamorebasicdis tinction,withwhichphilosophersarewellacquainted. Regardlessofthesearchalgorithmwechoose,therearetwokindsofcon straints that should help guide the search. One set of constraints comes fromthedata,thatis,theinputsentenceitself.…Thesecondkindofcon straintcomesfromthegrammar.…Thesetwoconstraints…giveriseto thetwosearchstrategiesunderlyingmostparsers:topdownorgoaldirect ed search, and bottomup or datadirected search. These constraints are morethanjustsearchstrategies.Theyreflecttwoimportantinsightsinthe westernphilosophicaltradition:therationalisttradition,whichemphasizes theuseofpriorknowledge,andtheempiricisttradition,whichemphasizes thedatainfrontofus.(JurafskyandMartin,2008:p.433)

comprisedthe“base”ofnearlyeverytransformationalgrammarpriortotheMinimalist program.EvenXbartheoryistriviallyexpressibleinacontextfreeformat. 24 This structure can be represented in bracket notation (as in (27)), tree notation,orinordinaryEnglish.Whilethebracketsrenderanysuchclaimshorter, trees render it easier to read. Whatever the notation, such representations serve onlytomakeclaimsaboutthestructureofasentence.Withoutfurtherargument, nothingatallfollowsaboutthepsychologyofalanguageuser,norabouttheinternal operationsofacomputationalsystemdesignedtoparsesentencesofalanguage.This observationleadsDevitt(2006)todrawsanimportantdistinctionbetweenstructure rulesandprocessingrules.JurafskyandMartin(2008)makeplaintheirrecognition ofthisdistinctionwhentheywrite,“Syntacticparsing…isthetaskofrecognizinga sentenceandassigningasyntacticstructuretoit.Thischapterfocusesonthekindof structuresassignedbycontextfreegrammars...[S]incetheyarebasedonapurely declarativeformalism,contextfreegrammarsdon’tspecify how theparsetreefora givensentenceshouldbecomputed.We’llthereforeneedtospecifyalgorithmsthat employthesegrammarstoproducetrees”(431). 25 Fodor,Bever,andGarrett(1974)refertotopdownandbottomuptechniques as analysis-by-analysis and analysis-by-synthesis ,respectively.Computerscientists sometimesusetheterms recursive-descent and shift-reduce .Mixedstrategies,such as leftcorner parsing, have also been explored and found to be psychologically plausibleinanumberofimportantrespects.See,e.g.,AbneyandJohnson(1991). 52 D.Pereplyotchik, Psychological and Computational Models

Abottomupparserbeginsbyimmediatelylookingattheinputand searchingthelexiconinordertodetermineallofthepossiblegram maticalcategoriestowhicheachwordintheinputcanbelong.Once theseareascertained,theparserbeginstobuildupallofthesyntactic structurescompatiblewiththosecategoriesandthegrammarofthe language. The parsing process is completed when the partial struc turesthatwerebuiltoutoftheitemsintheinputareintegratedintoa fullsentence—astructurewithanSnodeatitsroot.Ateachstepofthe process,theparser“looksforplacesintheparseinprogresswherethe righthandsideofsomerulemightfit”(JurafskyandMartin,2008:p. 434).This,ofcourse,entailsthattheparserhasanexplicitrepresenta tionoftherulesandusesthemasatemplatetodeterminewhichpars esaregrammaticallylicensed.Indeed,thepseudocodeofthebottomup “CKY”algorithmrevealsexplicitreferencetoadatastructureinwhich therulesofagrammarareexplicitlyrepresented. 26

function CKYPARSE( words, grammar ) returns table ß for j←from 1 to LENGTH( words ) do table [ j−1, j]←{ A | A → words [ j] ∈grammar } ß for i←from j−2 downto 0 do for k←i+1 to j−1 do table [i,j]←table [i,j] U { A | A → BC ∈grammar , ß B ∈table [i,k], C ∈table [k, j]}

Incontrasttoabottomupparser,atopdownparserusesitsinter nallyrepresentedgrammartoissuepredictionsabouttheinput, prior toexaminingit.Forinstance,iftheinternallyrepresentedCFGisthe onedisplayedabove,thentheparserwillpredictthatthesentencecon sistseitherofanNPandaVP,orsolelyofaVP.(Thesearetheonlytwo expansionsoftheSnodethatthegrammarallows.)Inthenextphase, theparser“unpacks”thesepredictionsfurtherbypredicting,e.g.,that theNPconsistsofeitherapronoun,apropernoun,oradeterminerand anominal.Eventually,theparserlooksattheinputandweedsoutall ofthepredictionsthatareincompatiblewithwhatitfinds.Theprocess iscompletewhenalloftheinputhasbeenaccountedforandatleast oneoftheanalyseshasnotbeenfalsified. In practice, many parsing algorithms adopt a mixed strategy, is suingtopdownpredictionsandthenusingabottomup,datadriven

26 This elegant algorithm was discovered in the late 1960s by three separate researchers: John Cocke, Tadao Kasami, and Daniel Younger. The algorithm is frequentlylabeled‘CKY’,inhonorofitsdiscoverers.Thepseudocodedisplayedabove istakenfromJurafskyandMartin(2008),ch.13.Ihaveusedboldarrowstoindicate theexplicitreferencetotherulesofthegrammar.Forinstance,thestring‘{ A | A → BC ∈ grammar }’ can be read as “all nonterminal nodes A, such the grammar containsarulethatexpandsAintoBandC.”ItisnotablethattheCKYalgorithm hasbeenimplementedinaconnectionistarchitecture;seeHale(1999)fordetails. D.Pereplyotchik, Psychological and Computational Models 53 approach in midsentence to weed out false predictions on the fly. Moreover, to avoid generating the same partially successful predic tionsagainandagain,orbacktrackingthroughpriorchoicepointsand reduplicatingworkalreadydone,manyparsersemploythe“dynamic programming”techniqueofstoringsuccessfulpartialparsetreesina datastructureknownasa“chart”ora“wellformedsubstringtable.” Thesuccessfulpartialparsesarethencalledonwhenneeded,instead ofhavingtobereconstructedanew.Awellknownalgorithmdeveloped byEarley(1970)proceedsalongtheselines.Thedetailsarefascinat ing,butforourpurposestheimportantpointisthateachofthethree mainfunctionsoftheEarleyalgorithm—Predict, Scan, and Complete — drawonaninternallyrepresentedgrammar.Thisismadeclearinthe pseudocode,whichexplicitlyreferstoadatastructurethattheauthors haveaptlylabeledGRAMMAR R ULES .27

procedure PREDICTOR(( A →a• B b,[ i, j])) for each (B →g) in GRAMMARRULESFOR( B, grammar ) do ß ENQUEUE(( B →•g,[ j, j]), chart [j]) procedure SCANNER(( A →a• B b,[ i, j])) ß if B ⊂PARTSOFSPEECH( word[j] ) then ENQUEUE(( B → word [ j],[ j, j+1]), chart[j+1] ) procedure COMPLETER(( B →g•,[ j,k])) for each (A →a• B b,[ i, j]) in chart [j] do ß ENQUEUE(( A →a B •b,[ i,k]), chart[k] )

TheEarleyparserconstructsallofthepossibleparses in parallel ,which typicallyrequiresagreatdealofmemorycapacitywhenabroadcov eragegrammarisapplied.Despitethisproblem,theEarleyalgorithm is still widely used and has been applied to probabilistic extensions ofcontextfreegrammars(Hale,2001;2003),aswellastoMinimalist andothermildlycontextsensitivegrammars(Harkema,2001).Indeed, Hale(2001,2003)presentsevidenceforthepsychologicalplausibilityof anEarleyparserthatdrawsonaninternallyrepresentedprobabilistic CFG. 28 Haleshowsthatsuchamodelpredictstwokindsofprocessing difficultiesthattheHSPMisknowntoexhibit,havingtodowiththe mainclause/relativeclauseambiguity(cf.sentences(18)–(21)above),

27 ThepseudocodedisplayedhereistakenfromJurafskyandMartin(2008:ch. 13).SeealsoPereiraandWarren(1983),Shieber,Schabes,andPereira(1995),Hale (2001),Harkema(2001). 28 Hale(2001)makesclearhiscommitmenttowhathecallsthe strong competence hypothesis :“Whatistherelationbetweenaperson’sknowledgeofgrammarandthat sameperson’sapplicationofthatknowledgeinperceivingsyntacticstructure?… The relation between the parser and grammar is one of strong competence. Strong competence holds that the human sentence processing mechanism directly uses rulesofgrammarinitsoperation,andthatabareminimumofextragrammatical machineryisnecessary.Thishypothesis,originallyproposedbyChomsky(Chomsky, 1965,p.9)hasbeenpursuedbymanyresearchers(Bresnan,1982)(Stabler,1991) (Steedman, 1992) (Shieber and Johnson, 1993), and stands in contrast with an approach directed towards the discovery of autonomous principles unique to the processingmechanism”(p.1,emphasisintheoriginal) 54 D.Pereplyotchik, Psychological and Computational Models and the asymmetry between subjects and objects in unreduced rela tiveclauses.Examplesofthelatter,takenfromGibson(1998),appear below. 29 (28)[SThereporter[S’who[Sthesenatorattacked]]admittedtheerror]. (29)[SThereporter[S’who[Sattackedthesenator]]admittedtheerror].

3.2 Parsing as Deduction An exciting development in computational linguistics, known as the Parsing as Deduction approach( PD ,henceforth),construesthepars ingprocessasaspeciesofnaturaldeductionineitherfirstorderlogic orrelatedformalisms.Onthisapproach,parsingroutinesrunthrough anexplicitproofprocedurethattakestherulesofagrammarasaxi omsandderivestheoremsconcerningthesyntacticstructureofinput strings.PDconstitutesthemostconcreteimplementationoftheRep resentationalThesis,treatingrulesas truth-evaluable claims ,which theparserthenusesaspremisesinthecourseofitsinferentialproce dures. Pereira and Warren (1983) show how the Earley algorithm dis cussedabovecanbereinterpretedintheParsingasDeductionframe work.Thebasicfunctionsoftheparser—Scan , Predict ,and Complete — are interpreted as inference rules, on a par with modus ponens and existential instantiation .Shieber,Schabes,andPereira(1995)extend thistreatmenttotheCKYalgorithmmentionedaboveandavarietyof otheralgorithms. [D]eductioncanprovideametaphorforparsingthatencompassesawide rangeofparsingalgorithmsforanassortmentofgrammaticalformalisms. We flesh out this metaphor by presenting a series of parsing algorithms literallyasinferencerules,andbyprovidingauniformdeductionengine, parameterizedbysuchrules,thatcanbeusedtoparseaccordingtoanyof theassociatedalgorithms.…Aswewillshow,thismethoddirectlyyields dynamicprogramming versions of standard topdown, bottomup, and mixeddirection(Earley)parsingprocedures.(p.4) TheyalsoapplythePDapproachtosyntacticformalismsotherthan thecontextfreegrammarsthatwe’vebeenconsideringthusfar.Like wise, Harkema (2001) and Hale (2003) apply the approach to Mini malistgrammars.Theseformalismsarewidelytakentobecapableof providingamoredescriptivelyadequatetreatmentofnaturallanguage thancontextfreegrammars.Weshallseeshortlythatprinciplebased

29 Gibson cites the results of a number of psycholinguistic experiments that establish the reality of this processing difficulty: “The object extraction is more complex by a number of measures including phoneme monitoring, online lexical decision,readingtimes,andresponseaccuracytoprobequestions...Inaddition,the volumeofbloodflowinthebrainisgreaterinlanguageareasforobjectextractions thanforsubjectextractions…,andaphasicstrokepatientscannotreliablyanswer comprehension questions about objectextracted [relative clauses], although they performwellonsubjectextracted[relativeclauses]…”(p.2). D.Pereplyotchik, Psychological and Computational Models 55 formalisms,suchasthegrammarofGovernmentandBindingtheory, likewisereceiveanaturalinterpretationinthePDapproach. 30 Fromthepointofviewofacomputationallinguist,thePDapproach has an immediate payoff in that it makes possible the application of wellknownprogrammingtechniquesin LISP and Prolog tothetaskof parsingnaturallanguage.Forphilosophicalpurposes,thepayoffisquite different,thoughnolessintriguing.Itisdifficulttofindupandrunning examplesofpsychologicalprocesses convincingly modeledasdeductive operationsdefinedovertruthevaluablestatements.ThePDapproach showsthatsuchanexampleisavailableinthecaseofatleastoneaspect of natural language comprehension. If the CKY or Earley algorithms playaroleinaplausiblemodeloftheHSPM,thenthereisaperfectly workableaccountaccordingwhichtheneuralmechanismsthatunder pincomprehensioncanbesaidtobecarryingoutdeductiveprocedures. Theavailabilityofsuchanaccountcanbebroughttobearontherecent debateinthephilosophyoflinguistics,concerningwhetherknowledge oflanguage—i.e.,grammaticalcompetence—isstrictlyspeaking propo- sitional .31 Thefactthataricharrayofsuccessfulparsingmodelscanbe representedinafamiliarpropositionalformatshows,attheveryleast, thatitisnotincoherenttosupposethatknowledgeoflanguageshould bepropositional.Nevertheless,Iwillarguein§4thatitisbettertore gardtheHSPMasasetof subpersonal processes,hencenotconsistingof propositionalattitudes,inthefullbloodedsenseoftheterm.

3.3 Principles and Parameters in Syntax and Parsing The Standard Theory of transformational grammar (Chomsky 1957, 1965) posited transformational rules over and above contextfree phrase structure rules. Each linguistic construction—passive, ques tion,etc.—wasassociatedwithadistincttransformationalrule.Inad dition,manytransformationswerelanguagespecific;theirinputsand outputswoulddifferfromonelanguagetothenext.Thisproliferation oftransformationalrulescametobeseenasaseriousproblem.Sub sequent work, eventuating in the Principles and Parameters theory (P&P,discussedabove),heraldedastrikingrevisionoftransformation algrammarinthe1980s.Thenewtheorydidawaywithmosttrans formationalrules.Onlyasingleruleremained: Move- α.Thisrulesays thatanyconstituentappearingatD(eep)structurecanbemoved any- where inthephrasestructuretreeonthewaytoS(urface)structure. Leftunconstrained, Move- αwouldgenerateagreatmanySstructure phrasemarkersthatfailtocorrespondtoanythingonefindsinnatural language.Toeliminatetheseunwantedresults,P&Pgrammarsinvoke a set of syntactic principles, e.g., the Projection Principle, the Theta Criterion,theCaseFilter,thethreeBindingPrinciples,andtheEmpty CategoryPrinciple,whichactasasequenceoffilters.

30 SeeBerwick(1991a,b)andJohnson(1989). 31 SeeKnowles(2000)andRattan(2002). 56 D.Pereplyotchik, Psychological and Computational Models

WiththeriseoftheP&Pframeworkinformalsyntax,itwasonly naturalthatcomputationallinguistswouldembarkontheprojectof implementing these novel ideas in parsing models. From the start, principlebasedparsersmadeheavyuseofthePDapproach. deductive inference is still perhaps the clearest way to think about how to‘use’knowledgeoflanguage.Inacertainsense,itevenseemsstraight forward.Thetermsinthedefinitionslike[thatoftheCasefilter]havea suggestivelogicalringtothem,andevenincludeinformalquantifierslike every ;termslike lexical NP canbepredicates,andsoforth.Inthisway, oneisledtofirstorderlogicorHornclauselogicimplementation(Prolog) asanaturalfirstchoiceorimplementation,andtherehavebeenseveral suchprinciplebasedparserswritten…Parsingamountstousingatheorem provertosearchthroughthespaceofpossiblesatisfyingrepresentationsto findaparse…(Berwick,1991a) Oneimportantfeatureofprinciplebasedparsersistheirflexibilitywith regardtodifferentlanguages.YangandBerwick(1996)pointoutthat “[t]raditional parsing technologies utilize languageparticular, rule basedformalisms,whichusuallyresultinlargeandinflexiblesystems.” Bycontrast,aprinciplebasedparsercanbeusedtoassignstructureto sentencesfromavarietyoftypologicallydistinctlanguages,simplyby swappingoutonelexiconforanotherandsettingtheparameterstothe valuescharacteristicofthetargetlanguage. Itisbelievedthatlanguagesareconstrainedbyasmallnumberofuniver salprinciples,withlinguisticvariationslargelyspecifiedbyparametricset tings.Themeritofprinciplebasedparsingistwofold.Asatoolforlinguists, itisdirectlyrootedingrammaticaltheories.Therefore,linguisticproblems, particularlythosethatinvolvecomplexinteractionsamonglinguisticprin ciples,canbecastinacomputationalframeworkandextensivelystudiedby drawingdirectlyonanalreadysubstantiatedlinguisticplatform.Itisde signedfromthestarttoaccommodateawiderangeoflanguages—notjust ‘Eurocentric’ Romance or Germanic languages. Japanese, Korean, Hindi andBanglahaveallbeenrelativelyeasilymodeledinPAPPI(Berwickand Fong1991,…).Differencesamonglanguagesreducetodistinctdictionaries, requiredinanycase,plusparametricvariationintheprinciples.…Because thePAPPIsystemimplementsitsmodellinguistictheoryfaithfully,adapt ingnewlanguagesisexpectedtobequiteminimal,asourimplementation shows.(YangandBerwick,1996) Besidestheirflexibilitywithrespecttodistinctlanguages,principle basedparsersexhibittolerancewithrespecttoungrammaticalinput. OnitswayfromXbaranalysistotheultimateassignmentofstructure, linguisticinputpassesthroughaseriesoffilters.Ratherthangrinding toahaltwhenpresentedwithungrammaticalinput,aprinciplebased parsersimplytakesnoteofwhichprinciplesofthegrammarthein putfailstosatisfy.Thisgivesrisetovariablestrengthjudgmentsof ungrammaticality;themoreprinciplesareviolated,themoreungram maticaltheinputisjudgedtobe.Nevertheless,asitrunsthisgauntlet, almost all linguistic input is assigned some interpretation, in a way thatmirrorswhatweobserveinhumanperformance. D.Pereplyotchik, Psychological and Computational Models 57

Inrecentyears,theP&Pframeworkhasmovedawayfromtheformal ismofGovernmentandBindingtheory.Seekingamoreelegantandcom pactaccountofsyntacticstructure,Chomsky(1995)andothershavede velopedavarietyofMinimalistgrammars,inwhichthebasicoperations of and Move forge hierarchical relations between lexical items, conceived as sets of features. Unsurprisingly, computational linguists havebuiltparsingmodelsthatincorporatethesegrammarsaswell.For example,thealgorithmsdevelopedbyHarkema(2001)andHale(2003) employ the PD approach in constructing Earley and CKY algorithms forparsingwithMinimalistgrammars.Inaddition,Berwick(1997)and Weinberg(1999),bothattempttoaccountforanumberofpsycholinguis ticresultsbyseeingparsingprocessesas“theincrementalsatisfactionof grammaticalconstraints”imposedbyMinimalistgrammars.

3.4 Effi ciency Itiswellknownthat, in their purest and simplest forms ,bothtopdown andbottomupparsingalgorithmsaresoinefficientastobepractically unusableforrealisticnaturallanguageprocessing.Devitt(2006)puts thepointvividlyinthefollowingpassage: Howcantherepresentedrulesbeusedasdatainlanguageuse?Consider languagecomprehension.Supposethat,somehoworother,theprocessing rulescomeupwithapreliminaryhypothesisaboutthestructureofthein put string. In principle, the represented rules might then play a role by determiningwhetherthishypothesiscouldbecorrect(assumingthatthe inputisindeedasentenceofthelanguage).Theprobleminpracticeisthat toplaythisroletheinputwouldhavetobetestedagainstthestructural descriptionsgeneratedbytherulesand there are just too many descriptions . The“searchspace”isjusttoovastforittobeplausiblethatthistestingisre allygoingoninlanguageuse.ThisledFodor,Bever,andGarretttoexplore theideathatheuristicrulesnotrepresentationsoflinguisticrulesgovern languageuse.(p.209) Unliketheauthorsofstandardtextsincomputationallinguistics,De vitttakesthistobeapowerfulargumentagainstmodelsthatmake use of internally represented grammars. But this is a mistake. The starkinefficiencyof simplistic modelsdoesnotconstitutegroundsfor rejecting every modelthatiscommittedtointernalrepresentationsof agrammar. 32 Computational linguists have devised an array of strategies for minimizing the inefficiency that Devitt points to. Some of these are 32 Incidentally,theapproachthatDevittmentions—i.e.,theheuristicstrategies proposedbyFodor,Bever,andGarrett(1974)—isknowntobeuntenable.AsPritchett (1992: pp. 22–26) points out, their “canonical sentoid strategy” and associated heuristicsaresimplynotadequateforexplainingawiderangeofprocessingdata. Psycholinguists have abandoned Fodor, Bever, and Garrett’s approach, pursuing morepromisingavenuesofresearch,fromwhichavarietyofprocessingprinciples haveemerged,includingthosediscussedin§2.Thesecutdowntheparser’ssearch space, but in a way that presupposes the parser’s internal representation of a grammar. 58 D.Pereplyotchik, Psychological and Computational Models bestcharacterizedasimplementationsofMinimalAttachment,Late Closure,andtheMinimalChainPrinciple(§§2.3,2.4).Others,likethe EarleyandCKYalgorithms,employclevercomputationaltricks,such asthestorageandreuseofpartialsolutions(§3.1).Stillothershave adifferentflavor,makingheavyuseofstatisticalmethodstoreduce errorandavoidambiguity.Forinstance,acommonapproachinvolves enrichingthecontextfreephrasestructurerulesdiscussedabovewith informationabouttheprobabilityoftheirapplicationinagivencon text.Probabilisticgrammarsencodethefrequencywithwhichlexical items,syntacticcategories,andevenphrasestructuresappearinacor pus.Ithasbeenhypothesizedthatsuchfrequencieshaveanexplana toryrelation(asyetnotfullyunderstood)totheambiguityresolution principlesemployedbyHSPM. 33 Forpresentpurposes,theimportant pointisthatinordertoimplementsuchfrequencybasedprinciples,a modelmuststilldrawoninternalrepresentationsofagrammar. 34 The probabilitieshavetobeattachedto something ,andthat“something” must be represented, either explicitly or implicitly. Thus, statistical methodsarenot alternatives totheparsingstrategiesthatmakeuse ofaninternallyrepresentedgrammar.Rather,theyare extensions of thosestrategies. Similarly,intheearlystagesoftheirdevelopment,principlebased parsers faced a number of challenges, centering mostly on issues to dowithcomputationalinefficiency.Buthere,too,impressivegainsin efficiencyweremadepossiblebythejudiciousapplicationofcleverpro grammingtechniques,suchasthe“coroutining,”“interleaving,”and formalcompilationofGovernmentandBindingprinciples. 35 WeareabletoparsesentenceswiththerangeofstructuresincludingWh movement, the Binding Theory, Quantifier Scoping, the BAconstruction to complex NP (clausal, possessive, and numeral/ classifier). All testing sentencesarecorrectlyanalyzed:LFlogicalformrepresentationsarecom putedforthegrammaticalsentencesandtheungrammaticalonesareruled outoneswithlinguisticprincipleviolation(s)shown.Eachparsetakesno morethan2secondsonaSparc10workstation.(YangandBerwick,1996:p. 370.Notethattoday’sprocessorswouldrequireonlyafractionofthetime. –D.P.) Inshort,themodelscurrentlybeingexploredinbothpsycholinguistics andcomputationallinguisticsarepremisedontheideathatstructure rulesarerepresentedandused“asdata”forthepurposeofparsingand comprehension.ContrarytosomeoftheremarksinDevitt(2006),this ideahas not beenabandoned.Rather,ithasbeentakenasastarting point,towhichvariousmodificationsaremade,inanefforttoincrease bothefficiencyandpsychologicalplausibility.

33 SeeJurafsky(2003)fordiscussion. 34 Jurafsky and Martin (2008: ch. 14) and Hale (1999, 2001) provide working examplesofsuchmodels. 35 SeeBerwick(1991a,b)forindepthdiscussionofthesetechniques. D.Pereplyotchik, Psychological and Computational Models 59

3.5 Connectionist Models of Sentence Processing Inrecentyears,anumberofconnectionistmodelsofsentenceprocess inghavebeendeveloped. 36 ThepioneeringworkofElman(1992)led to a number of followup studies, which yielded interesting results concerning the ability of simple recurrent networks to perform as if theywereexplicitlyrepresentinggrammaticaldependencies,without actuallydoingso.Thereare,ofcourse,familiarworriesaboutwhether suchmodelscanbescaleduptoachievebroadcoverage,ortoperform tasksthataremorerealistic(fromthepointofviewofpsychology)than predictingthelexicalcategoryofawordonthebasisofpriorinput.In lightofourconclusionsin§2,itisreasonabletosupposethatnocon nectionistmodelcanattainpsychologicalplausibilitywithoutcomput ingphrasemarkersforawiderangeofnaturallanguageinputs. Fortunately,suchmodelsare,infact,available.ThePRISMmodel presentedinHale(1999)isaconnectionistimplementationoftheCKY algorithmdiscussedearlier(§3.1).Haledemonstrateshowacontext free grammar, suitably represented, can be used by a connectionist networktocomputesyntacticstructures.Adoptingahybridclassical connectionist approach, Stevenson (1994) presents a model in which lexicalitemscompeteforattachmentinasyntacticstructure—acom petitiongovernedbyconnectionistprinciples,butinaccordancewithan internallyrepresentedphrasestructuregrammar.Stevenson’smodel providedthebasisforthemorerecentworkreportedinStevensonand Smolensky(2006),wheretheauthorsarguethat“an[OptimalityTheo retic]grammarthatiswellmotivatedfromtheperspectiveoftheoreti calsyntaxcanexplainonlineparsingpreferencesofcomprehenders, asevidencedbyempiricaldataontheprocessingofsentenceswhich,at intermediatepositions,havevariousstructuralambiguities”(p.829). Finally,GerthandbeimGraben(2009)mobilizearepresentationofa Minimalist grammar in a connectionist parsing model that achieves some measure of psychological plausibility, again by replicating the processingdifficultiesthatwefindinhumancomprehension. Connectionistresearchhas,moreover,enrichedourstockoftools forassigninginterpretationstotheinternalstatesofneuralnetworks. Statistical techniques such as Principal Components Analysis reveal howwelldefinedregionsofanetwork’svectorspacecantrackabstract linguisticproperties,e.g.,informationaboutaverb’ssubcategorization frame.Moreover,inalandmarkstudy,TaborandTanenhaus(1999) demonstrated how concepts from dynamical systems theory can in crease our understanding of what interpretations can be reasonably assignedtotheactivationvectorsofthehiddennodesinaconnection istnetwork. 37 Allinall,theinterpretivelimitationsthatwecurrently

36 SeeRohde(2002),ch.2,forahistoricaloverview. 37 To be fair, it should be noted that Tabor and Tanenhaus would strongly resisttheclaimthattheirmodelscontaininternalrepresentationsofgrammatical principles; they believe that such models are tracking only complex statistical 60 D.Pereplyotchik, Psychological and Computational Models face seemunlikelytoconstitutea principled problem;suchlimitations willalmostcertainlybeovercomeasresearchprogresses.Andthemore welearnabouthowtointerprettheinnerworkingsofsuccessfulcon nectionistmodels,themoreopportunitywewillhavetoseethattheir successisduetothefactthattheyrepresentgrammaticalprinciples, eitherimplicitlyorexplicitly. 38 Progresstowardthisgoalisaidedby productive efforts on the part of philosophers to spell out the condi tionsonimplicitrepresentation.Drawingontheparticularlyhelpful taxonomydevelopedbyDavies(1995),Iwillargueinthenextsection thatitisincorrecttosaythatconnectionistnetworksmust, by their very nature ,failtorepresentagrammar.Hence,evenifconnectionist modelsultimatelycapturetheempiricalfindingsabouthumanparsing performancebetterthancompetingclassicalmodels,itwouldstillnot becorrecttosay,asDevitt(2006)does,thatthereis“nosignofthe structurerulesofthelanguagegoverningtheprocessofcomprehen sion”(240).

§4. Representation, Functionalism, and Subpersonal Psychology Recall that Devitt (2006) argues against the representational thesis (RT),repeatedhere: (RT) Aspeakerofalanguagestandsinanunconsciousortacitproposi tionalattitudetotherulesorprinciplesofthelanguage,whichare representedinherlanguagefaculty(p.4). ThethesisIwishtodefendisamodifiedversionofRTthateschews Devitt’srelationalconceptionofrepresentation,aswellashisappealto thenotionofapropositionalattitude.Therelationalconceptionemerg esmostclearlyinthefollowingpassage,whichistheonlycharacteriza tionofrepresentationthatonefindsinDevitt(2006). [T]alk of representing rules raises a question: What sense of ‘represent’ do I have in mind in RT? The sense is a very familiar one illustrated in the following claims: a portrait of Winston Churchill represents him; asound/thePresidentoftheUnitedStates/representsGeorgeW.Bush; aninscription,‘rabbit’,representsrabbits;acertainroadsignrepresents thatthespeedlimitis30mph;themaponmydeskrepresentstheNew Yorksubwaysystem;thenumber11isrepresentedby‘11’intheArabicsys tem,by‘1011’inthebinarysystem,andby‘XI’intheRomansystem;and, propertiesoftheitemsinthecorpusoftrainingdata.Iamnotpersuadedthattheir experimentsrevealthis.Puttingasideseriousworriesabouttheirmethodology— particularly the construction of their training corpus—it remains the case that statisticalfrequenciesinanynaturallanguagecorpusareconfoundedwithawide rangeofsyntactic,semantic,andpragmaticeffects.Furtherexperimentalworkwill beneededtodecidethisissue. 38 See,e.g.,Lawrence,Giles,andFong(2000)foranattempttoextractrules,in theformofdeterministicfinitestateautomata,fromavarietyofrecurrentneural networksthathavebeentrainedtodistinguishgrammaticalfromungrammatical sentences. D.Pereplyotchik, Psychological and Computational Models 61

mostaptly,a(generalpurpose)computerthathasbeenloadedupwitha programrepresentstherulesofthatprogram.Somethingthatrepresentsin thissensehasa semantic content ,a meaning .Whenallgoeswell,therewill existsomethingthatarepresentationrefersto.Butarepresentationcan failtorefer;thus,nothingexiststhat‘JamesBond’or‘phlogiston’referto. Finally,representationinthissenseiswhatvarioustheoriesofreference— description,historicalcausal,indicator,andteleological—areattemptingto partlyexplain.(p.5) Inthissection,Iexplainwhypropositionalattitudesare not thekind ofstateinvolvedintheearlystagesoflanguagecomprehensionandI sketch a framework for thinking about the kinds of representations thatdoplayaroleinsentenceprocessing.Furthermore,Iarguethat what Devitt (2006) calls “merely embodied” rules should be seen as represented rules.

4.1 Representation and the Personal/Subpersonal Distinction Itakeasmystartingpointafunctionalroletheoryofrepresentation, accordingtowhichtherepresentationalpropertiesofanevent,state, or process are exhaustively determined by three factors: i) the envi ronmentalconditionsunderwhichitistypicallyelicited,ii)thecausal relations it typically bears to other events, states, or processes, and iii)itstypicalbehavioralconsequences.Contrarytothestandardver sionsofbehaviorist,covariationist,andinformationtheoreticaccounts ofrepresentation,Ibelievethatnoneofthethreefactorsisindividually sufficient;onlyacombinationofallthreecanbebothnecessaryand sufficientforsomethingtohavetherepresentationalpropertiesthat itdoes. 39 FollowingSellars(1963a),Iholdthatthefunctionalroletheoryof intentional content applies, in the first instance, to speech acts. But the need to account for speech and rational action gives rise to the theoreticalpositofinternalstates—thepropositionalattitudes.These arepersonallevelstates,whoseroleinacreature’scognitiveeconomy iscapturedbyasuitableformulationoffolkpsychology. 40 Thefolkpsy chologicalpositofpropositionalattitudessufficesonlyforarelatively coarsegrainedwayofdescribingacreature.Diggingdeeper,onewants to know how a creature can so much as have such states. Here, the strategy of attributing sub personal mechanisms becomes useful. Fo cusingspecificallyonthecaseoflanguagecomprehension,wecansay thefollowing:Whereasfolkpsychologyletsustalkaboutthesensation ofasoundgivingrisetoanactoflinguisticcomprehension—judging thataspeakersaidthat p—cognitivepsychologytellsuswhathappens between thesensationandthejudgment. Dennett(1987)drawsausefuldistinctionbetweentaking“thein tentionalstance”—i.e.,usingfolkpsychologytopredict,explain,and

39 Forcompellingargumentstothiseffect,seeBrandom(1998). 40 Lewis(1972) 62 D.Pereplyotchik, Psychological and Computational Models describe a creature’s behavior—and adopting “the design stance,” which involves thinking of a creature as an aggregate of purposeful mechanisms,eachofwhichhasthefunctionofperformingaspecialized task.It’sfromthedesignstancethatweattributesubpersonalstates totheHSPM.Thesestateshave some ofthefeaturesthatwetaketo becharacteristicofpersonallevelstates.Inparticular,theybearsys tematicrelationstotheenvironment,tobehavior,andtooneanother. Thisiswhatmakesitbothreasonableandusefultothinkofthemas representations—i.e.,tointerprettheminsomethinglikethewaythat we interpret the speech acts we find in a syntax textbook. Doing so allowsustoabstractawayfromthelargelyunknownneuralmecha nismsthatunderpinsubpersonalstatesandtoseethecausalrelations betweenthesestatesasresemblingthe inferential relationsthathold betweenpropositionalattitudes.This,inturn,allowsusto rationalize subpersonalmechanisms—i.e.,tounderstandthemasbeingengaged inpurposefulactivitiesandastakingreasonablestepstowardaccom plishingtheirgoals.Thesesimilaritiesbetweenpersonalandsubper sonalstatesmakeittemptingtothinkofthelatterasakindofpropo sitionalattitude.Butthatwouldbeamistake,fortherearesalientand wellknowndifferencesbetweenthetwokindsofstate. First, subpersonal states are not “inferentially integrated” with personallevelstates.Wecannotdrawpersonallevelinferenceswhose premisesaretherulesorprinciplesofthegrammarthatwesubperson ally represent (or, to use Chomsky’s neologism, “cognize”). 41 Second, subpersonalstateslikethoseinvolvedintheearlystagesoflanguage comprehensionarenotexpressibleinspeech.Third,subpersonalstates are always nonconscious,whereaspersonallevelstatesaresometimes consciousandsometimesnot.(Forthisreason,theconscious/noncon sciousdistinctionshould not beconfusedwiththepersonal/subpersonal distinction.)Fourth,subpersonalstatesaresusceptibletoacomputa tional description, whereas there are wellknown and potentially in surmountableproblems—looselycapturedunderthelabel“theframe problem”—fortheenterpriseofgivingacomputationaldescriptionof personallevelstates. 42 Fifth,thereisnoreasontobelievethatsubper sonalstatesarecomposedof concepts ,inanysenseofthatvexedterm, whereaspersonallevelstatesareparadigmaticallyconceptual. Finally, subpersonal states are best characterized by their natu ralfunctions—thepurposeforwhichtheywereselected—whereasthe possessionofagreatmanypersonallevelstates(e.g.,thoughtsabout quarks) cannot be explained by a straightforward appeal to natural selection.AsDennettmakesclear,adoptingthedesignstanceinvolves assuming only that a system has a purpose, and that it is not cur rentlymalfunctioning,whereastakingtheintentionalstanceinvolves makingmoreweightyassumptions—e.g.,thatthesystemisarational

41 SeeStich(1978)forfurtherelucidationofthispoint. 42 SeeFodor(1983:p.107).SeealsoPutnam(1988)andBrandom(2008),ch.3. D.Pereplyotchik, Psychological and Computational Models 63 agentwhosetermsmostlyrefer,whosejudgmentsaremostlytrue,and whoseinferencesaremostlygood. Functionalisminthephilosophymindisarguablythereigningor thodoxy.Butthereare,broadlyspeaking,twokindsoffunctionalism, anditmattersforpresentpurposeswhichofthetwoweadopt.One kindstemsfromtheworkofSellarsandLewis,whoemphasizeper sonallevelstatesandfolkpsychologicaldescriptions,pitchedfromthe intentionalstance.Theother,whichstemsfromtheearlyworkofHi laryPutnamandJerryFodor,dealswithcomputationalstates,ofthe sortthecognitivepsychologistposits.Thelatterbrandoffunctionalism hasledphilosophersastray,byblurringthedistinctionbetweenper sonalandsubpersonalstatesandmakingitseemasthoughthestates involvedinlanguageprocessingarepropositionalattitudes. 43 Chomsky (2000),Matthews(2007)andCollins(2008:ch.5)exposetheerrorsof thisview. Onesucherrorisembodiedinawidespreadcommitmenttoarela tionalviewofrepresentation,accordingtowhichrepresentationisa specialkindofcausalornomologicallinkbetweenasymbolandsome thingintheworld.Theprojectofnaturalizingthis“relation”hasgiven risetonoshortageoftheories,allofwhichareknowntofacegravedif ficulties.(See§1,fn.5.)IbelievethattheclassicworkofSellars(1974), aswellasthemorerecentviewsofBrandom(1998),Chomsky(2000), andMatthews(2007),providescompellinggroundsfora non-relational view of representation. On this view, to say of some state S that it representsXis not toclaimthattheSbearssomespecialrelationtoX. Rather,tosaythatSrepresentsXistomarkthatstateas belonging to a particular type —i.e.,asplayingadistinctiveroleinthecreature’s cognitiveandbehavioraleconomy.Thereare,ofcourse,manyinterest ingrelationsbetweenacreature’spsychologicalstatesanditsenviron ment.Andsuchrelationsmayeventurnouttobekeyingredientsin ouraccountofthosestates’representationalproperties.(Indeed,onthe functionalroleviewthatIfavor,theserelationsmaywellconstitute twothirdsofsuchanaccount.)Nevertheless,thereisnosingle,unified “representationrelation.”Nomindworldrelationdeservesthattitle, regardlessofitscausal,nomological,ornaturalisticcredentials. Iextendthisviewtothenotionofreference.Referringisbestseen assomethingthatpeople(andperhapsothercreatures) do —akindof communicativeact,notarelationbetweenlinguisticexpressionsand objects,properties,orevents. 44 Iurgethatweseerepresentationasa

43 AparticularlystrikinginstanceofthisconflationcanbefoundinFodor,Bever, andGarrett(1974),ch.1. 44 Tobeclear,no“idealism”or“antinaturalism”isinplayhere.Naturalismshould notbeheldhostagetotheviewthatanyofthenumerousrelationsthatwebearto theextramentalworldinperception,thought,andbehaviorisusefullylabeled“the referencerelation.”Norneedtherebesucharelationforustosustaintheeminently reasonabledoctrineofscientificrealism.SeeDevitt(1997)andChomsky(2000)for twodefensesoftheseclaims. 64 D.Pereplyotchik, Psychological and Computational Models moreinclusivenotionthanreference.Whereastalkofreferencehasits homeinpersonalleveldescriptionsofspeechactsandpropositionalat titudes,extendingthenotiontoothercases—e.g.,treerings,computer programs,maps,pictures,andsubpersonalpsychologicalstates—yields awkwardconsequences.(“Thetreeringreferstotheageofthetree.” “TheportraitreferstoPlato.”)Bycontrast,thenotionofrepresentation extendscomfortablytoallsuchcases.

4.2 Computational Psychology and Embodied Representation Taking on board the main claims of the foregoing discussion, let us restrictourfocussolelytothesubpersonallevelofdescription,andem ployaviewofrepresentationas“functionalclassification”(touseSell ars’handyterm).Wearenowinapositiontobemorepreciseaboutthe kindsofcommitmentsthatapsychologicallyplausiblecomputational modelmightmakeabouttherepresentationsinvolvedinlanguagepro cessing. Stabler(1983)definesavarietyofclaimsthatcanbemadewithin theframeworkofcomputationalpsychology.Onhisview,acomputa tionalsystemisonethatgoesintoaphysicalstatethatrepresentsthe outputofsomefunctionwheneverthesystemisinastatethatrepre sentsthecorrespondinginputtothatfunction.Wesaythatasystem computesafunctionwhenthiscausalpatternis“regularandpredict able”enoughforittobe“convenient”forustosodescribeit;thede scriptionisusedbecauseitis“clearanduseful.”(Thequotedphrases arefromStabler,fn.1.)InStabler’sterms,sayingthatasystemcom putesafunctionisgivinga fi rst-level theory .Thisdoesn’ttellus how the systemperformsthecomputation.Ifwewantamoresubstantialde scriptionofthesystem,wemightspecifythe program thatthesystem usesincomputingthefunction.Theprogramconsistsoftheinstruc tionsthatthemachinecarriesoutateachstepbetweentheinputand theoutput.Eachinstructionisassociatedwithamorebasicfunction, andthesequentialcomputationofbasicfunctionsproducestheoutput ofthetargetfunction.Specifyingsuchaprogramisgivinga second-level theory .Somesystems—e.g.,themoderndaypersonalcomputer—com putemanyfunctionsbyrepresenting,intheirmemorybanks,thevery programsthattheyarecomputing.Suchsystemshave“controlstates” thatencodetheinstructionsofaprogramandarecausallyinvolvedin theinnerworkingsofthemachine.Sayingwhichprogramsare encoded inthesystem(ratherthanmerelycomputedbyit)isgivinga third- level theory .45 Importantly,notallcomputersare“storedprogram”sys temsofthiskind.Somearehardwiredcircuits,forwhichwecangive asecondlevelbutnotathirdleveltheory.Suchcircuitscanbesaidto “embody,”ratherthantoencode,aprogram.

45 This is what Davies (1995) means by “explicit representation.” Davies also drawsadistinctionbetweentwonotionsofimplicitrepresentation,bothofwhich havetheirhomeinwhatStablerwouldcallasecondleveltheory. D.Pereplyotchik, Psychological and Computational Models 65

Ifwesaythatagrammarispsychologicallyrealinthesensethat (i) aspeakercomputesafunctionfromlinguisticstimulitojudgments about what was said, and (ii) the grammar is true of those stimuli, thenweareofferingafirstleveltheory—orwhatDevitt(2006)calls “position(M)”(‘M’for‘minimal’).Thistheorywouldbetrue,asfaris goes,butitwouldn’tgoveryfar;firstleveltheoriesarealmosttotally uninformative,tellingusonlywhatrulesadevice conforms to,exten sionally. Following Chomsky, we should aim for a more interesting thesistotheeffectthatgrammaticalrulesandprinciplesare mentally represented ,thatlanguageprocessingis governed bytherulesofthe grammar(ratherthanmerelyconformingtothem),andthatthemen tally represented grammar is used to generate mental phrase mark ers,inthecausalsenseoftheterm.Stabler’sthreelevelframework allowsustodistinguishtwoformsthatsuchathesismighttake.Ac cordingtowhatIwillcallthe“strongthesis,”wecangiveathirdlevel theoryoftheHSPM,onwhichitcontainsanexplicitrepresentationof agrammaranddrawsonthisrepresentation“asdata”inthecourseof comprehension.Bycontrast,the“weakthesis”hasitthattheHSPM issusceptibleonlytoasecondleveltheory,onwhichthegrammaris embodiedbutnotencodedinthehardwiredcircuitryofthebrain. 46 Now,anumberofleadingfiguresinpsycholinguisticshaveexpressed whatappearstobeacommitmenttothestrongthesis.Consider,forin stance,thefollowingpassagefromFrazierandFodor(1978),inwhich theauthorsdrawouttheconsequencesoftheirparsingmodel. whenmakingitssubsequentdecisions,theexecutiveunitoftheparserre ferstothegeometricarrangementofnodesinthepartialphrasemarker thatithasalreadyconstructed.Itthenseemsunavoidablethat the well- formedness conditions on phrase markers are stored independently of the executive unit, and are accessed by it as needed .Thatis, the range of syn- tactically legitimate attachments at each point in a sentence must be deter- mined by a survey of the syntactic rules for the language ,ratherthanbeing incorporatedintoafixedrankingofthemovestheparsershouldmakeat thatparticularpoint...(322n,emphasesadded). But,asStabler(1983)pointsout,it’sprovablethatanycomputation performedbyastoredprogramcomputercanalsobeperformedbyan assemblyofhardwiredcircuitsthatdon’t encode anyinstructions—a system for which can give only a secondlevel theory. Hence, it may wellbethattheHSPMissuchadevice,inwhichcaseonlytheweak thesisiswarranted.Moreover,Stablergoesontoarguethatwedonot, atpresent,haveanybehavioralorneurophysiologicaldatatosupport

46 ThestrongandtheweakthesesbearacloseresemblancetowhatStabler(1983) calls“(Hd)”and“(H2)”respectivelyandalsotowhatDevitt(2006)calls“position(ii)” and“position(iii)”respectively.Thereare,however,subtledifferencesbetweenthese claims.First,it’spossiblethatStablerwouldnottake(Hd)tocarryanycommitment toathirdleveltheory,thoughhisreasonsforthisareobscure,asmanyauthorsin theBBSpeercommentarypointout.Second,Devittseesposition(iii)asentailing theclaimthatagrammarisnotmentally represented ,whereasit’snotclearthat Stablerwouldagree.Indeed,Iarguebelowthatheshould not agree. 66 D.Pereplyotchik, Psychological and Computational Models thestrongerofthetwotheses.OnthebasisofStabler’sconclusions, Devitt(2006)concludesthatwehavenogroundsforsupposingthata grammarismentally represented .ItisthislastinferencethatIshall challengeintheremainderofthepresentdiscussion. SupposethatStablerwerecorrectinsayingthatwehavenopersua siveevidenceforthestrongthesisthatagrammarisexplicitlyencoded inthebrain.Wewouldthenretreattotheweakerthesisthatagram marismerelyembodied.Butwoulditfollowfromthisweakerthesis thattherulesofthegrammararenotmentally represented ,asDevitt concludes?Ithinkitwouldnot.Theinferencefrom‘embodied’or‘not encoded’to‘notrepresented’is,ingeneral,unwarranted.Thereareim portantdistinctionstobemadebetweenkindsofrepresentation,even atthesubpersonallevelofdescription.Encodingis,ofcourse,a species ofrepresentation—whatwemightcall“explicitrepresentation”—butit isnottheonlyone.Thereis,inaddition,anotionof implicit represen tationofrules,whichDavies(1995)elaboratesasfollows:Adevicethat effectstransitionsbetweenasetofinputsandtheirrespectiveoutputs canbesaidtohaveanimplicitrepresentationofarulejustincasethe devicecontainsastateoramechanismthatservesasa common causal factor inallofthosetransitions.Inaseriesofcleverexamples,Davies demonstratesthattherearewaysofsatisfyingthisdefinitionwhichare logicallyweakerthanexplicitencodingbut,atthesametime, stronger thanmere“conformity”witharule. 47 ArmedwithDavies’accountofimplicitrepresentation,wemaycon clude that the rules of a grammar may be represented in a system, evenifthatsystemissusceptibleonlytowhatStablercallsasecond leveltheory.Aslongasithastherightkindofstructure—i.e.,onethat admitsofexplanationsthatappealtocommoncausalfactors—evena hardwiredcircuitthatcomputesafunctionfromlinguisticstimulito judgmentsaboutwhatspeakersexpresscanbebothusefullyandrea sonablyinterpretedas representing agrammar,albeitimplicitly.This conclusion,significantinitsownright,hasabearingonhowweregard theconnectionistmodelsofsentenceprocessingdiscussedabove(§3.5). For,inmanycases,itisusefultothinkofthetrainedandfrozencon nectionistnetworkasaninstanceofahardwiredcircuit,susceptible onlytoasecondleveltheory. I end by addressing one final point in Devitt’s argument against RT.Devitt(2006)claimsthatthestructurerulesofalanguage—i.e., grammaticalrulesandprinciples—maybethe“wrongsortofrulesto govern[theprocessofcomprehension]”(53).Plainly,quiteabithangs onhowweconstruehisuseoftheterm‘govern’.Ononereading,De vitt’sclaimiscertainlyplausible.For,aswehaveseen(§3.1,fn.24),the structurerulesofalanguagearenottobeconfusedwiththeprocessing 47 Asnotedabove(fn.45),Daviesmakesfurtherdistinctions within thecategory ofimplicitrepresentation.Moreover,hisaccounthascloseaffinitieswiththeone developedinPeacocke(1989).Iwillpassoverthesesubtletieshere.Notealsothat Devitt(2006)citesDavies’workwithapproval(p.52,fn.10). D.Pereplyotchik, Psychological and Computational Models 67 rulesthatconstituteaparsingalgorithm.Rather,thestructurerules aretreated as data bysuchalgorithms.This,inturn,yieldsadifferent readingof‘govern’—whichDevittalsorecognizes—accordingtowhich grammatical principles govern sentence processing by being used as premisesinthe“deductions”thatconstituteaparsingroutine. 48 Devitt seems to assume that grammatical principles can only be drawnonasdataiftheyare explicitly represented.Butthisassump tion is groundless. To see this, consider a device, D, that computes mentalphrasemarkersbyimplementingtheEarleyortheCKYalgo rithm(§3.1),whichdrawsonexplicitlyrepresentedgrammaticalrules asdata.SupposethatDissusceptibletoathirdleveltheory,suchthat itsparsingalgorithmislikewiseexplicitlyrepresented.Asmentioned above,itisprovablethatonecanconstructahardwiredcircuit,D*,that performsthesamecomputationsasD,withoutexplicitlyrepresenting thealgorithm.Forpresentpurposes,theimportantpointisthatcon structingD*—i.e.,“hardwiring”theoriginalalgorithm—involves also “hardwiring” all of the data structures on which that algorithm drew . Hence,thegrammaticalrulesthatwereusedasdatainDwouldstillbe usedasdatainD*,thoughtheywouldbe implicitly represented—i.e., embodied,ratherthanencoded.

Conclusion Wesawin§2thatarangeofbehavioralandneurophysiologicaldata supportstheclaimthattheHSPMconstructsmentalphrasemarkers inthecourseofcomprehension.In§3,wesurveyedanumberofpsy chologicallyplausiblecomputationalmodelsofsentenceprocessing.I arguedthatallsuchmodels—bothclassicalandconnectionist—draw onrepresentationsoftherulesorprinciplesofagrammar.Finally,in §4Isketchedaframeworkforthinkingabouttheserepresentations, arguingthattheyarebestseenassubpersonalstates,whoserepresen tationalpropertiesaredeterminedbytheirfunctionalroleinacompu tational system. I distinguished between explicit and implicit repre sentationandarguedthatDevitt(2006)iswrongtothinkthatimplicit representationscannotserveasdataforanalgorithm.Iconcludethat Devitt’sskepticismconcerningthepsychologicalrealityofgrammars cannotbesustained. 49*

48 Devittcallsthis“position(ii)”onthepsychologicalrealityissue. 49* Iamindebtedtoagreatmanypeopleforveryhelpfulconversationsonthe topics addressed here, as well as for comments on previous drafts of this paper. ManythankstoJakeBerger,JenniferCorns,MichaelDevitt,JanetDeanFodor,Jon Golin,BobMatthews,MikeNairCollins,GeorgesRey,DavidRosenthal,EdStabler andalloftheparticipantsinthe2010“MentalPhenomena”conferenceattheIUC in Dubrovnik, Croatia. Thanks also to Dunja Jutronić, both for organizing the conferenceandforherhelpandencouragementthroughoutthepublicationprocess. 68 D.Pereplyotchik, Psychological and Computational Models

References Abney, S. P. and Johnson, M. (1991). “Memory Requirements and Local Ambiguities of Parsing Strategies,” Journal of Psycholinguistic Re- search ,20,(3),233–250. Berwick,R.C.andWeinberg,A.S.(1984). The Grammatical Basis of Lin- guistic Performance, Cambridge:MITPress. Berwick,R.C.(1991a).“PrinciplesofPrinciplebasedParsing,”in Princi- ple-Based Parsing: Computation and Psycholinguistics ,Berwick,R.C., Abney,S.P.,andTenny,C.eds.,Dordrecht:KluwerAcademicPublish ers,1–37. Berwick,R.C.(1991b).“PrincipleBasedParsing,”in Foundational Issues in Natural Language Processing, P.Sells,S.M.Shieber,T.Wasow,eds., Dordrecht:KluwerAcademicPublishers,115–226. Berwick,R.C.(1997).“ Syntax Facit Saltum :ComputationandtheGeno typeandPhenotypeofLanguage,” Journal of Neurolinguistics, Vol.10, No.213,231–249. Berwick,R.C.,Abney,S.P.,andTenny,C.,eds.(1991). Principle-Based Parsing: Computation and Psycholinguistics . Dordrecht: Kluwer Aca demicPublishers. Bever,T.G.,Carroll,J.M.,andMiller,L.A.,eds.(1984). Talking Minds: The Study of Language in Cognitive Science .Cambridge:MITPress. Block,N.(1986).“AdvertisementforaSemanticsforPsychology,”inFrench, P.A., et . al .,eds., Midwest Studies in Philosophy, Vol. X .Minneapolis: Univ.ofMinnesotaPress,615–78. Bock, K., & Loebell, H. (1990). “Framing sentences.” in Cognition, 35, 1–39. BornkesselSchlesewsky, I. and Schlesewsky, M. (2009). Processing Syn- tax and Morphology, A Neurocognitive Perspective , Oxford University Press. Brandom,R.(1998). Making It Explicit .HarvardUniversityPress. Brandom,R.(2008). Between Saying and Doing .OxfordUniversityPress. Bresnan,J.(2001). Lexical-functional Syntax ,Oxford:Blackwell. Carruthers, P. (2006). “The Case for Massively Modular Models of the Mind,”in Contemporary Debates in Cognitive Science ,R.Stainton(ed.), Malden:Blackwell,3–21. Cain,M.J.(2010).“Linguistics,PsychologyandtheScientificStudyofLan guage,” dialectica Vol.64,N°3,385–404. Chomsky, N. (1957/2002). , 2 nd edition. De Gruyter Mouton. Chomsky, N. (1965). Aspects of the Theory of Syntax , Cambridge: MIT Press. Chomsky,N.(1980). Rules and Representations .NewYork:ColumbiaUni versityPress. Chomsky,N.(1995). The .Cambridge:MITPress. Chomsky,N.(2000). New Horizons in the Study of Language and Mind , Cambridge:CambridgeUniversityPress. Collins,J.(2008). Chomsky: A Guide for the Perplexed .London:Continu umPress. Coltheart,M.(1999).“Modularityandcognition.” Trends in Cognitive Sci- ences ,3,115–120. D.Pereplyotchik, Psychological and Computational Models 69

Crocker, M. W., Pickering, M., and Clifton C., eds. (2000). Architectures and Mechanisms for Language Processing ,Cambridge:CambridgeUni versityPress. CulbertsonJ.andGross,S.(2009).“AreLinguistsBetterSubjects?” British Journal for the Philosophy of Science , 60,721–736. Cummins, R. (1991). Meaning and Mental Representation . Cambridge: MITPress. Cummins,R.(1996). Representations, Targets, and Attitudes .Cambridge: MITPress. Davies,M.(1987).“TacitKnowledgeandSemanticTheory:CanaFiveper centDifferenceMatter?” Mind ,96,441–62. Davies,M.(1989).“TacitKnowledgeandSubdoxasticStates,”in Epistemol- ogy of Language ,George,A.(ed.),OxfordUniversityPress,131–52. Davies,M.(1995).“TwoNotionsofImplicitRules,” Philosophical Perspec- tives, 9, AI, Connectionism, and Philosophical Psychology, Tomberlin,J. E.(ed.).Cambridge:Blackwell. Dennett,D.C.(1987). The Intentional Stance ,Cambridge:MITPress. DeVincenzi, M. (1991). “Fillergap dependencies in a Null Subject Lan guage: Referential and NonReferential WHs,” Journal of Psycholin- guistic Research ,vol.20,no.3,197–213. Devitt, M. (1997). Realism and Truth (2 nd edition), Princeton: Princeton UniversityPress. Devitt,M.(2006). Ignorance of Language ,Oxford:ClarendonPress. Dretske, F. (1981). Knowledge and the Flow of Information , MIT Press: Cambridge,MA Elman,J.L.(1992).“GrammaticalStructuresandDistributedRepresen tations,”in Connectionism: Theory and Practice , Vol. 3, Davis,S.(ed.), OxfordUniversityPress. Featherston,S.(2001). Empty Categories in Sentence Processing .Amster dam:JohnBenjaminsPublishingCompany. Fodor,J.A.(1968).“TheAppealtoTacitKnowledgeinPsychologicalEx planation,” The Journal of Philosophy ,65,(20),627–640. Fodor,J.A.(1983). The Modularity of Mind .Cambridge:MITPress. Fodor,J.A.(1987). Psychosemantics .Cambridge:MITPress. Fodor, J. A. (1990). A Theory of Content, and Other Essays . Cambridge: MITPress. Fodor,J.A.(2002). The Mind Doesn’t Work That Way .Cambridge:MIT Press. Fodor,J.A.,Bever,T.,andGarrett,M.(1974). The Psychology of Language , NewYork:McGrawHill. Fodor,J.D.(1989).“EmptyCategoriesinSentenceProcessing,”in Language and Cognitive Processes ,4,155–209. Fodor,J.D.(1995).“ComprehendingSentenceStructure,”in An Invitation to Cognitive Science: Volume 1, Language ,GleitmanandLiberman(eds.), Cambridge:MITPress,209–246. Frazier,L.(1979). On Comprehending Sentences: Syntactic Parsing Strate- gies , PhD Dissertation, available at: http://digitalcommons.uconn.edu/ dissertations/AAI7914150/ Frazier,L.andCliftonC.(1989).“SuccessiveCyclicityintheGrammarand theParser,” Language and Cognitive Processes ,4,(2),93–126. 70 D.Pereplyotchik, Psychological and Computational Models

Frazier, L. and Fodor, J.D. (1978). “The Sausage Machine: A New Two StageParsingModel,” Cognition, Vol.6,291–325. Friederici,A.D.,Pfeifer,E.,andHahne,A.(1993).“Eventrelatedbrainpo tentialsduringnaturalspeechprocessing:Effectsofsemantic,morpho logical,andsyntacticviolations,” Cognitive Brain Research ,1,183–92. Friederici,A.D.,Hahne,A.,andSaddy,D.(2002).“DistinctNeurophysi ologicalPatternsReflectingAspectsofSyntacticComplexityandSyn tacticRepair,” Journal of Psycholinguistic Research ,Vol.31,No.1. Gerth,S.andbeimGraben,P.(2009).“Unifyingsyntactictheoryandsen tenceprocessingdifficultythroughaconnectionistminimalistparser,” Cognitive Neurodynamics ,3,297–316. Gibson,E.(1998).“LinguisticComplexity:LocalityofSyntacticDependen cies,” Cognition ,68,1–76. Haegeman,L.(1994). Introduction to Government and Binding Theory ,2nd ed.,Oxford:Blackwell. Hale, J. (1999). “Dynamical Parsing and Harmonic Grammar,” unpub lished report, available at the following URL: courses.cit.cornell.edu/ jth99/prism.ps Hale, J. (2001). “A Probabilistic Parser as a Psycholinguistic Model,” in Proceedings of the NAACL. Available at: http://www.aclweb.org/ anthology/N/N01/N011021.pdf Hale,J.(2003). Grammar, Uncertainty and Sentence Processing ,Ph.D.dis sertation, Johns Hopkins University. Available at: http://web.jhu.edu/ bin/s/q/Hale_dissertation.pdf Harkema, H. (2001). Parsing Minimalist Languages , PhD Dissertation, UCLA.Availableat:http://www.linguistics.ucla.edu/people/stabler/par is08/Harkema01.pdf Horwich,P.(1998). Meaning .Oxford:ClarendonPress. Johnson,M.(1989).“ParsingasDeduction:TheUseofKnowledgeofLan guage,” Journal of Psycholinguistic Research ,Vol.18,1,105–128. Jurafsky,D.(2003).“ProbabilisticModelinginPsycholinguistics:Linguis tic Comprehension and Production,” in Probabilistic linguistics Bod, Hay,&Jannedy,eds.,MITPress:Cambridge. Jurafsky,D.andMartin,J.H.(2008). Speech and Language Processing: An Introduction to Speech Recognition, Computational Linguistics, and Natural Language Processing (2nd edition) ,PrenticeHall. Knowles,J.(2000).“KnowledgeofGrammarAsaPropositionalAttitude.” Philosophical Psychology 13(3),325–53. Lawrence, S., Giles, L., Fong, S. (2000). “Natural Language Grammati calInferencewithRecurrentNeuralNetworks,” IEEE Transactions on Knowledge and Data Engineering ,Vol.12,No.1,126–140. Lewis,D.K.(1972).“PsychophysicalandTheoreticalIdentifications,” Aus- tralasian Journal of Philosophy ,50,249–258. Marcus, M. P. (1984). “Some Inadequate Theories of Human Language Processing,”in Talking Minds: The Study of Language in Cognitive Sci- ence Bever,T.G.,Carroll,J.M.,andMiller,L.A.,eds.,Cambridge:MIT Press,253–278. Matthews,R.J.(2007). The Measure of Mind .OxfordUniversityPress. Millikan,R.G.(1984). Language, Thought and Other Biological Categories: New Foundations for Realism .Cambridge:MITPress D.Pereplyotchik, Psychological and Computational Models 71

Neander,K.(2006).“NaturalisticTheoriesofReference,”in The Blackwell Guide to the Philosophy of Language ,Devitt,M.andHanley,R.(eds.), BlackwellPublishing,374–391 Neville, H.J. Nicol, J., Barss, A., Forster, K., and Garrett, M.F. (1991). “Syntacticallybasedsentenceprocessingclasses:Evidencefromevent relatedpotentials,” Journal of Cognitive Neuroscience, 6,233–55. Nicol, J. & Swinney, D. (1989). “The role of structure in coreference as signmentduringsentencecomprehension,” Journal of Psycholinguistic Research, 18,5–19. Peacocke,C.(1989)“WhenisGrammarPsychologicallyReal?”in Refl ec- tions on Chomsky ,George,A.(ed.),Oxford:BasilBlackwell,111–130. Pereira,F.C.N.andWarren,D.H.D.(1983).“ParsingasDeduction,” Pro- ceedings of the 22nd Annual Meeting of the Association for Computa- tional Linguistics ,137–144. Pereplyotchik,D.(forthcoming). The Psychological Import of Syntactic The- ory ,PhDDissertation,CityofNewYork(CUNY)GraduateCenter. Phillips,C.(1994). Order and Structure ,unpublishedPhDthesis,MIT. Phillips, C. and Lewis, S. (forthcoming). “Derivational Order in Syntax: EvidenceandArchitecturalConsequences,”in Directions in Derivations , C.Chesi(ed.),Elsevier. Pickering,M.J.andFerreira,V.S.(2008).“StructuralPriming:ACritical Review,” Psychological Bulletin ,Vol.134,No.3,427–459. Pietroski, P. (2008). “Think of the Children: Comments on Ignorance of Language,byMichaelDevitt,” Australasian Journal of Philosophy ,86, 657–69. Pinker,S.(1999). How the Mind Works .W.W.NortonandCompany:New York. Pollard, C., and Sag, I. A. (1994). Head-driven phrase structure gram- mar. Chicago:UniversityofChicagoPress. Prinz,J.(2006).“IstheMindReallyModular?”in Contemporary Debates in Cognitive Science ,R.Stainton(ed.),Malden,MA:Blackwell,22–36. Pritchett,B.(1992). Grammatical Competence and Parsing Performance , UniversityofChicagoPress. Putnam,H.(1988). Representation and Reality .Cambridge:MITPress. Rattan,G.(2002).“Tacitknowledgeofgrammar:areplytoKnowles.” Phil- osophical Psychology, 15(2),135–54. Rayner,K.,(1998).“EyeMovementsinReadingandInformationProcess ing:20YearsofResearch,” Psychological Bulletin ,Vol.124,No.3,pp. 372–422. Rayner,K.,Carlston,M.,andFrazier,L.(1983).“TheInteractionofSyn taxandSemanticsDuringSentenceProcessing:EyeMovementsinthe AnalysisofSemanticallyBiasedSentences,” Journal of Verbal Learning and Verbal Behavior ,22,358–374. Rey,G.(2006).“Conventions,IntuitionsandLinguisticInexistents:ARe plytoDevitt.” Croatian Journal of Philosophy ,6,549–69. Rey,G.(2008).“InDefenseofFolieism,” Croatian Journal of Philosophy , 8,23,177–202. Rohde, D.L.T. (2002). A Connectionist Model of Sentence Comprehension and Production Ph.D.Dissertation,DepartmentofComputerScience, CarnegieMellonUniversity. 72 D.Pereplyotchik, Psychological and Computational Models

Rosenthal, D. M. (2005). Consciousness and Mind . Oxford: Clarendon Press. Schank,R.andBirnbaum,L.(1984).“Memory,Meaning,andSyntax,”in Talking Minds: The Study of Language in Cognitive Science Bever,T.G., Carroll,J.M.,andMiller,L.A.,eds.,Cambridge:MITPress,209–251. Sellars,W.(1963a).“EmpiricismandthePhilosophyofMind,”in Science Perception and Reality . Atascadero, CA: Ridgeview Publishing, 127– 196. Sellars, W. (1963b). “Some Reflections on Language Games,” in Science, Perception, and Reality . Atascadero, CA: Ridgeview Publishing, 321– 358. Sellars,W.(1974).“MeaningasFunctionalClassification,”Synthese ,27, 417–37. Shieber,S.M.,Schabes,Y.,andPereira,F.C.N.(1993).Principlesandim plementationofdeductiveparsing,TechnicalReportCRCTTR–11–94, ComputerScienceDepartment,HarvardUniversity,Cambridge,Mas sachusetts.Availableat:http://arXiv.org/. Smolensky,P.andLegendre,G.,eds.(2006). The Harmonic Mind .Cam bridge:MITPress. Stabler,E.P.,Jr.(1983).“HowareGrammarsRepresented?” Behavioral and Brain Sciences ,6,391–402. Stevenson, S. (1999). “Bridging the SymbolicConnectionist Gap in Lan guageComprehension.”inLepore,E.,ed., What is Cognitive Science? , Oxford:BlackwellPublishers,336–355. Stevenson,S.andSmolensky,P.(2006).“OptimalityinSentenceProcess ing,” in The Harmonic Mind , Smolensky, P. and Legendre, G., eds., Cambridge:MITPress,307–337. Stich,S.P.(1978).“BeliefsandSubdoxasticStates,” Philosophy of Science , 45,499–518. Sturt,P.,Pickering,M.J.,Scheepers,C.andCrocker,M.W.(2001).“The Preservation of Structure in Language Comprehension: Is Reanalysis theLastResort?” Journal of Memory and Language 45,283–307. Tabor,W.andTanenhaus,M.K.(1999).“DynamicalModelsofSentence Processing,” Cognitive Science ,23(4),491–515. Weinberg,A.(1999).“AMinimalistTheoryofHumanSentenceProcess ing,”inEpsteinandHornstein(eds.), Working Minimalism ,Cambridge: MITPress,283–315. Yang,C.andBerwick,R.C.(1996).“PrincipleBasedParsingforChinese,” in Language, Information and Computation (PACLIC11),363–371.