<<

XNL-Soar, Incremental Parsing, and the Deryle Lonsdale, LaReina Hingson, Jamison Cooper-Leavitt, Warren Casbeer, Rebecca Madsen BYU Department of

Syntax in the Minimalist Program (MP)

„ Minimalist Principles (Chomsky 1995) Abstract : Build structure based on uninterpretable categorial features „ First: + „ The object: a new incremental modeling parser „ Second: spec + head/X-bar Operator trace (partial) Syntactic representation based on the Minimalist Program (Chomsky 1995) Move: Generate grammaticality in the surface representation 1: Updates the prior GB (Principles & Parameters) representation „ Head-head movement 2:-3: 1: lexaccess(their) „ The framework: Soar „ Phrasal movement 5: 2: lexaccess(investigation) Symbolic, rule-based, goal-directed cognitive modeling approach (Newell 1990) „ Hierarchy of Projections (Adger 2003) 3: project (NP) Machine learning architecture (Laird et al. 1987) Inherent knowledge for specifying structural projections 4: hop (n) Models on-line human language processing. 6: Nominal: D > (Poss) > n > N 5: merge1(n’) „ The goal: explore MP correlations with prior psycholinguistic findings in human language processing (Lewis 1993) Clausal: C > T > (Neg) > (Perf) > (Prog) > (Pass) > v > V 6: merge2(nP) „ The approach: process incoming lexical items incrementally via operators „ Features play a central role 7: movehead (N) Lexical access operator for each incoming Strong (movement) and weak (merge) 8: hop(D) Build MP syntactic structure via projection, merge, and movement operators Feature checking 9: merge1(D’) Constrain operators using and thematic information „ NP, VP symmetry including shells 10: merge2(DP) Allow strategies for limited, local structural reanalysis for unproblematic 11: lexaccess(should) ambiguities 12: lexaccess(have) „ Related issues: 13: lexaccess(exonerated) 7: Semantic interpretation incrementally mapped from Incremental parsing 14: lexaccess(the) Machine learning: chunking up prior decisions and reusing them later 15: lexaccess(defendant) Performance issues: memory usage across time, parsing difficulties „ Human language processing is incremental 16: project(NP) „ References: „ Processes are largely lexically driven 17: hop(n) Chomsky, N. 1995. The Minimalist Program. MIT Press. 18: merge1(n’) Laird, J., A. Newell, and P. Rosenbloom. 1987. Soar: An architecture for „ Word-by-word processing: ... 9:-10: general intelligence. Artificial Intelligence 33:1-64. 19: merge2(nP) enter system’s perceptual buffer Lewis, R. 1993. An architecturally-based theory of human 20: movehead(N) attended to via lexical access operator comprehension. PhD thesis, Carnegie Mellon, School of Computer Science. 21: hop(D) Newell, A. 1990. Unified theories of cognition. Harvard University Press: disappear if unattended to after specified duration 11:-15: 22: merge1(D’) Cambridge, MA. „ Structure (syntactic and semantic) constructed piecemeal 23: merge2(DP) Pritchett, B. 1992. Grammatical Competence and Parsing Performance. 24: merge1(V’) University of Chicago Press. Chicago, IL. „ Open question: Is parsing within the MP incrementally feasible in a 25: merge2(VP) cognitively plausible way? 26: hop(v) 27: merge1(v’) 28: merge2(vP) 29: movehead(V) Background: Soar WordNet data 30: hop(Perf) 31: merge1(Perf’) Operators and operator types „ General theory of human problem solving 32: merge2(PerfP) Overview of exonerate „ Cognition: language, action, performance (in all their varieties) 16:-24: 33: hop(T) „ Operator: basic unit of cognitive processing The verb exonerate has 1 sense (first 1 from tagged 34: merge1(T’) „ Cognitive modeling architecture instantiated as intelligent agent texts) „ Stepwise progress toward specified goal 35: merge2(TP) „ Observable mechanisms, time course of behaviors, deliberation „ Various types and functions in XNL-Soar (1)acquit, assoil, clear, discharge, exonerate, „ Knowledge levels and their use exculpate -- (pronounce not guilty of criminal Lexical access: retrieve and store lexically-related information charges; "The suspect was cleared of the murder „ Instantiate the model as a computational system Merge: construct via MP-specified merge operations charges") Symbolic rule-based architecture Movehead: perform head-to-head movement (via adjunction) Semantic class: verb.communication Subgoal-directed problem specification HoP: consult hierarchy of projections, return next possible target level Verb frames: Operator-based problem solving Project: create bare-structure maximal projection from Somebody ---s somebody. „ Machine learning Somebody ---s somebody of something. „ Applications: robotics, video games and simulation, tutorial dialogue, etc. NL-Soar: natural language processing engine built on Soar English LCS data Background: NL-Soar 10.6.a#1#_ag_th,mod- External knowledge sources poss(of)#exonerate#exonerate#exonerate#exonerate+ed# (2.0,00874318_exonerate%2:32:00::) „ Soar extension for modeling language use „ “10.6.a” “ of Possessional Deprivation: Cheat Verbs/-of” WordNet 2.0 (wordnet.princeton.edu) „ WORDS (absolve acquit balk bereave bilk bleed burgle cheat Unified theory of cognition Lexical : part-of-speech, word senses, subcategorization cleanse con cull cure defraud denude deplete depopulate + deprive despoil disabuse disarm disencumber dispossess divest Inflectional and derivational drain ease exonerate fleece free gull milk mulct pardon Soar cognitive modeling system „ English LCS lexicon (www.umiacs.umd.edu/~bonnie/verbs-English.lcs) plunder purge purify ransack relieve render rid rifle rob sap 25:-29: + strip swindle unburden void wean) NL components Thematic information: θ-grids, θ-roles ((1 "_ag_th,mod-poss()") = Used to derive uninterpretable features (1 "_ag_th,mod-poss(from)") (1 "_ag_th,mod-poss(of)")) Unified cognitive architecture for overall cognition including NL Triggers syntactic construction „ Used specifically to model language tasks: acquisition, language use, "He !!+ed the people (of their rights); He !!+ed him of his Aligned with WordNet information sins" language/task integration, etc. „ Different modalities supported Parsing/comprehension: derive semantics of incoming utterances Generation/production: output utterances expressing semantic content Rules for XNL-Soar Mapping: convert between semantic representations Discourse/dialogue: learn and execute dialogue plans

„ IF→THEN (productions) „ XNL-Soar updates the syntactic component of NL-Soar to use the MP „ If certain conditions are met, then the agent performs some action „ Represent a priori knowledge „ Several rule/production firings can be bundled together as operators „ Current XNL-Soar system: about 60 productions „ External knowledge sources: interfaced via 1000+ lines of Tcl/Perl 30:-34: „ NL-Soar system: 3500 productions Our goals

„ Integrate the MP into a cognitive modeling engine „ Explore language/task-integrations using the MP Current status „ Test cross-linguistic implementation possibilities with the MP „ Ultimately, determine whether the MP supports incremental, operator- „ Proof of concept for fundamental syntactic structures based processing „ Basic transitive sentences work, ditransitives soon „ Unergatives and unaccusatives work Our approach „ All functional and lexical projections in syntactic structure

„ Feature percolation, feature checking mostly works „ Map the syntactic parsing task onto an operator-based framework „ Some constraints derived from thematic information Specify goals, subgoals, etc. for parsing Develop operator types for various MP syntactic operations Implement constraints, preconditions, precedence hierarchies Future functionality „ Integrate necessary and relevant external knowledge sources „ Strengths: „ XP adjunction We have already done this for a prior syntactic model. „ More semantics/deeper semantics. The MP has an operator-like feel to it. Quantifier raising The Soar operator-based framework is versatile and flexible. Scopal relationships „ Weaknesses: C-command and other interpretive mechanisms The MP literature does not address incremental parsing in-depth. More detailed LCS structures The external knowledge sources are somewhat incommensurable. „ Web-based interactive Minimalist Parser grapher 35: Our knowledge of human performance data is sketchy. „ Issue: Find a balance between generation and parsing Most MP descriptions are generative, not recognitional in focus Is it advisable and well motivated to “undo” or “reverse” movements? If not, is generate-and-test the right mechanism for parsing input? What are the implications for learning and bootstrapping language capabilities (e.g. parsing in the service of generation)? Similar Work

„ Incremental parsing in general (Phillips 2003) Future applications „ Other linguistic theories for incremental parsing GB (Kolb 1991) „ Integrate syntax/semantics into discourse/conversation component Dependency (Milward 1994, Ait-Mokhtar et al. 2002) „ Develop human-agent and agent-agent communication Categorial Grammar (Izuo 2004) „ Parameterize XNL-Soar for processing of other besides English „ Finite-state methods (Ait-Mokhtar & Chanod 1997) „ Model cognition in reading „ Minimalist parsing in other frameworks (Stabler 1997, Harkema 2001) „ Model real-time language/task integrations „ Thematic information and parsing (Schlesewsky & Bornkessel 2004) „ Crosslinguistic considerations in incremental parsing (Schneider 2000) „ Note: all of the above have been implemented in NL-Soar, the predecessor „ Human studies on ambiguity, reanalysis Eye tracking (Kamide, Altmann, & Haywood 2003) ERP (Bornkessel, Schlesewsky, & Friederici 2003) For more information ...

„ on Soar: http://sitemaker.umich.edu/soar „ on NL-Soar: http://linguistics.byu.edu/nlsoar