Modeling Linguistic Theory on a Computer: from GB to Minimalism

Modeling Linguistic Theory on a Computer: from GB to Minimalism

Modeling Linguistic Theory on a Computer: From GB to Minimalism Sandiway Fong Dept. of Linguistics Dept. of Computer Science 1 MIT IAP Computational Linguistics Fest, 1/14/2005 Outline • Mature system: PAPPI • Current work – parser in the principles-and- – introduce a left-to-right parser parameters framework based on the probe-goal model – principles are formalized and from the Minimalist Program declaratively stated in Prolog (MP) (logic) – take a look at modeling some – principles are mapped onto data from SOV languages general computational mechanisms • relativization in Turkish and Japanese – recovers all possible parses • psycholinguistics (parsing – (free software, recently ported preferences) to MacOS X and Linux) – (software yet to be released...) – (see http://dingo.sbs.arizona.edu/~sandi way/) 2 MIT IAP Computational Linguistics Fest, 1/14/2005 3 PAPPI: Overview sentence • user’s viewpoint syntactic represent ations parser operations corresponding to linguistic principles (= theory) 3 MIT IAP Computational Linguistics Fest, 1/14/2005 PAPPI: Overview • parser operations can be – turned on or off – metered • syntactic representations can be – displayed – examined • in the context of a parser operation – dissected • features displayed 4 MIT IAP Computational Linguistics Fest, 1/14/2005 PAPPI: Coverage • supplied with a basic set of principles – X’-based phrase structure, Case, Binding, ECP, Theta, head movement, phrasal movement, LF movement, QR, operator-variable, WCO – handles a couple hundred English examples from Lasnik and Uriagereka’s (1988) A Course in GB Syntax • more modules and principles can be added or borrowed – VP-internal subjects, NPIs, double objects Zero Syntax (Pesetsky, 1995) – Japanese (some Korean): head-final, pro-drop, scrambling – Dutch (some German): V2, verb raising – French (some Spanish): verb movement, pronominal clitics – Turkish, Hungarian: complex morphology – Arabic: VSO, SVO word orders 5 MIT IAP Computational Linguistics Fest, 1/14/2005 PAPPI: Architecture • software GUI layers parser prolog os 6 MIT IAP Computational Linguistics Fest, 1/14/2005 2 PAPPI: Architecture Word Order pro-drop Wh-in-Syntax Scrambling • software GUI Lexicon Parameters layers parser Periphery prolog PS Rules Principles os Programming Language Compilation Stage LR(1) Type Chain Tree Inf. – competing parses can be run in parallel across multiple machines 7 MIT IAP Computational Linguistics Fest, 1/14/2005 PAPPI: Machinery • morphology – simple morpheme concatenation – morphemes may project or be rendered as features • (example from the Hungarian implementation) EXAMPLE: a szerzô-k megnéz-et------het-----!--------né-----nek----! két cikk---et the author-Agr3Pl look_at---Caus-Possib-tns(prs)-Cond-Agr3Pl-Obj(indef) two article-Acc 8 MIT IAP Cao m p munkatárs-utationaal----------- Lingiku---------------------kalistics Fest, 1/14/2005 the colleague----Poss3Sg-Agr3Pl+Poss3Pl-LengdFC+Com 2 PAPPI: LR Machinery • specification • phrase – rule XP -> [XB|spec(XB)] ordered specFinal st max(XP), proj(XB,XP). structure – rule XB -> [X|compl(X)] ordered headInitial(X) st bar(XB), proj(X,XB), head(X). – parameterized – rule v(V) moves_to i provided agr(strong), finite(V). X’-rules – rule v(V) moves_to i provided agr(weak), V has_feature aux. – head movement rules State 3 • implementation NP -> D N . State 1 State 2 – rules are not used – bottom-up, shift-reduce parser NP -> D . N NP -> N . directly during – push-down automaton (PDA) parsing for – stack-based merge State 4 computational efficiency • shift S -> . NP VP NP -> . D N S -> NP . VP – mapped at compile- • reduce NP -> . N NP -> NP . PP NP -> . NP PP VP -> . V NP time onto LR VP -> . V machinery – canonical LR(1) VP -> . VP PP PP -> . P NP • disambiguate through oSntaete 0word lookahead 9 MIT IAP Computational Linguistics Fest, 1/14/2005 1 PAPPI: Machine Parameters • selected parser operations may be integrated with phrase structure • specification recovery or – coindexSubjAndINFL in_all_configurations CF where specIP(CF,Subject) then coindexSI(Subject,CF). chain formation – subjacency in_all_configurations CF where isTrace(CF), – machine upPath(CF,Path) then lessThan2BoundingNodes(Path) parameter • implementation – however, not always efficient – use type inferencing defined over category labels to do so • figure out which LR reduce actions should place an outcall to a parser operation – subjacency can be called during chain aggregation 10 MIT IAP Computational Linguistics Fest, 1/14/2005 3 PAPPI: Chain Formation • specification • recovery of – assignment of a chain feature to constituents chains – compute all possible combinations • each empty • combinatorics category optionally – exponential growth participates in a chain • each overt constituent optionally heads a chain 11 MIT IAP Computational Linguistics Fest, 1/14/2005 3 PAPPI: Chain Formation • specification • recovery of – assignment of a chain feature to constituents chains – compute all possible combinations • each empty • cimomplebmineantotaritciosn category optionally – exponentialpossible chains growth compositionally defined participates in – incrementally computed a chain – bottom-up • each overt constituent optionally heads a chain – allows parser operation merge 12 MIT IAP Computational Linguistics Fest, 1/14/2005 3 PAPPI: Chain Formation • specification • recovery of – assignment of a chain feature to constituents chains – compute all possible combinations • each empty • cimmoemprlgebemin ceaontontarsittcirosanints on chain paths category optionally – exponentialpossible chains growth compositionally defined participates in – incrementally computed a chain – bottom-up • each overt – loweringFilter in_all_configurations CF where isTrace(CF), constituent downPath(CF,Path) then Path=[]. optionally – subjacency in_all_configurations CF where isTrace(CF), heads a chain – allowsupPath( CpFarser,Path) operationthen lessTh amergen2BoundingNodes(Path) 13 MIT IAP Computational Linguistics Fest, 1/14/2005 2 PAPPI: Domain Computation • specification • minimal – gc(X) smallest_configuration CF st cat(CF,C), domain member(C,[np,i2]) – with_components – incremental – X, – bottom-up – G given_by governs(G,X,CF), – S given_by accSubj(S,X,CF). • implementing – Governing Category (GC): – GC(α) is the smallest NP or IP containing: – (A) α, and – (B) a governor of α, and – (C) an accessible SUBJECT for α. 14 MIT IAP Computational Linguistics Fest, 1/14/2005 2 PAPPI: Domain Computation • specification • minimal – gc(X) smallest_configuration CF st cat(CF,C), domain member(C,[np,i2]) – with_components – incremental – X, – bottom-up – G given_by governs(G,X,CF), – S given_by accSubj(S,X,CF). • iumspedle minenting – BGionvdeirnngin gC Conatdeigtioorny (AGC): – GC• (αA) nis athnea spmhaolrle mst uNsPt bore I PA -cboonutanindin ign: its GC – (A) α, and – conditionA in_(aBl)l _ac goonvfeigrnuorar toiof nαs, aCnFd where – anap(hCo)r (aCnF a) ctcheesns igbcle(C SFU,GBJCE),C aTB foour nαd.(CF,GC). – anaphor(NP) :- NP has_feature apos, NP has_feature a(+). 15 MIT IAP Computational Linguistics Fest, 1/14/2005 Probe-Goal Parser: Overview • strictly incremental – left-to-right – uses elementary tree (eT) composition • guided by selection • open positions filled from input – epp – no bottom-up merge/move • probe-goal agreement – uninterpretable interpretable feature system 16 MIT IAP Computational Linguistics Fest, 1/14/2005 3 Probe-Goal Parser: Selection • recipe 1 start(c) Spec 3 • select drives pick eT headed by c C Comp derivation from input (or M) 2 Move M – left-to-right fill Spec, run agree(P,M) fill Head, update P Probe P fill Comp (c select c’, recurse) • memory elements • example – MoveBox (M) • emptied in accordance with theta theory • filled from input – ProbeBox (P) • current probe 17 MIT IAP Computational Linguistics Fest, 1/14/2005 3 Probe-Goal Parser: Selection • recipe 1 start(c) Spec 3 • select drives pick eT headed by c C Comp derivation from input (or M) 2 Move M – left-to-right fill Spec, run agree(P,M) fill Head, update P Probe P fill Comp (c select c’, recurse) • memory elements • example – MoveBox (M) • emptied in accordance with theta theory • filled from input – ProbeBox (P) • current probe agree φ-features → probe • note case → goal – extends derivation to the right 18 MIT IAP Computational Linguistics Fest, 1/1•4/2s0i0m5 ilar to Phillips (1995) 3 Probe-Goal Parser: Selection • recipe 1 start(c) Spec 3 • select drives pick eT headed by c C Comp derivation from input (or M) 2 Move M – left-to-right fill Spec, run agree(P,M) fill Head, update P Probe P fill Comp (c select c’, recurse) • memory elements • example – MoveBox (M) • emptied in accordance with theta theory • filled from input – ProbeBox (P) • current probe agree φ-features → probe • note case → goal – enxot emnedrsg ed/emriovvaetion to the right 19 MIT IAP Computational Linguistics Fest, 1/1•4/2sc0if0m.5 Milainr itmoa Plihstil lGiprsa m(1m99a5r.) Stabler (1997) Probe-Goal Parser: Lexicon lexical item properties uninterpretable interpretable features features v* (transitive) select(V) per(P) (epp) spec(select(N)) num(N) value(case(acc)) gen(G) v (unaccusative) select(V) v# (unergative) select(V) spec(select(N)) PRT. (participle) select(V) num(N) case(C) gen(G) V (trans/unacc) select(N) V (unergative) V (raising/ecm) select(T(def)) 20 MIT IAP Computational Linguistics Fest, 1/14/2005 Probe-Goal Parser: Lexicon lexical item properties uninterpretable interpretable features features T select(v) per(P) epp value(case(nom)) num(N) gen(G) T(def) (ϕ-incomplete)

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    32 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us