Notes on Computational Linguistics

Notes on computational linguistics E. Stabler UCLA, Winter 2003 (under revision) Stabler - Lx 185/209 2003 Contents 1 Setting the stage: logics, prolog, theories 4 1.1Summary............................................................................. 4 1.2Propositionalprolog...................................................................... 6 1.3Usingprolog.......................................................................... 10 1.4Somedistinctionsofhumanlanguages.......................................................... 11 1.5PredicateProlog........................................................................ 13 1.6Thelogicofsequences.................................................................... 18 2 Recognition: first idea 26 2.1Aprovabilitypredicate.................................................................... 27 2.2Arecognitionpredicate.................................................................... 28 2.3Finitestaterecognizers.................................................................... 31 3 Extensions of the top-down recognizer 41 3.1Unificationgrammars..................................................................... 41 3.2Moreunificationgrammars:casefeatures........................................................ 42 3.3Recognizers:timeandspace................................................................ 44 3.4Trees,andparsing:firstidea................................................................ 46 3.5Thetop-downparser..................................................................... 47 3.6Somebasicrelationsontrees................................................................ 48 3.7Treegrammars......................................................................... 52 4 Brief digression: simple patterns of dependency 58 4.1Human-likelinguisticpatterns............................................................... 58 4.2 Semilinearity and some inhuman linguistic patterns .................................................. 60 5 Trees, and tree manipulation: second idea 62 5.1Nodesandleavesintreestructures............................................................ 62 5.2Categoriesandfeatures................................................................... 64 5.3Movementrelations...................................................................... 66 6 Context free parsing: stack-based strategies 75 6.1LLparsing............................................................................ 75 6.2LRparsing............................................................................ 80 6.3LCparsing............................................................................ 82 6.4AlltheGLCparsingmethods(the“stackbased”methods).............................................. 86 6.5Oracles.............................................................................. 89 6.6AssessmentoftheGLC(“stackbased”)parsers..................................................... 96 7 Context free parsing: dynamic programming methods 103 7.1CKYrecognitionforCFGs.................................................................. 103 7.2Treecollection......................................................................... 110 7.3EarleyrecognitionforCFGs................................................................. 113 8 Stochastic influences on simple language models 116 8.1Motivationsandbackground................................................................ 116 8.2Probabilisiticcontextfreegrammarsandparsing................................................... 159 8.3Multipleknowledgesources................................................................. 163 8.4Nextsteps............................................................................ 166 9 Beyond context free: a first small step 167 9.1“Minimalist”grammars.................................................................... 168 9.2CKYrecognitionforMGs................................................................... 183 10 Towards standard transformational grammar 198 10.1Review:phrasalmovement................................................................. 198 10.2Headmovement........................................................................ 201 10.3Verbclassesandotherbasics................................................................ 209 10.4Modifiersasadjuncts..................................................................... 220 10.5Summaryandimplementation............................................................... 222 10.6Someremainingissues.................................................................... 227 11 Semantics, discourse, inference 231 12 Review: first semantic categories 234 12.1Things.............................................................................. 234 12.2Propertiesofthings...................................................................... 234 12.3Unaryquantifiers,propertiesofpropertiesofthings................................................. 235 12.4Binaryrelationsamongthings............................................................... 236 12.5Binaryrelationsamongpropertiesofthings....................................................... 237 13 Correction: quantifiers as functionals 237 14 A first inference relation 237 14.1Monotonicityinferencesforsubject-predicate...................................................... 238 14.2MoreBooleaninferences................................................................... 239 15 Exercises 241 15.1Monotonicityinferencesfortransitivesentences.................................................... 242 15.2Monotonicityinference:Amoregeneralandconciseformulation......................................... 243 16 Harder problems 246 16.1Semanticcategories...................................................................... 246 16.2Contextualinfluences..................................................................... 249 16.3Meaningpostulates...................................................................... 250 16.4Scopeinversion......................................................................... 252 16.5Inference............................................................................. 254 17 Morphology, phonology, orthography 259 17.1Morphologysubsumed.................................................................... 259 17.2Asimplephonology,orthography............................................................. 263 17.3Bettermodelsoftheinterface................................................................ 265 18 Some open (mainly) formal questions about language 267 1 Stabler - Lx 185/209 2003 Linguistics 185a/209a: Computational linguistics I Lecture 12-2TR in Bunche 3170 Prof. Ed Stabler Office: Campbell 3103F Office Hours: 2-3T, by appt, or stop by x50634 [email protected] TA: Ying Lin Discussion: TBA Prerequisites: Linguistics 180/208, Linguistics 120b, 165b Contents: What kind of computational device could use a system like a human language? This class will explore the computational properties of devices that could compute morphological and synactic analyses, and recognize semantic entailment relations among sentences. Among other things, we will explore (1) how to define a range of grammatical analyses in grammars G that are expressive enough for human languages (2) how to calculate whether a sequence of gestures, sounds, or characters s ∈ L(G) (various ways!) (3) how to calculate and represent the structures d of expressions s ∈ L(G) (various ways!) (importantly, we see that size(d) < size(s), for natural size measures) (4) how to calculate morpheme sequences from standard written (or spoken) text (5) how to calculate entailment relations among structures (6) how phonological/orthographic, syntactic, semantic analyses can be integrated (7) depending on time and interest, maybe some special topics: • how to distribute probability measures over (the possibly infinitely many) structures of L(G), and how to calculate the most probable structure d of ambiguous s ∈ L(G) • how to handle a language that is “open-ended:” new words, new constructions all the time • how to handle various kinds of context-dependence in the inference system • how to handle temporal relations in the language and in inference • how to calculate certain “discourse” relations • tools for studying large collections of texts Readings: course notes distributed during the quarter from the class web page, supplemented occasionally with selected readings from other sources. Requirements and grades: Grades will be based entirely on problem sets given on a regular basis (roughly weekly) throughout the quarter. Some of these problem sets will be Prolog programming exercises; some will be exercises in formal grammar. Some will be challenging, others will be easy. Graduate students are expected to do the problem sets and an additional squib on a short term project or study. Computing Resources: We will use SWI Prolog, which is small and available for free for MSWindows, Linux/Unix, and MacOSX from http://www.swi-prolog.org/ Tree display software will be based on tcl/tk, which is available for free from http://www.scriptics.com/ 2 Stabler - Lx 185/209 2003 The best models of human language processing are based on the programmatic hypothesis that human language processes are (at least, in large part) computational. That is, the hypothesis is that understanding or producing a coherent utterance typically involves changes of neural state that can be regarded as a calculation, as the steps in some kind of derivation. We could try to understand what is going on by attempting to

Notes on Computational Linguistics

Lecture Notes in Computer Science

Unification Grammars Nissim Francez and Shuly Wintner Frontmatter More Information

Chp%3A10.1007%2F3-54

Amalia--A Unified Platform for Parsing and Generation

Semantics of Nondeterminism, Concurrency, and Communication*

David S. Brée

Off-Line Parsability and the Well-Foundedness of Subsumption

Proceedings of the 15Th Meeting on the Mathematics of Language (MOL 2017), Held at Queen Mary University of London, on July 13–14, 2017

Elephant 2000: a Programming Language Based on Speech Acts

Rechnion - Israel Institute Ot Technology Computer Science Dep~Rtment

Categorial Grammar and the Semantics of Contextual Prepositional Phrases

Script: a Communication Abstraction Mechanism and Its Verification*