Major Parts of Speech
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Why Is Language Typology Possible?
Why is language typology possible? Martin Haspelmath 1 Languages are incomparable Each language has its own system. Each language has its own categories. Each language is a world of its own. 2 Or are all languages like Latin? nominative the book genitive of the book dative to the book accusative the book ablative from the book 3 Or are all languages like English? 4 How could languages be compared? If languages are so different: What could be possible tertia comparationis (= entities that are identical across comparanda and thus permit comparison)? 5 Three approaches • Indeed, language typology is impossible (non- aprioristic structuralism) • Typology is possible based on cross-linguistic categories (aprioristic generativism) • Typology is possible without cross-linguistic categories (non-aprioristic typology) 6 Non-aprioristic structuralism: Franz Boas (1858-1942) The categories chosen for description in the Handbook “depend entirely on the inner form of each language...” Boas, Franz. 1911. Introduction to The Handbook of American Indian Languages. 7 Non-aprioristic structuralism: Ferdinand de Saussure (1857-1913) “dans la langue il n’y a que des différences...” (In a language there are only differences) i.e. all categories are determined by the ways in which they differ from other categories, and each language has different ways of cutting up the sound space and the meaning space de Saussure, Ferdinand. 1915. Cours de linguistique générale. 8 Example: Datives across languages cf. Haspelmath, Martin. 2003. The geometry of grammatical meaning: semantic maps and cross-linguistic comparison 9 Example: Datives across languages 10 Example: Datives across languages 11 Non-aprioristic structuralism: Peter H. Matthews (University of Cambridge) Matthews 1997:199: "To ask whether a language 'has' some category is...to ask a fairly sophisticated question.. -
Modeling Language Variation and Universals: a Survey on Typological Linguistics for Natural Language Processing
Modeling Language Variation and Universals: A Survey on Typological Linguistics for Natural Language Processing Edoardo Ponti, Helen O ’Horan, Yevgeni Berzak, Ivan Vulic, Roi Reichart, Thierry Poibeau, Ekaterina Shutova, Anna Korhonen To cite this version: Edoardo Ponti, Helen O ’Horan, Yevgeni Berzak, Ivan Vulic, Roi Reichart, et al.. Modeling Language Variation and Universals: A Survey on Typological Linguistics for Natural Language Processing. 2018. hal-01856176 HAL Id: hal-01856176 https://hal.archives-ouvertes.fr/hal-01856176 Preprint submitted on 9 Aug 2018 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Modeling Language Variation and Universals: A Survey on Typological Linguistics for Natural Language Processing Edoardo Maria Ponti∗ Helen O’Horan∗∗ LTL, University of Cambridge LTL, University of Cambridge Yevgeni Berzaky Ivan Vuli´cz Department of Brain and Cognitive LTL, University of Cambridge Sciences, MIT Roi Reichart§ Thierry Poibeau# Faculty of Industrial Engineering and LATTICE Lab, CNRS and ENS/PSL and Management, Technion - IIT Univ. Sorbonne nouvelle/USPC Ekaterina Shutova** Anna Korhonenyy ILLC, University of Amsterdam LTL, University of Cambridge Understanding cross-lingual variation is essential for the development of effective multilingual natural language processing (NLP) applications. -
Urdu Treebank
Sci.Int.(Lahore),28(4),3581-3585, 2016 ISSN 1013-5316;CODEN: SINTE 8 3581 URDU TREEBANK Muddassira Arshad, Aasim Ali Punjab University College of Information Technology (PUCIT), University of the Punjab, Lahore [email protected], [email protected] (Presented at the 5th International. Multidisciplinary Conference, 29-31 Oct., at, ICBS, Lahore) ABSTRACT: Treebank is a parsed corpus of text annotated with the syntactic information in the form of tags that yield the phrasal information within the corpus. The first outcome of this study is to design a phrasal and functional tag set for Urdu by minimal changes in the tag set of Penn Treebank, that has become a de-facto standard. Our tag set comprises of 22 phrasal tags and 20 functional tags. Second outcome of this work is the development of initial Treebank for Urdu, which remains publically available to the research and development community. There are 500 sentences of Urdu translation of religious text with an average length of 12 words. Context free grammar containing 109 rules and 313 lexical entries, for Urdu has been extracted from this Treebank. An online parser is used to verify the coverage of grammar on 50 test sentences with 80% recall. However, multiple parse trees are generated against each test sentence. 1 INTRODUCTION grammars for parsing[5], where statistical models are used The term "Treebank" was introduced in 2003. It is defined as for broad-coverage parsing [6]. "linguistically annotated corpus that includes some In literature, various techniques have been applied for grammatical analysis beyond the part of speech level" [1]. -
RELATIONAL NOUNS, PRONOUNS, and Resumptionw Relational Nouns, Such As Neighbour, Mother, and Rumour, Present a Challenge to Synt
Linguistics and Philosophy (2005) 28:375–446 Ó Springer 2005 DOI 10.1007/s10988-005-2656-7 ASH ASUDEH RELATIONAL NOUNS, PRONOUNS, AND RESUMPTIONw ABSTRACT. This paper presents a variable-free analysis of relational nouns in Glue Semantics, within a Lexical Functional Grammar (LFG) architecture. Rela- tional nouns and resumptive pronouns are bound using the usual binding mecha- nisms of LFG. Special attention is paid to the bound readings of relational nouns, how these interact with genitives and obliques, and their behaviour with respect to scope, crossover and reconstruction. I consider a puzzle that arises regarding rela- tional nouns and resumptive pronouns, given that relational nouns can have bound readings and resumptive pronouns are just a specific instance of bound pronouns. The puzzle is: why is it impossible for bound implicit arguments of relational nouns to be resumptive? The puzzle is highlighted by a well-known variety of variable-free semantics, where pronouns and relational noun phrases are identical both in category and (base) type. I show that the puzzle also arises for an established variable-based theory. I present an analysis of resumptive pronouns that crucially treats resumptives in terms of the resource logic linear logic that underlies Glue Semantics: a resumptive pronoun is a perfectly ordinary pronoun that constitutes a surplus resource; this surplus resource requires the presence of a resumptive-licensing resource consumer, a manager resource. Manager resources properly distinguish between resumptive pronouns and bound relational nouns based on differences between them at the level of semantic structure. The resumptive puzzle is thus solved. The paper closes by considering the solution in light of the hypothesis of direct compositionality. -
Chapter 1 Basic Categorial Syntax
Hardegree, Compositional Semantics, Chapter 1 : Basic Categorial Syntax 1 of 27 Chapter 1 Basic Categorial Syntax 1. The Task of Grammar ............................................................................................................ 2 2. Artificial versus Natural Languages ....................................................................................... 2 3. Recursion ............................................................................................................................... 3 4. Category-Governed Grammars .............................................................................................. 3 5. Example Grammar – A Tiny Fragment of English ................................................................. 4 6. Type-Governed (Categorial) Grammars ................................................................................. 5 7. Recursive Definition of Types ............................................................................................... 7 8. Examples of Types................................................................................................................. 7 9. First Rule of Composition ...................................................................................................... 8 10. Examples of Type-Categorial Analysis .................................................................................. 8 11. Quantifiers and Quantifier-Phrases ...................................................................................... 10 12. Compound Nouns -
Linguistic Profiles: a Quantitative Approach to Theoretical Questions
Laura A. Janda USA – Norway, Th e University of Tromsø – Th e Arctic University of Norway Linguistic profi les: A quantitative approach to theoretical questions Key words: cognitive linguistics, linguistic profi les, statistical analysis, theoretical linguistics Ключевые слова: когнитивная лингвистика, лингвистические профили, статисти- ческий анализ, теоретическая лингвистика Abstract A major challenge in linguistics today is to take advantage of the data and sophisticated analytical tools available to us in the service of questions that are both theoretically inter- esting and practically useful. I offer linguistic profi les as one way to join theoretical insight to empirical research. Linguistic profi les are a more narrowly targeted type of behavioral profi ling, focusing on a specifi c factor and how it is distributed across language forms. I present case studies using Russian data and illustrating three types of linguistic profi ling analyses: grammatical profi les, semantic profi les, and constructional profi les. In connection with each case study I identify theoretical issues and show how they can be operationalized by means of profi les. The fi ndings represent real gains in our understanding of Russian grammar that can also be utilized in both pedagogical and computational applications. 1. Th e quantitative turn in linguistics and the theoretical challenge I recently conducted a study of articles published since the inauguration of the journal Cognitive Linguistics in 1990. At the year 2008, Cognitive Linguistics took a quanti- tative turn; since that point over 50% of the scholarly articles that journal publishes have involved quantitative analysis of linguistic data [Janda 2013: 4–6]. Though my sample was limited to only one journal, this trend is endemic in linguistics, and it is motivated by a confl uence of historic factors. -
Morphological Processing in the Brain: the Good (Inflection), the Bad (Derivation) and the Ugly (Compounding)
Morphological processing in the brain: the good (inflection), the bad (derivation) and the ugly (compounding) Article Published Version Creative Commons: Attribution 4.0 (CC-BY) Open Access Leminen, A., Smolka, E., Duñabeitia, J. A. and Pliatsikas, C. (2019) Morphological processing in the brain: the good (inflection), the bad (derivation) and the ugly (compounding). Cortex, 116. pp. 4-44. ISSN 0010-9452 doi: https://doi.org/10.1016/j.cortex.2018.08.016 Available at http://centaur.reading.ac.uk/78769/ It is advisable to refer to the publisher’s version if you intend to cite from the work. See Guidance on citing . To link to this article DOI: http://dx.doi.org/10.1016/j.cortex.2018.08.016 Publisher: Elsevier All outputs in CentAUR are protected by Intellectual Property Rights law, including copyright law. Copyright and IPR is retained by the creators or other copyright holders. Terms and conditions for use of this material are defined in the End User Agreement . www.reading.ac.uk/centaur CentAUR Central Archive at the University of Reading Reading’s research outputs online cortex 116 (2019) 4e44 Available online at www.sciencedirect.com ScienceDirect Journal homepage: www.elsevier.com/locate/cortex Special issue: Review Morphological processing in the brain: The good (inflection), the bad (derivation) and the ugly (compounding) Alina Leminen a,b,*,1, Eva Smolka c,1, Jon A. Dunabeitia~ d,e and Christos Pliatsikas f a Cognitive Science, Department of Digital Humanities, Faculty of Arts, University of Helsinki, Finland b Cognitive Brain Research -
Inflection), the Bad (Derivation) and the Ugly (Compounding)
View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by Central Archive at the University of Reading Morphological processing in the brain: the good (inflection), the bad (derivation) and the ugly (compounding) Article Published Version Creative Commons: Attribution 4.0 (CC-BY) Open Access Leminen, A., Smolka, E., Duñabeitia, J. A. and Pliatsikas, C. (2019) Morphological processing in the brain: the good (inflection), the bad (derivation) and the ugly (compounding). Cortex, 116. pp. 4-44. ISSN 0010-9452 doi: https://doi.org/10.1016/j.cortex.2018.08.016 Available at http://centaur.reading.ac.uk/78769/ It is advisable to refer to the publisher's version if you intend to cite from the work. See Guidance on citing . To link to this article DOI: http://dx.doi.org/10.1016/j.cortex.2018.08.016 Publisher: Elsevier All outputs in CentAUR are protected by Intellectual Property Rights law, including copyright law. Copyright and IPR is retained by the creators or other copyright holders. Terms and conditions for use of this material are defined in the End User Agreement . www.reading.ac.uk/centaur CentAUR Central Archive at the University of Reading Reading's research outputs online cortex 116 (2019) 4e44 Available online at www.sciencedirect.com ScienceDirect Journal homepage: www.elsevier.com/locate/cortex Special issue: Review Morphological processing in the brain: The good (inflection), the bad (derivation) and the ugly (compounding) Alina Leminen a,b,*,1, Eva Smolka c,1, Jon A. Dunabeitia~ d,e and Christos Pliatsikas -
September 15 & 17 Class Summary: 24.902
September 15 & 17 class summary: 24.902 could not be replaced by depend, because depend bears the distinct feature [+ __ NP PP]. The week actually began with some unfinished business from last week, which is OK, painful detail. What is interesting to us is a constraint proposed by Chomsky on the contained in the previous summary. Then we moved on... ability of a subcategorization rule to control the environment in which it applies. 1. Preliminaries Chomsky claimed that the environment of the subcategorization rule for X makes A word often cares about its syntactic environment. A verb, for example, may reference to all and only the sisters of X. More simply still: the subcategorization property require two, one or zero complements, and (seemingly -- see below!) may specify the of X cares only about the sisters of X. syntactic category of its complement. Terminological note: As mentioned in class, a slightly different way of speaking has arisen in the field of syntax. We say that a verb like put "subcategorizes for" an NP and a • Examples: put requires NP PP, devour requiresNP, depend requires PP, eat takes an optional NP. PP. Likewise a verb like eat "subcategorizes for" an optional NP. • Chomsky's constraint has interesting implications for language acquisition. Granted (1) a. Sue put the book under the table. that a child must maintain some record of the syntactic environment in which lexical b. *Sue put the book. items occur (or else subcategorization information would not be acquired), Chomsky's c. *Sue put under the table. constraint suggests that this information is quite limited. -
When and Why Can 1St and 2Nd Person Pronouns Be Bound Variables? the Problem with 1St and 2Nd Person Pronouns
When and why can 1st and 2nd person pronouns be bound variables? The problem with 1st and 2nd person pronouns. In some contexts (call them context A), English 1st and 2nd person pronouns can be used as bound variables (Heim 1991, Kratzer 1998) as in (1). (1) Context A: 1st person (data based on Rullmann 2004) a. Only I got a question that I understood. = (i) !x [x got a question that x understood] = (ii) !x [x got a question that I understood] b. Every guy that I’ve ever dated has wanted us to get married. = (i) "x, guy(x) & dated(I, x) [x wants x & I to get married] = (ii) "x, guy(x) & dated(I, x) [x wants us to get married] Rullmann (2004) argues that such data are problematic for Déchaine & Wiltschko’s (2002) analysis (henceforth DW), where it is claimed that English 1st/2nd person pronouns are DPs. According to the DW analysis, 1st/2nd person pronouns have the status of definite DPs (i.e., definite descriptions) and consequently are predicted not to be construable as bound variables. DW support their claim with the observation that in some contexts of VP-ellipsis (call them context B), 1st/2nd person pronouns cannot be used as bound variables, as in (2). (2) Context B: 1st person (data based on Déchaine & Wiltschko 2002) I know that John saw me, and Mary does too. # (i) ‘I know that John saw me, and Mary knows that John saw her. !x [x knows that John saw x] & !y [y knows that John saw y] = (ii) ‘I know that John saw me, and Mary knows that John saw me.’ !x [x knows that John saw me] & !y [y knows that John saw me] As we will show, the (im)possibility of a bound variable reading with 1st and 2nd person is conditioned by a number of factors. -
Parsing the SYNTAGRUS Treebank of Russian
Parsing the SYNTAGRUS Treebank of Russian Joakim Nivre Igor M. Boguslavsky Leonid L. Iomdin Vaxj¨ o¨ University and Universidad Politecnica´ Russian Academy Uppsala University de Madrid of Sciences [email protected] Departamento de Institute for Information Inteligencia Artificial Transmission Problems [email protected] [email protected] Abstract example, Nivre et al. (2007a) observe that the lan- guages included in the 2007 CoNLL shared task We present the first results on parsing the can be divided into three distinct groups with re- SYNTAGRUS treebank of Russian with a spect to top accuracy scores, with relatively low data-driven dependency parser, achieving accuracy for richly inflected languages like Arabic a labeled attachment score of over 82% and Basque, medium accuracy for agglutinating and an unlabeled attachment score of 89%. languages like Hungarian and Turkish, and high A feature analysis shows that high parsing accuracy for more configurational languages like accuracy is crucially dependent on the use English and Chinese. A complicating factor in this of both lexical and morphological features. kind of comparison is the fact that the syntactic an- We conjecture that the latter result can be notation in treebanks varies across languages, in generalized to richly inflected languages in such a way that it is very difficult to tease apart the general, provided that sufficient amounts impact on parsing accuracy of linguistic structure, of training data are available. on the one hand, and linguistic annotation, on the other. It is also worth noting that the majority of 1 Introduction the data sets used in the CoNLL shared tasks are Dependency-based syntactic parsing has become not derived from treebanks with genuine depen- increasingly popular in computational linguistics dency annotation, but have been obtained through in recent years. -
Syntactic Categories in the Speech of \Bung Children
Developmental Psychology Copyright 1986 by the American Psychological Association, Inc. 1986, Vol. 22, No. 4,562-579 00l2-1649/86/$0O.75 Syntactic Categories in the Speech of \bung Children Virginia Valian Columbia University This article demonstrates that very young children have knowledge of a range of syntactic categories. Speech samples from six children aged 2 years to 2 years, 5 months, with Mean Lengths of Utterance (MLUs) ranging from 2.93 to 4.14, were examined for evidence of six syntactic categories: Deter- miner, Adjective, Noun, Noun Phrase, Preposition, and Prepositional Phrase. Performance was eval- uated by the conformance of the children's speech to criteria developed for each category. Known syntactic diagnostics served as models for the criteria developed; the criteria exploited distributional regularities. All children showed evidence of all categories, except for the lowest MLU child, whose performance was borderline on Adjectives and Prepositional Phrases. The results suggest that chil- dren are sensitive very early in life to abstract, formal properties of the speech they hear and must be credited with syntactic knowledge at an earlier point than heretofore generally thought. The results argue against various semantic hypotheses about the origin of syntactic knowledge. Finally, the methods and results may be applicable to future investigations of why children's early utterances are short, of the nature of children's semantic categories, and of the nature of the deviance in the speech of language-deviant children and adults. The grammars of mature speakers consist of rules and princi- Levin, 1985). Unless children can select the proper phrasal cate- ples that capture generalizations about the language.