Principles of Iconicity and Linguistic Categories
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Why Is Language Typology Possible?
Why is language typology possible? Martin Haspelmath 1 Languages are incomparable Each language has its own system. Each language has its own categories. Each language is a world of its own. 2 Or are all languages like Latin? nominative the book genitive of the book dative to the book accusative the book ablative from the book 3 Or are all languages like English? 4 How could languages be compared? If languages are so different: What could be possible tertia comparationis (= entities that are identical across comparanda and thus permit comparison)? 5 Three approaches • Indeed, language typology is impossible (non- aprioristic structuralism) • Typology is possible based on cross-linguistic categories (aprioristic generativism) • Typology is possible without cross-linguistic categories (non-aprioristic typology) 6 Non-aprioristic structuralism: Franz Boas (1858-1942) The categories chosen for description in the Handbook “depend entirely on the inner form of each language...” Boas, Franz. 1911. Introduction to The Handbook of American Indian Languages. 7 Non-aprioristic structuralism: Ferdinand de Saussure (1857-1913) “dans la langue il n’y a que des différences...” (In a language there are only differences) i.e. all categories are determined by the ways in which they differ from other categories, and each language has different ways of cutting up the sound space and the meaning space de Saussure, Ferdinand. 1915. Cours de linguistique générale. 8 Example: Datives across languages cf. Haspelmath, Martin. 2003. The geometry of grammatical meaning: semantic maps and cross-linguistic comparison 9 Example: Datives across languages 10 Example: Datives across languages 11 Non-aprioristic structuralism: Peter H. Matthews (University of Cambridge) Matthews 1997:199: "To ask whether a language 'has' some category is...to ask a fairly sophisticated question.. -
Modeling Language Variation and Universals: a Survey on Typological Linguistics for Natural Language Processing
Modeling Language Variation and Universals: A Survey on Typological Linguistics for Natural Language Processing Edoardo Ponti, Helen O ’Horan, Yevgeni Berzak, Ivan Vulic, Roi Reichart, Thierry Poibeau, Ekaterina Shutova, Anna Korhonen To cite this version: Edoardo Ponti, Helen O ’Horan, Yevgeni Berzak, Ivan Vulic, Roi Reichart, et al.. Modeling Language Variation and Universals: A Survey on Typological Linguistics for Natural Language Processing. 2018. hal-01856176 HAL Id: hal-01856176 https://hal.archives-ouvertes.fr/hal-01856176 Preprint submitted on 9 Aug 2018 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Modeling Language Variation and Universals: A Survey on Typological Linguistics for Natural Language Processing Edoardo Maria Ponti∗ Helen O’Horan∗∗ LTL, University of Cambridge LTL, University of Cambridge Yevgeni Berzaky Ivan Vuli´cz Department of Brain and Cognitive LTL, University of Cambridge Sciences, MIT Roi Reichart§ Thierry Poibeau# Faculty of Industrial Engineering and LATTICE Lab, CNRS and ENS/PSL and Management, Technion - IIT Univ. Sorbonne nouvelle/USPC Ekaterina Shutova** Anna Korhonenyy ILLC, University of Amsterdam LTL, University of Cambridge Understanding cross-lingual variation is essential for the development of effective multilingual natural language processing (NLP) applications. -
Urdu Treebank
Sci.Int.(Lahore),28(4),3581-3585, 2016 ISSN 1013-5316;CODEN: SINTE 8 3581 URDU TREEBANK Muddassira Arshad, Aasim Ali Punjab University College of Information Technology (PUCIT), University of the Punjab, Lahore [email protected], [email protected] (Presented at the 5th International. Multidisciplinary Conference, 29-31 Oct., at, ICBS, Lahore) ABSTRACT: Treebank is a parsed corpus of text annotated with the syntactic information in the form of tags that yield the phrasal information within the corpus. The first outcome of this study is to design a phrasal and functional tag set for Urdu by minimal changes in the tag set of Penn Treebank, that has become a de-facto standard. Our tag set comprises of 22 phrasal tags and 20 functional tags. Second outcome of this work is the development of initial Treebank for Urdu, which remains publically available to the research and development community. There are 500 sentences of Urdu translation of religious text with an average length of 12 words. Context free grammar containing 109 rules and 313 lexical entries, for Urdu has been extracted from this Treebank. An online parser is used to verify the coverage of grammar on 50 test sentences with 80% recall. However, multiple parse trees are generated against each test sentence. 1 INTRODUCTION grammars for parsing[5], where statistical models are used The term "Treebank" was introduced in 2003. It is defined as for broad-coverage parsing [6]. "linguistically annotated corpus that includes some In literature, various techniques have been applied for grammatical analysis beyond the part of speech level" [1]. -
Guggenheim Museum Archives Reel-To-Reel Collection on the Future of Art: “Art and the Structuralist Perspective” by Annette Michelson, 1969
Guggenheim Museum Archives Reel-to-Reel collection On the Future of Art: “Art and the Structuralist Perspective” by Annette Michelson, 1969 MALE 1 Good evening, ladies and gentlemen, and welcome to the fourth in the series of lectures presented by the Guggenheim Museum on the general subject of the future of art. This evening’s speaker is Ms. Annette Michelson, who was born in New York City, studied the history of art at Columbia University, and studied philosophy at the Sorbonne. During a period of residence in Paris in the late ’50s and early ’60s, she was art editor and critic of the Paris Herald Tribune, as well as Paris editor and correspondent for Arts Magazine and Art International. At present, she is contributing [00:01:00] editor of Artforum, and teaches the aesthetics of cinema at the Graduate School of the Arts of New York University. She is currently working on a study of the aesthetics and ideology of chance, as well as a further study on the aesthetics of cinema. She will speak to us tonight on art and the structuralist perspective. Ms. Michelson. (applause) ANNETTE MICHELSON [00:02:00] Years ago, when I was a student, I happened to see an entry in a bookseller’s catalog for an edition of Kant’s Critique of Pure Reason and Critique of Aesthetic Judgment, which was described in that entry as beautiful and illustrative. That entry caught my fancy, teased the imagination, and it intrigued me, really, to the point that I eventually made the trip down to Fourth Avenue to have a look at that beautiful and illustrative edition. -
Linguistic Profiles: a Quantitative Approach to Theoretical Questions
Laura A. Janda USA – Norway, Th e University of Tromsø – Th e Arctic University of Norway Linguistic profi les: A quantitative approach to theoretical questions Key words: cognitive linguistics, linguistic profi les, statistical analysis, theoretical linguistics Ключевые слова: когнитивная лингвистика, лингвистические профили, статисти- ческий анализ, теоретическая лингвистика Abstract A major challenge in linguistics today is to take advantage of the data and sophisticated analytical tools available to us in the service of questions that are both theoretically inter- esting and practically useful. I offer linguistic profi les as one way to join theoretical insight to empirical research. Linguistic profi les are a more narrowly targeted type of behavioral profi ling, focusing on a specifi c factor and how it is distributed across language forms. I present case studies using Russian data and illustrating three types of linguistic profi ling analyses: grammatical profi les, semantic profi les, and constructional profi les. In connection with each case study I identify theoretical issues and show how they can be operationalized by means of profi les. The fi ndings represent real gains in our understanding of Russian grammar that can also be utilized in both pedagogical and computational applications. 1. Th e quantitative turn in linguistics and the theoretical challenge I recently conducted a study of articles published since the inauguration of the journal Cognitive Linguistics in 1990. At the year 2008, Cognitive Linguistics took a quanti- tative turn; since that point over 50% of the scholarly articles that journal publishes have involved quantitative analysis of linguistic data [Janda 2013: 4–6]. Though my sample was limited to only one journal, this trend is endemic in linguistics, and it is motivated by a confl uence of historic factors. -
Morphological Processing in the Brain: the Good (Inflection), the Bad (Derivation) and the Ugly (Compounding)
Morphological processing in the brain: the good (inflection), the bad (derivation) and the ugly (compounding) Article Published Version Creative Commons: Attribution 4.0 (CC-BY) Open Access Leminen, A., Smolka, E., Duñabeitia, J. A. and Pliatsikas, C. (2019) Morphological processing in the brain: the good (inflection), the bad (derivation) and the ugly (compounding). Cortex, 116. pp. 4-44. ISSN 0010-9452 doi: https://doi.org/10.1016/j.cortex.2018.08.016 Available at http://centaur.reading.ac.uk/78769/ It is advisable to refer to the publisher’s version if you intend to cite from the work. See Guidance on citing . To link to this article DOI: http://dx.doi.org/10.1016/j.cortex.2018.08.016 Publisher: Elsevier All outputs in CentAUR are protected by Intellectual Property Rights law, including copyright law. Copyright and IPR is retained by the creators or other copyright holders. Terms and conditions for use of this material are defined in the End User Agreement . www.reading.ac.uk/centaur CentAUR Central Archive at the University of Reading Reading’s research outputs online cortex 116 (2019) 4e44 Available online at www.sciencedirect.com ScienceDirect Journal homepage: www.elsevier.com/locate/cortex Special issue: Review Morphological processing in the brain: The good (inflection), the bad (derivation) and the ugly (compounding) Alina Leminen a,b,*,1, Eva Smolka c,1, Jon A. Dunabeitia~ d,e and Christos Pliatsikas f a Cognitive Science, Department of Digital Humanities, Faculty of Arts, University of Helsinki, Finland b Cognitive Brain Research -
Inflection), the Bad (Derivation) and the Ugly (Compounding)
View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by Central Archive at the University of Reading Morphological processing in the brain: the good (inflection), the bad (derivation) and the ugly (compounding) Article Published Version Creative Commons: Attribution 4.0 (CC-BY) Open Access Leminen, A., Smolka, E., Duñabeitia, J. A. and Pliatsikas, C. (2019) Morphological processing in the brain: the good (inflection), the bad (derivation) and the ugly (compounding). Cortex, 116. pp. 4-44. ISSN 0010-9452 doi: https://doi.org/10.1016/j.cortex.2018.08.016 Available at http://centaur.reading.ac.uk/78769/ It is advisable to refer to the publisher's version if you intend to cite from the work. See Guidance on citing . To link to this article DOI: http://dx.doi.org/10.1016/j.cortex.2018.08.016 Publisher: Elsevier All outputs in CentAUR are protected by Intellectual Property Rights law, including copyright law. Copyright and IPR is retained by the creators or other copyright holders. Terms and conditions for use of this material are defined in the End User Agreement . www.reading.ac.uk/centaur CentAUR Central Archive at the University of Reading Reading's research outputs online cortex 116 (2019) 4e44 Available online at www.sciencedirect.com ScienceDirect Journal homepage: www.elsevier.com/locate/cortex Special issue: Review Morphological processing in the brain: The good (inflection), the bad (derivation) and the ugly (compounding) Alina Leminen a,b,*,1, Eva Smolka c,1, Jon A. Dunabeitia~ d,e and Christos Pliatsikas -
Iconicity As Recognizability Pascal Vaillant, Marie-Françoise Castaing
Iconicity as Recognizability Pascal Vaillant, Marie-Françoise Castaing To cite this version: Pascal Vaillant, Marie-Françoise Castaing. Iconicity as Recognizability. 2003. hal-00329352 HAL Id: hal-00329352 https://hal.archives-ouvertes.fr/hal-00329352 Preprint submitted on 10 Oct 2008 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. ICONICITY AS RECOGNIZABILITY Pascal Vaillant Marie-Françoise Castaing GEREC-F/GIL – Faculté des Lettres CRITT – CCST Université des Antilles et de la Guyane Les Algorithmes, Bâtiment Euclide B.P. 7207 Saint-Aubin 97275 SCHOELCHER CEDEX 91194 GIF-SUR-YVETTE CEDEX [email protected] [email protected] INTRODUCTION Interest in iconic communication has been constantly increasing over these past decades. Not only as a point of theoretical interest in the field of semiotics, but as an efficient and simple way of communicating general information in everyday life. In public places like airports, icons indicate directions or services. In printed form, like on product packages or instruction manuals, they provide detailed instructions in a synthetic way. On the internet “web sites”, they guide users of many different countries, who sometimes do not even share a writing system. -
Language Is More Abstract Than You Think, Or, Why Aren't Languages
Downloaded from http://rstb.royalsocietypublishing.org/ on June 19, 2018 Language is more abstract than you think, or, why aren’t languages more iconic? rstb.royalsocietypublishing.org Gary Lupyan1 and Bodo Winter2 1Department of Psychology, University of Wisconsin, Madison, WI 53706, USA 2Department of English Language and Applied Linguistics, University of Birmingham, Birmingham, UK Opinion piece GL, 0000-0001-8441-7433 Cite this article: Lupyan G, Winter B. 2018 How abstract is language? We show that abstractness pervades every corner Language is more abstract than you think, or, of language, going far beyond the usual examples of freedom and justice. In the light of the ubiquity of abstract words, the need to understand why aren’t languages more iconic? Phil. where abstract meanings come from becomes ever more acute. We argue Trans. R. Soc. B 373: 20170137. that the best source of knowledge about abstract meanings may be language http://dx.doi.org/10.1098/rstb.2017.0137 itself. We then consider a seemingly unrelated question: Why isn’t language more iconic? Iconicity—a resemblance between the form of words and their Accepted: 9 March 2018 meanings—can be immensely useful in language learning and communi- cation. Languages could be much more iconic than they currently are. So why aren’t they? We suggest that one reason is that iconicity is inimical One contribution of 23 to a theme issue to abstraction because iconic forms are too connected to specific contexts ‘Varieties of abstract concepts: development, and sensory depictions. Form–meaning arbitrariness may allow language use and representation in the brain’. -
Language Is Far Less Arbitrary Than One Thinks: Iconicity and Indexicality in Real-World Learning and Processing
1 Nonarbitrariness in Language Language is far less arbitrary than one thinks: Iconicity and indexicality in real-world learning and processing Margherita Murgiano, Yasamin Motamedi & Gabriella Vigliocco* Experimental Psychology, University College London *Corresponding author: [email protected] 2 Nonarbitrariness in Language Abstract In the last decade, a growing body of work has convincingly demonstrated that languages embed a certain degree of non-arbitrariness (mostly in the form of iconicity, namely the presence of imagistic links between linguistic form and meaning). Most of this previous work has been limited to assessing the degree (and role) of non-arbitrariness in the speech (for spoken languages) or manual components of signs (for sign languages). When approached in this way, non-arbitrariness is acknowledged but still considered to have little presence and purpose, showing a diachronic movement towards more arbitrary forms. However, this perspective is limited as it does not take into account the situated nature of language use in face-to-face interactions, where language comprises categorical components of speech and signs, but also multimodal cues such as prosody, gestures, eye gaze etc. We review work concerning the role of context-dependent iconic and indexical cues in language acquisition and processing to demonstrate the pervasiveness of non-arbitrary multimodal cues in language use and we discuss their function. We then move to argue that the online omnipresence of multimodal non-arbitrary cues supports children and adults in dynamically developing situational models. Keywords: iconicity, multimodal communication, language acquisition, language processing 3 Nonarbitrariness in Language 1. Introduction Moving away from a more traditional view of language (e.g. -
People Can Create Iconic Vocalizations to Communicate
www.nature.com/scientificreports OPEN People Can Create Iconic Vocalizations to Communicate Various Meanings to Naïve Received: 21 June 2017 Accepted: 23 January 2018 Listeners Published: xx xx xxxx Marcus Perlman 1 & Gary Lupyan 2 The innovation of iconic gestures is essential to establishing the vocabularies of signed languages, but might iconicity also play a role in the origin of spoken words? Can people create novel vocalizations that are comprehensible to naïve listeners without prior convention? We launched a contest in which participants submitted non-linguistic vocalizations for 30 meanings spanning actions, humans, animals, inanimate objects, properties, quantifers and demonstratives. The winner was determined by the ability of naïve listeners to infer the meanings of the vocalizations. We report a series of experiments and analyses that evaluated the vocalizations for: (1) comprehensibility to naïve listeners; (2) the degree to which they were iconic; (3) agreement between producers and listeners in iconicity; and (4) whether iconicity helps listeners learn the vocalizations as category labels. The results show contestants were able to create successful iconic vocalizations for most of the meanings, which were largely comprehensible to naïve listeners, and easier to learn as category labels. These fndings demonstrate how iconic vocalizations can enable interlocutors to establish understanding in the absence of conventions. They suggest that, prior to the advent of full-blown spoken languages, people could have used iconic vocalizations to ground a spoken vocabulary with considerable semantic breadth. In the parlor game charades, players are challenged to shed their spoken language and communicate using gestures. To succeed in creating a signal that communicates the intended meaning to their teammates, players typically make use of iconicity – a resemblance between the form of a signal and its meaning. -
Parsing the SYNTAGRUS Treebank of Russian
Parsing the SYNTAGRUS Treebank of Russian Joakim Nivre Igor M. Boguslavsky Leonid L. Iomdin Vaxj¨ o¨ University and Universidad Politecnica´ Russian Academy Uppsala University de Madrid of Sciences [email protected] Departamento de Institute for Information Inteligencia Artificial Transmission Problems [email protected] [email protected] Abstract example, Nivre et al. (2007a) observe that the lan- guages included in the 2007 CoNLL shared task We present the first results on parsing the can be divided into three distinct groups with re- SYNTAGRUS treebank of Russian with a spect to top accuracy scores, with relatively low data-driven dependency parser, achieving accuracy for richly inflected languages like Arabic a labeled attachment score of over 82% and Basque, medium accuracy for agglutinating and an unlabeled attachment score of 89%. languages like Hungarian and Turkish, and high A feature analysis shows that high parsing accuracy for more configurational languages like accuracy is crucially dependent on the use English and Chinese. A complicating factor in this of both lexical and morphological features. kind of comparison is the fact that the syntactic an- We conjecture that the latter result can be notation in treebanks varies across languages, in generalized to richly inflected languages in such a way that it is very difficult to tease apart the general, provided that sufficient amounts impact on parsing accuracy of linguistic structure, of training data are available. on the one hand, and linguistic annotation, on the other. It is also worth noting that the majority of 1 Introduction the data sets used in the CoNLL shared tasks are Dependency-based syntactic parsing has become not derived from treebanks with genuine depen- increasingly popular in computational linguistics dency annotation, but have been obtained through in recent years.