Modeling Language Variation and Universals: a Survey on Typological Linguistics for Natural Language Processing
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Why Is Language Typology Possible?
Why is language typology possible? Martin Haspelmath 1 Languages are incomparable Each language has its own system. Each language has its own categories. Each language is a world of its own. 2 Or are all languages like Latin? nominative the book genitive of the book dative to the book accusative the book ablative from the book 3 Or are all languages like English? 4 How could languages be compared? If languages are so different: What could be possible tertia comparationis (= entities that are identical across comparanda and thus permit comparison)? 5 Three approaches • Indeed, language typology is impossible (non- aprioristic structuralism) • Typology is possible based on cross-linguistic categories (aprioristic generativism) • Typology is possible without cross-linguistic categories (non-aprioristic typology) 6 Non-aprioristic structuralism: Franz Boas (1858-1942) The categories chosen for description in the Handbook “depend entirely on the inner form of each language...” Boas, Franz. 1911. Introduction to The Handbook of American Indian Languages. 7 Non-aprioristic structuralism: Ferdinand de Saussure (1857-1913) “dans la langue il n’y a que des différences...” (In a language there are only differences) i.e. all categories are determined by the ways in which they differ from other categories, and each language has different ways of cutting up the sound space and the meaning space de Saussure, Ferdinand. 1915. Cours de linguistique générale. 8 Example: Datives across languages cf. Haspelmath, Martin. 2003. The geometry of grammatical meaning: semantic maps and cross-linguistic comparison 9 Example: Datives across languages 10 Example: Datives across languages 11 Non-aprioristic structuralism: Peter H. Matthews (University of Cambridge) Matthews 1997:199: "To ask whether a language 'has' some category is...to ask a fairly sophisticated question.. -
Manual for Language Test Development and Examining
Manual for Language Test Development and Examining For use with the CEFR Produced by ALTE on behalf of the Language Policy Division, Council of Europe © Council of Europe, April 2011 The opinions expressed in this work are those of the authors and do not necessarily reflect the official policy of the Council of Europe. All correspondence concerning this publication or the reproduction or translation of all or part of the document should be addressed to the Director of Education and Languages of the Council of Europe (Language Policy Division) (F-67075 Strasbourg Cedex or [email protected]). The reproduction of extracts is authorised, except for commercial purposes, on condition that the source is quoted. Manual for Language Test Development and Examining For use with the CEFR Produced by ALTE on behalf of the Language Policy Division, Council of Europe Language Policy Division Council of Europe (Strasbourg) www.coe.int/lang Contents Foreword 5 3.4.2 Piloting, pretesting and trialling 30 Introduction 6 3.4.3 Review of items 31 1 Fundamental considerations 10 3.5 Constructing tests 32 1.1 How to define language proficiency 10 3.6 Key questions 32 1.1.1 Models of language use and competence 10 3.7 Further reading 33 1.1.2 The CEFR model of language use 10 4 Delivering tests 34 1.1.3 Operationalising the model 12 4.1 Aims of delivering tests 34 1.1.4 The Common Reference Levels of the CEFR 12 4.2 The process of delivering tests 34 1.2 Validity 14 4.2.1 Arranging venues 34 1.2.1 What is validity? 14 4.2.2 Registering test takers 35 1.2.2 Validity -
Policy Studies in Language and Cross-Cultural Education in the College of Education
Policy Studies in Language and Cross-Cultural Education In the College of Education OFFICE: Education and Business Administration 248 PLC 553. Language Assessment and Evaluation in Multicultural TELEPHONE: 619-594-5155 / FAX: 619-594-1183 Settings (3) Theories and methods of assessment and evaluation of diverse http://edweb.sdsu.edu/PLC/ student populations including authentic and traditional models. Procedures for identification, placement, and monitoring of linguisti- Faculty cally diverse students. Theories, models, and methods for program evaluation, achievement, and decision making. Alberto J. Rodriguez, Ph.D., Professor of Policy Studies in Language and Cross-Cultural Education, Interim Chair of Department PLC 596. Special Topics in Bilingual and Multicultural Karen Cadiero-Kaplan, Ph.D., Professor of Policy Studies in Language Education (1-3) and Cross-Cultural Education (Graduate Adviser) Prerequisite: Consent of instructor. Alberto M. Ochoa, Ph.D., Professor of Policy Studies in Language and Selected topics in bilingual, cross-cultural education and policy Cross-Cultural Education, Emeritus studies. May be repeated with new content. See Class Schedule for Cristina Alfaro, Ph.D., Associate Professor of Policy Studies in specific content. Credit for 596 and 696 applicable to a master's Language and Cross-Cultural Education degree with approval of the graduate adviser. Cristian Aquino-Sterling, Ph.D., Assistant Professor of Policy Studies in GRADUATE COURSES Language and Cross-Cultural Education Elsa S. Billings, Ph.D., Assistant Professor of Policy Studies in PLC 600A. Foundations of Democratic Schooling (3) Language and Cross-Cultural Education Prerequisite: Consent of instructor. Analysis of relationships among ideology, culture, and power in educational context; key concepts in critical pedagogy applied to Courses Acceptable on Master’s Degree programs, curricula, and school restructuring. -
1 Meeting of the Committee of Editors of Linguistics Journals January 10
Meeting of the Committee of Editors of Linguistics Journals January 10, 2016 Washington, DC Present: Eric Baković, Greg Carlson, Abby Cohn, Elizabeth Cowper, Kai von Fintel, Brian Joseph, Tom Purnell, Johan Rooryck (via Skype) 1. Unified Stylesheet v2.0 Kai von Fintel discussed his involvement in a working group aiming to “update, revise, amend, precisify” the existing Unified Stylesheet for Linguistics Journals. An email from von Fintel on this topic sent to the editors’ mailing list shortly after our meeting is copied at the end of these minutes. Abby Cohn noted that Laboratory Phonology will continue to use APA style given its close contact with relevant fields that use also this style. It was also noted and agreed that authors should be encouraged to ensure the stability of online works for citation purposes. 2. LingOA Johan Rooryck reported on the very recent transition of subscription Lingua (Elsevier) to open access Glossa (Ubiquity Press), and addressed questions about a document he sent to the editors’ mailing list in November (also appended at the end of these minutes). The document invites the editorial teams of other subscription journals in linguistics and related fields to make the move to fair open access, as defined by LingOA (http://lingoa.eu), to join Glossa as well as Laboratory Phonology and Journal of Portuguese Linguistics. On January 9, David Barner (Psychology & Linguistics, UC San Diego) and Jesse Snedeker (Psychology, Harvard) called for fair open access at Cognition, another Elsevier journal. (See http://meaningseeds.com/2016/01/09/fair- open-access-at-cognition/.) The transition of Lingua to Glossa has apparently gone even smoother than expected. -
Language Development Language Development
Language Development rom their very first cries, human beings communicate with the world around them. Infants communicate through sounds (crying and cooing) and through body lan- guage (pointing and other gestures). However, sometime between 8 and 18 months Fof age, a major developmental milestone occurs when infants begin to use words to speak. Words are symbolic representations; that is, when a child says “table,” we understand that the word represents the object. Language can be defined as a system of symbols that is used to communicate. Although language is used to communicate with others, we may also talk to ourselves and use words in our thinking. The words we use can influence the way we think about and understand our experiences. After defining some basic aspects of language that we use throughout the chapter, we describe some of the theories that are used to explain the amazing process by which we Language9 A system of understand and produce language. We then look at the brain’s role in processing and pro- symbols that is used to ducing language. After a description of the stages of language development—from a baby’s communicate with others or first cries through the slang used by teenagers—we look at the topic of bilingualism. We in our thinking. examine how learning to speak more than one language affects a child’s language develop- ment and how our educational system is trying to accommodate the increasing number of bilingual children in the classroom. Finally, we end the chapter with information about disorders that can interfere with children’s language development. -
Urdu Treebank
Sci.Int.(Lahore),28(4),3581-3585, 2016 ISSN 1013-5316;CODEN: SINTE 8 3581 URDU TREEBANK Muddassira Arshad, Aasim Ali Punjab University College of Information Technology (PUCIT), University of the Punjab, Lahore [email protected], [email protected] (Presented at the 5th International. Multidisciplinary Conference, 29-31 Oct., at, ICBS, Lahore) ABSTRACT: Treebank is a parsed corpus of text annotated with the syntactic information in the form of tags that yield the phrasal information within the corpus. The first outcome of this study is to design a phrasal and functional tag set for Urdu by minimal changes in the tag set of Penn Treebank, that has become a de-facto standard. Our tag set comprises of 22 phrasal tags and 20 functional tags. Second outcome of this work is the development of initial Treebank for Urdu, which remains publically available to the research and development community. There are 500 sentences of Urdu translation of religious text with an average length of 12 words. Context free grammar containing 109 rules and 313 lexical entries, for Urdu has been extracted from this Treebank. An online parser is used to verify the coverage of grammar on 50 test sentences with 80% recall. However, multiple parse trees are generated against each test sentence. 1 INTRODUCTION grammars for parsing[5], where statistical models are used The term "Treebank" was introduced in 2003. It is defined as for broad-coverage parsing [6]. "linguistically annotated corpus that includes some In literature, various techniques have been applied for grammatical analysis beyond the part of speech level" [1]. -
Multi-Disciplinary Lexicography
Multi-disciplinary Lexicography Multi-disciplinary Lexicography: Traditions and Challenges of the XXIst Century Edited by Olga M. Karpova and Faina I. Kartashkova Multi-disciplinary Lexicography: Traditions and Challenges of the XXIst Century, Edited by Olga M. Karpova and Faina I. Kartashkova This book first published 2013 Cambridge Scholars Publishing 12 Back Chapman Street, Newcastle upon Tyne, NE6 2XX, UK British Library Cataloguing in Publication Data A catalogue record for this book is available from the British Library Copyright © 2013 by Olga M. Karpova and Faina I. Kartashkova and contributors All rights for this book reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, without the prior permission of the copyright owner. ISBN (10): 1-4438-4256-7, ISBN (13): 978-1-4438-4256-3 TABLE OF CONTENTS List of Illustrations ..................................................................................... ix List of Tables............................................................................................... x Editors’ Preface .......................................................................................... xi Olga M. Karpova and Faina I. Kartashkova Ivanovo Lexicographic School................................................................ xvii Ekaterina A. Shilova Part I: Dictionary as a Cross-road of Language and Culture Chapter One................................................................................................ -
Universals in Phonology This Article A
UC Berkeley Phonology Lab Annual Report (2007) (Commissioned for special issue of The Linguistic Review, 2008, Harry van der Hulst, ed.) Universals in Phonology ABSTRACT This article asks what is universal about phonological systems. Beginning with universals of segment inventories, a distinction is drawn between descriptive universals (where the effect of different theoretical frameworks is minimized) vs. analytic universals (which are specific-theory- dependent). Since there are few absolute universals such as “all languages have stops” and “all languages have at least two degrees of vowel height”, theory-driven or “architectural” universals concerning distinctive features and syllable structure are also considered. Although several near- universals are also mentioned, the existence of conflicting “universal tendencies” and contradictory resolutions naturally leads into questions concerning the status of markedness and synchronic explanation in phonology. While diachrony is best at accounting for typologically unusual and language-specific phonological properties, the absolute universals discussed in this study are clearly grounded in synchrony. 1. Introduction My colleague John Ohala likes to tell the following mythical story about a lecture that the legendary Roman Jakobson gives upon arrival at Harvard University some time in the 1940s. The topic is child language and phonological universals, a subject which Prof. Jakobson addresses in his Kindersprache, Aphasie und allgemeine Lautgesetze (1941). In his also legendary strong Russian accent, Jakobson makes the pronouncement, “In all languages, first utterance of child, [pa]!” 1 He goes on to explain that it is a matter of maximal opposition: “[p] is the consonant most consonant, and [a] is the vowel most vowel.” As the joke continues, a very concerned person in the audience raises his hand and is called on: “But, professor, my child’s first utterance was [tSik].” Prof. -
The Linguistic Review 2019; 36(1): 117–150
The Linguistic Review 2019; 36(1): 117–150 Sun-Ah Jun* and Xiannu Jiang Differences in prosodic phrasing in marking syntax vs. focus: Data from Yanbian Korean https://doi.org/10.1515/tlr-2018-2009 Abstract: In studying the effect of syntax and focus on prosodic phrasing, the main issue of investigation has been to explain and predict the location of a prosodic boundary, and not much attention has been given to the nature of prosodic phrasing. In this paper, we offer evidence from intonation patterns of utterances that prosodic phrasing can be formed differently phonologically and phonetically due to its function of marking syntactic structure vs. focus (promi- nence) in Yanbian Korean, a lexical pitch accent dialect of Korean spoken in the northeastern part of China, just above North Korea. We show that the location of a H tone in syntax-marking Accentual Phrase (AP) is determined by the type of syntactic head, noun or verb (a VP is marked by an AP-initial H while an NP is marked by an AP-final H), while prominence-marking accentual phrasing is cued by AP-initial H. The difference in prosodic phrasing due to its dual function in Yanbian Korean is compared with that of Seoul Korean, and a prediction is made on the possibility of finding such difference in other languages based on the prosodic typology proposed in (Jun, Sun-Ah. 2014b. Prosodic typology: by prominence type, word prosody, and macro-rhythm. In Sun-Ah Jun (ed.), Prosodic Typology II: The Phonology of Intonation and Phrasing. 520–539. Oxford: Oxford University Press). -
Essentials of Language Typology
Lívia Körtvélyessy Essentials of Language Typology KOŠICE 2017 © Lívia Körtvélyessy, Katedra anglistiky a amerikanistiky, Filozofická fakulta UPJŠ v Košiciach Recenzenti: Doc. PhDr. Edita Kominarecová, PhD. Doc. Slávka Tomaščíková, PhD. Elektronický vysokoškolský učebný text pre Filozofickú fakultu UPJŠ v Košiciach. Všetky práva vyhradené. Toto dielo ani jeho žiadnu časť nemožno reprodukovať,ukladať do informačných systémov alebo inak rozširovať bez súhlasu majiteľov práv. Za odbornú a jazykovú stánku tejto publikácie zodpovedá autor. Rukopis prešiel redakčnou a jazykovou úpravou. Jazyková úprava: Steve Pepper Vydavateľ: Univerzita Pavla Jozefa Šafárika v Košiciach Umiestnenie: http://unibook.upjs.sk Dostupné od: február 2017 ISBN: 978-80-8152-480-6 Table of Contents Table of Contents i List of Figures iv List of Tables v List of Abbreviations vi Preface vii CHAPTER 1 What is language typology? 1 Tasks 10 Summary 13 CHAPTER 2 The forerunners of language typology 14 Rasmus Rask (1787 - 1832) 14 Franz Bopp (1791 – 1867) 15 Jacob Grimm (1785 - 1863) 15 A.W. Schlegel (1767 - 1845) and F. W. Schlegel (1772 - 1829) 17 Wilhelm von Humboldt (1767 – 1835) 17 August Schleicher 18 Neogrammarians (Junggrammatiker) 19 The name for a new linguistic field 20 Tasks 21 Summary 22 CHAPTER 3 Genealogical classification of languages 23 Tasks 28 Summary 32 CHAPTER 4 Phonological typology 33 Consonants and vowels 34 Syllables 36 Prosodic features 36 Tasks 38 Summary 40 CHAPTER 5 Morphological typology 41 Morphological classification of languages (holistic -
Margaret Thomas Department of Slavic & Eastern Languages
CURRICULUM VITAE Margaret Thomas Department of Slavic & Eastern Languages & Literatures Boston College 24 Hemlock Road Chestnut Hill, MA 02467 Newton, MA 02464 (617) 552-3697 (617) 244-3105 [email protected] Education Ph.D. 1991 Harvard University Linguistics A.M. 1985 Harvard University Linguistics M.Ed. 1983 Boston University Teaching English to Speakers of Other Languages B.A. 1974 Yale University Japanese Language and Literature Employment 2006– Professor, Slavic and Eastern Languages and Literatures, Boston College 1996–06 Associate Professor, Slavic and Eastern Languages Department, Boston College 1992–96 Assistant Professor, Slavic and Eastern Languages Department, Boston College 1991–92 Visiting Assistant Professor, Slavic and Eastern Languages, Boston College 1986–90 Teaching Fellow, Linguistics Department, Harvard University Publications 2013 ‘Air writing’ and second language learners’ knowledge of Japanese kanji. Japanese Language and Literature, 47: 59–92. Otto Jespersen and ‘The Woman’, then and now. Historiographia Linguistica, 40: 377–208. The doctorate in second language acquisition: An institutional history. Linguistic Approaches to Bilingualism, 3: 509–531. History of the study of second language acquisition. In Martha Young-Scholten and Julia Herschensohn (Eds.), The Cambridge Handbook of Second Language Acquisition, 26–45. Cambridge University Press.. 2011 Fifty Key Thinkers on Language and Linguistics, pp. xvii + 306. Routledge Press. Gender and the language scholarship of the Summer Institute of Linguistics in the context of mid twentieth-century American linguistics. In Gerda Hassler (Ed.), History of Linguistics 2008: Selected Papers from the XI International Conference on the History of the Language Margaret Thomas 2 Sciences (ICHOLS XI), Potsdam, 28 August–2 September 2008, 389–397. -
Linguistic Profiles: a Quantitative Approach to Theoretical Questions
Laura A. Janda USA – Norway, Th e University of Tromsø – Th e Arctic University of Norway Linguistic profi les: A quantitative approach to theoretical questions Key words: cognitive linguistics, linguistic profi les, statistical analysis, theoretical linguistics Ключевые слова: когнитивная лингвистика, лингвистические профили, статисти- ческий анализ, теоретическая лингвистика Abstract A major challenge in linguistics today is to take advantage of the data and sophisticated analytical tools available to us in the service of questions that are both theoretically inter- esting and practically useful. I offer linguistic profi les as one way to join theoretical insight to empirical research. Linguistic profi les are a more narrowly targeted type of behavioral profi ling, focusing on a specifi c factor and how it is distributed across language forms. I present case studies using Russian data and illustrating three types of linguistic profi ling analyses: grammatical profi les, semantic profi les, and constructional profi les. In connection with each case study I identify theoretical issues and show how they can be operationalized by means of profi les. The fi ndings represent real gains in our understanding of Russian grammar that can also be utilized in both pedagogical and computational applications. 1. Th e quantitative turn in linguistics and the theoretical challenge I recently conducted a study of articles published since the inauguration of the journal Cognitive Linguistics in 1990. At the year 2008, Cognitive Linguistics took a quanti- tative turn; since that point over 50% of the scholarly articles that journal publishes have involved quantitative analysis of linguistic data [Janda 2013: 4–6]. Though my sample was limited to only one journal, this trend is endemic in linguistics, and it is motivated by a confl uence of historic factors.