Building a Corpus for Palestinian Arabic: a Preliminary Study
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Issues in the Syntax of Arabic
Cambridge University Press 978-0-521-65017-5 - The Syntax of Arabic Joseph E. Aoun, Elabbas Benmamoun and Lina Choueiri Excerpt More information 1 Issues in the syntax of Arabic 1.1 The Arabic language(s) Arabic belongs to the Semitic branch of the Afro-Asiatic (Hamito- Semitic) family of languages, which includes languages like Aramaic, Ethiopian, South Arabian, Syriac, and Hebrew. A number of the languages in this group are spoken in the Middle East, the Arabian Peninsula, and Africa. It has been docu- mented that Arabic spread with the Islamic conquests from the Arabian Peninsula and within a few decades, it spread over a wide territory across North Africa and the Middle East. Arabic is now spoken by more than 200 million speakers excluding bilingual speakers (Gordon 2005). Although there is a debate about the history of Arabic (including that of the Standard variety and the spoken dialects) Arabic displays some of the typical characteristics of Semitic languages: root-pattern morphology, broken plurals in nouns, emphatic and glottalized consonants, and a verbal system with prefix and suffix conjugation. 1.1.1 The development of Arabic Classical Arabic evolved from the standardization of the language of the Qur’an and poetry. This standardization became necessary at the time when Arabic became the language of an empire, with the Islamic expansion starting in the seventh century. In addition to Classical Arabic, there were regional spo- ken Arabic varieties. It is a matter of intense debate what the nature of the historical relation between Classical Arabic and the spoken dialects is (Owens 2007). -
Arabic Sociolinguistics: Topics in Diglossia, Gender, Identity, And
Arabic Sociolinguistics Arabic Sociolinguistics Reem Bassiouney Edinburgh University Press © Reem Bassiouney, 2009 Edinburgh University Press Ltd 22 George Square, Edinburgh Typeset in ll/13pt Ehrhardt by Servis Filmsetting Ltd, Stockport, Cheshire, and printed and bound in Great Britain by CPI Antony Rowe, Chippenham and East bourne A CIP record for this book is available from the British Library ISBN 978 0 7486 2373 0 (hardback) ISBN 978 0 7486 2374 7 (paperback) The right ofReem Bassiouney to be identified as author of this work has been asserted in accordance with the Copyright, Designs and Patents Act 1988. Contents Acknowledgements viii List of charts, maps and tables x List of abbreviations xii Conventions used in this book xiv Introduction 1 1. Diglossia and dialect groups in the Arab world 9 1.1 Diglossia 10 1.1.1 Anoverviewofthestudyofdiglossia 10 1.1.2 Theories that explain diglossia in terms oflevels 14 1.1.3 The idea ofEducated Spoken Arabic 16 1.2 Dialects/varieties in the Arab world 18 1.2. 1 The concept ofprestige as different from that ofstandard 18 1.2.2 Groups ofdialects in the Arab world 19 1.3 Conclusion 26 2. Code-switching 28 2.1 Introduction 29 2.2 Problem of terminology: code-switching and code-mixing 30 2.3 Code-switching and diglossia 31 2.4 The study of constraints on code-switching in relation to the Arab world 31 2.4. 1 Structural constraints on classic code-switching 31 2.4.2 Structural constraints on diglossic switching 42 2.5 Motivations for code-switching 59 2. -
Arabic and Contact-Induced Change Christopher Lucas, Stefano Manfredi
Arabic and Contact-Induced Change Christopher Lucas, Stefano Manfredi To cite this version: Christopher Lucas, Stefano Manfredi. Arabic and Contact-Induced Change. 2020. halshs-03094950 HAL Id: halshs-03094950 https://halshs.archives-ouvertes.fr/halshs-03094950 Submitted on 15 Jan 2021 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Arabic and contact-induced change Edited by Christopher Lucas Stefano Manfredi language Contact and Multilingualism 1 science press Contact and Multilingualism Editors: Isabelle Léglise (CNRS SeDyL), Stefano Manfredi (CNRS SeDyL) In this series: 1. Lucas, Christopher & Stefano Manfredi (eds.). Arabic and contact-induced change. Arabic and contact-induced change Edited by Christopher Lucas Stefano Manfredi language science press Lucas, Christopher & Stefano Manfredi (eds.). 2020. Arabic and contact-induced change (Contact and Multilingualism 1). Berlin: Language Science Press. This title can be downloaded at: http://langsci-press.org/catalog/book/235 © 2020, the authors Published under the Creative Commons Attribution -
A Tale of Two Morphologies
A Tale of Two Morphologies Verb structure and argument alternations in Maltese Dissertation zur Erlangung des akademischen Grades eines Doktors der Philosophie vorgelegt von Spagnol, Michael an der Geisteswissenschaftliche Sektion Sprachwissenschaft 1. Referent: Prof. Dr. Frans Plank 2. Referent: Prof. Dr. Christoph Schwarze 3. Referent: Prof. Dr. Albert Borg To my late Nannu Kieli, a great story teller Contents Acknowledgments ............................................................................................................................. iii Notational conventions .................................................................................................................... v Abstract ............................................................................................................................................... viii Ch. 1. Introduction ............................................................................................................................. 1 1.1. A tale to be told ............................................................................................................................................. 2 1.2 Three sides to every tale ........................................................................................................................... 4 Ch. 2. Setting the stage ...................................................................................................................... 9 2.1. No language is an island ....................................................................................................................... -
The Routledge Handbook of Arabic Linguistics
Review Copy - Not for Distribution Youssef A Haddad - University of Florida - 02/01/2018 THE ROUTLEDGE HANDBOOK OF ARABIC LINGUISTICS The Routledge Handbook of Arabic Linguistics introduces readers to the major facets of research on Arabic and of the linguistic situation in the Arabic-speaking world. The edited collection includes chapters from prominent experts on various fields of Arabic linguistics. The contributors provide overviews of the state of the art in their field and specifically focus on ideas and issues. Not simply an overview of the field, this handbook explores subjects in great depth and from multiple perspectives. In addition to the traditional areas of Arabic linguistics, the handbook covers computational approaches to Arabic, Arabic in the diaspora, neurolinguistic approaches to Arabic, and Arabic as a global language. The Routledge Handbook of Arabic Linguistics is a much-needed resource for researchers on Arabic and comparative linguistics, syntax, morphology, computational linguistics, psycholinguistics, sociolinguistics, and applied linguistics, and also for undergraduate and graduate students studying Arabic or linguistics. Elabbas Benmamoun is Professor of Asian and Middle Eastern Studies and Linguistics at Duke University, USA. Reem Bassiouney is Professor in the Applied Linguistics Department at the American University in Cairo, Egypt. Review Copy - Not for Distribution Youssef A Haddad - University of Florida - 02/01/2018 Review Copy - Not for Distribution Youssef A Haddad - University of Florida - 02/01/2018 -
AIDA Bibliographie
George Grigore A Bibliography of AIDA Association Internationale de Dialectologie Arabe (1992-2017) Descrierea CIP a Bibliotecii Naţionale a României GRIGORE, GEORGE A bibliography of AIDA (Association Internationale de Dialectologie Arabe) : (1992-2017) / George Grigore ; pref.: Dominique Caubert, George Grigore, Stephan Procházka. - Iaşi : Ars Longa, 2016 ISBN 978-973-148-245-3 I. Caubert, Dominique (pref.) II. Procházka, Stephan (pref.) 061:811.411.21'28"1992-2017" © George Grigore © ARS LONGA, 2016 str. Elena Doamna, 2 700398 Iaşi, România Tel.: 0724 516 581 Fax: +40-232-215078 e-mail: [email protected]; [email protected] web: www.arslonga.ro All rights reserved. George Grigore A Bibliography of AIDA Association Internationale de Dialectologie Arabe (1992-2017) Foreword: AIDA – A brief history by Dominique Caubet George Grigore Stephan Procházka Ars Longa 2016 AIDA (Association Internationale de Dialectologie Arabe) – A brief history1 – AIDA (fr. Association Internationale de Dialectologie Arabe) – International Association of Arabic Dialectology / is an association of researchers – الرابطة الدولية لدراسة اللهجات العربية in Arabic dialects, Nowadays AIDA is the leading international association in this field of research and it has become a platform that joins scholars from all over the world, interested in various aspects of Arabic dialectology. Therefore, the idea of founding an association gathering the high-rated specialists in Arabic dialects was discussed, for the first time, between Dominique Caubet (the future founder of AIDA) and a number of dialectologists such as Clive Holes (Oriental Studies, Trinity Hall, Cambridge), Bruce Ingham (The School of Oriental and African Studies, London), Otto Jastrow (Heidelberg University), Catherine Miller (Director of IREMAM, Institut de Recherches et d’Etudes sur le Monde 1 Many thanks to our colleagues Catherine Miller and Martine Vanhove for helping us to clarify the beginnings of AIDA. -
Current Research on Linguistic Variation in the Arabic-Speaking World
Language and Linguistics Compass 10/8 (2016): 370–381, 10.1111/lnc3.12202 Current Research on Linguistic Variation in the Arabic-Speaking World Uri Horesh1* and William M. Cotter2 1Department of Language and Linguistics, University of Essex 2School of Anthropology and Department of Linguistics, The University of Arizona Abstract Given its abundance of dialects, varieties, styles, and registers, Arabic lends itself easily to the study of language variation and change. It is spoken by some 300 million people in an area spanning roughly from northwest Africa to the Persian Gulf. Traditional Arabic dialectology has dealt predominantly with geographical variation. However, in recent years, more nuanced studies of inter- and intra-speaker variation have seen the light of day. In some respects, Arabic sociolinguistics is still lagging behind the field compared to variationist studies in English and other Western languages. On the other hand, the insight presented in studies of Arabic can and should be considered in the course of shaping a crosslinguistic sociolinguistic theory. Variationist studies of Arabic-speaking speech communities began almost two de- cades after Labov’s pioneering studies of American English and have f lourished following the turn of the 21st century. These studies have sparked debates between more quantitatively inclined sociolinguists and those who value qualitative analysis. In reality, virtually no sociolinguistic study of Arabic that includes statistical modeling is free of qualitative insights. They are also not f lawless and not always cutting edge methodologically or theoretically, but the field is moving in a positive direction, which will likely lead to the recognition of its significance to sociolinguistics at large. -
Chapter 4 Arabic in Iraq, Syria, and Southern Turkey Stephan Procházka University of Vienna
Chapter 4 Arabic in Iraq, Syria, and southern Turkey Stephan Procházka University of Vienna This chapter covers the Arabic dialects spoken in the region stretching from the Turkish province of Mersin in the west to Iraq in the east, including Lebanon and Syria. The area is characterized by a high degree of linguistic diversity, and for about two and a half millennia Arabic has come into contact with various other Semitic languages, as well as with Indo-European languages and Turkish. Bilin- gualism, particularly with Aramaic, Kurdish, and Turkish, has resulted in numer- ous contact-induced changes in all realms of grammar, including morphology and syntax. 1 Current state and historical development The region discussed in this chapter is linguistically extremely heterogeneous: in it three different Arabic dialect groups, plus several other languages, are spo- ken. The two main Arabic dialect groups are Syrian and Iraqi, the distribution of which does not exactly correspond to the political boundaries of those two countries. Syrian-type dialects are also spoken in Lebanon, in three provinces of southern Turkey (Mersin, Adana,1 Hatay), and in one village on Cyprus (cf. Walter, this volume). In Iraq, Arabic is mainly spoken in Mesopotamia proper, whereas considerable parts of the mountainous parts of the country are Kurdish- speaking. Arabic dialects which are very akin to the Iraqi ones extend into north- eastern Syria and southeastern Anatolia (for the latter see Akkuş, this volume). These two groups are geographically divided by a third dialect group, which ar- rived in the region with an originally (semi-) nomadic population from northern 1The dialects spoken in Mersin and Adana provinces will henceforth referred to as Cilician Arabic. -
Diacritization of Moroccan and Tunisian Arabic Dialects: a CRF Approach
Diacritization of Moroccan and Tunisian Arabic Dialects: A CRF Approach Kareem Darwish∗, Ahmed Abdelali∗, Hamdy Mubarak∗, Younes Samihy, Mohammed Attia? ∗QCRI, yUniversity of Dusseldorf, ?Google Inc. fkdarwish, aabdelali, [email protected], [email protected], [email protected] Abstract Arabic is written as a sequence of consonants and long vowels, with short vowels normally omitted. Diacritization attempts to recover short vowels and is an essential step for Text-to-Speech (TTS) systems. Though Automatic diacritization of Modern Standard Arabic (MSA) has received significant attention, limited research has been conducted on dialectal Arabic (DA) diacritization. Phonemic patterns of DA vary greatly from MSA and even from one another, which accounts for the noted difficulty of mutual intelligibility between dialects. In this paper we present our research and benchmark results on the automatic diacritization of two Maghrebi sub-dialects, namely Tunisian and Moroccan, using Conditional Random Fields (CRF). Aside from using character n-grams as features, we also employ character-level Brown clusters, which are hierarchical clusters of characters based on the contexts in which they appear. We achieved word-level diacritization errors of 2.9% and 3.8% for Moroccan and Tunisian respectively. We also show that effective diacritization can be performed out-of-context for both sub-dialects. Keywords: Arabic, Dialects, Vowelization, Diacritization 1. Introduction • We explore the use of cross dialect and joint dialect train- ing between Moroccan and Tunisian, highlighting the or- Different varieties of Arabic are typically written with- thographic and phonetic similarities and dissimilarities of out diacritics (short vowels). Arabic readers disambiguate both sub-dialects. -
Classifying ASR Transcriptions According to Arabic Dialect
Classifying ASR Transcriptions According to Arabic Dialect Abualsoud Hanani, Aziz Qaroush Stephen Taylor Electrical & Computer Engineering Dept Computer Science Dept Birzeit University Fitchburg State University West Bank, Palestine Fitchburg, MA, USA ahanani,qaroush @birzeit.edu [email protected] { } Abstract We describe several systems for identifying short samples of Arabic dialects, which were pre- pared for the shared task of the 2016 DSL Workshop (Malmasi et al., 2016). Our best system, an SVM using character tri-gram features, achieved an accuracy on the test data for the task of 0.4279, compared to a baseline of 0.20 for chance guesses or 0.2279 if we had always chosen the same most frequent class in the test set. This compares with the results of the team with the best weighted F1 score, which was an accuracy of 0.5117. The team entries seem to fall into cohorts, with the all the teams in a cohort within a standard-deviation of each other, and our three entries are in the third cohort, which is about seven standard deviations from the top. 1 Introduction In 2016 the Distinguishing Similar Languages workshop (Malmasi et al., 2016) added a shared task to classify short segments of text as one of five Arabic dialects. The workshop organizers provided a training file and a schedule. After allowing the participants development time, they distributed a test file, and evaluated the success of participating systems. We built several systems for dialect classification, and submitted runs from three of them. Interestingly, our results on the workshop test data were not consistent with our tests on reserved training data. -
Arabic Diglossia Within Palestinian-Arab Folk Narratives
Portland State University PDXScholar University Honors Theses University Honors College 2017 Arabic Diglossia within Palestinian-Arab Folk Narratives Jordan M. Martinez Portland State University Follow this and additional works at: https://pdxscholar.library.pdx.edu/honorstheses Let us know how access to this document benefits ou.y Recommended Citation Martinez, Jordan M., "Arabic Diglossia within Palestinian-Arab Folk Narratives" (2017). University Honors Theses. Paper 421. https://doi.org/10.15760/honors.417 This Thesis is brought to you for free and open access. It has been accepted for inclusion in University Honors Theses by an authorized administrator of PDXScholar. Please contact us if we can make this document more accessible: [email protected]. ARABIC DIGLOSSIA WITHIN PALESTINIAN-ARAB FOLK NARRATIVES Jordan Martinez Dr. Dirgham H. Sbait Honors 403: Undergraduate Thesis June 15, 2017 1 Acknowledgements I would like to give a special thanks to my thesis advisor, Dr. Sbait, for spending numerous hours working with me this past year. This project would have been impossible without his guidance and profound knowledge of the Arabic language. I wish you luck in your retirement! I would also like to acknowledge the work put forth by Ibrahim Muhawi and Sharif Kanaana. Their work with Speak Bird, Speak Again is unprecedented, and will always be appreciated. 2 Abstract The preservation of Palestinian folktales is a way to restore and preserve Palestinian culture in result to the declining state and displacement of Palestinian populations into other regions of the Arabic World (Muhawi and Kanaana 1989). Some academics have acknowledged that the translation of folklore and other similar tales have contributed to a growing consideration of foreign cultures and the overall understanding by non-native sources (Klaus Roth 1998) (Kanaana 2005). -
Arabic Dialects (General Article)
This is a repository copy of Arabic dialects (general article). White Rose Research Online URL for this paper: http://eprints.whiterose.ac.uk/76443/ Version: Accepted Version Book Section: Watson, J (2011) Arabic dialects (general article). In: Weninger, S, Khan, G, Streck, M and Watson, JCE, (eds.) The Semitic Languages: An international handbook. Handbücher zur Sprach- und Kommunikationswissenschaft / Handbooks of Linguistics and Communication Science (HSK) . Walter de Gruyter , Berlin , pp. 851-896. ISBN 978-3-11-025158-6 This article is protected by copyright. Reproduced in accordance with De Gruyter self-archiving policy. https://www.degruyter.com/viewbooktoc/product/175227? rskey=gAMxxa&result=1 Reuse See Attached Takedown If you consider content in White Rose Research Online to be in breach of UK law, please notify us by emailing [email protected] including the URL of the record and the reason for the withdrawal request. [email protected] https://eprints.whiterose.ac.uk/ 312 50. Arabic Dialects (general article) 841 278 McLoughlin, L. 279280 1972 Towards a definition of modern standard Arabic. Archivum Linguisticum (new series) 281 3, 57Ϫ73. 282 Monteil, V. 283284 1960 L’arabe moderne (Etudes arabes et islamiques 4). Paris: Klinksieck. 285 Parkinson, D. 286287 1991 Searching for modern fushâ: Real-life formal Arabic. Al-Arabiyya 24, 31Ϫ64. 288 Ryding, K. 289290 2005a A Reference Grammar of Modern Standard Arabic. Cambridge: Cambridge Univer- 291 sity Press. 292 Ryding, K. 293294 2005b Educated Arabic. Encyclopedia of Arabic Language and Linguistics, Vol. 1 (Leiden: 295 Brill) 666Ϫ671. 296 Ryding, K. 297298 2006 Teaching Arabic in the United States.