The MADAR Arabic Dialect Corpus and Lexicon
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Different Dialects of Arabic Language
e-ISSN : 2347 - 9671, p- ISSN : 2349 - 0187 EPRA International Journal of Economic and Business Review Vol - 3, Issue- 9, September 2015 Inno Space (SJIF) Impact Factor : 4.618(Morocco) ISI Impact Factor : 1.259 (Dubai, UAE) DIFFERENT DIALECTS OF ARABIC LANGUAGE ABSTRACT ifferent dialects of Arabic language have been an Dattraction of students of linguistics. Many studies have 1 Ali Akbar.P been done in this regard. Arabic language is one of the fastest growing languages in the world. It is the mother tongue of 420 million in people 1 Research scholar, across the world. And it is the official language of 23 countries spread Department of Arabic, over Asia and Africa. Arabic has gained the status of world languages Farook College, recognized by the UN. The economic significance of the region where Calicut, Kerala, Arabic is being spoken makes the language more acceptable in the India world political and economical arena. The geopolitical significance of the region and its language cannot be ignored by the economic super powers and political stakeholders. KEY WORDS: Arabic, Dialect, Moroccan, Egyptian, Gulf, Kabael, world economy, super powers INTRODUCTION DISCUSSION The importance of Arabic language has been Within the non-Gulf Arabic varieties, the largest multiplied with the emergence of globalization process in difference is between the non-Egyptian North African the nineties of the last century thank to the oil reservoirs dialects and the others. Moroccan Arabic in particular is in the region, because petrol plays an important role in nearly incomprehensible to Arabic speakers east of Algeria. propelling world economy and politics. -
A Note on the Genitive Particle Ħaqq in Yemeni Arabic Free Genitives Mohammed Ali Qarabesh, University of Albayda Mohammed Q
A note on ħaqq in Yemeni Arabic … Qarabesh & Shormani A Note on the Genitive particle ħaqq in Yemeni Arabic Free Genitives Mohammed Ali Qarabesh, University of Albayda Mohammed Q. Shormani, University of Ibb الملخص: تتىاوه هذي اىىرقح ميمح "حق" فٍ اىيهجح اىُمىُح ورتثتها اىىحىَح فٍ تزمُة إضافح اىمينُح اىتحيُيُح، وتقذً ىها تحيُو وحىٌ وصفٍ، حُث َفتزض اىثاحثان أن هىاك وىػُه مه هذي اىنيمح فٍ اىيهجح اىُمىُح: ا( تيل اىتٍ ﻻ تظهز ػيُها ػﻻماخ اىتطاتق، مثو "اىسُاراخ حق ػيٍ"، حُث وزي أن ميمح "اىسُاراخ" ىها اىسماخ )جمغ، مؤوث، غائة( وىنه ميمح "حق" ﻻ تتطاتق مؼها فٍ أٌ مه هذي اىصفاخ، و ب( تيل اىتٍ تظهز ػيُها ػﻻماخ اىتطاتق مثو "اىسُاراخ حقاخ ػيٍ" حُث تتطاتق اىنيمتان "اىسُاراخ" و"حقاخ" فٍ مو اىسماخ. وؼَزض اىثاحثان أن اىىىع اﻷوه َ ستخذً فٍ مىاطق مثو صىؼاء، ػذن، إب... اىخ، واىثاوٍ فٍ شثىج وحضزمىخ ... اىخ. وَخيص اىثاحثان إىً أن هىاك دىُو ػميٍ ىُس فقظ ػيً وجىد اىىحى اىنيٍ فٍ "اىمينح اىيغىَح" تو أَضا ػيً "تَ ْى َس َطح" هذا اىىحى، ىُس فقظ تُه اىيغاخ تو وتُه ىهجاخ اىيغح اىىاحذج. الكلمات المفتاحية: اىيغاخ اىسامُح، اىؼزتُح اىُمىُح، اىؼثزَح، اى م ْينُح، "حق" Abstract This paper provides a descriptive syntactic analysis of ħaqq in Yemeni Arabic (YA). ħaqq is a Semitic Free Genitive (FG) particle, much like the English of. A FG minimally consists of a head N, genitive particle and genitive DP complement. It (in a FG) expresses or conveys the meaning of possessiveness, something like of in English. There are two types of ħaqq in Yemeni Arabic: one not exhibiting agreement with the head N, and another exhibiting it. -
Arabic Sociolinguistics: Topics in Diglossia, Gender, Identity, And
Arabic Sociolinguistics Arabic Sociolinguistics Reem Bassiouney Edinburgh University Press © Reem Bassiouney, 2009 Edinburgh University Press Ltd 22 George Square, Edinburgh Typeset in ll/13pt Ehrhardt by Servis Filmsetting Ltd, Stockport, Cheshire, and printed and bound in Great Britain by CPI Antony Rowe, Chippenham and East bourne A CIP record for this book is available from the British Library ISBN 978 0 7486 2373 0 (hardback) ISBN 978 0 7486 2374 7 (paperback) The right ofReem Bassiouney to be identified as author of this work has been asserted in accordance with the Copyright, Designs and Patents Act 1988. Contents Acknowledgements viii List of charts, maps and tables x List of abbreviations xii Conventions used in this book xiv Introduction 1 1. Diglossia and dialect groups in the Arab world 9 1.1 Diglossia 10 1.1.1 Anoverviewofthestudyofdiglossia 10 1.1.2 Theories that explain diglossia in terms oflevels 14 1.1.3 The idea ofEducated Spoken Arabic 16 1.2 Dialects/varieties in the Arab world 18 1.2. 1 The concept ofprestige as different from that ofstandard 18 1.2.2 Groups ofdialects in the Arab world 19 1.3 Conclusion 26 2. Code-switching 28 2.1 Introduction 29 2.2 Problem of terminology: code-switching and code-mixing 30 2.3 Code-switching and diglossia 31 2.4 The study of constraints on code-switching in relation to the Arab world 31 2.4. 1 Structural constraints on classic code-switching 31 2.4.2 Structural constraints on diglossic switching 42 2.5 Motivations for code-switching 59 2. -
Arabic and Contact-Induced Change Christopher Lucas, Stefano Manfredi
Arabic and Contact-Induced Change Christopher Lucas, Stefano Manfredi To cite this version: Christopher Lucas, Stefano Manfredi. Arabic and Contact-Induced Change. 2020. halshs-03094950 HAL Id: halshs-03094950 https://halshs.archives-ouvertes.fr/halshs-03094950 Submitted on 15 Jan 2021 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Arabic and contact-induced change Edited by Christopher Lucas Stefano Manfredi language Contact and Multilingualism 1 science press Contact and Multilingualism Editors: Isabelle Léglise (CNRS SeDyL), Stefano Manfredi (CNRS SeDyL) In this series: 1. Lucas, Christopher & Stefano Manfredi (eds.). Arabic and contact-induced change. Arabic and contact-induced change Edited by Christopher Lucas Stefano Manfredi language science press Lucas, Christopher & Stefano Manfredi (eds.). 2020. Arabic and contact-induced change (Contact and Multilingualism 1). Berlin: Language Science Press. This title can be downloaded at: http://langsci-press.org/catalog/book/235 © 2020, the authors Published under the Creative Commons Attribution -
Phonetics and Phonology Paradox in Levantine Arabic: an Analytical Evaluation of Arabic Geminates’ Hypocrisy
ISSN 1799-2591 Theory and Practice in Language Studies, Vol. 9, No. 7, pp. 854-864, July 2019 DOI: http://dx.doi.org/10.17507/tpls.0907.16 Phonetics and Phonology Paradox in Levantine Arabic: An Analytical Evaluation of Arabic Geminates’ Hypocrisy Ahmad Mahmoud Saidat Department of English Language and Literature, Al-Hussein Bin Talal University, Jordan Jamal A. Khlifat Department of Linguistics, University of Colorado, Boulder, USA Abstract—This paper explores the phonetic and phonological paradox between two categories of Levantine- Arabic long consonants—known as geminates by looking closely at the hypocrite Arabic geminates. Hypocrite geminates are phonetically long segments in a sequence that are not contrastive. The paper seeks to demonstrate that Arabic geminates can be classified into two categories—true vs. fake geminates—based on the phonological process of inseparability and the Obligatory Contour Principle (OCP). Thirty Levantine Arabic speakers have taken part in this case study. Fifteen participants were asked to utter a group of stimuli where the two types of geminates interact with the surrounding phonological environment. The other fifteen participants were recorded while reading target word lists that contained geminate consonants and medial singleton preceded by short and long consonants and engaging in naturalistic conversations. Auditory and acoustic analyses of long consonants were made. Results from the word lists indicated that while Arabic true geminates embrace the phonological process of inseparability, Arabic fake geminates do not. The case study also shows that the OCP seems to bridge the contradiction between these two categories of Arabic geminates. Index Terms—Arabic geminates, epenthesis, inseparability, obligatory contour principle, CV phonology I. -
Writing Arabizi: Orthographic Variation in Romanized
WRITING ARABIZI: ORTHOGRAPHIC VARIATION IN ROMANIZED LEBANESE ARABIC ON TWITTER ! ! ! ! Natalie!Sullivan! ! ! ! TC!660H!! Plan!II!Honors!Program! The!University!of!Texas!at!Austin! ! ! ! ! May!4,!2017! ! ! ! ! ! ! ! _______________________________________________________! Barbara!Bullock,!Ph.D.! Department!of!French!&!Italian! Supervising!Professor! ! ! ! ! _______________________________________________________! John!Huehnergard,!Ph.D.! Department!of!Middle!Eastern!Studies! Second!Reader!! ii ABSTRACT Author: Natalie Sullivan Title: Writing Arabizi: Orthographic Variation in Romanized Lebanese Arabic on Twitter Supervising Professors: Dr. Barbara Bullock, Dr. John Huehnergard How does technology influence the script in which a language is written? Over the past few decades, a new form of writing has emerged across the Arab world. Known as Arabizi, it is a type of Romanized Arabic that uses Latin characters instead of Arabic script. It is mainly used by youth in technology-related contexts such as social media and texting, and has made many older Arabic speakers fear that more standard forms of Arabic may be in danger because of its use. Prior work on Arabizi suggests that although it is used frequently on social media, its orthography is not yet standardized (Palfreyman and Khalil, 2003; Abdel-Ghaffar et al., 2011). Therefore, this thesis aimed to examine orthographic variation in Romanized Lebanese Arabic, which has rarely been studied as a Romanized dialect. It was interested in how often Arabizi is used on Twitter in Lebanon and the extent of its orthographic variation. Using Twitter data collected from Beirut, tweets were analyzed to discover the most common orthographic variants in Arabizi for each Arabic letter, as well as the overall rate of Arabizi use. Results show that Arabizi was not used as frequently as hypothesized on Twitter, probably because of its low prestige and increased globalization. -
Cognate Words in Mehri and Hadhrami Arabic
Cognate Words in Mehri and Hadhrami Arabic Hassan Obeid Alfadly* Khaled Awadh Bin Mukhashin** Received: 18/3/2019 Accepted: 2/5/2019 Abstract The lexicon is one important source of information to establish genealogical relations between languages. This paper is an attempt to describe the lexical similarities between Mehri and Hadhrami Arabic and to show the extent of relatedness between them, a very little explored and described topic. The researchers are native speakers of Hadhrami Arabic and they paid many field visits to the area where Mehri is spoken. They used the Swadesh list to elicit their data from more than 20 Mehri informants and from Johnston's (1987) dictionary "The Mehri Lexicon and English- Mehri Word-list". The researchers employed lexicostatistical techniques to analyse their data and they found out that Mehri and Hadhrmi Arabic have so many cognate words. This finding confirms Watson (2011) claims that Arabic may not have replaced all the ancient languages in the South-Western Arabian Peninsula and that dialects of Arabic in this area including Hadhrami Arabic are tinged, to a greater or lesser degree, with substrate features of the Pre- Islamic Ancient and Modern South Arabian languages. Introduction: three branches including Central Semitic, Historically speaking, the Semitic language Ethiopian and Modern south Arabian languages family from which both of Arabic and Mehri (henceforth MSAL). Though Arabic and Mehri descend belong to a larger family of languages belong to the West Semitic, Arabic descends called Afro-Asiatic or Hamito-Semitic that from the Central Semitic and Mehri from includes Semitic, Egyptian, Cushitic, Omotic, (MSAL) which consists of two branches; the Berber and Chadic (Rubin, 2010). -
00. the Realization of Negation in the Syrian Arabic Clause, Phrase, And
The realization of negation in the Syrian Arabic clause, phrase, and word Isa Wayne Murphy M.Phil in Applied Linguistics Trinity College Dublin 2014 Supervisor: Dr. Brian Nolan Declaration I declare that this dissertation has not been submitted as an exercise for a degree at this or any other university and that it is entirely my own work. I agree that the Library may lend or copy this dissertation on request. Signed: Date: 2 Abstract The realization of negation in the Syrian Arabic clause, phrase, and word Isa Wayne Murphy Syrian Arabic realizes negation in broadly the same way as other dialects of Arabic, but it does so utilizing varied and at times unique means. This dissertation provides a Role and Reference Grammar account of the full spectrum of lexical, morphological, and analytical means employed by Syrian Arabic to encode negation on the layered structures of the verb, the clause, the noun, and the noun phrase. The scope negation takes within the LSC and the LSNP is identified and illustrated. The study found that Syrian Arabic employs separate negative particles to encode wide-scope negation on clauses and narrow-scope negation on constituents, and utilizes varied and interesting means to express emphatic negation. It also found that while Syrian Arabic belongs in most respects to the broader Levantine family of Arabic dialects, its negation strategy is more closely aligned with the Arabic dialects of Iraq and the Arab Gulf states. 3 Table of Contents DECLARATION......................................................................................................................... -
Modern Standard Arabic ﺝ
International Journal of Linguistics, Literature and Culture (Linqua- IJLLC) December 2014 edition Vol.1 No.3 /Ʒ/ AND /ʤ/ :ﺝ MODERN STANDARD ARABIC Hisham Monassar, PhD Assistant Professor of Arabic and Linguistics, Department of Arabic and Foreign Languages, Cameron University, Lawton, OK, USA Abstract This paper explores the phonemic inventory of Modern Standard ﺝ Arabic (MSA) with respect to the phoneme represented orthographically as in the Arabic alphabet. This phoneme has two realizations, i.e., variants, /ʤ/, /ӡ /. It seems that there is a regional variation across the Arabic-speaking peoples, a preference for either phoneme. It is observed that in Arabia /ʤ/ is dominant while in the Levant region /ӡ/ is. Each group has one variant to the exclusion of the other. However, there is an overlap regarding the two variants as far as the geographical distribution is concerned, i.e., there is no clear cut geographical or dialectal boundaries. The phone [ʤ] is an affricate, a combination of two phones: a left-face stop, [d], and a right-face fricative, [ӡ]. To produce this sound, the tip of the tongue starts at the alveolar ridge for the left-face stop [d] and retracts to the palate for the right-face fricative [ӡ]. The phone [ӡ] is a voiced palato- alveolar fricative sound produced in the palatal region bordering the alveolar ridge. This paper investigates the dichotomy, or variation, in light of the grammatical (morphological/phonological and syntactic) processes of MSA; phonologies of most Arabic dialects’ for the purpose of synchronic evidence; the history of the phoneme for diachronic evidence and internal sound change; as well as the possibility of external influence. -
Expression of Attributive Possession in Tunisian Arabic: the Role of Language Contact
Expression of Attributive Possession in Tunisian Arabic: The Role of Language Contact Lotfi Sayahi 1 Introduction Possession is a universal concept that describes a particular link between two entities. It can denote a set of relationships that range from strict ownership to more loose connections that are open to interpretations. At the semantic level, a major distinction has been established between alienable possession and inalienable possession. Alienable possession refers to relationships that are temporary or which can be ended freely such as ownership of material objects. In the case of inalienable possession the possessed entity is intrinsi- cally linked to the possessor, as in the case of part-whole relations (e.g., body parts) or kinship relations. Languages, however, vary when it comes to what counts as alienable vs. inalienable possession (Heine 1997; Payne and Barshi 1999; Herslund and Baron 2001). At the structural level, two major types of pos- sessive structures exist. The first type, which is not the object of the current study, is predicative possession which uses verb forms to express the possessive link. The second type is attributive possession, also referred to as adnominal possession, which marks the possession in the noun phrase through different morphosyntactic structures. Despite being such a common category, only a few studies have described change in the expression of possession in cases of language contact. Weinreich (1963) in his seminal work mentions the case of Estonian, Amharic, and Modern Hebrew which developed or increased the use of analytic possession constructions as a result of contact with other languages (Weinreich 1963: 41–42). -
Berber Etymologies in Maltese1
Berber E tymologies in Maltese 1 Lameen Souag LACITO (CNRS – Paris Sorbonne – INALCO) France ﻣﻠﺧص : ﻻ ﯾزال ﻣدى ﺗﺄﺛﯾر اﻷﻣﺎزﯾﻐﯾﺔ ﻓﻲ اﻟﻣﺎﻟطﯾﺔ ﻏﺎﻣﺿﺎ، واﻟﻛﺛﯾر ﻣن اﻻﻗﺗراﺣﺎت اﻟﻣﻧﺷورة ﻓﻲ ھذا اﻟﻣﺟﺎل ﻟﯾﺳت ﻣؤﻛدة . ھذه اﻟﻣﻘﺎﻟﺔ ﺗﻘﯾم ﻛل ﻛﻠﻣﺔ ﻣﺎﻟطﯾﺔ اﻗ ُﺗرح ﻟﮭﺎ أﺻل أﻣﺎزﯾﻐﻲ ﻣن ﻗﺑل، ﻓﺗﺳﺗﺑﻌد ﺧﻣﺳﯾن ﻛﻠﻣﺔ وﺗﻘﺑل ﻋﺷرﯾن . ﻛﻣﺎ أﻧﮭﺎ ﺗﻘﺗرح ﻟﻠﻣرة اﻷوﻟﻰ أﺻوﻻ أﻣﺎزﯾﻐﯾﺔ ﻟﺳﺗﺔ ﻛﻠﻣﺎت ﻣﺎﻟطﯾﺔ ﻏﯾر ھذه اﻟﺳﺑﻌﯾن، وھﻲ " أﻛﻠﺔ ﻟزﺟﺔ " ، " ﺻﻐﯾر " ، " طﯾر اﻟوﻗواق " ، " ﻧﺑﺎت اﻟدﯾس " ، " أﻧﺛﻰ اﻟﺣﺑ ﺎر " ، واﻟﺗﺻﻐﯾر ﺑﻌﻼﻣﺔ اﻟﺗﺄﻧﯾث . ﺑﻧﺎء ﻋﻠﻰ ھذه اﻟﻧﺗﺎﺋﺞ، ﺗﻔﺣص اﻟﻣﻘﺎﻟﺔ اﻟﺗوزﯾﻊ اﻟدﻻﻟﻲ ﻟدى اﻟﻣﻔردات اﻟدﺧﯾﻠﺔ ﻣن اﻷﻣﺎزﯾﻐﯾﺔ ﻟﺗﻠﻘﻲ ﻧظرة ﻋﻠﻰ ﻧوﻋﯾﺔ اﻻﺣﺗﻛﺎك ﺑﯾن اﻟﻌرﺑﯾﺔ واﻷﻣﺎزﯾﻐﯾﺔ ﻓﻲ إﻓرﯾﻘﯾﺔ ﻗﺑل وﺻول ﺑﻧﻲ ھﻼل إﻟﻰ اﻟﻣﻧطﻘﺔ Abstract : The extent of Berber lexical influence on Maltese remains unclear, and many of the published etymological proposals are problematic. This article evaluates the reliability of existing proposed Maltese etymologies involving Berber, excluding 50 but accepting 20, and proposes new Berbe r etymologies for another six Maltese words ( bażina “overcooked, sticky food”, ċkejken “small”, daqquqa “cuckoo”, dis “dis - grass, sparto”, tmilla “female cuttlefish used as bait”), as well as a diminutive formation strategy using - a . Based on the results, it examines the distribution of the loans found, in the hope of casting light on the context of Arabic - Berber contact in Ifrīqiyah before the arrival of the Banū Hilāl. Keywords: Maltese, Berber, Amazigh, etymology, loanword s 1 The author thanks Marijn van Putten, Karim Bensoukas, and two anonymous reviewers for their helpful comments on earlier drafts of this paper, and Abdessalem Saied for providing Nabuel data. © The International Journal of Arabic Linguistics (IJAL) Vol. -
Creating Resources for Dialectal Arabic from a Single Annotation: a Case Study on Egyptian and Levantine
Creating Resources for Dialectal Arabic from a Single Annotation: A Case Study on Egyptian and Levantine Ramy Eskander, Nizar Habash†, Owen Rambow and Arfath Pasha Columbia University, USA †New York University Abu Dhabi, UAE [email protected], [email protected] [email protected], [email protected] Abstract Arabic dialects present a special problem for natural language processing because there are few Arabic dialect resources, they have no standard orthography, and they have not been studied much. However, as more and more written dialectal Arabic is found on social media, natural lan- guage processing for Arabic dialects has become an important goal. We present a methodology for creating a morphological analyzer and a morphological tagger for dialectal Arabic, and we illustrate it on Egyptian and Levantine Arabic. To our knowledge, these are the first analyzer and tagger for Levantine. 1 Introduction The goal of this paper is to show how a particular type of annotated corpus can be used to create a morphological analyzer and a morphological tagger for a dialect of Arabic, without using any additional resources. A morphological analyzer is a tool that returns all possible morphological analyses for a given input word taken out of any context. A morphological tagger is a tool that identifies the single morphological analysis which is correct for a word given its specific context in a text. We illustrate our work using Egyptian Arabic and Levantine Arabic. Egyptian Arabic is in fact relatively resource-rich compared to other Arabic dialects, but we use a subset of available data to simulate a resource-poor dialect.