Morphologically Annotated Corpora for Seven Arabic Dialects: Taizi, Sanaani, Najdi, Jordanian, Syrian, Iraqi and Moroccan

Total Page:16

File Type:pdf, Size:1020Kb

Morphologically Annotated Corpora for Seven Arabic Dialects: Taizi, Sanaani, Najdi, Jordanian, Syrian, Iraqi and Moroccan Morphologically Annotated Corpora for Seven Arabic Dialects: Taizi, Sanaani, Najdi, Jordanian, Syrian, Iraqi and Moroccan Faisal Alshargi,? Shahd Dibas,z Sakhar Alkhereyf,y Reem Faraj,y Basmah Abdulkareem,y Sane Yagi,z Ouafaa Kacha,z Nizar Habash,∗ Owen Rambowx ?Universitat¨ Leipzig, Germany zUniversity of Jordan, Jordan yColumbia University, USA ∗New York University Abu Dhabi, UAE xElemental Cognition, USA [email protected], [email protected], [email protected], [email protected], [email protected] Abstract ideal starting point for experimenting with using multidialectal resources to create and train NLP We present a collection of morphologi- tools. The dialects we consider are Taizi Yemeni cally annotated corpora for seven Arabic (YE.TZ)1, Sanaani Yemeni (YE.SN), Saudi Na- dialects: Taizi Yemeni, Sanaani Yemeni, jdi (SA.NJ), Jordanian (JOR), Syrian Damascene Najdi, Jordanian, Syrian, Iraqi and Moroc- (SY.DM), Iraqi Baghdadi (IR.BG), and Moroccan can Arabic. The corpora collectively cover Rabati (MA.RB) Arabic. over 200,000 words, and are all manually The paper is structured as follows. We start with annotated in a common set of standards a review of relevant literature (Section2). We then for orthography, diacritized lemmas, to- summarize some linguistic facts about DA in gen- kenization, morphological units and En- eral (Section3) and subsequently present each of glish glosses. These corpora will be pub- our seven dialects in Section4, summarizing the licly available to serve as benchmarks for corpora used and some interesting facts specific to training and evaluating systems for Arabic each dialect. Section5 then presents our annota- dialect morphological analysis and disam- tion methodology. We then briefly discuss mor- biguation. phological analyzers, and conclude. 1 Introduction 2 Related Work As Arabic dialects (DA) become more widely Data Collections There have been several data written in social media, there is increased interest collections centered on Arabic dialects, specifi- in the Arabic NLP community to have annotated cally spoken Arabic. A very useful resource is the corpora that will allow us to both study the dialects Semitisches Tonarchiv at the University of Heidel- linguistically, and to create systems that can auto- berg in Germany.2 We have included two Yemeni matically process dialectal text. There have been transcriptions from this resource in our YE.TZ and important efforts to create relatively large corpora YE.SN corpora. Khalifa et al.(2016) is a large col- for Egyptian (Maamouri et al., 2014), Palestinian lection of over 100M words of a number of Ara- (Jarrar et al., 2014), and Emirati Arabic (Khal- bic dialect, although the majority is from the Gulf. ifa et al., 2018). While these resources are very Bouamor et al.(2018) created a large corpus with helpful for single dialects, the problem is that parallel data text from 25 Arab cities. Further data there are many dialects, and in fact it is often un- collections include (Al-Amri, 2000) which has not clear what to count as separate dialects (for exam- yet been digitized for use in NLP research. ple, the subdialects of Levantine). Therefore, we present a different approach in this paper: we an- Annotated Corpora There are few annotated notate seven dialects, but with relatively smaller corpora for dialectal Arabic: the Levantine Ara- corpora (most around 30,000 words). Some of bic Treebank (specifically Jordanian) (Maamouri the dialects are closely related (Jordanian and Syr- et al., 2006), the Egyptian Arabic Treebank ian), others are more distant (Moroccan). We use (Maamouri et al., 2014), Curras, the Pales- the same annotation methodology for all dialects: 1The abbreviations we use intend to capture the country same guidelines, same processing steps, and same name and the city or region name when applicable. annotation file format. This makes our effort an 2http://www.semarch.uni-hd.de 137 Proceedings of the Fourth Arabic Natural Language Processing Workshop, pages 137–147 Florence, Italy, August 1, 2019. c 2019 Association for Computational Linguistics tinian Arabic annotated corpus (Jarrar et al., 3 Dialects: Linguistic Facts 2014), the Gulf Arabic Annotated corpus (Khalifa In this section we present some general facts and et al., 2018), Syrian, Jordanian dialectal corpora phenomena shared across different dialects. In (Bouamor et al., 2014; Harrat et al., 2014), a small subsequent subsections, we present our dialects effort on Sanaani and Moroccan (AlShargi et al., in more detail and commenting on the corpus 2016) (which this paper builds on), and SUAR sources. (Al-Twairesh et al., 2018), a morphologically an- notated corpus for Najdi and Hijazi which is semi- Dialects and MSA Arabic dialects share many automatically annotated using the MADAMIRA commonalities with Classical Arabic and Mod- tool (Pasha et al., 2014) and subsequently man- ern Standard Arabic (MSA). All variants of Ara- ually checked. Additionally, Voss et al.(2014) bic are morphologically complex as they include present a corpus of Moroccan dialect which has rich inflectional and derivational morphology that been annotated for language variety (code switch- is expressed in two ways: namely, via templates ing). Several of these efforts have followed the ap- and affixes. Furthermore, they contain several proach of Curras (Jarrar et al., 2014), which con- classes of attachable clitics. However, the dialects sists of around 70,000 words of a balanced genre as a class differ in consistent ways from MSA, corpus. The corpus was manually annotated using and they differ amongst each other. In fact, the the DIWAN tool (Alshargi and Rambow, 2015), differences between MSA and Dialectal Arabic which we also use. The annotation in Curras is (DA) have often been compared to those between done by first using a morphological tagger for an- Latin and the Romance languages (Chiang et al., other Arabic dialect, namely MADAMIRA Egyp- 2006). The principal morpho-syntactic difference tian (Pasha et al., 2014), to produce a base that was between DA and MSA is the loss of productive then corrected or accepted by a trained annotator. case marking, and nunation (tanween) on nouns, and mood on imperfective verbs. Other NLP Resources for Dialectal Arabic Dialectal Variations Differences among the di- The effort to annotate corpora in context is a cen- alects are found on all levels of linguistic descrip- tral step in developing morphological analyzers tion, i.e., phonology, morphology, syntax, and the and taggers (Eskander et al., 2013; Habash et al., lexicon. We summarize three phonological and 2013). However, other notable approaches and three morphological salient examples in Table1 efforts that do not use annotated corpora have for our dialects: the pronunciation of MSA /q/ focused on developing specific resources manu- written q,3 MSA /Ã/ written j and MSA /k/ ally or semi-automatically, e.g., the Egyptian Ara- h. written k; and the various forms of the future, bic morphological analyzer (Habash et al., 2012b) ¼ progressive and possessive particles. which is built upon the Egyptian Colloquial Ara- From a lexical point of view, there are many bic Lexicon (Kilany et al., 2002), the multi- words that have different meanings across dialects. dialectal dictionary Tharwa (Diab et al., 2014), or For example, the word mA$y /ma:Si/ is ‘no’ extending MSA analyzers and resources (Salloum ú æ AÓ and Habash, 2014; Harrat et al., 2014; Boujelbane in YE.SN and MA.RB, ‘yes/ok’ in SY.DM and JOR, and ‘walking’ in SA.NJ. Another exam- et al., 2013). Q ple is the word ú¯A SAfy /s a:fi/ which means ‘enough’ in MA.RB, but ‘pure’ in the other di- Linguistic Studies There are many theoretical alects and MSA. Some cases show subtle dif- and descriptive linguistic studies for the dialects ferences in meaning, e.g., Ð@Yg xdAm /xadda:m/ we work on: Yemeni dialects (Watson, 1993, means ‘employee’ generically in MA.RB, but it 2002), Najdi (Ingham, 1994), Gulf Arabic dialect has a more specific and negative connotation in (Holes, 1990), Jordanian (Bani-Yasin and Owens, YE.TZ and YE.SN, namely ‘enslaved servant’. 1987), Moroccan (Harrell, 1962), Syrian (Cow- While the above cases are all homonyms (homo- ell, 1964), and Iraqi (Erwin, 1963); not to men- phones and homographs), there are instances of tions comparative studies across dialects and MSA (Holes, 2004; Brustad, 2000). We make extensive 3We represent the Arabic words in Arabic script and in the Buckwalter transliteration (in italics) (Habash et al., 2007). use of such studies as part of the design of our an- When needed, we present the IPA (in /.../). The English gloss notation guidelines. is added in single quotes. 138 Phenomenon MSA YE.TZ YE.SN SA.NJ JOR SY.DM IR.BG MA.RB Pronunciation of q /q//q//g//g/ or /dz//g/ or /P//P//g//q/ or /g/ j Pronunciation of h. /Ã//g//Ã//Ã//Z//Z//Ã//Ã/ Pronunciation of ¼ k /k//k//k//k/ or /ts//k/ or /Ù//k//k/ or /Ù//k/ Future Particle + s+ + $+ +¨ E+ +H. b+ +h H+ +h H+ +h H+ +¨ g+ swf A$ Ed rH rH rH gAdy ¬ñ @ Y« hP hP hP ø XA« + $+ h@P rAH + y+ ø b b b b d k Progressive Particle φ +H. + +H. + Y«A¯ qAEd +H. + +H. + +X + +¼ + t ËAg. jAls Ñ« Em Y«A¯ qAEd +H + d Possessive Particle φ ©J.K tbE ©J.K tbE k Hq ©J.K tbE ©J.K tbE ÈAÓ mAl +X + k Hq k Hq ¨AK tAE ÈAK X dyAl Table 1: Cross-dialectal and MSA variants in some phonological and morphological phenomena homophones that have different meanings in dif- 4 Dialect-Specific Corpora ferent dialects. For example the utterance /fagr/ Until recently, Arabic was mostly written in Mod- can mean ‘morning’ in YE.TZ (written as Qm¯ fjr), or .
Recommended publications
  • Different Dialects of Arabic Language
    e-ISSN : 2347 - 9671, p- ISSN : 2349 - 0187 EPRA International Journal of Economic and Business Review Vol - 3, Issue- 9, September 2015 Inno Space (SJIF) Impact Factor : 4.618(Morocco) ISI Impact Factor : 1.259 (Dubai, UAE) DIFFERENT DIALECTS OF ARABIC LANGUAGE ABSTRACT ifferent dialects of Arabic language have been an Dattraction of students of linguistics. Many studies have 1 Ali Akbar.P been done in this regard. Arabic language is one of the fastest growing languages in the world. It is the mother tongue of 420 million in people 1 Research scholar, across the world. And it is the official language of 23 countries spread Department of Arabic, over Asia and Africa. Arabic has gained the status of world languages Farook College, recognized by the UN. The economic significance of the region where Calicut, Kerala, Arabic is being spoken makes the language more acceptable in the India world political and economical arena. The geopolitical significance of the region and its language cannot be ignored by the economic super powers and political stakeholders. KEY WORDS: Arabic, Dialect, Moroccan, Egyptian, Gulf, Kabael, world economy, super powers INTRODUCTION DISCUSSION The importance of Arabic language has been Within the non-Gulf Arabic varieties, the largest multiplied with the emergence of globalization process in difference is between the non-Egyptian North African the nineties of the last century thank to the oil reservoirs dialects and the others. Moroccan Arabic in particular is in the region, because petrol plays an important role in nearly incomprehensible to Arabic speakers east of Algeria. propelling world economy and politics.
    [Show full text]
  • A Note on the Genitive Particle Ħaqq in Yemeni Arabic Free Genitives Mohammed Ali Qarabesh, University of Albayda Mohammed Q
    A note on ħaqq in Yemeni Arabic … Qarabesh & Shormani A Note on the Genitive particle ħaqq in Yemeni Arabic Free Genitives Mohammed Ali Qarabesh, University of Albayda Mohammed Q. Shormani, University of Ibb الملخص: تتىاوه هذي اىىرقح ميمح "حق" فٍ اىيهجح اىُمىُح ورتثتها اىىحىَح فٍ تزمُة إضافح اىمينُح اىتحيُيُح، وتقذً ىها تحيُو وحىٌ وصفٍ، حُث َفتزض اىثاحثان أن هىاك وىػُه مه هذي اىنيمح فٍ اىيهجح اىُمىُح: ا( تيل اىتٍ ﻻ تظهز ػيُها ػﻻماخ اىتطاتق، مثو "اىسُاراخ حق ػيٍ"، حُث وزي أن ميمح "اىسُاراخ" ىها اىسماخ )جمغ، مؤوث، غائة( وىنه ميمح "حق" ﻻ تتطاتق مؼها فٍ أٌ مه هذي اىصفاخ، و ب( تيل اىتٍ تظهز ػيُها ػﻻماخ اىتطاتق مثو "اىسُاراخ حقاخ ػيٍ" حُث تتطاتق اىنيمتان "اىسُاراخ" و"حقاخ" فٍ مو اىسماخ. وؼَزض اىثاحثان أن اىىىع اﻷوه َ ستخذً فٍ مىاطق مثو صىؼاء، ػذن، إب... اىخ، واىثاوٍ فٍ شثىج وحضزمىخ ... اىخ. وَخيص اىثاحثان إىً أن هىاك دىُو ػميٍ ىُس فقظ ػيً وجىد اىىحى اىنيٍ فٍ "اىمينح اىيغىَح" تو أَضا ػيً "تَ ْى َس َطح" هذا اىىحى، ىُس فقظ تُه اىيغاخ تو وتُه ىهجاخ اىيغح اىىاحذج. الكلمات المفتاحية: اىيغاخ اىسامُح، اىؼزتُح اىُمىُح، اىؼثزَح، اى م ْينُح، "حق" Abstract This paper provides a descriptive syntactic analysis of ħaqq in Yemeni Arabic (YA). ħaqq is a Semitic Free Genitive (FG) particle, much like the English of. A FG minimally consists of a head N, genitive particle and genitive DP complement. It (in a FG) expresses or conveys the meaning of possessiveness, something like of in English. There are two types of ħaqq in Yemeni Arabic: one not exhibiting agreement with the head N, and another exhibiting it.
    [Show full text]
  • Arabic Sociolinguistics: Topics in Diglossia, Gender, Identity, And
    Arabic Sociolinguistics Arabic Sociolinguistics Reem Bassiouney Edinburgh University Press © Reem Bassiouney, 2009 Edinburgh University Press Ltd 22 George Square, Edinburgh Typeset in ll/13pt Ehrhardt by Servis Filmsetting Ltd, Stockport, Cheshire, and printed and bound in Great Britain by CPI Antony Rowe, Chippenham and East bourne A CIP record for this book is available from the British Library ISBN 978 0 7486 2373 0 (hardback) ISBN 978 0 7486 2374 7 (paperback) The right ofReem Bassiouney to be identified as author of this work has been asserted in accordance with the Copyright, Designs and Patents Act 1988. Contents Acknowledgements viii List of charts, maps and tables x List of abbreviations xii Conventions used in this book xiv Introduction 1 1. Diglossia and dialect groups in the Arab world 9 1.1 Diglossia 10 1.1.1 Anoverviewofthestudyofdiglossia 10 1.1.2 Theories that explain diglossia in terms oflevels 14 1.1.3 The idea ofEducated Spoken Arabic 16 1.2 Dialects/varieties in the Arab world 18 1.2. 1 The concept ofprestige as different from that ofstandard 18 1.2.2 Groups ofdialects in the Arab world 19 1.3 Conclusion 26 2. Code-switching 28 2.1 Introduction 29 2.2 Problem of terminology: code-switching and code-mixing 30 2.3 Code-switching and diglossia 31 2.4 The study of constraints on code-switching in relation to the Arab world 31 2.4. 1 Structural constraints on classic code-switching 31 2.4.2 Structural constraints on diglossic switching 42 2.5 Motivations for code-switching 59 2.
    [Show full text]
  • Possessive Constructions in Najdi Arabic
    Possessive Constructions in Najdi Arabic Eisa Sneitan Alrasheedi A thesis submitted to the Faculty of Humanities, Arts and Social Sciences in partial fulfilment of the requirements for the degree of Doctor of Philosophy in Theoretical Linguistics School of English Literature, Language and Linguistics Newcastle University July, 2019 ii Abstract This thesis investigates the syntax of possession and agreement in Najdi Arabic (NA, henceforth) with a particular focus on the possession expressed at the level of the DP (Determiner Phrase). Using the main assumptions of the Minimalist Program (Chomsky 1995, and subsequent work) and adopting Abney’s (1987) DP-hypothesis, this thesis shows that the various agreement patterns within the NA DP can be accounted for with the use of a probe/goal agreement operation (Chomsky 2000, 2001). Chapter two discusses the syntax of ‘synthetic’ possession in NA. Possession in NA, like other Arabic varieties, can be expressed synthetically using a Construct State (CS), e.g. kitaab al- walad (book the-boy) ‘the boy’s book’. Drawing on the (extensive) literature on the CS, I summarise its main characteristics and the different proposals for its derivation. However, the main focus of this chapter is on a lesser-investigated aspect of synthetic possession – that is, possessive suffixes, the so-called pronominal possessors, as in kitaab-ah (book-his) ‘his book’. Building on a previous analysis put forward by Shlonsky (1997), this study argues (contra Fassi Fehri 1993), that possessive suffixes should not be analysed as bound pronouns but rather as an agreement inflectional suffix (à la Shlonsky 1997), where the latter is derived by Agree between the Poss(essive) head and the null pronoun within NP.
    [Show full text]
  • The Ammonite Onomasticon: Semantic Problems
    Andrews University Seminary Studies, Spring 1987, Vol. 25, No. 1, 51-64. Copyright @ 1987 by Andrews University Press. THE AMMONITE ONOMASTICON: SEMANTIC PROBLEMS M. O'CONNOR Ann Arbor, Michigan 48104 The small corpus of epigraphic finds associated with the Ammonites testifies to a South Canaanite dialect closely allied to the dialects attested in Epigraphic Hebrew and Moabite finds and in the Hebrew Bible.' The Ammonite inscribed remains also testify to a characteristic South Canaanite onomasticon (see Excursus A at the close of this article). Most of the Ammonite names are of well- known Northwest Semitic types, involving common formants (for some exceptions, see Excursus B at the close of this article). Certain of the names, however, remain obscure, and I hope to clarify some of these here by considering a variety of semantic factors. 1. Single- Unit Names Referring to the Non-Human World 1 .I. Plant Names Two Ammonite names may refer to plants: 'lmg* and grgr.3 Personal names from plants are not common, but they are attested: note simply Ugaritic names in gpn, "vine"; krm, "vineyard"; and ychr, "forest."4 The 'See K. P. Jackson, The Ammonite Language of the Iron Age (Chico, CA, 1983) (hereinafterJAL); W. E. Aufrecht, A Bibliography of Ammonite Inscriptions, News- letter for Targumic and Cognate Studies, Supplement #1 (Toronto, 1982); and D. Sivan, "On the Grammar and Orthography of the Ammonite Findings," UF 14 (1982): 219-234. Aufrecht will shortly publish a full study of the Ammonite texts. K. P. Jackson and Philip Schmitz read an earlier draft of the present essay and thanks are due Schmitz for several comments.
    [Show full text]
  • Arabic and Contact-Induced Change Christopher Lucas, Stefano Manfredi
    Arabic and Contact-Induced Change Christopher Lucas, Stefano Manfredi To cite this version: Christopher Lucas, Stefano Manfredi. Arabic and Contact-Induced Change. 2020. halshs-03094950 HAL Id: halshs-03094950 https://halshs.archives-ouvertes.fr/halshs-03094950 Submitted on 15 Jan 2021 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Arabic and contact-induced change Edited by Christopher Lucas Stefano Manfredi language Contact and Multilingualism 1 science press Contact and Multilingualism Editors: Isabelle Léglise (CNRS SeDyL), Stefano Manfredi (CNRS SeDyL) In this series: 1. Lucas, Christopher & Stefano Manfredi (eds.). Arabic and contact-induced change. Arabic and contact-induced change Edited by Christopher Lucas Stefano Manfredi language science press Lucas, Christopher & Stefano Manfredi (eds.). 2020. Arabic and contact-induced change (Contact and Multilingualism 1). Berlin: Language Science Press. This title can be downloaded at: http://langsci-press.org/catalog/book/235 © 2020, the authors Published under the Creative Commons Attribution
    [Show full text]
  • Cognate Words in Mehri and Hadhrami Arabic
    Cognate Words in Mehri and Hadhrami Arabic Hassan Obeid Alfadly* Khaled Awadh Bin Mukhashin** Received: 18/3/2019 Accepted: 2/5/2019 Abstract The lexicon is one important source of information to establish genealogical relations between languages. This paper is an attempt to describe the lexical similarities between Mehri and Hadhrami Arabic and to show the extent of relatedness between them, a very little explored and described topic. The researchers are native speakers of Hadhrami Arabic and they paid many field visits to the area where Mehri is spoken. They used the Swadesh list to elicit their data from more than 20 Mehri informants and from Johnston's (1987) dictionary "The Mehri Lexicon and English- Mehri Word-list". The researchers employed lexicostatistical techniques to analyse their data and they found out that Mehri and Hadhrmi Arabic have so many cognate words. This finding confirms Watson (2011) claims that Arabic may not have replaced all the ancient languages in the South-Western Arabian Peninsula and that dialects of Arabic in this area including Hadhrami Arabic are tinged, to a greater or lesser degree, with substrate features of the Pre- Islamic Ancient and Modern South Arabian languages. Introduction: three branches including Central Semitic, Historically speaking, the Semitic language Ethiopian and Modern south Arabian languages family from which both of Arabic and Mehri (henceforth MSAL). Though Arabic and Mehri descend belong to a larger family of languages belong to the West Semitic, Arabic descends called Afro-Asiatic or Hamito-Semitic that from the Central Semitic and Mehri from includes Semitic, Egyptian, Cushitic, Omotic, (MSAL) which consists of two branches; the Berber and Chadic (Rubin, 2010).
    [Show full text]
  • 00. the Realization of Negation in the Syrian Arabic Clause, Phrase, And
    The realization of negation in the Syrian Arabic clause, phrase, and word Isa Wayne Murphy M.Phil in Applied Linguistics Trinity College Dublin 2014 Supervisor: Dr. Brian Nolan Declaration I declare that this dissertation has not been submitted as an exercise for a degree at this or any other university and that it is entirely my own work. I agree that the Library may lend or copy this dissertation on request. Signed: Date: 2 Abstract The realization of negation in the Syrian Arabic clause, phrase, and word Isa Wayne Murphy Syrian Arabic realizes negation in broadly the same way as other dialects of Arabic, but it does so utilizing varied and at times unique means. This dissertation provides a Role and Reference Grammar account of the full spectrum of lexical, morphological, and analytical means employed by Syrian Arabic to encode negation on the layered structures of the verb, the clause, the noun, and the noun phrase. The scope negation takes within the LSC and the LSNP is identified and illustrated. The study found that Syrian Arabic employs separate negative particles to encode wide-scope negation on clauses and narrow-scope negation on constituents, and utilizes varied and interesting means to express emphatic negation. It also found that while Syrian Arabic belongs in most respects to the broader Levantine family of Arabic dialects, its negation strategy is more closely aligned with the Arabic dialects of Iraq and the Arab Gulf states. 3 Table of Contents DECLARATION.........................................................................................................................
    [Show full text]
  • Modern Standard Arabic ﺝ
    International Journal of Linguistics, Literature and Culture (Linqua- IJLLC) December 2014 edition Vol.1 No.3 /Ʒ/ AND /ʤ/ :ﺝ MODERN STANDARD ARABIC Hisham Monassar, PhD Assistant Professor of Arabic and Linguistics, Department of Arabic and Foreign Languages, Cameron University, Lawton, OK, USA Abstract This paper explores the phonemic inventory of Modern Standard ﺝ Arabic (MSA) with respect to the phoneme represented orthographically as in the Arabic alphabet. This phoneme has two realizations, i.e., variants, /ʤ/, /ӡ /. It seems that there is a regional variation across the Arabic-speaking peoples, a preference for either phoneme. It is observed that in Arabia /ʤ/ is dominant while in the Levant region /ӡ/ is. Each group has one variant to the exclusion of the other. However, there is an overlap regarding the two variants as far as the geographical distribution is concerned, i.e., there is no clear cut geographical or dialectal boundaries. The phone [ʤ] is an affricate, a combination of two phones: a left-face stop, [d], and a right-face fricative, [ӡ]. To produce this sound, the tip of the tongue starts at the alveolar ridge for the left-face stop [d] and retracts to the palate for the right-face fricative [ӡ]. The phone [ӡ] is a voiced palato- alveolar fricative sound produced in the palatal region bordering the alveolar ridge. This paper investigates the dichotomy, or variation, in light of the grammatical (morphological/phonological and syntactic) processes of MSA; phonologies of most Arabic dialects’ for the purpose of synchronic evidence; the history of the phoneme for diachronic evidence and internal sound change; as well as the possibility of external influence.
    [Show full text]
  • The Amazigh Influence on Moroccan Arabic: Phonological and Morphological Borrowing Mohamed Lahrouchi
    The Amazigh influence on Moroccan Arabic: Phonological and morphological borrowing Mohamed Lahrouchi To cite this version: Mohamed Lahrouchi. The Amazigh influence on Moroccan Arabic: Phonological and morphological borrowing. International Journal of Arabic Linguistics, 2018, Arabic-Amazigh contact, 4 (1), pp.39-58. halshs-01798660v2 HAL Id: halshs-01798660 https://halshs.archives-ouvertes.fr/halshs-01798660v2 Submitted on 2 Jul 2018 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. The Amazigh influence on Moroccan Arabic: Phonological and morphological borrowing 1 Mohamed Lahrouchi CNRS & Université Paris 8 ﻣﻠﺧص ﺗ ﻌﺎﻟﺞ ھذه اﻟورﻗﺔ ﺑﻌض اﻟﺳﻣﺎت اﻟﺻوﺗﯾﺔ واﻟ ﺻرﻓﯾ ﺔ اﻟرﺋﯾﺳﯾﺔ اﻟﺗﻲ طور ﺗ ﮭﺎ اﻟﻌرﺑﯾﺔ اﻟﻣﻐرﺑﯾﺔ ﻓﻲ اﺗﺻﺎﻟ ﮭﺎ ﻣﻊ اﻷﻣﺎزﯾﻐ ﯾﺔ . اﻧطﻼﻗﺎ ﻣن اﻷ ﻋﻣ ﺎ ل اﻟﺳﺎﺑﻘ ﺔ، ﺳﻧﺑﯾن أن اﻟﻌ رﺑﯾﺔ اﻟﻣﻐرﺑﯾﺔ ﻗد ﻓﻘدت اﻟﺣرﻛﺎت اﻟﻘﺻﯾرة اﻟﻣﻘﺎﺑﻠﺔ ﻟﻧظﯾراﺗﮭﺎ اﻟﻔﺻﯾﺣﺔ ، وطورت ﺑدل ذﻟك ﺣ رﻛﺔ وﺳطﯾﺔ ﻗﺻﯾر ة ﺗﺳﺗﻌﻣل أﺳﺎﺳﺎ ﻟﺗﻔرﯾق ا ﻟﺻواﻣت ﻓﻲ اﻟﻣﺟﻣوﻋﺎت اﻟﻣردودة . وﺑﻧﺎء ﻋﻠﻰ ذﻟك، ﯾ ﺑدو أن أﻓﺿل ﺗﺣﻠﯾل ﻟ ﺗوزﯾﻊ ھذ ه اﻟﺣ رﻛﺔ اﻟ ﻘﺻﯾر ة ھو ﻧﻣوذج ﺻﺎﻣت – ﺻﺎﺋت ﺻﺎرم ﯾﻠزم ﻛل ﺻﺎﻣت ﺗﺣﺗﻲ ﻏﯾر ﻣرﺗﺑط ﺑرأس أن ﯾﺻﺑﺢ ﺻﺎﺋﺗﺎ ﻓﻲ اﻟﺑﻧﯾﺔ اﻟﺳطﺣﯾﺔ ﻣﺎ ﻋدا اﻟﺿﻣﺔ اﻟﺗﻲ اﺣﺗﻔظت ﺑﮭﺎ اﻟﻌرﺑﯾﺔ اﻟﻣﻐرﺑﯾﺔ ﻋﻧدﻣﺎ ﺗظﮭر ﺑﺟوار ﺻﺎﻣت ﺷﻔوي أو ﺣﺟﺎﺑﻲ أو ﻟﮭوي .
    [Show full text]
  • A Tale of Two Morphologies
    A Tale of Two Morphologies Verb structure and argument alternations in Maltese Dissertation zur Erlangung des akademischen Grades eines Doktors der Philosophie vorgelegt von Spagnol, Michael an der Geisteswissenschaftliche Sektion Sprachwissenschaft 1. Referent: Prof. Dr. Frans Plank 2. Referent: Prof. Dr. Christoph Schwarze 3. Referent: Prof. Dr. Albert Borg To my late Nannu Kieli, a great story teller Contents Acknowledgments ............................................................................................................................. iii Notational conventions .................................................................................................................... v Abstract ............................................................................................................................................... viii Ch. 1. Introduction ............................................................................................................................. 1 1.1. A tale to be told ............................................................................................................................................. 2 1.2 Three sides to every tale ........................................................................................................................... 4 Ch. 2. Setting the stage ...................................................................................................................... 9 2.1. No language is an island .......................................................................................................................
    [Show full text]
  • On the Syntax of Sentential Negation in Yemeni Arabic
    International Journal of English Linguistics; Vol. 10, No. 2; 2020 ISSN 1923-869X E-ISSN 1923-8703 Published by Canadian Center of Science and Education On the Syntax of Sentential Negation in Yemeni Arabic Abdulrahman Alqurashi1 & Mukarram Abduljalil1 1 Department of European Languages & Literature, King Abdelaziz University, Jeddah, Saudi Arabia Correspondence: Abdulrahman Alqurashi, P.O. BOX 80200, Jeddah 21589, Saudi Arabia. E-mail: [email protected] Received: December 26, 2019 Accepted: January 31, 2020 Online Published: February 23, 2020 doi:10.5539/ijel.v10n2p331 URL: https://doi.org/10.5539/ijel.v10n2p331 Abstract In this paper we explore the system of negation in modern Arabic dialects with a particular focus on Yemeni Arabic (Raymi dialect). The data observed in this dialect incorporate important and novel facts related to the syntax of sentential negation in Arabic. This includes the distribution of negation patterns and the interaction between negation and negative polarity items, which challenges the two widely adopted analyses for sentential negation in Arabic: The Spec-NegP analysis and the discontinuous Neg analysis. In this paper we argue that neither analysis can provide an adequate account of Raymi Arabic facts. Instead, a more recent analysis, the Spilt-Neg analysis, can accommodate them. In addition, in the study we provide empirical evidence in support of the Higher-Neg analysis, wherein Neg is projected higher than T in the derivation. Keywords: Arabic dialects, discontinuous negation, negative polarity items, non-discontinuous negation, Raymi dialect, sentential negation, Yemeni Arabic 1. Introduction The syntax of negation in Arabic is as extremely diverse as the varieties of the language themselves.
    [Show full text]