19 the Arabic Language
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Automatic Identification of Arabic Language Varieties and Dialects in Social Media
Automatic Identification of Arabic Language Varieties and Dialects in Social Media Fatiha Sadat Farnazeh Kazemi Atefeh Farzindar University of Quebec in NLP Technologies Inc. NLP Technologies Inc. Montreal, 201 President Ken- 52 Le Royer Street W., 52 Le Royer Street W., nedy, Montreal, QC, Canada Montreal, QC, Canada Montreal, QC, Canada [email protected] [email protected] [email protected] Abstract Modern Standard Arabic (MSA) is the formal language in most Arabic countries. Arabic Dia- lects (AD) or daily language differs from MSA especially in social media communication. However, most Arabic social media texts have mixed forms and many variations especially be- tween MSA and AD. This paper aims to bridge the gap between MSA and AD by providing a framework for AD classification using probabilistic models across social media datasets. We present a set of experiments using the character n-gram Markov language model and Naive Bayes classifiers with detailed examination of what models perform best under different condi- tions in social media context. Experimental results show that Naive Bayes classifier based on character bi-gram model can identify the 18 different Arabic dialects with a considerable over- all accuracy of 98%. 1 Introduction Arabic is a morphologically rich and complex language, which presents significant challenges for nat- ural language processing and its applications. It is the official language in 22 countries spoken by more than 350 million people around the world1. Moreover, the Arabic language exists in a state of diglossia where the standard form of the language, Modern Standard Arabic (MSA) and the regional dialects (AD) live side-by-side and are closely related (Elfardy and Diab, 2013). -
Adapting MARBERT for Improved Arabic Dialect Identification
Adapting MARBERT for Improved Arabic Dialect Identification: Submission to the NADI 2021 Shared Task Badr AlKhamissi∗ Mohamed Gabr∗ Independent Microsoft EGDC badr [at] khamissi.com mohamed.gabr [at] microsoft.com Muhammed ElNokrashy Khaled Essam Microsoft EGDC Mendel.ai muelnokr [at] microsoft.com khaled.essam [at] mendel.ai Abstract syntax, morphology, vocabulary, and even orthog- raphy. Dialects may be heavily influenced by pre- In this paper, we tackle the Nuanced Ara- viously dominant local languages. For example, bic Dialect Identification (NADI) shared task Egyptian variants are influenced by the Coptic lan- (Abdul-Mageed et al., 2021) and demonstrate guage, while Sudanese variants are influenced by state-of-the-art results on all of its four sub- the Nubian language. tasks. Tasks are to identify the geographic ori- gin of short Dialectal (DA) and Modern Stan- In this paper, we study the classification of such dard Arabic (MSA) utterances at the levels of variants and describe our model that achieves state- both country and province. Our final model is of-the-art results on all of the four Nuanced Ara- an ensemble of variants built on top of MAR- bic Dialect Identification (NADI) subtasks (Abdul- BERT that achieves an F1-score of 34:03% for Mageed et al., 2021). The task focuses on distin- DA at the country-level development set—an guishing both MSA and DA by their geographi- improvement of 7:63% from previous work. cal origin at both the country and province levels. The data is a collection of tweets covering 100 1 Introduction provinces from 21 Arab countries. -
Linguistics Development Team
Development Team Principal Investigator: Prof. Pramod Pandey Centre for Linguistics / SLL&CS Jawaharlal Nehru University, New Delhi Email: [email protected] Paper Coordinator: Prof. K. S. Nagaraja Department of Linguistics, Deccan College Post-Graduate Research Institute, Pune- 411006, [email protected] Content Writer: Prof. K. S. Nagaraja Prof H. S. Ananthanarayana Content Reviewer: Retd Prof, Department of Linguistics Osmania University, Hyderabad 500007 Paper : Historical and Comparative Linguistics Linguistics Module : Indo-Aryan Language Family Description of Module Subject Name Linguistics Paper Name Historical and Comparative Linguistics Module Title Indo-Aryan Language Family Module ID Lings_P7_M1 Quadrant 1 E-Text Paper : Historical and Comparative Linguistics Linguistics Module : Indo-Aryan Language Family INDO-ARYAN LANGUAGE FAMILY The Indo-Aryan migration theory proposes that the Indo-Aryans migrated from the Central Asian steppes into South Asia during the early part of the 2nd millennium BCE, bringing with them the Indo-Aryan languages. Migration by an Indo-European people was first hypothesized in the late 18th century, following the discovery of the Indo-European language family, when similarities between Western and Indian languages had been noted. Given these similarities, a single source or origin was proposed, which was diffused by migrations from some original homeland. This linguistic argument is supported by archaeological and anthropological research. Genetic research reveals that those migrations form part of a complex genetical puzzle on the origin and spread of the various components of the Indian population. Literary research reveals similarities between various, geographically distinct, Indo-Aryan historical cultures. The Indo-Aryan migrations started in approximately 1800 BCE, after the invention of the war chariot, and also brought Indo-Aryan languages into the Levant and possibly Inner Asia. -
The Concept of Ministry in the Arabic Political Tradition Its Origin, Development, and Linguistic Reflection
The Concept of Ministry in the Arabic Political Tradition Its origin, development, and linguistic reflection IVAN V. SIVKOV Abstract The paper presents the results of an analysis of the term “ministry” (wizāra) as one of the pivotal concepts in the Arabic/Islamic political tradition. The ministry as key political/administrative institution in the Arabic/Islamic traditional state machinery is researched from a historical/institutional perspective. The concept of ministry is treated from the point of its origin and historical development, as well as its changeable role and meaning in the variable Arabic political system. The paper is primarily dedicated to the investigation of the realization of the concept of ministry and its different types and branches in the Arabic language through the etymological and semantic examination of the terms used to denote this institution during the long period of administrative development of the Arabic world from its establishment as such and during the inception of the ʿAbbāsid caliphate to its usage in administrative apparatus of modern Arab states. The paper is based on Arabic narrative sources such as historical chronicles, collections of the official documents of modern Arabic states, and the lists of its chief magistrates (with special reference to government composition and structure). Keywords: term, terminology, concept, semantic, etymology, value, derivation Introduction The term wazīr is traditionally used to denote the position of vizier who was the state secretary, the aide, helper and councilor of the caliph/sultan of the highest rank in the administrative apparatus of ʿAbbāsid Caliphate and its successor states (e.g., Būyids, Fāṭimids, Ayyūbids and Salǧūqs). -
1 United States District Court District of Connecticut Dr. Al Malik
UNITED STATES DISTRICT COURT DISTRICT OF CONNECTICUT DR. AL MALIK OFFICE FOR FINANCIAL AND ECONOMIC CONSULTANCY, Plaintiff, No. 3:19-cv-1417 (JAM) v. HORSENECK CAPITAL ADVISORS, LLC, Defendant. ORDER GRANTING IN PART AND DENYING IN PART MOTION TO DISMISS This case is about a consulting fee dispute. The plaintiff claims that the defendant agreed to pay the plaintiff about $1.5 million for consulting services in 2018 but that the defendant has only paid about half that amount. The plaintiff has filed this lawsuit alleging numerous claims, and the defendant in turn has moved to dismiss the claims for statutory theft and breach of the implied covenant of good faith and fair dealing. I will grant in part and deny in part the motion to dismiss. BACKGROUND The following facts are drawn from the complaint and are assumed to be true solely for the purpose of this motion to dismiss. Doc. #1. Plaintiff Dr. Al Malik Office for Financial and Economic Consultancy (“Malik”) is an eponymous sole proprietorship owned and operated by Dr. Ahmed Al Malik, a citizen and resident of Saudi Arabia. Malik provides consulting services for entities seeking funds from investors in Saudi Arabia. Doc. #1 at 1 (¶ 1). Defendant Horseneck Capital Advisors, LLC (“Horseneck”) is a Connecticut limited liability company that raises investment capital for businesses in the United States. Horseneck is 1 based in Greenwich, Connecticut, and is owned and managed by Christopher Franco. Id. at 1 (¶ 2). Malik has sued Horseneck principally seeking payment for consulting services that Malik rendered in Saudi Arabia for one of Horseneck’s investment fund clients. -
Saudi Dialects: Are They Endangered?
Academic Research Publishing Group English Literature and Language Review ISSN(e): 2412-1703, ISSN(p): 2413-8827 Vol. 2, No. 12, pp: 131-141, 2016 URL: http://arpgweb.com/?ic=journal&journal=9&info=aims Saudi Dialects: Are They Endangered? Salih Alzahrani Taif University, Saudi Arabia Abstract: Krauss, among others, claims that languages will face death in the coming centuries (Krauss, 1992). Austin (2010a) lists 7,000 languages as existing and spoken in the world today. Krauss estimates that this figure could come down to 600. That is, most the world's languages are endangered. Therefore, an endangered language is a language that loses her speakers within a few generations. According to Dorian (1981), there is what is called ―tip‖ in language endangerment. He argues that a language's decline can start slowly but suddenly goes through a rapid decline towards the extinction. Thus, languages must be protected at much earlier stage. Arabic dialects such as Zahrani Spoken Arabic (ZSA), and Faifi Spoken Arabic (henceforth, FSA), which are spoken in the southern region of Saudi Arabia, have not been studied, yet. Few people speak these dialects, among many other dialects in the same region. However, the problem is that most these dialects' native speakers are moving to other regions in Saudi Arabia where they use other different dialects. Therefore, are these dialects endangered? What other factors may cause its endangerment? Have they been documented before? What shall we do? This paper discusses three main different points regarding this issue: language and endangerment, languages documentation and description and Arabic language and its family, giving a brief history of Saudi dialects comparing their situation with the whole existing dialects. -
The Status of the Least Documented Language Families in the World
Vol. 4 (2010), pp. 177-212 http://nflrc.hawaii.edu/ldc/ http://hdl.handle.net/10125/4478 The status of the least documented language families in the world Harald Hammarström Radboud Universiteit, Nijmegen and Max Planck Institute for Evolutionary Anthropology, Leipzig This paper aims to list all known language families that are not yet extinct and all of whose member languages are very poorly documented, i.e., less than a sketch grammar’s worth of data has been collected. It explains what constitutes a valid family, what amount and kinds of documentary data are sufficient, when a language is considered extinct, and more. It is hoped that the survey will be useful in setting priorities for documenta- tion fieldwork, in particular for those documentation efforts whose underlying goal is to understand linguistic diversity. 1. InTroducTIon. There are several legitimate reasons for pursuing language documen- tation (cf. Krauss 2007 for a fuller discussion).1 Perhaps the most important reason is for the benefit of the speaker community itself (see Voort 2007 for some clear examples). Another reason is that it contributes to linguistic theory: if we understand the limits and distribution of diversity of the world’s languages, we can formulate and provide evidence for statements about the nature of language (Brenzinger 2007; Hyman 2003; Evans 2009; Harrison 2007). From the latter perspective, it is especially interesting to document lan- guages that are the most divergent from ones that are well-documented—in other words, those that belong to unrelated families. I have conducted a survey of the documentation of the language families of the world, and in this paper, I will list the least-documented ones. -
D:\IGNOU\Tilak\BHIC 104 English\Aaaaa.Xps
Theme IV Societies in Central Islamic Lands Time Line Pre-Islamic Arab World Arabian Peninsula: Sarakenoi/Saraceni Arab Tribes: Quraysh, Aws, Khazraj Pre-Islamic Cities Mecca, Yathrib/Medina, Taif Rise of Islam Prophet’s march from Mecca to Medina (Hijara): 622 Caliph Abu Bakr: 632-634 Caliph Umar: 634-644 Caliph Usman: 644-656 Caliph Ali: 656-661 The Ummayad Caliphate: 661-684 Late Ummayad Caliphate: 684-750 The Abbasid Caliphate: 750-1258 Photograph: Manuscript folio with depiction by Yahya ibn Vaseti found in the Maqama of Hariri located at the BibliothequeNationale de France. Image depicts a library with pupils in it, 1237 Courtesy: Zereshk, September 2007 Source: https://upload.wikimedia.org/wikipedia/commons/2/2c/Maqamat_hariri.jpg UNIT 12 PRE-ISLAMIC ARAB WORLD AND ITS CULTURE* Structure 12.0 Objectives 12.1 Introduction 12.2 Tribal Confederations in Arabia 12.2.1 The Dominant Tribes of The Arabian Peninsula 12.2.2 Religious Diversity in The Arabian Peninsula 12.3 Tribal and Religious Practices 12.3.1 Religious and Ritual Practices of The Meccans 12.3.2 Religious and Ritual Practices at Medina 12.4 The Arab Trading Network before the 6th Century 12.5 Political Structure in Pre-Islamic Arabia 12.6 Social Structures in Pre-Islamic Arabia 12.6.1 Tribal Structure and Leadership 12.6.2 Inequality and Slavery 12.6.3 The Elite Camel Nomads 12.6.4 Intra-Tribal Warfare 12.7 Economic Conditions 12.7.1 Camel Nomadism 12.7.2 Agriculture in Arabia 12.7.3 Industry and Mining in Arabia 12.8 Literature of the Pre-Islamic Period 12.9 Summary 12.10 Keywords 12.11 Answers to Check Your Progress Exercises 12.12 Suggested Readings 12.13 Instructional Video Recommendations 12.0 OBJECTIVES The study of pre-Islamic Arabia is an important area of study in order to understand the history of the region in which Islam developed. -
Classical and Modern Standard Arabic Marijn Van Putten University of Leiden
Chapter 3 Classical and Modern Standard Arabic Marijn van Putten University of Leiden The highly archaic Classical Arabic language and its modern iteration Modern Standard Arabic must to a large extent be seen as highly artificial archaizing reg- isters that are the High variety of a diglossic situation. The contact phenomena found in Classical Arabic and Modern Standard Arabic are therefore often the re- sult of imposition. Cases of borrowing are significantly rarer, and mainly found in the lexical sphere of the language. 1 Current state and historical development Classical Arabic (CA) is the highly archaic variety of Arabic that, after its cod- ification by the Arab Grammarians around the beginning of the ninth century, becomes the most dominant written register of Arabic. While forms of Middle Arabic, a style somewhat intermediate between CA and spoken dialects, gain some traction in the Middle Ages, CA remains the most important written regis- ter for official, religious and scientific purposes. From the moment of CA’s rise to dominance as a written language, the whole of the Arabic-speaking world can be thought of as having transitioned into a state of diglossia (Ferguson 1959; 1996), where CA takes up the High register and the spoken dialects the Low register.1 Representation in writing of these spoken dia- lects is (almost) completely absent in the written record for much of the Middle Ages. Eventually, CA came to be largely replaced for administrative purposes by Ottoman Turkish, and at the beginning of the nineteenth century, it was function- ally limited to religious domains (Glaß 2011: 836). -
Ba Islamic History
Maharaja’s College, Ernakulam (A Government Autonomous College) Affiliated to Mahatma Gandhi University, Kottayam Under Graduate Programme in Islamic History 2020 Admission Onwards Board of Studies in Islamic History Sl. Name of Member Designation No. 1 Sri. I K Jayadev, Associate Professor Chairman, BoS Islamic History 2 Dr. A B Aliyar External Member 3 Sri. Anil Kumar External Member 4 Dr. Muhammad Riyaz V B External Member [Industry] 5 Sri. K U Bava External Member [Alumni] 6 Sri. Muhammad Ali Jinnah Sahib I Internal Member 7 Dr.Shajila Beevi S Internal Member 8 Dr. Salooja M S Internal Member 9 Sri. Ajmal P A Internal Member 10 Smt. Subida M D Internal Member 11 Smt. Sheeja O Internal Member MAHARAJA'S COLLEGE, ERNAKULAM (A GOVERNMENT AUTONOMOUS COLLEGE) REGULATIONS FOR UNDER GRADUATE PROGRAMMES UNDER CHOICE BASED CREDIT SYSTEM 2020 1. TITLE 1.1. These regulations shall be called “MAHARAJA'S COLLEGE (AUTONOMOUS) REGULATIONS FOR UNDER GRADUATE PROGRAMMESUNDER CHOICE BASED CREDIT SYSTEM 2020” 2. SCOPE 2.1 Applicable to all regular Under Graduate Programmes conducted by the Maharaja's College with effect from 2020 admissions 2.2 Medium of instruction is English except in the case of language courses other than English unless otherwise stated therein. 2.3 The provisions herein supersede all the existing regulations for the undergraduate programmes to the extent herein prescribed. 3. DEFINITIONS 3.1. ‘Academic Week’ is a unit of five working days in which the distribution of work is organized from day one to day five, with five contact hours of one hour duration on each day. -
Arabic and Contact-Induced Change Christopher Lucas, Stefano Manfredi
Arabic and Contact-Induced Change Christopher Lucas, Stefano Manfredi To cite this version: Christopher Lucas, Stefano Manfredi. Arabic and Contact-Induced Change. 2020. halshs-03094950 HAL Id: halshs-03094950 https://halshs.archives-ouvertes.fr/halshs-03094950 Submitted on 15 Jan 2021 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Arabic and contact-induced change Edited by Christopher Lucas Stefano Manfredi language Contact and Multilingualism 1 science press Contact and Multilingualism Editors: Isabelle Léglise (CNRS SeDyL), Stefano Manfredi (CNRS SeDyL) In this series: 1. Lucas, Christopher & Stefano Manfredi (eds.). Arabic and contact-induced change. Arabic and contact-induced change Edited by Christopher Lucas Stefano Manfredi language science press Lucas, Christopher & Stefano Manfredi (eds.). 2020. Arabic and contact-induced change (Contact and Multilingualism 1). Berlin: Language Science Press. This title can be downloaded at: http://langsci-press.org/catalog/book/235 © 2020, the authors Published under the Creative Commons Attribution -
Persian, Farsi, Dari, Tajiki: Language Names and Language Policies
University of Pennsylvania ScholarlyCommons Department of Anthropology Papers Department of Anthropology 2012 Persian, Farsi, Dari, Tajiki: Language Names and Language Policies Brian Spooner University of Pennsylvania, [email protected] Follow this and additional works at: https://repository.upenn.edu/anthro_papers Part of the Anthropological Linguistics and Sociolinguistics Commons, and the Anthropology Commons Recommended Citation (OVERRIDE) Spooner, B. (2012). Persian, Farsi, Dari, Tajiki: Language Names and Language Policies. In H. Schiffman (Ed.), Language Policy and Language Conflict in Afghanistan and Its Neighbors: The Changing Politics of Language Choice (pp. 89-117). Leiden, Boston: Brill. This paper is posted at ScholarlyCommons. https://repository.upenn.edu/anthro_papers/91 For more information, please contact [email protected]. Persian, Farsi, Dari, Tajiki: Language Names and Language Policies Abstract Persian is an important language today in a number of countries of west, south and central Asia. But its status in each is different. In Iran its unique status as the only official or national language continueso t be jealously guarded, even though half—probably more—of the population use a different language (mainly Azari/Azeri Turkish) at home, and on the streets, though not in formal public situations, and not in writing. Attempts to broach this exclusive status of Persian in Iran have increased in recent decades, but are still relatively minor. Persian (called tajiki) is also the official language ofajikistan, T but here it shares that status informally with Russian, while in the west of the country Uzbek is also widely used and in the more isolated eastern part of the country other local Iranian languages are now dominant.