Improving the Arabic Pronunciation Dictionary for Phone and Word Recognition with Linguistically-Based Pronunciation Rules
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Alif and Hamza Alif) Is One of the Simplest Letters of the Alphabet
’alif and hamza alif) is one of the simplest letters of the alphabet. Its isolated form is simply a vertical’) ﺍ stroke, written from top to bottom. In its final position it is written as the same vertical stroke, but joined at the base to the preceding letter. Because of this connecting line – and this is very important – it is written from bottom to top instead of top to bottom. Practise these to get the feel of the direction of the stroke. The letter 'alif is one of a number of non-connecting letters. This means that it is never connected to the letter that comes after it. Non-connecting letters therefore have no initial or medial forms. They can appear in only two ways: isolated or final, meaning connected to the preceding letter. Reminder about pronunciation The letter 'alif represents the long vowel aa. Usually this vowel sounds like a lengthened version of the a in pat. In some positions, however (we will explain this later), it sounds more like the a in father. One of the most important functions of 'alif is not as an independent sound but as the You can look back at what we said about .(ﺀ) carrier, or a ‘bearer’, of another letter: hamza hamza. Later we will discuss hamza in more detail. Here we will go through one of the most common uses of hamza: its combination with 'alif at the beginning or a word. One of the rules of the Arabic language is that no word can begin with a vowel. Many Arabic words may sound to the beginner as though they start with a vowel, but in fact they begin with a glottal stop: that little catch in the voice that is represented by hamza. -
Modern Standard Arabic ﺝ
International Journal of Linguistics, Literature and Culture (Linqua- IJLLC) December 2014 edition Vol.1 No.3 /Ʒ/ AND /ʤ/ :ﺝ MODERN STANDARD ARABIC Hisham Monassar, PhD Assistant Professor of Arabic and Linguistics, Department of Arabic and Foreign Languages, Cameron University, Lawton, OK, USA Abstract This paper explores the phonemic inventory of Modern Standard ﺝ Arabic (MSA) with respect to the phoneme represented orthographically as in the Arabic alphabet. This phoneme has two realizations, i.e., variants, /ʤ/, /ӡ /. It seems that there is a regional variation across the Arabic-speaking peoples, a preference for either phoneme. It is observed that in Arabia /ʤ/ is dominant while in the Levant region /ӡ/ is. Each group has one variant to the exclusion of the other. However, there is an overlap regarding the two variants as far as the geographical distribution is concerned, i.e., there is no clear cut geographical or dialectal boundaries. The phone [ʤ] is an affricate, a combination of two phones: a left-face stop, [d], and a right-face fricative, [ӡ]. To produce this sound, the tip of the tongue starts at the alveolar ridge for the left-face stop [d] and retracts to the palate for the right-face fricative [ӡ]. The phone [ӡ] is a voiced palato- alveolar fricative sound produced in the palatal region bordering the alveolar ridge. This paper investigates the dichotomy, or variation, in light of the grammatical (morphological/phonological and syntactic) processes of MSA; phonologies of most Arabic dialects’ for the purpose of synchronic evidence; the history of the phoneme for diachronic evidence and internal sound change; as well as the possibility of external influence. -
Hebrew Names and Name Authority in Library Catalogs by Daniel D
Hebrew Names and Name Authority in Library Catalogs by Daniel D. Stuhlman BHL, BA, MS LS, MHL In support of the Doctor of Hebrew Literature degree Jewish University of America Skokie, IL 2004 Page 1 Abstract Hebrew Names and Name Authority in Library Catalogs By Daniel D. Stuhlman, BA, BHL, MS LS, MHL Because of the differences in alphabets, entering Hebrew names and words in English works has always been a challenge. The Hebrew Bible (Tanakh) is the source for many names both in American, Jewish and European society. This work examines given names, starting with theophoric names in the Bible, then continues with other names from the Bible and contemporary sources. The list of theophoric names is comprehensive. The other names are chosen from library catalogs and the personal records of the author. Hebrew names present challenges because of the variety of pronunciations. The same name is transliterated differently for a writer in Yiddish and Hebrew, but Yiddish names are not covered in this document. Family names are included only as they relate to the study of given names. One chapter deals with why Jacob and Joseph start with “J.” Transliteration tables from many sources are included for comparison purposes. Because parents may give any name they desire, there can be no absolute rules for using Hebrew names in English (or Latin character) library catalogs. When the cataloger can not find the Latin letter version of a name that the author prefers, the cataloger uses the rules for systematic Romanization. Through the use of rules and the understanding of the history of orthography, a library research can find the materials needed. -
Inflected Article in Proto-Arabic and Some Other West Semitic Languages*
ASIAN AND AFRICAN STUDIES, 9, 2000, I, 24-35 INFLECTED ARTICLE IN PROTO-ARABIC AND SOME OTHER WEST SEMITIC LANGUAGES* Andrzej Z a b o r s k i M. Zebrzydowskiego 1, 34-130 Kalwaria Zebrzydowska, Poland The Arabic, Canaanite and Modern South Arabian definite article has a common origin and goes back to an original demonstrative pronoun which was a compound inflected for gen der, number and probably also for case. It can be reconstructed as *han(V)~ for masc. sing., *hat(V)~ for fern. sing, and *hal(V)- for plural. Assimilations of -n- and -t- to the following consonant (including -n-l- > -11- and -t-l- > II) neutralized the opposition of gender and number and led to a reinterpretation of either hcil/’al- or han/’an-> ’am- synchronically as basic variant. In Aramaic the suffixed definite article was due not to simple suffixation o f hā but to a resegmentation of the postposed compound demonstrativehā-zē-[n(ā)] and suffixation of enclitic hā> -ā which has been generalized. The problem of the definite article in the West Semitic languages has been discussed by many scholars, so that there is a rather abundant literature on the subject and opinions differ widely. Older studies were briefly reviewed by Barth (1907), while in the most recent contribution D. Testen (1998) discusses most of the newer studies (e.g. Wensinck 1931, Ullendorf 1965, Lambdin 1971, Rundgren 1989) and he develops a hypothesis going back at least to Stade (1879, § 132a; cf. Croatto 1971) saying that the Arabic, Canaanite and Modern South Arabian1 definite article goes back to an “assertative” particle */ (sic!) which is continued both by la- i.e. -
Towards Arabic to English Machine Translation
The ITB Journal Volume 9 Issue 1 Article 3 2008 Towards Arabic to English Machine Translation Yasser Salem Arnold Hensman Brian Nolan Follow this and additional works at: https://arrow.tudublin.ie/itbj Part of the Computer Engineering Commons Recommended Citation Salem, Yasser; Hensman, Arnold; and Nolan, Brian (2008) "Towards Arabic to English Machine Translation," The ITB Journal: Vol. 9: Iss. 1, Article 3. doi:10.21427/D7J15D Available at: https://arrow.tudublin.ie/itbj/vol9/iss1/3 This Article is brought to you for free and open access by the Ceased publication at ARROW@TU Dublin. It has been accepted for inclusion in The ITB Journal by an authorized administrator of ARROW@TU Dublin. For more information, please contact [email protected], [email protected]. This work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 4.0 License ITB Journal Towards Arabic to English Machine Translation Yasser Salem, Arnold Hensman and Brian Nolan School of Informatics and Engineering Institute of Technology Blanchardstown, Dublin, Ireland E-mails: {firstname.surname}@itb.ie Abstract This paper explores how the characteristics of the Arabic language will effect the development of a Machine Translation (MT) tool from Arabic to English. Several distinguishing features of Arabic pertinent to MT will be explored in detail with reference to some potential difficulties that they might present. The paper will conclude with a proposed model incorporating the Role and Reference Grammar (RRG) technique to achieve this end. 1 Introduction Arabic is a Semitic language originating in the area presently known as the Arabian Peninsula. -
Quran for Peace?
Quran forforfor Peace Dr. Ayman Mohamed © 2012 Ayman Mohamed Why Quran for peace? This book deals with the subject of learning about the concept of “ islam ” from the Quran. Surprisingly, this simple subject is quite revolutionary. A survey of the literature would reveal that it is in fact unusual to learn about the concept of “ islam ” from the Quran. Almost all of the sources rely on later traditions. Based on the preconceptions and distortions of later traditions, many people may think that they already know all there is to know about the Quran and the concept of islam . Therefore, in this introduction some brief examples will be provided to demonstrate that the vast majority of people, including most so-called Islamic scholars, do not grasp the meaning of basic concepts in the Quran as much as they think. The first example deals with a term widely used in the media. We all heard about the organization named Al-Qaeda, which has become a notorious symbol of the ideology of “Extreme Islamic Jihad”. This ideological reach shows in the fact that the majority of global terrorist threats, even when not claimed by “Al-Qaeda”, are usually claimed by a group that is “Al-Qaeda affiliated” or “Al-Qaeda inspired”. When experts are asked about the translation of some of the most prominently featured Arabic words in the media such as al-qaeda, one typically gets confused answers. A seemingly right but contextually wrong answer one would occasionally get is that the word al-qaeda means “the base”. This wrong answer is often given even by native Arabic speakers. -
Generic Reference in English, Arabic and Malay: a Cross Linguistic Typology and Comparison
English Language Teaching; Vol. 7, No. 11; 2014 ISSN 1916-4742 E-ISSN 1916-4750 Published by Canadian Center of Science and Education Generic Reference in English, Arabic and Malay: A Cross Linguistic Typology and Comparison Eidhah Abdullah AL-Malki1, Norazman Abdul Majid1 & Noor Abidah Mohd Omar1 1 Language Academy, UTM, Skudai, Johor Bahru, Malaysia Correspondence: Eidhah Abdullah AL-Malki, Language Academy, UTM, Skudai, Johor Bahru, 81310, Malaysia. Tel: 60-017-772-4049. E-mail: [email protected] Received: September 5, 2014 Accepted: October 6, 2014 Online Published: October 23, 2014 doi:10.5539/elt.v7n11p15 URL: http://dx.doi.org/10.5539/elt.v7n11p15 Abstract According to the Longman Grammar of Spoken and Written English 1999 by Biber et al. (p. 266) generic article uses are more than twice as common in academic English than in conversation or fiction. This is an area that English for Academic Purpose (EPA) textbooks and teachers would need to target more than general English teaching. This paper is therefore a contribution towards better understanding of what linguistic facts about generics teachers and textbooks of EAP might need to cover in order to deal with them satisfactorily, particularly for learners with Arabic or Malay as L1. This paper is also significant as it is the first to compare the expression of generic meanings by noun phrases in three typologically quite different languages: the Germanic language English, the Semitic language Arabic and the Austronesian language Malay. The contrast between the three languages is substantial in that they have different settings according to the nominal mapping parameter (NMP), which captures some widespread generalizations about the occurrence of mass and countable nouns and articles in the languages of the world. -
The Relationship Between Arabic Alla¯H and Syriac Alla¯Ha¯1
The relationship between Arabic Alla¯h and Syriac Alla¯ha¯ 1 David Kiltz Berlin-Brandenburgische Akademie der Wissenschaften, Potsdam Abstract Various etymologies have been proposed for Arabic allah but also for Syriac allaha. It has often been proposed that the Arabic word was borrowed from Syriac. This article takes a comprehensive look at the linguistic evidence at hand. Es- pecially, it takes into consideration more recent epigraphical material which sheds light on the development of the Arabic language. Phonetic and morphological analysis of the data confirms the Arabic origin of the word allah, whereas the prob- lems of the Syriac form allaha are described, namely that the Syriac form differs from that of other Aramaic dialects and begs explanation, discussing also the possi- bility that the Syriac word is a loan from Arabic. The final part considers qur#anic allah in its cultural and literary context and the role of the Syriac word in that con- text. The article concludes, that both, a strictly linguistic, as well as cultural and literary analysis reveals a multilayered interrelation between the two terms in ques- tion. The linguistic analysis shows, that Arabic allah must be a genuinely Arabic word, whereas in the case of Syriac allaha, the possibility of both, a loan and a spe- cific inner-Aramaic development are laid out. Apart from linguistic considerations, the historical and cultural situation in Northern Mesopotamia, i.e. the early Arab presence in that region is taken into scrutiny. In turn, a possible later effect of the prominent use of Syriac allaha on the use in the Qur#an is considered. -
Romanization of Arabic 1 Romanization of Arabic
Romanization of Arabic 1 Romanization of Arabic Arabic alphabet ﺍ ﺏ ﺕ ﺙ ﺝ ﺡ ﺥ ﺩ ﺫ ﺭ ﺯ ﺱ ﺵ ﺹ ﺽ ﻁ ﻅ ﻉ ﻍ ﻑ ﻕ ﻙ ﻝ ﻡ ﻥ ﻩ ﻭ ﻱ • History • Transliteration • Diacritics (ء) Hamza • • Numerals • Numeration Different approaches and methods for the romanization of Arabic exist. They vary in the way that they address the inherent problems of rendering written and spoken Arabic in the Latin script. Examples of such problems are the symbols for Arabic phonemes that do not exist in English or other European languages; the means of representing the Arabic definite article, which is always spelled the same way in written Arabic but has numerous pronunciations in the spoken language depending on context; and the representation of short vowels (usually i u or e o, accounting for variations such as Muslim / Moslem or Mohammed / Muhammad / Mohamed ). Method Romanization is often termed "transliteration", but this is not technically correct. Transliteration is the direct representation of foreign letters using Latin symbols, while most systems for romanizing Arabic are actually transcription systems, which represent the sound of the language. As an example, the above rendering is a transcription, indicating the pronunciation; an ﺍﻟﻌﺮﺑﻴﺔ ﺍﻟﺤﺮﻭﻑ ﻣﻨﺎﻇﺮﺓ :munāẓarat al-ḥurūf al-ʻarabīyah of the Arabic example transliteration would be mnaẓrḧ alḥrwf alʻrbyḧ. Romanization standards and systems This list is sorted chronologically. Bold face indicates column headlines as they appear in the table below. • IPA: International Phonetic Alphabet (1886) • Deutsche Morgenländische Gesellschaft (1936): Adopted by the International Convention of Orientalist Scholars in Rome. It is the basis for the very influential Hans Wehr dictionary (ISBN 0-87950-003-4). -
Arabic Alphabet 1 Arabic Alphabet
Arabic alphabet 1 Arabic alphabet Arabic abjad Type Abjad Languages Arabic Time period 400 to the present Parent systems Proto-Sinaitic • Phoenician • Aramaic • Syriac • Nabataean • Arabic abjad Child systems N'Ko alphabet ISO 15924 Arab, 160 Direction Right-to-left Unicode alias Arabic Unicode range [1] U+0600 to U+06FF [2] U+0750 to U+077F [3] U+08A0 to U+08FF [4] U+FB50 to U+FDFF [5] U+FE70 to U+FEFF [6] U+1EE00 to U+1EEFF the Arabic alphabet of the Arabic script ﻍ ﻉ ﻅ ﻁ ﺽ ﺹ ﺵ ﺱ ﺯ ﺭ ﺫ ﺩ ﺥ ﺡ ﺝ ﺙ ﺕ ﺏ ﺍ ﻱ ﻭ ﻩ ﻥ ﻡ ﻝ ﻙ ﻕ ﻑ • history • diacritics • hamza • numerals • numeration abjadiyyah ‘arabiyyah) or Arabic abjad is the Arabic script as it is’ ﺃَﺑْﺠَﺪِﻳَّﺔ ﻋَﺮَﺑِﻴَّﺔ :The Arabic alphabet (Arabic codified for writing the Arabic language. It is written from right to left, in a cursive style, and includes 28 letters. Because letters usually[7] stand for consonants, it is classified as an abjad. Arabic alphabet 2 Consonants The basic Arabic alphabet contains 28 letters. Adaptations of the Arabic script for other languages added and removed some letters, such as Persian, Ottoman, Sindhi, Urdu, Malay, Pashto, and Arabi Malayalam have additional letters, shown below. There are no distinct upper and lower case letter forms. Many letters look similar but are distinguished from one another by dots (’i‘jām) above or below their central part, called rasm. These dots are an integral part of a letter, since they distinguish between letters that represent different sounds. -
Literary and Linguistic Matters in the Book of Proverbs*
LITERARY AND LINGUISTIC MATTERS IN THE BOOK OF PROVERBS* Gary A. Rendsburg The present study surveys an array of literary and linguistic issues relevant to the book of Proverbs. The topics discussed and the examples presented are selected somewhat at random, though they all cohere at the nexus of language and literature within the composition. Much of what is put forth here is not original, but rather is based upon the work of earlier scholars. Nonetheless, the present author hopes that the reader will ¿nd this exposition bene¿cial on a number of levels. In addition, because I wish to include as much data as possible within the con¿nes of a scholarly essay, the presentation herein will be very schematic. The author begs the reader’s forbearance for the outline form of this study. The knowledgeable reader will realize that each of the passages presented could easily deserve several paragraphs if not a short essay for further elucidation. Hopefully, the accumulation of data included herein will compensate for my inability to provide such elucidation on this occasion.1 * It is my pleasant duty to thank the Oxford Centre for Hebrew and Jewish Studies at Yarnton Manor for hosting me during the period of June–December 2012, during which months this study was written. As those who have enjoyed time at the Centre know well, this singular institution provides the perfect atmosphere in which to conduct one’s academic research. An oral version of this material was presented to the Old Testament Seminar at the University of Oxford on 12 November 2012; I am grateful to members of the seminar for both their warm welcome and their important feedback. -
Nouns and Adjectives of Old English and Modern Standard Arabic – a Comparative Study
International Journal of Humanities and Social Science Vol. 1 No. 18 www.ijhssnet.com Nouns and Adjectives of Old English and Modern Standard Arabic – A Comparative Study Dr. Khalil Hassan Nofal Head/Department of English Director/ Language Centre Philadelphia University- Jordan Abstract This paper is intended to discuss nouns and adjectives in two unrelated languages: Old English and Modern Standard Arabic. Although Arabic is Semitic Language, its grammar has a lot of similarities with the grammar of Old English. In the discussion it has been found that nouns and adjectives of both languages have inflectional modifications to indicate gender, case and number. Additionally, there is agreement between nouns, adjectives, verbs and demonstratives in both languages. Furthermore, Modern Standard Arabic is a highly inflected language. It uses a complex system of pronouns and their respective prefixes and suffixes for verbs, nouns, rāb places vowel suffixes on each؟adjectives and possessive conjunctions. In addition, the system known as ?i verb, noun, and adjectives, according to its function within a sentence and its relation to surrounding words. Key words: nouns, adjectives, demonstratives, inflections, declensions, case, gender, number. Phonetic symbols of Arabic consonants: Transliteration Phonetic Description Arabic Examples Arabic Letters Symbol (glottal stop ?amal (hope ? ء (b voiced bilabial stop balad (country ة (t voiceless denti- alveolar stop tammūz (july د (θ voiceless dental fricative Θuluθ ( one third س (J voiced palato-alveolar