Typesetting the Qur'an and Its Specific Challenges to the TEX Family∗

Total Page:16

File Type:pdf, Size:1020Kb

Typesetting the Qur'an and Its Specific Challenges to the TEX Family∗ ∗ Typesetting the Qur'an and its specific challenges to the TEX family Hossam A. H. Fahmy Electronics and Communications Department, Faculty of Engineering, Cairo University, Egypt [email protected] 1 Peculiarity of Arabic typography isolated. The simplest example of this rule is the ¡ ¥ ¤ ¥ ¥ ¥ ¨ equivalent of `b' in Arabic: § . A reader with The arabic alphabet has been adopted for use by a sensitive eye would notice that the four shapes of many languages in Africa and Asia including: Ara- the same letter differ in their width, height above bic, Dari, Farsi, Jawi, Kashmiri, Pashto, Punjabi, the line, and depth below it. The same structural Sindhi, Urdu, and Uyghur. The arabic script is ¡ ¤ ¨ shape is used for the equivalent of `t' ( § ), also used for a number of other languages either to ¡ ¤ ¨ © © § © and `th' ( © ). The equivalent of `n' and present how the language used to be written histor- `y' share the same shapes as `b' in the initial and ically (as for Turkish) or how some write it in an ¤ £ ¤ £ ¨ ¨ ¦ medial forms ( ¦ and ) but not in the final or unofficial manner (as for Hausa in western Africa). ¦ the isolated forms ( ¦ and ). In tradi- Similar to the latin alphabet, with its adoption tional Arabic writing styles (but with the exception to several languages, the arabic alphabet acquired of thuluth, riq¯a`, and tawq¯ı‘ styles [13]), the letters new symbols to represent the sounds that do not and their siblings with dots or marks do exist in Arabic. In contrast to the latin alphabet not connect to the following letter but only to the that has dots only on the `i' and `j', the arabic al- preceding one. phabet uses dots extensively both above and below The cursive nature of arabic script adds another the letter shapes to distinguish the different charac- characteristic; many letters combine together to pro- ters. This explicit distinction between the different ¥ ¥ duce new shapes as in becoming . In the letters in written text using dots was in itself an ad- latin script this phenomena occurs infrequently and dition to the original script which had no dots. In when it happens, a ligature is used to improve the the original arabic script, the distinction between appearance of the problematic letter combinations ¢ £ ¤ ¥ ¡ ¢ ¦ ¤ ¥ ¡ (house) and (girl) when represented as such as ‘ff' and ‘ffl’. In arabic typography, on the ¢ ¤ ¡ was understood from the context. In general, other hand, the presence of combined letters is abun- the developed symbols for the other languages follow dant but optional in many cases. Haralambous [2] the same idea as Arabic and use more dots (up to gives a long list (yet not exhaustive) of possible `lig- four) and special marks on the original shapes of the atures' in arabic. While speaking about the history arabic letters. of arabic typography, Milo [9] explains that \each Arabic being a semitic language, only the con- letter can have a different appearance in any com- sonants are usually written in a word. The equiva- bination, something that can only be crudely imi- lent of short vowel sounds are written as additional tated with ligatures". According to Milo [9], most marks on top of the letters. Obviously, the differ- modern books present the connected letter groups ent languages have different vowels and need dif- \as `ligatures' and `artistic expressions' without so ferent symbols to mark them. In addition to that, much as a hint at traditional morphographic rules". since the geographical area covered by the arabic Mackay [8] discusses the range of context evaluation script historically is quite vast, different regions of in Arabic and concludes that clusters of four, five, the world developed different symbols. The result six, and sometimes more letters may combine into that we have today is a plethora of additional marks a unique shape. Mackay then proposes the use of developed historically. virtual fonts as an adequate solution. A beginner learns that Arabic is written from Another feature in the Arabic script is its re- right to left and must practice writing each letter liance on subtle changes to the letter shape to aid and its connection rules to other letters. Because the reader in identifying the beginning and end of of the cursive nature, any letter may connect to the each letter within a combination. The letter has previous and following letters. Hence, a beginner three \teeth" (vertical pen strokes) similar to the learns the general four basic forms of a letter: at the start of a word, at the middle, at the end, and ∗ A graduation project under the author's supervision by: Ibrahim R. Mohamed, Ahmed Z. Ameen, Ahmed M. Amin, Akram A. Mojahed, Kareem O. Sharawi, and Hisham Shihab TUGboat, Volume 0 (2060), No. 0 | Proceedings of the 2060 Annual Meeting 1001 Hossam A. H. Fahmy ¤ ¥ ¤ ¤ ¥ teeth in and . When is connected to , the tail 5. There are additional marks that are put on top of the may be elongated to alert the reader to the or below the character. correct grouping of the teeth. Furthermore, the two 6. The horizontal and vertical location of the dots ¢ ¥ ¨ ¡ words and present different heights for the and marks on the characters is not at the same ¥ ¤ teeth of the ¤ and to help the reader as well. This position always but depends on the character difference in width and height is a type of encoding and its context. to prevent a misreading of the word and to aid the 7. There are too many letters that combine (\lig- trained eye in quickly catching the letter combina- atures"). tion. That encoding helps in other cases as well. If 8. Ligatures and variable width forms of the let- due to any reason the dots fade away, a reader faced ters are used to justify the lines. ¢ ¡ with can guess its correct origin. This encoding 9. Several typefaces are needed for special materi- to emphasize the different letters by raising some als in a work of good quality. ¢ ¢ £ ¥ ¡ ¢ ¥ ¤ ¡ ¢ ¥ ¦ ¨ ¨ teeth is essential in words such as and ¡ . With all of these issues, to find a suitable posi- However, the raising of the teeth is only possible by tion for the dots and marks on the letter combina- investigating the group of letters. So the height of tions is sometimes a real challenge even for a human ¡ ¢ £ ¢ ¥ ¤ ¡ ¢ ¥ ¤ ¦ ¨ ¨ ¨ ¦ the tooth for ¦ in and is not a feature of let alone a machine. the individual letter but of the whole combination. The same goes for the height of the dot on top of 2 Automated typesetting of Arabic £ ¥ ¢ ¦ ¦ ¨ ¦ ¦ that same letter as in or . Arabic script does not enjoy the same luxury that In traditional (manual) writing, the \skeleton" latin script has when it comes to automated type- of the letter combination is written first then the setting on computers. Milo correctly asserts [9] that dots and the other marks are provided. So, a writer the use of individual letters as the building block probably progresses from the skeleton to the dotted is not suitable for Arabic. Both Mackay [8] and £ § £ £ ¦¥ ¦ ¦ ¦ ¦ ¤ to the vocalized form as ! ! . Milo [10] argue that a layered approach is a bet- In the arabic script, a good calligrapher justifies ter solution. In such a layered approach, some basic the lines not just by stretching the spaces between elements are provided in the font to represent the the words but mainly by using optional ligatures or skeleton of some letter combinations, an individual © wider forms of some letters as in changing ¨ to letter's skeleton, or even a part of a letter. These ¡ ¡ ¥ ¨ ¥ ¨ ¨ and writing © instead of . The use of elements are combined first to give the correct skele- optional ligatures and wider forms is the preferred ton of the word with all the needed shaping for the method in high quality works. Another method is to teeth or other style requirements. On top of that add an elongation to the tail of some letters by us- skeleton, a second layer for the dots is added. Then, ¡ ¥ ¥ ¨ ing the tat.w¯ıl or kash¯ıdah symbol ` ' such as the vowel marks and any other marks come in sub- ¡ ¢ ¥ ¥ instead of . This second method has been sequent layers. Milo [11] developed a system with a widely abused in newspapers and low quality mate- layered approach for his company, DecoType. It is a rials using mechanical typewriters. proprietary system used by a number of commercial The arabic script has a large number of writing software tools. Due to its proprietary nature, the styles that were developed traditionally to accom- full details and the extent of the capabilities of this modate the different languages and different pur- system are not widely known. poses. Latin scripts use bold, italic, or larger fonts Our goal is to provide a freely available system for section headings and for emphasis. Traditional capable of typesetting the Qur'an, other traditional arabic writings vary the typeface instead. The print- texts, and any publications in the languages using ings of the Qur'an as well as of most books almost the arabic script. The Qur'an is one of the most always use the naskh typeface for the main body. demanding arabic texts from a typographical point The headings, the introductory materials, and the of view. However, there is a long historical record of back materials frequently use other typefaces such excellent quality materials (manuscripts and recent as the thuluth, ta`l¯ıq, and ruq`ah. printings) to guide the work on a system to typeset In summary, the points just mentioned are: it. Such a system, once complete, can easily typeset 1. Arabic is written from right to left.
Recommended publications
  • Arabic Alphabet - Wikipedia, the Free Encyclopedia Arabic Alphabet from Wikipedia, the Free Encyclopedia
    2/14/13 Arabic alphabet - Wikipedia, the free encyclopedia Arabic alphabet From Wikipedia, the free encyclopedia َأﺑْ َﺠ ِﺪﯾﱠﺔ َﻋ َﺮﺑِﯿﱠﺔ :The Arabic alphabet (Arabic ’abjadiyyah ‘arabiyyah) or Arabic abjad is Arabic abjad the Arabic script as it is codified for writing the Arabic language. It is written from right to left, in a cursive style, and includes 28 letters. Because letters usually[1] stand for consonants, it is classified as an abjad. Type Abjad Languages Arabic Time 400 to the present period Parent Proto-Sinaitic systems Phoenician Aramaic Syriac Nabataean Arabic abjad Child N'Ko alphabet systems ISO 15924 Arab, 160 Direction Right-to-left Unicode Arabic alias Unicode U+0600 to U+06FF range (http://www.unicode.org/charts/PDF/U0600.pdf) U+0750 to U+077F (http://www.unicode.org/charts/PDF/U0750.pdf) U+08A0 to U+08FF (http://www.unicode.org/charts/PDF/U08A0.pdf) U+FB50 to U+FDFF (http://www.unicode.org/charts/PDF/UFB50.pdf) U+FE70 to U+FEFF (http://www.unicode.org/charts/PDF/UFE70.pdf) U+1EE00 to U+1EEFF (http://www.unicode.org/charts/PDF/U1EE00.pdf) Note: This page may contain IPA phonetic symbols. Arabic alphabet ا ب ت ث ج ح خ د ذ ر ز س ش ص ض ط ظ ع en.wikipedia.org/wiki/Arabic_alphabet 1/20 2/14/13 Arabic alphabet - Wikipedia, the free encyclopedia غ ف ق ك ل م ن ه و ي History · Transliteration ء Diacritics · Hamza Numerals · Numeration V · T · E (//en.wikipedia.org/w/index.php?title=Template:Arabic_alphabet&action=edit) Contents 1 Consonants 1.1 Alphabetical order 1.2 Letter forms 1.2.1 Table of basic letters 1.2.2 Further notes
    [Show full text]
  • DRAFT Arabtex a System for Typesetting Arabic User Manual Version 4.00
    DRAFT ArabTEX a System for Typesetting Arabic User Manual Version 4.00 12 Klaus Lagally May 25, 1999 1Report Nr. 1998/09, Universit¨at Stuttgart, Fakult¨at Informatik, Breitwiesenstraße 20–22, 70565 Stuttgart, Germany 2This Report supersedes Reports Nr. 1992/06 and 1993/11 Overview ArabTEX is a package extending the capabilities of TEX/LATEX to generate the Perso-Arabic writing from an ASCII transliteration for texts in several languages using the Arabic script. It consists of a TEX macro package and an Arabic font in several sizes, presently only available in the Naskhi style. ArabTEX will run with Plain TEXandalsowithLATEX2e. It is compatible with Babel, CJK, the EDMAC package, and PicTEX (with some restrictions); other additions to TEX have not been tried. ArabTEX is primarily intended for generating the Arabic writing, but the stan- dard scientific transliteration can also be easily produced. For languages other than Arabic that are customarily written in extensions of the Perso-Arabic script some limited support is available. ArabTEX defines its own input notation which is both machine, and human, readable, and suited for electronic transmission and E-Mail communication. However, texts in many of the Arabic standard encodings can also be processed. Starting with Version 3.02, ArabTEX also provides support for fully vowelized Hebrew, both in its private ASCII input notation and in several other popular encodings. ArabTEX is copyrighted, but free use for scientific, experimental and other strictly private, noncommercial purposes is granted. Offprints of scientific publi- cations using ArabTEX are welcome. Using ArabTEX otherwise requires a license agreement. There is no warranty of any kind, either expressed or implied.
    [Show full text]
  • Classifying Arabic Fonts Based on Design Characteristics: PANOSE-A
    Classifying Arabic Fonts Based on Design Characteristics: PANOSE-A Jehan Janbi A Thesis In the Department of Computer Science & Software Engineering Presented in Partial Fulfillment of the Requirements For the Degree of Doctor of Philosophy (Computer Science) at Concordia University Montreal, Quebec, Canada July 2016 © Jehan Janbi, 2016 Abstract In desktop publishing, fonts are essential components in each document design. With the development of font design software and tools, there are thousands of digital fonts. Increasing the number of available fonts makes selecting an appropriate font, which best serves the objective of a design, not an intuitive issue. Designers can search for a font like any other file types by using general information such as name and file format. But for document design purposes, the design features or visual characteristics of fonts are more meaningful for designers than font file information. Therefore, representing fonts’ design features by searchable and comparable data would facilitate searching and selecting a desirable font. One solution is to represent a font’s design features by a code composed of several digits. This solution has been implemented as a computerized system called PANOSE-1 for Latin script fonts. PANOSE-1 is a system for classifying and matching typefaces based on design features. It is composed of 10 digits, where each digit represents a specific design feature. It is used within several font management tools as an option for ordering and searching fonts based on their design features. It is also used in font replacement processes when an application or an operating system detects a missing font in an immigrant document or website.
    [Show full text]
  • The Secret of Letters: Chronograms in Urdu Literary Culture1
    Edebiyˆat, 2003, Vol. 13, No. 2, pp. 147–158 The Secret of Letters: Chronograms in Urdu Literary Culture1 Mehr Afshan Farooqi University of Virginia Letters of the alphabet are more than symbols on a page. They provide an opening into new creative possibilities, new levels of understanding, and new worlds of experience. In mature literary traditions, the “literal meaning” of literal meaning can encompass a variety of arcane uses of letters, both in their mode as a graphemic entity and as a phonemic activity. Letters carry hidden meanings in literary languages at once assigned and intrinsic: the numeric and prophetic, the cryptic and esoteric, and the historic and commemoratory. In most literary traditions there appears to be at least a threefold value system assigned to letters: letters can be seen as phonetic signs, they have a semantic value, and they also have a numerical value. Each of the 28 letters of the Arabic alphabet can be used as a numeral. When used numerically, the letters of the alphabet have a special order, which is called the abjad or abujad. Abjad is an acronym referring to alif, be, j¯ım, d¯al, the first four letters in the numerical order which, in the system most widely used, runs from alif to ghain. The abjad order organizes the 28 characters of the Arabic alphabet into eight groups in a linear series: abjad, havvaz, hutt¯ı, kalaman, sa`fas, qarashat, sakhkha˙˙ z, zazzagh.2 In nearly every area where˙ ¨the¨ Arabic script ˙ was adopted, the abjad¨ ˙ ˙system gained popularity. Within the vast area in which the Arabic script was used, two abjad systems developed.
    [Show full text]
  • The Challenges and Pitfalls of Arabic Romanization and Arabization
    The Challenges and Pitfalls of Arabic Romanization and Arabization Jack Halpern (春遍雀來) The CJK Dictionary Institute, Inc. (日中韓辭典研究所) 34-14, 2-chome, Tohoku, Niiza-shi, Saitama 352-0001, Japan [email protected] their native script directly into Arabic, something Abstract probably never attempted. These systems are part of our ongoing efforts to develop Arabic re- The high level of ambiguity of the Ara- sources for automatic transcription, machine bic script poses special challenges to translation and named entity extraction. developers of NLP tools in areas such as morphological analysis, named entity The following typographic conventions are used extraction and machine translation. in this paper: These difficulties are exacerbated by the lack of comprehensive lexical resources, 1. Phonemic transcriptions are indicated by . (/qaabuus/ < ﻗــــﺎﺑﻮس) such as proper noun databases, and the slashes multiplicity of ambiguous transcription 2. Phonetic transcriptions are indicated by . ([qɑːbuːs] < ﻗــــﺎﺑﻮس ) schemes. This paper focuses on some of square brackets the linguistic issues encountered in two subdisciplines that play an increasingly 3. Graphemic transliterations are indicated by . (\qAbws\ < ﻗــــﺎﺑﻮس ) important role in Arabic information back slashes processing: the romanization of Arabic 4. Popular transcriptions are indicated by italics .(Qaboos < ﻗــــﺎﺑﻮس ) -names and the arabization of non Arabic names. The basic premise is that linguistic knowledge in the form of lin- 2 Motivation and Previous Work guistic rules is essential for achieving Arabic transcription technology is playing an high accuracy. increasingly important role in a variety of practical applications such as named entity 1 Introduction recognition, machine translation, cross-language information retrieval and various security The process of automatically transcribing Arabic applications such as anti-money laundering and to a Roman script representation, called romani- terrorist watch lists.
    [Show full text]
  • Arabic Alphabet Pdf
    Arabic alphabet pdf Continue It is used by many to start any language by teaching it parts of speech; however, it is logically better to start our trip by teaching arabic alphabet (Arabic letters), as this is a reasonable starting point. Consider the lack of alphabets, then, how can we form words and/or sentences?! Pronunciation Of Transliteered Isolated Isolated َ د d'l دَال kh ﺧـ ـﺨـ ـﺦ As Ch in the name of Bach خ khā̛ َﺧﺎء h ﺣـ ـﺤـ ـﺢ hā̛ in pronunciation َﺣﺎء j ﺟـ ـﺠـ ـﺞ Sometimes as G in a Girl or, as J's Jar ج Jim ﺛـ ـﺜـ ـﺚ ِﺟﻴﻢ As Th in the theory of ث thā̛ ﺛَﺎء t ت ﺗـ ـﺘـ ـﺖ ـﺔ tā̛ ﺗَﺎء Like T in Tree ت tā̛ ﺗَﺎء b ﺑـ ـﺒـ ـﺐ bā̛ - As B in Baby ـﺎ ـﺎ ﺑَﺎء leff as in Apple̛ أﻟِﻒ Pronunciation Original Medial Final Transcription ﺿـ ـﻀـ ـﺾ as D in the dead still heavy in pronunciation ض d'd َﺿﺎد with ﺻـ ـﺼـ ـﺺ As S in the garden is still heavy in pronunciation ص garden َﺻﺎد sh ﺷـ ـﺸـ ـﺶ as w in she ِﺷﻴﻦ ش shins ﺳـ ـﺴـ ـﺲ sin - Like S in see ِﺳﻴﻦ z ز ـﺰ ـﺰ As in the zoo ز z'y َزاي r ـﺮ ـﺮ rā̛ - Like R in Rama and َراء z ذ ـﺬ ـﺬ Like The Th's ذ z'l ذَال d د ـﺪ ـﺪ as D's dad َ َ How F ف fā̛ ﻓَﺎء gh ﻏـ ـﻐـ ـﻎ As Gh in Gandhi غ ghain ﻋـ ـﻌـ ـﻊ ̛ع ﻏَﻴﻦ /aliﻋﻠﻲ /ع has no real equivalent sometimes they replace it sound with sound like, for example, Ali's name for ع ainﻋَﻴ ٍﻦ ع ẓ ﻇـ ـﻈـ ـﻆ As in zorro still heavy in pronunciation ظ ẓā̛ ﻇﺎء tons ﻃـ ـﻄـ ـﻂ As T in the table is still heavy in pronunciation ط tā̛ ﻃﺎء d <ﺻـ ـﺼـ ـﺺ <7 w'w'w' , Like the W in response to the وَاو h ﻫـ ـﻬـ ـﻪ Like the H in He ه ﻫـ hā̛ ﻫَﺎء n ﻧـ ـﻨـ ـﻦ nun - Like the
    [Show full text]
  • Arabic Alphabet 1 Arabic Alphabet
    Arabic alphabet 1 Arabic alphabet Arabic abjad Type Abjad Languages Arabic Time period 400 to the present Parent systems Proto-Sinaitic • Phoenician • Aramaic • Syriac • Nabataean • Arabic abjad Child systems N'Ko alphabet ISO 15924 Arab, 160 Direction Right-to-left Unicode alias Arabic Unicode range [1] U+0600 to U+06FF [2] U+0750 to U+077F [3] U+08A0 to U+08FF [4] U+FB50 to U+FDFF [5] U+FE70 to U+FEFF [6] U+1EE00 to U+1EEFF the Arabic alphabet of the Arabic script ﻍ ﻉ ﻅ ﻁ ﺽ ﺹ ﺵ ﺱ ﺯ ﺭ ﺫ ﺩ ﺥ ﺡ ﺝ ﺙ ﺕ ﺏ ﺍ ﻱ ﻭ ﻩ ﻥ ﻡ ﻝ ﻙ ﻕ ﻑ • history • diacritics • hamza • numerals • numeration abjadiyyah ‘arabiyyah) or Arabic abjad is the Arabic script as it is’ ﺃَﺑْﺠَﺪِﻳَّﺔ ﻋَﺮَﺑِﻴَّﺔ :The Arabic alphabet (Arabic codified for writing the Arabic language. It is written from right to left, in a cursive style, and includes 28 letters. Because letters usually[7] stand for consonants, it is classified as an abjad. Arabic alphabet 2 Consonants The basic Arabic alphabet contains 28 letters. Adaptations of the Arabic script for other languages added and removed some letters, such as Persian, Ottoman, Sindhi, Urdu, Malay, Pashto, and Arabi Malayalam have additional letters, shown below. There are no distinct upper and lower case letter forms. Many letters look similar but are distinguished from one another by dots (’i‘jām) above or below their central part, called rasm. These dots are an integral part of a letter, since they distinguish between letters that represent different sounds.
    [Show full text]
  • Arabic Diacritics 1 Arabic Diacritics
    Arabic diacritics 1 Arabic diacritics Arabic alphabet ﻱ ﻭ ﻩ ﻥ ﻡ ﻝ ﻙ ﻕ ﻑ ﻍ ﻉ ﻅ ﻁ ﺽ ﺹ ﺵ ﺱ ﺯ ﺭ ﺫ ﺩ ﺥ ﺡ ﺝ ﺙ ﺕ ﺏ ﺍ Arabic script • History • Transliteration • Diacritics • Hamza • Numerals • Numeration • v • t [1] • e 〈ﺗَﺸْﻜِﻴﻞ〉 i‘jām, consonant pointing), and tashkil) 〈ﺇِﻋْﺠَﺎﻡ〉 The Arabic script has numerous diacritics, including i'jam .(〈ﺣَﺮَﻛَﺔ〉 vowel marks; singular: ḥarakah) 〈ﺣَﺮَﻛَﺎﺕ〉 tashkīl, supplementary diacritics). The latter include the ḥarakāt) The Arabic script is an impure abjad, where short consonants and long vowels are represented by letters but short vowels and consonant length are not generally indicated in writing. Tashkīl is optional to represent missing vowels and consonant length. Modern Arabic is nearly always written with consonant pointing, but occasionally unpointed texts are still seen. Early texts such as the Qur'an were initially written without pointing, and pointing was added later to determine the expected readings and interpretations. Tashkil (marks used as phonetic guides) The literal meaning of tashkīl is 'forming'. As the normal Arabic text does not provide enough information about the correct pronunciation, the main purpose of tashkīl (and ḥarakāt) is to provide a phonetic guide or a phonetic aid; i.e. show the correct pronunciation. It serves the same purpose as furigana (also called "ruby") in Japanese or pinyin or zhuyin in Mandarin Chinese for children who are learning to read or foreign learners. The bulk of Arabic script is written without ḥarakāt (or short vowels). However, they are commonly used in some al-Qur’ān). It is not) 〈ﺍﻟْﻘُﺮْﺁﻥ〉 religious texts that demand strict adherence to pronunciation rules such as Qur'an al-ḥadīth; plural: aḥādīth) as well.
    [Show full text]
  • Downloaded License
    Journal of Islamic Manuscripts 11 (2020) 263–291 brill.com/jim A Rediscovered Almoravid Qurʾān in the Bavarian State Library, Munich (Cod. arab. 4) Umberto Bongianino University of Oxford, Oxford, U.K. [email protected] Abstract This article examines and contextualizes a small Quranic manuscript, copied in al- Andalus in 533/1138–1139, whose importance has so far gone unrecognized. Among its many interesting features are: its early date; its lavish illumination; its colophon and the information contained therein; its system of notation and textual division; its use of different calligraphic styles, including Maghribī thuluth; and a series of didactic notes written at the beginning and end of the codex. Presented in the appendix is an updat- ed list of the extant Qurʾāns in Maghribī scripts dated to before 600/1203–1204, aimed at encouraging the digitization, publication, and comparative study of this still large- ly uncharted material. The advancement of scholarship on the arts of the book, the transmission of the Qurʾān, and the consumption of Quranic manuscripts in the Is- lamic West depends upon the analysis of these and many other surviving codices and fragments, related to Cod. arab. 4 of the Bavarian State Library and its context of pro- duction. Keywords Al-Andalus – Almoravids – Maghribī scripts – Qurʾān – illumination – calligraphy 1 Cod. arab. 41 Among the lesser-known treasures of the Bavarian State Library in Munich is a small Qurʾān, kept under the shelf mark Cod. arab. 4, which once belonged to 1 Submitted on January 6, 2020. Accepted for publication on March 2, 2020.
    [Show full text]
  • Urdu Alphabet
    Urdu alphabet The Urdu alphabet is the right-to-left alphabet used for the Urdu language. It is Urdu alphabet a modification of the Persian alphabet known as Perso-Arabic, which is itself a اردو ﺗ ﺠﯽ derivative of the Arabic alphabet. The Urdu alphabet has up to 58 letters.[1] With 39 basic letters and no distinct letter cases, the Urdu alphabet is typically written in the calligraphic Nastaʿlīq script, whereas Arabic is more commonly in the Naskh style. Usually, bare transliterations of Urdu into Roman letters (called Roman Urdu) omit many phonemic elements that have no equivalent in English or other languages commonly written in the Latin script. The National Language Authority of Pakistan has developed a number of systems with specific notations Example of writing in the Urdu to signify non-English sounds, but these can only be properly read by someone alphabet: Urdu already familiar with the loan letters. Type Abjad Languages Urdu, Balti, Burushaski, others Contents Parent Proto-Sinaitic systems History Phoenician Nastaʿlīq Alphabet Aramaic Differences from Persian alphabet Retroflex letters Nabataean Vowels Arabic Vowel chart Alif Perso-Arabic Wāʾo Ye Urdu The 2 he's alphabet Ayn Nun Ghunnah Unicode U+0600 to U+06FF range Hamza U+0750 to U+077F Diacritics U+FB50 to U+FDFF Iẓāfat U+FE70 to U+FEFF Computers and the Urdu alphabet Encoding Urdu in Unicode Software Romanization standards and systems See also References Sources External links History The Urdu language emerged as a distinct register of Hindustani well before the Partition of India. It is distinguished most by its extensive Persian influences (Persian having been the official language of the Mughal government and the most prominent lingua franca of the Indian subcontinent for several centuries before the solidification of British colonial rule during the 19th century).
    [Show full text]
  • ALI-BABAANDTHE4.0 UNICODECHARACTERS New
    ALI-BABA AND THE 4.0 UNICODE CHARACTERS New input and output concepts under Unicode Thomas Milo The Netherlands [email protected] ABSTRACT New input and output concepts under Unicode: Literary, Classical and Qurʾānic Arabic in Unicode involve more characters than the Arabic keyboard can accommodate. To enter sophisticated Arabic, Basis Technology and DecoType jointly developed an IME (input method editor): ALI (Arabic, Latin Input). When used at full power, ALI overstretches conventional font technology. Only DecoType’s Arabic Calligraphic Engine ACE, dubbed BABA for the occasion, handles elaborate ALI output. Therefore, the ALI-BABA pair points the way toward technology for dealing with Arabic text –from dialect transcriptions to the most complicated Qurʾānic quotation, covering all aspects of both abstract text level and rendering it to the highest possible standards of computer generated typography. RÉSUMÉ Des nouveaux concepts d’entrée et de sortie sous Unicode. L’arabe litérraire, classique et coranique requiert plus de caractères que le clavier de saisie arabe ne peut fournir. Pour saisir de l’arabe «sophistiqué», les socieétés Basis Technology et Decotype ont développé en commun un EME (Éditeur de méthode d’entrée) du nom de ALI (Arabic, Latin Input). Utilisé à tout son potentiel, ALI dépasse la technologie conventionnelle de fonte. Seule la technologie d’arabe calligraphique de DecoType (appelée ici BABA pour l'occasion), est en mesure de gérer efficacement la sortie de ALI. C’est pour cette raison que le couple ALI-BABA montre la voie vers une nouvelle technologie de composition arabe – des transcriptions dialectales jusqu’aux citations coraniques les plus complexes, en couvrant tous les aspects du niveau textuel abstrait et en affichant ce texte aux plus hauts standards de typographie informatique.
    [Show full text]
  • A Computational Model of Modern Standard Arabic Verbal Morphology Based on Generation
    A computational model of Modern Standard Arabic verbal morphology based on generation by Alicia González Martínez under the supervision of Dr Antonio Moreno Sandoval Dissertation submitted in partial fulfillment of the requirements for the degree of PhD Madrid, December 2012 Universidad Autónoma de Madrid Facultad de Filosofía y Letras Departamento de Lingüística Laboratorio de Lingüística Informática LLI-UAM This dissertation was completed thanks to an FPU scholarship from the Universidad Autónoma de Madrid, and was carried out in the LLI-UAM Laboratory of Computational Linguistics, Department of Linguistics, under the supervision of Dr Antonio Moreno Sandoval para Tronspmkins, que me mostró cerdos volar para Su, que me regaló un encinar y me enseñó a hozar "Why are you working so hard? Why don't you come and play?" But the stubborn bricklayer pig just said "no". "I shall finish my house first. It must be solid and sturdy. And then I'll come and play!" Three little pigs Abstract The computational handling of non-concatenative morphologies is still a challenge in the field of natural language processing. Amongst the various areas of research, Arabic morphology stands out due to its highly complex structure. We propose a model for Arabic verbal morphology based on a root-and-pattern approach, which satisfies both computational consistency and an elegant formalization. Our model defines an abstract representation of prosodic templates and a set of intertwined morphemes that operate at different phonological levels, as well as a separate module of rewrite rules to deal with morphophonological and orthographic alterations. Our verbal system model asserts that Arabic exhibits two conjugational classes.
    [Show full text]