Typesetting the Qur'an and Its Specific Challenges to the TEX Family∗
Total Page:16
File Type:pdf, Size:1020Kb
∗ Typesetting the Qur'an and its specific challenges to the TEX family Hossam A. H. Fahmy Electronics and Communications Department, Faculty of Engineering, Cairo University, Egypt [email protected] 1 Peculiarity of Arabic typography isolated. The simplest example of this rule is the ¡ ¥ ¤ ¥ ¥ ¥ ¨ equivalent of `b' in Arabic: § . A reader with The arabic alphabet has been adopted for use by a sensitive eye would notice that the four shapes of many languages in Africa and Asia including: Ara- the same letter differ in their width, height above bic, Dari, Farsi, Jawi, Kashmiri, Pashto, Punjabi, the line, and depth below it. The same structural Sindhi, Urdu, and Uyghur. The arabic script is ¡ ¤ ¨ shape is used for the equivalent of `t' ( § ), also used for a number of other languages either to ¡ ¤ ¨ © © § © and `th' ( © ). The equivalent of `n' and present how the language used to be written histor- `y' share the same shapes as `b' in the initial and ically (as for Turkish) or how some write it in an ¤ £ ¤ £ ¨ ¨ ¦ medial forms ( ¦ and ) but not in the final or unofficial manner (as for Hausa in western Africa). ¦ the isolated forms ( ¦ and ). In tradi- Similar to the latin alphabet, with its adoption tional Arabic writing styles (but with the exception to several languages, the arabic alphabet acquired of thuluth, riq¯a`, and tawq¯ı‘ styles [13]), the letters new symbols to represent the sounds that do not and their siblings with dots or marks do exist in Arabic. In contrast to the latin alphabet not connect to the following letter but only to the that has dots only on the `i' and `j', the arabic al- preceding one. phabet uses dots extensively both above and below The cursive nature of arabic script adds another the letter shapes to distinguish the different charac- characteristic; many letters combine together to pro- ters. This explicit distinction between the different ¥ ¥ duce new shapes as in becoming . In the letters in written text using dots was in itself an ad- latin script this phenomena occurs infrequently and dition to the original script which had no dots. In when it happens, a ligature is used to improve the the original arabic script, the distinction between appearance of the problematic letter combinations ¢ £ ¤ ¥ ¡ ¢ ¦ ¤ ¥ ¡ (house) and (girl) when represented as such as ‘ff' and ‘ffl’. In arabic typography, on the ¢ ¤ ¡ was understood from the context. In general, other hand, the presence of combined letters is abun- the developed symbols for the other languages follow dant but optional in many cases. Haralambous [2] the same idea as Arabic and use more dots (up to gives a long list (yet not exhaustive) of possible `lig- four) and special marks on the original shapes of the atures' in arabic. While speaking about the history arabic letters. of arabic typography, Milo [9] explains that \each Arabic being a semitic language, only the con- letter can have a different appearance in any com- sonants are usually written in a word. The equiva- bination, something that can only be crudely imi- lent of short vowel sounds are written as additional tated with ligatures". According to Milo [9], most marks on top of the letters. Obviously, the differ- modern books present the connected letter groups ent languages have different vowels and need dif- \as `ligatures' and `artistic expressions' without so ferent symbols to mark them. In addition to that, much as a hint at traditional morphographic rules". since the geographical area covered by the arabic Mackay [8] discusses the range of context evaluation script historically is quite vast, different regions of in Arabic and concludes that clusters of four, five, the world developed different symbols. The result six, and sometimes more letters may combine into that we have today is a plethora of additional marks a unique shape. Mackay then proposes the use of developed historically. virtual fonts as an adequate solution. A beginner learns that Arabic is written from Another feature in the Arabic script is its re- right to left and must practice writing each letter liance on subtle changes to the letter shape to aid and its connection rules to other letters. Because the reader in identifying the beginning and end of of the cursive nature, any letter may connect to the each letter within a combination. The letter has previous and following letters. Hence, a beginner three \teeth" (vertical pen strokes) similar to the learns the general four basic forms of a letter: at the start of a word, at the middle, at the end, and ∗ A graduation project under the author's supervision by: Ibrahim R. Mohamed, Ahmed Z. Ameen, Ahmed M. Amin, Akram A. Mojahed, Kareem O. Sharawi, and Hisham Shihab TUGboat, Volume 0 (2060), No. 0 | Proceedings of the 2060 Annual Meeting 1001 Hossam A. H. Fahmy ¤ ¥ ¤ ¤ ¥ teeth in and . When is connected to , the tail 5. There are additional marks that are put on top of the may be elongated to alert the reader to the or below the character. correct grouping of the teeth. Furthermore, the two 6. The horizontal and vertical location of the dots ¢ ¥ ¨ ¡ words and present different heights for the and marks on the characters is not at the same ¥ ¤ teeth of the ¤ and to help the reader as well. This position always but depends on the character difference in width and height is a type of encoding and its context. to prevent a misreading of the word and to aid the 7. There are too many letters that combine (\lig- trained eye in quickly catching the letter combina- atures"). tion. That encoding helps in other cases as well. If 8. Ligatures and variable width forms of the let- due to any reason the dots fade away, a reader faced ters are used to justify the lines. ¢ ¡ with can guess its correct origin. This encoding 9. Several typefaces are needed for special materi- to emphasize the different letters by raising some als in a work of good quality. ¢ ¢ £ ¥ ¡ ¢ ¥ ¤ ¡ ¢ ¥ ¦ ¨ ¨ teeth is essential in words such as and ¡ . With all of these issues, to find a suitable posi- However, the raising of the teeth is only possible by tion for the dots and marks on the letter combina- investigating the group of letters. So the height of tions is sometimes a real challenge even for a human ¡ ¢ £ ¢ ¥ ¤ ¡ ¢ ¥ ¤ ¦ ¨ ¨ ¨ ¦ the tooth for ¦ in and is not a feature of let alone a machine. the individual letter but of the whole combination. The same goes for the height of the dot on top of 2 Automated typesetting of Arabic £ ¥ ¢ ¦ ¦ ¨ ¦ ¦ that same letter as in or . Arabic script does not enjoy the same luxury that In traditional (manual) writing, the \skeleton" latin script has when it comes to automated type- of the letter combination is written first then the setting on computers. Milo correctly asserts [9] that dots and the other marks are provided. So, a writer the use of individual letters as the building block probably progresses from the skeleton to the dotted is not suitable for Arabic. Both Mackay [8] and £ § £ £ ¦¥ ¦ ¦ ¦ ¦ ¤ to the vocalized form as ! ! . Milo [10] argue that a layered approach is a bet- In the arabic script, a good calligrapher justifies ter solution. In such a layered approach, some basic the lines not just by stretching the spaces between elements are provided in the font to represent the the words but mainly by using optional ligatures or skeleton of some letter combinations, an individual © wider forms of some letters as in changing ¨ to letter's skeleton, or even a part of a letter. These ¡ ¡ ¥ ¨ ¥ ¨ ¨ and writing © instead of . The use of elements are combined first to give the correct skele- optional ligatures and wider forms is the preferred ton of the word with all the needed shaping for the method in high quality works. Another method is to teeth or other style requirements. On top of that add an elongation to the tail of some letters by us- skeleton, a second layer for the dots is added. Then, ¡ ¥ ¥ ¨ ing the tat.w¯ıl or kash¯ıdah symbol ` ' such as the vowel marks and any other marks come in sub- ¡ ¢ ¥ ¥ instead of . This second method has been sequent layers. Milo [11] developed a system with a widely abused in newspapers and low quality mate- layered approach for his company, DecoType. It is a rials using mechanical typewriters. proprietary system used by a number of commercial The arabic script has a large number of writing software tools. Due to its proprietary nature, the styles that were developed traditionally to accom- full details and the extent of the capabilities of this modate the different languages and different pur- system are not widely known. poses. Latin scripts use bold, italic, or larger fonts Our goal is to provide a freely available system for section headings and for emphasis. Traditional capable of typesetting the Qur'an, other traditional arabic writings vary the typeface instead. The print- texts, and any publications in the languages using ings of the Qur'an as well as of most books almost the arabic script. The Qur'an is one of the most always use the naskh typeface for the main body. demanding arabic texts from a typographical point The headings, the introductory materials, and the of view. However, there is a long historical record of back materials frequently use other typefaces such excellent quality materials (manuscripts and recent as the thuluth, ta`l¯ıq, and ruq`ah. printings) to guide the work on a system to typeset In summary, the points just mentioned are: it. Such a system, once complete, can easily typeset 1. Arabic is written from right to left.