
Arabic text justification Mohamed Jamal Eddine Benatia, Mohamed Elyaakoubi and Azzeddine Lazrek Department of Computer Science, Faculty of Science, University Cadi Ayyad P.O. Box 2390, Marrakesh, Morocco Phone: +212 24 43 46 49 Fax: +212 24 43 74 09 lazrek (at) ucam dot ac dot ma http://www.ucam.ac.ma/fssm/rydarab Abstract Justification of Latin script based texts is carried out by handling of hyphenation and insertion of inter-word glue, which can be stretched or shrunk to some extent. Due to the cursive nature of Arabic writing, text justification in Arabic has a quite different logic. So, the classical algorithms of text justification must be completely revised. Justification in Arabic typography has traditional processes inspired by calligraphy manuals. Words are flexible: on the one hand, a word can be resized through stretching some letters in a curvilinear way; on the other hand, it can be shrunk via ligatures so that the width of a given letter group is decreased. ¨r` Pn ­ÐA Ayr dmt` ¢AJ A ¤ ¨ny ® rA TAtk ¨ Pn ­ÐA ºr¯ :Pl T}A T§w dwq AqbV ,rs§¯ LAhl T§ÐAm Amlk yWq Pn TA` wh ¨r` X A .Pylqt ¤ XyWmtl TlA Ar CE¤ ,Pn Tl ¨Sq TAtk dw A AhS` Tmlk ¤r Of ms§ ¯ wKm T¤r yt§ AqmA ¢nk¤ ¢y w ry r Amlk yWq Am @ ­dyKk Am`tF TWFw ¨Ays kK ¤r d§dm ¤ Amlk XyWm .A§ wm Ahld ¤ ¤r r TWFw Amlk wV Pylq yt§ ¢ Am lt§ ¨r` XA TAtk ¨ Pn ­ÐA yq wlF ¨n`§ @¡ ­ÐAm AmCAw T`r ¤ ¨ny ® rA TAtk ¨ £ry\ AAm .¢n QAn ¯ r Ahl} 1 Justification in Latin typography places to insert hyphens in a Latin script based writ- ten word. The method is controlled by an organized A good many methods to make paragraphs on a tree structure of tries, containing a list of hyphen- page be visually homogeneous have been developed. ation patterns. Combinations of letters which allow, The majority of these methods have already been or prohibit, the word-breaking are listed, and pri- implemented in T X. The hyphenation and jus- E orities to breakpoints in letter groups are assigned. tification algorithm divides a paragraph into lines Patterns reflect the hyphenation rules of a given lan- in an optimal way, as regards time complexity as guage. So there will be as many pattern tree struc- well as the visual result obtained [6]. The process- tures as there are normative languages. This is the ing spreads out beyond the paragraphs, to reach the algorithm Donald E. Knuth chose to implement in level of the page in its totality. TEX. 1.1 Typographical hyphenation 1.2 Spacing Typographical hyphenation is the breaking of words To give a line the flexibility it needs, a space or its when they come at the end of a line and would over- equivalent can presumably be inserted between each flow into the margin. In general, word breaking hap- pair of words. Therefore, some of these spaces are pens at syllable boundaries. transformed into line ends. Others are transformed In the beginning, the hyphenation of words was into variable sized spaces called glue. The glue has a done by hand. In 1983, F. M. Liang [8] published a normal size that can be stretched or shrunk. When a sophisticated method to find nearly all the suitable paragraph intended to have a justified right margin TUGboat, Volume 27 (2006), No. 2 — Proceedings of the 2006 Annual Meeting 137 Mohamed Jamal Eddine Benatia, Mohamed Elyaakoubi and Azzeddine Lazrek Figure 2: An Arabic word with kashida Figure 1: Special morphology • emphasis: to mark an important piece of a is composed with TEX, the glue widths of each line word. The kashida will then mark the sound are adjusted so that lines end almost at the right elongation; margin. Generally, the last line of a paragraph is an • legibility: to give a better character layout on exception, and does not have to end at the margin. the baseline, and to lessen the cluttering at the Given this, it is always an extremely difficult joint point between two successive letters of the task to obtain a uniform typographical gray. The same word; main reason is the impossibility of ensuring an equal • aesthetics, to embellish the writing of a word; inter-word space in different lines. The composition • justification, to justify a text line. accidents of rivers and alleys are uncomfortable for the reader, and irregular spaces catch his attention. The example in Figure 2 shows a composition of an A solution for such problems has been imple- Arabic word; the arrows indicate the kashidas, with mented by H`anThˆe´ Th`anh [11]. Instead of (or as various degrees of extensibility. well as) changing inter-word spaces to justify text There are mandatory elongations, allowed elon- lines, the widths of characters are slightly modified. gations and prohibited elongations. The typograph- So, better inter-word spacing can be obtained and ical quality of a text is determined, among other space elasticity can be limited. This width modi- things, by the absence of mandatory elongations or fication is implemented through horizontal scaling the presence of prohibited elongations. of fonts in pdfTEX. If it is employed parsimoniously In terms of Arabic text justification, the kashida and wisely, this method can appreciably improve ap- is a typographical effect that allows the lengthening pearance of the typography produced by TEX. of letters in some carefully selected points of the line, with determined parameters, in order to produce the 2 Justification in Arabic typography left alignment of a paragraph. The good selection of characters to be stretched is called tansil. Arabic writing is cursive in its printed form as well as in its handwritten form. The letters’ morphol- 2.2 Current typesetting systems ogy changes according to their position in the word, In terms of text processing tools, the curvilinear according to the surrounding letters, and in some kashida is, generally, still beyond what the majority cases, according to the word’s meaning (for exam- of typesetting systems can afford. The kashida is ple, ALLAH and Mohamed when it indicates the not a character in itself, but an elongation of some prophet’s name in Figure 1). The alternative po- character parts while keeping rigid the body’s char- sitions then depend on the typeset words. The end acter. It is not a simple horizontal scaling to widen of a given glyph is tied to the beginning of the fol- character width. Instead of performing a kashida, lowing glyph, with no possible break. the majority of typesetting systems proceed by in- serting a rectilinear segment between letters; the re- Remark: sulting typographical quality is unpleasant. Due to All figures and the table are written from right to the lack of adequate tools, the solution consists of left according to Arabic writing direction. inserting a glyph, that is, an element of a font. So, rather than computing (say) parameterized B´ezier 2.1 Kashida curves in real time, a ready-to-use character is in- The genuine connections between Arabic letters are serted. Moreover, whenever stretching is performed curvilinear strokes that can be stretched or shrunk by means of a parameterized glyph coming from an according to the writing context. Such curve is called external dynamic font, the current font context is kashida, tamdid, madda, maT, taTwil, or iTalah. changed. This variable-sized connection between letters is spe- Curvilinear extensibility of characters can be af- cific to Arabic alphabet based writing. The kashida forded by certain systems through the a priori gener- is used in various circumstances: ation of curvilinear glyphs for some predefined sizes. 138 TUGboat, Volume 27 (2006), No. 2 — Proceedings of the 2006 Annual Meeting Arabic text justification dmh2 - dm$ ←− ++c+#+c Figure 3: Presence of Figure 4: Absence of Noon-Meem ligature Noon-Meem ligature d# . (Source: Holy Quran writ- ten by the calligrapher Figure 6: Various levels of ligatures Alhaj Hafez Mohamed Amin Alrochdi, scrutinized and revised by General Directorate of Endowments, sion of implicit ligatures into aesthetic ones, brings Baghdad, p. 649.) some flexibility to the word. So, it can be adapted to the available space on the line. The example in Fig- ,) + & ure 6 shows three ligature levels: mandatory simple ← ← + substitutions, aesthetic ligatures of second degree, and finally, aesthetic ligatures of third degree. The Figure 5: Contextual and aesthetic transformations two last ligature levels provide shrinking possibilities of the same word. Beyond these sizes, the system will choose curvilin- The use of aesthetic ligatures of second and ear primitive and linear fragments. Of course, this third degree has to take into consideration the con- will violate the curvilinear shape of letters and sym- straints of legibility. bols composed at large sizes [3, 9, 10]. A typesetting system should take into account A better approach consists of building a dy- three levels of recourse to ligatures. In the first namic font [4, 7], through parameterizations of the level, there are only implicit contextual ligatures and composition procedure of each letter. The intro- mandatory grammatical ligatures of second degree. duced parameters indicate the extensibility size or This level is recommended for textbook composi- degree. To handle the elongations, a letter is decom- tion, where it is necessary to avoid any collision be- posed into two parts: the static body of the letter tween characters and/or any reading ambiguity. In and the dynamic part, capable of stretching. a second level, some aesthetic ligatures of second degree can be used. This level is recommended for 2.3 Ligatures composition of books for the general public. The The cursive nature of Arabic writing implies, among third level, where the use of aesthetic ligatures of other things, a wide use of ligatures [5].
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages10 Page
-
File Size-