Dynamic Arabic Mathematical Fonts

Mostafa Banouni [email protected]

Mohamed Elyaakoubi [email protected]

Azzeddine Lazrek [email protected] Department of Computer Sciences Faculty of Sciences University Cadi Ayyad P.O. Box 2390 Marrakech, Morocco http://www.ucam.ac.ma/fssm/rydarab

Abstract This contribution describes a font family designed to meet the requirements of typesetting mathematical documents in an Arabic presentation. Thus, not only is the text written in an -based , but specific symbols are used and mathematical expressions also spread out from right to left. Actually, this font family consists of two components: an Arabic mathematical font and a dy- namic font. The construction of this font family is a first step of a project aiming at providing a complete and homogeneous Arabic font family, in the OpenType format, respecting Arabic calligraphy rules. Keywords: Mathematical font, Dynamic font, Variable-sized symbols, Ara- bic mathematical writing, Multilingual documents, , PostScript, Open- Type.

1 Overview ics take their place over or under the characters, The Arabic language is native for roughly three hun- and that is also context dependent. Moreover, the dred million people living in the Middle East and kashida, a small flowing curve placed between Ara- North Africa. Moreover, the is used, bic characters, is to be produced and combined with in various slightly extended versions, to write many characters and symbols. The kashida is also used major languages such as Urdu (Pakistan), Persian for the text justification. The techniques for man- and Farsi (Iran, India), or other languages such as aging the kashida are similar to those that can be Berber (North Africa), Sindhi (India), Uyghur, Kir- used for drawing curvilinear extensible mathemati- giz (Central Asia), Pashtun (Afghanistan), Kurdish, cal symbols, such as sum, product or limit. Jawi, Baluchi, and several African languages. A There are several Arabic font styles. Of course, great many Arabic mathematical documents are still it is not easy to make available all existing styles. written by hand. Millions of learners are concerned The font style Naskh was the first font style adopted in their daily learning by the availability of systems for computerization and standardization of Arabic for typesetting and structuring mathematics. typography. So far, only Naskh, Koufi, Ruqaa, and Creating an Arabic font that follows calligra- to a limited extent Farsi have really been adapted phic rules is a complex artistic and technical task, to the computer environment. Styles like Diwani due in no small part to the necessity of complex or Thuluth, for example, don’t allow enough sim- contextual analysis. Arabic letters vary their form plification, they have a great variation in characters according to their position in the word and accord- shapes, the characters don’t share the same baseline, ing to the neighboring letters. Vowels and diacrit- and so on. Considering all that, we have decided to use the Naskh style for our mathematical font.

48 Preprints for the 2004 Annual Meeting Dynamic Arabic Mathematical Fonts

The RyDArab [10] system was developed for the Now we will describe the way we constructed purpose of typesetting Arabic mathematical expres- the OpenType Arabic mathematical font RamzArab. sions, written from right to left, using specific sym- The construction of the font started by drawing the bols. RyDArab is an extension of the TEX system. It whole family of characters by hand. This task was runs with K. Lagally’s Arabic system ArabTEX [8] performed by a calligrapher. Then the proofs were or with Y. Haralambous and J. Plaice’s multilingual scanned to transform them into vectors. The scan- Ω [6] system. The RyDArab system uses characters ning tools alone don’t produce a satisfying result, so belonging to the ArabTEX font xnsh or to the omsea once the design is finalized, the characters are pro- font of Ω, respectively. Further Arabic alphabetic cessed and analyzed using special software to gener- symbols in different shapes can be brought from the ate the file defining the font. font NasX that has been developed, for this special In Arabic calligraphy, the feather’s head (ka- purpose, using METAFONT. The RyDArab system lam) is a flat rectangle. The writer holds it so that also uses symbols from Knuth’s Computer Modern the largest side makes an angle of approximately 70◦ family, obtained through adaptation to the right-to- with the baseline. Except for some variations, this left direction of Arabic. orientation is kept all along the process of drawing Since different fonts are in use, it is natural that the character. Furthermore, as Arabic writing goes some heterogeneity will appear in mathematical ex- from right to left, some boldness is produced around pressions typeset with RyDArab [9]. Symbol sizes, segments from top left toward the bottom right and shapes, levels of boldness, positions on the base- conversely, segments from top right to the bottom line will not quite be in harmony. So, we undertook left will rather be slim as in Figure 1. building a new font in OpenType format with two The RamzArab font in Figure 4 contains only main design goals: on the one hand, all the symbols symbols specific to Arabic mathematics presentation will be drawn with harmonious dimensions, propor- plus some usual symbols found even in text mode. tions, boldness, etc., and on the other hand, the font It is mainly composed of the following symbols: should contain the majority of the symbols in use in • alphabetic symbols: Arabic letters in various the scientific and technical writing based on an Ara- forms, such as characters in isolated standard bic script. form, isolated double-struck, initial standard, Both Arabic texts and mathematical expres- initial with tail, initial stretched and with loop sions need some additional variable-sized symbols. (e.g., v ¥ ‡ ¹‡ ¶‡ “ respectively); We used the CurExt [11] system to generate such • punctuation marks (e.g., ; : 9 ! > ); symbols. This application was designed to gener- • digits as used in the Maghreb Arab (North Afri- ate automatically curvilinear extensible symbols for ca), and as they are in the Machreq Arab (Mid- T X with the font generator METAFONT. The new E dle East); extension of CurExt does the same with the font gen- • accents to be combined with alphabetic symbols erator PostScript. XTsTtT While METAFONT generates bitmap fonts and (e.g., ); thus remains inside the TEX environment, Open- • ordinary mathematical symbols such as delim- Type [14] gives outline and multi-platform fonts. iters, arithmetic operators, etc. Moreover, since Adobe and Microsoft have devel- • mirror image of some symbols such as sum, in- oped it jointly, OpenType has become a standard tegral, etc. combining the two technologies TrueType and Post- In Arabic mathematics, the order of the alpha- Script. In addition, it offers some additional typo- betic symbols differs from the Arabic alphabetic or- graphic layout possibilities thanks to its multi-table der. Some problems can appear with the alphabetic feature. symbols in their multi-form. Generally, in Arabic mathematical expressions, 2 A Mathematical Font alphabetic@ A B symbols are written without dots (e.g., The design and the implementation of a mathemat- T T T ) or diacritics. This helps to avoid confu- ical font are not easy [5]. It becomes harder when sions with accents. The dots can be added when- it is oriented to Arabic presentation. Nevertheless, ever they are needed, however. Thus, few symbols independent attempts to build an Arabic mathemat- are left. ical font have been undertaken. In fact, F. Alhar- Moreover, some deviation from the general rules gan [1] has sent us proofs of some Arabic mathemat- will be necessary: in a mathematical expression, the ical symbols in TrueType format. isolated form of the letter ALEF can be confused with the Machreq Arab digit ONE. The isolated

Preprints for the 2004 Annual Meeting 49

1 1

1

1 Mostafa Banouni, Mohamed Elyaakoubi and Azzeddine Lazrek

form of the letter HEH can also present confusion with the Machreq Arab digit FIVE. The choice of the glyphs Z and ¸y to denote respectively these two characters will help to avoid such confusions. −→ −→ Even though these glyphs are not in conformity with the homogeneity of the font style and calligraphic Figure 1: Sum symbol with vertical then rules, they are widely used in mathematics. In the horizontal mirror image same way, the isolated form of the letter KAF c , resulting from the combination of two other basic el- ements, will be replaced by the KAF glyph in Ruqaa style, ^. Other symbols with mirror image forms already 1 For the four letters ALEF, DAL, REH and WAW, in use are not added to this font. Of course, Latin the initial and the isolated forms are the same, and and Greek alphabetic symbols can be used in Arabic these letters will be withdrawn from the list of letters mathematical expressions. In this first phase of the in initial form. On the other hand, instead of a project, we aren’t integrating these symbols into the unique cursive element, the stretched form of each of font. They can be brought in from other existing the previous letters will result from the combination fonts. of two elements. It follows that these letters will not 3 A Dynamic Font be present in the list of the letters in the stretched form. The composition of variable-sized letters and curvi- The stretched form of a letter is obtained by the linear symbols is one of the hardest problems in addition of a MADDA-FATHA or ALEF in its final digital typography. In high-quality printed Arabic form ¶ to the initial form of the letter to be stretched works, justification of the line is performed through (e.g., ‡ + ¶ −→ ¶‡). The glyph of LAM-ALEF b has using the kashida, a curvilinear variable lengthen- a particular ligature that will be added to the list. ing of letters along the baseline. The composition of The stretched form of a character is used if there is curvilinear extensible mathematical symbols is an- no confusion with any usual function abbreviation other aspect of dynamic fonts. Here, the distinction between fixed size symbols and those with variable (e.g., ¶ˆ or ¶ˆ@ for the sine function). The form with tail is obtained starting from width, length, or with bidimensional variability, ac- the initial form of the letter followed by an alter- cording to the mathematical expression covered by native of the final form of the letter HEH ¹ (e.g., the symbol, is of great importance. ‡ + ¹ −→ ¹‡ ). These two forms are not integrated Certain systems [11] solve the problem of ver- into the font because they can be obtained through tical or horizontal curvilinear extensibility through a simple composition. the a priori production of the curvilinear glyphs for The form with loop is another form of letters certain sizes. New compositions are therefore neces- with a tail. It is obtained through the combination sary beyond these already available sizes. This op- of the final form with a particular curl that differs tion doesn’t allow a full consideration of the curvilin- from one letter to another (e.g., ¢ ). This earity of letters or composed symbols at large sizes. form will be integrated into the font because it can- A better approach to get curvilinear letters or ex- not be obtained through a simple composition. tensible mathematical symbols consists of parame- The following particular glyphs are also in use: terizing the composition procedure of these symbols. Z[¸y\]^¸~_`ab. The parameters then give the system the required The elements that are used in the composition information about the size or the level of extensi- of the operator sum, product, limit and factorial in bility of the symbol to extend. As an example, we will deal with the particular case of the opening and a conventional presentation (·d g·ˆ ¶·f ·Œ) are added also. These symbols are extensible. They are closing parenthesis as vertically extensible curvilin- stretched according to the covered expression, as we ear symbol and with the kashida as a horizontally will see in the next section. extensible curvilinear symbol. This can be general- Reversed glyphs, with respect to the vertical — ized to any other extensible symbol. and sometimes also to the horizontal — axis, as in The CurExt system was developed to build ex- Figure 1, are taken from the Computer Modern font tensible mathematical symbols in a curvilinear way. family. For example, there are: ¼¾ÀÂÃÄÅÆÇÈÉ. 1 The Bidi Mirrored property of characters used in Uni- code.

50 Preprints for the 2004 Annual Meeting

1 1 1

1

1

1 1 1

1

1 1

1

1

1

1 Dynamic Arabic Mathematical Fonts 1 14 The previous version of this system was able to pro- duce automatically certain dynamic characters, such 2 13 1 14 as parentheses, using METAFONT. In this adapta- 1 14 tion, we propose to use the Adobe PostScript Type 2 13 3 format [13]. 2 13 The PostScript language defines several types of font, 0, 1, 2, 3, 9, 10, 11, 14, 32, 42. Each one of these types has its own conventions to represent and to organize the font information. The most widely 3 12 used PostScript font format is Type 1. However, a 3 12 3 12 dynamic font needs to be of Type 3 [3].

Although the use of Type 3 loses certain advan- 4 11 4 11 4 11 tages of Type 1, such as the possibility of producing 5 10 hints for when the output device is of low resolution, 5 10 and in the case of small glyphs, a purely geometrical 5 10 treatment can’t prevent the heterogeneity of char- acters. Another lost advantage is the possibility of using Adobe Type Manager (ATM) software. These two disadvantages won’t arise in our case, since the 6 9

symbols are generally without descenders or serifs 6 9 7 8 and the font is intended to be used with a composi- tion system such as TEX, not directly in Windows. 6 9 7 8 The PostScript language [7] produces a draw- ing by building a path. Here, a path is a set of segments (lineto) and third degree B´ezier curves 7 8 (curveto). The path can be open or closed on its origin (closepath). A path can contain several con- trol points (moveto). Once a path is defined, it can Figure 2: Parametrization of dynamic parenthesis be drawn as a line (stroke) or filled with a color (fill). From the graphical point of view, a glyph is a procedure defined by the standard operators of • converting the file from dvi to ps format. PostScript. This process should be repeated as many times as To parameterize the procedure, the form of the needed to resolve overlapping of extensible symbols. glyph has to be examined to determine the different The curvilinear parentheses is produced by Cur- parts of the procedure. This analysis allows deter- Ext with the following encoding:

mining exactly what should be parameterized. In ¢¡¤£¦¥¨§ © ¦© ©¦¨ .0/21

the case of an opening or closing parenthesis, all the ¡¥¨ ¢§¦ ! ¡ "#§

3 425 % !& ¡ "#§

parts of the drawing depend on the size: the width, $

687:9 ( !) ¡ "#§

the length, the boldness and the end of the paren- '

; .

• converting the file pl into a metric file tfm with with CurExt:

¢¡¤£¥¡§¦¨¡§©§£¥¡§ ¢

the application pltotf;

! "#   ¤§£¢ ©¨¢  

• compiling the document to generate a dvi file; $&%('*),+

Preprints for the 2004 Annual Meeting 51 Mostafa Banouni, Mohamed Elyaakoubi and Azzeddine Lazrek latex doc.tex - parent.par curext References par2pl [1] http://www.linux.org.sa. ? [2] Jacque Andr´e,Caract`eres num´eriques : intro- latex doc.dvi  parent.pl duction, Cahiers GUTenberg, vol. 26, 1997. curext ⊕ [3] Daniel M. Berry, Stretching Letter and Slanted- pltotf baseline Formatting for Arabic, Hebrew and ? Persian with ditroff/ffortid and Dynamic parent.tfm POSTSCRIPT Fonts, Software–Practice & Expe- dvips -u rience, no. 29:15, 1999, pp. 1417–1457. doc.ps  parent.t3 [4] Charles Bigelow et Kris Holmes, Cr´eation d’une police Unicode, Cahiers GUTenberg, vol. 20, parent.map 1995. [5] Yannis Haralambous, Une police math´ematique Figure 3: Generation of dynamic parentheses pour la Soci´et´eMath´ematique de France : le SMF Baskerville, Cahiers GUTenberg, vol. 32, 1999, pp. 5–19. instead of the straight lengthened one obtained by [6] Yannis Haralambous and John Plaice, Multilin- RyDArab: gual Typesetting with Ω, a Case Study: Arabic,  \amarabmath §§§§§§§§§§§§§§m.× Proceedings of the International Symposium on ${\lsum_{b=T-1}^{s}}$ 1− =H Multilingual Information Processing (Tsukuba), 1997, pp. 137–154. We can stretch Arabic letters in a curvilinear

way through the kashida by CurExt: [7] Adobe Systems Incorporated, POSTSCRIPT ¡¤£ ¢¡¤£¥ §¦¨¡¤£¥ §© Language Reference Manual, Second ed., Addison-Wesley, 1992. 4 Conclusion [8] Klaus Lagally, ArabTEX — Typesetting Ara- The main constraints observed in this work were: bic with Vowels and Ligatures, EuroTEX’92 • a close observation of the Arabic calligraphy (Prague), 1992. rules, in the Naskh style, toward their formal- [9] Azzeddine Lazrek, Aspects de la probl´ematique ization. It will be noticed, though, that we are de la confection d’une fonte pour les still far from meeting all the requirements of math´ematiques arabes, Cahiers GUTen- Arabic calligraphy; berg, vol. 39–40, Le document au XXIe si`ecle, 2001, pp. 51–62. • the heavy use of some digital typography tools, rules and techniques. [10] Azzeddine Lazrek, A package for typesetting arabic mathematical formulas, Die T Xnische RamzArab, the Arabic mathematical font in E Kom¨odie, DANTE e.V., vol. 13. (2/2001), 2001, Naskh style, is currently available as an OpenType pp. 54–66. font. It meets the requirements of: [11] Azzeddine Lazrek, CurExt, Typesetting • homogeneity: symbols are designed with the variable-sized curved symbols, EuroTEX 2003 same nib. Thus, their shapes, sizes, boldness preprints, pp. 47–71 (to appear in TUGboat). and other attributes are homogeneous; [12] Mustapha Eddahibi, Azzeddine Lazrek • completeness: it contains most of the usual spe- and Khalid Sami, Arabic mathematical e- cific Arabic symbols in use. documents, International Conference on TEX, These symbols are about to be submitted for XML and Digital Typography (TUG 2004, inclusion in the Unicode standard. This font is un- Xanthi, Greece), 2004. der test for Arabic mathematical e-documents [12] [13] W lodzimierz Bzyl, The Tao of Fonts, TUGboat, after having been structured for Unicode [2, 4]. vol. 23, 2002, pp. 27–39. The dynamic component of the font also works [14] Thomas W. Phinney, TrueType, POSTSCRIPT in PostScript under CurExt for some symbols such Type 1 & OpenType: What’s the Difference?, as the open and close parenthesis and the kashida. Version 2.00 (2001). That will be easily generalized to other variable- sized symbols. The same adaptation can be per- formed within the OpenType format.

52 Preprints for the 2004 Annual Meeting

1 Dynamic Arabic Mathematical Fonts

0 1 2 3 4 5 6 7 8 9 1x 2x 3x ! " # $ % & '

4x ( ) * + , - . / 0 1

5x 2 3 4 5 6 7 8 9 : ; 6x < = > ? @ A B C D E

7x F G H I J K L M N O

8x P Q R S T U V W X Y

9x Z [ \ ] ^ _ ` a b c

10x d e f g h i j k l m

11x n o p q r s t u v w

12x x y z { | } ~  € 

13x ‚ ƒ „ † ‡ ˆ ‰ Š ‹

14x Œ  Ž   ‘ ’ “ ” •

15x – — ˜ ™ š › œ  ž Ÿ

16x ¡ ¢ £ ¤ ¥ ¦ § ¨ ©

17x ª « ¬ ­ ® ¯ ° ± ² ³

18x ´ µ ¶ · ¸ ¹ º » ¼ ½

19x ¾ ¿ À Á Â Ã Ä Å Æ Ç

20x È É

Figure 4: RamzArab Arabic mathematical font

Preprints for the 2004 Annual Meeting 53