A Non-standard Application of ArabTEX: Generating Sorted Indices∗

Klaus Lagally Institut fur¨ Informatik, Universit¨at Stuttgart Breitwiesenstraße 20-22, 70565 Stuttgart, Germany [email protected] 1. Januar 2011

Abstract

ArabTEX, a macro package extending TEX and LATEX, has primarily been designed to support the typesetting of texts using the and other right-to- left scripts. However, due to the high flexibility of the TEX macro mechanism, we can also apply ArabTEX to support some more general data processing tasks, provided the input data contain a suitably chosen symbolic markup. We present some techniques we used to generate sorted indices in the of a multi-lingual dictionary.

1 Introduction

ArabTEX[?, ?], a macro package for TEX[?] and LATEX[?, ?, ?] originally designed to support the typesetting of texts using the Arabic writing, has recently been extended in several respects. Originally it offered its own private ASCII encoding, modelled after the standard ZDMG transliteration [?, ?] as the only means of inputting Arabic text; but meanwhile several other encoding standards as, e.g., ASMO 449 [?, ?] and ISO 8859-6 [?, ?], are also supported, and can even peacefully coexist within the same document. Likewise, other Semitic languages can be handled, e.g., vowelized Hebrew in several common encodings [?]; and the processing of Syriac presently is only hampered by the fact that we know of no Syriac font available in the public domain. One may even, should the need arise, typeset in Ugaritic cuneiform. This surprising versatility made ArabTEX an obvious candidate to be considered in an ongoing project of building a multi-lingual dictionary, and first results look

∗Submitted to Cahiers GUTenberg

1 promising, even though the project is still far from its completion [?, ?]. For a sample page see Figure 1. Building a dictionary amounts to more than just producing a printed version; in fact the input data can be considered a data base, and may be evaluated along many various lines, even if not all possible applications are known at this stage. It may pay off to choose an encoding that, in addition to supporting the printing job, will allow us to capture all relevant information available now, without interfering with the printing process and without precluding further evaluations. The obvious solution is some kind of symbolic markup, denoting the structure and, where available, also the meaning of segments of the data, without having to decide now on the details of further processing. This idea has been advocated for quite some time, e.g., within LATEX[?], SGML [?, ?], and more recently, HTML [?]. Though these proposals are related, we observe some differences which, in our opinion, are important for the application at hand. Even though the manuals say otherwise, both SGML and HTML appear primarily oriented towards the visual pre- sentation of text, and require a complete definition of the syntax and the semantics of the markup statements occurring in a given document class. This may be very useful if we want to check the formal consistency and completeness, and also enables us to decouple the collection of data from the rendering process, but it will also make extending the notation for additional information a major task. On the other hand, due to the TEX macro feature the TEX/LATEX markup is completely open-ended, and, apart from the primitive TEX commands, has no fixed meaning at all. The interpre- tation of a control sequence needs only to be known at the time of actual processing, and can be bound to quite different actions for different processing tasks according to additional requirements. We shall present some examples in the sequel.

2 Symbolic Markup

To demonstrate what we mean we present a short excerpt of the encoded input data for the example shown in Figure 1.

\alemma {qAbUs} JA 1886 (1) 460. \see \ar {qwA_tUs} (ib.)

\alemma {qAbIl} \glemma {k’aphloc} \from \syr {qpIl’} ZDMG 1897 (51) 470. der Kleinh"andler, Speisewirth: \ar {mi_tl insAn _dAhib fI al-sUq ‘inda al-qAbIl ya^sum al-^siwA’| wa-al.tabI_h} "‘Wie ein Mensch wel\-cher auf dem Markte bei [dem] Speisewirth

2   BBH 843 paen., 1049.10 ⇒    JA 1886 (1) 460. ⇒   (ib.) á £AKA¯ Q £AKA¯ €ñK.A¯ €ñK@ñ¯ (ib.) kphloc  < (syr. ĂŇĽŤŮ ) ZDMG kajolik“c ÉJKA¯    MAF 129.9   . 1897 (51) 470. der Kleinh¨andler, ‡J ËñKA¯ ‡J ÊKAm.Ì'@ ñëð ‡J ÊKA mÌ'@ ÐA®Óð ‡K Q¢JË@YK I m' á ºKð Speisewirth:   .  .  ú¯ I. ë@X àA‚@ ÉJÓ éJ KYÓ †@QªË@XCJ.K. ÐAÓB@ èQå” k ú¯ qJJ¢Ë@ð Z@ñ‚Ë@ Õæ„ ÉJKA®Ë@YJ « †ñ ‚Ë@    ' Wie . ein Mensch welcher . auf dem é J»A¢@ ‡K Q¢. Y K Im ẠJ¯ ÐC‚Ë@ ” var.    ,   , ; OC Markte bei [dem] Speisewirth vor- ‡J ËñKA¯ ‡J ËñKA¯ á »ñKA¯ 1979 (63) 79,80 n.22 var.  ⇒ beigeht und den Duft der gekochten ‡J ËñKA¿ und gebratenen Speisen riecht.“  ,   ,  . ‡J ËñKA¿ ‡J ËñKA ¯ ‡J ËñKA g.   kjisma pl.   GRAF VERZ. A҂A¯ HA ҂A¯  AG 2.54.5 ⇒ (ib.) 86 Kathisma in der Psalmeneinlei- €ñKA¯ €ñKA¯ ” tung“. var.  (pl.  ),  SUWAIDI 235a.15-b.3 ⇒  A҂£A¯ HA҂“A¯ á KPXA ¯ €PXA ¯ A҂A¿ . DIOSK/DIET 1.22.14 - 2.1. Nr. 44.    GRAF VERZ. 86 ⇒   (ib.) €PXA¯ k‘droc DIOSK/DIET 1.22.14-2.1. HA҂A¯ A҂A¯ Nr. 44. (Zeder, Cedrus Libani A. HA‚ ‚ºKA ¯ katoiq’seic HIPA 2.474.7. ⇒ Rich) das ist die Ze- á K.Qå„Ë@ ñëð ”  (ib.) cod.  HA Jk. ñ«A¢’Ó HA‚ ‚º JÊ®Ë@ der“. KPXA ¯ SUWAIDI 235a.15 -

? HA ‚º JÊ®Ë@? HA ‚ ‚ºJ JÊ®Ë@ b.3 (ib.); cod. €PXA¯    pl.    GRAF VERZ. 86 var. ‡J ËñKA¯ é®ÊJ¯ àXA ¯ FI 1.252.22 ⇒ àXA ¯ (ib.)    ; MAF. 129.9 ⇒    kdoc €ñ® JËA®¯ ‡J ËñKA¯ €ðXA¯ ZDMG 1896 (50) 617, ib. (ib.) 1897 (51) 300, 325. Eimer“. JA ”   kaja•resic < (syr. ŚĽŚŸŽŮ ) 1886 (1) 431, ib. 1913 (2) 383 pot. ƒ@PAKA¯ GRAF VERZ. 86 Amtsenthebung, ne signifie gu`ere‘conduit, tuyau’ que dans le Maghreb ⇒ DOZY Absetzung. ⇒  GRAF VERZ. €Q¯ 87 322-323; €Y¯ (Hiˇgaz)ZDMG 1897 (51) 325.  ZAHRAWI 68b.9 ⇒  (ib.) Q£AKA ¯ Q £AKA ¯ àðXA ¯ FI 1.252.22 ⇒ €XA¯ (ib.) Q£AKA ¯ kaj’twr ZAHRAWI 68b.9 ≈ éËB@  QT 19.2 ⇒  (ib.) ù҂ úæË@ cod. Q£AKA ¯ ; HINDU àðXA¯ €PXñ¯ 163.5. ÉJÊgB@ ú¯ ÉgY K ¬ðñ m× ÉJ¯ @PA¯ kraboc ART. 235.10 ... ½Ò‚Ë@  . . ú¯ ©¯ð Qm.k I. .‚. ÈñJ.Ë@ .Jk@ @X@ @PA¯ ÕԂ øY ÊË@ÉJÓ ⇒ HPA¯ .ÐX ‡Ê « ð@ èY ƒ ð@ é KA J ÖÏ@ ‡J « . . FRAENKEL, FREMD 261; BBH 843 paen; 1049.10 var.  á £AKA ¯ Figure 1: Example of multi-lingual text

3 vorbeigeht und den Duft der gekochten und gebratenen Speisen riecht."’

\alemma {qAtismA} \glemma {k’ajisma} pl. \ar {qAtismAt} GRAF VERZ. 86 "‘Kathisma in der Psalmeneinleitung"’. \var \ar {qA.tsmA} (pl. \ar {qA.ssmAt}), \ar {kAtsmA}.

As we can see, the data are structured as a sequence of primary and secondary entries, each providing information about some Arabic word that is supposed to originate directly or indirectly from Greek. A primary entry refers to the Greek origin directly and also reports some additional information; a secondary entry refers to some primary entry and usually is just a writing variant. The markup command \ar indicates that its argument is Arabic text encoded in the ArabTEX standard ASCII encoding; \alemma in addition identifies an Arabic lemma. Likewise we have \gr for Greek text in the GreekTEX encoding [?, ?], \glemma denotes a Greek lemma. The markup adds some logical structure to the entries and supplies semantic information, but does not yet fix the intended processing and/or rendering; to do this we have to provide some macro definitions for the occurring control sequences before the actual processing run. For producing the printed version this looks as follows:

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % Arabic: call ArabTeX macros

\let \ar \RL \def \alemma #1{\item [{\setnashbf \<#1>}]}

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % Greek: use GreekTeX encoding and fonts

\font \grfont = kdgr10 \font \grbfont = kdbf10

\def \gr #1{{\grfont #1}} \def \glemma #1{{\grbfont #1}}

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % Hebrew: ArabTeX transliteration % use ArabTeX Hebrew mode

4 \def \heb #1{\sethebrew \RL{#1}\setverb }

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % Syriac: ArabTeX transliteration % temporary solution: mapped to Hebrew

\def \syr #1{{\it (syr. \heb {#1})\/}} % print a single Syriac word

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % miscellaneous markup commands

\def \see {\unskip \ $\Rightarrow$ } \def \from {\unskip \ $<$ } \def \like {\unskip \ $\approx$ } \def \var {{\it var. }} \def \cod {{\it cod. }}

\def \key #1{\relax }

We have assumed that the ArabTEX commands and a Greek font are available, and we have also assumed that the data are included in a LATEX “description list” envi- ronment, formatted in two columns. Observe that the markup for an Arabic lemma and for an Arabic comment are different; they play different rˆolesand are also rendered differently. The same is true for Greek lemmata and Greek explanations (none in the example). We did not diffe- rentiate between the various latin-script languages occurring but could do so easily.

3 Producing a Greek Index

The dictionary in its present form enables the user to look up some Arabic word assumed to have a Greek origin, but gives no assistance at all to the task of finding the Arabic versions of a Greek term. To do this we need an index on the Greek terms leading at least to the relevant main entries. Now our data as given are not in a format that could be sorted easily on the Greek lemmata, and we probably will not be able to find a sorting routine that can process the input notation supported by GreekTEX that we used. So we will have to do some preprocessing ourselves: we have to take out the Greek lemmata and transform them into a format that can be sorted by the software at hand, without losing the connection to the main entries. To do this in TEX is relatively easy: we read our input file one entry at a time, and generate an output file containing for every main entry: • an alphanumeric sorting key, • the Greek lemma,

5 • the Arabic lemma.

This file can be sorted by any standard sorting routine provided by our operating system, and when processing it again we only have to make sure that the sorting key will not interfere. To generate the file to be sorted we use a short TEX program whose main parts are given below:

\long \def \alemma #1#2\par {% process next entry \def \aword {#1}% save last Arabic lemma \getgreek #2\glemma*\relax }

\def \getgreek #1\glemma #2#3\relax {% parse the entry \ifx *#2\relax % secondary entry \else \def \thekey {}\getkey #2\relax % compute the sorting key \putentry {#2}% and output the line \fi }

\def \putentry #1{% write the output line \immediate\write\outfile {\string\key \string{\thekey \string} \string\gr \string{#1\string} \string\ar \string{\aword\string}}}

\def \getkey #1{% build up the sorting key \let \next \getkey \ifx \relax #1\let \next \relax \else \ifcat a\noexpand #1\edef \thekey {\thekey\csname gk#1\endcsname}% \fi\fi \next }

% translation table:

\def \gka {a} \def \gkb {b} \def \gkc {r} \def \gkd {d} \def \gke {e} \def \gkf {u} \def \gkg {c} \def \gkh {g} \def \gki {i} \def \gkj {h} \def \gkk {j} \def \gkl {k} \def \gkm {l} \def \gkn {m} \def \gko {o} \def \gkp {p} \def \gkq {v} \def \gkr {q} \def \gks {r} \def \gkt {s} \def \gku {t} \def \gkv {} \def \gkw {x} \def \gkx {n} \def \gky {w} \def \gkz {f}

Note that we did not need to redefine every control sequence used in the markup; we only look for the existence of a Greek lemma and project out the relevant infor- mation. This is one of the occasions where TEX’s macro pattern matching mechanism comes in very handy.

6 The sorting key itself is just the concatenation of the letters of the Greek lemma itself, after a suitable re-encoding in order to obey the collating sequence. If we apply the algorithm to the example given and sort the result, we obtain: \key{jador} \gr{k’adoc} \ar{qAdUs} \key{jahaiqerir} \gr{kaja’iresic} \ar{qA_tArAsIs} \key{jahgsxq} \gr{kaj’htwr} \ar{qA_tA.tIr} \key{jahirla} \gr{k’ajisma} \ar{qAtismA} \key{jahokijor} \gr{kajolik’oc} \ar{qA_tUlIq} \key{japgkor} \gr{k’aphloc} \ar{qAbIl} \key{jaqabor} \gr{k’araboc} \ar{qArAbs} \key{jasoivgreir} \gr{katoiq’hseic} \ar{qAtksIsAt} \key{jedqor} \gr{k’edroc} \ar{qAdrs} which with the new definitions \def \key #1{\relax} \def \ar #1{\hfill \<#1>\par } when typeset in two columns will produce:

kdoc  kajolik“c    €ðXA¯ ‡J ËñKA¯ kphloc  kaja•resic   ÉJKA¯  ƒ @PAKA¯ . kraboc @PA ¯ kaj’twr   . Q £ AKA¯ katoiq’seic   HA ‚ ‚ ºKA¯ kjisma   A҂ A¯ k‘droc €PXA¯ For lemmata in other languages, e.g., Hebrew or Syriac, we proceed analogously; about some complications due to the ArabTEX input notation see below.

4 Producing an Arabic Retrograde Index

When editing an ancient manuscript it may happen that the beginning of a word is unreadable or corrupted. In this situation a retrograde index may help: a list of words sorted backwards from the end. To produce a retrograde index from our dictionary data base, we again project out the Arabic lemma and put it into a new file together with a suitable sorting key. However, due to the peculiarities of the Arabic script, we cannot just parse the input notation character per character as we did in the case of Greek. Fortunately ArabTEX can assist in our task as it performs much of the needed processing anyway. When an Arabic text is typeset using ArabTEX it is processed in units of individual words; their external representations are accumulated from right to left in a line buffer whose contents will be passed on to TEX for further processing. For every word several processing steps are performed in succession:

7 • An internal token sequence is formed that no longer depends on the actual input encoding.

• From this token sequence a sequence of pseudo-syllables is deduced. Each pseudo-syllable denotes a carrier (a printable character) and the associated diacritic and vowel information. This sequence is available in two forms: in the input order and also reversed as needed for the Arabic script. Indeed there exist Semitic scripts running from left to right, e.g., Ugaritic cuneiform.

• Context analysis is performed on pairs of adjacent carriers: we determine whe- ther the isolated, initial, medial, or final shape of the character in question is required. This step is language dependent.

• The obtained sequence of glyphs with associated diacritical information is mo- dified as sometimes ligatures have to be formed. This step depends on the language and also on the characteristics of the font used.

• Finally the external representation is built up by combining the glyphs with the associated diacritics into a two-dimensional pattern.

Sorting Arabic words solely depends on the carriers, and we may access the sequence of carriers by using a well-defined even though yet unpublished interface. For generating a retrograde index we use the carrier sequence in the reversed order; for a direct index we would use the input order. The processing program looks essentially like the example given above, but we now do not need to look for the existence of a Greek lemma, thus we will be processing all entries alike; and we access the internal ArabTEX data and use a different translation table. The basic idea is using the collating sequence defined by the ASMO 449 encoding. We cannot use this encoding directly since it maps the Arabic letters to ASCII capital and small letters, and most simple sorting routines identify corresponding pairs; but the hexadecimal encoding of the ASCII position numbers will work. The resulting retrograde word list looks as follows:     A҂A¯ €ñKA¯ ÉJ K.A¯ HA‚ ‚ºKA ¯ €ñKA¯ àXA ¯ .  HA҂ A ¯ €ðXA¯ àðXA ¯    Q£AKA ¯  ƒ@PAKA ¯ àðXA¯    á KPXA ¯ Q £AKA ¯ ‡J ËñKA ¯    á £AKA ¯ €PXA¯ ‡J ËñKA ¯

8 5 Re-sorting the data

While we build up and maintain the data base, new items will be added in random order, and existing items will be modified. Thus from time to time we have to re-sort them, and we have to make sure that the text of the individual items will not be modified with the exception of the possible addition of a sorting key which during further processing will be ignored. We can proceed in analogy to the building up of a retrograde index, this time accessing the internal list of carriers in the input order. However, we also have to reproduce the complete entries essentially intact in the output, and to achieve this goal some care is necessary. Formally the markup tags we use are TEX control sequences, possibly with parameters, and control sequences will be expanded by TEX during a writing operation. Furthermore the body of an item when written out in one step will go into a single, possibly very long, line which might cause the input buffer of our sorting routine, and also that of TEX during further processing, to overflow. For the first problem there appear to be two solutions. One of them is defining the meaning of every control sequence that may occur within the body of an item to expand to its external representation, taking care of the parameters, e.g.:

\def \glemma #1{\string\glemma \string{#1\string}}

This has to be done individually for every kind of markup statement. The second solution is to deactivate all the control sequences and the parameter delimiters globally:

\def \alemma #1{% do not yet read the body of the item \def \aword {#1}% save the lemma \makekey {#1}% compute the sorting key \begingroup \catcode ‘\{=12 \catcode ‘\}=12 % disable the braces \catcode ‘\\=12 % disable all control sequences \getbody }

\def \getbody #1\par {% now read the rest of the entry \endgroup \def \thebody {#1}\writeentry }

\def \writeentry {% put things together \immediate\write\outfile {\string\key \string{\thekey \string} \string\alemma \string{\aword \string} \thebody } \immediate\write\outfile {}}% add an empty line

The second problem is much worse, and our macros to solve it are far too com- plicated to present them here in full. We shall just indicate the main tasks to be solved:

9 If we want to leave the line structure of the input data essentially intact, we have to make sure it will survive the sorting process which cannot safely be assumed to be stable. Thus every line has also to get a secondary sorting key, concatenated to the primary key and increasing for every additional line. Originally our items have been separated by an empty line. This will not survive the sorting process and has to be replaced by an explicit \par command which also carries a (composite) sorting key to migrate to the correct position in the sorted file. A line only carrying a sorting key by itself would also arrive at the correct position but would no longer act as an item separator. To get at the line separators after disabling the control sequences within the body of an item we have to make the TEX end-of-line character active and thus recognizable. This will disable the TEX feature of substituting a \par command for an empty line, and thus we may not read the body of an item as a whole but have to process it line by line, checking both for an empty line and a \par command as end markers. Since all control sequences have been disabled, checking for the \par command means looking for its external representation irrespective of a possibly present sorting key. Every line will get a new composite sorting key. It already may contain a key from a preceding run, and this key might have been invalidated by editing; thus it has to be deleted, and this means searching for its textual representation together with its argument. Fortunately the TEX macro mechanism is sufficiently powerful to handle this task within very few lines of code, but these are rather intricate and far from trivial. As a final complication there may be letter variations due to erroneous writing which are not differentiated by the standard collating sequence, but nonetheless have to be kept separate. This even occurs within the sample given in Figure 1, and can be handled by judicious assignment of secondary keys. The sorting keys are comparatively long and thus increase the size of the data file by an appreciable amount. They also impair the readability and thus complicate the editing. Fortunately they are only required during the sorting process itself, whereas before processing they would have to be defined to do nothing, and thus we may safely strip them off after sorting. This can easily be done by defining the \key command to read the rest of the line and write it out verbatim. Thus we get back to the original input format with the possible deviation that empty lines may have been replaced by \par commands; and with little effort we can also redo this substitution.

6 General Re-encoding

As we have seen, once an Arabic word has been processed by ArabTEX all the in- formation governing its graphical rendering is available as a sequence of carriers and modifiers, independent of the actual input encoding. This opens up some new possi- bilities: we may consider the inverse transformation and produce again an external textual representation, possibly in a different encoding standard. There may well

10 be applications outside the realm of typesetting, e.g., importing texts coded in AS- MO 449 or ISO 8859-6 into Arabic Windows or vice versa. The implementation should meanwhile be rather obvious: for any required output encoding we have to provide a translation table for the carriers, and also a routine to transform the modifiers to their external representation (if the encoding caters for them at all). We collect the resulting character string in a buffer which will be written to an output file. Whether we preserve the existing line breaks of the input text or will break the lines at some reasonable place ourselves depends on the application. If handling an additional encoding convention is required (there are e.g., numerous private standards used by various Arabic text processing packages), we have to provide and add a new input module and possibly an output module for it. ArabTEX provides a standard mechanism for installing additional reading modules, and the technique of constructing them is meanwhile well understood. Building new output modules is more risky and as yet there is no standard mechanism for installing them.

7 Conclusion

(. . .) All the algorithms presented in this report have been implemented and seem to function correctly as described. In our opinion the principal drawback of the presented solutions is their high demand on storage and running time. At the present state of technology this is no longer such a severe limiting factor as it would have been a decade ago; nonetheless it is inconvenient, and it may pay off to think twice before starting a major processing job. Of course, if the job has to be done and there just is no other solution available, we have no choice. We have to thank Bernd Raichle for pointing out some improvements to our TEX macros.

References

[ASMO449] Arab Standards and Metrology Organization. 7-bit coded Arabic cha- racter set for information interchange. ASMO 449 Tech. Rep., Amman, Jordan, 1982.

[ASMO708] Arab Standards and Metrology Organization. 8-bit Coded Arabic/Latin Character Set for Information Interchange. ASMO DS 708 Tech. Rep., Amman, Jordan, 1985.

[BL+94] Tim Berners-Lee et al. The World-Wide . Communications of the ACM, 37(8):76–82, 1994.

11 [DIN31635] Deutsches Institut fur¨ Normung e.V. Umschrift des Arabischen Alpha- bets. DIN 31 635, 1982.

[Dry94] K. J. Dryllerakis. GreeKTEX. Available electronically via the InterNet from the Comprehensive TEX Archive Network (CTAN), 1994. A set of Greek fonts, associated macros and documentation; based on fonts devised by Silvio Levy and Yannis Haralambous.

[GMS84] Michael Goossens, Frank Mittelbach, and Alexander Samarin. The LATEX Companion. Addison-Wesley, Reading, Mass., etc., 1984.

[ISO8859-6] International Organization for Standardization. Information processing – 8-bit single-byte coded graphics character sets – Part 6: Latin/. ISO 8859-6, 1987.

[ISO8879] International Organization for Standardization. Information Processing – Text and Office Systems – Standard Generalized Markup Language (SGML). Technical Report ISO 8879, ISO Geneva, 15 October 1986; Amendment 1, 1 July 1988.

[ISO9036] International Organization for Standardization. Information processing – Arabic 7-bit coded character set for information interchange. ISO/TC 97 - ISO 9036, 1987.

[ISO/R233] International Organization for Standardization. International System for the Transliteration of Arabic Characters. ISO/R 233 - 1961.

[Knu84] Donald E. Knuth. The TEXbook, volume A of “Computers & Typeset- ting”. Addison-Wesley, Reading, Mass., 1984.

[Lag92] Klaus Lagally. ArabTEX — Typesetting Arabic with Vowels and Li- gatures. In EuroTEX ’92, Proc. 7th European TEX Conference, pages 153–172, Prague, Czechoslovakia, September 14–18, 1992. See also [?].

[Lag93] Klaus Lagally. ArabTEX, a System for Typesetting Arabic. User Manual Version 3.00. Report 1993/11, Universit¨at Stuttgart, Fakult¨at Informa- tik, 1993.

[Lag94a] Klaus Lagally. How to extend ArabTEX to handle Hebrew, 1994. Unpu- blished internal Notes.

[Lag94b] Klaus Lagally. Using TEX as a Tool in the Production of a Multi-Lingual Dictionary. Report 1994/15, Universit¨at Stuttgart, Fakult¨at Informatik, 1994.

[Lam86] Leslie Lamport. LATEX, a Document Preparation System. Addison- Wesley, Reading, Mass., 1986.

12 [Lam94] Leslie Lamport. LATEX, a Document Preparation System. User’s Guide and Reference Manual. Addison-Wesley, Reading, Mass., second edition, 1994.

[Lev88] Silvio Levy. Using Greek Fonts with TEX. TUGboat, 9(1):20–24, 1988. [Ser93] Nikolaj Serikoff. A Dictionary of Greek Borrowings and Loan Words in Arabic [Tasks, Methods, Preliminary Results]. Graeco-Arabica, 5:267– 273, 1993.

[Smi92] Joan M. Smith. SGML and Related Standards. Ellis Horwood Ltd., New York, 1992.

13