Tugboat, Volume 19 (1998), No. 2 115 an Overview of Indic Fonts For

TUGboat, Volume 19 (1998), No. 2 115 the Indo-Aryan and Dravidian language families of Fonts India. Such uniformity in phonetics is reflected in orthography, which in turn enables all scripts to be transliterated through a single scheme. This unifor- An Overview of Indic Fonts for TEX mity has subsequently been reflected in the translit- Anshuman Pandey eration schemes of the Indic language/script packages. 1 Introduction Most packages have their own transliteration Many scholars and students in the humanities have scheme, but these schemes are essentially variations on a single scheme, differing merely in the coding preferred TEX over other “word processors” or document preparation systems because of the ease TEX of a few vowel, nasal, and retroflex letters. Most provides them in typesetting non-Roman scripts, the of these packages accept input in one of the two availability of TEX fonts of interest to them, and the primary 7-bit transliteration schemes— ITRANS or ability TEX has in producing well-structured docu- Velthuis—or a derivative of one of them. There ments. is also an 8-bit format called CS/CSX which a few However, this is not the case amongst Indol- of these packages support. CS/CSX is described in ogists. The lack of Indic fonts for TEXandthe further detail in Section 3. perceived difficulty of typesetting them have often 2 The Fonts and Packages turned Indologists away from using TEX. Little do they realize that TEXisthe foremost tool for de- Figure 1 shows examples of the various fonts de- veloping Indic language/script documents. With an scribed in this article. Table 1 lists the sites from increase over the past few years in the development which all of the fonts and packages described in this and availability of Indic language and font pack- article are available. ages, the introduction of other fonts and style pack- 3 CS/CSX ages, the flexibility of the LATEX2ε system, and the creation of TUGIndia (which may revolutionize the CS/CSX (Classical Sanskrit/Classical Sanskrit eX- typesetting of Indic scripts) there is now even more tended) is the closest thing to an accepted stan- reason for Indologists to implement TEX in their dardization of 8-bit transliteration of Indic scripts. work. Adopted in 1990 at the 8th World Sanskrit Con- There are roughly thirteen major Indic scripts ference in Vienna, CS/CSX enables Indologists to (Tibetan is included in this list) which are used exchange electronic data in a variety of platform- throughout South Asia to write the major languages independent media. and dialects of the region. As of this article all of CS/CSX is an encoding convention based on IBM Code Page 437. CS is a basic inventory of dia- these major scripts can be typeset with TEX, the exception being Assamese (see Section 6). critic letters which are traditionally used to translit- Not only is it fascinating that the major scripts erate Sanskrit written in the Devanagari script. CSX of South Asia can be typeset with TEX, but the ease is an extension of this basic inventory to include with which such a task can be accomplished is itself accented and other characters. Contrary to what an amazing feat. Anyone who has ever tried writing the name indicates, the inventory of CS/CSX char- a document with multiple non-Roman scripts and acters is not limited to Sanskrit, and may be used diacritic text in an environment other than TEX un- to transliterate other Indic languages. Introductory information on CS/CSX is found derstands the complexity of such a task. TEXtakes the user beyond such difficulties by facilitating the in an article by Dominik Wujastyk titled Standard- implementation of multiple scripts without the has- ization of Sanskrit for Electronic Data Transfer and sle of worrying about various fonts and their en- Screen Representation [1]. This document, as well codings, manual font switching, and other such hin- as supporting screen fonts and drivers for DOS-based drances to productivity caused by common “word machines, is available from the INDOLOGY site as processors”. well as from CTAN. TEX enables the incorporation of several non- Various fonts and packages have been developed Roman scripts within a single document through which enable TEX to typeset documents encoded in transliterated input of the scripts. Indic scripts are the CS/CSX convention. These are enumerated be- based on the phonetic template of the languages low: they represent, a template which is uniform in both cp437csx The file cp437csx.def is an input encoding definition file for LATEX2ε which enables 116 TUGboat, Volume 19 (1998), No. 2 K g G R kKÅgÅGÅdû k K g G R k ã Devnag Devnac Devnag Pen Ka ga Ga z k @ØZwWC Íû©Ê× Sanskrit ItxBengali ItxGujarati K g G ऄ आ ऎ ऐ ckgGL k Punjabi Gurmukhi Washington Tamil Gौ H Iौ Jौ K ऋ ऌ ऍ ऎ ए H P X C h Telugu Malayalam Sinhala ࣽ ࣾ ࣿ ऀ k K g G f k K g G f Konark Cuttack AAI Kannada इ आ ࣽࣾࣿ H H H H P Q R S T ई GTib etian Naskh Washington Brahmi Figure 1: Example of Indic Fonts CS/CSX encoded documents to be typeset in the INDOLOGY site and is also bundled with LATEX without need for conversion. The file is the ITRANS package. available from CTAN as part of the csx package. csutopia Dominik Wujastyk produced an extension csxtimes John Smith has made available the font and re-encoding of the Adobe Utopia font ac- metrics and virtual fonts of the commonly-used cording to the CS/CSX convention called ‘CS PostScript fonts re-encoded with the CSX en- Utopia’. Users should note that ‘CS Utopia’ coding. The use of these fonts enable CS/CSX supports only the characters of the Classical documents to be typeset directly by TEX with- Sanskrit encoding and does not support the Ex- out the need for any conversions. This is facil- tended encoding. This font is available from the itated through the csx.def file which provides INDOLOGY site and is also bundled with the the CS/CSX input encoding definitions for stan- ITRANS package. dard LATEX and for the standard TEXfonts. wnri Thomas Ridgeway developed a package called cscharter Dominik Wujastyk produced an exten- ‘Washington Roman Indic’ which contained a sion and re-encoding of the Bitstream Charter family of fonts based on Computer Modern and font according to the CS/CSX convention called which were encoded with the CS/CSX and other ‘CS Charter’. Users should note that ‘CS Char- supplementary conventions. I recently revised ter’ supports only the characters of the Classi- the package for use with LATEX2ε and added cal Sanskrit encoding and does not support the a style file and input encoding definition file Extended encoding. This font is available from which does away with the need for the fonts. TUGboat, Volume 19 (1998), No. 2 117 Ridgeway also developed screen fonts and driv- tains the characters required to typeset San- ers for wnri for DOS-based machines. The up- skrit, Hindi, Marathi, and any other languages dated package is available from CTAN. which use the Devanagari script. The font ‘Dev- nag Pen’ developed by Thomas Ridgeway is a 4babel variation on ‘Devnag’ which resembles Devana- Recently Jun Takashima introduced two Indic lan- gari written with an ordinary pen and is bun- guage modules to the babel fold. These packages dled with devnag. enable support for Romanized Sanskrit and for Kan- Dominik Wujastyk, John Smith, myself, and nada in both the original and Roman scripts. Please a few others have recently upgraded devnag for refer to Section 5.5 for a description of Takashima’s A use with LTEX2ε. The package is now NFSS- Kannada package. compliant. skthyph This module provides the hyphenation A patterns for Romanized Sanskrit. As of this sanskrit The Sanskrit for LTEX2ε package devel- article these files are not distributed with the oped by Charles Wikner is an extensive pack- current version of babel but will be included age which enables the typesetting of Devana- in the next release. This module is presently gari text with Vedic accents and other special available only from the developer’s FTP site. characters not supported by the devnag package. Numerous options may be set in regard 5ITRANS to transliteration, alternate characters, inter- The ITRANS package developed by Avinash Chopde character spacing, and other preferences. Only is the primary component of an on-going project to support for the Sanskrit language is available. make the typesetting of all Indic scripts possible by The font ‘Sanskrit’, also developed by Wikner, means of a single tool. As of this article, ITRANS is bundled with the package. It is a rather com- supports the Bengali, Devanagari, Gujarati, Gur- plete font in that it contains many complex lig- mukhi, Kannada, Tamil, and Telugu scripts. It also atures and variants which enable excellent type- supports the ‘CS Utopia’ diacritic Roman font. setting of Devanagari. This package is available In addition to the default TEX output, ITRANS from CTAN. can produce direct HTML and PostScript output itrans Four fonts provide Devanagari support in ITRANS DOS from the input file. versions for both ITRANS: the ‘Devnag’ and ‘Devanagari Pen’ and UNIX systems are available from the developer’s fonts described above and two more called ‘Dev- website. nac’ and ‘Xdvng’. The ‘Devnac’ font is a 5.1 Bengali PostScript Type 3 font developed by Avinash Chopde for the ITRANS package. ‘Devnac’ was arosgaon The AroSgaon package was developed developed to enable users unfamiliar with TEX by Muhammad Masroor Ali as an extension to to still produce texts in Devanagari through HP the ‘SonarGaon’ Laserjet softfont designed the “dumb textual interface” mode of ITRANS.

Tugboat, Volume 19 (1998), No. 2 115 an Overview of Indic Fonts For

Section 14.4, Phags-Pa

Comparison, Selection and Use of Sentence Alignment Algorithms for New Language Pairs

Slides for My Lecture for the Texperience 2010

Note: the Font “Hindiromanized” Must Be Selected

Tugboat, Volume 15 (1994), No. 4 447 Indica, an Indic Preprocessor

Exploring the Linguistic Influence of Tibet in Ladakh(La-Dwags)

Proposal for a Kannada Script Root Zone Label Generation Ruleset (LGR)

Learning to Serve and to Roam You Could Ask Almost Any Tibetan

A Script Independent Approach for Handwritten Bilingual Kannada and Telugu Digits Recognition

16-Sanskrit-In-JAPAN.Pdf

Devan¯Agar¯I for TEX Version 2.17.1

The Unicode Standard, Version 4.0--Online Edition