Tugboat, Volume 40 (2019), No. 3 263 Typesetting

Tugboat, Volume 40 (2019), No. 3 263 Typesetting

TUGboat, Volume 40 (2019), No. 3 263 Typesetting the Bangla script in Unicode 2 Scope of this article TEX engines — experiences and insights Before the Unicode Standard was created to enable Md Qutub Uddin Sajib the writing of most scripts of the world on comput- ers, the attempts to typeset Bangla script in TEX Abstract were confined to ASCII-based transliteration systems. The typesetting of Bangla (also known as Bengali) Brief discussions of ASCII- and Unicode-based type- script in TEX was first introduced more than 15 years setting of this script are presented in sections 3 and 4. ago through transliteration-based systems. These The TEX packages and fonts available today that systems have shortcomings: among others, the source support Unicode Bangla typesetting are discussed in files are harder to read and they require one or two sections 5 and 6. particular Bangla typeface families for typesetting. It is predictable that most Bangla documents With the introduction of Unicode-aware TEX en- contain at least English, math, and possibly other gines, such as X TE EX, and the emergence of Unicode- scripts. In this article, however, we have consid- compliant free Bangla fonts, new possibilities have ered typesetting of the Bangla script only, using the evolved. Today both X TE EX and LuaTEX, as avail- four TEX engines that support the Unicode Standard. able in TEX Live 2019, support Bangla typesetting This article does not cover the discussion on font se- allowing the user to input the text directly with Uni- lection techniques for different scripts except Bangla. code Bangla fonts in the editor. Although several For information on selecting specific fonts for Roman years have passed since the X TE EX system was first (English) and math along with Bangla, the fontspec seen to work, it is still in a state where the finest package [11] can be consulted. typographic quality is nearly unachievable for this TEX engines known to support the Unicode Stan- particular script. Several rendering issues were ob- dard are X TE EX, LuaTEX, HarfTEX, and LuaHBTEX. served while working with Unicode Bangla fonts in The first two are available in TEX Live 2019; the four Unicode-aware TEX engines. Precision typeset- last two via tlcontrib or Akira Kakuto’s w32tex and ting of the Bangla script in Unicode TEX requires w64tex distributions (http://w32tex.org/). We attention in terms of fonts, rendering, hyphenation, used all four engines to typeset some text of Bangla use of colors, and more. script to observe the rendering with different fonts (Section 7), hyphenation (Section 9), and use of col- ors (Section 10). Some development ideas for this particular script are discussed in Section 12. 1 Introduction In this paper, using X TE EX means compiling The language Bangla, also known as Bengali, is one the .tex file with xelatex; using LuaTEX, HarfTEX, of the ten most-spoken languages in the world, as and LuaHBTEX means compiling the same file with reported by Ethnologue in its 2019 edition. Native lualatex, harflatex, and luahblatex, respectively. The speakers of this language are mainly from Bangladesh, TEX-specific examples presented here were produced a small but populous country in south Asia. Another using the TEX Live 2019 distribution on a computer good number of native speakers are from the West running the GNU/Linux operating system (Slack- Bengal state of India. The Bangla script, the written ware 14.2). The HarfTEX and LuaHBTEX engines form of the Bangla language, is one of the thirteen were installed via tlcontrib, following the instructions major Indic scripts and has made its way into the at https://contrib.texlive.info/. Unicode Standard. Publishing in this script has a history of many centuries. Like other Indic scripts, 3 ASCII-based transliteration systems typesetting of the Bangla script in TEX has seen ASCII-based systems to typeset the Bangla script in several attempts in the last few years but typographic TEX were first seen to work more than 15 years quality has yet to reach a peak. ago. The two transliteration schemes known to Apart from beautiful rendering of mathematical support the Bangla script are ITRANS (Indian lan- contents in TEX, another goal of this typesetting guages TRANSliteration) by Avinash Chopde and system is the finest typographic quality [5]. The the Velthuis system by Frans Velthuis. Both of these same philosophy can be expected in typesetting other schemes were primarily developed for the Devanagari scripts, including Bangla. Considering the present- script. Later, the schemes were adapted to typeset day support of the Bangla script in TEX, this article the Bangla script in TEX. discusses a few rendering issues, mostly gathered The typeface families that work with ITRANS from the author’s day-to-day typesetting experiences; include the “SonarGaon” (sgaon) fonts by Anisur it also provides some insights for future development. Rahman [7] and “AroSgaon” fonts by Muhammad Typesetting the Bangla script in Unicode TEX engines — experiences and insights 264 TUGboat, Volume 40 (2019), No. 3 ¿1 L'ZG ¿,6. 1SJûG 1ZP 2iÇ6.7*s\\ Т ÑÖ Æ ×­ Ö {\bn ke la{}ibe mor kaarya, kahe sandhyaa rabi ক লইেব মার কায, কেহ সারিব। ÅTDcS 83_ .=3 TDÃP. 7TGs\\ Ö ÒÖ º ÒÝ িনয়া জগৎ রেহ িনর ছিব। "suni.yaa jagaT rahe niruttar chabi | ISú. Ý%8E T7LG ¿2 1TPLG 6,8G\\ মার দীপ িছল, স কিহল, ামী, Ô¯ÁÔ Ð¸ × Ð¸ Ñ ÑÌÖ maa.tir pradiip chila, se kahila, sbaami ,6. ¿J;<19 OSCÇ 1TKG @S 7,s আমার যটুকু সাধ কিরব তা আিম। ÌÙÙ ×Ü­ Ö Ø Ñ º ÑÖ aamaar ye.tuku saadhya kariba taa aami | \Q\ Ö¯ÒÞ Ù Ö ß -- rabindranaath .thaakur} – রবীনাথ ঠাকু র HH .*8qDSA <S19. Figure 1: Typesetting of Bangla script in TEX with a Figure 2: Typesetting of Bangla script in X TE EX transliteration system using the METAFONT-generated with Unicode fonts (spelling of a few words, as they “Bengali” fonts (source: [7]). appeared in Figure 1, were corrected following [14]). Ñ Masroor Ali [1]. The latter was available with its inscript, bengali-itrans, and bengali-probhat. METAFONT sources and was replaced by the Type 1 None of the Unicode Bangla keyboard layouts “ITXBengali” fonts of Shrikrishna Patil in ITRANS. available today were designed with TEX users in mind; The bwti (Bengali Writer TEX Interface) package by hence one may need to switch the layout frequently Ð AbhijitТ Das included METAFONT-generated “Bengali” in order to type special TEX characters (\, %, &, etc.). fonts and worked in TEX through a special interface. An appropriate font containing the Bangla script The bengali package [8] by Anshuman Pandey has to be set via the fontspec package (details in uses the Velthuis transliteration scheme instead of Section 6). Then, upon processing the .tex file with ITRANS. It uses the latest version of Das’s “Bengali” xelatex, lualatex, harflatex, or luahblatex one gets the fonts for typesetting. The bangtex package [6] by typeset document. Palash Baran Pal includes class files and METAFONT The Unicode-based systems in TEX for this sources for its “Bangla” fonts. The Type 1 fonts script have many advantages over the older systems. for this package were created by Ananda Kumar For example, the source file is now easy to read Samaddar [6] and are included in the bengali-omega (Figure 2, right versus Figure 1, right). In addi- package [10] of Lakshmi K. Raut. The latter uses tion, any font that contains the glyphs for this script the Velthuis transliteration scheme. It also supports can be used for typesetting. However, the current Unicode-based input but would convert the Unicode situation is not free from shortcomings. The ver- text into the transliteration scheme for typesetting. batim text in Figure 2 (right), which should read The transliteration-based systems require the “\vskip6pt\raggedleft”, is unreadable because the user to input Bangla text in a specific scheme with font used there contains glyphs only from the Bangla fonts from the Roman script. Then the text would script. Other shortcomings that we have observed be processed with preprocessors for typesetting in are discussed in the following sections. TEX. Although these systems work, the source file is harder to read (Figure 1). They seem to use the 5 LATEX packages for Unicode Bangla “Bengali” fonts (from bwti) or “Bangla” fonts (from The polyglossia package by François Charette [2] is bangtex) to typeset the document. The typographic designed to provide support for typesetting Bangla quality of these fonts may not be comparable with script, along with other scripts, using suitable Uni- fonts we see in modern Bangla publications. code fonts and TEX engines. It provides a style file (begalidigits.sty) for this script that translates 4 Unicode-aware TEX engines the Arabic numerals into Bangla numerals. The With the introduction of X TE EX and LuaTEX around language definition file (gloss-bengali.ldf) imple- 2007, and the fontspec package for selecting TrueType ments the Bangla numerals in LATEX counters. It and OpenType fonts, typesetting of Bangla in TEX also provides Bangla translation for the names of using Unicode fonts became a reality. Today, a good LATEX sections and counters, and for the Gregorian number of Unicode-compliant Bangla fonts are freely calendar months. available that work with these engines. The latexbangla package by Adib Hasan [3] intro- To start, one needs a keyboard layout that sup- duces some control sequences to select Bangla fonts. ports the input of Unicode Bangla characters in an ed- To our knowledge, there are no Unicode Bangla fonts itor. In most GNU/Linux systems, a keyboard layout designed to be used with TEX.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    7 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us