Latin Modern: Enhancing Computer Modern with Accents, Accents, Accents
Total Page:16
File Type:pdf, Size:1020Kb
Latin Modern: Enhancing Computer Modern with accents, accents, accents Bogus law Jackowski BOP s.c., Gda´nsk, Poland [email protected] Janusz M. Nowacki Foto-Alfa, Grudziadz, Poland [email protected] Abstract The number of (free) fonts prepared with METAFONT is surprisingly small com- pared, e.g., to what is available in the commercial market in other formats. Well, perhaps not so surprisingly: METAFONT generates TEX-oriented bitmap fonts which have not become popular outside the TEX world. Accepting this irksome situation as a challenge, we have prepared META- Type1, a package for generating fonts in the PostScript Type 1 format accepted world-wide. The package makes use of METAFONT’s “sister”, namely META- POST, and a few other utilities, such as awk and t1utils. Recently, an opportunity arose to embark METAType1 upon enhancing the Computer Modern family of fonts with diacritical characters, thus following in Lars Engebretsen’s footsteps, who also recognized the importance of the problem and created the AE (Almost EC) collection of virtual fonts. The task turned out to be fairly complex but well-suited for a fully programmable engine like META- Type1. We here report on the outcome of the project, i.e., the Latin Modern family of fonts in the Type 1 format, and share the experiences gathered while accomplishing the task. 1 Introduction not be overestimated from this point of view. But Accented characters play the rˆoleof enfants terri- the typesetting system itself is only one side of the bles in the world of computers. Anybody who has coin. The other is the collection of fonts it uses. to communicate with another computer system in Originally, TEX was equipped with the Com- a language other than English knows that using so- puter Modern family of fonts (CM) which did not called “funny characters” is not fun at all. contain diacritical characters. Those few TEX users who would need accented letters were supposed to 1.1 Those pesky diacritics employ the \accent primitive. The immense popu- A giant step towards putting some order into the larity of TEX in countries that use lots of diacritical chaos was the Unicode standard (ISO/IEC 10646) characters invalidated this presumption. At least published ten years ago. Unicode, obviously, does three reasons can be set forth: (1) accented char- not remove all the problems from the font play- acters do not behave like “normal” ones, i.e., they ground, and even adds a few new ones (e.g., prob- interfere with important TEX algorithms such as hy- lems with the size of font files and with the regis- phenation and insertion of implicit kerns; (2) the CM tration of non-standard characters and languages). fonts do not contain all necessary diacritics, e.g., an Nevertheless, one can believe that the world will be- ogonek accent (used in Polish, Lithuanian, Navajo) come a bit better when Unicode turns from the stan- is missing; (3) such diacritical elements as cedilla dard de nomine to the standard de facto. and ogonek, when treated as “accents”, overlap with a letter, which precludes some applications, e.g., TEX’s 8-bit (i.e., 256 characters per font) par- adigm is becoming more and more obsolescent, and preparing texts for cutting plotters (see figure 1), enhancing it with multi-byte character codes seems even if outline fonts are used. The lesson is obvi- inevitable. Such efforts as the Ω Project [11], devel- ous — the CM family should be extended by a vari- oped by John Plaice and Yannis Haralambous, can- ety of diacritical letters. 64 TUGboat, Volume 24 (2003), No. 1 — Proceedings of the 2003 Annual Meeting Latin Modern: Enhancing Computer Modern with accents, accents, accents In this paper we would like to present our ap- (1) the scaling of bitmap fonts is troublesome — they proach to solving the problem, i.e., the open source look nice only if their resolution matches the reso- family of fonts, Latin Modern (LM), in the Post- lution of the device; (2) in many cases, outline fonts Script Type 1 format [2], prepared using META- turn out to display much better and, paradoxically, Type1, a METAPOST-powered package [8] (see sec- faster on a screen, e.g., when used in pdf. (Happily, tion 2.4). We believe that the LM family is a de- Adobe Acrobat 6 has improved handling of bitmap cent alternative to the other extensions of the CM fonts considerably, and non-Adobe programs with family — we expect it to be a handy collection of decent results are also available.) fonts for typesetting in Latin-based alphabets. The This inspired Lars Engebretsen, who prepared fonts are also equipped with Printer Font Metric a set of TEX virtual fonts containing basic diacrit- files (*.pfm) and therefore can be used as system ical characters [4]. The virtual fonts could refer to fonts in GUI systems. Finally, they can be used the excellent outline version of the CM family which with the CM metrics (e.g., via psfonts.map), so as had appeared in the meantime. It had been cre- to preserve typesetting of existing documents. ated in 1988 by Blue Sky Research for the American Mathematical Society in PostScript Type 3 format, 1.2 A gulp of history converted in 1990 by Y&Y into the hinted Type 1 Needless to say, the lack of diacritical letters in the format, and released in 1997 for public use by the CM family was recognized almost from the very be- AMS. Engebretsen called his collection AE — “Al- ginning by TEX users who had to struggle with the most EC”. His virtual fonts suffer, however, from typesetting of languages other than English. Only the same limitation as TEX does, i.e., the number of in 1990, however, during the TUG meeting in Cork, characters is limited to 256. Moreover, as we have Ireland, did the international TEX community de- mentioned, superimposing a diacritical element on cide that fonts in the so-called Cork Encoding (EC a character reveals undesirable features when the or, in LATEX lingo, T1) should be prepared for Eu- character is stroked rather than filled (see figure 1). ropean TEX users [6]. The work on EC fonts started Only recently, automatically traced fonts in the soon after the Cork meeting. Norbert Schwartz de- PostScript Type 1 format, based on the EC fonts, signed a prototype, the so-called DC fonts. The work have been published: P´eter Szab´o’s Tt2001, Vladimir was then continued by a team led by J¨orgKnappen. Volovich’s CM-super (both in 2001; [14] and [15], re- The final release of EC fonts was announced in 1997. spectively), and a newfangled CM-LGC from Alexey It was an important achievement. Neverthe- Kryukov (March 2003). Note, however, that Szab´o less, the Cork Encoding conformed to TEX’s 8-bit courteously “recommends the wonderful CM-super paradigm and therefore was not able to contain all package instead of his own Tt2001”. Indeed, Volo- characters occurring in European languages, not to vich’s collection contains many more font variations mention other Latin-based alphabets, such as Viet- and covers a broader character set than Szab´o’s. namese and Navajo. Kryukov’s collection is, in a way, a supplement to For a few years, EC fonts were available only in CM-super. The creation of these packages was pos- a TEX-specific bitmap form (pk). Nowadays, with sible thanks to a marvelous tool provided by Martin the advent of electronic publishing, bitmaps are not Weber, namely, autotrace [16]. acceptable. At least two factors can be pointed out: Volovich’s accomplishment seems to bring to an end the long-lasting endeavours to introduce diacrit- ical characters into TEX’s realm. Do we really need yet another collection of fonts? 2 Another viewpoint Autotraced fonts, in spite of their many advantages, have drawbacks. Objectively, the most important one is perhaps the size of a font. Such fonts are usu- ally larger than visually similar fonts having care- fully designed outlines because of a greater num- Figure 1: The letter Eogonek from Times New ber of nodes in the outlines. Compare, for example, Roman for Windows XP (left), from aer10 (middle), Volovich’s fairly tidy CM-super fonts with AMS CM and from lmr10 (right); only the latter form, and LM: the number of bytes per character is 260, with a single outline, is acceptable in professional 200, and 135, respectively. Twice is not too much, applications. but when many magnifications are included (see sec- TUGboat, Volume 24 (2003), No. 1 — Proceedings of the 2003 Annual Meeting 65 Bogus law Jackowski and Janusz M. Nowacki tion 2.1) it makes a difference. Incidentally, the script (7 pt), and second-order script (5 pt, “script- size of the CM-super fonts can be reduced by circa script”). Because of the well-established tradition, 10 percent by using a subroutine compression mod- we cannot refrain from using Knuth’s scheme, but ule PACKSUBR from METAType1 (actually, it is a we would strongly discourage extending it. short awk script). For these reasons, we accept with difficulty the For us, however, more important are arguments enormous number of different sizes and proportions of a rather imponderable nature. We stand firmly by present in both the EC and CM-super font families. the conception underlying the TEX and METAFONT This is apparently the inheritance of Knuth’s and design: every detail, be it a typesetting or a typeface Sauter’s ideas. We would gladly discard most of the design, should be controllable and replicable. fourteen renditions of a single typeface (in sum, font This is not the case with autotraced fonts.