<<

Experience with OpenType Production

Sivan Toledo School of Computer Science, Tel-Aviv University, Tel-Aviv 69978, Israel [email protected] http://www.tau.ac.il/~stoledo

Zvika Rosenberg MasterFont Studio Rosenberg, 159 Yigal Alon Street, Tel-Aviv 67443, Israel http://www.masterfont.co.il

Abstract The article describes the production of about 200 Hebrew OpenType with advanced typo- graphic layout features. The production project was conducted for MasterFont, a large commercial font foundry in Israel, which produces and sells over 1400 Hebrew fonts. The production project had two main objectives: to convert Hebrew fonts with complex support from an older format to OpenType, and to streamline the font production process at the foundry. The article describes font technology background, the objectives and constraints of the project, the tools that are available, and the final conversion and production process. The article also describes our con- clusions from the project, and in particular our conclusions concerning font production processes and tools, and the OpenType format.

Résumé Cet article décrit la production d’environ 200 fontes OpenTypes hébraïques, munies de propriétés typographiques avancées. Ce projet a été mené par MasterFont, une importante fonderie israë- lienne, qui produit et commercialise plus de 1 400 fontes hébraïques. Le projet avec deux objec- tifs principaux : la conversion de fontes hébraïques avec support de diacritiques complexes à partir d’un format de fonte plus d’ancien OpenType, et l’automatisation du processus de production de fontes à la fonderie. L’article décrit les technologies de fontes impliquées, les objectifs et les contraintes du projet, les outils disponibles, et le processus de conversion et de production final. Il décrit également les conclusions que nous avons tirées du projet et en particulier nos conclusions concernant les processus et les outils de production de fonte, ainsi que concernant le format Open- Type.

Introduction in Hebrew (meaning “to add points”). Bibles also use a third kind, cantillation marks, which are not treated in This article describes the production of a large number this article (see [4, 5] for a thorough discussion). He- of Hebrew OpenType fonts. More specifically, we de- brew are quite hard to support in fonts, since scribe the production of OpenType fonts with TrueType there are 15 of them and 27 letters, with up to 3 diacrit- descriptions and with advanced typographic lay- ics per . Even though not all the combinations are out features. The project was conducted by MasterFont grammatically possible, there are too many combinations Studio Rosenberg, a large Hebrew digital font foundry to support them conveniently using precomposed . in Tel-Aviv, with assistance from Sivan Toledo from the In addition, diacritics must be positioned relative to the School of Computer Science in Tel-Aviv University. base letter visually, rather than mathematically. For ex- Hebrew is a right-to-left script that is used in one of ample, the below- diacritics must be set below two ways: with or without diacritics. Most Hebrew texts the visual axis of the letter [15], not below its mathe- are printed almost without any diacritics, but children’s matical center. This means that ligatures, pair , books, poetry, and bibles are printed with diacritics. and simple mathematical centering alone are insufficient Even texts that are printed without diacritics use them for correct placement of diacritics. occasionally to disambiguate pronounciation and mean- In addition to diacritics, Hebrew fonts can also ben- ing. The diacritics in children’s books and poetry are efit from pair kerning, although until recently, pair kern- vowels and consonant modifiers, which are called nikud ing was not particularly common in Hebrew fonts. For

TUGboat, Volume 24 (2003), No. 3 — Proceedings of EuroTEX 2003 557 Sivan Toledo and Zvika Rosenberg further information about the Hebrew script, see [1, 14, vented by Apple in the late 1980’s, as part of a collabora- 15]. tion between Apple and , and became the na- The objective of the project was to convert tive scalable font format for both Apple com- older fonts into the new OpenType format. This was puters and Windows computers. In a TrueType font, motivated by the concurrent development of Adobe In- all the data associated with the font, such as glyph de- Design 2.0 ME (Middle Eastern), the first high-end scriptions, metric information, administrative informa- -layout program to support Hebrew OpenType fea- tion (font name, for example), and possibly bitmaps, are tures. Microsoft’s Office XP also supports these features, placed in a single file, which is organized as a collection so InDesign was not the only motivator. But it quickly of tables. For example, one table contains glyph descrip- became evident that the same tools that would be neces- tions, another character-to-glyph mappings, yet another sary to convert the old fonts to the new format could also table contains kerning pairs, and so on. be used to streamline the entire font production process The kinds of data in a TrueType font are almost at MasterFont. Therefore, the project had two objec- identical to the kinds of data in a Type 1 font, but they tives: to convert the old fonts to the new format, and to are organized differently and sometimes represented dif- streamline the production process. ferently. Most importantly, glyph descriptions and hints The rest of the paper is organized as follows. The for low-resolution rasterization are represented differ- next section provides background on font formats. Sec- ently. In a Type 1 font, glyphs are represented using tion “Diacritic-Positioning” describes how diacritic po- cubic-spline outlines with declarative hints. In a True- sitioning information is represented in MasterFont’s old Type font, glyphs are represented using quadratic-spline fonts. Section “Objectives” describes the objectives of outlines and a low-level programmable hinting language. the project, and “Tools” describes the software tools that Type 1 outlines are easier to design and hint, and most we used to achieve these objectives. The overall produc- designers use cubic splines when designing fonts. Well- tion process is described in “The Production Process”, hinted TrueType fonts are hard to produce, but they and our conclusions from this development and produc- can achieve pixel-perfect rasterization at low resolutions, tion experience are described in “Conclusions”. something that Type 1 fonts cannot achieve. Virtually all font editors can convert one type to the other. The con- version of TrueType outlines to Type 1 outlines is loss- Font Formats less, but the other direction only produces an approxima- This section describes the OpenType font format and tion. However, the approximation of a Type 1 outline by other relevant font formats. Unfortunately, the term a TrueType outline can be made arbitrarily accurate and OpenType is somewhat vague, so this section also estab- font editors produce approximations that are visually in- lishes a more precise nomenclature. distinguishable from the original. Hinting information is usually also converted, but not as well. Type 1, TrueType, and Early Extensions The first widely- Both Adobe and Apple introduced enhancements to accepted device-independent font format was the Post- these font formats during the 1990s, but none became Script Type 1 format [10]. Adobe started selling re- widely used. Adobe introduced the Multiple Master tail Type 1 fonts in 1986, and published the specifica- font technology [11, 12], which allows users to interpo- tion in the late 1980’s, after Bitstream figured out how late between fonts in a family. This enhancement trans- to produce Type 1 fonts (prior to that the specification forms the font family from a discrete set to a continuous was proprietary and only Adobe could produce Type 1 spectrum. For example, there aren’t just medium and fonts). Type 1 fonts consist of two or three main parts: bold fonts, but a continuous weight axis. Apple intro- scalable hinted glyph descriptions, metric information, duced GX fonts [7] (now called AAT fonts), which are which includes glyph dimensions, advance widths, and TrueType fonts with additional tables that allow contin- pair kerning, and possibly bitmaps for screen preview- uous interpolation, like , and which ing. Today most systems can rasterize the scalable glyph enable high-end typographic features. In particular, GX descriptions adequately, so the bitmaps are rarely neces- fonts can support multiple glyphs for a single charac- sary; Adobe distributes the Manager pro- ter, such as , , or old-style figures, they gram without cost; it is a Type 1 rasterizer for older can reorder glyphs, and so on. Neither format became Windows and Macintosh computers without a built-in widely accepted by font vendors, probably because the rasterizer. Type 1 fonts are relatively easy to design and fonts were too hard to produce. No major font foundry produce, and today there are tens of thousands of com- besides Adobe produced Multiple Master fonts, and only mercial Type 1 fonts. a few GX fonts were produced by Bitstream and Lino- The next widely-accepted device-independent font type. format was the TrueType format [7, 2], which was in-

558 TUGboat, Volume 24 (2003), No. 3 — Proceedings of EuroTEX 2003 Experience with OpenType Font Production

OpenType OpenType is a new font format [3] that Mi- such as rasterizers, layout engines, and printer drivers crosoft and Adobe specified collaboratively. The format can simply ignore the extra tables. OpenType fonts with was first used, under the name TrueType Open, in the Type 1 outlines require special support in rasterizers and Arabic version of Windows 95, which was introduced in printer drivers, support beyond that required to support 1996. Apple later joined the specification effort and the Type 1 and TrueType fonts. In principle any TrueType- name was changed to OpenType. capable layout engine can use any OpenType font, since OpenType adds three capabilities to the TrueType the metric and layout information are compatible (an ex- format: advanced layout features, support for Type 1 ception is pair-kerning information, which can appear in glyph descriptions, and digital signatures. The advanced either the ‘kern’ table or the GPOS table or both in a font layout features are supported by five new tables: with TrueType glyphs, but only in the GPOS table in a • GDEF, which defines glyph properties and glyph font with Type 1 glyphs). To emphasize the fact that sets; OpenType fonts with TrueType glyphs can be used by • GSUB, which defines glyph substitution rules; any TrueType-capable software, whereas special support • is required for fonts with Type 1 fonts, files containing GPOS, which specifies positioning information; the first kind use the ttf or ttc , whereas files • JSTF, which provides justification information; and containing the second kind use the otf extension. • BASE, which provides baseline-adjustment informa- So what is an OpenType font? According to the tion. Microsoft/Adobe specification, any font that conforms to These features were designed to support layout of the specification, including both TrueType and Type 1 text in complex scripts such as Arabic, Hebrew, Indic, glyphs, with or without advanced layout tables and dig- and East Asian scripts (Hebrew is by no means the most ital signatures. This is somewhat confusing, especially complex), and to support high-end typographic features since the file icons in Windows do not correspond to the such as swash letters and decorative ligatures, old-syle glyph descriptions ( and XP use the ‘O’ and proportional figures, and small capitals. for otf files and for ttf or ttc files if the fonts The support for Type 1 glyph descriptions is de- have a DSIG table, and the ‘TT’ icon for all other ttf signed to allow lossless conversion of Type 1 fonts to a files). We will use the acronyms OTF for OpenType fonts TrueType-like table-based file format. Such fonts use with Type 1 glyphs, TTF-OTL for fonts with TrueType two new tables, CFF and VORG, to store a losslessly- glyphs and with advanced layout tables, and TTF for compressed Type 1 font. Technically, the format of fonts with TrueType glyphs but no advanced layout ta- the CFF table corresponds to that of PostScript Type 2 bles. fonts [13], also known as the compact font format, but Microsoft’s main interests in the OpenType format the essence of the format is a lossless representation of lies in supporting complex scripts in Windows, with ini- Type 1 glyph descriptions (outlines and hints). The tial on Arabic and later on Indic scripts, and need to support two kinds of glyph descriptions and hints in tighter operating system/font integration using digi- constrained the design of the layout-related tables, in tal signatures. Adobe’s main interests in the format lies the sense that positioning information in the GPOS ta- in exploiting TrueType’s advantages, mainly the support ble could not be hinted using the normal TrueType or for non-Latin character encodings and the convenient hinting language; a different hinting mechanism is used, single-file format, and in providing its fonts and applica- which supports both Type 1 and TrueType glyphs. tions easier access to so-called expert glyphs, such as old- The support for digitally signing fonts, which uses style figures and small capitals [9]. the DSIG table, is meant to allow users and font-using software to verify the authenticity of fonts. For example, Windows and Macintosh Font Files Differences in the way future operating systems might require that certain sys- font files are stored on different platforms cause a few ad- tem fonts are digitally signed by the operating-system’s ditional difficulties during font production. vendor, thereby preventing modified versions of these On Windows and Unix/ systems, a TrueType fonts from being used instead. For independent font ven- or OpenType font is stored in a single file. A Type 1 font dors, there is currently little benefit in digitally signing is represented by up to four files: a pfb or pfa file that fonts, since users currently do not have means to verify stores the glyph descriptions and a character-to-glyph en- these signatures, and since fonts rarely pose security or coding, afm and/or pfm files that store metric informa- stability problems to systems. tion, as well as the encoding, but no glyphs, and an inf OpenType fonts with TrueType outlines can be file that stores administrative infomation about the font. used by any TrueType-capable software, since except for On Windows system, only the pfb and pfm files are used. extra tables (which are allowed by the TrueType spefica- On Macintosh systems, there are multiple ways to tion), the fonts are also valid TrueType fonts. Software store fonts. The Macintosh operating system supports

TUGboat, Volume 24 (2003), No. 3 — Proceedings of EuroTEX 2003 559 Sivan Toledo and Zvika Rosenberg two byte streams per file. One is called the data fork and is required to prevent the below-baseline diacritics from the other the , which was meant to store ap- colliding with the . These glyphs do not have plication resources such as localized strings, icons, and so code points, but they are used in all the Mac’s on. The format of the resource fork allows multiple re- Hebrew fonts, including the fonts bundled by Apple with sources to be stored in a single file. Before Mac OSX (that the operating system. These glyphs appear to have been is, before version 10), fonts were always stored in the added by Apple quite early on. They do fix the most se- resource fork of files whose data fork was empty. Fonts rious placement cases, but they do not fix all the place- and font families are represented by several kinds of re- ment problems. Furthermore, this glyph set appears to sources, such as the FOND resource that describes an en- have been designed by somebody without much exper- tire family (regular, bold, italic, etc.), SFNT for storing tise in Hebrew, since some of the precomposed glyphs TrueType fonts, FONT and NFNT for storing bitmap can never appear in a text (e.g., dalet with a hataf-patah). fonts, and so on. The fonts of a family are usually stored Since the precomposed glyphs are insufficient for in a single file called a suitcase, whose resource fork con- correctly positioning all the combinations, MasterFont tains multiple resources. The Macintosh also associates a developed a few years ago a positioning aid for the Mac- file type (separate from the extension) and a creator code intosh. This software, called Mapik-Ve-Day, is a Mac- with each file. Font files need to have a specific type to be intosh system extension that selects a diacritic glyph ac- recognized as such by the Macintosh’s operating system. cording to the base letter. Fonts designed to work with Storing fonts in multiple resources within a sin- this software have three sets of diacritics: one for nar- gle file, and storing the font data in the resource fork row letters, one for wide letters, and one for medium- rather than the data fork, creates problems when fonts are width letters. The diacritics in all sets usually have the moved between platforms. To address this issue, Apple same shapes, and all have zero advance widths, but each introduced new ways to store fonts in Mac OSX. First, set has different sidebearings. The three qamats glyphs, suitcases can now be placed in the data fork, using a for- for example, all have the same shape and zero width, and mat called a dfont file (after its extension). This for- all print to the right of the insertion (so they appear mat is essentially a traditional Mac font file, but in the below the preceding letter in a right-to-left text run), but data rather than the resource fork. Second, Mac OSX the amount of right shifting is different. The software es- also allows Windows-style packaging of TrueType and sentially divides the letters into three categories, each of OTF files in a single file in which the font data is stored which has its own set of diacritics. in the data fork with no header or trailer. These two mechanisms, the precomposed glyphs and the three sets of diacritics of different widths, essen- tially mimic hot-lead diacritics. They allow fairly good Diacritic-Positioning Information in Older Fonts control over positioning, provided the font is designed MasterFont’s existing fonts used three mechanisms for with this system in mind. In particular, special attention positioning diacritics. These mechanisms were devel- must be given to the left sidebearings of base letters, to oped over a long period of time — each additional mech- ensure that the diacritics from the appropriate category anism was designed to correct deficiencies in previous are well placed under them. ones. To allow more precise control over diacritic posi- The first mechanism uses a number of precomposed tioning, MasterFont began to use explicit positioning in- glyphs. These are used for dagesh, a dot that appears in- formation in the fonts. These are stored in the kerning side letters, and which must be carefully positioned to table of the font, even though they are not real kern pairs, prevent overstriking the letter itself. Such combinations since the insertion point should not move after the po- are also used for shin/sin dots, for diacritics attached to sitioning adjustment. For example, a kerning pair alef- a final letter (only two combinations are used in mod- qamats with value 35 means that the qamats glyph must ern Hebrew), and a few other cases. There are Unicode be shifted relative to the alef glyph by 35 font units. The code points for these glyphs, as well as code points in the insertion point remains exactly after (to the left of) the MacHebrew 8-bit encoding, and they are used by both alef, as if no adjustment took place. This information is Macintosh and Windows systems. not used by the Macintosh operating system, but a some On the Macintosh, precomposed glyphs are also Adobe applications can use it. used to place diacritics on particularly difficult letter: MasterFont currently has 86 fonts with this level of resh, dalet, and qof. The first two are wide letters that diacritic positioing, which is essentially the set of fonts have a single vertical stem at the right, under which designed for book work. In addition, there are many below-baseline diacritics should be placed. Incorrect more display fonts with pair kerning information but placement under these letters looks awful. The qof is the without elaborate diacritic support, a few of which are only non-final letter with a descender, so special attention shown in Figure 6.

560 TUGboat, Volume 24 (2003), No. 3 — Proceedings of EuroTEX 2003 Experience with OpenType Font Production

Objectives hand, on pre-2000 Windows and pre-X Macs, OTF fonts require the utility, which is free Now that the setting has been described, we can enumer- but nonetheless requires downloading and installing by ate the objectives of this project in more detail. the end user. Also, some applications, most importantly First and foremost, we wanted to convert the di- Microsoft’s PowerPoint, cannot use OTF fonts even on acritic positioning information in the existing fonts to systems that do support them (this is true for both Pow- OpenType advanced-layout tables. We wanted to do this erPoint 2000 and PowerPoint 2002). Finally, one of the without specifying any additional positioning informa- tools we used in the production of the fonts, Microsoft’s tion. This restriction was motivated by several reasons: VOLT, provides much better support for TTF-OTL fonts • Efficiency; there was no point in doing extra de- than for OTF fonts; this is described more fully in the sign work on fonts that already produce perfectly next section. Given the clear disadvantages and the only positioned texts. For example, there is no point slight advantage of OTF fonts, we decided to produce in specifying positioning information for a combi- TTF-OTL fonts. nation represented by a carefully designed-precom- posed glyph. Tools • Correctness; by not specifying new positioning in- formation, the chances of introducing new position- The production of the fonts uses a number of software ing bugs are minimized. tools, which we describe in this section. We also describe • Continued support for older applications; we want- alternatives to some of the tools and explain our choice of ed to be able to produce additional new fonts that tools. would be able to work not only in OpenType ap- Assembly of the Advanced Typographic Tables The tool plications, but also in older applications. By build- that we used to construct the advanced typographic ta- ing new fonts according to the old specification and bles in the fonts is Microsoft’s Visual OpenType Layout automatically converting them to OpenType, the Tool (VOLT). Given a font, we automatically construct foundry would be able to offer customers either an input files for VOLT, and use it to construct a derivative OpenType font or an old-syle font for legacy appli- font with appropriate GDEF, GPOS, and GSUB tables. cations. Adobe’s OpenType Font Development Kit (FDK) Second, we wanted to automatically produce fonts for can perform a similar function. However, when we both the Windows and Macintosh platforms (Windows started the project, the FDK had incomplete support for fonts also work on Unix and Linux). More specifically, the GPOS table, which meant that we could not use it we wanted to produce the fonts for the two platforms us- to convert our Hebrew fonts. The FDK is also OTF- ing the same source fonts, rather than generate a Win- oriented, in that it cannot accept TrueType glyphs as in- dows font and convert it, and then generate a Mac font put, where as we wanted a TTF-to-TTF-OTL conver- and convert it. This meant that the glyph sets for Win- sion, in order not to modify the glyphs or their hints. dows and Mac fonts had to be merged, since previously These restrictions appear to still hold. the foundry used different glyph sets for each platform. In addition to VOLT, Microsoft also distributes Furthermore, if possible, we wanted to have identical fin- command-line tools that can construct advanced typo- ished fonts, except for the suitcase wrappers. It turned graphic tables, but they are quite old and we did not ex- out to be possible. periment with them. Two remaining choices had to be made, concern- Versions 4.0 and 4.5 of FontLab can also produce ing the glyph-description format and the Mac-packaging OpenType fonts with advanced typographic tables. But format. After some deliberation, we decided to pro- our understanding has been that the OpenType support duce TTF-OTL fonts rather than OTF fonts. Produc- in FontLab is based on Adobe’s FDK and hence does not ing OTF fonts has one advantage, namely that the origi- support the required GPOS mechanisms, so we did not nal Type 1 outlines and hints remain intact, whereas au- explore this approach. tomatic conversion is used when producing TTF fonts Since we used VOLT, we would like to comment on from the same source files. We felt that given the ac- its advantages and disadvantages. VOLT is a visual tool, curacy of the conversion, this had essentially no designed to allow a font designer to specify visually glyph on high resolution output from printers or imagesetters. substitutions and positionings. Together with the sample We also felt that modern on-screen rasterizers that use fonts, it is a superb tool for understanding how OpenType antialiasing (font smoothing) produce acceptable output layout features work and for prototyping and developing from automatically-hinted and unhinted TrueType out- OpenType layout support. In addition to visual design, lines. Therefore, we felt that OTF’s advantage over VOLT allows the user to export and import the structure TTF-OTL is not particularly significant. On the other of the layout tables or parts of the tables into text files

TUGboat, Volume 24 (2003), No. 3 — Proceedings of EuroTEX 2003 561 Sivan Toledo and Zvika Rosenberg with special . Since we had to convert many fonts, visually, and the ability to accept automatically-generated we opted for automatically generating these files, which input. are called VOLT project files (vtp) rather than visually Minor Modifications of Other TrueType Tables We grad- constructing each font. On the other hand, VOLT does ually discovered that other TrueType tables had to be not have command-line options that can be used to open a modified to ensure that the fonts perform properly. Here font and a VOLT project file and ship an assembled TTF- are the main modifications that we perform on the fonts: OTL font. This means that even if the project files are • constructed automatically en masse, they cannot be pro- We add code-page-support and Unicode-range in- cessed in batch mode — each font must be opened using formation to the fonts. This information is used by a menu command, the corresponding project file must be Windows and Windows applications to determine imported using another menu command, and the output which scripts a font supports. font must be shipped using yet another command. • While we use VOLT to construct the Unicode char- acter-to-glyph mapping table (a subtable of the Data Extraction and Project File Construction Toproduce ‘cmap’ table), we later add an 8-bit Hebrew encod- the input file for VOLT, we had to convert the data in ing table to support older Mac applications. This the initial TTF fonts into VOLT project files. The format encoding subtable must be added even if the origi- of the VOLT project files is complex and undocumented, nal TrueType font has one, since VOLT regenerates but quite simple to figure out by comparing the file to the the ‘cmap’ table, and to the best of our knowledge, font structure in the graphical user interface. cannot emit this particular subtable. We performed the conversion using two custom- • written programs. The first program, written in C, reads We remove the ‘kern’ table, which contains diacritic a TTF font and outputs an afm-like text file, which maps positioning data that are not true kerning pairs, as glyph indices to PostScript glyph names, and which doc- described above. uments all the kerning pairs (glyphs in TrueType fonts • We add a ‘gasp’ table to specify that the fonts should are stored in an array and are refered to by VOLT using to be antialiased at all sizes. their indices). • We make minor modifications to the ‘name’ table, The second program, written in the AWK language, which contains copyright, version, URL, and other transforms this information into a VOLT project file. The textual information for the end user. transformation consists of: Some of the modifications are done with a specially- • Mappingeachglyphindextoaglyphname andUni- written C program. The other modifications are per- code code point(s). formed by converting the appropriate table to XML • Defining glyph groups. format using Apple’s ftxdumperfuser command-line • Defining glyph substitution lookups, for example tool for Mac OSX, modifying the XML file using AWK to map normal diacritics to wide or narrow ones, and SED programs, and then fusing the resulting XML- to map glyph sequences to precomposed glyphs, formatted table to the font file, again using ftxdumper- and in one case to decompose a precomposed se- fuser. quence depending on the preceding glyphs (this We perform some of the modifications using a C supports two meanings and vocalizations of the Uni- program and some using the Apple tool because when we code vav+holam sequence, which can stand for a sin- started the project, the Apple Font Tools for Mac OSX gle long vowel or a consonant followed by a short had not yet been released. We therefore wrote a custom vowel; there is a single Unicode sequence for both C program to perform the modifications we found neces- cases, but they should be printed and vocalized dif- sary. As testing progressed and we discovered more re- ferently). quired or desired modifications, we switched to using the • Defining diacritic positioning information. Apple tool, which was easier than extending the C pro- • Defining kerning pairs. gram. One problem with the XML format that Apple uses to describe font tables is that simple text processing We wrote the AWK program by developing a prototype tools like AWK and SED do not understand XML’s hierar- font visually in VOLT, exporting the project file, and then chical structure. But for relatively simple modifications, writing the program so that given the input file, it gen- the combination of ftxdumperfuser and AWK or SED erates a similar project file. We then loaded the result- scripts works. ing project file into VOLT and inspected the resulting There is another tool that can perform TrueType- OpenType structure. When we discovered problems, we to-XML and XML-to-TrueType conversions, Just van modified the AWK program to correct them and so on. 1 Rossum’s TTX. We experimented with this tool, which This methodology exploits the two strengths of VOLT: the ability to design and inspect OpenType layout tables 1. http://www.letterror.com/code/ttx/

562 TUGboat, Volume 24 (2003), No. 3 — Proceedings of EuroTEX 2003 Experience with OpenType Font Production

F. 1: One of the OpenType fonts in VOLT along with three types of OpenType lookups that we use to position diacritics: anchor positioning, single contextual substitution, and substitution. Our fonts also use pair-kerning adjustments and more complex contextual substitutions.

TUGboat, Volume 24 (2003), No. 3 — Proceedings of EuroTEX 2003 563 Sivan Toledo and Zvika Rosenberg

is written in the Python language, but we could not install A-Reg.TTF it successfully. The trouble was that TTX uses a num- A-Ita.TTF ber of other Python packages, for example, to perform . . . numerical calculations, which did not install properly on B-Reg.TTF

. . . our systems. But assuming that these issues in TTX are easily solvable, it is certainly an alternative to the Apple prevolt script processes tools, and at least in principle, it is a multiplatform tool an entire directory that runs on Windows, MacOS, and Linux.

A-Reg..TTF + A-Reg.VTP General-Purpose Font Editors Although no general-pur- A-Ita.FIXED.TTF + A-Ita.VTP

pose font editor was used for the actual conversion, we . . . did use two font editors to produce the input font files B-Reg.FIXED.TTF + B-Reg.VTP and to inspect fonts that seemed to have problems. We . . . used Fontographer to produce the input files to the con- Microsoft VOLT processes version process, and we used PfaEdit to inspect fonts. each font separately Macintosh Suitcase Generation We used the command- 2 line program ufond from George Williams’ FONDU A-Reg.VOLT.TTF package to package TTF-OTL fonts into suitcases, and we A-Ita.VOLT.TTF used command-line tools from the Mac OSX developer . . . kitto tag the suitcases with the required file type and cre- B-Reg.VOLT.TTF ator code. . . . We also used a script to automatically scan an en- postvolt script processes tire directory with TTF-OTL fonts and classify them into an entire directory families. The classification is done using the family name

field of the font. Once the classification is made, the A-Reg.OT.TTF script invokes ufond to package the fonts of the family A-Ita.OT.TTF into a suitcase. . . .

B-Reg.OT.TTF

The Production Process . . . The final production process consists of four stages. The suitcases script processes an entire directory prevolt stage is performed by a script that processes all the input TTF fonts in a given directory. For each in- put file, the script generates a modified TTF file and a A.OT.SUIT VOLT project file. The modifications to the input files B.OT.SUIT that we perform at this stage involve fixing the ‘post’ . . . table to overcome an apparent Fontographer bug, up- F. 2: An overview of the production process. grading the version of the ‘OS/2’ table and adding to it Unicode-range and code-page information, and re- The figures shows the input and output files for each stage and how that stage operates. The postvolt moving the ‘kern’ table. The VOLT project file is pro- duced by invoking the C program that generates an afm- stage produces the final font files for Windows and like text file with glyph information and kerning and Mac OSX, and the suitcases stage produces the final diacritic-positioning information, and converting that file font files for any Mac operating system, X or older. to a VOLT project file using a custom AWK program. In the second stage, each modified TTF file is opened in VOLT, the corresponding project file is im- a MacHebrew encoding subtable, to fix the ‘name’ table, ported, and the resulting font is shipped out. This re- and to add a ‘gasp’ table. Although some of the process- quires only three menu commands in VOLT and clicking ing can be moved from the prevolt to the postvolt or vice ‘okay’ a couple of times, but it is still the most time con- versa, some processing must be done before VOLT can suming stage of the production process. process the font and some processing must be done after The next stage, the postvolt stage, processes all the VOLT ships the font. Hence, both prevolt and postvolt files shipped from VOLT in a givendirectory. A script in- stages are necessary. vokes ftxdumperfuser a few times on each font, to add The final suitcases stage again uses a script to pro- cess an entire directory. The script first extracts the 2. http://fondu.sourceforge.net family name from each font file. Next, the script finds

564 TUGboat, Volume 24 (2003), No. 3 — Proceedings of EuroTEX 2003 Experience with OpenType Font Production

בְּראשִׁית, בָּרא אֱלהִים, אֵת הַשָּׁמַיִם, וְאֵת שֶׁלָך :Ligature substitution הָאָרץ. וְהָאָרץ, הָיְתָה תֹהוּ וָבֹהוּ, וְחֹשֶׁך, עַל פְּנֵי זְמָן :Anchor positioning תְהוֹם; וְרוּחַ אֱלהִים, מְרחֶפֶת עַל פְּנֵי הַמָּיִם. וְ גָם :Contextual substitution וַיֹּאמֶר אֱלהִים, יְהִי אוֹר; וַיְהִי אוֹר. וַיַּרא אֱלהִים מָצוֹת .vs מִצְוֹת אֶת הָאוֹר, כִּי טוֹב; וַיַּבְדֵּל אֱלהִים, בֵּין הָאוֹר וּבֵין :Insensitivity to mark orderings הַחֹשֶׁך. וַיִּקְרא אֱלהִים לָאוֹר יוֹם, וְלַחֹשֶׁך קָרא גָּל gimel+dagesh+qamats לָיְלָה; וַיְהִי עֶרב וַיְהִי בֹקֶר, יוֹם אֶחָד. גָּל gimel+qamats+dagesh

F. 3: Hebrew diacritic positioning using F. 4: The first few phrases of Genesis, typeset in OpenType features. The first three lines show how Adobe InDesign 2.0 ME using an OpenType version OpenType features utilize the mechanisms of older of Henri Fridlaender’s Hadassah typeface. fonts. These examples correspond exactly to the בְּראשִׁית, בָּרא אֱלהִים, אֵת הַשָּׁמַיִם, וְאֵת הָאָרץ. examples shown in Figure 1. The fourth line shows a new application of contextual substitution that yields וְהָאָרץ, הָיְתָה תֹהוּ וָבֹהוּ, וְחֹשֶׁך, עַל פְּנ ֵי תְהוֹם; correctly-positioned glyphs for two identical Unicode ַוְרוּח אֱלהִים, מְרחֶפֶת עַל פְּנֵי הַמָּיִם. וַיֹּאמֶר sequences with different grammatical meanings and אֱלהִים, יְהִי אוֹר; וַיְהִי אוֹר. וַיַּרא אֱלהִים אֶת different vocalizations. The last two lines show that the OpenType features have been encoded in a way הָאוֹר, כִּי טוֹב; וַיַּבְדֵּל אֱלהִים, בֵּין הָאוֹר וּבֵין that prints all valid diacritic sequences in the same הַחֹשֶׁך. וַיִּקְרא אֱלהִים לָאוֹר יוֹם, וְלַחֹשֶׁך קָרא way, as mandated by Unicode. The sample was ל ָיְלָה; וַיְהִי עֶרב וַיְהִי בֹקֶר, יוֹם אֶחָד. .prepared using Adobe InDesign 2.0 ME all the font files that belong to that family, and invokes F. 5: The first few phrases of Genesis again, ufond on them to create a suitcase. The suitcase is cre- typeset in Adobe InDesign 2.0 ME using an ated by ufond in the data fork of a file. We then move OpenType version of Zvi Narkiss’s Narkiss Classic the font data to the resource fork of the font file, and typeface. The typeface is a new version of Narkiss tag it with appropriate type and creator codes. The au- Linotype, which came out in 1968. The sample tomatic recognition of the font files that belong to each shows four weights: light, regular, medium, and bold. family minimizes the chances for errors that could occur if a family-configuration file was written by hand. Due to the use of ftxdumperfuser, the prevolt Second, fonts, especially those produced by inde- and postvolt stages must be performed on a Mac OSX ma- pendent foundries, are supposed to work with a variety chine. The VOLT stage must be performed on a Win- of operating systems and software applications. Since dif- dows machine. The suitcases stage is also performed un- ferent clients access different parts of font data, a font der Mac OSX, in order to move the font data to the re- that tests perfectly with a given set of clients might fail source fork and in order to assign type and creator tags; on another client, as we have often discovered during the non-Macintosh operating systems do not support these production process. To compound the problem, font-file file-system features. specifications specify the format and intent of data fields, but they specify only loosely the behavior of client data. Conclusions This is often done on purpose, to accommodate the be- havior of older existing clients, but it makes it almost im- This project led us to several conclusions concerning possible to create working fonts without extensive testing. OpenType and font production in general. Let us give an example: are the glyph names in a True- Reflections on Font Production Fonts are atypical objects Type or OpenType font significant, or are they present that present special challenges to their manufacturers. only to allow printer drivers to download the font into The uniqueness of fonts is caused by a combination of fac- a printer in a consistent manner? Presumably, applica- tors that is not present in similar products. tions should use the encoding table in the ‘cmap’ table First, fonts are highly complex data objects. While to find glyphs, not the PostScript names of the glyphs. Type 1 fonts are fairly simple, TrueType, OpenType, But if some application uses glyph names instead, a font and other advanced font formats are complex, with hun- with variant names would not work in that application dreds of individual fields that client software uses. (we note that even Microsoft- and Apple-supplied fonts

TUGboat, Volume 24 (2003), No. 3 — Proceedings of EuroTEX 2003 565 Sivan Toledo and Zvika Rosenberg use names for the Hebrew glyphs that are inconsistent that they use themselves to produce fonts. Still, with the Adobe Glyph List). the effort in publicly releasing and supporting these Third, font data, such as glyph outlines, hints, and tools is significant, and the wide availability of these metric information, has a very long life, but font files have tools does help independent foundries. In some a shorter life span. The outlines and metrics of widely- cases the tools are not ideal for automated produc- used fonts by foundries such as Linotype, Bitstream, or tion. For example, the inability of VOLT to process Agfa-Monotype are sometimes more than 20 years old, a font without invoking the graphical user interface clearly an old age in the digital world. On the other slows down production. hand, font files need to be recreated from this data once • General-purpose font editors now include scripting in a while. Font data that might have been used to create languages. Specifically, FontLab uses Python and pre-PostScript fonts, were used again in the mid 1980s PfaEdit its own scripting language. These tools will to create PostScript Type 3 fonts, then PostScript Type 1 reduce the cost of automation and hence of font pro- fonts, then TrueType fonts. The TrueType font might duction, especially if scripts can be applied to a font have been upgraded once with better hints, then again to from the command line (this is true for PfaEdit, and add the Euro symbol, then again to add non-Latin glyphs perhaps also for FontLab). Obviously, this approach for supporting additional scripts. In between, special ver- requires programming, but at least the program- sions might have been produced for bundling with op- ming environment is the familiar font editor. One erating systems or software packages. In the non-Latin remaining problem in this approach is that these font font market, special versions might have been produced editors read the font data into their own in-memory to work with various layout programs. And today, the data structures and then generate a new font, which same font data is used to produce TTF-OTL and/or OTF can have side effects on tables that are not supposed fonts. Even a relatively young foundry like MasterFont to be modified by a script. FontLab is careful not to has font data going back about 15 years. modify data it does not need to (for example, not to Fourth, font foundries are typically small businesses modify TrueType glyphs unless glyphs are edited), or small business units [8]. This limits their ability to in- but PfaEdit regenerates TrueType glyphs whenever vest in custom programming to automate font production. it outputs a font file. This again is a particular problem in the non-Latin font • Individuals produce powerful free tools that can as- market, where fonts are more specialized. sist in automating font production. These include The longevity of font data from which new font files TTX and PfaEdit. One problem with using these must be occasionally produced, together with the fact tools in commercial font production is that they tend that foundries easily accumulate hundreds or thousands to be less stable. One might expect that they would of fonts, implies that automation should be widely - not be as well supported as commercial tools, but ployed in font production. That is, producing a new font that is not always true: George Williams, for exam- by launching a font editor, loading an older version, mak- ple, the creator of PfaEdit, provides excellent and ing the required changes, and generating a new font, is prompt support. inefficient when a large number of fonts must be pro- Font production requires careful management over time duced. Manual production is also prone to errors and of source files, deliverables, and software. Source files hence requires more testing than an automatic conversion are used to produce fonts over many years, and they that processes all the fonts in exactly the same way. How- need to be updated by adding glyphs, metric infoma- ever, the size of most independent foundries precludes tion, and so on. When the source files are updates or large investments in custom automation (that is, paying when deliverables in new formats are needed, deliver- programmers to automate font production). ables are produced again. It is beneficial to automate new This problem has been addressed recently in three production cycles as much as possible, especially when ways: many fonts are involved. Since foundries often cannot in- • Microsoft, Adobe, and Apple, who need to encour- vest significant funds in custom programming, standard age font production, release free font-production font-production tools must support automation and re- tools (Adobe’s interest in independent font produc- production. The scripting support in FontLab is a step tion stems from its role as a vendor of application, in this direction, and essentially any command-line tool is not from its role as a font foundry). Microsoft has usable in script-driven batch processing. But other tools, released VOLT and a large number of other tools like VOLT, do not support automated production and re- for hinting, adding digital signatures, editing strings production. The importance of automation and the re- in a font, and so on. Adobe has released the FDK, quirements of multiple production cycles may be evident and Apple has released a number of font tools. To a to large font foundries, but are usually not evident to large extent, the three companies release the tools smaller foundries.

566 TUGboat, Volume 24 (2003), No. 3 — Proceedings of EuroTEX 2003 Experience with OpenType Font Production

Reflections on OpenType Will the OpenType format, and is not much harder than using OpenType layout features: in particular the advanced typographic layout, succeed, instead of marking a run of text or a character style with or will it fail like Multiple Master fonts and GX fonts? OpenType features, the document designer or graphic This question is important for font vendors, application designer marks the text or style with an alternate font. developers, and font consumers, who all need to decide This makes it a bit harder to switch font families, but the whether or not to invest in OpenType. There are several difference is not that dramatic. reasons to believe that OpenType layout will become suc- Second, a single OpenType font may behave in dif- cessful and widespead. ferent ways in different applications, which may be con- First, both Adobe and Microsoft are pushing this fusing or disappointing to users. This happens due to technology. In particular, both companies provide font two reasons: (1) The OpenType specification does not producers with free production tools, Adobe has con- specify formally and in detail how a layout engine should verted almost its entire typeface library to OpenType behave, and (2) the desire to allow application develop- (which was not the case with Multiple Masters), and ers to increase the level of OpenType support gradually. Microsoft has provided application developers with both Microsoft published some specifications regarding layout tools to utilize OpenType layout features (through the engines, but these specifications basically specify only the and OTLS libraries) and with assistance with minimal actions that a layout engine must perform to sup- developing layout engines. Both companies already pro- port OpenType fonts in a particular script. Many other duce major applications that exploit OpenType’s layout aspects of layout-engine behavior are not spelled out ex- features, namely Office XP and InDesign. Other compa- plicitly. The desire to allow application developers to nies are also developing OpenType layout engines, such add OpenType support gradually means that some appli- as IBM’s ICU open-source engine. cations can rasterize OpenType fonts but completely ig- Second, the technology is essential for presenting nore the advanced layout tables, other applications sup- text in some scripts, such as Indic scripts, and essential port the advanced layout tables but give users access to for presenting Arabic and Hebrew text with diacritics only a few layout features, and so on. For example, Mi- (Arabic and Hebrew without diacritics can be set with- crosoft Office applications appear to ignore the layout out OpenType layout features). In particular, Microsoft tables in fonts with CFF outlines, and both Adobe and has used OpenType layout in its Arabic fonts and operat- Microsoft application seem to ignore the association be- ing systems for years now. tween layout features and specific scripts and languages. Third, the OpenType format has been designed to make font design easy, which was not the case for Mul- OpenType and TEX The aim of the project described in tiple Masters and GX. The structure of OpenType’s lay- this article was to produce OpenType fonts for existing out support is declarative. For example, the font de- OpenType-capable applications, such as signer can declare that a certain glyph substitution only and Adobe InDesign. Getting these fonts (or any other occurs in a certain glyph context (preceding and follow- OpenType fonts) to work with TEX or Omega was not ing glyphs). The algorithmic question of how to recog- one of the project’s goals. Still, how would one use Open- nize the context is left to the layout engine. In contrast, Type fonts like these in TEX or Omega? GX’s structure is procedural/programmable, and the font The familiarity with OpenType’s advanced layout designer must specify a finite-state machine that the lay- features that we gained from this project leads us to think out engine uses to detect the context. In general, declar- that TEX’s mechanisms are too weak to support many ative font technologies like outline fonts and Type 1 hint- useful OpenType layout features. Several factors restrict ing have been more successful than programmable tech- TEX from exploiting OpenType. First, most OpenType nologies like TrueType hinting, GX state machines, or fonts represent more than 256 characters and have more . (TrueType hinting is unsuccessful in the than 256 glyphs, so the 8-bit character/glyph restric- sense that the cost of hinting causes most fonts to remain tion in TEX is restrictive. Second, many OpenType lay- unhinted or automatically hinted.) The obvious reason is out features, such as mark-to-mark diacritic positioning, that most font designers are graphic designers by training, mark-to-base diacritic positioning, and contextual sub- rather than programmers. stitution simply cannot be represented by TEX’s ligature One must acknowledge, however, a few problems and kerning mechanisms. Third, OpenType’s justifica- related to OpenType layout. First, the main benefit that tion mechanisms appear to be incompatible with TEX’s OpenType brings to the Latin-script market are alternate assumptions about boxes and words. OpenType fonts can glyphs, especially small caps, old-style figures, propor- suggest glyph substitutions and positioning adjustments to tional figures, and swashes. But these glyphs have been be done when text needs to be expanded or condensed available for years, although in separate small-caps, old- to justify a line. The adjustments can include changing style-figures, and expert fonts. Using these separate fonts the sidebearings of glyphs (tracking), replacing glyphs by

TUGboat, Volume 24 (2003), No. 3 — Proceedings of EuroTEX 2003 567 Sivan Toledo and Zvika Rosenberg

References קרוב ל22- מליון חיפושיות נמכרו במהלך 70 שנותיה. חברת פולקסווגן, יצרנית The Unicode Consortium. The Unicode Standard [1] קרוב ל22- מליון חיפושיות נמכרו במהלך 70 שנותיה. חברת פולקסווגן, יצרנית Version 3.0, 2000. Parts of the standard are קרוב ל22- מליון חיפושיות נמכרו במהלך 70 שנותיה. חברת פולקסווגן, .available online at http://www.unicode.org קרוב ל22- מליון חיפושיות נמכרו במהלך 70 שנותיה. חברת פולקסווגן, Microsoft Corporation. TrueType 1.0 Font [2] קרוב ל22- מליון חיפושיות נמכרו במהלך 70 שנותיה. Files, Technical Specification Revision 1.66, 1995. .http://www.microsoft.com/typography קרוב ל22- מליון חיפושיות נמכרו במהלך 70 שנותיה. Microsoft Corporation. OpenType Specifications [3] קרוב ל22- מליון חיפושיות נמכרו במהלך 70 שנותיה. חברת פולקסווגן, יצרנית Version 1.4, 2002. http://www.microsoft. קרוב ל22- מליון חיפושיות נמכרו במהלך 70 שנותיה. חברת פולקסווגן, .com/typography קרוב ל22- מליון חיפושיות נמכרו במהלך 70 [4] Yannis Haralambous. the Holy Bible in Hebrew, with T X. TUGboat,  E F . 6: A few display OpenType fonts produced 15(3):174–191, 1994. Also appeared in the by MasterFont using the method described in Proceedings of EuroTEX 1994, Gdansk.´ the paper. Display fonts do not have extensive [5] Yannis Haralambous. Tiqwah: a typesetting diacritic-positioning support. The first four lines show system for biblical Hebrew, based on TEX. Rosenbert Naot, the next two Taxi, and the 7th line Bible et Informatique, 4:445–470, 1995. shows Hafgana, all by Zvika Rosenberg. The 8th line [6] Yannis Haralambous and John Plaice. Omega, a shows Daphna by Ada Yardeni, and the 9th shows TEX extension including Unicode and featuring Avney-Gad Hakuk by Gad Ullman. lex-like filter processes. In Proceedings of the 1994 EuroTEX Conference, Gdansk,´ 1994. [7] Apple Computer Inc. TrueType Reference wider/narrower variants, forming or breaking ligatures, Manual, 1999. http://fonts.apple.com. and so on. TEX’s -breaking algorithm assumes [8] Emily King. New Faces: in the that words have a fixed width, but this is not necessarily First Decade of Device-Independent Digital true in OpenType fonts: words can have several possible Typesetting. PhD thesis, Kingston University, widths that the layout engine can choose from. 1999. http://www.typotheque.com under At least the first two issues should be addressed in ‘Articles’. order to use OpenType fonts in an effective way in a TEX [9] Thomas W. Phinney. TrueType, PostScript system, at least in complex non-latin scripts. Omega [6] Type 1, & OpenType: What’s the difference? already addresses the first issue, so it seems to be the http://www.font.to/downloads/TT_PS_ ideal candidate for OpenType support. It appears that OT., 2001. the Omega team is already working on the second issue, [10] Adobe Systems. Adobe Type 1 Font Format, and we hope that it will get resolved. We believe that it 1990. http://partners.adobe.com/asn/ would be best to support OpenType in Omega directly, developer/technotes/main.html. with no intermediate metric files (i.e., without ofm files), [11] Adobe Systems. Type 1 Font Format Supplement, but that is something for the Omega team to decide. We 1994. Technical Specification #5015, plan to explore these issues ourselves at Tel-Aviv Univer- http://partners.adobe.com/asn/ sity using NTS, the new implementation of TEX in Java. developer/technotes/main.html. [12] Adobe Systems. Designing Multiple Master Acknowledgements Thanks to an anonymous referee for , 1997. Technical Specification helpful suggestions and corrections. Thanks to Mi- #5091, http://partners.adobe.com/asn/ crosoft’s Paul Nelson for answering numerous OpenType- developer/technotes/main.html. related questions on the VOLT-community’s message board, on the OpenType mailing list, and via private [13] Adobe Systems. The Type 2 Charstring email exchanges. Thanks to WinSoft’s Pascal Rubini Format, 2000. Technical Specification #5177, http://partners.adobe.com/asn/ for answering questions regarding the behavior of In- developer/technotes/main.html Design 2.0’s Hebrew layout engine, and regarding the . Hebrew-diacritic-handling mechanisms of older applica- [14] Sivan Toledo. A simple technique for tions. Thanks to George Williams for making PfaEdit typesetting Hebrew with vowel points. (now named FontForge) available. Some of the historical TUGboat, 20(1):15–19, 1999. information concerning font formats was gathered from [15] Ada Yardeni. The Book of the Hebrew Script: replies to a question that Adam Twardoch posted on the History, Palaeography, Script Styles, OpenType mailing list. and Design. Carta, Jerusalem, 1997.

568 TUGboat, Volume 24 (2003), No. 3 — Proceedings of EuroTEX 2003