<<

fpTEX: A teTEX-based Distribution for Windows

Fabrice Popineau Sup´elec 2 rue E. Belin F-57070 Metz France [email protected] http://www.ese-metz.fr/~popineau

Abstract This paper deals with the ins and outs of porting the widely used teTEX distribu- tion to the Windows environment. The choices made and difficulties experienced are related, a brief description of this huge distribution is given and the future work is sketched out.

Motivation Some of the nicest features of emTEX, such as its dvipm previewer, were not available to Windows users. Context. More and more people need to use some Moreover, emT X’s author, Eberhard Mattes, never sort of Microsoft environment, perhaps because of E released his sources. office suite or the like, or because of management At the same time, MiKT X began to mature. staff decisions. Some of the greatest pieces of soft- E Christian Schenk, author of MiKT X, has followed ware have been developed on UNIX (or other oper- E a different way. He has designed a completely new ating system) well before the general availability of Win32-oriented T X distribution. Looking at his Windows. Thus, software such as T X should be E E work, I questioned the usefulness of porting Web2C available on Microsoft operating systems, natively to Win32, but there were some reasons to do so: ported, and compatible with implementations on other operating systems. TEX by itself is largely Compatibility. Having a Windows TEX distribu- platform independent but achieving complete com- tion based on exactly the same files as the UNIX patibility for an entire distribution is better. one means you can share resources. For exam- The Web2C TEX distribution is one of the great- ple, you can not only share texmf trees across est TEX implementations and has support for porta- the network but also configuration and format bility. Moreover, Web2C is the base of the widely files. used teTEX distribution for UNIX.NowthatUNIX Portability. Most of the ongoing developments and Windows can easily share files across the net- around TEX are done with UNIX Web2C (see work, thanks to tools such as Samba, it is most de- pdfTEXorε-TEX). Being able to share source sirable to have not a teTEX-like TEX distribution files means less efforts to compile a new release. for Windows, but the actual teTEX distribution for There is another consequence: on several occa- Windows. sions, it has proven to be useful to compile the source code on something really different from Free TEX for Windows. In the fall of 1997, when UNIX. Errors that do not show up on one plat- I began to port Web2C to Win32, one main TEX dis- form may do so on another one. tribution was available to Windows users: emTEX. This is still a great TEX environment but it was de- Usability. Lots of people are familiar with teTEX signed for MS-DOS first and then for OS/2.So,when under UNIX. Having the very same distribution Windows became 32-bit-aware,1 those MS-DOS ap- under Windows is a plus. plications could not benefit from the new 32-bit flat The plan. The porting tasks can be divided as fol- mode, or at least not optimally. The so-called DOS- 2 lows. Note, however, that the job was not formally extenders were not as smart as they are today. planned at all since it had to be done using mostly 1 In fact, even if Windows 9x can run 32-bit mode appli- spare time. So the project has followed a circu- NT cations, only the Windows incarnation of Windows is a lar technique, with some issues only being resolved true 32-bit environment; see section about the previewer 2 For example, support for long filenames was not avail- quite recently. Below is a very short description of able at first. the next sections.

290 TUGboat, Volume 20 (1999), No. 3 — Proceedings of the 1999 Annual Meeting fpTEX: A teTEX-based Distribution for Windows

Command-line programs. The first goal was to – gsftopk and ps2pk to rasterize Type 1 have a .exe running under the Win32 API fonts to PK files (Application Programming Interface), the set of – mktex* support programs for generating functions that implement the kpathsea library. missing font files and fmtutil for building Compilation environment. teTEXusesapower- formats ful tool called autoconf. This tool relies heavily • a DVI file viewer based on xdvi, but adapted to on having a UNIX shell and lots of UNIX utili- Windows ties such as sed, grep, awk, and so on. Clearly, • packages found on the TEX-Live CD such as: this is not something easy to find and run under – dvipdfm, to convert DVI files to PDF Windows. – tex4ht and tth, to convert TEX files to Shell issues. Moreover, the source distribution uses HTML some shell scripts at run-time.Itisnotwise DVI to suppose that the end-user will have a UNIX – extra tools to deal with either files, shell on a Windows machine. PostScript files or fonts. • -specific. Some issues like find- extra packages found only on the Win32 section CD ing a replacement for file links or naming files of the TEX-Live : on the network have been solved recently. The – ttf2pk and ttfdump will handle TTF fonts question of using the registry is also mentioned. – hbf2gf will handle East-Asian fonts Previewer. No TEX distribution would be com- – gzip and jpeg2ps, which can be handy DVI plete without a previewer. So the port of • the teTEX texmf tree, which is not the least xdvi was contemplated. important part! Installation. This is the trickiest part. Binary dis- tributions were not common under UNIX but Command-line programs they are under Windows, and the installation The process. The Web2C distribution integrates process is very different. all TEX-related tools around one main library called Configuration. The process of configuring Web2C kpathsea. It was devised by Karl Berry to face is very simple because it consists mainly of edit- the growing number of environment variables needed ing text files or setting up environment vari- to set up a complete TEX distribution. Instead of ables. The teTEX distribution introduces a smart setting environment variables, the path values and tool to administer the system, and this task can many other constants are looked for in a configu- be rendered in a Windows-oriented way as well. ration file. This guarantees extensibility and is far Future work. There are many points that can be easier to maintain. enhanced and some will be done in the very near The second point is the process of compiling future. TEX itself. The Web2C distribution owes its name from the → C translator that converts the orig- However, if tasks such as editing a text file, set- inal Web code to C programs. Basically, the follow- ting environment variables or unpacking an archive ing tools were needed: are usual in the UNIX world, they are not usual any- • a C compiler targetting the Win32 API and, more in the Windows, world where end-users expect if possible, supporting the standard C library automatic or point-and-click things to happen. So functions; the installation and configuration parts are very spe- • UNIX cific to Windows. some tools such as sed, grep and awk; • the Perl language, which has proven to be useful The contents of the distribution to put glue between many parts of the building Before discussing the porting issues, here is a brief process, due to the lack of a shell with real pro- outline of what is in the distribution: gramming capabilities under Windows. • Web2C base distribution: TEX, , Choosing a compiler. The availability of GCC — METAPOST, DVIware and fontware tools or rather of a native Win32 GCC — under Windows is quite recent. Moreover, GCC in its Cygwin3 incar- • each TEX extension or package that is found in nation has some drawbacks under Windows: teTEX: 3 – ε-T X, pdfT X, Ω (Omega) Most of the GNU tools have been ported to Win32 by E E Cygnus Software and their port is under the GNU Public – dvipsk and dviljk to print DVI files Licence. See http://sourceware.cygnus.com/cygwin/.

TUGboat, Volume 20 (1999), No. 3 — Proceedings of the 1999 Annual Meeting 291 Fabrice Popineau

• Every program is linked to a DLL (Dynamically • popen() is available only for command-line pro- Linked Library) that emulates UNIX calls; this grams but fails for graphical programs (the pre- slows down somewhat programs doing intensive viewer uses this call!). 4 file system calls, for example. • stat() fails to recognize directories if their name • At the time I began the Web2C port, this DLL has a trailing / . was not stable at all. All these problems have workarounds using the Win32 • GCC Benchmarks on the same computer using calls instead of the standard C calls. under and the Microsoft compiler under NT have shown up to 20% less time on the same Compilation environment runs in favor of the Microsoft compiler.5 In order to make the build process safer and closer Thus, the Microsoft compiler was chosen. The gen- to what happens under UNIX, a number of decisions eral philosophy was to stick to the Win32 API as had to be made. much as possible and avoid any layer to handle the translation, which might alter performance. Makefiles. Every UNIX Makefile comes in a gener- Many of the auxiliary tools needed for the build ic shape, Makefile.in, that needs to be instantiated UNIX process were available either through the GNU- by the complex process of autoconf.The Win32 Cygnus project or from previous ports to MS- Makefile.in is assembled and processed by the m4 DOS; however, almost none of them were available macro processor to generate the actual Makefile natively ported to Win32. While I was at porting that will fit your own configuration. Moreover, those UNIX kpathsea and Web2C to Win32, I also adapted the Makefile use many constructs (shell or other tools I needed to Win32. This resulted in an archive tools). So they are not usable as-is under Windows, of UNIX tools, many compiled by myself and the where the process is somewhat different. others gathered from the net. This archive is avail- The Windows Makefile is built by hand from the UNIX Makefile.in. All common parts are stored able in the same directory as fpTEX, in the CTAN archives. in a special place. An initial configure.pl Perl script allows one to: Compiling tex.exe. The kpathsea library already • configure the common parts with options like had support at the source level for other platforms root of the destination directory, root of the than UNIX, namely Amiga and VMS. So the path source tree, and so on; was already laid out. Fortunately, kpathsea already • ensure through the configuration that only the encapsulated almost all system calls needed for TEX. files generated by the current build will be re- This was a great feature of the TEX source code, to precisely identify system dependencies. ferred to; that is, no external kpathsea.dll will interfere, no external .exe or texmf Disks. The Windows environment knows about de- will be referred to when generating documenta- vice names attached to disks whereas the direc- tion or file formats; tory tree structure under UNIX hides them. • Paths. The path separator is not the same but, for- save and restore each of the Makefile files in a safe place. tunately, the Win32 API support ’\’and’/’ path separators. Source code configuration. The same Perl script Links. There are no hard or symbolic links under also undertakes the translation process of every Windows. config.in or c-auto.in configuration file into their Permissions. The permissions on files for UNIX definitive form. Since there is only one target oper- and Windows are handled in a completely dif- ating system, there is no need to guess if the fea- ferent way. tures are supported — just consult a table of fea- Some of the problems met were specific to Windows tures. Thus, doing this ensures better compatibility 9x, where standard C library calls are available but with the original source code. buggy. For example: Overall build process. The overall build process • system() is meant to run external commands is done by another build.pl Perl script. This script but fails to return their exit code — it always delegates to the different Makefile and can be asked returns true. to: 4 The kpathsea library can do lots of stat calls on a huge • clean up the source tree at different levels texmf tree. 5 • rebuild dependencies This was GCC 2.7.1 versus VC++ 5.0; the situation may have changed today. • build and/or install the whole release

292 TUGboat, Volume 20 (1999), No. 3 — Proceedings of the 1999 Annual Meeting fpTEX: A teTEX-based Distribution for Windows

• use different compilation modes, such as debug Operating system-specific or release, statically or dynamically linked exe- Two main features have been added and one has cutables been avoided. • prepare for specific tasks, such as profiling or using advanced debugging and checking tools No file links under Windows. The Web2C dis- such as BoundsChecker tribution uses file links under UNIX for linking pro- • install everything from scratch, including in- grams under different names. The problem is that stalling the latest teTEX texmf tree you can have several format files generated by one engine. For example, .fmt and plain.fmt are Up to now, the build process is not clean enough, bothrunwiththetex.exe engine. Under UNIX,the but nonetheless the source tree has been used suc- tex engine maybe linked under the names latex cessfully by people with no previous knowledge. and plain, and the name under which the engine is Cleaning up the Win32 part of the source tree would linked determines which file format is loaded. allow more people to access it and contributions There are no file links under Windows 7 and from the net could be expected. all you can do is simply copying the tex.exe engine Shell issues to latex.exe and so on — at the expense of disk space. Given the number of engines, some of them The Web2C distribution may ask for font generation being quite large, it is important to overcome the at run-time. This is done through the kpathsea problem of file links. library calling an external command when it fails to Fortunately, for executable programs, there is a find some needed font. natural way of doing something similar to file links Because of the complex and evolving nature of using the Win32. The trick is to build a DLL with this process — how many versions of those make- all the engine code and to have a small stub linked to texpk scripts have been devised? — the generation the DLL. This way, the DLL is shared and the stub of fonts has traditionally been handled by shell scripts. can be copied without using too much disk space. The shell requirement needed to be removed un- For example: der Win32 and the Perl alternative, despite some 03/17/99 08:44a 16,384 pdfinitex.exe drawbacks, was considered: 03/17/99 08:44a 16,384 pdflatex.exe • Perl provides greater portability across such dif- 03/17/99 08:44a 389,120 pdftex.dll ferent operating systems as UNIX and Windows; 03/17/99 08:44a 16,384 pdftex.exe • Perl is not widespread enough under Windows 03/17/99 08:44a 16,384 pdfvirtex.exe and under UNIX, which means you can easily There are four stubs linked to the same DLL. Should find it but not every single user will have it or you create a new format file called frpdflatex.fmt, want it; for example, you only need to copy pdftex.exe to • Perl has quite a large disk space footprint. frpdflatex.exe and the new format file will be Had UNIX users been ready to switch from shell loaded automatically by calling the new command, scripts to Perl scripts for this task, things might have which has a very small footprint on disk. been different but that was not the case. So the only There are other potential advantages: risk-free and simple solution from the user point of 1. upgrading to a new version of pdfTEX could be view was to code the shell scripts in C.Giventhe done by only upgrading pdftex.dll; C complexity, the version of the scripts required sev- 2. clever TEX shells could drive TEX engines di- eral rewritings before becoming reliable enough. But rectly by talking to the DLL and not use the now, they do behave like their original shell counter- command line. parts. And the C version of these scripts can even be used under UNIX. The several mktex* shell scripts Accessing files on the network. Under Win- are provided as one DLL, with several stubs,6 follow- dows, you can access files shared on the network by using UNC names. UNC refers to Universal Nam- ing the same philosophy as for the TEX engines — see next section. This means that kpathsea could ing Code, a syntax introduced by Microsoft to re- be linked with this mktex.dll and could avoid cre- fer to shared resources — files or devices — available ating a new process to generate a new font file. through the network. The kpathsea library is thus made aware of UNC names, means you can make 6 A stub is a small executable program linked to some DLL and whose only function is to set up some parameters before 7 Shortcuts are not file links but rather redirections, avail- calling the DLL. This way, the same large DLL can be called able only through the Windows shell environment, not from in different ways by using only small executable programs. the command line.

TUGboat, Volume 20 (1999), No. 3 — Proceedings of the 1999 Annual Meeting 293 Fabrice Popineau

$TEXMF point to \\TeXServer\TeXmf or ask dvips All but two of the C source files have been to print on \\TeXServer\printer. patched to compile under Win32 and the missing graphic primitives have been added. As well, a new The registry. Under Windows, every program ac- user interface has been devised following the sam- cesses the registry to retrieve its parameters and all ples provided by Microsoft with their Win32 System required information. The registry is a database, Development Kit. shared across the network which encompasses the Some issues have been raised — and solved! — environment. by the redisplay mechanism. The main problem So the question arises: should the port of Web2C with the redisplay was where it should happen: in to Windows use the registry? The answer is no. memory or directly onto the display surface? The The main reason is that it is not recommended that former was easier but had one major drawback: at end-users modify the registry by hand — there is too a scale factor of 1 and 600dpi, an A4 color page much potential danger for their system. So storing would be huge (about 34Mb). So, on a second try, the configuration into the registry would prevent the redisplay was changed to draw directly onto the users to easily change the way their tools behave. screen. In fact, both solutions are still in the source Temporarily setting environment variables is a quick code but only one is activated. way to modify kpathsea behavior and a very useful This is also the same reason for Windvi not dis- feature which it would have been unwise to remove. playing PostScript inclusions at a scale factor of 1: Users can fiddle with their configuration parameters Ghostscript is told to allocate the whole page be- exactly in the same way they would under UNIX:by cause it can be asked to display raw PostScript code. either editing the texmf.cnf file or overriding pa- So this time, Ghostscript would require the 34Mb rameters in the environment. It is always a matter page. It does work under Windows NT — albeit very of getting the best of both worlds. slowly — but it is too heavy for Windows 9x. Previewer This was also the opportunity to fully under- stand where Windows 9x and Windows NT are dif- Motivation. It was not at first my intention to ferent. They share the same API but they behave DVI devise one. format is not a modern format any- in very different ways. For example, the first data more. Even if it fulfills everybody’s needs, it does structures I built and that used to work under Win- not mean it will last. The new pdfTEXextension dows NT assigned one bitmap handle per glyph used DVI has demonstrated that is not mandatory for a in the DVI file. It was even pretty fast but the same TEX system. Thus, devising a previewer for Win32 program running under Windows 9x was slow and is not a simple task. Spending lots of time on a tool eventually crashed. Looking at the resources, all the that might turn quickly into something obsolete is graphics resources were used. How to explain such not very appealing. different behavior? In fact, the Windows 9x GDI — But no TEX distribution would be complete with- the kernel part that implements graphics services — DVI out a previewer: many TEX users stick to the allocates all the graphic objects in a few 64K stacks. A DVI good old (plain- or LTEX-generated) format be- The one dedicated to bitmap headers was quickly fore any kind of PostScript conversion. filled in when the DVI file was using even a low num- We still might argue that Ghostscript provides ber of fonts. an accurate view of what will be printed, but the process of TEX → dvips → Ghostscript is somewhat Features. The features of Windvi are almost the slow and heavy for many documents. same as those of xdvi. I have tried to mimic the So I ended up in looking at the xdvi source code. behavior of xdvi whenever possible and, at the same As I had no previous experience of Win32 graphics time, to add Windows behavior via status bars, tool programming nor of X-Window graphics program- bars and tooltips. The most important features of ming, so this was another reason for doing the pre- Windvi include: viewer. • monochrome or grey-scale bitmaps (anti-alias- ing) for fonts Porting graphics application. If we omit the • DVI interface, xdvi uses only a few primitives from X- easy navigation through the file Window: it only needs to draw bitmaps for glyphs – page by page and rectangles for rules. So the decision was made – with different increments (by 5 or 10 pages to adapt to Win32 everything that could be (the at a time) page reading and drawing mechanism, for example) – go to home, end, or any page within the and to rewrite the user interface part. document

294 TUGboat, Volume 20 (1999), No. 3 — Proceedings of the 1999 Annual Meeting fpTEX: A teTEX-based Distribution for Windows

Figure 1: Windvi featuring magnifying glass and Hyper-TEX links

• different shrink factors to zoom page in and out The main features not found in xdvi are color • magnifying glass to show the page at the pixel support and printing. The latter was again the op- level portunity to test different behaviors between Win- • compatible with xdvi keystrokes dows 9x and Windows NT — quite painful to debug. In fact, printing is something not at all easily done • use of .vf fonts because there are many ways to handle it: • display .pk and .gf font files • automatic generation of missing .pk files, even 1. Print through the generic Windows printer driv- for Type 1 fonts er, but PostScript specials will not be printed. • tracking DVI file changes and automatic reopen- 2. Ask dvips to convert the .dvi file to PostScript, ing and then either send the file directly to the • understanding of Ω extended DVI files printer, if it can handle it, or else call Ghost- • drag-and-drop file from the Windows shell ex- script to do the job. plorer 3. Build a bitmap with the page, PostScript spe- • external commands through \specials cials included, and do banding (because the page • color support (`aladvips) would be huge) and send the bitmap to the • real-time logging of background font generation printer. • visualization of PostScript inclusions Currently the first and third options are implement- • support of Hyper-TEX specials ed but the third one uses lots of Windows 9x re- • printing. sources.

TUGboat, Volume 20 (1999), No. 3 — Proceedings of the 1999 Annual Meeting 295 Fabrice Popineau

Last, the fact that Windvi is quite close to a The installer. There is a product dedicated to port of xdvi was rewarding when it came to imple- building installers for Windows that is widely known menting the Hyper-TEX feature, which relies on the and used in the Windows world, called InstallShield. use of the libwww library, maintained by the W3C Given a tree structure of components to which are consortium. Fortunately, this library is available for associated file groups and a few setup schemes,it UNIX and Windows. The net result was that adding will build a nice looking installer. this Hyper-TEX feature to Windvi took only a cou- But things are not so easy when it comes to ple of hours to have a first workable result that al- installing a huge distribution. While InstallShield lows navigation inside the document and referenc- handles many common installation cases well, TEX ing external programs. However, it turned out that, is quite special because of the large number of files. under Windows, this library is not mandatory at all This provides the opportunity to experiment with because the shell can be called to open URLs, so it bugs and any limitations in InstallShield. Even is not needed anymore. though it is being used for the current release of fpTEXandtheTEX-Live CD, it will probably be Installation abandoned and replaced by a dedicated installer. Packaging TEX. Maintaining a texmf tree is a job All in all, even if the installer is not perfect or that is very well done by Thomas Esser for teTEX flexible enough, it is useful enough to install from CD and by Sebastian Rahtz for the TEX-Live CD.Given the TEX-Live 4 . The latest version of the in- suchatree,Iwantedtofindawaytoautomatically staller used on fpTEX even makes it possible to add group files by packages. packages on top of an existing installation. And fi- As has been pointed out in electronic discus- nally, it has been useful to use InstallShield to sort sion lists,8 there is a lack of a standard procedure out all the problems related to installation; even if to install TEX packages. So there is no way to get a it is not used for future versions, the specifications source texmf tree and build it, logging where every are still there. single file has been installed. So I wrote a few Perl Windows integration. Most environments dedi- scripts to reverse-engineer the build process, keeping cated to TEX in one way or another will support the following goals in mind: fpTEX. Amongst them, we can cite WinEdt, 4TEX • Have three levels of completion: basic, recom- and XEmacs. mended and full; given a package, these levels Configuration are guessed by heuristics from the lists files, devised by Sebastian Rahtz and to be found on Assuming a recommended installation, there is little the TEX-Live CD. to configure. But, as pointed out in the introduc- • Build a two-level structure targetted for Install- tion, Windows users expect dialog boxes not text Shield us (see next section); this structure is files to be edited. This has lead me to devise a based on components (e.g. latex, , ...) dialog box-based tool targetted at fpTEX configu- and subcomponents (e.g. latex\graphics). ration. The texconfig.exe tool allows the user to access most of the configuration files in a point-and- • Group files as much as possible; for example, click way. to group fonts files, style files, source and docu- Moreover, the standard Windows menus are pro- mentation files for one package. This was done vided with shortcuts to command-line tools (to re- by implementing some kind of rule-based sys- build formats or file database) and to local web tem in Perl, along with some other ad-hoc rules. pages (to access the documentation). • Give a description for each sub-component; this was done using the Web description of TEX Future work packages assembled by Graham Williams. Some of the above-mentioned components will be The result is not perfect, especially for the TEX-Live enhanced in the near future. CD , with its huge number of packages. Notably, Previewer. Even if the DVI format is old nowa- the automatic detection of descriptions is flaky — days, many people are still using it so I will enhance some of them being false, but this is harmless — and Windvi in the following ways: the recommended installation installs far too many • Type 1 and TTF font support packages, which means that the levels attributed to • other graphics files format support packages have been underestimated. • graphical transformations for glyphs and rules 8 See the dedicated mailing list on tug.org. under Windows NT

296 TUGboat, Volume 20 (1999), No. 3 — Proceedings of the 1999 Annual Meeting fpTEX: A teTEX-based Distribution for Windows

Figure 2: texconfig.exe tool editing pdfTEX related paths

• two-page spread mode org, to which you can subscribe by sending a request to [email protected]. • forward and inverse search; that is, from the editor to the DVI file or from the DVI file to the Acknowledgements editor All this work relies heavily on the work done by Karl • cut-and-paste to other applications Berry, Thomas Esser, Sebastian Rahtz and Olaf We- ber on the UNIX distributions Web2C, teTEXand Installation. The dedicated installer is being work- TEX-Live. ed on. It will handle installations of both fpTEX Obviously, the numerous authors of all the pack- and T X-Live. The fpT X files will be distributed E E ages present in fpTEX — programs or TEX style files — as .zip files and if they are not present, the installer are thanked too for having shared their work. will try to download them from the net. References Configuration. The texconfig.exe program is not yet available. Its interface needs to be discussed Berry, Karl and O. Weber. “The Web2C distribution because it is difficult to be simple and powerful at of TEX”. http://tug.org/web2c, 1999. the same time. Many users want to tweak the con- Esser, Thomas. “The teTEX distribution of TEX”. figuration whereas, to make this thing really simple, http://tug.org/pub/teTeX, 1999. we should hide most of the configuration parame- Gurari, Eitan M. “A demonstration of TEX4ht”. ters. http://www.cis.ohio-state.edu/~gurari/ tug97/tug97-h.html, 1997. Availability Popineau, F. “Rapidit´e et souplesse avec le moteur The whole package is available from CTAN,inthe Web2c-7”. Cahiers GUTenberg 26, 96–108, 1997. systems/win32/fptex directory. More information Popineau, F. “Windvi User’s Manual”. MAPS 20, is available from www.ese-metz.fr/~popineau/ 146 – 149, 1998. van Dobbelsteen, G. “DVIview: A new previewer”. fptex, the home of fpTEX. The TEX Users Group is kindly hosting a dedicated mailing-list, fptex@tug. MAPS 20, 120–124, 1998.

TUGboat, Volume 20 (1999), No. 3 — Proceedings of the 1999 Annual Meeting 297