Star TEX: the Next Generation Several Ideas on How to Modernize TEX Already Exist
Total Page:16
File Type:pdf, Size:1020Kb
TUGboat, Volume 33 (2012), No. 2 199 Star TEX: The Next Generation Several ideas on how to modernize TEX already exist. Some of them have actually been implemented. Didier Verna In this paper, we present ours. The possible future Abstract that we would like to see happening for TEX is some- what different from the current direction(s) TEX's While T X is unanimously praised for its typesetting E evolution has been following. In our view, modern- capabilities, it is also regularly blamed for its poor izing TEX can start with grounding it in an old yet programmatic offerings. A macro-expansion system very modern programming language: Common Lisp. is indeed far from the best choice in terms of general- Section 2 clarifies what our notion of a \better", or purpose programming. Several solutions have been \more modern" TEX is. Section 3 on the next page proposed to modernize T X on the programming E presents several approaches sharing a similar goal. side. All of them currently involve a heterogeneous Section 4 on the following page justifies the choice approach in which T X is mixed with a full-blown pro- E of Common Lisp. Finally, section 5 on page 205 gramming language. This paper advocates another, outlines the projected final shape of the project. homogeneous approach in which TEX is first rewrit- ten in a modern language, Common Lisp, which 2 A better TEX serves both at the core of the program and at the The T X community has this particularity of being scripting level. All programmatic macros of T X are E E a mix of two populations: people coming from the hence rendered obsolete, as the underlying language world of typography, who love it for its beautiful itself can be used for user-level programming. typesetting, and people coming from the world of Prologue computer science, who appreciate it for its automa- tion and similarity to a programming language. It is TEX true that the way TEX works is much closer to that The [final] frontier. of a compiled language than a WYSIWYG editor. These are the voyages, In both worlds, TEX is unanimously acclaimed Of a software enterprise. for the quality of its typesetting. This shouldn't Its continuing mission: be surprising, as it has always been TEX's primary To explore new tokens, objective. The question of its programmatic capa- To seek out a new life, bilities, however, is much more arguable. People New forms of implementation. unfamiliar with programming in general easily ac- knowledge that TEX's programmatic interface is not 1 Introduction trivial to use. For people coming from the world of In 2010, I asked Donald Knuth why he chose to computer science, it is even more obvious that TEX design and implement TEX as a macro-expansion is no match for a real programming language, partly system rather than as a full-blown, procedure-based, due to its macro-expansion nature. programming language. His answer was twofold: Let us recall that TEX was not originally meant 1. He wanted something relatively simple for his to be a programming language. To quote its author, secretary who was not a computer scientist. Donald Knuth [8]: I'm not going to design a programming lan- 2. The very limited computing resources at that guage; I want to have just a typesetting lan- time practically mandated the use of something guage. [. ] In some sense I put in many of much lighter than a complete programming lan- T X's programming features only after kick- guage. E ing and screaming. The first part of the answer left me with a slight From this perspective, it seems natural to con- feeling of skepticism. It remains to be seen that T X E sider that a \better" T X, today, would essentially is simple to use. Although it probably is for simple E deliver the same typesetting quality from a more things, its programmatic capabilities are notoriously modern programmatic interface. More precisely: tricky. The second part of the answer, on the other hand, was both very convincing and arguably now • access to the typesetting subset of TEX's primi- obsolete as well. Time has passed and the situation tives should be provided with a consistent syntax today is very different from what it was 30 years (in particular, no more need for \relax), ago. The available computing power has grown expo- • the programmatic API (\def, \if, etc.) should nentially, and so have our overall skills in language be dropped in favor of support from a real pro- design and implementation. gramming language, Star TEX: The Next Generation 200 TUGboat, Volume 33 (2012), No. 2 • the system should remain simple to use, at least more integrated than the other alternatives is built for simple things, just as TEX is today, like this: a Lua interpreter is embedded in a more • the system should remain highly extensible and or less regular TEX, written in WEB and C. customizable, just as TEX is today, The idea that we suggest in this paper is that • and finally, preserving backward compatibility, another, fully integrated approach is also possible. although not considered mandatory, should also In this approach, another programming language be considered. would be used to completely rewrite TEX and pro- vide the desired programmatic layer at the same 3 Alternatives time. We are aware of at least one previous attempt 4 The idea of grounding TEX into a real program- at a fully integrated approach. NTS , the \New ming language is not new. Let us mention some Typesetting System" was supposed to be a complete such attempts that we are aware of. eval4tex1 and re-implementation of TEX in Java, but the project sTEXme2 both use Scheme (another dialect of Lisp). was never widely adopted. We don't think that this PerlTEX[13, 15], as its name suggests, chooses Perl. project's demise indicates in any way that the ap- QaTEX/PyTEX[4] use Python, and finally, also as proach is doomed in general. On the contrary, the its name suggests, LuaTEX3 employs Lua. remaining sections explain why we think that Com- These approaches, although motivated more or mon Lisp is a very good candidate for it. less by the same general idea, work in very different 4 Common Lisp: why? ways. Some of them wrap TEX in a programming Lisp is a very old language. It was invented by language by giving the language access to TEX's in- ternals. Others wrap a programming language in John McCarthy in the late 1950s [9]. Contrary to what many people seem to think however, being old TEX by allowing TEX to execute code (LuaTEX be- longs to this category). Some do both. For instance, doesn't imply being obsolete. In this case, it is a synonym for being mature and modern. When Lisp sTEXme ships with a extended Scheme engine that was invented, it was in fact way ahead of its time, to can access TEX's internals, as well as a modified TEX engine that can evaluate Scheme code. the point that even its inventor hadn't realized the extent of his creation. A sign of this is the recent Some of them allow authors to write TEX macros incorporation of features that Lisp already provided in a different language (PerlTEX lets you write TEX 50 years ago into so-called \modern" programming macros in Perl). Others, like QaTEX, aim at com- languages, such as C# with the addition of dynamic pletely getting rid of TEX macros so that all program- matic functionality is written in another language types, or C++ and Java with the addition of lambda (anonymous) functions. (Python, with PyTEX in that case). This particular case is closer to what we have in mind. 4.1 Standardization Finally, some approaches use a synchronous Common Lisp, in particular, was standardized in dual-process scheme in which both TEX and the programming language of choice run in parallel, com- 1994 [1] (it was in fact the first object-oriented lan- municating either via standard input/output redi- guage to get an ANSI standard). Remember how rection, or by file or socket I/O. That is the case of stability was important to Donald Knuth for TEX? This means that even across different implementa- sTEXme and PerlTEX. Others, like eval4tex use a multi-pass scheme instead. In a first pass, the \for- tions (there are half a dozen or so), that the core of eign" code is extracted and sent to the programming the language will never change, and in fact hasn't for language. The programming language in question the last 20 years. Compare this to modern scripting languages such as Python or Ruby, for which the executes its code and sends it back to TEX. In the \standard" is essentially the current implementation final pass, TEX is left with only regular TEX macros to process. of the current version of the language written by the In spite of all these variations, it is worth stress- author of the language. ing that all these approaches have something in com- The Common Lisp standard is fairly comprehen- mon: they are heterogeneous. They involve both a sive: it includes not only the language core but a large programming language engine on one hand, and the library of functions. Of course, because the standard is 20 years old, it lacks several things that are con- original, though possibly modified, TEX engine on sidered important today (such as a multi-threading the other hand. Even LuaTEX which is somehow API). Every Common Lisp implementation provides 1 http://www.ccs.neu.edu/home/dorai/eval4tex/ its own version of non-standard features, but again, 2 http://stexme.sourceforge.net/ 3 http://www.luatex.org 4 http://nts.tug.org Didier Verna TUGboat, Volume 33 (2012), No.