TUGBOAT

Volume 35, Number 1 / 2014

General Delivery 2 Ab epistulis / Steve Peter 3 Editorial comments / Barbara Beeton Updike prize for student type design; Talk by Matthew Carter; R.I.P. Mike Parker (1929–2014); Turing Award for Leslie Lamport; TAOCP volume 1 issued as an ebook; Other items worth a look 5 The TEX tuneup of 2014 / Donald Knuth 9 Making lists: A journey into unknown grammar / James Hunt 16 In memoriam Jean-Pierre Drucbert / Jean-Michel Hufflen Letters 16 Does not suffice to run latex a finite number of times to get cross-references right / Jaime Gaspar Fonts 17 Fetamont: An extended logo typeface / Linus Romer A L TEX 22 LATEX3 news, issue 9 / LATEX Project Team 27 Introduction to presentations with beamer / Thomas Thurnherr 31 The beamer class: Controlling overlays / Joseph Wright 34 Boxes and more boxes / Ivan Pagnossin 36 Glisterings: Glyphs, long labels / Peter Wilson 39 The pkgloader and lt3graph packages: Toward simple and powerful package management for LATEX / Michiel Helvensteijn Software & Tools 44 An overview of Pandoc / Massimiliano Dominici 51 Numerical methods with LuaLATEX / Juan Montijano, Mario Pérez, Luis Rández and Juan Luis Varona 61 ModernDvi: A high quality rendering and modern DVI viewer / Antoine Bossard and Takeyuki Nagao 57 Parsing PDF content streams with LuaTEX / Taco Hoekwater 69 Selection in PDF viewers and a LuaTEX bug / Hans Hagen 71 Parsers in TEX and using CWEB for general pretty-printing / Alexander Shibakov Graphics 79 Entry-level MetaPost 4: Artful lines / Mari Voipio 82 drawdot in MetaPost: A bug, a fix / Hans Hagen Electronic Documents 83 HTML to LATEX transformation / Frederik Schlupkothen Survey 99 Macro memories, 1964–2013 / David Walden Publishing 91 Scientific documents written by novice researchers: A personal experience in Latin America / Ludger Suarez–Burgoa Abstracts 111 Eutypon: Contents of issue 30–31 (October 2013) 111 Die TEXnische Komödie: Contents of issue 1/2014 Hints & Tricks 112 The treasure chest / Karl Berry Book Reviews 113 Book reviews: LATEX for Complete Novices and Using LATEX to Write a PhD Thesis, by Nicola Talbot / Boris Veytsman 115 Book review: Dynamic Documents with R and knitr, by Yihui Xie / Boris Veytsman TUG Business 2 TUGboat editorial information 120 TUG financial statements for 2013 / Karl Berry 121 TUG institutional members Advertisements 121 TEX consulting and production services News 123 Calendar 124 TUG 2014 announcement TEX Users Group Board of Directors TUGboat (ISSN 0896-3207) is published by the TEX Donald Knuth, Grand Wizard of TEX-arcana † Users Group. Steve Peter, President ∗ Jim Hefferon∗, Vice President Memberships and Subscriptions Karl Berry∗, Treasurer 2014 dues for individual members are as follows: ∗ Susan DeMeritt , Secretary Regular members: $105. Barbara Beeton Special rate: $75. Kaja Christiansen The special rate is available to students, seniors, and Michael Doob citizens of countries with modest economies, as de- Steve Grathwohl tailed on our web site. Also, anyone joining or re- Taco Hoekwater newing before March 31 received a $20 discount. Klaus Höppner Membership in the T X Users Group is for the E Ross Moore calendar year, and includes all issues of TUGboat for Cheryl Ponchin the year in which membership begins or is renewed, Arthur Reutenauer as well as software distributions and other benefits. Philip Taylor Individual membership is open only to named indi- Boris Veytsman viduals, and carries with it such rights and responsi- David Walden bilities as voting in TUG elections. For membership Raymond Goucher, Founding Executive Director † information, visit the TUG web site. † TUGboat Hermann Zapf, Wizard of Fonts Also, (non-voting) subscriptions are ∗ available to organizations and others wishing to remember of executive committee †honorary ceive TUGboat in a name other than that of an individual. The subscription rate is $105 per year, See http://tug.org/board.html for a roster of all past and present board members, and other official positions. including air mail delivery. Addresses Electronic Mail Institutional Membership (Internet) Institutional membership is a means of showing con- TEX Users Group P.O. Box 2311 General correspondence, tinuing interest in and support for both T X and the E Portland, OR 97208-2311 membership, subscriptions: TEX Users Group, as well as providing a discounted U.S.A. [email protected] group rate and other benefits. For further information, see http://tug.org/instmem.html or contact Telephone Submissions to TUGboat, the TUG office. +1 503 223-9994 letters to the Editor: [email protected] Trademarks Fax Technical support for Many trademarked names appear in the pages of +1 815 301-3568 TEX users: TUGboat. If there is any question about whether [email protected] a name is or is not a trademark, prudence dictates Web Contact the Board that it should be treated as if it is. The following list http://tug.org/ of Directors: of trademarks which commonly appear in TUGboat http://tug.org/TUGboat/ [email protected] should not be considered complete. TEX is a trademark of American Mathematical Society. METAFONT is a trademark of Addison-Wesley Inc. Copyright c 2014 TEX Users Group. PostScript is a trademark of Adobe Systems, Inc. Copyright to individual articles within this publication remains with their authors, so the articles may not [printing date: April 2014] be reproduced, distributed or translated without the authors’ permission. Printed in U.S.A. For the editorial and other material not ascribed to a particular author, permission is granted to make and distribute verbatim copies without royalty, in any medium, provided the copyright notice and this permission notice are preserved. Permission is also granted to make, copy and distribute translations of such editorial material into another language, except that the TEX Users Group must approve translations of this permission notice itself. Lacking such approval, the original English permission notice must be included. Trying to do everything inside TeX may be a good sport and educational activity but very often far from being the best solution for real needs. :-) Michal Jaegermann comp.text.tex, 26 October 1996

COMMUNICATIONS OF THE TEX USERS GROUP EDITOR BARBARA BEETON

VOLUME 35, NUMBER 1 • 2014 PORTLAND • OREGON • U.S.A. 2 TUGboat, Volume 35 (2014), No. 1

A TUGboat editorial information and L TEX, are available from CTAN and the TUGboat This regular issue (Vol. 35, No. 1) is the first issue of the web site. We also accept submissions using ConTEXt. 2014 volume year. Deadlines, tips for authors, and other information: TUGboat is distributed as a benefit of member- http://tug.org/TUGboat/location.html ship to all current TUG members. It is also available Effective with the 2005 volume year, submission of to non-members in printed form through the TUG store a new manuscript implies permission to publish the ar- TUGboat (http://tug.org/store), and online at the TUGboat ticle, if accepted, on the web site, as well as web site, http://tug.org/TUGboat. Online publication in print. Thus, the physical address you provide in the to non-members is delayed up to one year after print manuscript will also be available online. If you have any publication, to give members the benefit of early access. reservations about posting online, please notify the edi- Submissions to TUGboat are reviewed by volun- tors at the time of submission and we will be happy to teers and checked by the Editor before publication. How- make special arrangements. ever, the authors are still assumed to be the experts. TUGboat editorial board Questions regarding content or accuracy should there- Barbara Beeton, Editor-in-Chief fore be directed to the authors, with an information copy Karl Berry, Production Manager to the Editor. Boris Veytsman, Associate Editor, Book Reviews Submitting items for publication Production team Proposals and requests for TUGboat articles are grate- William Adams, Barbara Beeton, Karl Berry, fully accepted. Please submit contributions by electronic Kaja Christiansen, Robin Fairbairns, Robin Laakso, mail to [email protected]. Steve Peter, Michael Sofka, Christina Thiele The second issue for this year is expected to be the TUG 2014 proceedings (http://tug.org/tug2014). TUGboat advertising The deadline for receipt of final papers for that issue is For advertising rates and information, including consul- August 11. The third issue deadline is October 3. tant listings, contact the TUG office, or see: The TUGboat style files, for use with plain TEX http://tug.org/TUGboat/advertising.html

Ab Epistulis For years, we’ve been growing by word of mouth among colleagues in the academic disciplines. One Steve Peter mathematician tells another about TEX’s abilities in Not long ago, I saw a post on one of the discussion handling all sorts of complex equations; a historian Bib lists asking how to convert a TEX document into tells another how TEX or JabRef can handle the Word, “because that’s what publishers want.” I’ve complexities of managing a bibliographic database recently been working with a group at work to revise (no math here, just academic writing); and so on. our production guidelines for TEX manuscripts, and Now, we need to engage another sub-community I began to ponder the question in more depth. Why of the (academic) publishing world: the freelancers. do publishers want Word (if indeed they do), and Copyeditors and indexers need to know that they what can we do as a community to change that? can gain a competitive advantage by learning at least For math-heavy books, whether math, physics, enough TEX to be able to work directly in source files. or economics, we not infrequently work with TEX (Speaking from personal experience, more and more files throughout the production process, whether the of my own freelance work has gone over to TEX-based book is ultimately provided in camera-ready copy copyediting, away from TEX programming.) by the author, or is produced by a TEX-enabled How will this engagement happen? The key is compositor. Without a doubt, the biggest pain point going to be education, especially in a casual way. If in the process is with copyediting (for all books) and you do encounter a freelancer curious about TEX, indexing (for books where the author does not supply show them the simple stuff to dispel the fear and camera-ready copy). As the publishing industry uncertainty. We don’t need to turn freelance copy- moved to outsource these two processes, a vast army editors or indexers into hardcore TEX experts who of freelance and independent contractors arose, but shun any trace of commercial software. We do need very few saw fit to gain expertise in TEX. In fact, to show them just enough to be able to do their expertise per se isn’t even required, just enough specialized jobs as part of a TEX-based workflow. knowledge to be able to work directly in the files If we can enable a painless workflow, the pub- without breaking too much. lishers will come. In essence, it isn’t necessarily that publishers ⋄ Steve Peter are demanding Word, it’s the freelance community Princeton University Press that is requiring it, and the publishers lack a pool of president (at) tug dot org TEX-savvy talent to draw from to be able to break http://tug.org/TUGboat/Pres that dependency. It seems to me that this represents an opportunity to expand our community. TUGboat, Volume 35 (2014), No. 1 3

Editorial comments 2011/05/Charles-Snell-1714.jpg, and Carter’s re-imagined version at Barbara Beeton http://luc.devroye.org/MatthewCarter- Updike prize for student type design RoundhandBT-afterCharlesSnell.gif. Andrea Mantegna, a 15th century Italian artist An annual prize for student type designers has been and student of Roman archaeology, was the inspi- announced by the Daniel Berkeley Updike Collection ration for the Mantinia font. The foundation post at the Providence Public Library. The basis of the of Mantegna’s house in Mantua (shown in one of Collection is the archives of the eponymous 20th Carter’s slides) prominently displays the carved name century printer and proprietor of the Merrymount “Mantinia” (the Latin form of Mantegna) as well as Press in Boston. other inscriptions in the style typical of Roman monu- The prize requires that the student make at ments and gravestones. All the letters are uppercase. least one visit to the Updike Collection within 18 Whether to save space, balance the shape of the in- months of their application, and must be enrolled in scriptional lines, or for some other reason, two or an undergraduate or graduate program during the three adjacent letters are sometimes combined in time of their visit. unusual ways. Some smaller letters occur as well — The first prize includes $250 and complimentary inscriptional “small caps” — but, rather than being admission to the 2015 TypeCon, organized by SOTA, aligned at the baseline with the larger letters, they the Society of Typographic Aficionados. are aligned at the top. Additional information, and the application A more recent use of this style of carving can form, can be obtained from this web site: be found on the panels above the windows of the http://www.provlib.org/updikeprize Boston Public Library (see Fig. 1). These variations gave Carter a model on which to base the many Talk by Matthew Carter unusual and playful ligatures and top-aligned “low- An exhibition celebrating the 200th anniversary of ercase” found in the finished Mantinia font. the death of typographer Giambattista Bodoni, and The third font was the one designed for Yale the launch of the Updike Prize was held on Febru- University. Inspired by Bembo, this font serves all ary 27. The speaker was Matthew Carter; his subject was “Genuine Imitations: A Type Designer’s View of Revivals”. Carter took three of his typeface designs as the material for his talk: Snell Roundhand, Mantinia, and the design he created for Yale University. The first two designs were based on non-typographic sources; the first was inspired by the work of an English writing master, Charles Snell, and the second, by stone carving in a more-or-less traditional Roman style. In 1965, when Linotype was converting its fonts from metal to images on film, Carter was invited to design a new script font. It’s possible to do many things with film that can’t be done with metal. For example, character widths aren’t limited to what can fit in a rather narrow rectangle cast as part of a unitary “line of type”. (That process doesn’t even permit the kerns that can be carved into the corners of hand-set type.) Charles Snell, a 17th century English writing master, published manuals that included detailed Figure 1: A panel from the Boston Public Library diagrams and instructions for writing script with showing the names of noted astronomers. a quill pen. The flowing fs and the tails on the Photo by Jascin L. Finger, curator of the Maria Mitchell g and y are representative of the style. An image House on Nantucket; used with permission. Maria Mitchell was an American astronomer, only the second woman to be of Snell’s lowercase script can be seen at http://www. recognized as the discoverer of a comet; the first was Caroline paulshawletterdesign.com/wp-content/uploads/ Herschel, sister of William Herschel (discoverer of Uranus). 4 TUGboat, Volume 35 (2014), No. 1

typographic functions of the University, from let- Although Don’s web page (http://www-cs-faculty. terhead to cast bronze letters on building fa¸cades stanford.edu/~uno/abcde.html) explains that the to signage on recycling and rubbish baskets on the “spiﬀy new versions” of the C&T volumes were “pro- campus. Images of the Yale font and notes on its duced entirely with technology that can be expected history and development can be seen at http://www. to last for many generations”, there is no hint that yale.edu/printer/typeface/typeface.html. they might sometime be available in electronic form.

R.I.P. Mike Parker (1929–2014) Other items worth a look Just a few days before Matthew Carter’s talk, Mike As a follow-up to last year’s interview of Chuck Parker died. Parker, as director of type development Bigelow, here is an essay in which he shares some ear- at Linotype, invited Carter to design the script font lier history, written for the centennial of Reed College: that became Snell Roundhand, persuaded Carter to “Rescued from a Life of Crime” (http://www.reed. join Linotype as chief designer, and later, in 1981, edu/reed_magazine/september2011/articles/ left Linotype with Carter to found Bitstream. features/bigelow/bigelow.html). Who knew? Parker was largely responsible for bringing font The Hamilton Wood Type Museum has been production from the world of metal to film, and mentioned before in this column. Evicted from its from film to digital formats. He was an enthusiastic original site, it is now open in a new location in and influential proponent of Helvetica, and put forth Two Rivers, Wisconsin: http://woodtype.org.A the proposition that Times New Roman had in fact documentary, “Typeface”, telling the story of this been designed by Starling Burgess, not by Stanley museum, was presented on the Sundance channel in Morison. the U.S.; information about the video can be found A sympathetic obituary appeared in the March at http://typeface.kartemquin.com/about. 8th edition of The Economist, and can be accessed The debate about whether or not there should be on the web. a wider space after a period goes on: “Space Invaders, Turing Award for Leslie Lamport Why you should never, ever use two spaces after a period.” in Slate (slate.com/articles/technology/ The Association for Computing Machinery will pre- technology/2011/01/space_invaders.html). Ac- sent the 2013 A.M. Turing Award to Leslie Lamport tually, this isn’t new, but it’s just been called to my for “advances in reliability and consistency of com- attention. As a TEX user, I’m spoiled: I can type puting systems”. The Turing Award is given for two spaces after the periods that end sentences in major contributions of lasting importance to com- my (monospace) emacs window, and TEX will do puting. The citation can be read here: my bidding, whether using the default U.S. style or http://techpolicy.acm.org/blog/?p=3641. \frenchspacing to treat all spaces alike. Using a Not a single word recognizes his creation of monospace font while editing makes sense, since it A LTEX, the accomplishment for which this community makes a file easier to read (at least for these old knows him best. With this award, Leslie joins Don eyes), and permits an author to align or indent to Knuth, who received the 1974 Turing Award. And illuminate structure. As I see it, there are times all for achievements not related to TEX. when the wider spaces are valuable typographically. TAOCP volume 1 issued as an ebook Maybe not after every sentence, but if your text says “. . . etc. W. H. Auden says . . . ”, where is the end of The InformIT arm of Pearson Education (parent of that sentence? The real culprit (which the author the group that includes Addison-Wesley) has an- of this screed doesn’t even mention) is software that nounced that The Art of Computer Programming sets multiple spaces in text as multiple spaces, not Volume 1 ebook is now available for sale. Their news singles. Software is supposed to make life easier for release further states that the user, not harder, but some things seem to be Only InformIT provides this eBook in three going backwards these days. formats — EPUB, MOBI,& PDF — together To end this installment, here’s an oldie, but for one price. So you can buy it once and get a real goodie: setting 24pt type in an ogee curve it on any device including your PC or eReader (flickr.com/photos/sos222/12391791743/). The of choice. old guys really did know what they were doing. We will be releasing the other volume eBooks throughout the year and currently ⋄ Barbara Beeton hope to have Vol. 2 for you in just a couple http://tug.org/TUGboat of months. tugboat (at) tug dot org TUGboat, Volume 35 (2014), No. 1 5

The TEX tuneup of 2014 from [4], because it reflects my unwavering philosophy (see [3]): Donald Knuth The index to Digital Typography lists eleven If you ask the Wayback Machine to take you back pages where the importance of stability is to the home page stressed, and I urge all maintainers of TEX http://www-cs-faculty. and METAFONT to read them again every few stanford.edu/~knuth/abcde.html years. Any object of nontrivial complexity is non-optimum, in the sense that it can be of The T Xbook and my other books on Computers E improved in some way (while still remaining & Typesetting, as that page existed on 16 January non-optimum); therefore there’s always a rea- 1999, you’ll find the following remarks: son to change anything that isn’t trivial. But I still take full responsibility for the master one of TEX’s principal advantages is the fact sources of TEX, METAFONT, and Computer that it does not change — except for serious Modern. Therefore I periodically take a few flaws whose correction is unlikely to affect days off from my current projects and look at more than a very tiny number of archival doc- all of the accumulated bug reports. This hap- uments. pened most recently in 1992, 1993, 1995, and Users can rest assured that I haven’t “broken” any- 1998; following this pattern, I intend to check thing in this round of improvements. Everyone can on purported bugs again in the years 2002, upgrade at their convenience. 2007, 2013, 2020, etc. The intervals between such maintenance periods are increasing, be- TEX Version 3.14159265 cause the systems have been converging to an Let’s get down to specifics. The new version of TEX error-free state. differs from the old only with respect to the “null con- And if you fast-forward nine more years, you can trol sequence” \csname\endcsname, which has been a legal construct since version 0.8 (November 1982) al- find a TUGboat article called “The TEX tuneup of 2008” [4], which describes the changes that were though almost nobody uses it. Oleg Bulatov noticed in September 2008 that T X’s \message operation made to TEX and its companion systems based on E the comments from users that were received during has curiously inconsistent behavior: Suppose you say the years 2003, 2004, 2005, 2006, and 2007. That \def\\#1{\message{#1bar}} article ended as follows: \def\surprise{wunder} \let\foo=! So now I send best wishes to the whole T X E (for example). Then community, as I leave for vacation to the land \\\surprise gives wunderbar of TAOCP — until 31 December 2013. Au \\\over gives \over bar revoir! \\\foo gives \foo bar Hello again, dear friends, allô ! Here is the sequel. \\{\csname 6\endcsname} gives \6bar On 31 December 2013, Barbara Beeton duly \\{\csname fu\endcsname} gives \fu bar forwarded to me a well-organized collection of ma- as messages on your terminal and in your log file. terials covering more than two dozen potentially But ‘\\{\csname\endcsname}’ unfortunately gives troublesome topics that had been submitted for con- \csname\endcsnamebar sideration during the years 2008, 2009, 2010, 2011, because I forgot to insert a space when I coded this 2012, and 2013. This was the residue of hundreds part of the print cs routine (see [B], §262). So Oleg of items that had been carefully filtered by a team has won a check for $327.68 [1]. Of course I hope of expert volunteers, who had worked hard to mini- that this turns out to be the “historic” final bug mize the effort that I would need to devote to this in T X. (It’s the 947th; see [3], page 662.) project. (I can’t possibly thank all the volunteers in- E \\{\csname\endcsname} dividually; but Donald Arseneau, Karl Berry, Peter Henceforth ‘ ’ will give Breitenlohner, and Boguslaw Jackowski deserve par- \csname\endcsname bar ticular commendation.) and everybody will be happy. This corrected behav- As in 2008, both TEX and METAFONT have ior does not simply affect TEX’s messages; the name changed slightly and gained new digits in their ver- of a control sequence can also get into documents, for sion numbers. But again, the changes are essentially example via \write or \meaning. But the change invisible. I can’t resist quoting another paragraph surely won’t ruin your archived works.

The TEX tuneup of 2014 6 TUGboat, Volume 35 (2014), No. 1

METAFONT Version 2.7182818 editions of TEX Live; now they are in some sense The historic final (I hope) bug in METAFONT was “official.”) Here is a current list of all the web files for discovered during June 2008 by the longstanding which I have traditionally been responsible: TEX contributor Eberhard Mattes. The error that name current version date he brought to light is easier to describe than the TEX dvitype.web 3.6 December 1995 error discussed above, but it was much more subtle to gftodvi.web 3.0 October 1989 detect: Whenever previous versions of METAFONT gftopk.web 2.4 January 2014 have transformed pencircle into an axially symmet- gftype.web 3.1 March 1991 ric pen whose polygon has no point on the x-axis, mf.web 2.7182818 January 2014 the algorithm in §536 of [D] has “leaked memory,” mft.web 2.0 October 1989 by forgetting to reclaim seven words that had been pltotf.web 3.6 January 2014 allocated for the omitted point. This happened, for pooltype.web 3.0 September 1989 instance, with one of the pens in exercise 16.2 of [C], tangle.web 4.5 December 2002 and in my original TRAP test [2] for METAFONT; so I tex.web 3.14159265 January 2014 should have discovered the problem long ago. Eber- tftopl.web 3.3 January 2014 hard noticed that the METAFONT program vftovp.web 1.4 January 2014 pen p; vptovf.web 1.6 January 2014 forever: showstats; weave.web 4.4 January 1992 p := pencircle scaled 1.4; endfor would abort with METAFONT’s capacity exceeded — Typographic errors and other blunders although it did take quite awhile to overflow 3 million So far I’ve only been discussing potential anomalies in words of memory on my current home system — and the software. But of course people have also reported he also figured out how to cure the problem. For problematic aspects of the documentation — which this he amply deserves his new reward in [1]. may actually be the hardest thing to get right. Even The T Xbook [A], which has been under intense Computer Modern E scrutiny for more than thirty years, was not free of No changes have been made to the Computer Modern hitherto-unperceived defects. fonts of 2008, although I did delete a few bytes of Altogether I made corrections to each of [A], redundant source code and alter two names. [B], [C], [D], and [E], enough to represent $23.68 in John Bowman noticed a tiny bump that appears eleven new reward checks. The most significant of near the top right serif when an italic ‘K ’ is greatly these changes can be seen from the home page cited magnified, and Jacko discovered the underlying rea- above, if you click to get the PDF errata file and scan son: Part of the stroke of this slanted letter is drawn for corrections dated in 2014. with a circular pen, but it joins up with outlines that are slanted (hence not true circles). The same The master sources tiny bumps can therefore by observed also in various The backbone of the TEX system, for the past 25 other italic and slanted letters, such as A, V, W, X, years or so, has been a collection of 178 files, mostly Y, when enlarged. with names of the forms *.web, *.tex, and *.mf. But those bumps are even less visible than the These files contain almost exactly 7 megabytes al- mispositioned bulbs that I discussed in [4]. And in together; and the new changes have altered about fact I’ve even become somewhat fond of such little 3500 of those bytes. Thus it appears that the TEX glitches, now that I’ve been learning to appreciate system was 99.95% correct in 2008, if it is 100% the Japanese concept of wabi-sabi. correct today. Thus I’ve decided that the Computer Modern The master files, together with a bunch of errata fonts are to be forever frozen in their present form, files that document past history, can be downloaded especially now that the definitive description in the from the ftp server cs.stanford.edu, which accepts latest printing of [E] has become available. ‘anonymous’ as a login name. They’re collected together in a single compressed file TEXware and METAFONTware I made minor updates to the master web files for five pub/tex/tex14.tar.gz , other programs, namely gftopk, pltotf, tftopl, which you can compare if you like to the older files vftovp, and vptovf, in order to make them more pub/tex/tex08.tar.gz , pub/tex/tex03.tar.gz. robust in the presence of weird input files. (These The latest versions of individual files can of course changes had in fact already been made in recent also be found in the CTAN archive.

Donald Knuth TUGboat, Volume 35 (2014), No. 1 7

As I did in [4], I’ll mention here the names of On the other hand, I myself have to generate all files that have changed in some way during the new printings every now and then; and I have a latest go-round: favorite way to get around the booby trap by first tex/texbook.tex % source file for [A] typing ‘19’ and then typing some other special codes. tex/tex.web % master file for TEX in Pascal (I also realize that unscrupulous people might even tex/trip.fot % torture test terminal output try to change texbook.tex, although that is strictly tex/tripin.log % torture test first log file forbidden. The source code is intended to be examined, if desired, but not executed or modified except tex/trip.log % torture test second log file by its author.) tex/trip.typ % torture test output of DVItype Unfortunately I don’t think I ever noted down texware/pltotf.web % PLTOTF master file for the running time in the 80s, so I can’t give a definitive TFTOPL texware/tftopl.web % master file for answer to the question. My recollection is that the mf/mfbook.tex % source file for [C] entire book took maybe 20 minutes on Stanford’s mf/mf.web % master file for METAFONT in Pascal PDP10 mainframe (shared with other users). There mf/trap.fot % torture test terminal output was a noticeable slowdown on certain pages — such mf/trapin.log % torture test first log file as page 218, when prime numbers are computed the mf/trap.log % torture test second log file hard way. mf/trap.typ % torture test output of DVItype My colleague David Fuchs used The TEXbook mfware/gftopk.web % master file for GFTOPK as a benchmark in 1986, when he was developing cm/romanu.mf % master file for Computer Modern MicroTEX (the first version of TEX to run on an IBM PC Roman uppercase ). A few days ago I asked him if he could cm/symbol.mf % master file for Computer Modern remember its speed. He replied that, like me, he had Roman symbols no firm memory of those days, except that MicroTEX could do several pages per minute; and he guessed etc/vftovp.web % master file for VFTOVP that it had taken roughly an hour to complete the etc/vptovf.web % master file for VPTOVF whole TEXbook. His estimate seems right, because lib/manmac.tex % macros for [A] and [C] The TEXbook has nearly 500 pages. errata/errata.nine % changes to [A] between Today, on my home computer (a 3.6 GHz Xeon 1992 and 1996 with 10 MB cache), TEX transforms texbook.tex to errata/errata.tex % changes to [A]–[E] since texbook.dvi in 0.3 seconds. 2001 (2) If you were designing T X today, would you still errata/tex82.bug % tex.web E changes to use \over and friends, rather than something like errata/errorlog.tex % one-per-line annotated \frac{...}{...}, when the latter would avoid the summaries of those changes necessity of \mathchoice and \mathpalette? errata/mf84.bug % changes to mf.web This question, from tex.stackexchange.com, (Notice that the basic macro files for plain vanilla also quoted from page 151 of [A]: TEX and plain vanilla METAFONT, lib/plain.tex \mathchoice is somewhat expensive in terms and lib/plain.mf, remain unchanged.) of time and space, and you should use it only when you’re willing to pay the price. Questions and answers And well, I guess that quote implies my answer. For Barbara also asked me to answer three questions, I was clearly willing to pay the price in 1982, so I’m which she said “keep coming up in various forums,” certainly willing to pay zero today! so that she could point people to the answers if those I suppose there are some people in the world questions come up again. who prefer expressions like ‘sum(2, 3)’ to ‘2 + 3’; but I’m certainly not among them. Ever since TEX was (1) How long did it take to typeset The TEXbook born, I’ve been enormously pleased by the ability in the 80s, and how long does it take today? to write ‘2\over3’ or ‘n\choose k’ or ‘p\atop q’ This question is a bit strange, because anybody or ··· , instead of being forced to write something who tries to apply TEX to the file texbook.tex im- like ‘frac{2}{3}’ that would have distracted my mediately gets the message ‘ ~\.{This manual is attention from the task at hand. copyrighted and should not be TeXed ’, repeated The questioner seems to want to place burdens endlessly. Therefore the running time to typeset The on all users, rather than on the backs of a few macro- TEXbook has always been infinite. developers.

The TEX tuneup of 2014 8 TUGboat, Volume 35 (2014), No. 1

(3) Why is the default rule thickness 0.4 points? [3] Donald E. Knuth, Digital Typography One of the very first things I did when designing (Stanford, California: Center for the Study TEX was to choose several publications that repre- of Language and Information, 1999), sented the highest standards of excellence in mathe- xvi + 685 pages. CSLI Lecture Notes, no. 78. matical typesetting, and to “reverse engineer” them The second printing (2012) contains numerous by making careful measurements of those fine works. corrections. (See [3], page 620.) The thickness of rules in The [4] Donald Knuth, “The TEX tuneup of Art of Computer Programming was definitive for me. 2008,” TUGboat 29 (2008), 233–238. http: I also knew that Belfast Universities Press was using //tug.org/TUGboat/tb29-2/tb92knut.pdf. that value in its typesetting of mathematical journals [5] Donald E. Knuth, “An earthshaking in 1977. announcement.” TUGboat 31 (2010), This question, however, is related to the one sore 121–124. http://tug.org/TUGboat/tb33-3/ point with respect to which I wish that I could turn tb105knut.pdf. back the clock and redesign TEX from scratch: The [A] Donald E. Knuth, The T Xbook (Reading, actual default rule thickness in TEX is not exactly E 0.4 printer’s points; it is exactly 26214 scaled points, Mass.: Addison–Wesley, 1984), x + 483 pages. where there are 65536 scaled points to every printer’s Also published as Computers & Typesetting, point. Thus the default rule thickness is actually Volume A. Currently in its 34th printing 0.399993896484375 points. (paperback) and 19th printing (hardcover). I made the foolish mistake of using binary frac- [B] Donald E. Knuth, Computers & Typesetting, tions internally, while providing approximate decimal Volume B, TEX: The Program (Reading, equivalents in the user interface. I should have de- Mass.: Addison–Wesley, 1986), xvi + 594 pages. fined a scaled point to be 1/100000 of a printer’s Currently in its 9th printing (hardcover). point, thereby making internal and external repre- [C] Donald E. Knuth, The 89:;<=>:book sentations coincide. This anomaly, which is discussed (Reading, Mass.: Addison–Wesley, 1986), further in [5], is the only real regret that I have today xii + 361 pages. Also published as Computers about TEX’s original design. & Typesetting, Volume C. Currently in its 12th printing (paperback) and 8th printing Conclusion (hardcover). The T X family of programs seems to be healthy E [D] Donald E. Knuth, Computers & Typesetting, as it continues to approach perfection. Volunteers Volume D, 89:;<=>:: The Program have been stalwart contributors to this success in (Reading, Mass.: Addison–Wesley, 1986), optimum ways. Stay tuned for The TEX Tuneup xvi + 560 pages. Currently in its 6th printing of 2021! (hardcover). References [E] Donald E. Knuth, Computers & Typesetting, Volume E, Computer Modern Typefaces [1] The Bank of San Serriffe, account balances. (Reading, Mass.: Addison–Wesley, 1986), See http://www-cs-faculty.stanford.edu/ xvi + 588 pages. Currently in its 7th printing ~knuth/boss.html (accessed January 2014). (hardcover). [2] Donald E. Knuth, A torture test for 89:;<=>:. Stanford Computer Science ⋄ Donald Knuth Report 1095 (Stanford, California: Stanford http://www-cs-faculty.stanford. University Computer Science Department, edu/~knuth January 1986), 78 pages.

Donald Knuth TUGboat, Volume 35 (2014), No. 1 9

Making Lists: A Journey into decorate the text of the work in any way that we con- Unknown Grammar sider necessary: bold, italic, indenting, white space, bullets, numbers, table rules, illustrations, and so on. James R. Hunt Navigational devices, such as headings and sub- Abstract headings, page numbers, captions for figures and tables, tables of contents, lists of figures and tables, Textbooks on technical writing, and academic, cor- and headers and footers may be applied to the text porate and other style guides, often prescribe rules as the writer considers necessary. These navigational for lists that result in basic grammatical errors. Item- devices may be, but are not necessarily, made from ised and enumerated lists are grammatically different, text elements, and may be, but are not necessarily, and errors arise when the rules for one type of list complete sentences. No matter what they are, text are used in constructing the other type. Examples decorations are designed to assist comprehension of of correct and incorrect usage are given, and ways of the text, and navigational devices are designed to avoiding errors are described. The strict application assist in finding specific material in the text. Neither of grammatical rules to list construction reveals some text decorations nor navigational devices are part of interesting limitations of the list form. the text itself. 1 Introduction We are all familiar with numbered and bulleted lists, 1.1.3 Ellipsis and use them often. After ordinary paragraphs, lists Ellipsis is the omission of elements recoverable from are perhaps the commonest devices used by tech- the context, and can involve punctuation as well nical and scientific writers for arranging text on a as words. Restoring the punctuation and missing page. It is an unfortunate fact that textbooks on words (usually conjunctions like and) will produce a technical writing, corporate style guides, and even complete, grammatically correct sentence. style guides promulgated by learned societies, often Elliptical sentences should be constructed care- prescribe rules for lists that result in grammatical fully, in such a way that the reader of the material errors. Some of the errors produced by these rules, can without effort reconstruct the full version, and such as prescribing sentences without correct termin- not gain any conscious impression that the material ating punctuation, are basic indeed. The purpose of is grammatically incorrect. this article is to examine some of these errors, and Elliptical sentences are often used to reduce the try to find ways to avoid them. apparent complexity of sentences in technical works. The approach adopted here is an axiomatic one: In particular, elliptical sentences are commonly used a number of assumptions about the desirable prop- in list constructions. erties of a technical text are made, and the con- sequences of those assumptions are examined. 1.1.4 There is No Such Thing as a 1.1 Preliminaries and Basic Assumptions Sentence Fragment First, some preliminaries and plausible basic rules or Some writers refer to incomplete or otherwise un- axioms. grammatical sentences as sentence fragments. Now sentence fragment is not really a useful concept: in- 1.1.1 Writing in a Formal Register complete or ungrammatical sentences are not accept- Technical works are usually written in a formal re- able in the body of a technical work, which must be gister. It will be assumed here that a formal docu- written in complete sentences or easily-reconstructed ment is one that is written in grammatically correct, elliptical sentences. complete sentences, and is correctly punctuated. The There is no requirement for the words in chapter actual quality and style of the language used is not and section headings and captions to constitute com- under consideration. plete sentences, and they usually don’t. Such headings are only navigation devices, designed to help 1.1.2 Improving Comprehension readers to find their way around a complicated doc- A technical work must be written in complete, gram- ument. Such navigational devices could be called matically correct sentences. These sentences can be sentence fragments, but since sentence fragments are typographically arranged on a page or screen in any limited to navigation devices, there seems to be no way that helps the reader to understand the material. real need for a separate name: we could simply refer We can, in the interests of clarity and comprehension, to headings and captions.

Making Lists: A Journey into Unknown Grammar 10 TUGboat, Volume 35 (2014), No. 1

1.1.5 Sentences Cannot be Nested If the brackets and letters are left out, the sentence In the English language, sentences cannot be nested, is grammatically correct, and a little easier to read. that is, a complete sentence cannot be placed inside The bracketed letters add nothing to our understand- another sentence as a standalone entity. (It is of ing, and are merely extraneous decoration. course possible to construct elaborate sentences that 2.2 Identifying a List contain parts that would be complete sentences if they were written out separately, but that is not the The items of a list must have some common property: same as a nested sentence.) for example, they must all relate to a single idea or task; and the bullets or numbers must in some way 1.1.6 A Basic Rule for Writing improve the clarity of the material being presented. This idea will be progressively refined here. Any writing rule that results in grammatical error is useless, and must be discarded. The application 3 Terminology of this rule to the construction of numbered and LAT X recognises three types of lists: enumerated bulleted lists leads to some surprising conclusions. E lists, described in Section 3.2, itemised lists, de- definition 2 How Do You Know It’s a List? scribed in Section 4, and lists. The definition list is something of an oddity, Bullets or numbers do not make a list. Word pro- because it appears to be nothing more than a table cessors and text formatters can generate tagging in disguise. The definition list is considered further symbols or sequential numbering for any type of text in Section 8. item that you can specify, and you can place those symbols anywhere in a text. However, these symbols 3.1 Other Names for Itemised Lists and numbers may not serve the purpose of assisting Itemised lists are sometimes referred to as bulleted in comprehending the ideas being presented. lists or unordered lists. The term bulleted list is common but not all that useful, because bulleted 2.1 False Lists lists do not necessarily have text bullets to mark their There are lists, and there are false lists. A false items: items may be marked by dashes, graphics, or list is a collection of items that do not really belong not marked at all but merely set out on separate together, but have bullets or numbers at the left on lines. The term unordered list is often used as a more their first lines. In the Age of PowerPoint, false lists general term than bulleted list. are very common, because it is so easy to add symbols The term unordered list is not particularly use- to text. However, the bullet symbols, pointing hands, ful either, since the items of the list are arranged on marching ants, or numbers quite often add nothing the page in some order that the author considered to the presentation: plain, undecorated text would useful. The important point is that the order does have been more informative. not matter, in the sense that the items can be re- Bad Example 1 shows a false list, of a kind arranged without changing the meaning of the text, commonly found in PowerPoint presentations. even if the clarity of the presentation is reduced as a consequence. Bad Example 1: Vertical False List 3.2 Enumerated Lists Topics for today An enumerated list is used to set out sequential • What is the Automatic Manual Writer? instructions, or to list components or cases, or to • Five easy steps to success indicate the order of importance of cases. The items • How it works in enumerated lists are marked by numbers or letters, or other obviously sequential symbols. Setting out sequential instructions is fundament- False lists are not always set out in a neat ver- ally different from the other two uses: if the enumer- tical alignment, as Bad Example 2 shows. ated list comprises components or cases, then the order of the items in the list does not matter, because Bad Example 2: Horizontal False List the items can be rearranged without changing the meaning of the presentation. Sometimes the items in There are three options available, namely, (a) running an enumerated list, such as a list of component parts away; (b) staying and hiding; and (c) staying and being of an assembly, can be changed at random without brave. reducing the clarity of the presentation.

James R. Hunt TUGboat, Volume 35 (2014), No. 1 11

4 Itemised Lists Bad Example 3: Incorrect Use of Initial Capitals Itemised and enumerated lists appear to be similar, but there is a fundamental diﬀerence between the This section gives guidelines for: two types, and each has its own rules of construction. • Creating lists; Itemised and enumerated lists are often confused by • Punctuating lists; and technical writers, in the sense that the rules for one • Creating embedded lists. type are often applied to the other. An itemised list is simply a visual device for displaying a single, usually complex, sentence on a page. The bullets and other typographical devices, 4.2 Omitting Punctuation such as indents and line spacing, are not part of the Itemised lists are often written in unpunctuated, or syntactical structure of the sentence itself: they serve elliptical, form. Consider the itemised list shown in only to assist comprehension. It follows from the Example 3. (Some textbook writers insist that an basic premise stated in Section 1.1.2 that a sentence elliptical list like this should have a ﬁnal full stop. written as an itemised list should still be grammat- But why would you bother?) ically correct when the visual devices are removed. Consider the itemised list shown in Example 1. Example 3: Itemised List, Unpunctuated (Elliptical) Example 1: Simple Itemised List Three colours are available: Three colours are available: • red • red, • green • green, and • blue • blue.

4.2.1 Ellipsis This itemised list is only a typographical re- The unpunctuated list in Example 3 is derived from arrangement of the following sentence. the fully punctuated version by the process of ellipsis. Three colours are available: red, green, and blue. Some textbooks on technical writing refer to an itemised list formed by ellipsis as a list of sen- 4.1 Commas and Semicolons tence fragments, but, as was pointed out in Subsec- In order to avoid ambiguities arising from the re- tion 1.1.4, the concept of a sentence fragment is not peated use of the word and in more complicated ex- useful. amples, we could adopt a convention that sentences If a grammatically correct version cannot be are to be subdivided by semicolons, not commas, and constructed, then we are dealing with a false list. thus write out itemised lists in the form shown in Many corporate style guides actually specify the Example 2. that the elliptic form of an itemised list must be used in place of the fully punctuated original version. Example 2: Itemised List with Semicolons There is of course nothing wrong with this speciﬁca- tion, as long as we are aware that the elliptical form Three colour combinations are available: is a derivative and not the full version. • red, white, and blue; • blue, green, and yellow; and 4.2.2 Ellipsis and Parallel Construction • green, orange, and white. Many style guides specify that the items in an itemised list should show parallel construction, that is, the items should be syntactically similar. Parallel This is an example of the classic, fully punctu- construction is useful because it makes it easier to ated version of the itemised list. understand the material presented, and easier to Since an itemised list comprises only one sen- recover the full form of a list from the elliptical form. tence, the word immediately following an item decoration must not be capitalised unless it is a proper 5 Enumerated Lists noun of some form. Bad Example 3 illustrates a An itemised list is a visual device for displaying a common but incorrect usage. single sentence. In contrast, an enumerated list, in

Making Lists: A Journey into Unknown Grammar 12 TUGboat, Volume 35 (2014), No. 1

numbered or lettered form, is a visual device for but never with a colon. This remarkably common displaying a passage of text comprising a number of error arises from a confusion of the layout rules for sentences. The sentences in the text may, but do an enumerated list with those for an itemised list. not necessarily, collectively describe a sequence of events in time or space. Sometimes the form of a 5.2 Decorating the Numbers enumerated list is used to indicate the number of The numbers beside the items in an enumerated components in a collection, or cases under considera- list are decorations — visual devices that assist the tion. This is usually clear from the context. reader in understanding the material — and have no If you intend to construct cross-references to syntactical meaning at all. Any full stops, brackets, individual list items, then those items should be part bolding, animation or other devices added to the of an enumerated list. numbers are, from a syntactical point of view, also The commonest example of an enumerated list decorative. The order of the actions is determined is a set of instructions that must be performed in a by the order of the sentences, not by the numbers ﬁxed time order. Consider the set of instructions in on the page: the numbers serve only to reinforce the Example 4. sequence in the reader’s mind. The use of full stops or brackets after the num- Example 4: Instructions in Narrative Form bers or letters could suggest to a reader a syntactical The instructions for servicing the device are as follows. structure that does not actually exist, and this is one Open the top panel of the veeblefetzer. Insert the screw- reason why such decorations of the basic numbers driver into the slot at the left. Turn the screw clockwise or letters should perhaps be avoided. This is more until the pressure is released. Close the top panel. easily said than done: most writing software will insert these decorations (for example, full stops after item numbers) automatically. We usually present a set of instructions like this Sometimes, you may need to make the numbers as a numbered or lettered list with an introductory quite prominent: for example, lists of instructions in sentence, or heading, or both. The heading is only a user guide may be embedded in masses of explan- text decoration, and as such need not be either gram- atory material. matically correct or punctuated. The introductory sentence must of course be a complete sentence, end- 6 Complications and Restrictions ing with a full stop (or question mark, or exclamation Strict application of the rules produces some inter- mark, if appropriate). An example of an enumerated esting results. These results are described in more list with a heading and an introductory sentence is detail in following sections. shown in Example 5. An individual item in an itemised list may have Example 5: Enumerated List with Heading another, subsidiary itemised list attached to it. (See and Introduction Subsection 6.1.) Putting an explanatory paragraph after an item Servicing the Device in an itemised list is a grammatical error. (See Sub- To perform a routine service, carry out the following section 6.2.) steps. An individual item in an enumerated list may 1. Open the top panel of the veeblefetzer. have an itemised list attached to it. (See Subsec- 2. Insert the screwdriver into the slot at the left. tion 6.3.) An individual item in an itemised list cannot 3. Turn the screw clockwise until the pressure is released. have a subsidiary, enumerated list attached to it. (See Subsection 6.4.) 4. Close the top panel. 6.1 Itemised Lists within an Itemised List It is possible, and quite common, to insert secondary 5.1 No Colon Introducing an itemised lists under the individual items in another Enumerated List higher level list. There are even widely accepted Many writers end the introductory sentence before an rules about the selection of bullet points: round enumerated list with a colon. This is a grammatical bullets at the ﬁrst level, dashes at the second level, error: every sentence in a technical work must be and so on. Recall that an itemised list comprises complete, and every sentence must end with either a only one sentence: it follows that all of the items in full stop, a question mark, or an exclamation mark, the list, considered together, must constitute only

James R. Hunt TUGboat, Volume 35 (2014), No. 1 13

one sentence. Attempting to follow this rule could for girls, the situation is not as symmetrical as it easily result in complicated constructions that lack may at ﬁrst appear. clarity. Example 6 shows a two-level itemised list Recall that the bulleted items form a single that follows the rule and is still clear. (A sentence sentence, and each enumerated item contains one that can be displayed as a three-level itemised list or more complete sentences. Sentences cannot rea- would be rather complicated, at best.) sonably be nested within other sentences, and so an enumerated list cannot be placed after a bulleted Example 6: Itemised Lists within an Itemised item without violating grammatical rules. List It is common in technical works to use bulleted The colour of the body of the device may be: items as a variety of unnumbered heading, with a bulleted item, often in bold type, followed by explan- • a primary colour, which may be one of: atory paragraphs. On reﬂection, it is hard to think – red; of any reason why you would want to do this — why – yellow; or would you lay out sets of instructions on one or more – blue; or pages, and imply that the order in which those sets • a pastel colour, which may be one of: are presented to the reader does not matter? The – pink; best solution to this problem is to avoid it: possibly – pale yellow; or by rewriting the bulleted items as headings followed – azure; or by text, with any enumerated lists in the text left to stand alone. • a neutral colour, which may be one of: – beige; 7 Pathologies – bone; or Technical works often contain strange list construc- – ecru. tions: enumerated lists disguised as itemised lists, itemised lists disguised as enumerated lists, and hy- brid constructions and monsters. These errors often 6.2 Explanatory Paragraphs in Itemised proceed from a failure to understand the basic dif- Lists Not Possible ference between enumerated and itemised lists. It is common practice to insert explanatory para- 7.1 Enumerated List Disguised as an graphs after items in itemised lists. This is a gram- Itemised List matical error: an itemised list comprises only one sentence, and it is not possible to nest other sentences The most common error appears to be displaying an within that sentence. enumerated list in the guise of an itemised list. Bad Example 4, taken from a corporate style 6.3 Itemised List After Numbered Item guide, appears to be an itemised list, but it actually In an enumerated list, it is possible to have two or contains four sentences, one of which is not correctly more sentences after each number, if the writer thinks terminated. this necessary — and of course any of those sentences may be displayed as an itemised list if appropri- Bad Example 4: Enumerated List Disguised as ate (that is, bullet points may follow a numbered an Itemised List list item). Use the following guidelines for creating appendices: Some of those following sentences may be dis- • List the appendices in the table of contents. played with their own ordering symbols (that is, a • subsidiary enumerated list may follow a numbered Refer to the appendices in the preface. list item). This is common, and presents no concep- • For each appendix, provide an introductory tual problems. paragraph. 6.4 No Numbered Lists After a Bullet Item The list shown in Bad Example 4 should be We can place an itemised list after a numbered item presented as an enumerated list with an introductory in an enumerated list. Can we, conversely, place an paragraph, as shown in Example 7. enumerated list after a bulleted item in an itemised The numbers in Example 7 do not necessarily list? As was the case with the legendary ski resort full specify a sequence of actions: they may merely clarify of girls looking for husbands and husbands looking the number of rules to be followed.

Making Lists: A Journey into Unknown Grammar 14 TUGboat, Volume 35 (2014), No. 1

Example 7: Improved Version Bad Example 6: An Overloaded List of Bad Example 4 [The] eight main varieties of speech in China [...] Use the following guidelines for creating appendices. • Cantonese (Yúe) Spoken in the south, mainly 1. List the appendices in the table of contents. Guangdong, southern Guangxi, Macau, Hong Kong. 2. Refer to the appendices in the preface. (46 million) • 3. For each appendix, provide an introductory Gan Spoken in Shanxi and south-west Hebei. paragraph. (21 million) • Hakka Widespread, especially between Fujian and Guangxi. (26 million) • Mandarin A wide range of dialects in the northern, 7.2 Itemised List Disguised as an central, and western regions. North Mandarin, as Enumerated List found in Beijing, is the basis of the modern standard Bad Example 5 appears to be an enumerated list, but language. (720 million) the numbers serve no purpose: they do not indicate • Northern Min Spoken in north-west Fujian. (10 mil- steps to be taken, or a sequence in which items may lion) be used, or a number of items (how many items are • Southern Min Spoken in the south-east, mainly included in “its usual accoutrements”?), or an order in parts of Zhejiang, Fujian, Hainan Island, and of importance. Taiwan. (26 million) • Wu Spoken in parts of Anhui, Zhejiang, and Jiangsu. Bad Example 5: Itemised List Disguised as an (77 million) Enumerated List • Xian (Hunan) Spoken in the south-central region, in Hunan. (36 million) The items enclosed in the Vampire Protection Kit are as follows. 1. An efficient pistol with its usual accoutrements. 2. Silver bullets. The material in Bad Example 6 would be better 3. An ivory crucifix. displayed in a three-column table—possibly with 4. Powdered flowers of garlic. itemised lists in some of the table cells. 5. A wooden stake. 6. Professor Blomberg’s new serum. 8 Definition Lists The definition list, mentioned in Section 3, is a differ- The information in Bad Example 5 is better ent kind of list, because the elements of the list are presented in an itemised list, as shown in Example 8. neither itemised nor enumerated. Instead, an item Example 8: Improved Version in a definition list comprises two parts: a term and of Bad Example 5 an explanation of the term. These two parts of the item are usually distinguished typographically, and The items enclosed in the Vampire Protection Kit may, but do not necessarily, appear on the same line. are as follows: Most word processors do not offer the definition • an efficient pistol with its usual accoutrements; list as an option on their drop-down menus, because a • silver bullets; two-column table without a caption, rules, or headers • an ivory crucifix; (that is, an informal table) does much the same job. • powdered flowers of garlic; The definition list may be used for setting out • a wooden stake; and glossary items, as shown in Example 9. (This mater- • Professor Blomberg’s new serum. ial was derived from [1].)

8.1 Uses of the Definition List 7.3 Sometimes, a Table is a Better Idea Definition lists may be used to set out definitions and It is possible to put too much material into a list, and construct glossaries. Other possible uses of definition sometimes the material would be clearer if it were set lists are not so obvious: for example, a definition out in a table. Bad Example 6, which displays a list list could be used as a way of setting out otherwise with too much material, was taken from a popular awkward one-bullet lists, which are banned by some work on linguistics [2]. style guides.

James R. Hunt TUGboat, Volume 35 (2014), No. 1 15

Example 9: A Definition List Scribe could not handle tables, but LATEX could, to a limited extent. The only tables that LAT X could Klingon This is perhaps the most fully realised science E handle at first were less than one page in length, and fiction language. Klingon has a complete grammar floats and vocabulary, and countless nerds have learned it those tables had to be , where the position of like high-school French or German. the table in the final print version of the document Qwghlmian From Neal Stephenson’s Cryptonomicon was determined by an algorithm that positioned the novel and Baroque Cycle trilogy, this fictional lan- table so that it did not extend over a page break. guage is allegedly spoken on obscure British islands. While that one-page limitation existed, the definition The language has sixteen consonants and no vowels, list was still useful, because it could be used to lay out and is thus ideal for representing binary informa- tabular material that occupied more than one page. tion—and nearly impossible to pronounce. R’lyehian This other-worldly, barely speakable lan- 9 Conclusions guage is part of the Cthulhu mythos (introduced Applying a few simple rules to itemised and enumer- in the classic Lovecraft short story The Call of ated lists leads to some unexpected conclusions. In Cthulhu). particular, itemised lists are much more limited in Sindarin While Tolkien created several languages for scope than they at first appear. In many instances, his various Lord of the Rings books, Sindarin, the enumerated lists are much more useful. language of the elves, is not only his most beautiful but also his most fully realised invented language. Definition lists are not used by technical writers as much as they could be.

References 8.2 No Definition List in Common Word Processors [1] John Baichtel. Top ten geekiest constructed languages. 2009. http://www.wired.com/ Constructing a definition list with the wraparound A geekdad/2009/08/top-ten-geekiest- layout shown in Example 9 is simple enough in LTEX, constructe-languages/. but common word processors do not offer menu items How Language Works relating to definition lists. Two-column informal [2] David Crystal. . Avery, tables will usually be a good approximation. New York, 2007. [3] Microsoft Corporation. Microsoft Manual of 8.3 History of the Definition List Style for Technical Publications, Third Edition. The definition list is defined in HTML, in the DITA Microsoft Press, Sebastopol, Calif., 2004. (Darwin Information Typing Architecture: an XML data model for authoring) specification of 2005, in A ⋄ the DocBook specification of 1990, and in the LTEX James R.Hunt specification of 1986; it appears to have originated P. O. Box 580 in the now almost-forgotten Scribe text formatter, Mt Gravatt ca. 1978. The aim of the author of Scribe was to Queensland 4122 Australia provide a simple way of coding command descriptions writerlyjames (at) gmail dot com in programming manuals.

Making Lists: A Journey into Unknown Grammar 16 TUGboat, Volume 35 (2014), No. 1

In memoriam Jean-Pierre Drucbert Letters Jean-Michel Hufflen Jean-Pierre Drucbert passed away in March 2009 . . . quietly . . . as he lived . . . He was discreet yet ef- Does not suffice to run latex a finite number ficient. . . in his own way. . . I personally worked in of times to get cross-references right 1 the same building as him, for two years, from 1989 Jaime Gaspar to 1991. At that time, I got by using LAT X, but had E A only a little experience, compared with my present Abstract. We present a LTEX file such that a ability. So several times I hesitated to start in im- cross-reference is wrong no matter how many times plementing some functions and asked him for some we run latex. − − ∗ − − advice. At the time, he told me: ‘That’s not obvi- It is well-known that we need to run latex several ous.’ Or: ‘That’s difficult.’ But a little while after times to get cross-references right. This raises a that, he waved discreetly to me to join him. . . and natural question for mathematicians: for any LAT X he implemented the commands. E file, does it suffice to run latex a finite number of I had met him after hearing about his transla- times? We show that the answer is negative, by a tion of Leslie Lamport’s original LAT X book, incor- E counterexample. The LAT X file porating much additional information. As far as I E 2 A \documentclass{article} know, this work was one of the first complete LTEX manuals in French, if not the first. It has only been \usepackage{forloop} \begin{document} distributed privately; I personally thought that was \newcounter{n} a pity. Several times I suggested to Jean-Pierre that \forloop{n}{0} he should submit it to a publisher. But he was not {\value{n} < \pageref{l}}{~\newpage} interested in the limelight, it was not in his char- Last-page label here\label{l}. acter. He was solitary, as if he bore some secret Label value: \pageref{l}. and deep-rooted pain, some dead-end despair. But \end{document} he was always ready to help other people within his 3 is such that the cross-reference \pageref{l} is wrong activities’ scope. He left many packages as a re- no matter how many times we run latex. This markable testimony of that. file uses a little diabolical trick: a label l is created in the last page (line 7) and there are cre- ⋄ Jean-Michel Hufflen ated (resorting to a for loop) \pageref{l} many FEMTO-ST & University of Franche-Comté, new pages (lines 5 and 6), causing the document to 16 route de Gray, have \pageref{l} + 1 pages, so the cross-reference 25030 Besançon Cedex, \pageref{l} to the last page is wrong. (An even France more diabolical counterexample that avoids a for loop is shown at http://tex.stackexchange.com/ questions/30674.) 1 At CERT (‘Centre d’Études et de Recherches de Tou- louse Acknowledgement. At the time of writing: ’, that is, ‘Toulouse study and research centre’). 2 2 πr Jean-Pierre Drucbert : Utilisation de LATEXetBIBTEX INRIA Paris-Rocquencourt, , Univ Paris Diderot, sur Multics au CERT. August 1988. CERT, IT-service group. Sorbonne Paris Cité, F-78153 Le Chesnay, France; 3 See http://ctan.org/author/drucbert for a complete financially supported by the French Fondation list including ‘original’ implementations and adaptations to French of existing ones. Sciences Mathématiques de Paris. ⋄ Jaime Gaspar Universitat Rovira i Virgili Department of Computer Engineering and Mathematics Av. Pa¨ısos Catalans 26 E-43007 Tarragona, Catalonia jaime.gaspar (at) urv dot cat; Centro de Matemática e Aplica¸cões (CMA), FCT, UNL TUGboat, Volume 35 (2014), No. 1 17

Fetamont: An extended logo typeface well-known logos of METAFONT, METAPOST and METATYPE1. Linus Romer 3 The many faces of fetamont Abstract It is clear that thanks to the power of METAFONT METAFONT The logo font, known from logos like or the number of possible faces is endless. However, METAPOST , has been very limited in its collection Fetamont comes in a mere 36 predefined faces. The of glyphs. The new typeface Fetamont extends the suffixes of every face are schematically listed in the logo typeface in two ways: following table: • Fetamont consists of 256 glyphs, such that the Upright Oblique T1 (a.k.a. EC, a.k.a. Cork) encoding table is r8 b8 h8 o8 bo8 ho8 complete now. r9 b9 h9 o9 bo9 ho9 • Fetamont has additional faces like “light ultra- l10 r10 b10 h10 lo10 o10 bo10 ho10 condensed” or “script”. Condensed Upright Condensed Oblique lc10 c10 lco10 co10 The fetamont package provides LATEX support for the Fetamont typeface. Both the package and the bc40 bco40 typeface are distributed on CTAN under the terms Ultracond. Upright Ultracond. Oblique of the LATEX Project Public License (LPPL). lq10 lqo10 The following article presents some facets of the Script Upright Script Oblique Fetamont typeface, explains important techniques lw10 w10 bw10 hw10 lwo10 wo10 bwo10 hwo10 and shows the history of the logo typeface. Anyone wishing to design a new face for Fetamont can do so by just redefining the parameters of ffmr10.mf 1 Comparison with existing logos and saving the file under a new name. METAPOST The following picture shows the and the 3.1 Script faces METAFONT logos written in Fetamont (gray) and Taco Hoekwater’s Type 1 version of the logo font The Fetamont script faces make use of randomized (outlined). paths that are drawn by a rotated ellipse pen to make it look more handwritten. They may be used METAFONT for comics or children’s texts:

METAPOST tIMe mAcHinE There are hardly any differences; only the “S” is significantly different, because its shape was changed by D. E. Knuth in 1997 (see section 6). The other Since version 1.3, the OpenType versions of the script faces of Hoekwater’s Logo are also very similar to faces even support the “Randomize” feature. their corresponding Fetamont faces. Widths and 3.2 Condensed faces kernings may rarely differ by one unit (except for the “A” in Logo 9, which has a strange width). The titles in Knuth’s books show a variant of the A comparison with the METATYPE1 logo from logo font that blends with Computer Modern Sans Jackowski, Nowacki, and Strzelczyk (2001) shows Serif Demibold Condensed 40. So I decided to add virtually no differences as well.1 this variant as Fetamont Bold Condensed 40 and let also a light and medium variant benefit from the condensation because it looked so good. METATYPE1 Light Condensed 10 2 Comparison with the mflogo package Medium Condensed 10 The control sequences defined in the fetamont package are analogous to those from the mflogo pack- Bold Condensed 40 age (Vieth, 1999). \MF, \MP and \MT produce the 3.3 Ultracondensed Face 1 I have never seen the original sources of the “Y” and the “1” but I think that my imitated “Y” and “1” are extremely I went even a step further and created an ultracon- close to the original. densed face for Fetamont. The credits written on

Fetamont: An extended logo typeface 18 TUGboat, Volume 35 (2014), No. 1

movie posters are often typeset in such ultracon- stores anchors as ordinary points in arrays. Fur- densed faces: thermore, the character picture is stored in another theMOSTIMPORTANTTHINGinthePROGRAMMINGLANGUAGEistheNAME.a array at the end of each character. For combined LANGUAGE WILL not SUCCEED WITHOUT a GOOD NAME. i HAVE RECENTLY IN- characters, METAFONT looks at the different arrays VENTEDaVERYGOODNAMEandNOWiamLOOKINGFORaSUITABLELANGUAGE. and then overlays the base and the accent character using the macro addto currentpicture. (This is said to be a quotation from D. E. Knuth.) 5.2 Kerning classes with METAFONT 4 Spacing problems with the “S” Like anchor pairing, the concept of kerning classes The original spacing of the letter “S” fits perfectly for is widely known but not frequently used in META- METAPOST the combination “OST” as in . However, FONT. The reason for this is that METAFONT cannot in combination with normal letters like “N”, the “S” natively write kernings for multiple characters at is positioned too much to the right (see the left part once. Hence, multiple kerning information has to be of the following picture). cached in arrays. Let me illustrate the general idea of this caching NSN NSN with a fictional example: Thus, I shifted the “S” a bit to the left (see the right part of the upper picture) and added corresponding 5.2.1 Define kerning classes kerning instructions for “OS” and “ST” to keep the It is clear that “OV” needs the same kerning as “DV”. spacing of the original logo intact. But beware, “VO” needs a different kerning than “VD”! So generally there are two kinds of kerning 5 Special techniques classes: 5.1 Anchor pairing with METAFONT • The first kerning class groups glyphs together In order to draw accented and other combined char- that share the same shape to the right, like “D” acters, it is helpful to use anchors. The concept and “O” of anchors is common in type design outside of the • The second kerning class groups glyphs together METAFONT world. However, anchors rarely have that share the same shape to the left, like “C” been seen in METAFONT up to now. and “O” The idea is easy: Put an anchor at a given point in a base glyph and in the accent glyph; then The first kerning class is stored in an array as follows:

overlay the two glyphs such that the anchors coincide, row 0 2

producing the pre-composed accented character. row 1 3 68 (D) 79 (O) 214 (O)¨

anchor top (base) row 2 2 86 (V) 87 (W) The 0th column is reserved for the number of columns th anchor top (accent) (0 row) and the number of items in the correspond- + ing row, because there is no straightforward way to determine the length of arrays or subarrays in META- FONT. Each row forms a ﬁrst kerning class. The characters are stored as codes (for the sake of clarity, they are shown here as letters also). ⇓ The storage of the second kerning classes works analogously, in another array:

row 0 2

row 1 4 67 (C) 79 (O) 80 (Q) 214 (O)¨

row 2 2 86 (V) 87 (W)

5.2.2 Kern the kerning classes The classes are then kerned with the following commands: addclasskern("D","C",2u#) Normally several kinds of anchors are needed. E.g. addclasskern("D","V",-u#) “C”´ and “C¸ ” need two diﬀerent anchors. Fetamont addclasskern("V","C",-u#)

Linus Romer TUGboat, Volume 35 (2014), No. 1 19

This kerning information is stored in a large three- sources (Knuth, 1979b), which were written in dimensional array, which has as many columns as the obsolete METAFONT78. characters: METAFONT 67 79 80 214 86 87 row 68 6 u u u u −u −u 2 # 2 # 2 # 2 # # # Knuth used quite a thick circular pen. This 67 79 80 214 86 87 row 79 6 2u# 2u# 2u# 2u# −u# −u# “manfnt” also contains 8 pt, 9 pt and title ver- 67 79 80 214 sions of the logo typeface. However, the title row 86 4 −u# −u# −u# −u# 67 79 80 214 font is just a magniﬁed version of the 10 pt font. row 87 4 − − − − u# u# u# u# • 1984/05/27: The sources of the logo font are 67 79 80 214 86 87 row 214 6 2u# 2u# 2u# 2u# −u# −u# rewritten for the new METAFONT84 (Knuth, 1984a). The pen has become elliptic and thinner. 5.2.3 Write the kerning information • 1984/09/09: Second try for the new META- FONT At the very end, the macro writeligtable writes all 84 (Knuth, 1984b). The characters “E” kerning information from the large three-dimensional and “F” have rounded vertices. array, row-wise, in a METAFONT-friendly way. METAFONT

5.3 Producing outlines • 1985/09/03: Knuth (1985e) defines some “crazy shapes”. He then (Knuth, 1986) uses them to METAFONT The sources have been converted to out- demonstrate randomized typefaces. line font formats like Type 1 or OpenType with a Python script. The script calls METAPOST to pro- VWXYZ[\] ffiffl !"#$% duce PostScript files for each glyph. These glyphs are • 1985/09/22: The logo typeface gets a slanted imported by the fontforge module. Hosny (2011) variant (Knuth, 1985b), a backslanted skinny previously used this technique to produce the out- bold variant (Knuth, 1985f) and an ultrawide lines of Punk Nova. Because the glyph widths are light variant (Knuth, 1985d). lost in the import, the tfm module from the mftrace

project is also needed (Nienhuys, 2006). BCDGWvwU The shapes of the letters A, E, F, M, N, O, T *.mf have reached their ﬁnal state (Knuth, 1985c and Knuth, 1985a). METAPOST METAFONT

❄ ❄ • 1985/10/06: Knuth (1985g) uses a logo variant for titles along with Computer Modern Sans *.eps *.tfm Serif Demibold Condensed 40 for the title pages in Knuth (1986). fontforge module tfm module METAFONT ❄ • 1986/01/07: Knuth (1986) speciﬁes a boldface *.sfd variant of the METAFONT logo. This variant goes well with Computer Modern Sans Serif fontforge module Bold: SansMETA ❄ ❄ ❄ • 1989/04/22: Knuth (1989) shows a new special *.otf *.pfb *.afm weight to go with Pandora (see Billawala, 1989). Pandora METAFONT 6 History of the logo typeface • 1989/06/24: Cugley (1989) extends the logo font • 1979/07/12: Knuth (1979a) shows the ﬁrst ver- to cover the whole uppercase alphabet and a sions of the “Computer Modern” typeface and couple of punctuation characters in the so-called the METAFONT logo. The graphic below is the “mf” font. result of a conversion from the original “manfnt” DAMIAN CUGLEY

Fetamont: An extended logo typeface 20 TUGboat, Volume 35 (2014), No. 1

• 1992/01/16: Knuth (1992a) adds a demibold Nowacki, and Strzelczyk (2001). The newly variant. This variant is needed for the title “Ex- designed characters “Y” and “1” are diﬀerent cerpts from the Programs for TEX and META- to the “Y” and “1” in Elogo. FONT” (Knuth, 1992b), which combines Com- METATYPE1 puter Modern Bold Extended with the logo font. METAFONT Funnily, the EuroTEX 2001 conference logo also TEX and embeds an extended logo font (Pepping, 2001). • 1993/03/23: Knuth (1993a) adds the characters “P” and “S” to the logo font such that one can typeset “METAPOST” with it. Hobby (2009) states that both of the following logos are okay: • 2011/06/02: For the documentation of another METAPOST MetaPost typeface, I wanted to write “mf2pt1” in the logo font, because I was heavily relying on this pro- • 1997/04/20: Knuth (1993b) changes the shape gram back then. Having realised that there were of the “S”. “It now sort of assumes that a ‘T’ other tools related to METAFONT that could not will follow” (Beeton, 1998). be written in the logo font, I started extending the logo font. MF2PT1 MFLua MetaFog

References • 1997/09/30: The American Mathematical So- AMS. “Computer Modern PostScript Fonts”. ciety provides Type 1 versions of the logo font www.ams.org/amsfonts, 2009. in the following styles: logo8, logo9, logo10, Beeton, Barbara. “Editorial comments”. logobf10, logosl10 (AMS, 2009). The “P” TUGboat 19(4), 1998. tug.org/TUGboat/ and the “S” are not included and the widths of tb19-4/tb61beet.pdf. the supplied glyphs are often wrongly rounded. Billawala, Nazneen N. “Metamarks: Preliminary Studies for a Pandora’s Box of Shapes”. Technical Report STAN-CS 1256, Stanford 78 77 University, 1989. Cugley, Damian. “A character font more than a little reminiscient of the METAFONT logo font by D. E. Knuth”. mirror.ctan.org/fonts/ • 1999/6/03: Taco Hoekwater releases new Type 1 utilities/mff-29/mf.mf, 1989. versions of the logo font (Hoekwater, 1999). The METAPOST conversion has been done in MetaFog, which is Hobby, John. “The page”. ect. bell-labs.com/who/hobby/MetaPost.html, only available with TrueTEX. The “S” still has its old shape. The round endings are sometimes 2009. suboptimal (e.g. the “F” from Logo 10 ) but all Hoekwater, Taco. mirror.ctan.org/fonts/ in all, the conversion is very clean. mflogo/ps-type1/hoekwater, 1999. Hoekwater, Taco. “A new metafont (beta test)”. comments.gmane.org/gmane.comp.tex. context/4259, 2001. Hoekwater, Taco. “METAPOST Developments”. • 2001/03/04: Taco Hoekwater presents Elogo, an www.luatex.org/talks/ extended version of the logo font, which covers tug2007-taco-metapost.pdf, 2007. the 7-bit T X text encoding (Hoekwater, 2001). E Hosny, Khaled. https://github.com/ The font has been used in some of Hoekwater’s khaledhosny/punk-otf/blob/master/ presentation slides (e.g. the following image is tools/build.py, 2011. extracted from Hoekwater, 2007). The “S” still has the old shape. Jackowski, Bogus law, J. M. Nowacki, and P. Strzelczyk. “METATYPE1: A METAPOST-based engine for generating Type 1 • 2001/09/26: At EuroTEX 2001 the METATYPE1 fonts”. www.ntg.nl/eurotex/JackowskiMT. logo is shown in the presentation of Jackowski, pdf, 2001.

Linus Romer TUGboat, Volume 35 (2014), No. 1 21

Knuth, Donald E. TEX and METAFONT: Knuth, Donald E. The METAFONTbook. New directions in typesetting. American Addison-Wesley, 1986. Mathematical Society, 1979a. Knuth, Donald E. “10-point METAFONT logo, Knuth, Donald E. “Special font for the special weight to go with Pandora text”. METAFONT manual”. www.saildart. www.saildart.org/LOGN10.MF[NB,DEK], 1989. org/MANFNT.MF[MF,DEK]1, 1979b. Knuth, Donald E. “10-point demibold METAFONT Knuth, Donald E. “Letters for the METAFONT logo”. mirror.ctan.org/systems/knuth/ logo; first try with the new MF”. www. local/lib/logod10.mf, 1992a. saildart.org/LOGO.MF[FNT,DEK]1, 1984a. Knuth, Donald E. Literate Programming. CSLI, Knuth, Donald E. “Letters for the METAFONT 1992b. logo; second try with the new MF”. www. Knuth, Donald E. “Routines for the METAFONT saildart.org/LOGO.MF[FNT,DEK], 1984b. logo, as found in The METAFONTbook”. Knuth, Donald E. www.saildart.org/METAFO. ftp://cs.stanford.edu/pub/concretemath. MF[MF,SYS], 1985a. errata/fonts/logo.mf, 1993a. Knuth, Donald E. “10-point slanted METAFONT Knuth, Donald E. “Routines for the METAFONT logo”. www.saildart.org/LOGL10.MF[MF,SYS] logo, as found in The METAFONTbook”. 1, 1985b. ftp://cs.stanford.edu/pub/tex/dist/lib/ Knuth, Donald E. “Driver file for the METAFONT logo.mf, 1993b. logo”. www.saildart.org/NLOGO.MF[MF,SYS], Nienhuys, Han-Wen. https://github.com/ 1985c. hanwen/mftrace/blob/master/tfm.py, 2006. Knuth, Donald E. “Fat version of METAFONT Pepping, Simon, editor. EuroTEX 2001 proceedings. logo”. www.saildart.org/FLOGO.MF[MF,SYS], NTG, 2001. 1985d. Vieth, Ulrik. “The mflogo package”. mirror.ctan. Knuth, Donald E. “Font for examples in org/macros/latex/contrib/mflogo/mflogo. Chapter 21 of The METAFONTbook”. pdf, 1999. www.saildart.org/RANDOM.MF[FNT,DEK]1, 1985e. ⋄ Linus Romer Knuth, Donald E. “Skinny variant of METAFONT Oberseestrasse 7 logo”. www.saildart.org/SKLOGO.MF[MF,SYS], Schmerikon, 8716 1985f. Switzerland linus.romer (at) gmx dot ch Knuth, Donald E. “Special font for the TEX and METAFONT manuals”. www.saildart.org/ LOGO.MF[MF,SYS]1, 1985g.

Fetamont: An extended logo typeface 22 TUGboat, Volume 35 (2014), No. 1

LATEX3 News Issue 9, March 2014

Contents the audience that it’s not so weird after all. In our ex- Hiatus? 1 perience, anyone that’s been exposed to some of the expl3 in the community 1 more awkward expansion aspects of TEX programming appreciates how expl3 makes life much easier for us. A Logo for the LTEX3 Programming Language 2 expl3 in the community Recent activity 2 While things have been slightly quieter for the team, more and more people are adopting expl3 for their own Work in progress 2 use. A search on the TEX Stack Exchange website for Uppercasing and lowercasing ...... 2 either “expl3” or “latex3” at time of writing yield Space-skipping in xparse ...... 3 around one thousand results each. In order to help standardise the prefixes used in expl3 . . . and for 2014 onwards 3 modules, we have developed a registration procedure for package authors (which amounts to little more than What can you do for the LATEX3project? 4 notifying us that their package uses a specific prefix, Programming Layer ...... 4 which will often be the name of the package itself). DesignLayer ...... 4 Please contact us via the latex-l mailing list to reg- DocumentInterfaceLayer ...... 5 ister your module prefixes and package names; we ask InSummary...... 5 that you avoid using package names that begin with Andsomethingelse...... 5 l3... since expl3 packages use this internally. Some Hiatus? authors have started using the package prefix lt3... as a way of indicating their package builds on expl3 in Well, it’s been a busy couple of years. Work has slowed some way but is not maintained by the LAT X3 team. on the LAT X3 codebase as all active members of the E E In the prefix database at present, some thirty pack- team have been—shall we say—busily occupied with age prefixes are registered by fifteen separate individ- more pressing concerns in their day-to-day activities. uals (unrelated to the LAT X3 project — the number Nonetheless, Joseph and Bruno have continued E of course grows if you include packages by members to fine-tune the LAT X3 kernel and add-on packages. E of the team). These packages cover a broad range of Browsing through the commit history shows bug fixes functionality: and improvements to documentation, test files, and internal code across the entire breadth of the codebase. acro Interface for creating (classes of) acronyms Members of the team have presented at two TUG A hobby Hobby’s algorithm in PGF/TiKZ for drawing conferences since the last LTEX3 news. (Has it really optimally smooth curves. been so long?) In July 2012, Frank and Will travelled to Boston; Frank discussed the challenges faced in the chemmacros Typesetting in the field of chemistry. past and continuing to the present day due to the limits classics Traditional-style citations for the classics. of the various T X engines; and, Frank and Will to- E conteq Continued (in)equalities in mathematics. gether covered a brief history and recent developments of the expl3 code. ctex A collection of macro packages and document In 2013, Joseph and Frank wrote a talk on complex classes for Chinese typesetting. A layouts, and the “layers” ideas discussed in LTEX3; endiagram Draw potential energy curve diagrams. Frank went to Tokyo in October to present the work. enotez Support for end-notes. Slides of and recordings from these talks are available A on the LTEX3 website. exsheets Question sheets and exams with metadata. These conferences are good opportunities to intro- lt3graph A graph data structure. duce the expl3 language to a wider group of people; in many cases, explaining the rationale behind why newlfm The venerable class for memos and letters. expl3 looks a little strange at first helps to convince fnpct Interaction between footnotes and punctuation.

LATEX3 News #9 TUGboat, Volume 35 (2014), No. 1 23

GS1 Barcodes and so forth. 2. Joseph’s refinement of the experimental galley code hobete Beamer theme for the Univ. of Hohenheim. now allows separation of paragraph shapes from margins/cutouts. This still needs some testing! kantlipsum Generate sentences in Kant’s style. lualatex-math Extended support for mathematics in 3. For some time now expl3 has provided “native” A drivers although they have not been selected by LuaLTEX. default in most cases. These have been revised to media9 Multimedia inclusion for Adobe Reader. improve robustness, which makes them probably pkgloader Managing the options and loading order of ready to enable by default. The improvements other packages. made to the drivers have also fed back to more substances Lists of chemicals, etc., in a document. “general” LATEX code. withargs Ephemeral macro use. A xecjk Support for CJK documents in X LE TEX. Work in progress xpatch, regexpatch Patch command definitions. We’re still actively discussing a variety of areas to xpeek Commands that peek ahead in the input stream. tackle next. We are aware of various “odds and ends” xpinjin Automatically add pinyin to Chinese characters in expl3 that still need sorting out. In particular, some experimental functions have been working quite well zhnumber Typeset Chinese representations of numbers and it’s time to assess moving them into the “stable” zxjatype Standards-conforming typesetting of Japanese modules, in particular the l3str module for dealing with A for X LE TEX. catcode-twelve token lists more commonly known in expl3 as strings. Some of these packages are marked by their authors as experimental, but it is clear that these packages have Areas of active discussion including issues around been developed to solve specific needs for typesetting uppercasing and lowercasing (and the esoteric ways and document production. that this can be achieved in TEX) and space skipping The expl3 language has well and truly gained traction (or not) in commands and environments with optional after many years of waiting patiently. arguments. These two issues are discussed next. A logo for the LAT X3 Programming Language E Uppercasing and lowercasing To show that expl3 is ready for general use Paulo Cereda drew up a nice logo for us, showing a The commands \tl_to_lowercase:n and hummingbird (agile and fast — but needs huge amounts \tl_to_uppercase:n have long been overdue for a of energy) picking at “l3”. Big thanks to Paulo! good hard look. From a traditional TEX viewpoint, these commands are simply the primitive \lowercase and \uppercase, and in practice it’s well known that there are various limitations and peculiarities associated with them. We know these commands are good, to one extent or another, for three things:

1. Uppercasing text for typesetting purposes such as all-uppercase titles.

2. Lowercasing text for normalisation in sorting and Recent activity other applications such as filename comparisons. A LTEX3 work has only slowed, not ground to a halt. 3. Achieving special effects, in concert with manip- While changes have tended to be minor in recent times, ulating \uccode and the like, such as defining there are a number of improvements worth discussing commands that contain characters with different explicitly. catcodes than usual. 1. Bruno has extended the floating point code to cover additional functions such as inverse trigono- We are working on providing a set of commands to metric functions. These additions round out the achieve all three of these functions in a more direct and functionality well and make it viable for use in easy-to-use fashion, including support for Unicode in most cases needing floating point mathematics. LuaLATEX and X LE ATEX.

LATEX3 News #9 24 TUGboat, Volume 35 (2014), No. 1

Space-skipping in xparse ...and for 2014 onwards We have also re-considered the behaviour of space- There is one (understandable) misconception that skipping in xparse. Consider the following examples: shows up once in a while with people claiming that \begin{dmath} \begin{dmath}[label=foo] expl3 = LATEX3. [x y z] = [1 2 3] x^2 + y^2 = z^2 \end{dmath} \end{dmath} However, the correct relation would be a subset, expl3 ⊂ A , In the first case, we are typesetting some mathematics LTEX3 that contains square brackets. In the second, we are with expl3 forming the Core Language Layer on which assigning a label to the equation using an optional ar- there will eventually be several other layers on top that gument, which also uses brackets. The fact that both provide work correctly is due to behaviour that is specifically • higher-level concepts for typesetting (Typesetting programmed into the workings of the dmath environ- Foundation Layer), ment of breqn: spaces before an optional argument are • explicitly forbidden. At present, this is also how com- a designer interface for specifying document struc- mands and environments defined using xparse behave. tures and layouts (Designer Layer), But consider a pgfplots environment: • and finally a Document Representation Layer that implements document level syntax. \begin{pgfplot} [ Of those four layers, the lowest one — expl3 — is avail- % plot options able for use and with xparse we have an instance of ] the Document Representation Layer modeled largely \begin{axis} after LATEX 2ε syntax (there could be others). Both [ can be successfully used within the current LATEX 2ε % axis options framework and as mentioned above this is increasingly ] happening. ... The middle layers, however, where the rubber meets \end{axis} the road, are still at the level of prototypes and ideas \end{pgfplot} (templates, ldb, galley, xor and all the good stuff) that need to be revised and further developed to arrive at a This would seem like quite a natural way to write such LAT X3 environment that can stand on its own and that xparse E environments, but with the current state of this is to where we want to return in 2014. syntax would be incorrect. One would have to write An overview on this can be found in the answer to either of these instead: “What can *I* do to help the LATEX3 project?” on 1 \begin{pgfplot}% \begin{pgfplot}[ Stack Exchange, which is reproduced below in slightly [ % plot options abridged form. This is of course not the first time that % plot options ] we have discussed such matters, and you can find sim- ] ilar material in other publications such as those at http://latex-project.org; e.g., the architecture talk Is this an acceptable compromise? We’re not entirely given at the TUG 2011 conference. sure here—we’re in a corner because the humble [ has ended up being part of both the syntax and semantics of a LATEX document. Despite the current design covering most regular use- cases, we have considered adding a further option to xparse to define the space-skipping behaviour as desired by a package author. But at this very moment we’ve rejected adding this additional complexity, because environments that change their parsing behaviour based on their intended use make a LATEX-based language more difficult to predict; one could imagine such behaviour causing difficulties down the road for automatic syntax checkers and so forth. However, we don’t make such decisions in a vacuum and we’re always happy to continue to discuss such matters. 1http://tex.stackexchange.com/questions/45838

LATEX3 News #9 TUGboat, Volume 35 (2014), No. 1 25

What can you do for the LATEX3 project? design, i.e., instantiating the templates with values By Frank Mittelbach so that the desired layout for “itemize” lists, etc., is created. My vision of LATEX3 is really a system with multiple layers that provide interfaces for different kinds of roles. In real life a single person may end up playing more These layers are than one role, but it is important to recognise that all of them come with different requirements with respect • the underlying engine (some TEX variant) to interfaces and functionality. • the programming layer (the core language, i.e., Programming Layer expl3) • the typesetting foundation layer (providing higher- The programming layer consists of a core language expl3 level concepts for typesetting) layer (called (EXP erimental L aTeX 3) for his- torical reasons and now we are stuck with it :-)) • the typesetting element layer (templates for all and two more components: the “Typesetting Founda- types of document elements) tion Layer” that we are currently working on and the • the designer interface foundation layer “Typesetting Element Layer” that is going to provide customizable objects for the design layer. While expl3 • the class designer layer (where instances of docu- is in many parts already fairly complete and usable the ment elements with specific settings are defined) other two are under construction. • document representation layer (that provides the Help is needed for the programming layer in input syntax, i.e., how the author uses elements) • helping by extending and completing the regression If you look at it from the perspective of user roles test suite for expl3 then there are at least three or four roles that you can clearly distinguish: • helping with providing good or better documentation, including tutorials • The Programmer (template and functionality • provider) possibly helping in coding additional core functionality — but that requires, in contrast to the first • The Document Type Designer (defines which two points, a good amount of commitment and elements are available; abstract syntax and seman- experience with the core language as otherwise the tics) danger is too high that the final results will end up • The Designer (typography and layout) being inconsistent • The Author (content) Once we are a bit further along with the “Typeset- As a consequence the LATEX3 Project needs different ting Foundation Layer” we would need help in pro- kinds of help depending on what layer or role we are viding higher-level functionality, perhaps rewriting looking at. existing packages/code for elements making use of ex- The “Author” is using, say, list structures by spec- tended possibilities. Two steps down the road (once the ifying something like \begin{itemize} \item in his “Designer Layer” is closer to being finalized) we would documents. Or perhaps by writing

... or need help with developing templates for all kinds of whatever the UI representation offers to him. elements. The “Document Type Designer” defines what kind In summary for this part, we need help from people of abstract document elements are available, and what interested in programming in TEX and expl3 and/or attributes or arguments those elements provide at the interested in providing documentation (but for this a author level. E.g., he may specify that a certain class thorough understanding of the programming concepts of documents provides the display lists “enumerate”, is necessary too). “itemize” and “description”. Design Layer The “Programmer” on the other hand implements templates (that offer customizations) for such docu- The intention of the design layer is to provide interfaces ment elements, e.g., for display lists. What kind of that allow specifying layout and typography styles in customization possibilities should be provided by the a declarative way. On the implementation side there “Programmer” is the domain of the “Document De- are a number of prototypes (most notably xtemplate signer”; he drives what kind of flexibility he needs for and the recent reimplementation of ldb). These need to the design. In most cases the “Document Designer” be unified into a common model which requires some should be able to simply select templates (already writ- more experimentation and probably also some further ten) from a template library and only focus on the thoughts.

LATEX3 News #9 26 TUGboat, Volume 35 (2014), No. 1

But the real importance of this layer is not the im- spective, i.e., the syntax and structure for auto- plementation of its interfaces but the conceptual view mated typesetting where the content is prepared of it: provisioning a rich declarative method (or meth- by a system/application rather than by a human ods) to describe design and layout. I.e., enabling a The two most likely need strict structure (as automa- designer to think not in programs but in visual repre- tion works much better with structures that do not sentations and relationships. have a lot of alternative possibilities and shortcuts, So here is the big area where people who do not feel etc.) and even when just looking at the human author they can or want to program T X’s bowels can help. E a lot of open questions need answering. And these What would be extremely helpful (and in fact not just answers may or may not be to painfully stick with for LATEX3) would be existing LATEX 2ε conventions in all cases (or perhaps • collecting and classifying a huge set of layouts and with any?). designs None of this requires coding or expl3 experience. – designs for individual document elements What it requires is familiarity with existing input con- (such as headings, TOCs, etc) cepts, a feel for where the pain points currently are and – document designs that include relationships the willingness to think and discuss what alternatives between document elements and extensions could look like. • thinking about good, declarative ways to specify In Summary such designs Basically help is possible on any level and it doesn’t – what needs to be specified need to involve programming. Thoughts are sprin- – to what extent and with what flexibility kled throughout this article, but here are a few more highlights: I believe that this is a huge task (but rewarding in it- • self) and already the first part of collecting existing help with developing/improving the core program- design specifications will be a major undertaking and ming layer by will need coordination and probably a lot of work. But – joining the effort to improve the test suite it will be a huge asset towards testing any implementa- – help improving the existing (or not existing) tions and interfaces for this layer later on. documentation – joining the effort to produce core or auxiliary Document Interface Layer code modules If we get the separation done correctly, then this layer • help on the design layer by should effectively offer nothing more than a front end for parsing the document syntax and transforming it – collecting and classifying design tasks into an internal standardised form. This means that on – thinking and suggesting ways to describe this layer one should not see any (or not much) coding layout requirements in a declarative manner or computation. • help on shaping the document interface layer It is envisioned that alternative document syntax models can be provided. At the moment we have a These concepts, as well as their implementation, are 2 draft solution in xparse. This package offers a document under discussion on the list latex-l. The list has only a fairly low level of traffic right now as actual syntax in the style of LATEX 2ε, that is with *-forms, optional arguments in brackets, etc., but with a few implementation and development tasks are typically more bells and whistles such as a more generalized con- discussed directly among the few active implementors. cept of default values, support for additional delimiters But this might change if more people join. for arguments, verbatim-style arguments, and so on. And something else . . . It is fairly conventional though. In addition when it The people on the LAT X3 team are also committed was written the clear separation of layers wasn’t well- E to keeping LAT X 2ε stable and even while there isn’t defined and so the package also contains components E that much to do these days there remains the need to for conditional programming that I no longer think resolve bug reports (if they concern the 2e core), pro- should be there. vide new distributions once in a while, etc. All this is Bottom line on what is needed for this layer is to work that takes effort or remains undone or incomplete. A • think about good syntax for providing document Thus here too, it helps the LTEX3 efforts if we get help content from “the author” perspective to free up resources.

• think about good syntax for providing document 2Instructions for joining and browsing archives at: content from an “application to typesetting” per- http://latex-project.org/code.html

LATEX3 News #9 TUGboat, Volume 35 (2014), No. 1 27

Introduction to presentations with beamer of the title slide, and alter the appearance of the presentation. Thomas Thurnherr 3.1 Presentation title Abstract Beamer reuses the standard LATEX macros to create The document class beamer provides flexible com- the title page: \title, \author, and \date. mands to prepare presentations with an appealing look. Here I introduce the basics of the beamer class \title{Beamer presentation title} intended for new users with a basic knowledge of \author{Presenter’s name} A \date{\today} LTEX. I cover a range of topics from how to create a first slide to dynamic content and citations. We use these further below to create a title page 1 Introduction frame. The LATEX document class beamer [1] was written to 3.2 Presentation appearance help with the creation of presentations held using a In beamer, “themes” change the appearance of a projector. In German, the English word Beamer de- presentation. Themes define the style and the color scribes a projector, which is likely the reason for Till of a presentation. By default, beamer loads the Tantau, the package author, to choose this particular rather bland default theme. To change the theme to name. Many macros available in the standard LAT X E something more appealing, we can use the following document classes are used in beamer, although some- command in the preamble with a theme name we times the result might look different. The beamer like: package comes with extensive documentation, which is included in most TEX distributions (such as TEX \usetheme{default} % default theme Live) and available online on CTAN. As with other document classes, the output document is likely in There are a great number of themes distributed with LAT X. They are often named after cities. Try portable document format (PDF). This imposes E Berkeley Madrid Singapore certain limitations on animations well-known from for example: , , or , to beamer commercial software. However, it has always been name a few (figure 1). Also, look for theme galleries online. a strength of (LA)TEX to let the author focus on the content, and beamer extends this concept to slide 4 Presentation slides presentations. 4.1 Creating a basic frame 2 The very basics A frame may contain a number of different things, To create a presentation, we set the document class including simple text, formulas, figures, tables, etc. to beamer: Most often, however, a frame contains numbered or \documentclass{beamer} bulleted lists. To create lists, we use the standard list environments: enumerate and itemize. An example The main difference between standard LATEX of a bulleted list is shown below: document classes and beamer is that content does \begin{frame} not continuously “flow” across multiple pages, but \frametitle{List types in \LaTeX} is limited to a single slide. The environment name \begin{itemize} frame is used for slides, usually to produce a single \item Bulleted list: itemize slide (sometimes several, but we will get to that later). \item Numbered list: enumerate A frame contains a title and a body. Furthermore, \item Labeled list: description at the bottom of every frame, beamer automatically \end{itemize} adds a navigation menu. \end{frame} \begin{frame} Although the result looks different, lists work in \frametitle{Slide title} the same way as in other document classes; they can %Slide body be nested and customized to your needs. \end{frame} 4.2 Title page frame 3 In the preamble We have already seen how to define a title in the A preamble. Now we want to use this title to create a As with the standard LTEX document classes, the preamble serves to load packages, define the content title page frame.

Introduction to presentations with beamer 28 TUGboat, Volume 35 (2014), No. 1

Introduction Methods Results Conclusion

Abasic beamer example

Thomas Abasicbeamerexample Abasicbeamerexample Thurnherr Introduction

Methods Abasicbeamerexample Thomas Thurnherr Thomas Thurnherr Results Conclusion Thomas Thurnherr February 19, 2014 February 19, 2014 February 19, 2014

Figure 1: Title page frames for beamer themes: default, Singapore, and Berkeley.

{some-figure-file} \begin{frame} \end{frame} \titlepage % alternatively \maketitle can be used Similarly with tables, we omit the table envi- \end{frame} ronment and directly use tabular.

4.5 Defining multiple columns 4.3 Table of contents By default, content is stacked vertically. Therefore, To add structure to a presentation and create an if you have a list, then a figure, then a text para- outline, we can use \section and \subsection, to- graph, first the list is produced, with the figure below gether with \tableofcontents. These commands and lastly the text. Often, it’s desirable to position are used outside of frames. To create an outline at content next to each other. There are two (identi- the beginning of a presentation, we use: cal) methods to split a slide horizontally into two or \section{Presentation Outline} more columns: the minipage environment and the \begin{frame} beamer-specific columns environment. \frametitle{Outline} \tableofcontents \begin{frame} \end{frame} \frametitle{Two column example: minipage} \begin{minipage}{0.48\linewidth} For long presentations, it may make sense to % content column 1 show the outline again at the beginning of a new \end{minipage} \quad % adds some whitespace \section. We use the option currentsection to \begin{minipage}{0.48\linewidth} \tableofcontents to highlight the current section. % content column 2 \end{minipage} \section{New Section Title} \end{frame} \begin{frame} \frametitle{Outline} \begin{frame} \tableofcontents[currentsection] \frametitle{Two column example: columns} \end{frame} \begin{columns} \begin{column}{0.48\linewidth} 4.4 Adding figures % content column 1 \end{column} Frames are single entities and therefore figures and \quad tables do not need to be wrapped in their respective \begin{column}{0.48\linewidth} floating environments. Also, we probably do not % content column 2 wish to add a caption. Therefore, to show a figure, \end{column} \includegraphics with appropriate alignment and \end{columns} scaling (see the graphicx package [3]) is sufficient: \end{frame}

\begin{frame} 5 Simple animations \frametitle{Adding a figure to a frame} \centering It may be an exaggeration to use the word “ani- \includegraphics[width=0.8\linewidth] mations”. What I will show is merely how to add,

Thomas Thurnherr TUGboat, Volume 35 (2014), No. 1 29

remove and replace parts of the content, primar- Hide and show list items ily text. However, I believe this is good enough to keep the audience interested, everything else is just a distraction.

5.1 Add items dynamically I First item, shown with the slide The beamer command \pause adds content gradu- I Next item, revealed after pressing some button ally, by pausing and waiting for the presenter to press I Show this item with the ﬁrst a button. For example, we can use \pause in a list to reveal one item after another. You might wonder how this is possibly translated into a PDF. There is really no magic to it; LATEX just produces three slides with the same page number, adding an extra item one each subsequent slide. Try for yourself:

\begin{frame} Figure 2: Hide and show list items. \frametitle{Usage of pause} \begin{itemize} \item First item, shown with the slide \pause \begin{frame} \item Next item, revealed after pressing \frametitle{Hide and show content} a button \uncover<1> { % adds content } \pause \uncover<2> { % add additional content } \item Last item, revealed after pressing \end{frame} a button again \end{itemize} \begin{frame} \end{frame} \frametitle{Hide and overwrite content} \only<1> { % adds content } 5.2 Hide and show content \only<2> { % replaces previous content } “Overlays” is a slightly more sophisticated concept. \end{frame} Overlays use pointed brackets to hide, reveal and Similar to the itemize example above, much overwrite content. For example, the specification more sophisticated overlays can be created using \item<1-> means: “from slide 1 on” (see figure 2). \uncover and \only. \begin{frame} \frametitle{Hide and show list items} 5.3 Highlighting items \begin{itemize} Besides hiding and revealing, we can also highlight \item<1-> First item, shown with the slide text upon a button press. In beamer, this is called \item<2-> Next item, revealed after an alert (see figure 3): pressing some button \item<3-> Last item, revealed again after \begin{frame} pressing some button \frametitle{Highlight items of a list} \item<1-> Show this item with the first \begin{itemize} \end{itemize} \item Highlight first item \end{frame} \item Highlight second item \item Highlight third item We can also combine ranges of numbers. As- \item<4- | alert@4> Combine reveal and suming more than 7 overlays, to show an item on highlight all but slides 3 and 6, we use: \item<-2,4-5,7->. \end{itemize} Items always occupy their space, even if they are \end{frame} not shown. Joseph Wright’s article elsewhere in this issue provides more examples [5]. This syntax works with other content too, as 6 Citations and bibliography implemented in the commands \uncover and \only. You can generate a bibliography the same way as with The difference between them is that \uncover occu- a standard LATEX document class. Personally, I prefer pies space when hidden, whereas \only does not, and to show the bibliography entries on the same slide can therefore be used to overwrite previous content. where they are cited. I use the biblatex package [2]

Introduction to presentations with beamer 30 TUGboat, Volume 35 (2014), No. 1

Highlight items of a list 1D. E. Knuth and D. Bibby (1986). The TEXbook. Vol. A. Addison-Wesley, Reading, MA, USA. 2Knuth and Bibby 1986. 3D. E. Knuth and D. Bibby, 1986,

Figure 4: Full, authoryear-style, and custom citation I Highlight ﬁrst item of The TEXbook. I Highlight second item I Highlight third item % Citations \footcustomcite{knuth86}

Figure 4 shows a citation (for The TEXbook) in the full, authoryear, and above custom styles. 7 Creating handouts

Figure 3: Highlight list items. If you teach a class or give a talk it might be appropriate to provide handouts. The beamer document class option handout reduces overlays to a single slide for this, which provides a variety of methods. The and removes the navigation bar at the bottom. In commands \footcite and\footfullcite produce addition, we can combine multiple slides to a single pgfpages references according to the style (e.g. authoryear), physical page, using the package [4]: and full references respectively. I use an external \documentclass[handout]{beamer} BibTEX ﬁle to store references. Here is an example. \usepackage{pgfpages} \pgfpagesuselayout{4 on 1}[a4paper, % Preamble border shrink=5mm, landscape] \usepackage[backend=biber, maxnames=2, firstinits=true,style=authoryear]{biblatex} References \bibliography{path/to/references.bib} [1] beamer — A LATEX class for producing % Citations presentations and slides. http://www.ctan. \begin{frame} org/pkg/beamer. Accessed: 2014-02-19. \frametitle{Citing other people’s work} A [2] biblatex — Bibliographies in LTEX using \begin{itemize} BibT X for sorting only. http://www.ctan. \item Full citation E org/pkg/biblatex. Accessed: 2014-02-19. \footfullcite{knuth86} \item Author-year citation [3] graphicx — Enhanced support for graphics. \footcite{knuth86} http://www.ctan.org/pkg/graphicx. \end{itemize} Accessed: 2014-02-19. [4] pgf — Create PostScript and PDF graphics If your preferred style is not available, you can in TEX. http://www.ctan.org/pkg/pgf. deﬁne your own citation style using the biblatex Accessed: 2014-02-19. command \DeclareCiteCommand. The invocation below prints references as footnotes, showing the [5] Joseph Wright. The beamer class: Controlling TUGboat author, year, and journal title. overlays. , 35(1):31–33, 2014.

⋄ % Preamble Thomas Thurnherr \usepackage[backend=biber, maxnames=2, texblog (at) gmail dot com firstinits=true]{biblatex} http://texblog.org \bibliography{path/to/references.bib} \DeclareNameAlias{sortname}{first-last} \DeclareCiteCommand{\footcustomcite}{}{% \footnote{\printnames[author]{author}, \printfield{year}, \printfield{journaltitle} \printfield{booktitle}}}{;}{}

Thomas Thurnherr TUGboat, Volume 35 (2014), No. 1 31

The beamer class: Controlling overlays We can make things more specific by giving only a single slide number, giving an ending slide number Joseph Wright and so on. There was a question recently on the TEX StackEx- \begin{frame} change site (Gil, 2014) about the details of how slide \begin{itemize} overlays work in the beamer class (Tantau, Wright, \item<1> Visible on the 1st only and Miletić, 2013). The question itself was about \item<-3> Visible on the a particular input syntax, but it prompted me to 1st to 3rd slides think that a slightly more general examination of \item<2-4,6> Visible on the how overlays would be helpful to beamer users. 2nd to 4th slides, and the 6th slide A word of warning before I start: don’t overdo \end{itemize} overlays! Having text or graphics appear or disappear \end{frame} on a slide can be useful but is easy to over-use. I’m The syntax is quite powerful, but there are at going to focus on the mechanics here, but that doesn’t least a couple of issues. First, the slide numbers beamer mean that they should be used in every frame are hard-coded. That means that if I want to add you create. something else in before the first item I’ve got to 1 Overlay basics renumber everything. Secondly, I’m having to repeat myself. Luckily, beamer offers a way to address both beamer Before we get into the detail of how deals of these concerns. with overlays, I’ll first give a bit of background to what they are. The beamer class is built around the 2 Auto-incrementing the overlay idea of frames: The first tool beamer offers is the special symbol + \begin{frame} in overlay specifications. This is used as a place \frametitle{A title} holder for the “current overlay”, and is automatically % Frame content incremented by the class. To see it in action, I’ll \end{frame} rewrite the first overlay example without any fixed which can produce one or more slides: individual numbers. pages of output that will appear on the screen. These \begin{frame} separate slides within a frame are created using over- \begin{itemize} lays, which is the way the beamer manual describes \item<+-> Visible from the 1st slide the idea of having the content of individual slides \item<+-> Visible from the 2nd slide varying. Overlays are “contained” within a single \item<+-> Visible from the 3rd slide frame: when we start a new frame, any overlays ... from the previous one stop applying. \end{itemize} The most basic way to create overlays is to \end{frame} explicitly set up individual items to appear on a par- What’s happening here? Each time beamer finds ticular slide within the frame. That’s done using the an overlay specification, it automatically replaces all (optional) overlay argument that beamer enables for of the + symbols with the current overlay number. many document components; this overlay specifica- It then advances the overlay number by 1. So in the tion is given in angle brackets. The classic example above example, the first + is replaced by a 1, the is a list, where the items can be made to appear one second by a 2 and the third by a 3. So we get the at a time. same behaviour as in the hard-coded case, but this \begin{frame} time if I add another item at the start of the list I \begin{itemize} don’t have to renumber everything. \item<1-> Visible from the 1st slide There are of course a few things to notice. The \item<2-> Visible from the 2nd slide first overlay in a frame is number 1, and that’s what \item<3-> Visible from the 3rd slide beamer sets the counter to at the start of each frame. ... To get the second item in the list to appear on slide 2, \end{itemize} we still require an overlay specification for the first \end{frame} item: I could have skipped the <1-> in the hard- As you can see, the overlay specification here is coded example and nothing would have changed. The simply the first slide number we want the item to be second point is that every + in an overlay specification on followed by a - to indicate “and following slides”. gets replaced by a given value. We’ll see later there

The beamer class: Controlling overlays 32 TUGboat, Volume 35 (2014), No. 1

are places you might accidentally add a + to mean \item<+-> Visible from the 1st slide “advance by 1”: don’t do that! \item<.-> Visible from the 1st slide \item<+-> Visible from the 2nd slide 3 Reducing redundancy \item<.-> Visible from the 2nd slide Using the + approach has made our overlays flexible, ... but I’ve still had to be repetitive. Handily, beamer \end{itemize} helps out there too by adding an optional argument \end{frame} to the list which inserts an overlay specification for What happens here is that . can be read as each line: “repeat the overlay number of the last +”. So the \begin{frame} two + overlay specifications create one slide each, \begin{itemize}[<+->] while the two lines using . in the specification ’pick \item Visible from the 1st slide up’ the overlay number of the preceding +. (The \item Visible from the 2nd slide beamer manual describes the way this is actually \item Visible from the 3rd slide done, but I suspect that’s less clear than thinking of ... this as a repetition!) \end{itemize} Depending on the exact use case, you might \end{frame} want to combine this with the “reducing repeated Notice that this is needs to be inside the “nor- code” optional argument, with <.-> as an override. mal” [ ... ] set up for an optional argument. Ap- \begin{frame} plying an overlay to every item might not be exactly \begin{itemize}[<+->] what you want: you can still override individual lines \item Visible from the 1st slide in the standard way. \item<.-> Visible from the 1st slide \begin{frame} \item Visible from the 2nd slide \begin{itemize}[<+->] \item<.-> Visible from the 2nd slide \item Visible from the 1st slide ... \item Visible from the 2nd slide \end{itemize} \item Visible from the 3rd slide \end{frame} \item<1-> Visible from the 1st slide 5 Offsets ... \end{itemize} A combination of + and . can be used to convert \end{frame} many “hard-coded” overlay set ups into “relative” ones, where the slide numbers are generated by Remember not to overdo this effect: just because beamer without you having to work them out in it’s easy to reveal every list line by line doesn’t mean advance. However, there are still cases it does not you should! cover. To allow even more flexibility, beamer has 4 Repeating the overlay number the concept of an “offset”: an adjustment to the number that is automatically inserted. Offset values The + syntax is powerful, but as it always increments are given in parentheses after the + or . symbol they the overlay number it doesn’t allow us to remove the apply to, for example: hard-coded numbers from a case such as \begin{frame} \begin{frame} \begin{itemize} \begin{itemize} \item<+(1)-> Visible from the 2nd slide \item<1-> Visible from the 1st slide \item<+(1)-> Visible from the 3rd slide \item<1-> Visible from the 1st slide \item<+-> Visible from the 3rd slide \item<2-> Visible from the 2nd slide \end{itemize} \item<2-> Visible from the 2nd slide \end{frame} ... \end{itemize} Notice that this adjustment only applies to the \end{frame} substitution, so both the second and third lines above end up as <3-> after the automatic replacement. If beamer For this case, offers another special symbol, you try the demo, you’ll also notice that none of the a single period ‘.’, as in: items appear on the first slide! \begin{frame} Perhaps a more realistic example for where an \begin{itemize} offset is useful is the case of revealing items “out of

Joseph Wright TUGboat, Volume 35 (2014), No. 1 33

order”, where the full list makes sense in some other \begin{frame} way. With hard-coded numbers this might read \begin{itemize} \begin{frame} \item<+-> Visible from the 1st slide \begin{itemize} \item<+-> Visible from the 2nd slide \item<1-> Visible from the 1st slide \end{itemize} \item<2-> Visible from the 2nd slide \pause \item<1-> Visible from the 1st slide Text after the list \item<2-> Visible from the 2nd slide \end{frame} ... If you read the beamer manual carefully, this is what \end{itemize} is supposed to happen here, but the more important \end{frame} question is how to get what you (probably) want: which can be made “flexible” with a set up such as three slides. \begin{frame} The answer is to use \onslide: the \pause \begin{itemize} command is by far the most basic way of making \item<+-> Visible from the 1st slide overlays, and simply doesn’t “know” how to work \item<+-> Visible from the 2nd slide with +-. In contrast, \onslide uses exactly the same \item<.(-1)-> Visible from the syntax we’ve already seen for overlays: 1st slide \begin{frame} \item<.-> Visible from the 2nd slide \begin{itemize} ... \item<+-> Visible from the 1st slide \end{itemize} \item<+-> Visible from the 2nd slide \end{frame} \end{itemize} or the equivalent \onslide<+-> Text after the list \begin{frame} \end{frame} \begin{itemize} \item<+-> Visible from the 1st slide As we are then using the special + syntax for all of \item<.(1)-> Visible from the 2nd slide the overlays, everything is properly tied together and \item<.-> Visible from the 1st slide gives the expected result: three slides. \item<+-> Visible from the 2nd slide 7 Summary ... \end{itemize} The beamer overlay feature can help you set up \end{frame} complex and flexible overlays to generate slides with dynamic content. By using the tools carefully, you As shown, we can use both positive and negative can make your input easier to read and maintain. offsets, and these work equally well for + and . auto- generated values. You have to be slightly careful with References negative offsets; while beamer will add additional Gil, Yossi. “Relative overlay specification in slides for positive offsets, if you offset to below a beamer?” http://tex.stackexchange.com/q/ final value of 0 then errors will crop up. With this 154521, 2014. rather advanced setup, which version is easiest for you to follow will be down to personal preference. Tantau, Till, J. Wright, and V. Miletić. “The beamer Notice that positive offsets do not include a + class”. Available from CTAN, sign, but are just given as an unsigned integer: re- macros/latex/contrib/beamer, 2013. member what I said earlier about all + symbols being ⋄ replaced. If you try something like <+(+1)>, your Joseph Wright 2, Dowthorpe End presentation will compile but you’ll have a lot of Earls Barton slides! Northampton 6 Pausing general text NN6 0NH United Kingdom beamer The class offers a very simple \pause com- joseph.wright (at) morningstar2 mand to split general material into overlays. A classic dot co dot uk problem that people run into is combining that idea with the + approach to making overlays. For example, the following creates four slides:

The beamer class: Controlling overlays

34 TUGboat, Volume 35 (2014), No. 1 Boxes and more boxes width Ivan R. Pagnossin Abstract height Boxes are a key concept in (LA)TEX as well as an useful baseline tool, though not a very well known one. I believe this depth A g is due to the success of LTEX in providing a high level interface to TEX, which allows the user to produce beautiful documents without getting acquainted with reference point the concept of boxes. But it is useful to know the box concept because it enhances our understanding Figure 1: The dimensions of a box. Everything that A of TEX, helps us to avoid and solve problems and can is visible in a document produced by LTEX is in one or even propose solutions for unusual circumstances. more boxes, which is why they are so fundamental. In this paper we reproduce, step by step, the reflection effect in the title of this article in order to 2 produces Boxes , a box in wide. That is, a box gain some knowledge of the box concept. 3 which occupies more space than its contents. You 1 Box basics can also define the alignment of the content with the box, to the left, center or right, as shown below: Among the fundamental ingredients of a LAT X doc- E • Boxes ument, there are boxes: imaginary rectangles which \framebox[.67in][l]{Boxes} • Boxes enclose letters, lines, pages, paragraphs, figures, ta- \framebox[.67in][c]{Boxes} • \framebox[.67in][r]{Boxes} Boxes bles etc. Indeed, TEX does not typeset letters or characters, lines . . . but boxes! In a similar fashion, we may create a box which You may have already used boxes to avoid break- occupies less space than its contents; a zero-width ing some word or maybe to frame an equation. But box, for instance: they are far more useful than this. A simple but \framebox[0in][l]{Boxes} instructive example is the reflection effect in the title Try it yourself and see that Boxes overlaps the of this paper, which requires one customized box. text at its right. This happens because, as we have Let’s learn something about it. seen, LAT X uses the dimensions of the letters (boxes) Figure 1 shows the box associated with the let- E to place them side by side in a row. However, the box ter g. It has three dimensions: height, width and we have just created occupies no (horizontal) space. depth. It is these dimensions that LAT X actually E It looks like it is not there, though its contents are cares about, not the content of the box. In other (try also changing the alignment parameter). words, LATEX does not “see” the letter g, only its box. This is the important fact to understand. 2 Reflecting, scaling, coloring We can ask LAT X to show us this box: just write E Another command we will need to produce the re- \framebox{g}. Well, first we must set \fboxsep to flection effect is \scalebox, defined in the package zero, in order to remove the extra space around g: graphicx. Its syntax is \scalebox{f }[f ]{a box}. \setlength{\fboxsep}{0in}. x y It changes the dimensions of the box and its contents Much more important than just showing the according to the horizontal and vertical scaling fac- box, \framebox builds a box with whatever you tors f and f (respectively) and creates a new box want inside it. For instance, \framebox{Boxes}. x y that encloses this new content. Thus, as a first step

Now, Boxes is a single box, as unbreakable as the toward our goal, to get BoxesBoxes we write

letter g itself. By the way, this is another property to keep in mind: boxes cannot be broken. \scalebox{1}[-1]{Boxes}Boxes BoxesBoxes Another interesting property is this: the con- A This means we instruct LTEX to multiply the tents of a box need not lie inside it. You may have width by fx = 1 (hence, nothing changes on the noticed that, given the contents as an argument, the horizontal) and the height (> 0 in our case) and the \framebox command sets the dimensions of the box depth (= 0) by fy = −1. As a result, we have a to those of the contents (in reality, to the “sub-boxes” vertical reflection of Boxes. A hint: the commands that compose the contents). But you can define the defined by graphicx receive boxes as arguments, not dimensions explicitly as well. For example, just figures (which are boxes too). Since the package \framebox[.67in]{Boxes} was designed to handle figures, it is common to think

Ivan R. Pagnossin TUGboat, Volume 35 (2014), No. 1 35

that its commands apply only to ﬁgures, but this is zero here). Finally, the matrix has unequal height

not true. and depth (it’s centered on the math axis; the LATEX Let’s proceed. How can we place Boxes below code that produces this line is at the end of this Boxes? Answer: by constructing it in a way that paper). The important thing here is to notice that occupies no horizontal space, that is, by putting it in the placement of all these boxes follow a zero-width box (new code in black, previous code exactly the same rules as those followed in gray): by letters, since they are all in boxes.

So, whenever you insert a ﬁgure or table in your Boxes \makebox[0in][l]{\scalebox{1}[-1]{Boxes}}Boxes document, see them as “big g letters”. In other We have also swapped \framebox for \makebox. words, see the boxes! These two commands are equivalent, except that the former draws a frame around the box, while the 4 Further reading latter does not. To learn more about this subject, you can study Finally, we choose the reﬂection color (use the the environment minipage, the commands \parbox, package xcolor): \raisebox and \rule in section 4.7 of [1] and/or

\makebox[0in][l]{\scalebox}{1}[-1]{% appendix A.2 of [2]. Chapter 11 of [3] is mandatory. Boxes \textcolor{red!80}{Boxes}}Boxes Moreover, have a look at the commands deﬁned by graphicx The sequence below illustrates the changing of the package . the box width, from its natural width (deﬁned by the Appendix: Our unusual line of text content) down to zero. Notice that the non-inverted The code used to produce the unusual text line dis-

Boxes always starts immediately after the frame Boxes cussed above in this paper is shown below. You will (the box), not after the content ( ).

need the packages graphicx and amsmath to run it.

Boxes Boxes Boxes BoxesBoxes Boxes Boxes Boxes \setlength\fboxsep{0pt}\noindent\hrulefill \fbox{\begin{minipage}[t]{0.16\textwidth} \setlength{\parindent}{1em} 3 Boxes upon boxes \tiny Just a short example paragraph. There are boxes far more complex than these. As I A line or two or three. That is all. mentioned, everything that is visible in a document \end{minipage}}% produced by LATEX is in one or more boxes. There \hrulefill are even invisible elements in your document that are \fbox{% boxes too. An example is the paragraph indentation, \includegraphics[width=0.05\textwidth]{img}}% \hrulefill which is nothing more than an empty box of width \fbox{$\left(\begin{matrix} \parindent. In all cases, all of the concepts we have \cos\varphi & -\sin\varphi & 0 \\ just seen remain valid. \sin\varphi & \cos\varphi & 0 \\ Let’s see an example. The line below contains 0 & 0 & 1 three boxes plus some filling lines (which are boxes \end{matrix}\right)$}% too): the first is an entire paragraph, the second is a \hrulefill figure and the third a mathematical table (or matrix, if you prefer). All of them have height, width and Acknowledgments depth, are unbreakable, and are placed side by side The author thanks Karl Berry and Barbara Beeton with their reference points aligned (fig. 1) over an for invaluable help and improvements to this article. imaginary line called the baseline, represented by the horizontal filling lines. References [1] H. Kopka and P. W. Daly. A Guide to LATEX. Addison-Wesley, 3rd edition, 2004. cos ϕ − sin ϕ 0   [2] F. Mittelbach and M. Goossens. The LATEX baseline → Just a short example sin ϕ cos ϕ 0 nd paragraph. A line or two Companion. Addison-Wesley, 2 edition, 2004. or three. That is all.  0 0 1 [3] D. E. Knuth. The TEXbook. Addison-Wesley, The paragraph box is aligned on the baseline 1984. of its first line. The figure, on the other hand, inserted with the \includegraphics command (pack- ⋄ Ivan R. Pagnossin age graphicx), has depth zero (but since we’ve en- ivan dot pagnossin (at) gmail closed it in an \fbox for the example, it is not quite dot com

Boxes and more boxes 36 TUGboat, Volume 35 (2014), No. 1

Glisterings: Glyphs, long labels \catcode‘\:=\active \let~\textcoloncentered Peter Wilson \ignorespaces }{\ifhmode\unskip\fi}} Ek gret eﬀect men write in place lite; \newcommand{\textcoloncentered}{} Th’ entente is al, and nat the lettres space. \DeclareRobustCommand*{\textcoloncentered}{% \begingroup Troilus and Criseyde, Geoffrey Chaucer \sbox0{T}% The aim of this column is to provide odd hints or \sbox2{:}% small pieces of code that might help in solving a \dimen0=\ht0 % problem or two while hopefully not making things \advance\dimen0 by -\ht2 % \dimen0=.5\dimen0 % worse through any errors of mine. This installment \raisebox{\dimen0}{:}% presents some items about glyphs. \endgroup} 1 Asterism \begin{document} A symbol called an asterism is a simple kind of or- \begin{vccolon} nament consisting of three asterisks, and is supplied GALL : REG : IACO : MAG : BRITA : REG as a glyph in some fonts. It is typically used as an \end{vccolon} anonymous division, looking like this: \end{document} Following on from this Dan Luecking suggested *** Stephen Moye [8] posted a macro to make one using a \valign: if it was not available otherwise. This was based on \def\textcoloncentered{% some earlier code from Peter Flynn [4]. The code \valign{&##\cr\vphantom{T}\cr\vfil\hbox{:}% below is my version. \vfil\cr}} \newcommand*{\asterism}{% also remarking that perhaps a simple box raised by \raisebox{-.3em}[1em][0em]{% OK for 10-12pt some multiple of ex would do as well. \setlength{\tabcolsep}{0.05em}% I tried all three suggestions and decided that \begin{tabular}{@{}cc@{}}% \DeclareRobustCommand*{\textcoloncentered}{% \multicolumn{2}{c}*\\[-.75em]% \raisebox{.2ex}{:}} *&*% \end{tabular}% gave a satisfying result, also enabling the height to }} be adjusted to optically center the colon if necessary.. Heiko’s result: The asterism above was printed by: \par{\centering \asterism\par} GALL : REG : IACO : MAG : BRITA : REG 2 Raising a character Dan’s result: ‘Maximus_Rumpas’ wrote to ctt along the following GALL : REG : IACO : MAG : BRITA : REG lines: My result: I am writing some Latin text within a document: GALL : REG : IACO : MAG : BRITA : REG GALL : REG : IACO : MAG : BRITA : REG I need to raise the colon between the abbreviated text to the centre of the text line rather than, as 3 Boxing a glyph normal, aligned at the bottom of the text. I use Paul Kaletta wanted to be able to draw a box around \textperiodcentered for a single period but I can’t a glyph similar to the example in chapter 11 of The ﬁnd anything similar for colons. TEXbook. Herbert Voß responded with [11] (slightly Heiko Oberdiek [9] responded with: edited): The following solution centers the colon using an \documentclass{article} uppercase letter for comparison. Also it makes the \usepackage[T1]{fontenc} colon active inside an environment for easier writing. \usepackage{lmodern} \documentclass{article} \newsavebox\CBox \begingroup \makeatletter \lccode‘\~=‘\:% \def\Gbox#1{\begingroup \lowercase{\endgroup \unitlength=1pt \newenvironment{vccolon}{% \fboxsep=0pt\sbox\CBox{#1}%

Peter Wilson TUGboat, Volume 35 (2014), No. 1 37

\leavevmode text of which I want to know \\ \put(0,0){\line(1,0){\strip@pt\wd\CBox}}% the width. \\ \fbox{#1}% There can be \\ \put(-\strip@pt\wd\CBox,0){\circle*{4}} only one paragraph.} \endgroup} \fbox{% \makeatother \begin{minipage}{\testboxwidth} \begin{document} \usebox\testbox \begingroup \end{minipage}} \fontsize{2cm}{2.2cm}\selectfont results in: \Gbox{g}\Gbox{r}\Gbox{f}% \Gbox{’}\Gbox{,}\Gbox{T} Some \endgroup text of which I want to know \end{document} the width. Herbert’s demonstration code results in: There can be only one paragraph.

5 Font size s s s s s s grf’,T Gonzalo Medina Arellano asked on ctt: Let’s say I use 12pt as a class option. How 4 Glyph widths can I find the exact values of the size obtained with the standard commands \tiny, \scriptsize, ..., ‘PmI’ wrote to ctt: \huge, and \Huge? I’m trying to put some text in a box, so that I Several respondents, Bob Tennent [10] among can calculate the box dimensions, but neither hbox them, suggested looking at the appropriate .clo file, nor mbox seem to want to perform linebreaks, and such as size12.clo for the article or report classes’ parbox needs a width argument, which is actually 12pt option or bk12.clo for the book class, which what I want to compute so it’s useless . . . example lists the font and baseline sizes for the several size follows: commands. \filltestbox{I need \\ the width \\ Dan Luecking [6] added to this, saying (edited of this text} a bit): \begin{minipage}{\testboxwidth} You can determine the current font size within Several respondents mentioned that the varwidth a document without knowing what size command was package [1] does this, but two other solutions were last issued. The macro \f@size holds the font size provided for when the text was simple [3]. [as a number (10, 12, 14. 4, . . . )] and is updated Ulrike Fischer proposed using a tabular and with each size change. For convenience we can define measuring its width. another macro without an @ sign to access it, as in: \newsavebox\testbox \makeatletter \newcommand{\filltestbox}[1]{% \newcommand*{\currentfontsize}{\f@size} \savebox\testbox{% \makeatother \begin{tabular}{@{}l@{}}#1\end{tabular}}} \newcommand*{\testboxwidth}{\the\wd\testbox} Then \currentfontsize would print it and \typeout{\currentfontsize} Alternatively, as an exercise Donald Arseneau, would display it on the terminal screen and in the the author of the varwidth package, proposed this: log file. \newcommand{\filltestbox}[1]{% It turns out that the font sizes are the same in \setbox\testbox\vbox{% size*.clo and book*clo (the differences are in var- \def\\{\unskip\egroup\hbox\bgroup% A \ignorespaces}% ious margin settings). The unofficial LTEX reference \hbox\bgroup\ignorespaces #1\unskip\egroup}} manual provides a table of the font sizes at [5]. which makes a vbox (called \testbox) containing a I don’t believe in labels. I want to do the best I list of hboxes whose contents are the pieces of text can, all the time. I want to be progressive without between \\. getting both feet off the ground at the same time. In my limited testing, the two approaches yield the same final result. For instance: Television and radio interview, March 15, 1964, Lyndon B. Johnson \filltestbox{Some \\

Glisterings: Glyphs, long labels 38 TUGboat, Volume 35 (2014), No. 1

6 Long labels Short A short label. Ernest posted to comp.text.tex saying [2]: Longer label A longer label. I’m trying to change the description environ- A very long label exceeding the available ment, so that when labels exceed a certain length, the width text following the label starts in the next line instead A long label with the text also longer than a of starting in the same line. The LATEX Companion single line. book explains how to do this, however, when I’ve tried Medium label The more typical length of a label I’ve found that the lines that contain a long label are and some text. typeset a little bit too close to the previous line, not with the usual baselineskip . . . References The standard description environment uses [1] Donald Arseneau. The varwidth package, \descriptionlabel for setting the contents of the March 2009. http://mirror.ctan.org/ \item macro. The Companion [7, §3.3] describes macros/latex/contrib/varwidth. various methods of modifying the standard layout by using a different definition for \descriptionlabel [2] Ernest. Description environment. Post to and/or creating a new kind of description list. comp.text.tex newsgroup, 17 June 2010. Ernest’s requirement can be met by a modified ver- [3] Ulrike Fischer and Donald Arseneau. sion of \descriptionlabel, the default definition Re: line breaks in boxes (or ‘how do i get of which is: paragraph parsing in hbox/mbox?’). \newcommand*{\descriptionlabel}[1]{% Post to comp.text.tex newsgroup, \hspace{\labelsep}% 26–27 November 2009. \normalfont\bfseries #1} [4] Peter Flynn. Re: Uncommon typography. Post The following is an example of the default ap- to comp.text.tex newsgroup, 4 June 2007. pearance of a description list: [5] George Greenwade et al. LATEX: An unofficial reference manual. http://svn.gna.org/ Short A short label. viewcvs/*checkout*/latexrefman/trunk/ Longer label A longer label. latex2e.html#Font-styles. Package home A very long label exceeding the available width page: http://home.gna.org/latexrefman. A long label with the text also longer than a [6] Dan Luecking. Re: How to find the fontsize? single line. Post to comp.text.tex newsgroup, Medium label The more typical length of a label 24 August 2010. and some text. [7] Frank Mittelbach and Michel Goossens. A As you can easily see, it does not handle long The LTEX Companion. Addison Wesley, \item labels in a graceful manner. second edition, 2004. ISBN 0-201-36299-6. This version of the \descriptionlabel meets [8] Stephen Moye. Re: asterism. Post to xetex Ernest’s requirements: mailing list, 11 January 2010. \usepackage{calc} % or xparse [9] Heiko Oberdiek. Re: Raising a colon to \newlength{\dlabwidth} center. Post to comp.text.tex newsgroup, \newcommand*{\widedesclabel}[1]{% 23 November 2009. \settowidth{\dlabwidth}{\textbf{#1}}% [10] Bob Tennent. Re: How to find the fontsize? \hspace{\labelsep}% Post to comp.text.tex newsgroup, \ifdim\dlabwidth>\columnwidth 24 August 2010. \parbox{\columnwidth-\labelsep}% [11] Herbert Voß. Re: How to draw a box {\textbf{#1}\strut}% around the boundaries of a glyph? Post to \else comp.text.tex newsgroup, 27 March 2009. \textbf{#1} \fi} ⋄ Peter Wilson \let\descriptionlabel\widedesclabel 12 Sovereign Close which, when applied to the previous example yields: Kenilworth, CV8 1SQ UK herries dot press (at) earthlink dot net

Peter Wilson TUGboat, Volume 35 (2014), No. 1 39

The pkgloader and lt3graph packages: a systematic approach to resolve package conflicts. Toward simple and powerful package This is where the power of TEX comes in handy. A A package can be written to oversee the package loading management for LTEX process: a LAT X package manager. It should be easy Michiel Helvensteijn E enough to use for the casual document author, yet Abstract powerful enough to allow package authors to hook into it to simplify their development process. This article introduces the pkgloader package. I re- Section 2 introduces pkgloader, which I wrote cently wrote this package to address one of the to fulfill this rôle. Reading this section should be major frustrations of LAT X: package conflicts. It E enough to give you a general idea of how to use it. also introduces lt3graph, a LAT X3 library used by E Section 3 describes lt3graph, a utility library using pkgloader to do most of the heavy lifting. A the LTEX3 programming layer which does most of 1 Introduction the heavy lifting for pkgloader. Finally, Section 4 A goes through some of the more advanced and planned LTEX package conflicts are a common source of frus- features of pkgloader. tration. If you are reading this article, you’re proba- pkgloader A Here is a quick glimpse of in use: bly experienced enough with LTEX to have encountered them more than once. I cannot improve upon \RequirePackage{pkgloader} ... the words of Freek Dijkstra [3] on the subject: \documentclass{article} “Package conflicts are a hell.” ... Package conflicts can exist because of the sheer power \usepackage{algorithm} A \usepackage{hyperref} any order of TEX[4], the language on which LTEX is based. float ) Not only is it Turing complete [8], but most of the \usepackage{ } ... language can be redefined from within the language \begin{document} itself. This was famously demonstrated with the TEX ... script xii.tex, written by David Carlisle [1] (if you \end{document} haven’t seen it yet, download and compile it now; A it’s awesome). LTEX packages can not only add new Warning: This package is still under development. definitions, but also remove and modify existing ones. Some of the features described in this article may not They can offer domain specific languages [6], monkey- yet be fully implemented, and the presented syntax patch the core language to hook into existing com- may still change in the coming months. However, its mands [7], and even change the meaning of individual main purpose and the fundamental ideas underlying symbols by altering their category code [5]. Put sim- its implementation are here to stay. A ply, LTEX packages have free rein. This power can be Community collaboration: I intend for the de- quite useful, but makes it too easy for independent velopment and maintenance of this package to be as package authors to step on each others’ toes. open as possible to community collaboration. This A different type of conflict concerns package op- package has a very wide scope, and is rather invasive. tions. If a package is requested more than once with If done right, it has the potential to become widely A A different options, LTEX bails out with an error mes- used and improve the LTEX experience for document sage. This is an understandable precaution. Because authors and package authors alike. If done wrong, it A of the way package loading works, LTEX has no way will break things and annoy many people. to apply the second set of options. The package will I hope to bring some useful domain knowledge A have already been loaded with the first set. to the development effort, but there are many LTEX Most package authors are well aware of these gurus out there who have more experience and insight problems. Document authors are told to avoid cer- than I do. If you like the idea of pkgloader and tain package combinations, or to load packages in would like to contribute in any way, I encourage you some specific order. Some of the larger packages are to contact me personally, or to file issues or pull designed to test for the presence of other packages in requests through the pkgloader Github page: order to circumvent known conflicts. Unfortunately, github.com/mhelvens/latex•pkgloader this is all done in an ad hoc fashion. Solving these problems on a case-by-case basis 2 The pkgloader package (Part 1) takes time and effort for both document and package This package was inspired by my PhD research [2], authors. It pollutes the code, makes maintenance which happens to be all about conflicts between inde- more difficult, and confuses new users. We need pendently developed modules. I also took cues from

A The pkgloader and lt3graph packages: Toward simple and powerful package management for LTEX 40 TUGboat, Volume 35 (2014), No. 1

similar libraries and standards for other languages, ensures that hyperref is loaded before algorithm. such as JavaScript, which is surprisingly similar to These are the rules that would allow the code on A LTEX in many ways. page 39 to compile without problems. Note that neither rule actually loads any packages. They simply 2.1 How to use the package manager tell the package manager how to treat certain pairs A LTEX packages are generally loaded with one of of packages, should they ever be requested together the commands \usepackage, \RequirePackage, or in a single document. \RequirePackageWithOptions. In a similar way, The third rule states that fixltx2e must al- document classes are loaded with \documentclass, ways be loaded, and must be loaded early. The fourth \LoadClass or \LoadClassWithOptions. Normally, rule states that the algorithms and pseudocode when such a command is reached, the relevant class packages should never be loaded together. They both or package is loaded on the spot. The idea behind also include a textual reason, which documents the pkgloader is to make it the very first file you load: rule, and is included in certain error messages. before the document class, and before any other pack- To better understand how these rules work, let’s A age. Thus, the main file for a LTEX document using dive into their underlying model: a directed graph. pkgloader would be structured like this: 3 The lt3graph Package \RequirePackage{pkgloader} The pkgloader package is written in the experimen- hdocument class and packages in any orderi A \LoadPackagesNow tal LTEX3 programming layer expl3, which gives us ... optional something akin to a traditional imperative program- \begin{document} o ming language, with data structures, while loops, ... and so on. Let’s face it, TEX is no one’s first choice \end{document} for a programming language. But expl3 makes it A bearable. So kudos to the LTEX3 team! The area between \RequirePackage{pkgloader} To represent graphs, I wrote a data-structure and \LoadPackagesNow is called the pkgloader area. package called lt3graph.1 This library was born as Inside this area, the loading of all classes and pack- a means to an end, but has grown into a full-fledged ages is postponed. It also ends automatically upon general-purpose data structure for representing and reaching the end of the preamble. The package man- analyzing directed graphs. ager analyzes the intercepted loading requests and A directed graph contains vertices (nodes) and executes them in the proper order and with the edges (arrows). Using lt3graph, an example graph proper options, lifting this burden from the user. may be defined as follows: 2.2 A conflict resolution database \ExplSyntaxOn The pkgloader package does not analyze the actual \graph_new:N \l_my_graph code of each package in order to detect conflicts. In \graph_put_vertex:Nn \l_my_graph {v} \graph_put_vertex:Nn \l_my_graph w fact, because T X is Turing complete, this would be { } E \graph_put_vertex:Nn \l_my_graph {x} mathematically impossible. The package manager \graph_put_vertex:Nn \l_my_graph {y} is backed by a database of rules for recognizing and \graph_put_vertex:Nn \l_my_graph {z} resolving known conflicts, as well as performing other \graph_put_edge:Nnn \l_my_graph {v}{w} neat tricks. The following shows some examples: \graph_put_edge:Nnn \l_my_graph {w}{x} \graph_put_edge:Nnn \l_my_graph {w}{y} \Load {float} before {hyperref} \graph_put_edge:Nnn \l_my_graph {w}{z} \Load {algorithm} after {hyperref} \graph_put_edge:Nnn \l_my_graph {y}{z} \Load {fixltx2e} always early \ExplSyntaxOff because {it fixes some imperfections in LaTeX2e} Each vertex is identified by a key, which, to this \Load error if {algorithms && pseudocode} library, is a string: a list of characters with category because {they provide the same functionality and conflict code 12 and spaces with category code 10. An edge on many command names} is then declared between two vertices by referring to their keys. By supplying an additional argument to The first two rules encode some workarounds for the the functions above, you can store arbitrary data in hyperref package, which is notorious for causing 1 Bearing the prefix lt3 rather than the more common conflicts. The first one says that float must be prefix l3 indicates that the package is not officially supported loaded before hyperref. Similarly, the second rule by the LATEX3 team.

Michiel Helvensteijn TUGboat, Volume 35 (2014), No. 1 41

a vertex or edge for later retrieval. Let’s use TikZ to A topological order is not uniquely determined. The visualize this graph: important thing is that the constraints imposed by the graph are respected. \newcommand{\vrt}[1] {\node(#1){\ttfamily\vphantom{Iy}#1};} 4 The pkgloader package (Part 2) \begin{tikzpicture}[every path/.style= {line width=1pt,->}] The pkgloader package uses graph vertices to rep- A \matrix[nodes={circle,draw}, resent LTEX packages and arrows to encode package row sep=1cm, column sep=1cm, ordering rules. At the end, selected packages are execute at begin cell=\vrt] loaded in topological order. { v & w & x \\ Let’s take a more detailed look at some of the & y & z \\ }; features of pkgloader. \ExplSyntaxOn \graph_map_edges_inline:Nn 4.1 Rules \l_my_graph Each \Load rule can contain a number of diﬀerent { \draw (#1) to (#2); } \ExplSyntaxOff clauses. We look at them one by one. \end{tikzpicture} It usually contains a package description, consisting of a name, a set of options and a minimal version, just like the \usepackage command: v w x \Load [options] {package-name} [version]

It can contain a condition clause, indicating y z when the rule should be applied. This takes the form of a Boolean formula in expl3 style, in which the Just to be clear, this library does not, inherently, atomic propositions are package names: understand any TikZ. What it does is help you to \Load {pkg1} if {pkg2 || pkg3 && !pkg4} analyze the structure of your graph. For example, does it contain a cycle? This rule would load pkg1 if either pkg2 will be loaded too, or if pkg3 will be loaded but pkg4 \ExplSyntaxOn will not. Alternatively, the condition clause can \graph_if_cyclic:NTF be always, indicating that the rule should be ap- \l_my_graph {Yep}{Nope} \ExplSyntaxOff plied under any conditions. Finally, the keywords if loaded can be used to apply the rule only if Nope the package named in the package description is re- Indeed, there are no cycles in this graph. While we’re quested anyway. This is the default behavior, but at it, is vertex w reachable from vertex y? the keywords can be included to make it explicit. There is one exception to the structure described \ExplSyntaxOn above. Instead of a package description, a rule can \graph_acyclic_if_path_exist:NnnTF contain the error keyword, followed by a condition \l_my_graph {y}{w}{Yep}{Nope} \ExplSyntaxOff clause, to describe conditions that should never occur — usually invalid package combinations: Nope \Load error if {pkgX && pkgY} Quite true. Finally, and most importantly, you can interpret the graph as a dependency graph and list When multiple condition clauses are present in a its vertices in topological order: single rule, their disjunction is used. In other words, \ExplSyntaxOn the rule is applied if any of its conditions is satisﬁed. \clist_new:N \LinearClist A non-error rule may contain an order clause, \graph_map_topological_order_inline:Nn to ensure that the package described by the rule \l_my_graph is loaded in a speciﬁc order with regard to other { \clist_put_right:Nn packages: \LinearClist {\texttt{#1}} } \ExplSyntaxOff \Load {A} after {B,C} before {D} $ \LinearClist $ This rule ensures that if package A is ever loaded, it v, w, x, y, z is never loaded before B or C, and never after D. This

A The pkgloader and lt3graph packages: Toward simple and powerful package management for LTEX 42 TUGboat, Volume 35 (2014), No. 1

is where the graph representation comes in. The loads a recommended set of rules, allowing the aver- above rule would yield a graph like this: age user to get started without any hassle. But this behavior can be overwritten using package options: B A D \RequirePackage[recommended=false, my-better-rules] C {pkgloader} ... This can take care of specific known package \LoadPackagesNow ordering conflicts. But some packages should, as a This means: the pkgloader•recommended.sty rule of thumb, be loaded before all other packages, or file, which is usually preloaded by default, should after all others, unless specified otherwise. A typical not be loaded for this document. Instead, load the example is the hyperref package, which should pkgloader•my•better•rules.sty file. almost always be loaded late in the run. For this, Take note of the following: every ruleset should the early and late clauses may be used: be bundled in a pkgloader•hsomethingi.sty file, \Load {hyperref} late and can then be loaded by specifying hsomethingi as a package option. The early and late clauses work by ordering the So basically, any user can create rules for their package relative to one of two placeholder nodes: own documents, or distribute custom rulesets, e.g., through CTAN. But primarily, I expect two groups of people to author pkgloader rules: 1 2 hyperref A The LTEX community: The recommended ruleset would, ideally, be populated further through early normal late the efforts of anyone who diagnoses and solves package conflicts. Perhaps through websites like These two nodes are always present in the graph. tex.stackexchange.com, or by filing issues Ordering a package early is intuitively the same as or pull-requests to the pkgloader Github page. ordering it ‘before {1}’. And ordering it late is Package authors: pkgloader will eventually be the same as ordering it ‘after {2}’. All packages that are, after considering all rules, not (indirectly) directly usable for package authors just as for document authors, to include their own rules ordered ‘before {1}’ or ‘after {2}’ are automati- from right inside their packages. Rather than cally ordered ‘after {1} before {2}’. A rule can have any number of order clauses, and all are taken manually scanning for and fixing potential con- pkgloader into account when one of the conditions of the rule flicts, they could leverage , as in: is satisfied. \RequirePackage{pkgloader} Finally, a rule can be annotated with a reason, \Load me before {some-pkg} explaining why it was created: \Load me after {some-other-pkg} \ProcessRulesNow \Load {comicsans} always because {that font is awesome!} It may be possible to apply such rules in the same A LTEX run in which they are encountered. But if This text does not have any effect on the behavior of not, the package manager will know what to do in the rule. It is meant for human consumption, though the next run through the use of auxiliary files. This should not be formatted in any way. It should be functionality has not yet been implemented. semantically and grammatically correct when following the words “This rule was created because . . . ”. 4.3 Error messages It can also be used for citing relevant sources. It is There are two types of error messages that may be used in certain pkgloader error messages and may generated by pkgloader. eventually be used to generate documentation. The first type of error message happens when an error rule is triggered. It looks like this: 4.2 Rulesets You may be wondering: who makes up these rules? A combination of packages fitting Short answer: Anyone. Rules can be placed di- the following condition was requested: rectly inside the pkgloader area, but they can also hconditioni be bundled in a .sty file. By default, pkgloader This is an error because hreasoni.

Michiel Helvensteijn TUGboat, Volume 35 (2014), No. 1 43

The second type of error message is a bit more 5 Conclusion interesting. Since rules can eﬀectively come from any I hope this article and the packages described therein source, it is possible to apply rules that contradict have been useful and/or inspiring. And I hope to each other. To give an (unrealistic) example: A have convinced you that the idea of a LTEX package \Load {pkgX} always before {pkgY} manager is worth pursuing. because {pkgX is better} Both packages are available on CTAN: \Load {pkgY} always before {pkgX} www.ctan.org/pkg/pkgloader because {pkgY is better} www.ctan.org/pkg/lt3graph I love feedback, and I love questions. I can be reached through the e-mail address and website pkgX pkgY speciﬁed in the signature block below the references. Any kind of feedback or patches regarding one of the A potential circular ordering is not necessarily a packages should go through their Github pages: problem, so long as both rules are never applied github.com/mhelvens/latex-pkgloader in the same run. But in this exact example, the github.com/mhelvens/latex-lt3graph following error message will be generated: Happy TEXing!

There is a cycle in the requested References package loading order: [1] David Carlisle. xii.tex, December 1998. pkgX http://www.ctan.org/pkg/xii. --1--> pkgY [2] Dave Clarke, Michiel Helvensteijn, and --2--> pkgX Ina Schaefer. Abstract delta modeling. The circular reasoning is as follows: SIGPLAN Notices, 46(2):13–22, February (1) ’pkgX’ is to be loaded before 2011. Proceedings of GPCE’10, October 10–13, ’pkgY’ because pkgX is better. 2010, Eindhoven, The Netherlands. http: (2) ’pkgY’ is to be loaded before //doi.acm.org/10.1145/1942788.1868298. ’pkgX’ because pkgY is better. [3] Freek Dijkstra. LaTeX package conflicts, Whenever this happens, the user may want to re- June 2012. http://www.macfreek.nl/ consider one of their included rulesets, or file a bug- memory/LaTeX_package_conflicts. report to the responsible party — especially if the [4] Donald E. Knuth. The TEXbook. circularity comes from the recommended ruleset. Addison-Wesley, Reading, MA, USA, 1986. 4.4 Options and versions [5] Seamus. What are category codes?, The package manager need not be confined to playing April 2011. http://tex.stackexchange.com/ with the package loading order. While intercepting questions/16410/what-are-category-codes. package loading requests, it will be able to accu- [6] Wikipedia. Domain-specific language, 2014. mulate package options and versions as well, and http://en.wikipedia.org/wiki/ then combine them in a multitude of flexible ways. Domain-specific_language. Possible ways of combining option-lists include: [7] Wikipedia. Monkey patch, January 2014. • Concatenate all option-lists for the same package http://en.wikipedia.org/w/index.php? into a single list, in any arbitrary order. title=Monkey_patch&oldid=586056263. • Interpret key=value options, and generate an [8] Wikipedia. Turing completeness, 2014. error message when two different instances call http://en.wikipedia.org/wiki/Turing_ for two different values for the same key. completeness. • Provide a tailor-made function for combining option lists for a specific package. ⋄ Michiel Helvensteijn Leiden University, As for package versions, it would make sense to Niels Bohrweg 1, take the ‘maximum’ version-string that is encoun- 2333 CA, Leiden, tered, and use that to load the actual package. the Netherlands As of this writing, neither of these features has mhelvens+latex (at) gmail dot com been implemented. http://mhelvens.net

A The pkgloader and lt3graph packages: Toward simple and powerful package management for LTEX 44 TUGboat, Volume 35 (2014), No. 1

An overview of Pandoc the (few) conventions that the program follows in order to format the document. Massimiliano Dominici These languages are mainly used in two fields: Abstract documentation of code (reStructuredText, AsciiDoc, etc.) and management of contents for the web (Mark- This paper is a short overview of Pandoc, a utility down, Textile, etc.). In the case of code documenta- for the conversion of Markdown-formatted texts to tion, the use of an LML is a good choice, because the many output formats, including LAT X and HTML. E documentation is interspersed in the code itself, so it 1 Introduction should be easy to read by a developer perusing the code; but at the same time it should be able to be Pandoc is software, written in Haskell, whose aim converted to presentation formats (PDF and HTML, is to facilitate conversion between some lightweight traditionally, but today many IDEs include some markup languages and the most widespread ‘final’ form of visualization for the internal documentation). document formats.1 On the program’s website [3], In the case of web content, the emphasis is placed Pandoc is described as a sort of ‘swiss army knife’ for on the ease of writing for the user. Many content converting between different formats, and in fact it management systems already provide plugins for one is able to read simple files written in LATEX or HTML; A or more of those languages and the same is true for but it is of lesser use when trying to translate LT X 2 E static site generators that are usually built around documents with non-trivial constructs such as com- one of them and often provide support for others. mands defined by the user or by a dedicated package. The various wiki dialects can be considered another Pandoc shows its real utility, in my opinion, instance of LML. when what is needed is to obtain several output The actual ‘lightness’ of an LML depends greatly formats from a single source, as in the case of a docu- on its ultimate purpose. In general, an LML con- ment distributed online (HTML), in print form (PDF ceived for code documentation will be more complex via LAT X) and for viewing on tablets or ebook read- E and less readable than one conceived for web content ers (EPUB). In such cases one may find that writing management, which in turn will often not be capable the document in a rich format (e.g. LAT X) and con- E of general semantic markup. A paradigmatic exam- verting later to other markup languages often poses ple of this second category is Markdown that, in its significant problems because of the different ‘philoso- original version, stays rigorously close to the mini- phies’ that underlie each language. It is advisable, in- malistic approach of the first LMLs. The following stead, to choose as a starting point a language that is citation from its author, John Gruber, explains his ‘neutral’ by design. A good candidate for this role is a intentions in designing Markdown: lightweight markup language, and in particular Mark- down, of which Pandoc is an excellent interpreter. Markdown is intended to be as easy-to-read In this article we will briefly discuss the concept and easy-to-write as is feasible. Readabil- of a ‘lightweight markup language’ with particular ity, however, is emphasized above all else. reference to Markdown (§ 2), and then we will re- A Markdown-formatted document should be view Pandoc in more details (§ 3) before drawing our publishable as-is, as plain text, without look- conclusions (§ 5). ing like it’s been marked up with tags or formatting instructions.3 2 Lightweight markup languages: The only output format targeted by the refer- Markdown ence implementation of Markdown is HTML; indeed, Before getting to the heart of the matter, it is advis- Markdown also allows raw HTML code. Gruber has able to say a few words about lightweight markup languages (LML) in general. They are designed with 2 Static site generators are a category of programs that the explicit goal of minimizing the impact of the build a website in HTML starting from source files written in markup instructions within the document, with a a different format. The HTML pages are produced beforehand, usually on a local computer, and then loaded on the server. particular emphasis on the readability of the text by Websites built this way share a great resemblance with old a human being, even when the latter does not know websites written directly in HTML, but unlike those, in the building process it is possible to use templates, share metadata Translation by the author from his original in ArsTEXnica across pages, and create structure and content programmati- #15, April 2013, “Una panoramica su Pandoc”, pp. 31–38. cally. Static site generators constitute an alternative to the 1 Strictly speaking, LATEX isn’t a ‘final’ document format more popular dynamic server applications. in the same way PDF, ODF, DOC, EPUB, etc. are. But, from 3 [1], http://daringfireball.net/projects/markdown/ the point of view of a Pandoc user, LATEX is a ‘final’—or syntax#philosophy. A significant contribution to the intermediate, at least — product. design of Markdown was made by Aaron Swartz.

Massimiliano Dominici TUGboat, Volume 35 (2014), No. 1 45

Table 1: Markdown syntax: inline elements. Element Markdown LATEX HTML Links [link](http://example.net) \href{link}{% http://example.net} link Emphasis _emphasis_ \emph{emphasis} emphasis *emphasis* \emph{emphasis} emphasis Strong emphasis __strong__ \textbf{strong} strong **strong** \textbf{strong} strong Verbatim ‘printf()‘ \verb|printf()| printf() Images ![Alt](/path/to/img.jpg) \includegraphics{img} Alt

Table 2: Markdown syntax: block elements. Element Markdown LATEX HTML Sections # Title # \section{Title}

Title

## Title ## \subsection{Title}

Title

...... Quotation > This paragraph \begin{quote}

> will show This paragraph This paragraph > as quote. will show will show as quote. as quote. \end{quote}

Itemize * First item \begin{itemize}

First item
Second item
Third item

Enumeration 1. First item \begin{enumerate}

First item
Second item
Third item

Verbatim Text paragraph. Text paragraph.

Text paragraph.

grep -i ’\$’ grep -i ’\$’ always adhered to these initial premises and has Of course, in the reference implementation there is consistently refused to extend the language beyond no LATEX output, so I have provided the most logical the original specifications. This stance has caused a translation. In the following sections we will see how proliferation of variants, so that every single imple- Pandoc works in practice. mentation constitutes an ‘enhanced’ version. Famous websites like GitHub, reddit and Stack Overflow, all 3 An overview of Pandoc support their own Markdown flavour; and the same As mentioned in the introduction, Pandoc is pri- is true for conversion programs like MultiMarkdown marily a Markdown interpreter with several output or Pandoc itself, which also introduce new output formats: HTML,LATEX, ConTEXt, DocBook, ODF, formats. It’s not necessary, here, to examine the OOXML, other LMLs such as AsciiDoc, reStructured- details of the different flavours; the reader can get an Text and Textile (a complete list can be found in [3]). idea of the basic formatting rules from tables 1 and 2. Pandoc can also convert, with severe restrictions,

An overview of Pandoc 46 TUGboat, Volume 35 (2014), No. 1 a source file in LATEX, HTML, DocBook, Textile - name: First Author or reStructuredText to one of the aforementioned - affiliation: First Affiliation output formats. Moreover it extends the syntax of - name: Second Author Markdown, introducing new elements and providing - affiliation: Second Affiliation customization for the elements already available in --- the reference implementation. and then, in the template $for(author)$ 3.1 Markdown syntax extensions $if(author.name)$ Markdown provides, by design, a very limited set $author.name$ of elements. Tables, footnotes, formulas, and biblio- $if(author.affiliation)$ ($author.affiliation$) graphic references have no specific markup in Mark- $endif$ down. The author’s intent is that all markup exceed- $else$ ing the limits of the language should be expressed $author$ $endif$ in HTML. Pandoc maintains this approach (and, for $endfor$ LATEX or ConTEXt output, allows the use of raw TEX code) but makes it unnecessary, since it introduces to get a list of authors with (if present) affiliations. many extensions, giving the user proper markup for As we will see in section 3.1, a YAML block can each of the elements mentioned above. In the follow- also be used to build a bibliographic database. ing paragraphs we’ll take a look at these extensions. Footnotes Since the main purpose of Markdown Metadata Metadata for title, author and date can and its derivatives is readability, the mark and the be included at the beginning of the file, in a text text of a footnote should usually be split. It is block, each preceded by the character %, as in the recommended to write the footnote text just below following example. the paragraph containing the mark, but this is not strictly required: the footnotes could be collected % Title % First Author; Second Author at the beginning or at the end of the document, for % 17/02/2013 instance. The mark is an arbitrary label enclosed in the following characters: [^...]. The same label, The content for any of these elements can be followed by ‘:’ must precede the footnote text. omitted, but then the respective line must be left When the footnote text is short, it is possible blank (unless it is the last element, i.e. the date). to write it directly inside the text, without the label. % Title The output from all the footnotes is collected at the % end of the document, numbered sequentially. Here % 17/02/2013 is an example of the input: % Paragraph containing[^longnote] a % First Author; Second Author footnote too long to be written directly % 17/02/2013 inside the text.

% Title [^longnote]: A footnote too long to be % First Author; Second Author written inside the text without causing confusion. Since version 1.12 metadata support has been substantially extended. Now Pandoc accepts multi- New paragraph.^[A short note.] ple metadata blocks in YAML format, delimited by Tables Again, syntax for tables is based on consid- a line of three hyphens (---) at the top and a line erations of readability of the source. The alignment of three hyphens (---) or three dots (...) at the 4 of the cells composing the table is immediately visi- bottom. This gives the user a high level of flexibil- ble in the alignment of the text with respect to the ity in setting and using variables for templates (see dashed line that divides the header from the rest section 3.1). of the table; this line must always be present, even YAML structures metadata in arrays, thus al- when the header is void.5 When the table includes lowing for a finer granularity. The user may specify cells with more than one line of text, it is mandatory in his source file the following code: --- 5 In fact, it is the header, if present, or the first line of the author: table that sets the alignment of the columns. Aligning the remaining cells is not needed, but it is recommended as an 4 YAML Ain’t Markup Language, http://www.yaml.org/. aid for the reader.

Massimiliano Dominici TUGboat, Volume 35 (2014), No. 1 47

to enclose it between two dashed lines. In this case divided by blank lines from the remaining text, it the width of each column in the source file is used to will be interpreted as a floating object with its own compute the width of the equivalent column in the caption taken from the ‘Alternative text’. output table. Multicolumn or multirow cells are not Listings In standard Markdown, verbatim text is supported. The caption can precede or follow the ta- marked by being indented by four spaces or one tab. ble; it is introduced by the markup ‘:’ (alternatively: To that, Pandoc adds the ability to specify identifiers, Table:) and must be divided from the table itself classes and attributes for a given block of ‘verbatim’ by a blank line: material. Pandoc will treat them in different ways, ------depending on the output format and the command Right Centered Left line options; in some circumstances, they will sim------Text Text Text ply be ignored. To achieve this, Pandoc introduces aligned aligned aligned an alternative syntax for listings of code: instead right center left of indented blocks, they are represented by blocks delimited by sequences of three or more tildes (~~~) New cell New cell New cell or backticks (‘‘‘); identifiers, classes and attributes ------must follow that initial ‘rule’, enclosed in braces. In the following example6 we can see a listing of Python Table: Alignment code with, in this order: an identifier, the class that There’s an alternative syntax to specify the marks it as Python code, another class that speci- alignment of the individual columns: divide columns fies line numbering and an attribute that marks the with the character ‘|’ and use the character ‘:’ in starting point of the numbering. the dashed line below the header to specify, through ~~~ {#bank .python .numberLines startFrom="5"} its placement, the column’s alignment, as shown in class BankAccount(object): the following example: def __init__(self, initial_balance=0): Right | Centered | Left self.balance = initial_balance ------:|:------:|:------def deposit(self, amount): Text | Text | Text self.balance += amount aligned | aligned | aligned def withdraw(self, amount): right | center | left self.balance -= amount | | def overdrawn(self): New cell | New cell | New cell return self.balance < 0 my_account = BankAccount(15) : Alignment by ‘:’ my_account.withdraw(5) In the examples above, the cells cannot contain print my_account.balance ‘vertical’ material (multiple paragraphs, verbatim ~~~ blocks, lists). ‘Grid’ tables (see the example below) It is possible to use identifiers, class and at- allow this, at the cost of not being to specify the tributes for inline code, too: column alignments. The return value of the ‘printf‘{.C} function +------+------+------+ is of type ‘int‘. | Text | Lists | Code | +======+======+======+ By default, Pandoc uses a simple verbatim en- | Paragraph. | * Item 1 | ~~~ | vironment for code that doesn’t need highlighting | | * Item 2 | \def\PD{% | and the Highlighting environment, defined in the | Paragraph. | | \emph{Pandoc} | template’s preamble (see 3.1) and based on Verbatim | | * Item 3 | ~~~ | from fancyvrb, when highlighting is needed. If the +------+------+------+ option --listings is given on the command line, | New cell | New cell | New cell | Pandoc uses the lstlistings environment from list- +------+------+------+ ings every time a code block is encountered. Figures As shown in table 1, Markdown allows for Formulas Pandoc supports mathematical formu- the use of inline images with the following syntax: las quite well, using the usual TEX syntax. Expres- ![Alternative text](/path/image) sions enclosed in dollar signs will be interpreted as where ‘Alternative text’ is the description that inline formulas; expressions in double dollar signs HTML uses when the image cannot be viewed. Pan- doc adds to that one more feature: if the image is 6 From http://wiki.python.org/moin/SimplePrograms.

An overview of Pandoc 48 TUGboat, Volume 35 (2014), No. 1

--- will be interpreted as displayed formulas. This is all references: comfortably familiar to a TEX user. - author: The way these expressions will be treated de- family: Gruber pends on the output format. For T X (LAT X/Con- given: E E - John TEXt) output, the expressions are passed without id: gruber13:_markd modiﬁcations, except for the substitution of the issued: delimiters for display math: \[...\] instead of year: 2013 $$...$$. When HTML (or similar) output is re- title: Markdown type: no-type quired, the behavior is controlled by command line publisher: the formulas by means of Unicode characters. Other - volume: 32 options allow for the use of some of the most com- page: 272-277 container-title: TUGboat mon JavaScript libraries for visualizing math on the author: web: MathJax, LaTeXMathML and jsMath. It is family: Kielhorn also possible, always by means of a command line given: switch, to render formulas as images or to encode - Axel 7 id: kielhorn11:_multi them as MathML. issued: Pandoc is also able to parse simple macros and year: 2011 expand them in output formats diﬀerent from the title: Multi-target publishing type: article-journal supported TEX dialects. This feature, though, is only issue: 3 available in the context of math rendering. ...

Citations Pandoc can build a bibliography (and Figure 1:A YAML bibliographic database (line break manage citations inside the text) using a database in in the url is editorial). any of several common formats (BibTEX, EndNote, ISI, etc.). The database file must always be specified as the argument of the option --bibliography. Since version 1.12 native support for citations Without other options on the command line, Pan- has been split from the core functions of Pandoc. In doc will include citations and bibliographic entries order to activate this feature, one must now use an as plain text, formatted following the bibliographic external filter (--filter pandoc-citeproc, to be 9 style ‘Chicago author–date’. The user may specify a installed separately). different style by means of the option --csl, whose A new feature is that bibliographic databases 8 argument is the name of a CSL style file and may can now be built using the references field in- also specify that the bibliographic apparatus will side a YAML block. Finding the correct encoding be managed by natbib or biblatex. In this case Pan- for a YAML bibliographic database can be a little doc will not include in the LATEX output citations tricky, so it is recommended, if possible, to convert and entries in extended form, but only the required from an existing database in one of the formats rec- commands. The options to get this behavior are, ognized by Pandoc (among them BibTEX), using respectively, --natbib and --biblatex. the biblio2yaml utility, provided together with the The user must type citations in the form [@key1; pandoc-citeproc filter. The YAML code for the @key2;...] or @key1 if the citation should not be first two items in this article’s bibliography, created enclosed in round brackets. A dash preceding the by converting the .bib file, is shown in figure 1. label suppresses the author’s name (when supported A YAML field can be used also for specifying the by the citation format). Bibliographic references are CSL style for citations (csl field), or the external always placed at the end of the document. bibliography file, if required (bibliography field). Raw code (HTML or TEX) All implementations 7 The web pages for these different rendering engines for of Markdown, of whichever flavour, allow for the use math on the web are http://www.mathjax.org, http://math. of raw HTML code, written without modifications in etsu.edu/LaTeXMathML, http://www.math.union.edu/~dpvc/ the output, as we mentioned in section 2. Pandoc jsmath and http://www.w3.org/Math. 8 CSL extends this feature, allowing for the use of TEX raw (http://citationstyles.org), Citation Style Lan- A guage, is an open format, XML-based, language to describe code, too. Of course, this works only for LTEX/ the formatting of citations and bibliographies. It is used in ConTEXt output. several citation managers, such as Zotero, Mendeley, and Pa- pers. A detailed list of the available styles can be found in 9 The filter is not needed when using natbib or biblatex http://zotero.org/styles. directly instead of the native support.

Massimiliano Dominici TUGboat, Volume 35 (2014), No. 1 49

1 \documentclass$if(fontsize)$[$fontsize$]$endif$ Pandoc; to obtain a complete document, including a 2 {article} preamble, the command line option --standalone 3 \usepackage{amssymb,amsmath} (or its equivalent -s) is used. 4 \usepackage{ifxetex} It is also possible to build more flexible tem- 5 \ifxetex plates, useful for different projects with different 6 \usepackage{fontspec,xltxtra,xunicode} features, providing for a moderate level of customiza- 7 \defaultfontfeatures{Mapping=tex-text, 8 Scale=MatchLowercase} tion. As the reader can see in figure 2, a template is 9 \else substantially a file in the desired output format (in A 10 \usepackage[utf8]{inputenc} this case LTEX) interspersed with variables and con- 11 \fi trol flow statements introduced by a dollar sign. The 12 $if(natbib)$ expressions will be evaluated during the compilation 13 \usepackage{natbib} and replaced by the resulting text. For instance, at 14 \bibliographystyle{plainnat} line 113 of the listing in figure 2 we find the expres- 15 $endif$ 16 $if(biblatex)$ sion $body$, which will be replaced by the document 17 \usepackage{biblatex} body. Above, at lines 12–21, we can find the se- 18 $if(biblio-files)$ quence of commands that will include in the final 19 \bibliography{$biblio-files$} output all the resources needed to generate a bibli- 20 $endif$ ography by means of natbib or biblatex. This code 21 $endif$ will be activated only if the user has specified either ... --natbib or --biblatex on the command line. The code to print the bibliography can be found at the 113 $body$ end of the listing, at lines 124–130. 114 115 $if(natbib)$ In this way it is possible to define all the desired 116 $if(biblio-files)$ variables and the respective compiler options. The 117 $if(biblio-title)$ user can thus change the default template to specify, 118 $if(book-class)$ e.g., among the options that may be passed to the 119 \renewcommand\bibname{$biblio-title$} class, not only the body font, but a generic string 120 $else$ containing more options.10 We would replace the 121 \renewcommand\refname{$biblio-title$} first line in the listing of figure 2 with the following: 122 $endif$ 123 $endif$ \documentclass$if(clsopts)$[$clsopts$]$endif$ 124 \bibliography{$biblio-files$} Then we can compile with the following options: 125 $endif$ pandoc -s -t latex --template=mydefault \ 126 $endif$ -V clsopts=a4paper,12pt -o test.tex test.md 127 $if(biblatex)$ 128 \printbibliography given that we saved the modified template in the 129 $if(biblio-title)$[title=$biblio-title$]$endif$ current directory by the name mydefault.latex. 130 $endif$ Since version 1.12, ‘variables’ can be replaced by 131 $for(include-after)$ YAML ‘metadata’, specified either inside the source 132 $include-after$ file or on the command line using the -M option. 133 $endfor$ 134 \end{document} 4 Problems and limitations

Figure 2: Fragments of the default Pandoc v1.11 We’ve seen many nice features of Pandoc. Not sur- A prisingly, Pandoc also has some limitations and short- template for LTEX. comings. Some of these shortcomings are tied to the particular LML used by Pandoc. For instance, Mark- down doesn’t allow semantic markup.11 This kind of Templates One of the most interesting features limitation can be addressed using an additional level of Pandoc is the use of customized templates for 10 the diﬀerent output formats. For HTML-derived and This can be also be achieved via the variable $fontsize. (Since Pandoc 1.12, the default LATEX template includes sepa- TEX-derived output formats this can be achieved in rate variables for body font size, paper size and language, and two ways. First of all, the user may generate only a generic $classoption variable for other parameters.) 11 the document ‘body’ and then include it inside a This is not an issue necessarily pertinent to all LMLs since some of them provide methods to deﬁne objects that be- ‘master’ (for TEX output, with \input or \include). A have like L TEX macros, either through pre- or post-processing In this way, an ad hoc preamble can be built be- (txt2tags) or by taking advantage of conceptually close struc- forehand. This is in fact the default behaviour for tures (the ‘class’ of a span in HTML, in Textile). In any case,

An overview of Pandoc 50 TUGboat, Volume 35 (2014), No. 1

of abstraction, using preprocessors like gpp or m4, as scripts (a drawback for me, at least . . . ). The current illustrated by Aditya Mahajan in [4]. Of course, this version also allows Python, thus making easier the clashes with the initial purpose of readability and task of creating such scripts.12 introduces further complexity, though the use of m4 need not significantly increase the amount of extra 5 Conclusion markup. To conclude this overview, I consider Pandoc to be Other problems, though, unexpectedly arise in the best choice for a project requiring multiple out- A the process of conversion to LTEX output. For input formats. The use of a ‘neutral’ language in stance, the cross reference mechanism is calibrated the source file makes it easier to avoid the quirks to HTML and shows all its shortcomings with regard of a specific language and the related problems of A to the LTEX output. The cross reference is in fact translation to other languages. For a LATEX user in generated by means of a hypertext anchor and not particular, being able to type mathematical expres- \label \ref A by the normal use of and . As a typical sions ‘as in LTEX’ and to use a BibTEX database for example, let’s consider a labelled section referenced bibliographic references are also two strong points. later in the text, as in the following: One should not expect to find in Pandoc an easy ## Basic elements ## {#basic} solution for every difficulty. Limitations of LMLs [...] in general, and some flaws specific to the program, As we have explained in entail the need for workarounds, making the process [Basic elements](#basic) less immediate. This doesn’t change the fact that, if we get this result: the user is aware of such limitations and the project \hyperdef{}{basic}{% can bear them, Pandoc makes obtaining multiple \subsection{Basic elements}\label{basic} output formats from a single source extremely easy. } [...] References As we have explained in \hyperref[basic]{Basic elements} [1] John Gruber. Markdown. http://daringfireball.net/projects/ which is not exactly what a LAT X user would expect E markdown/, 2013. . . . Of course one could directly use \label and \ref, but they will be ignored in all non-TEX output [2] Axel Kielhorn. Multi-target publishing. formats. Or, we could use a preprocessor to get TUGboat, 32(3):272–277, 2011. http://tug. two different intermediate source files, one for HTML org/TUGboat/tb32-3/tb102kielhorn.pdf. and for LATEX (and maybe a third for ODF/OOXML, [3] John MacFarlane. Pandoc: a universal etc.), but by now the original point of using Pandoc document converter. http://johnmacfarlane. is being lost. net/pandoc/, 2013. Formulas, too, may cause some problems. Pan- [4] Aditya Mahajan. How I stopped worrying and doc recognizes only inline and display expressions. started using Markdown like TEX. http:// The latter are always translated as displaymath en- randomdeterminism.wordpress.com/2012/06/ vironments. It is not possible to specify a different 01/how-i-stopped-worring-and-started- kind of environment (equation, gather, etc.) unless using-markdown-like-tex/, 2012. one of the workarounds discussed above is employed, [5] Wikipedia. Lightweight markup language. with all the consequent drawbacks also noted. http://en.wikipedia.org/wiki/ It should be stressed, however, that Pandoc is a Lightweight_markup_language, 2013. program in active development and that several features present in the current version were not available a short time ago. So it is certainly possible that all or ⋄ Massimiliano Dominici A Pisa, Italy some of the shortcomings that a LTEX user finds in the current version of Pandoc will be addressed in the mlgdominici (at) gmail dot com near or mid-term. It is also possible, to some extent, to extend or modify Pandoc’s behaviour by means of scripts, as noted at http://johnmacfarlane.net/ 12 Another option is writing one’s own custom ‘writer’ in pandoc/scripting.html. One major drawback, un- Lua. A writer is essentially a program that translates the data til recently, was the mandatory use of Haskell for such structure, collected by the ‘reader’, in the format specified by the user. Having Lua installed on the system is not required, though, the philosophy behind LMLs doesn’t support such since a Lua interpreter is embedded in Pandoc. See http:// markup methods. johnmacfarlane.net/pandoc/README.html#custom-writers.

Massimiliano Dominici TUGboat, Volume 35 (2014), No. 1 51

Numerical methods with LuaLATEX was undertaken a few years ago. (The leaders of the project and main developers are Taco Hoekwater, Juan I. Montijano, Mario Pérez, Luis Rández Hartmut Henkel and Hans Hagen.) The idea was and Juan Luis Varona to enhance TEX with a previously existing general Abstract purpose programming language. After careful evaluation of possible candidates, the language chosen An extension of T X known as LuaT X has been in E E was Lua (see http://www.lua.org), a powerful, fast, development for the past few years. Its purpose is to lightweight, embeddable scripting language that has, allow T X to execute scripts written in the general E of course, a free software license suitable to be used purpose programming language Lua. There is also with TEX. Moreover, Lua is easy to learn and use, LuaLAT X, which is the corresponding extension for E and anyone with basic programming skills can use it LAT X. E without difficulty. (Many examples of Lua code can In this paper, we show how LuaLAT X can be E be found later in this article, and also, for example, used to perform tasks that require a large amount at http://rosettacode.org/wiki/Category:Lua, of mathematical computation. With LuaLAT X in- E and http://lua-users.org/.) stead of LATEX, we achieve important improvements: LuaTEX is not TEX, but an extension of TEX, in since Lua is a general purpose language, rendering the same way that pdfTEX or X TE EX are extensions. documents that include evaluation of mathematical In fact, LuaTEX includes pdfTEX (it is an extension algorithms is much easier, and generating the PDF of pdfTEX, and offers backward compatibility), and file becomes much faster. also has many of the features of X TE EX. LuaTEX is still in a beta stage, but the current Introduction versions are usable (the first public beta was launched TEX (and LATEX) is a document markup language in 2007, and when this paper was written in January used to typeset beautiful papers and books. Al- 2013, the release used was version 0.74). though it can also do programming commands such It has many new features useful for typographic as conditional execution, it is not a general purpose composition; and examples can be seen at the project programming language. Thus there are many tasks web site http://www.luatex.org, and some papers that are easily done with other programming lan- using development versions have been published in guages, but are very complicated or very slow when TUGboat, among them [3, 2, 4, 5, 7, 11]. Most done with TEX. Due to this limitation, auxiliary of those articles are devoted to the internals and programs have been developed to assist TEX with are very technical, only for true TEX wizards; we common tasks related to document preparation. For do not deal with this in this paper. Instead, our instance, bibtex or biber to build bibliographies, and goal is to show how the mathematical power of the makeindex or xindy to generate indexes. In both embedded language Lua can be used in LuaTEX. Of cases, sorting a list alphabetically is a relatively sim- course, when we build LATEX over LuaTEX, we get ple task for most programming languages, but it is so-called LuaLATEX, which will be familiar to regular A very complicated to do with TEX, hence the desire LTEX users. for auxiliary applications. All the examples in this paper are done with A Another shortcoming of TEX is the computa- LuaLTEX. It is important to note that the current tion of mathematical expressions. One of the most version of LuaTEX is not meant for production and common uses of TEX is to compose mathematical beta users are warned of possible future changes in formulas, and it does this extremely well. However the syntax. However, the examples in this article TEX is not good at computing mathematics. For use only a few general Lua-specific commands, so it instance, TEX itself does not have built-in functions is likely these examples will continue to work with to compute a square root or a sine. Although it is future versions. possible to compute mathematical functions with the To process a LuaLATEX document we perform help of auxiliary packages written in TEX, internally the following steps: First, we must compile with these packages must compute functions using only LuaLATEX, not with LATEX; how to do this depends on addition, subtraction, multiplication and division op- the editor being used. Second, we must load the pack- erations — and a very large number of them. This age luacode with \usepackage{luacode}. Then, in- is difficult to program (for package developers) and side LuaLATEX, we can jump into Lua mode with the slow in execution. command \directlua; moreover, we can define Lua To address the need to do more complex func- routines in a \begin{luacode}... \end{luacode} tions within TEX, an extension of TEX called LuaTEX environment (also {luacode*} instead of {luacode}

Numerical methods with LuaLATEX 52 TUGboat, Volume 35 (2014), No. 1

can be used); the precise syntax can be found in the x sin(x) cos(x) tan(x) ◦ 0 0.00000 0.00000 1.00000 0.00000 manual “The luacode package” (by Manuel Pégourié- ◦ 3 0.05236 0.05234 0.99863 0.05241 Gonnard) [10]. In the examples, we do not explain ◦ 6 0.10472 0.10453 0.99452 0.10510 all the details of the code; they are left to the reader’s ◦ 9 0.15708 0.15643 0.98769 0.15838 intuition. ◦ 12 0.20944 0.20791 0.97815 0.21256 ◦ In this paper we present four examples. The 15 0.26180 0.25882 0.96593 0.26795 ◦ first is very simple: the computation of a trigonomet- 18 0.31416 0.30902 0.95106 0.32492 ◦ ric table. In the other examples we use the LATEX 21 0.36652 0.35837 0.93358 0.38386 ◦ packages tikz and pgfplots to show Lua’s ability to 24 0.41888 0.40674 0.91355 0.44523 ◦ produce graphical output. Some mathematical skill 27 0.47124 0.45399 0.89101 0.50953 ◦ may be necessary to fully understand the examples, 30 0.52360 0.50000 0.86603 0.57735 ◦ but the reader can nevertheless see how Lua is able 33 0.57596 0.54464 0.83867 0.64941 ◦ 36 0.62832 0.58779 0.80902 0.72654 to manage the computation-intensive job. In any ◦ 39 0.68068 0.62932 0.77715 0.80978 case, we do not explore the more complex possibili- ◦ 42 0.73304 0.66913 0.74314 0.90040 ties, which involve writing Lua programs that load ◦ 45 0.78540 0.70711 0.70711 1.00000 existing Lua modules or libraries to perform a wide range of functions and specialized tasks. Figure 1: A trigonometric table. 1 First example: a trigonometric table the sine, the cosine and the tangent. Inside Lua mode, To show how to use Lua, let us begin with a simple it “exports” to LAT X with tex.print; note that we but complete example. Observe the following docu- E escape any backslash by doubling it. Moreover, we ment, which embeds some Lua source code. Type- have taken into account the following notation to setting it with LuaLATEX, we get the trigonometric give format to numbers: table shown in Figure 1. \documentclass{article} • %2d indicates that a integer number must be \usepackage{luacode} displayed with 2 digits. • %1.5f indicates that a floating point number \begin{luacode*} must be displayed with 1 digit before the decimal function trigtable () point and 5 digits after it. for t=0, 45, 3 do The LAT X part has the skeleton of a tabular x=math.rad(t) E tex.print(string.format( built with the data exported by Lua. '%2d$^{\\circ}$ & %1.5f & %1.5f & %1.5f ' 2 Second example: Gibbs phenomenon .. '& %1.5f \\\\', t, x, math.sin(x), math.cos(x), Now and in what follows, we will use graphics to math.tan(x))) show the output of some mathematical routines. A end very convenient way to do it is by means of the end PGF/TikZ package (TikZ is a high-level interface to \end{luacode*} PGF) by Till Tantau (the huge manual [12] of the \newcommand{\trigtable} current version 2.10 has more than 700 pages of doc- {\luadirect{trigtable()}} umentation and examples); two short introductory papers are [8, 9]. Based on PGF/TikZ, the package \begin{document} pgfplots (by Christian Feuersänger [1]) has additional \begin{tabular}{rcccc} \hline facilities to plot mathematical functions like y = f(x) & $x$ & $\sin(x)$ & $\cos(x)$ & $\tan(x)$ \\ (or a parametric function x = f(t), y = g(t)) or visu- \hline alize data in two or three dimensions. For instance, \trigtable pgfplots can draw the axis automatically, as usual in \hline any graphic software. \end{tabular} For completeness, let us start showing the syntax \end{document} of pgfplots by means of a data plot; this is an example The luacode* environment contains a small Lua extracted from its very complete manual (more than program with a function named trigtable (with no 400 pages in the present version 1.7). After loading arguments). This function consists of a loop with a \usepackage{pgfplots}, the code variable t representing degrees. Lua converts t to \begin{tikzpicture} radians with x=math.rad(t); then, Lua computes \begin{axis}[xlabel=Cost, ylabel=Error]

Juan I. Montijano, Mario P´erez, Luis R´andez and Juan Luis Varona TUGboat, Volume 35 (2014), No. 1 53

and to export the data with the syntax required by pgfplots (function print_partial_sum):

−4 \begin{luacode*} -- Fourier series function partial_sum(n,x) partial = 0;

Error for k = 1, n, 1 do −6 partial = partial + math.sin(k*x)/k end; return partial end −8 -- Code to write PGFplots data as coordinates 2 3 4 5 6 7 8 function print_partial_sum(n,xMin,xMax,npoints, Cost option) local delta = (xMax-xMin)/(npoints-1) Figure 2: Plotting of a data table with pgfplots. local x = xMin if option~=[[]] then tex.sprint("\\addplot[" .. option \addplot[color=red,mark=x] coordinates { .. "] coordinates{") (2,-2.8559703) (3,-3.5301677) (4,-4.3050655) else (5,-5.1413136) (6,-6.0322865) (7,-6.9675052) tex.sprint("\\addplot coordinates{") (8,-7.9377747) end }; for i=1, npoints do \end{axis} y = partial_sum(n,x) \end{tikzpicture} tex.sprint("("..x..","..y..")") generates the plot in Figure 2. Before going on, note x = x+delta that in future versions the packages PGF/TikZ and end pgfplots could, internally, use LuaLATEX themselves tex.sprint("}") in a way transparent to the user. This would allow end extra power, calculating speed, and simplicity, but \end{luacode*} this is not yet available and we will not worry about Then, we also define the command it in this paper. \newcommand\addLUADEDplot[5][]{% In the next example we consider the Gibbs phe- \directlua{print_partial_sum(#2,#3,#4,#5, nomenon. Using LuaLATEX, the idea is to compute a [[#1]])}% data table with Lua (easy to program, powerful and } fast in the execution), and plot it with pgfplots. which will be used to call the data from pgfplots. The Gibbs phenomenon is the peculiar way in Here, the parameters have the following meaning: #2 which the Fourier series of a piecewise continuously indicates the number of terms to be added (n = 30 differentiable periodic function behaves at a jump dis- in our case); the plot will be done in the interval [#3, continuity, where the n-th partial sum of the Fourier #4] (from x = 0 to 10π) sampled with #5 points (to series has large oscillations near the jump. It is ex- get a very smooth graphic and to show the power of plained in many harmonic analysis texts, but for the the method we use 1000 points); finally, the optional purpose of this paper the reader can refer to [13]. argument #1 is used to manage optional arguments In our case we consider the function f(x) = (π− in the \addplot environment (for instance color x)/2 in the interval (0, 2π) extended by periodicity [grayscaled for TUGboat], width of the line, ...). to the whole real line (it has discontinuity jumps at Now, the plot is generated by 2jπ for every integer j). Its Fourier series is ∞ \pgfplotsset{width=.9\hsize} sin(kx) f(x) = . \begin{tikzpicture}\small k \begin{axis}[ k=1 X xmin=-0.2, xmax=31.6, To show the Gibbs phenomenon, we evaluate the n sin(kx) ymin=-1.85, ymax=1.85, partial sum k=1 k (for n = 30) with Lua to xtick={0,5,10,15,20,25,30}, generate a table of data, and we plot it with pgfplots. ytick={-1.5,-1.0,-0.5,0.5,1.0,1.5}, P In the .tex file, we include the following Lua minor x tick num=4, to compute the partial sum (function partial_sum) minor y tick num=4,

Numerical methods with LuaLATEX 54 TUGboat, Volume 35 (2014), No. 1

where 1.5 k1 = hf(ti, yi), 1 1 1 k = hf(ti + h, yi + k ),  2 2 2 1 0.5 1 1  k3 = hf(ti + h, yi + k2),  2 2 k = hf(ti + h, yi + k ). 5 10 15 20 25 30 4 3 −0.5  See, for instance, [15]. −1 Here, we consider the initial value problem ′ −1.5 y (t) = y(t) cos t + 1 + y(t) , ( y(0) = 1. kx p Figure 3: The partial sum 30 sin( ) of the Fourier k=1 k In the Lua part of our .tex file, we compute series of f(x) = (π − x)/2 illustrating the Gibbs phenomenon. P the values {(ti, yi)} and export them with pgfplots syntax by means of \begin{luacode*} -- Differential equation y'(t) = f(t,y) axis lines=middle, -- with f(t,y) = y * cos(t+sqrt(1+y)). axis line style={-} -- Initial condition: y(0) = 1 ] function f(t,y) % SYNTAX: Partial sum 30, from x = 0 to 10*pi, return y * math.cos(t+math.sqrt(1+y)) % sampled in 1000 points. end \addLUADEDplot[color=blue,smooth]{30} {0}{10*math.pi}{1000}; -- Code to write PGFplots data as coordinates \end{axis} function print_RKfour(tMax,npoints,option) \end{tikzpicture} local t0 = 0.0 See the output in Figure 3. local y0 = 1.0 local h = (tMax-t0)/(npoints-1) local t = t0 3 Third example: Runge-Kutta method local y = y0 A differential equation is an equation that links an if option~=[[]] then unknown function and its derivatives, and these equa- tex.sprint("\\addplot[" .. option tions play a prominent role in engineering, physics, .. "] coordinates{") economics, and other disciplines. When the value of else tex.sprint("\\addplot coordinates{") the function at an initial point is fixed, a differential end equation is known as an initial value problem. The tex.sprint("("..t0..","..y0..")") mathematical theory of differential equations shows for i=1, npoints do that, under very general conditions, an initial value k1 = h * f(t,y) problem has a unique solution. Usually, it is not k2 = h * f(t+h/2,y+k1/2) possible to find the exact solution in an explicit form, k3 = h * f(t+h/2,y+k2/2) and it is necessary to approximate it by means of k4 = h * f(t+h,y+k3) numerical methods. y = y + (k1+2*k2+2*k3+k4)/6 One of the most popular methods to integrate t = t + h numerically an initial value problem tex.sprint("("..t..","..y..")") end ′ tex.sprint("}") y (t) = f(t, y(t)), end y(t ) = y 0 0 \end{luacode*} Also, we define the command is the classical Runge-Kutta method of order 4. With it, we compute in an approximate way the values \newcommand\addLUADEDplot[3][]{% \directlua{print_RKfour(#2,#3,[[#1]])}% yi ≃ y(ti) at a set of points {ti} starting from i = 0 } and with ti+1 = ti + h for every i by the algorithm to call the Lua routine from the LATEX part (the 1 parameter #2 indicates the final value of t, and #3 is yi = yi + (k + 2k + 2k + k ), +1 6 1 2 3 4 the number of sampled points).

Juan I. Montijano, Mario P´erez, Luis R´andez and Juan Luis Varona TUGboat, Volume 35 (2014), No. 1 55

1

0.8

0.6 40

0.4 20

0.2 10 −10 0 0 −10 5 10 15 20 25 30 10

Figure 4: Solution of the differential equation Figure 5: The Lorentz attractor (six orbits starting at ′ y (t) = y(t) cos t + 1 + y(t) with initial condition several initial points). y(0) = 1. p and β = 1. Six orbits starting at several initial points Then, the graphic of Figure 4, which shows the close to (0, 1, 0) are plotted in different colors; all of solution of our initial value problem, is created by them converge to the 3-dimensional chaotic attractor means of known as the Lorenz attractor. \pgfplotsset{width=0.9\hsize} The Lua part of the program uses a discretiza- \begin{tikzpicture} tion of the Lorenz equations (technically, this is the \begin{axis}[xmin=-0.5, xmax=30.5, explicit Euler method yi+1 = yi + hf(ti, yi), which ymin=-0.02, ymax=1.03, is less precise than the Runge-Kutta method of the xtick={0,5,...,30}, ytick={0,0.2,...,1.0}, previous section, but enough to find the attractor): enlarge x limits=true, minor x tick num=4, minor y tick num=4, \begin{luacode*} axis lines=middle, axis line style={-} -- Differential equation of Lorenz attractor ] function f(x,y,z) % SYNTAX: Solution of the initial value problem local sigma = 3 % in the interval [0,30] sampled at 200 points local rho = 26.5 \addLUADEDplot[color=blue,smooth]{30}{200}; local beta = 1 \end{axis} return {sigma*(y-x), \end{tikzpicture} -x*z + rho*x - y, x*y - beta*z} 4 Fourth example: Lorenz attractor end The Lorenz attractor is a strange attractor that arises -- Code to write PGFplots data as coordinates in a system of equations describing the 2-dimensional function print_LorAttrWithEulerMethod(h,npoints, flow of a fluid of uniform depth, with an imposed option) vertical temperature difference. In the early 1960s, -- The initial point (x0,y0,z0) Lorenz [6] discovered the chaotic behavior of a sim- local x0 = 0.0 plified 3-dimensional system of this problem, now local y0 = 1.0 known as the Lorenz equations: local z0 = 0.0 ′ -- add random number between -0.25 and 0.25 x (t) = σ(y(t) − x(t)), local x = x0 + (math.random()-0.5)/2 ′ y (t) = −x(t)z(t) + ρx(t) − y(t), local y = y0 + (math.random()-0.5)/2  ′  z (t) = x(t)y(t) − βz(t). local z = z0 + (math.random()-0.5)/2 The parameters σ, ρ, and β are usually assumed to if option ~= [[]] then  be positive. Lorenz used the values σ = 10, ρ = 28 tex.sprint("\\addplot3[" .. option .. "] coordinates{") and β = 8/3. The system exhibits a chaotic behavior else for these values; in fact, it became the first example tex.sprint("\\addplot3 coordinates{") of a chaotic system. A more complete description end can be found in [14]. -- dismiss the first 100 points Figure 5 shows the numerical solution of the -- to go into the attractor Lorenz equations calculated with σ = 3, ρ = 26.5 for i=1, 100 do

Numerical methods with LuaLATEX 56 TUGboat, Volume 35 (2014), No. 1

m = f(x,y,z) [4] P. Isambert, Three things you can do x = x + h * m[1] with LuaTEX that would be extremely y = y + h * m[2] painful otherwise, TUGboat, Volume 31 z = z + h * m[3] (2010), No. 3, 184–190. http://tug.org/ end TUGboat/tb31-3/tb99isambert.pdf for i=1, npoints do m = f(x,y,z) [5] P. Isambert, OpenType fonts in LuaTEX, TUGboat x = x + h * m[1] , Volume 33 (2012), No. 1, y = y + h * m[2] 59–85. http://tug.org/TUGboat/tb33-1/ z = z + h * m[3] tb103isambert.pdf tex.sprint("("..x..","..y..","..z..")") [6] E. N. Lorenz, Deterministic Nonperiodic Flow, end J. Atmospheric Sci., Volume 20 (1963), No. 2, tex.sprint("}") 130–141. http://dx.doi.org/10.1175/ end 1520-0469(1963)020<0130:DNF>2.0.CO;2 \end{luacode*} [7] A. Mahajan, LuaTEX: A user’s perspective, The function which calls the Lua part from the TUGboat, Volume 30 (2009), No. 2, A LTEX part is 247–251. http://tug.org/TUGboat/tb30-2/ \newcommand\addLUADEDplot[3][]{% tb95mahajan-luatex.pdf \directlua{print_LorAttrWithEulerMethod [8] A. Mertz and W. Slough, Graphics with (#2,#3,[[#1]])}% TUGboat } PGF and TikZ, , Volume 28 (2007), No. 1, 91–99. http://tug.org/TUGboat/ Here, the parameter #2 gives the step of the dis- tb28-1/tb88mertz.pdf cretization, and #3 is the number of points. [9] A. Mertz and W. Slough, A TikZ tutorial: The LAT X part is the following. In it, we call E Generating graphics in the spirit of T X, the Lua function six times with different colors: E TUGboat, Volume 30 (2009), No. 2, \pgfplotsset{width=.9\hsize} 214–226. http://tug.org/TUGboat/tb30-2/ \begin{tikzpicture} tb95mertz.pdf \begin{axis} % SYNTAX: Solution of the Lorenz system [10] M. Pégourié-Gonnard, The luacode package. % with step h=0.02 sampled at 1000 points. http://mirror.ctan.org/macros/luatex/ \addLUADEDplot[color=red,smooth]{0.02}{1000}; latex/luacode/luacode.pdf \addLUADEDplot[color=green,smooth]{0.02}{1000}; [11] A. Reutenauer, LuaTEX for the LATEX user: \addLUADEDplot[color=blue,smooth]{0.02}{1000}; An introduction, TUGboat, Volume 30 \addLUADEDplot[color=cyan,smooth]{0.02}{1000}; (2009), No. 2, 169. http://tug.org/TUGboat/ \addLUADEDplot[color=magenta,smooth]{0.02}{1000}; tb30-2/tb95reutenauer.pdf \addLUADEDplot[color=yellow,smooth]{0.02}{1000}; & PGF \end{axis} [12] T. Tantau, TikZ . http://mirror. \end{tikzpicture} ctan.org/graphics/pgf/base/doc/generic/ pgf/pgfmanual.pdf References [13] Wikipedia, Gibbs phenomenon. http://en. [1] C. Feuersänger, Manual for Package wikipedia.org/wiki/Gibbs_phenomenon PGFPLOTS. http://mirror.ctan.org/ [14] Wikipedia, Lorenz system. http://en. graphics/pgf/contrib/pgfplots/doc/ wikipedia.org/wiki/Lorenz_system pgfplots.pdf [15] Wikipedia, Runge-Kutta methods. http://en. [2] H. Hagen, LuaTEX: Halfway to version 1, wikipedia.org/wiki/Runge-Kutta_methods TUGboat, Volume 30 (2009), No. 2, 183–186. http://tug.org/TUGboat/tb30-2/ ⋄ Juan I. Montijano, Mario Pérez, tb95hagen-luatex.pdf Luis Rández and [3] T. Hoekwater and H. Henkel, LuaTEX 0.60: Juan Luis Varona An overview of changes, TUGboat, Volume Universidad de Zaragoza 31 (2010), No. 2, 174–177. http://tug.org/ (Zaragoza, Spain) and TUGboat/tb31-2/tb98hoekwater.pdf Universidad de La Rioja (Logroño, Spain) {monti,mperez,randez} (at) unizar dot es and jvarona (at) unirioja dot es

Juan I. Montijano, Mario P´erez, Luis R´andez and Juan Luis Varona TUGboat, Volume 35 (2014), No. 1 57

Parsing PDF content streams with LuaTEX ing that new item have to be updated because their placement is off. Taco Hoekwater The full update process is quite complicated; the Abstract issue this paper deals with is that the server software needs to know what words are on any PDF page, as The new pdfparser library in LuaTEX allows parsing of external PDF content streams directly from within well as their location on that page, and therefore its text extraction process has to agree with the same a LuaTEX document. This paper explains its origin and usage. process on the iPad. 3 PDF text extraction 1 Background Text extraction is a two-step process. The actual Docwolves’ main product is an infrastructure to fa- drawing of a PDF page is handled by PostScript-style cilitate paperless meetings. One part of the func- postfix operators. These are contained in objects tionality is handling meeting documents, and to do that are called page content streams. so it offers the meeting participants a method to After decompression of the PDF, the beginning distribute, share, and comment on such documents of a content stream might look like this: by means of an intranet application, as well as an 59 0 obj iPad app. << /Length 4013 >> Meeting documents typically consist of a meet- stream ing agenda, followed by included appendices, com- 0 g 0 G bined into a single PDF file. Such documents can 1 g 1 G have various revisions, for example if a change has q been made to the agenda or if an appendix has to 0 0 597.7584 448.3188 re f be added or removed. After such a change, a newly Q combined PDF document is re-distributed. 0 g 0 G Annotations can be made on these documents 1 0 0 1 54.7979 44.8344 cm and these can then be shared with other meeting ... participants, or just communicated to the server for Here g, G, q, re, f, Q, and cm are all (postfix) safekeeping. Like documents, annotations can be operators, and the numeric values are all arguments. updated as well. As you see, not all operators take the same number All annotations are made on the iPad, with an of arguments (g takes one, q zero, and re four). (implied) author and an intended audience. Anno- Other operators may take, for instance, string-valued tations apply to a specific part of the source text, arguments instead of numeric ones. There are a bit and come in a few types (highlight, sticky note, free- more than half a dozen different types. hand drawing). The iPad app communicates with To process such a stream easily, it is best to a network server to synchronize shared annotations separate the task (at least conceptually) into two between meeting participants. separate tasks. First there is a lexing stage, which entails converting the raw bytes into combinations 2 The annotation update problem of values and types (tokens) that can be acted upon. The server–client protocol aims to be as efficient as Separate from that, there is the interpretation possible, especially in the case of communication stage, where the operators are actually executed with with the iPad app, since bandwidth and connection the tokenized arguments that have preceded it. costs can be an issue. 3.1 PDF text extraction on the iPad This means that for any annotation on a referenced document, only the document’s internal identi- It is very easy on an iPad to display a representation PDF fication, the (PDF) page number, and the beginning of a page, and Apple also provides a convenient PDF and end word indices on the page are communicated interface to do the lexing of content streams back and forth. This is quite efficient, but gives rise that is the first step in getting the text from the page. PDF to the following problem: But to find out where the objects are, one has to interpret the PDF document stream oneself, and that When a document changes, e.g. if an extra is the harder part of the text extraction operation. meeting item is added, all annotations follow- 3.2 PDF text extraction on the server Editor’s note: Originally published in ConTEXt Group: Proceedings, 6th meeting, pp.19–23. Reprinted with On the server side, there is a similar problem at a dif- permission. ferent stage: displaying a PDF is easy, and even literal

Parsing PDF content streams with LuaTEX 58 TUGboat, Volume 35 (2014), No. 1

text extraction is easy (with tools like pdftotext). 6 Lexing via poppler However, that does not give you the location of the As said above, a lexer converts bytes in the input text on the page. On the server, Apple’s lexing inter- text stream into tokens, and these tokens have types PDF face is not available, and the available library and values. libpoppler provides a way to get one (libpoppler) does not offer similar functionality. byte from a stream using the getChar() method, 4 Our solution and it also applies any stream filters beforehand, but it does not create such tokens. We needed to write text extraction software that can be used on both platforms, to ensure that the same 6.1 Poppler limitations releases of server and iPad software always agreed perfectly on the what and where of the text on the There is no way to get the full text of a stream as a PDF page. whole; it has to be read byte by byte. Both platforms use a stream interpreter written Also, if the page content consists of an array of by ourselves in C, with the iPad software starting content streams instead of a single entry, the separate from the Apple lexer, and the server software starting content streams have to be manually concatenated. from a new lexer written from scratch. And content streams have to be ‘reset’ before The prototype and first version of the newly the first use. created stream interpreter as well as the server-side Here is some example code for reading a stream, using the epdf library: lexer were written in Lua. LuaTEX’s epdf library — libpoppler bindings for Lua, see below — were a function parsestream(stream) very handy tool at that stage. The code was later local self = { streams = {} } converted back to C for compilation into a server- local thetype = type(stream) side helper application as well as the iPad App, but if thetype == ’userdata’ then originally it was written as a texlua script. self.stream = stream:getStream() elseif thetype == ’table’ then A side effect of this development process is that for i,v in ipairs(stream) do the lexer could be offered as a new LuaTEX extension, self.streams[i] = v:getStream() and so that is exactly what we have done. end self.stream = table.remove( 5 About the ‘epdf’ library self.streams,1) This library is written by Hartmut Henkel, and it end provides Lua access to the poppler library included self.stream:reset() in LuaTEX. For instance, it is used by ConTEXt to local byte = getChar(self) preserve links in external PDF figures. while byte >= 0 do The library is fairly extensive, but a bit low-level, ... byte = getChar(self) because it closely mimics the libpoppler interface. end It is fully documented in the LuaTEX reference man- if self.stream then ual, but here is a small example that extracts the self.stream:close() page cropbox information from a PDF document: end local function run (filename) end local doc = epdf.open(filename) In the code above, any interesting things you local cat = doc:getCatalog() want to do are inserted at the ... spot. The example local numpages = doc:getNumPages() local pagenum = 1 makes use of one helper function, getChar, which print (’Pages: ’ .. numpages) looks like this: while pagenum <= numpages do local function getChar(self) local page = cat:getPage(pagenum) local i = self.stream:getChar() local cbox = page:getCropBox() if (i<0) and (#self.streams>0) then print (string.format( self.stream:close() ’Page %d: [%g %g %g %g]’, self.stream = table.remove( pagenum, cbox.x1, cbox.y1, self.streams, 1) cbox.x2, cbox.y2)) self.stream:reset() pagenum = pagenum + 1 i = getChar(self) end end end return i run(arg[1]) end

Taco Hoekwater TUGboat, Volume 35 (2014), No. 1 59

7 Our own lexer: ‘pdfscanner’ In its simplest form, the creation of this page The new lexer we wrote does create tokens. Its Lua state looks like this: interface accepts either a poppler stream, or an array function createParserState() of such streams. It puts PDF operands on an internal local stack = {} stack and then executes user-selected operators. stack[1] = {} stack[1].ctm = The library pdfscanner has only one function, AffineTransformIdentity() scan(). Usage looks like this: return stack require ’pdfscanner’ end function scanPage(page) Internally, pdfscanner.scan() loops over the local stream = page:getContents() input stream content bytes, creating tokens and col- local ops = createOperatorTable() lecting operands on an internal stack until it finds local info = createParserState() PDF if stream then a operator. If that operator’s name exists in if stream:isStream() the given operator table, then the associated Lua or stream:isArray() then function is executed. After that function has run (or pdfscanner.scan(stream, ops, info) when there is no function to execute) the internal end operand stack is cleared in preparation for the next end operator, and processing continues. end The processing functions are called with two The above functions createOperatorTable() arguments: the scanner object itself, and the info and createParserState() are helper functions that table that was passed as the third argument to create arguments of the proper types. pdfscanner.scan. The scanner argument to the processing func- 7.1 The scan() function tions is needed because it offers various methods to get the actual operands from the internal operand As you can see, the scan() function takes three stack. arguments, which we describe here. The first argument should be either a PDF 7.2 Extracting tokens from the scanner stream object or a PDF array of PDF stream objects The lowest-level function available in scanner is (those options comprise the possible return values of scanner:pop() which pops the top operand of the :getContents() and

Size	Larger than any gull
Adult	White, black wing-tips, yellow nape
Juvenile	Grey, gradually becoming white over 5 years
Bill	Dagger-like
In flight	Cigar-shaped with long, narrow, black-tipped wings
Voice	Usually silent, growling urr when nesting
Lookalikes	Skuas, Gulls and Terns

Title

Title

Gannet