<<

TUGBOAT

Volume 35, Number 3 / 2014

General Delivery 230 Ab epistulis / Steve Peter 231 Editorial comments / Barbara Beeton TEX entomology; An alternative to tangle and weave; More ; More from Chuck Bigelow about Lucida; Erratum: “Online Publishing via pdf2htmlEX”, TUGboat 34:3; Peter Flynn’s Formatting Information updated; Klaus Peters, 1937–2014; Other items worth a look — bibliographies; Geographical trivia: Kolophon 232 A footnote about ‘Oh, oh, zero’ / 235 Twenty Questions for Donald Knuth (on the occasion of the ePublication of TAOCP) Letters 244 A letter on the persistence of (e)books / Charles Bigelow A L TEX 245 LATEX document class options / Thomas Thurnherr 248 How to influence the position of float environments like figure and table in LATEX? / Frank Mittelbach 255 Placing a full-width insert at the bottom of two columns / Barbara Beeton 256 biblatex variations / Ulrike Fischer 261 Every LATEX document brings new programming issues / David Walden 269 Glisterings: Lining up / Peter Wilson Resources 274 CTAN goes multi-lingual: Additional language support for the portal / Gerd Neugebauer Fonts 276 Obyknovennaya Novaya (Ordinary New Face) in / Basil Solomykov Multilingual 277 A simple Arabic system for mixed Latin/Arabic documents: d. ¯ad / Document Processing Yannis Haralambous & Tools 284 Visual editing (in a specialized case): prerex / Bob Tennent 287 l3build —A modern Lua test suite for TEX programming / Frank Mittelbach, Will Robertson, LATEX3 team 294 MetaPost path resolution isolated / Taco Hoekwater Macros 297 Typeset MMIX programs with TEX / Udo Wermuth Bibliographies 309 A Citation Style Language (CSL) workshop / Daniel Stender Hints & Tricks 315 The treasure chest / Karl Berry Book Reviews 317 Book review: Practical LATEX, by George Gr¨atzer / William Adams 318 Book review: Apprendre `aprogrammer en TEX, by Christian Tellechea / Jacques Andr´e 319 Book review: The Imitation Game, by Jim Ottaviani and Leland Purvis / Michael Berry 320 Book review: Let’s Learn LATEX, by S. Parthasarathy / Nicola Talbot Abstracts 322 Die TEXnische Kom¨odie: Contents of issues 2–3/2014 323 Les Cahiers GUTenberg: Contents of issue 57 (2012) TUG Business 230 TUGboat editorial information 323 TEX Development Fund 2013 report 324 TUG 2013 election 325 TUG membership form 326 TUG institutional members Advertisements 326 TEX consulting and production services News 327 TUG 2015 announcement 328 Calendar TEX Users Board of Directors TUGboat (ISSN 0896-3207) is published by the Donald Knuth, Grand Wizard of TEX-arcana † ∗ TEX Users Group. Steve Peter, President Jim Hefferon∗, Vice President Memberships and Subscriptions Karl Berry∗, Treasurer 2015 dues for individual members will be as follows: ∗ Susan DeMeritt , Secretary Regular members: $105. Barbara Beeton Special rate: $75. Kaja Christiansen The special rate is available to students, seniors, and Michael Doob citizens of countries with modest economies, as de- Steve Grathwohl tailed on our web site. Also, anyone joining or re- Taco Hoekwater newing before March 31 receives a $20 discount. Klaus H¨oppner Membership in the T X Users Group is for the E Ross Moore calendar year, and includes all issues of TUGboat for Cheryl Ponchin the year in which membership begins or is renewed, Arthur Reutenauer as well as software distributions and other benefits. Philip Taylor Individual membership is open only to named indi- Boris Veytsman viduals, and carries with it such rights and responsi- David Walden bilities as voting in TUG elections. For membership Raymond Goucher, Founding Executive Director † information, visit the TUG web site. † TUGboat , Wizard of Fonts Also, (non-voting) subscriptions are ∗ available to organizations and others wishing to re- member of executive committee †honorary ceive TUGboat in a name other than that of an individual. The subscription rate is $110 per year, See http://tug.org/board.html for a roster of all past and present board members, and other official positions. including air mail delivery. Addresses Electronic Mail Institutional Membership (Internet) Institutional membership is primarily a means of TEX Users Group P.O. Box 2311 General correspondence, showing continuing interest in and support for both Portland, OR 97208-2311 membership, subscriptions: TEX and the TEX Users Group. It also provides U.S.A. [email protected] a discounted membership rate, site-wide electronic access, and other benefits. For further information, Telephone Submissions to TUGboat, see http://tug.org/instmem.html or contact the +1 503 223-9994 letters to the Editor: [email protected] TUG office. Fax Technical support for Trademarks +1 815 301-3568 TEX users: Many trademarked names appear in the pages of [email protected] TUGboat . If there is any question about whether Web Contact the a name is or is not a trademark, prudence dictates http://tug.org/ Board of Directors: that it should be treated as if it is. The following list http://tug.org/TUGboat/ [email protected] of trademarks which commonly appear in TUGboat should not be considered complete. Copyright 2014 TEX Users Group. TEX is a trademark of American Mathematical Society. METAFONT is a trademark of Addison-Wesley Inc. Copyright to individual articles within this publication PostScript is a trademark of Adobe Systems, Inc. remains with their authors, so the articles may not be reproduced, distributed or translated without the authors’ permission. [printing date: October 2014] For the editorial and other material not ascribed to a Printed in U.S.A. particular author, permission is granted to make and distribute verbatim copies without royalty, in any medium, provided the copyright notice and this permission notice are preserved. Permission is also granted to make, copy and distribute translations of such editorial material into another language, except that the TEX Users Group must approve translations of this permission notice itself. Lacking such approval, the original English permission notice must be included. The first book ever printed in Europe–heavy, luxurious, pungent and creaky–does not read particularly well on an iPhone. Simon Garfield Just My Type: A book about fonts

COMMUNICATIONS OF THE TEX USERS GROUP EDITOR BARBARA BEETON

VOLUME 35, NUMBER 3 • 2014 PORTLAND • OREGON • U.S.A. 230 TUGboat, Volume 35 (2014), No. 3

A TUGboat editorial information and L TEX, are available from CTAN and the TUGboat This regular issue (Vol. 35, No. 3) is the third and last web site. We also accept submissions using ConTEXt. issue of the 2014 volume year. Deadlines, tips for authors, and other information: TUGboat is distributed as a benefit of member- http://tug.org/TUGboat/location.html ship to all current TUG members. It is also available Effective with the 2005 volume year, submission of to non-members in printed form through the TUG store a new manuscript implies permission to publish the ar- TUGboat (http://tug.org/store), and online at the TUGboat ticle, if accepted, on the web site, as well as web site, http://tug.org/TUGboat. Online publication in print. Thus, the physical address you provide in the to non-members is delayed up to one year after print manuscript will also be available online. If you have any publication, to give members the benefit of early access. reservations about posting online, please notify the edi- Submissions to TUGboat are reviewed by volun- tors at the time of submission and we will be happy to teers and checked by the Editor before publication. How- make special arrangements. ever, the authors are still assumed to be the experts. TUGboat Questions regarding content or accuracy should there- editorial board fore be directed to the authors, with an information copy Barbara Beeton, Editor-in-Chief to the Editor. Karl Berry, Production Manager Boris Veytsman, Associate Editor, Book Reviews Submitting items for publication Proposals and requests for TUGboat articles are grate- Production team fully accepted. Please submit contributions by electronic William Adams, Barbara Beeton, Karl Berry, mail to [email protected]. Kaja Christiansen, Robin Fairbairns, Robin Laakso, The first issue for 2015 will be a regular issue, with Steve Peter, Michael Sofka, Christina Thiele a deadline of March 6. The second 2015 issue will be the proceedings of TUG’15 (http://tug.org/tug2015); the TUGboat advertising deadline for receipt of final papers is July 31. The third For advertising rates and information, including consul- issue deadline is September 25. tant listings, contact the TUG office, or see: The TUGboat style files, for use with plain TEX http://tug.org/TUGboat/advertising.html

Ab Epistulis TUGboat 34:2, which includes two notable ar- ticles by Chuck Bigelow, is now publicly available Steve Peter on the TUG website (at http://tug.org/TUGboat/ As summer fades to fall here in the northern hemi- Contents/contents34-2.html). Early online ac- sphere, contemplation strikes. It seems to be a time cess to TUGboat issues is one of the benefits of being of looking forward as we prepare for the long winter a TUG member. Printed copies are available in the ahead, and as it is once again election season in the TUG store at http://tug.org/store/#tugboat. US. 2015 is an election year for TUG. Several direc- Board member Boris Veytsman continues to tor seats will be up for election, as well as the office write prolifically. Now online at the TUG website of president. Jim Hefferon, long-time TUG board are two new book reviews, covering Design Museum member and current vice-president, has expressed Fifty That Changed The World, by John his intent to run for president, which I am happy to Walters, and The Imitation Game, by Jim Ottaviani support. Of course, anyone interested in serving is & Leland Purvis. For these and many more reviews, welcome to run for a board position or president. see http://tug.org/books/#reviews. The previous issue of TUGboat contained the Until next time. Happy TEXing! proceedings of the Portland conference of 2014. The innovations coming from TUG members continues ⋄ Steve Peter to amaze me. TUG 2015 will be held in Darmstadt, Press Germany, July 20–22, 2015. Keep tuned to this space president (at) tug dot org and the upcoming electronic newsletters as we begin http://tug.org/TUGboat/Pres planning for this exciting meeting. TUGboat, Volume 35 (2014), No. 3 231

Editorial comments The letter “a” is shown in all its variety at http: //bigelowandholmes.typepad.com/. (This page, Barbara Beeton the B&H blog, also contains a remembrance of Hans TEX entomology Eduard Meier, Swiss lettering artist and creator of the Syntax, who died on 15 July 2014, age 91. For the past many years, I have been listed on Don Other interesting items appear in the blog as well; Knuth’s T X web page as his official collector of E by the time you read this, there should be an item bugs (see the “Errata” section of http://www-cs- “How and Why We Designed Lucida”.) faculty.stanford.edu/~uno/abcde.html). This A “Math” page at http://lucidafonts.com/ is about to change. pages/lucida-math points to the Lucida fonts page The next review is scheduled for 2020, and it’s at the TUG store (http://tug.org/store). prudent for someone younger to be the bearer of this responsibility. By unanimous consent, my successor More from Chuck Bigelow about Lucida will be Karl Berry (already the bearer of many T X- E Lucida fonts spotted “in the wild”: http://www. related responsibilities), [email protected]; he pinterest.com/lucidaf/lucida-on-location/ will officially take up the butterfly net on 1 January Chuck asks, “If you run across uses of Lucida 2015. Although in practice we will continue to share elsewhere that are photogenic enough to be legible, information and consult on matters involving the let me know.” history of this function, as of January 1, Karl will He also notes that the Louvre example is out of be the person responding to inquiries and rendering date; the photo was taken in 1997, but by 2012, the decisions on whether a report is or is not a bug. interior signage had been switched to use other fonts As in the past, this decision will not be reached by (unspecified). just one person; a few “trusted experts” (trusted and approved by Don, that is) will continue to provide Erratum: “Online Publishing via advice backing up responses to reports. pdf2htmlEX”, TUGboat 34:3 Bonne chance, Karl! May you find this exercise as interesting as I have. In their Acknowledgement on page 323, the authors, Lu Wang and Wanmin Liu, misspelled the name of An alternative to tangle and weave Professor Masakata Kaneko. TUG In to his presentation at 2014 on Peter Flynn’s Formatting Information a new, fully functional, TEX-language , updated Doug McKenna unveiled a command-line program, literac, that converts written in lan- An early version of this manual was published in TUGboat guages that use C-style commenting syntax into a 23:2. It has undergone some revisions since A its original appearance in 1999. The latest (HTML) LTEX document. The result, if the author has been diligent, is a literate exposition of the program under version has undergone several major changes: it is consideration. Only one file and one step is involved, now mobile-friendly, it has a new search engine, a unlike the dual-process tangle and weave. new index. and the chapter pages (previously quite The slides from the talk are posted at http:// large) have been cut into files per section so that tug.org/tug2014/slides/mckenna-literac.pdf, they load faster on marginal connections. though sadly without the (blindingly fast) demon- Peter says, regarding the new release, stration that accompanied the presentation. We look Some things like lines of code examples forward to articles from Doug on this and other topics won’t fit happily on very small screens. I don’t in a future TUGboat issue. think anyone has a real solution to this yet. The examples have all been reworked, and More Lucida fonts all the package links updated (and several The complete (albeit growing) selection of Lucida obsolescent packages replaced by newer ones). fonts has a new venue at the Lucida fonts store, The PDF and eBook will follow in due http://lucidafonts.com. As announced on the course. Please email me with all corrections, Bigelow & Holmes site: suggestions, gripes, flames, etc. We have opened a store to sell downloadable Peter’s contact information is available on the Lucida Fonts. We offer 310 fonts, most of web site for the manual: http://latex.silmaril. them never before released and available only ie/formattinginformation/. The manual is also from The Lucida Fonts Store. posted on CTAN. 232 TUGboat, Volume 35 (2014), No. 3

Klaus Peters, 1937–2014 A footnote about ‘Oh, oh, zero’ Klaus Peters was a who, instead of Don Knuth “practicing” , preferred to use his knowl- edge to ensure that mathematics and other scientific I can’t resist adding a few comments to Chuck Bige- literature was presented in its best, most readable low’s wonderful essay about the history of zero- form for a wide audience. His publishing ventures be- versus-oh in TUGboat 34:2 (2013), 168–181. gan with the founding of Birkhäuser, Boston, in 1979, As an associate editor of ACM’s Journal and with his wife, Alice, and continued through several Communications during the 1960s, and as a prospec- other publishing houses, some of which he founded tive author, I’d been giving some thought to the (including A K Peters Ltd.), others where he worked fact that new concepts arising in science as an editor or consultant. His expertise and friend- were calling for new typographic conventions. In ship were greatly valued by scores of . particular, I corresponded with Myrtle Kellington, His philosophy was laid out in an article in the who was responsible for in all of ACM’s AMS Notices, “Why publish mathematics?” (http: publications. (Computer scientists have Myrtle to //www.ams.org/notices/200907/rtx090700819p. thank for the now-universal style in which computer pdf). It is well worth reading, as is a shorter opinion programs have long been typeset with a pleasant piece on the obligations of a responsible publishing mixture of roman, bold, and italic. She introduced house: “PV : The value of publishing” (http://www. this style when she masterminded the publication ams.org/notices/201206/rtx120600741p.pdf). of the Algol 60 Report, first with roman and bold The high standards he professed are a worthy only [1] and later with italic too [2].) goal for any author or publisher. I applauded her work on formatted , but I wasn’t happy with the appearance of various Other items worth a look — bibliographies papers about machine input and output, in cases The web-based service http://www.doi2bib.org/ where a would have improved the will accept a DOI (digital object identifier) and re- exposition. When she learned of my concerns she turn a BibTEX entry for use in your bibliography. A wrote to me on 10 February 1966: similar facility, based on author names and titles, is Whenever the vertical alignment is a require- offered by http://www.ams.org/mathscinet/, but ment the printer uses the only he has is available only to subscribers (including many aca- where this applies. It is called “Typewriter,” demic libraries). This by way of a reminder that is machine-set, is available in four sizes, and Nelson Beebe has amassed an amazing collection of looks like the old-fashioned typewriter style. scientific bibliographies and tools for handling them, A sample is enclosed. at http://ftp.math.utah.edu/pub/bibnet/. Actually the sample she sent had three different Geographical trivia: Kolophon fonts, and it didn’t exhibit the full character sets. Not long ago, I attended a presentation entitled Those fonts (all to be used on Monotype machines) “Field Dirt”, in which were reported the projects were: Typewriter No. 74 (eight size); Reming- undertaken during the summer vacation by the ar- ton No. 70 (ten point and twelve point sizes); and chaeological faculty of Brown University. One of the Remington No. 17 (eleven point size), which was projects covered several sites in Turkey, which were somewhat darker. Only No. 17 was available for ma- duly displayed on a map. But wait — what’s that chine setting; the other styles needed to be inserted name “Kolophon” doing there? by hand. This city was founded by the Greeks around the I replied on 14 February 1966 — evidently the turn of the first millennium b.c. as “Κολοφ ΄ων”. (One U.S. Postal service was quite efficient in those days! — of the most renowned Ionian cities until its conquest as follows: and decline in the 7th century b.c., it has been cited Regarding my request for a special type-font, as a possible home or birthplace for Homer.) The I believe the 8 point “Typewriter No. 74” will origin of the name is the word κολοφ ΄ων, meaning do very nicely (assuming there are commas, “summit”, on account of its location, built on three parentheses, and the usual other special char- hills. The bibliographic “colophon” is from the same acters found on a typewriter). The other source, with the metaphorical sense of “crowning styles are also adequate but not as good. If touch”, a feature nowadays too often missed. possible it would be preferable to have a more ⋄ Barbara Beeton squarish capital letter O so it can’t be mis- taken for zero. I don’t know how expensive it http://tug.org/TUGboat TUGboat, Volume 35 (2014), No. 3 233

is to make up new characters one at a time; it is to 8 point in with 12 point text, I realize a whole new font can be very costly. etc.? I will be very glad to mark up all papers The special characters will no doubt be the that go through my desk in a special way to major problem. indicate what parts should be in this fixed- As examples of printed material using this width style. style, I can only say unfortunately I don’t Unfortunately, Myrtle’s reply (dated 17 February) know of any, except Addison-Wesley is doing shot all these ideas down: it for me in the forthcoming set of books I’m writing. I think ACM should “pioneer” in this. Dear Don: I am grateful for your interest in The best example I can give you is point out the printing aspects . . . However, I had not re- sections of the last Communications of the alized from your earlier letter that this new or ACM issue (January ’66) which would have special typeface with fixed-width characters been much improved if set in a “typewriter” would have to be used for words run in the style: text — not displayed, that is. What I am try- Page 5, right column, the “all caps” ing to say is that I had assumed that use of the words. Typewriter typeface would be only for certain Page 6, “all caps” words and dis- special displayed sequences or programs. played programs. Incorporating the Typewriter, or any other Page 9, Appendix C. Possibly ap- typeface, for isolated words or groups of words pendix B also. into the customary text would lead to pro- Page 30, the tables, if type set, hibitive costs. would be candidates for fixed width, Let me tell you what can be done staying although in this case the line engrav- within machine-set composition, which is the ings were quite adequate as a substi- most economical form of typesetting in letter tute. press composition, and perhaps we can evolve Page 31, the words in all caps. a plan that would achieve what you are after, Page 32 ff. The FORTRAN and or at least partially. COBOL program example. The typeface used for the text in CACM Pages 36–37, the capitalized “ma- is Modern No. 8, referred to as #8. This is chine response” sentences. available in all sizes both in roman and italics. Page 41 ff. All-caps words except . . . Along with #8 we can have, for machine perhaps ELIZA. setting that is, one other typeface. The one we In general, FORTRAN and COBOL and use is boldface, called #275. . . . To assembly language programs and references summarize, we can intermix in one keyboard to symbols within such programs would look operation for machine setting the following much better in a fixed-width style. Even an typefaces and sizes: 8 point type would be satisfactory here for In size 10 on 12, appearance sake in the midst of the normal 12 #8, in roman: all caps; csc point type (it would look like “”), (caps and small caps); sc (all although perhaps there would be some trouble small caps); and clc (caps and from the monotype side in such mixing of lowercase). sizes, I don’t know. You already have good #8, in italics: all caps, and formats for printing ALGOL programs, and clc. that needs no change; but these others look #275, in roman: all caps, quite unreadable by comparison, particularly and clc. things like Appendix C on page 9. The distinction between “oh” and “zero” #275, in ital: all caps, and is reasonably important. On page 6 I see clc. the word “TO” which should have really been In 8 on 10, the same range. tee-zero. But the two sizes, 8 pt and 10 pt, cannot I would like to continue discussing this be intermixed on the same line, I am sure with you by letter. Can you tell me what you know — unless of course hand work is special characters are available, how difficult involved.

A footnote about ‘Oh, oh, zero’ 234 TUGboat, Volume 35 (2014), No. 3

Thus for all the examples you mentioned Furthermore, Hans and Addison-Wesley agreed in your letter, one could not have the Type- to have a special glyph cut for me, a “squarish” ver- writer typeface without an entire special hand sion of the uppercase O, compatible with the existing operation to drop all those words in after the Remington font. The font also included a new special regular text had been set by machine in #8 character like ‘ ’ to indicate a blank space. and #275. Therefore the publication of Volume 1 of TAOCP . . . Actually we are giving considerable in January 1968 was actually the debut of a brand- attention to changing printers, originally mo- new typographic style for , featuring tivated by saving money, but now by many typewriter style blended freely with ordinary text in other considerations such as automated type- appropriate places. The new ‘O’ didn’t quite align setting or being prepared for a fully auto- properly with the other letters at the baseline; but I mated operation all the way along the line, as didn’t actually notice that glitch at the time, because well as quality of the printing. With certain I was so happy to have ‘O’ instead of ‘0’. of these new cold type and automated compo- Why did I ask Addison-Wesley for a squarish sition processes, one can intersperse several Oh? I was almost surely influenced primarily by typefaces and adjust the spacings almost at the dot-matrix font used by keypunch machines in whim. And your request would be no prob- those days. Look, for example, at the illustration lem. of a punched card on page 148 of the original 1968 edition of my book, or on page 152 of the current I will bear all this in mind as we carry on edition. Bob Bemer’s article [3, page 516] also shows our interviews and observe demonstrations. it as IBM’s recommended corporate practice for dis- (Indeed, ACM did eventually change to “cold tinguishing Oh from zero on keypunches, as of 1964. type” printing, and it turned out to be a mistake — On typewriters, IBM recommended a wide Oh and a although I’m sure they tried valiantly to work with narrow zero, “except for the stylized fonts for OCR the available vendors during those years. Decent and MICR.” mathematical typesetting was becoming a lost art; When Volume 2 of The Art of Computer Pro- and that, of course, is why I was motivated to de- gramming came out in 1981, with glyphs now drawn METAFONT velop TEX some years later. A downward spiral of by , of course I retained my beloved ‘O’. decreasing typographic quality in Communications And the ‘’ became squarish too at this time (al- of the ACM began with their issue of March 1971; though with a loop at the bottom instead of a cross- various examples of fixed-width type can incidentally bar). be found in that issue, all of which were poorly repro- Alas, however, Chuck’s essay demonstrates that duced from line-printer output. The Journal of the I’m still standing alone in this respect: None of the ACM began to suffer the same fate in October 1976.) nine monospaced typefaces in his Fig. 9 have any- Meanwhile, as indicated in my letter to Myr- thing like an Oh that I would want to use. (Nowhere tle, I had been having much better responses from did I see a really satisfactory Oh in Chuck’s discus- Addison-Wesley, as they were preparing to publish sion — until I came to Karl Berry’s production notes The Art of Computer Programming. Addison-Wesley at the end, and Karl’s reference to ZeroFontOT.otf.) had been founded by two printers who were inter- I herewith submit a humble request to have squarish ested in producing good textbooks; and they became O and Q available as alternates in the next edition of the only scientific publisher with their own in-house Lucida Console. composition facility, at least in America. All of References their typesetting was done under the direction of an old-timer named Hans Wolf. At my request, Hans [1] et al., “Report on the Algorithmic Lan- 3 had figured out a clever way to adapt his Monotype guage ALGOL 60,” Communications of the ACM keyboards and casters so that machine setting with (1960), 299–314. Remington #17 could actually be mixed together [2] Peter Naur et al., “Revised report on the Algorithmic ALGOL Communications of the ACM 6 with the normal roman, italic, bold, bold italic, and Language 60,” (1963), 1–17. math symbols. (Indeed, I’m pretty sure that this had [3] R. W. Bemer, “Toward standards for handwritten never been done before, because Hans had originally zero and oh,” Communications of the ACM 10 (1967), told me — as Myrtle was to do later — that such a 513–518. thing would be impossible.) This mixing could be done either in 10-point type on a 12-point base, or ⋄ Don Knuth 9-point type on an 11-point base. http://www-cs-staff.stanford.edu/~uno

Don Knuth TUGboat, Volume 35 (2014), No. 3 235

Twenty Questions for Donald Knuth, in my book MMIXware (1999), as well as that book’s to celebrate the ePublication of TAOCP metasimulator for MMIX, in which I explain many principles of advanced pipelined from the To celebrate the publication of the eBooks of The ground up. Art of Computer Programming (TAOCP), Pearson continues to be one of the asked several computer scientists, contemporaries, greatest joys of my life. In fact, I find myself writing colleagues, and well-wishers to pose one question roughly two programs per week, on average, both each to author Donald E. Knuth. Here are his an- large and small, as I draft new material for the next swers. (Reprinted in TUGboat by kind permission volumes of TAOCP. of Pearson, from www.informit.com/promotions/ 2. Dave Walden, T X Users Group: Might you -of-the-art-of-computer-programming E publish the original 3,000-page version of TAOCP -139881.) (before the decision to change it into seven volumes), 1. Jon Bentley, researcher: What a treat! The as a historical artifact of your view of the state of last time I had an opportunity like this was at the the art of algorithms and their analysis circa 1965? I end of your data structures class at Stanford in June, think lots of people would like to see this. 1974. On the final day, you opened the floor so that Don Knuth: Scholars can look at the handwrit- we could ask any question on any topic, barring only ten pages that led to Volumes 1–3 by going to the politics and religion. I still vividly remember one Stanford Archives, and all of the remaining pages question that was asked on that day: “Among all will be deposited there eventually. I see little value the programs you’ve written, of which one are you in making those drafts more generally available — most proud?” although some of the material about baseball that I Your answer (as I approximately recall it, four decided not to use is pretty cool. Archives from the decades later) described a that you wrote real pioneers of computer science, who wrote in the for a minicomputer with 1024 available of mem- 40s and 50s, should be published first. ory. Your first draft was 1029 bytes long, but you I do try to retain the youthful style of the origi- eventually had it up and running and debugged at nal, in the pages that I write today, except where my 1023 bytes. You said that you were particularly first draft was embarrassingly naive or corny. I’ve proud of cramming so much functionality into so also learned when to say “that” instead of “which,” little memory. thanks in part to Guy Steele’s tutelage. My query today is a slight variant on that ven- erable question. Of all the programs that you’ve 3. Charles Leiserson, MIT: TAOCP shows a great written, what are some of which you are most proud, love for computer science, and in particular, for al- and why? gorithms and . But love is not Don Knuth: I’d like to ask you the same! But always easy. When writing this series, when did that’s something like asking parents to name their you find yourself reaching deepest into your emo- favorite children. tional reservoir to overcome a difficult challenge to Of course I’m proud of TEX and METAFONT, be- your vision? cause they seem to have helped to change the world, Don Knuth: Again, Charles, I’d like to ask you and because they led to many friendships. Further- exactly the same question! more they’ve made these eBooks possible: I’m enor- For me, I guess, the hardest thing has always mously happy that the work I did more than 30 years been to figure out what to cut. And I obviously ago has miraculously survived many changes of tech- haven’t been very successful at that, in spite of much nology, and that the 3,000 pages of TAOCP now look rewriting. so great on a little tablet — even after zooming. The most difficult technical challenge was to While I was preparing for Volume 4 of TAOCP write the metasimulator for MMIX. I needed to in the 90s, I wrote several dozen short routines us- do that behind the scenes, in order to shape what ing what you and I know as “literate programming.” actually appears in the books, and it was surely Those little essays have been packaged into The Stan- the toughest programming task that I’ve ever faced. ford GraphBase (1994), and I still enjoy using and Without the methodology of literate programming, I modifying them. My favorite is the implementation don’t think I could have finished that job successfully. of Tarjan’s beautiful for strong compo- Many of the “starred” mathematical sections nents, which appears on pages 512–519 of that book. also stretched me pretty far. Overall, however, after I have to admit some pride also in the implemen- working on TAOCP for more than fifty years, I can’t tation of IEEE floating-point arithmetic that appears think of any aspect of the activity where the effort

Twenty Questions for Donald Knuth 236 TUGboat, Volume 35 (2014), No. 3

of writing wasn’t amply repaid by what I learned gap-filling “fascicles” when future volumes need to while doing it. refer to recently invented material that ultimately 4. Dennis Shasha, NYU: How does a beauti- belongs in Volume 3, say. ful algorithm compare to a beautiful theorem? In (2) Hey, what a fascinating question — I don’t other words, what would be your criteria of beauty think anybody else has ever asked me that before! for each? If I’d been born in 1814, the truth is that I would Don Knuth: Beauty has many aspects, of course, almost certainly have had a very limited education, and is in the eye of the beholder. Some theorems and coupled with hardly any access to knowledge. My algorithms are beautiful to me because they have own male ancestors from that era were all employed many different applications; some because they do as laborers, on farms that they didn’t own, in what powerful things with severely limited resources; some is now called northern Germany. because they involve aesthetically pleasing patterns; But I suppose you have a different question in some because they have a poetic purity of concept. mind. What if I had been one of the few people with For example, I mentioned Tarjan’s algorithm a chance to get an advanced education, and who also for strong components. The data structures that he had some flexibility to choose a career? devised for this problem fit together in an amazingly All my life I’ve wanted to be a teacher. In fact, beautiful way, so that the quantities you need to when I was in first grade, I wanted to teach first look at while exploring a directed graph are always grade; in second grade, I wanted to teach second; magically at your fingertips. And his algorithm also and so on. I ended up as a college teacher. Thus I does as a byproduct. suppose that I’d have been a teacher, if possible. It’s even possible sometimes to prove a beau- To continue this speculation, I have to explain tiful theorem by exhibiting a beautiful algorithm. about being a geek. Fred Gruenberger told me long Look, for instance, at Theorem 5.1.4D and/or Corol- ago that about 2% of all college students, in his ex- lary 7H in TAOCP. perience, really resonated with computers in the way that he and I did. That number stuck in my mind, 5. Mark Taub, Pearson: Does the emergence and over the years I was repeatedly able to confirm of “apps” (small, single-function, networked pro- his empirical observations. For instance, I learned in grams) as the dominant programming paradigm to- 1977 that the University of Illinois had 11,000 grad day impact your plans in any way for future material students, of whom 220 were CS majors! TAOCP in ? Thus I came to believe that a small percentage Don Knuth: People who write apps use the ideas of the world’s population has somehow acquired a pe- and paradigms that are already present in the first culiar way of thinking, which I happen to share, and volumes. And apps make use of ever-growing pro- that such people happened to discover each other’s ex- gram libraries, which are intimately related to TAOCP. istence after computer science had acquired its name. Users of those libraries ought to know something For simplicity, let me say that people like me about what goes on inside. are “geeks,” and that geeks comprise about 2% of Future volumes will probably be even more the world’s population. I know of no explanation for “app-likable,” because I’ve been collecting tons of the rapid rise of academic computer science depart- fascinating games and puzzles that illustrate pro- ments — which went from zero to one at virtually gramming techniques in especially instructive and every college and university between 1965 and 1975 — appealing ways. except that they provided a long-needed home where 6. Radia Perlman, Intel: (1) What is not in the geeks could work together. Similarly, I know of no books that you wish you’d included? (2) If you’d good explanation for the failure of many unsuccess- been born 200 years ago, what kind of career might ful software projects that I’ve witnessed over the you imagine you’d have had? years, except for the hypothesis that they were not Don Knuth: (1) Essentially everything that I want entrusted to geeks. to include is either already in the existing volumes So who were the geeks of the early 19th cen- or planned for the future ones. Volume 4B will be- tury? Beginning a little earlier than 1814, I’d maybe gin with a few dozen pages that introduce certain like to start with Abel (1802); but he’s been pretty newfangled mathematical techniques, which I didn’t much claimed by the mathematicians. Jacobi (1804), know about when I wrote the corresponding parts of Hamilton (1805), Kirkman (1806), De Morgan (1806), Volume 1. (Those pages are now viewable from my Liouville (1809), Kummer (1810), and ’s Li website in beta-test form, under the name “mathe- Shanlan (1811) are next; I’m listing “mathemati- matical preliminaries redux.”) I plan to issue similar cians” whose writings speak rather directly to the

Twenty Questions for Donald Knuth TUGboat, Volume 35 (2014), No. 3 237

geek in me. Then we get precisely to your time pe- that it was some minutes before I could steer him riod, with Catalan (1814) and Sylvester (1814), Boole back to the Erse curses, about which he seemed a (1815), Weierstrass (1815), and Borchardt (1817). I good deal less enthusiastic. ‘Really’, he said, ‘that would have enjoyed the company of all these people, sort of thing isn’t my subject at all. Of course, I and with luck I might have done similar things. applaud breadth of vocabulary; and you never know By the way, the first person in history whom I’d when some seemingly useless piece of knowledge may not turn out to be of cardinal practical importance. classify as “100% geek” was . Many of I could certainly envisage a situation in which they his predecessors had strong symptoms of our disease, might come in very handy indeed’. ‘And runic erot- but he was totally infected. ica?’ ‘Not extant’. (Was it only my fancy that heard a 7. Tony Gaddis, author: Do you remember a note of faint regret in his reply?) Certainly the higher specific moment when you discovered the joy of pro- flights of scholarship can add savour; but does the gramming, and decided to make it your life’s work? man-in-the-street have the time and the pertinacity and the intellectual digestion for them? Don Knuth: During the summer of 1957, between my freshman and sophomore years at Case Tech in Programming, of course, is not just an ordinary , I was allowed to spend all night with an subject. It is intrinsically empowering, and applica- IBM 650, and I was totally hooked. ble to many different kinds of knowledge. And I also But there was no question of viewing that as a know that you’ve been having enormous successes, at “life’s work,” because I knew of nobody with such a ca- Princeton and online, teaching advanced concepts of reer. Indeed, as mentioned above, my life’s work was programming to students from every discipline. to be a teacher. I did write a compiler manual in 1958, But your question asks about everybody. I still which by chance was actually used as the textbook think many years will have to go by before I would for one of my classes in 1959(!). Still, programming recommend that my own highly intelligent wife, son, was for me primarily a hobby at first, after which it and daughter should learn to program, much less that became a way to support myself while in grad school. everybody else I know should do so. I saw no connection between computer program- Nick Trefethen told me a few years back that ming and my intended career as a math professor he had just visited his son’s high school in Oxford, until I met Bob Floyd late in 1962. I didn’t foresee which is one of the best anywhere, and learned that that computer science would ever be an academic not a single student knew how to program! Britain is discipline until I met in 1964. now beginning to change that, indeed at a more rapid pace than in America. Yet such a revolution almost 8. Robert Sedgewick, Princeton: Don, I re- surely needs to take place over a generation or more. member some years ago that you took the position Where are the teachers going to come from? that you weren’t trying to reach everyone with your My own experience is with the subset of college books — knowing that they would be particularly students who are sufficiently interested in program- beneficial to people with a certain interest and ap- ming that they expect it to become an integral part titude who enjoy programming and exploring its of their life. TAOCP is essentially for specialists. I’ve relationship to mathematics. But lately I’ve been primarily been writing it for geeks, not for a general wondering about your current thoughts on this issue. audience, because somebody has to write books that It took a long time for society to realize the benefits aren’t for dummies. (By a “dummy” I mean a smart of teaching everyone to read; now the question be- non-geek. That’s a much larger market, and very im- fore us is whether everyone should learn to program. portant; but it’s not my target audience, and general What do you think? education is not my forte.) Don Knuth: I suppose all college professors think On the other hand, believe it or not, I try that their subject ought to be taught to everybody to explain everything in my books by imagining a in the world. In this regard I can’t help quoting non-specialist reader. My goal is to be jargon-free from a wonderful paper that John Hammersley wrote whenever possible; I especially try to avoid terms in 1968: from higher mathematics that tend to frighten the Just for the fun of getting his reactions, I asked an em- programmer-on-the-street. Whenever possible I try inent scholar of English Literature what educational to translate results from the theoretical literature benefits might lie in the study of goliardic verse, Erse curses, and runic erotica. ‘A working background of into a language that high-school students could un- goliardic verse would be more than helpful to anyone derstand. hoping to have some modest facility in his own mother I know that my books still aren’t terribly easy tongue’, he declared; and with that he warmed to his to fathom, even for geeks. But I could have made subject and to the poverties of unlettered science, so them much, much harder.

Twenty Questions for Donald Knuth 238 TUGboat, Volume 35 (2014), No. 3

9. Barbara Steele: What was the conversion pro- The great advantage of an eBook is the reader’s cess, and what tools did you use, to convert your ability to search exhaustively. What fun it is to look print books to eBooks? for all occurrences of a random word like ‘game’, or Don Knuth: I knew that these volumes would not for a random word fragment like ‘gam’ or ‘ame’, and work especially well as eBooks unless they were con- find lots of cool material that I don’t recall having verted by experts. Fortunately I received some prize written. The search feature on these books works money in 2011, which could be used to pay for pro- even better than I had a right to hope for. fessional help. Therefore I was able to achieve the The index in a printed book has the advantage kind of quality that I envisioned, without delaying of being more focused. But that index also appears my work on future volumes, by letting the staff at in the eBook, and in the eBook you can even click in Mathematical Sciences Publishers in Berkeley (MSP) the index to get to the cited pages. handle all of the difficult stuff. Today’s eBook readers are often inconvenient My principal goal was to make the books eas- for setting bookmarks and going back to where you ily searchable — and that’s a much more challenging were a couple of minutes ago, especially after you problem than it seems, if you want to do it right. click on an Internet link and then want to go back to Secondarily, I wanted to let readers easily click on reading. But that software will surely improve, and the number of any exercise or equation or illustra- so will today’s electronic devices. tion or table or algorithm, etc., and to jump to that In the future I look forward to curated eBooks exercise; also to jump readily between an exercise that have additional notes by experts — and possi- and its answer. bly even graffiti in the style of Concrete Mathemat- The people at MSP wrote special software that ics — somewhat analogous to the “director’s com- converts my source text into suitable input to other ments” and other extras found on the DVDs for films. software that creates PDF files. I don’t know the One could select different subsets of these comments details, except that they use “change files” analo- when reading. gous to those used in WEB and CWEB. I’ve checked 11. Peter Gordon, Addison-Wesley (retired): the results pretty carefully, and I couldn’t be more If the full range of today’s eBook features and func- pleased. Moreover, they’ve designed things so that it tionalities had been available when TAOCP was first won’t be hard for me to make changes next year, as published, would you have written those volumes readers discover bugs in the present editions. very differently? (My style of writing tends to maximize the num- Don Knuth: Well, I don’t think I would have got- ber of opportunities to make mistakes, hence I would ten very far at all. I would have had to think about be fooling myself if I thought that the books were doing everything in color, and with interactive fig- now perfect. Therefore it has always been important ures, tables, equations, and exercises. A single person to keep future errata in mind. The production staff cannot use the “full range” of features that eBooks at Addison-Wesley has been consistently wonderful potentially have. in the way they allow me to correct about fifty pages But by limiting myself to what can be presented every year in each volume.) well in black-and-white type, on printed pages of a 10. Silvio Levy, MSP: Could you comment on fixed size, I was fortunately able to complete 3,000 the differences between the print, PDF, EPUB, etc., pages over a period of 50 years. editions of TAOCP? What would you say is gained 12. Udi Manber, Google: The early volumes of or lost with each? TAOCP established computer programming as com- Don Knuth: The printed versions weigh a lot more, puter science. They introduced the necessary rigor. but they don’t need battery power or a tether to elec- This was at the time when computers were used tricity. They are always there; I don’t have to turn mostly for numerical applications. Today, most ap- them on, and I can have them all open at once. plications are related to people — social interaction, I can scribble in the margins (and elsewhere) of search, entertainment, and so on. Rigor is rarely used the print versions, and I can highlight text in differ- in the development of these applications. Speed is ent colors. Ten years from now I expect analogous not always the most important factor, and “correct- features will be commonly available for eBooks. ness” is rarely even defined. Do you have any advice I’m used to flipping pages and finding my way on how to develop a new computer science that can around a regular book, much more so than in an introduce rigor to these new applications? eBook; but my grandchildren might have the oppo- Don Knuth: The numerical computations that site reaction. were somewhat central when computer science was

Twenty Questions for Donald Knuth TUGboat, Volume 35 (2014), No. 3 239

born are by no means gone; they continue to grow, that are a mile wide, or constants that involve a tril- year by year. Of course, they now represent a much lion zeros, etc. I’ve taken care to avoid catastrophic smaller piece of the pie, but I don’t believe in concen- crashes, but I don’t check every addition operation trating too much on the big pieces. for possible overflow. My work on METAFONT introduced me to ap- There’s even a fundamental gap in the founda- plications where “correctness” cannot be defined. tions of my main mathematical specialty, the analysis How do I know, for example, that my program for of algorithms. Consider, for example, a computer pro- the letter A produces a correct image? I never will; gram that sorts a list of numbers into order. Thanks and I’ve learned to live with that uncertainty. On to the work of Floyd, Hoare, and others, we have the other hand, when I implemented the routines formal definitions of semantics, and tools by which that interpret specifications and draw the associated we can verify that sorting is indeed always achieved. bitmaps, there was plenty of room for rigor. The My job is to go beyond correctness, to an that go into font rendering are among the such things as the program’s running time: I write most interesting I’ve ever seen. down a recurrence, say, which is supposed to rep- As a user of products from Google and Adobe resent the average number of comparisons made by and other corporations, I know that a tremendous that program on random input data. I’m 100% sure amount of rigor goes into the manipulation of map that my recurrence correctly describes the program’s data, transportation data, pixel data, linguistic data, performance, and all of my colleagues agree with me metadata, and so on. Furthermore, much of that that the recurrence is “obviously” valid. Yet I have processing is done with distributed and decentralized no formal tools by which I can prove that my recur- algorithms that require more rigor than anybody rence is right. I don’t really understand my reasoning ever thought of in the 60s. processes at all! My student Lyle Ramshaw began to So I can’t say that rigor has disappeared from create suitable foundations in his thesis (1979), but the computer science scene. I do wish, however, the problem seems inherently difficult. Nevertheless, that Google’s and Adobe’s and Apple’s program- I don’t lose any sleep over this situation. mers would learn rigorously how to keep their sys- tems from crashing my home computers, when I’m 13. Al Aho, Columbia: We all know that the not using Linux. is a universal model for sequential In general I agree with you that there’s no de- computation. crease in the need for rigor, rather an increase in the But let’s consider reactive distributed systems number of kinds of rigor that are important. The fact that maintain an ongoing interaction with their envi- that correctness can’t be defined on the “bottom line” ronment—systems like the Internet, cloud comput- should not lull people into thinking that there aren’t ing, or even the human brain. Is there a universal intermediate levels within every nontrivial system model of computation for these kinds of systems? where correctness is crucial. Robustness and quality Don Knuth: I’m not strong on logic, so TAOCP are compromised by every weak link. treads lightly on this sort of thing. The TAOCP On the other hand, I certainly don’t think that model of computation, discussed on pages 4–8 of Vol- everything should be mathematized, nor that every- ume 1, considers “reactive processes,” a.k.a. “compu- thing that involves computers is properly a subdisci- tational methods,” which correspond to single proces- pline of computer science. Many parts of important sors. I’ve long planned to discuss recursive software systems do not require the special talents of and other cooperative processes in Chapter 8, after I geeks; quite the contrary. Ideally, many disciplines finish Chapter 7. The beautiful model of -free collaborate, because a wide variety of orthogonal parsing via semiautonomous agents, in Floyd’s great skill sets is a principal reason why life is such a joy. survey paper of 1964, has strongly influenced my Vive la difference. thinking in this regard. Indeed, I myself follow the path of rigor only I’d like to see extensions of the set-theoretic partway: Rarely do I ever give a formal proof that model of computation at the beginning of Volume 1 any of my programs are correct, once I’ve constructed to the things you mention. They might well shed an informal proof that convinces me. I have no real light on the subject. interest, for example, in defining exactly what it But fully distributed processes are well beyond would mean for TEX to be correct, or for verifying the scope of my books and my own ability to com- formally that my implementation of that 550-page prehend them. For a long time I’ve thought that an program is free of bugs. I know that anomalous understanding of the way ant colonies are able to results are possible when users try to specify pages perform incredibly organized tasks might well be the

Twenty Questions for Donald Knuth 240 TUGboat, Volume 35 (2014), No. 3

key to an understanding of human cognition. Yet the size, n, is feasible. Most of today’s literature is de- ants that invade my house continually baffle me. voted to algorithms that are asymptotically great, n 14. Guy Steele, Oracle Labs: Don, you and I are but they are helpful only when exceeds the size of both interested in program analysis: What can one the universe. know about an algorithm without actually executing In one sense such literature makes my life easier, it? Type theory and Hoare logic are two formalisms because I don’t have to discuss those methods in TAOCP for that sort of reasoning, and you have made great . I’m emphatically not against pure research, contributions to using mathematical tools to ana- which significantly sharpens our abilities to deal with lyze the execution time of algorithms. What do practical problems and which is interesting in its own you think are interesting currently open problems in right. So I sometimes play asymptotic games. But program analysis? I sure wouldn’t mind seeing a lot more algorithms that I could also use. Don Knuth: against Guy, I’m sure you aren’t really For instance, I’ve been reading about algorithms the idea of program execution. You and I both that decide whether or not a given graph G belongs like to know things about programs and to execute to a certain class. Is G, say, chordal? You and others them. Often the execution contradicts our supposed discovered some great algorithms for the chordal- knowledge. ity and minimum fillin problems, early on, and an The quest for better ways to verify programs is enormous number of extremely ingenious procedures one of the famous grand challenges of computer sci- have subsequently been developed for characterizing ence. And as I said to Udi, I’m particularly rooting the graphs of other classes. But I’ve been surprised for better techniques that will avoid crashes. to discover that very few of these newer algorithms Just now I’m writing the part of Volume 4B have actually been implemented. They exist only on that discusses algorithms for satisfiability, a problem paper, and often with details only sketched. of great industrial importance. Almost nothing is Two years ago I needed an algorithm to decide known about why the heuristics in modern solvers whether G is a so-called comparability graph, and work as well as they do, or why they fail when they was disappointed by what had been published. I do. Most of the techniques that have turned out believe that all of the supposedly “most efficient” to be important were originally introduced for the algorithms for that problem are too complicated to wrong reasons! be trustworthy, even if I had a year to implement If I had my druthers, I wish people like you one of them. would put a lot of effort into a problem of which Thus I think the present state of research in al- I’ve only recently become aware: The programmers gorithm design misunderstands the true nature of ef- of today’s multithreaded machines need new kinds ficiency. The literature exhibits a dangerous trend in of tools that will make linked data structures much contemporary views of what deserves to be published. more cache-friendly. One can in many cases start Another issue, when we come down to earth, up auxiliary parallel threads whose sole purpose is is the efficiency of algorithms on real computers. to anticipate the memory accesses that the main As part of the Stanford GraphBase project I imple- computational threads will soon be needing, and to mented four algorithms to compute minimum span- preload such data into the cache. However, the task ning trees of graphs, one of which was the very pretty of setting this up is much too daunting, at present, method that you developed with Cheriton and Karp. for an ordinary programmer like me. Although I was expecting your method to be the win- 15. , Princeton: What do you see ner, because it examines much of the data only half as the most promising directions for future work in as often as the others, it actually came out two to algorithm design and analysis? What interesting and three times worse than Kruskal’s venerable method. important open problems do you see? Part of the reason was poor cache interaction, but Don Knuth: My current draft about satisfiabil- the main cause was a large constant factor hidden ity already mentions 25 research problems, most of by O notation. which are not yet well known to the theory commu- 16. Frank Ruskey, University of Victoria: nity. Hence many of them might well be answered Could you comment on the importance of working before Volume 4B is ready. Open problems pop up on unimportant problems? My sense is that com- everywhere and often. But your question is, of course, puter science research, funding, and academic hiring really intended to be much more general. is becoming more and more focused on short-term In general I’m looking for more focus on algo- problems that have at their heart an economic moti- rithms that work fast with respect to problems whose vation. Do you agree with this assessment, is it a bad

Twenty Questions for Donald Knuth TUGboat, Volume 35 (2014), No. 3 241

trend, and do you see a way to mitigate it? done together with ACM committees. In a number Similarly, could you comment on the demise of of others, written while I was at Caltech, I did the the individual researcher? So many papers that I see theory and my student co-authors wrote computer published these days have multiple authors. Five- programs to validate it. There was one paper with author papers are routine. But when I dig into the Mike Garey, Ron Graham, and David Johnson, in details it seems that often only one or two have con- which they did the theory and my role was to ex- tributed the fresh ideas; the others are there because plain what they did. You and I wrote a joint paper they are supervisors, or financial contributors, or in 2004, related to recursive coroutines, in which we whatever. I’m pretty sure that Euler didn’t publish shared equally. any papers with five co-authors. What is the reason The phenomenon of hyperauthorship still hasn’t for this trend, how does it interfere with trying to infected computer science as much as it has hit establish a history of ideas, and what can be done physics and biology, where I’ve read that Thomson- to reverse it? Reuters indexed more than 200 papers having 1,000 Don Knuth: I was afraid somebody was going to authors or more, in a single recent year! When I ask a question related to economics. I’ve never un- cite a paper in TAOCP, I like to mention all of the derstood anything about that subject. I don’t know authors, and to give their full names in the index. why people spend money to buy things. I’m willing That policy will become impossible if CS publication to believe that some economists have enough wisdom practices follow in the footsteps of those fields. to keep the world running some of the time, but their Collaborative work is exhilarating, and it’s won- reasons are beyond me. derful when new results are obtained that wouldn’t I just write books. I try to tell stories that seem have been discovered by individuals working alone. to be important, at least for geeks. I’ve never both- But as you say, authors should be authors, not ered to think about marketing, or about what might hangers-on. sell, except when my publishers ask me to answer You mention the history of ideas. To me the questions as I’m doing now! method of discovery tends to be more important Three years ago I published Selected Papers on than the identification of the discoverers. Still, credit Fun and Games, a 750-page book that is entirely should be given where credit is due; conversely, credit devoted to unimportant problems. In many ways the shouldn’t be given where credit isn’t due. fact that I was able to live during a time in the his- I suppose the multiple-author anomalies are tory of the world when such a book could be written largely due to poor policies related to financial re- has given me even more satisfaction than I get when wards. Unenlightened administrators seem to base seeing the currently healthy state of TAOCP. salaries and promotions on publication counts. I’ve reached an age where I can fairly be de- What can we do? As I say, I’m incompetent to scribed as a “grumpy old man,” and perhaps that is deal with economics. I’ve gone through life refusing why I strongly share your concern for the alarming to go along with a crowd, and bucking trends with trends that you bring up. I’m profoundly upset when which I disagree. I’ve often declined to have my people rate the quality of my work by measuring the name added to a paper. But I suppose I’ve had a extent to which it affects Wall Street. sheltered existence; young people may be forced to Everybody seems to understand that astrono- bow to peer pressure. mers do astronomy because astronomy is interesting. 17. Andrew Binstock, Dr. Dobb’s: At the ACM Why don’t they understand that I do computer sci- Turing Centennial in 2012, you stated that you were ence because computer science is interesting? And becoming convinced that P = NP . Would you be that I’d do it regardless of whether or not it made kind enough to explain your current thinking on money for anybody? The reason is probably that this question, how you came to it, and whether this not everybody is a geek. growing conviction came as a surprise to you? Regarding joint authorship, you are surely right about Euler in the 18th century. In fact I can’t Don Knuth: As you say, I’ve come to believe that think of any two-author papers in mathematics, un- P = NP , namely that there does exist an M til Hardy and Littlewood began working together at and an algorithm that will solve every n-bit problem the beginning of the 20th century. belonging to the class NP in nM elementary steps. In my own case, two of my earliest papers were Some of my reasoning is admittedly naive: It’s joint because the other authors did the theory and I hard to believe that P =6 NP and that so many wrote computer programs to validate it. Two other brilliant people have failed to discover why. On the papers were related to the ALGOL language, and other hand if you imagine a number M that’s finite

Twenty Questions for Donald Knuth 242 TUGboat, Volume 35 (2014), No. 3

but incredibly large — like say the number 10 ↑↑↑↑ 3 nomial-time-worst-case algorithm for satisfiability, discussed in my paper on “coping with finiteness” — even though P happens to equal NP . then there’s a humongous number of possible al- 18. Jeffrey O. Shallit, University of Waterloo: nM gorithms that do bitwise or addition or shift Decision methods, automated theorem-proving, and n operations on given bits, and it’s really hard to proof assistants have been successful in a number believe that all of those algorithms fail. of different areas: the Wilf-Zeilberger method for My main point, however, is that I don’t believe combinatorial identities and the Robbins conjecture, P NP that the equality = will turn out to be help- to name two. What do you think theorem discovery ful even if it is proved, because such a proof will and proof will look like in 100 years? Rather like almost surely be nonconstructive. Although I think today, or much more automated? M probably exists, I also think human beings will Don Knuth: never know such a value. I even suspect that nobody Besides economics, I was also afraid will even know an upper bound on M. that somebody would ask me about the future, be- Mathematics is full of examples where something cause I’m a notoriously bad prophet. I’ll take a shot at your question anyway. is proved to exist, yet the proof tells us nothing about Assuming 100 years of sustainable civilization, how to find it. Knowledge of the mere existence of an algorithm is completely different from the knowledge I’m fairly sure that a large percentage of theorems (maybe even 38.1966%) will be discovered with com- of an actual algorithm. puter aid, and that a nontrivial percentage (maybe For example, RSA relies on the fact that one party knows the factors of a number, 0.7297%) will have computer-verified proofs that can- not be understood by mortals. but the other party knows only that factors exist. In my Ph.D. thesis (1963), I looked at computer- Another example is that the game of N×N Hex has a winning strategy for the first player, for all N. generated examples of small finite projective planes, John Nash found a beautiful and extremely simple and used that data to construct infinitely many proof of this theorem in 1952. But Wikipedia tells planes of a kind never before known. Ten years me that such a strategy is still unknown when N = 9, later, I discovered the so-called Knuth-Morris-Pratt algorithm by studying the way one of Steve Cook’s despite many attempts. I can’t believe anyone will automata was able to recognize concatenated palin- ever know it when N is 100. More to the point, Robertson and Seymour have dromes in linear time. Such investigations are fun. proved a famous theorem in graph theory: Any class A few months ago, however, I tried unsuccess- c of graphs that is closed under taking minors has fully to do a similar thing. I had a 5,000-step mechan- a finite number of minor-minimal graphs. (A minor ically discovered proof that the edges of a smallish flower snark graph cannot be 3-colored, and I wanted of a graph is any graph obtainable by deleting ver- to psych out how the machine had come up with tices, deleting edges, or shrinking edges to a point. A minor-minimal graph H for c is a graph whose smaller it. Although I gave up after a couple of days, I do minors all belong to c although H itself doesn’t.) think it would be possible to devise new tools for Therefore there exists a polynomial-time algorithm the study of computer proofs in order to identify the to decide whether or not a given graph belongs to c: “aha moments” therein. In February of this year I noticed that the cal- The algorithm checks that G doesn’t contain any of culation of an Erd˝os-discrepancy constant — made c’s minor-minimal graphs as a minor. But we don’t know what that algorithm is, ex- famous by Tim Gowers’ Polymath project, in which cept for a few special classes c, because the set of many mathematicians collaborated via the Internet — minor-minimal graphs is often unknown. The algo- makes an instructive benchmark for satisfiability- rithm exists, but it’s not known to be discoverable testing algorithms. My first attempt to compute it needed 49 hours of computer time. Two weeks later in finite time. I’d cut that down to less than 2 hours, but there still This consequence of Robertson and Seymour’s theorem definitely surprised me, when I learned were 20 million steps in the proof. I see no way at about it while reading a paper by Lovasz. And present for human beings to understand more than it tipped the balance, in my mind, toward the hy- the first few thousand of those steps. pothesis that P = NP . 19. Scott Aaronson, MIT: Would you recommend The moral is that people should distinguish be- to other scientists to abandon the use of email, as tween known (or knowable) polynomial-time algo- you have done? rithms and arbitrary polynomial-time algorithms. Don Knuth: My own situation is unusual, because People might never be able to implement a poly- I do my best work when I’m not interrupted. I eat,

Twenty Questions for Donald Knuth TUGboat, Volume 35 (2014), No. 3 243

sleep, and write content, more-or-less as a recluse who 11. Peter Gordon, Don Knuth’s editor at Addison- spends considerable time reading archives and other Wesley from the early 1980s until his retirement people’s code. As I say on my home page (http:// in 2014; see his TUG interview at http://tug. www-cs-faculty.stanford.edu/~uno), most peo- org/interviews/gordon.html. ple need to keep on top of things, but my role is 12. Udi Manber, a vice president of engineering at to get to the bottom of things. Google, responsible for search products. So I don’t recommend a no-email policy to peo- 13. Al Aho, Columbia University; programming lan- ple who thrive on communication. And I actually guages, and related algorithms, and take advantage of others in this respect (either shame- prolific author of textbooks on the art and science lessly or shamefully, I’m not sure which), by pestering of computer programming; co-author of the AWK them with random questions, even though I don’t . want anybody to pester me — except about the one 14. Guy Steele, designer and writer of numerous topic that I happen to be zooming in on at any programming language specifications, including particular time. the original command set of Emacs; the first TAOCP I do welcome email that reports bugs in , person to port TEX (from WAITS to ITS). because I always try to correct them as soon as 15. Robert Tarjan, Princeton; known for his pioneer- possible. ing work on graph theory algorithms and data Other unsolicited messages go to the bit bucket structures; his dissertation was supervised by in the sky, otherwise known as /dev/null. Don Knuth. 20. J. H. Quick, blogger: Why is this multi- 16. Frank Ruskey, University of Victoria; research in- interview called “twenty questions,” when only 19 cludes algorithms for exhaustively listing discrete questions were asked? structures, and various combinatorial topics. Don Knuth: I’m stumped. No, wait — Radia 17. Andrew Binstock, Editor-in-Chief, Dr. Dobb’s asked two. Journal. Incidentally, the eVolumes of TAOCP contain 18. Jeffrey Outlaw Shallit, University of Waterloo; some 4,500 questions, and almost as many answers. combinatorics on words, formal languages, au- tomata theory, and algorithmic ; − − ∗ − − also Vice President of Electronic Frontier Canada. 19. Scott Aaronson, MIT; theory of computational The panel complexity and quantum computing. 1. Jon Bentley, author of “Programming Pearls” in 20. J. H. Quick, blogger. Communications of the ACM. − − ∗ − − 2. Dave Walden, TUG board member and coordina- tor of the TUG Interview Corner. We conclude with another quote from 3. Charles Leiserson, MIT; theory of parallel com- Radia Perlman, regarding how TAOCP affected puting and distributed computing, and the prac- her, which also nicely expresses how many of us TEX tical applications thereof. users feel about Computers & Typesetting: 4. Dennis Shasha, NYU; biological computing, pat- Having the books on my bookshelf gave me a sense tern recognition, and machine learning. of security . . . that pretty much anything I’d wonder 5. Mark Taub, Pearson. about would be explained there. Today Wikipedia 6. Radia Perlman, Intel; software designer and serves some of that purpose. It would have been network engineer. nice 20 years ago to have had a (more) portable 7. Tony Gaddis, author of computer science books. version of Knuth so that I could know, wherever I 8. Robert Sedgwick, Princeton; analysis of algo- was, that I could quickly look something up. But rithms; one of Don Knuth’s Ph.D. students. 20 years ago there was nothing else, so I’d have to 9. Barbara Steele, contributor to Common Lisp: wait until I was back at home to consult the copy The Language. in my bedroom bookshelves, or wander the halls at 10. Silvio Levy, co-author with Don Knuth of The work to find someone who had a copy in their office. CWEB System of Literate Programming, and I did actually have a 2nd copy that was supposed with Raymond Seroul of A Beginner’s Book of to be at work, but it was always being “borrowed,” TEX; professional goal: to further the communi- so I could never find my own copy at work when I cation of mathematics. needed it.

Twenty Questions for Donald Knuth 244 TUGboat, Volume 35 (2014), No. 3

A letter on the persistence of (e)books Charles Bigelow

[Originally sent fall 2011; slightly edited for TUGboat, summer 2014.] Dear Colleagues, I have an original Amazon Kindle, bought around 2008. I also have a Barnes & Noble Nook, from around 2010, and an early Sony Reader a bit older than the Kindle. My graduate student was using them for his research. He gave them back to me this past summer after he finished his thesis (Congeniality of Reading on Digital Devices, RIT 2011). I set them on a shelf and didn’t think about them Fournier’s Manuel Typographique, hand-held. Built-in until last week, when I got them out to show to my book bookmark tape visible along gutter. St. Augustin size design class in the fall (2011). (14 point modern measure, about the size of menus on The Kindle was totally dead. Wouldn’t light up, Macs). Page L (50) uses St. Augustin in the Dutch wouldn’t charge, wouldn’t run even when plugged in. style (big x-height, narrow letters, light weight); The Nook wouldn’t charge but would work when plugged page LI (51) also St. Augustin. in, until yesterday, when it wouldn’t work at all. I took it to Barnes & Noble, where a staff person changed the battery, to no avail, and then called B&N tech support, landscape format images, and other fold-outs with music. who advised removing the battery, letting the thing sit Its casing is a rich, dark red leather with gold tooling for half an hour, and then recharging for four hours, to and ornamentation and colorful marbling, also on the see if that would reboot the battery. inside covers. The page edges are gilded, so the whole I took it home and plugged it in, with the power book glows with a faint aura even before it is opened. cord I almost left in the store, forgetting that an e-book It is small enough to be held with one hand, but works has to have cords and connectors. The Nook connector with two hands just as well. doesn’t fit my Kindle, so I have to keep track of which I won’t strain credulity by pretending the book is e-reader has which connector. But, that’s not a problem perfect. There are two volumes, so you have to keep now, because the Kindle is beyond recharge anyway, and track of both. The search function is primitive: you so is the Nook, which never did revive. must browse or skim; and if you want to capture text The Sony Reader was hopeless whether or not it took for later reference, you have to remember it or copy it a charge, because it required hooking up to a Windows by hand, and if you want to link to a page, you have to PC to download books, but I use a Macintosh. And use a bookmark (this book comes with its own bound-in, anyway, I’d misplaced the cables and connectors. Hard bookmark tape movable to the page of your choice). to keep track of those things. These reading exertions put a severe strain on my At home, I went over to my bookshelf and took down brain, to be sure. Sometimes I have to eat some chocolate a beautiful little book printed in Paris in 1764–1766 by to re-charge, but I have to be careful not to get chocolate Pierre-Simon Fournier. on the pages, because it won’t wash off. Some of the pages It was still in excellent working condition, even are slightly spotted by mold or foxing (not from chocolate though it hadn’t been charged in 247 years. I opened it, but from centuries of humidity and slow chemical changes and immediately it started up! I could read everything, in the paper), yet the text is still readable. The type without connectors or cables. can’t be resized, a problem for many of us over the age of It still works today. The print on the pages is 40, so to read the smallest size specimen, which Fournier a rich, dark black with good contrast against a pale called “Parisienne”, a gem-like cutting at about 5 point creamy background. The type is elegant, cut by P.-S. body size in modern type measure, I have to use bifocals Fournier himself, the greatest type designer of his era. or a reading glass. The pages have a slight texture with a good feel, making Oh, and this particular book is in French, which them easy to turn with a simple swipe of the fingers. It slows down my reading, but well, when I’m reading this has generous margins so you can hold the book open book, I’m not in much of a hurry anyway. without covering the text with your thumbs—a brilliant feature of the user interface. It has many illustrations ⋄ Charles Bigelow in vivid black and white, including fold-out pages with http://www.lucidafonts.com TUGboat, Volume 35 (2014), No. 3 245

Table 2: Measures of predefined page formats. LATEX document class options Thomas Thurnherr Option width height a4paper 210 mm 297 mm Abstract a5paper 148 mm 210 mm The standard document classes article, report, b5paper 176 mm 250 mm book, and letter accept a number of class options letterpaper 8.5 in 11 in which allow high-level customization of a document. legalpaper 8.5 in 14 in In this article, available options are introduced, the executivepaper 7.25 in 10.5 in default for each document class is highlighted, and alternative, more flexible customizations are given. 1 Setting document class options and so on. To use a geometry page format, the op- tion is passed to the package directly, rather than Options that differ from the default are passed to to the document class. Any page format set in the the document class through its optional argument document class is ignored. Besides these additional field. Multiple options have to be separated by a predefined formats, the package allows the user to comma. If contradictory options are set, the last define an arbitrary page size. Here is an example: option always overrides the previous ones. Moreover, if a non-existent option is set, LATEX ignores it and % A0 size: generates a warning in the log. \usepackage[a0paper]{geometry}

\documentclass[hoption1 i,hoption2 i,...]{article} % Arbitrary page size: \usepackage[paperwidth=5cm,paperheight=5cm] 2 Default options {geometry} Most default options are the same between different document classes, with a few exceptions. An over- 4 Font size view of all the defaults is given in table 1 (below). Throughout the entire document, LAT X uses the As the letter class is fairly specific, several options E same font size, except for headings or if the font is don’t apply and are therefore not implemented. changed locally through a macro, such as \small or 3 Paper size \large. 10pt is the default for all classes. Three options are available: 10pt, 11pt, and 12pt. If the LAT X provides several predefined paper (page) sizes. E default font size is changed, headings and macros The supported options: a4paper, a5paper, b5paper, change accordingly. Margins are also changed ac- letterpaper, legalpaper, and executivepaper. cording to the font size. The width and height for each of these page For a larger range, the extsizes package [2] sizes is listed in table 2. The default depends on the provides additional classes that support font sizes T X distribution and/or system used. It is either E between 8pt–20pt. a4paper or letterpaper. The geometry package [3] implements additional 5 One or two columns page sizes. For example, with this package, all ISO standard formats are available, including ISO A0–A6, All document classes use a single column layout by B0–B6, and C0–C6, specified as a0paper–a6paper, default (onecolumn). With the twocolumn option,

Table 1: Default document class option for standard document classes. Option article report book letter Paper size (system specific) a4paper/letterpaper a4paper/letterpaper a4paper/letterpaper a4paper/letterpaper Font size 10pt 10pt 10pt 10pt Number of columns onecolumn onecolumn onecolumn onecolumn Margins oneside oneside twoside oneside Title page notitlepage titlepage titlepage - Chapterstartpage - openany openright - Orientation portrait portrait portrait portrait Formulaoptions (center;rightlabel) - - - Draft or final final final final final

LATEX document class options 246 TUGboat, Volume 35 (2014), No. 3

the page is horizontally divided, a layout frequently portrait doesn’t explicitly exist. However, there is used by scientific journals. The \linewidth macro a landscape option, which rotates the page by 90◦, flexibly adapts to the new layout and is automati- but keeps the dimensions of the text area and the cally set to the width of a single column. Therefore, margins, which is often undesired. The geometry \linewidth is convenient to make optimal use of package [3] provides a more convenient landscape the available space, for example when adding figures. option, where text area and margins are adapted \textwidth, on the other hand, remains unchanged accordingly. and is equal to the total width of the text area. In two-column mode, the figure* environment \usepackage[landscape]{geometry} inserts a figure that spans both columns, and sim- The lscape [7] and pdflscape [10] packages im- ilarly table* for a full-width table. Consequently, plement the landscape environment, which changes \linewidth and \textwidth are identical within the orientation locally, for one or several pages in these starred environments. An example: an otherwise portrait document. In contrast to the \documentclass[twocolumn]{article} geometry package, with these packages only the ori- \usepackage{graphicx} entation of the text area is changed, while the mar- \begin{document} gins and with them the header and footer remain in portrait mode. This environment is particularly \begin{figure*}[ht] useful for adding extra-wide figures or tables to a doc- \includegraphics[width=\linewidth]{myFigure} ument. If pdfTEX is used for processing, pdflscape \caption{Figure spanning two columns.} physically rotates any landscape oriented page, which \end{figure*} makes it easier to read on screen. For example:

... text of document ... \documentclass{article} \end{document} \usepackage{pdflscape} \begin{document} The multicol package [9] provides support for \begin{landscape} two or more columns. With this package, it is also % landscape oriented content possible to mix different layouts within the same \end{landscape} document. \end{document} 6 Margins 9 Chapter starting page The options oneside and twoside affect the width of the side margins. With oneside, which is the default Chapters and other chapter-level headings are only for article, report, and letter, the margins on available in the report and book classes. By de- both sides of every page are equally wide. With fault, a new chapter starts on the next page in twoside,LATEX distinguishes between an inner and report (openany), but always on an odd page in outer margin. The outer margin is substantially book (openright). As a consequence, in a book wider and switches between left and right. Even there might be a blank page between two consecu- pages have their outer margin on the left, odd pages tive chapters (if the previous chapter ended with an on the right. Most books follow this structure and so odd page number). openany and openright do not it should not come as a surprise that the book class apply to article or letter. default is twoside. 10 Formula options fleqn and leqno 7 Title page The fleqn and leqno options define how formulas The titlepage option prints the title on a separate are displayed. They are independent and so can page. This is the default for report and book. On be used together. The names are not especially the other hand, article has notitlepage as its self-explanatory — fleqn aligns formulas on the left, default, with the main text starting directly after the instead of the default centering; leqno prints the title. The letter class doesn’t implement title page equation number on the left side instead of the (de- commands and therefore these options are altogether fault) right. unavailable. For instance, consider the Cauchy-Schwartz in- equality printed with the defaults: the formula is 8 Page orientation centered, with the equation number on the right. All of the standard document classes produce docu- 2 ments in portrait orientation, by default. The option |x, y| ≤ hx, xi · hy, yi (1)

Thomas Thurnherr TUGboat, Volume 35 (2014), No. 3 247

With fleqn, the equation is left-aligned: \ifdraft{% Draft: omit title/toc/lof/lot 2 }{% |x, y| ≤ hx, xi · hy, yi (2) \maketitle \tableofcontents\clearpage leqno And with , the equation number is placed left \listoffigures\clearpage of the equation instead of right: \listoftables\clearpage 2 (3) |x, y| ≤ hx, xi · hy, yi } \end{document} The amsmath package [1] provides more flexi- bility for equations. For example, it implements the flalign environment which was used here to References illustrate left-alignment (equation 2). [1] amsmath — AMS mathematical facilities for A 11 Draft or final LTEX. http://www.ctan.org/pkg/amsmath. Accessed: 2014-09-22. All document classes have the final option preset. [2] extsizes — extend the standard classes’ With draft, text or environments that reach into size options. http://www.ctan.org/pkg/ the margins are highlighted with a black square or extsizes. Accessed: 2014-09-30. bar. With that, it becomes easy to spot Overfull \hbox warnings in the document output. [3] geometry — flexible and complete interface to Other packages also make use of these options document dimensions. http://www.ctan.org/ and implement macros that behave differently in pkg/geometry. Accessed: 2014-09-22. draft mode. For example, the graphicx bundle [4] [4] graphicx — enhanced support for graphics. replaces figures with a box that shows the file name http://www.ctan.org/pkg/graphicx. instead of the figure. Document processing time can Accessed: 2014-09-28. be drastically reduced when figures are not loaded. [5] hyperref—extensive support for hypertext in hyperref A Two other examples: the [5] package re- LTEX. http://www.ctan.org/pkg/hyperref. moves all linking features from a document in draft Accessed: 2014-09-28. mode, and microtype [8] disables its features alto- [6] ifdraft — detect “draft” and “final” class gether. options. http://www.ctan.org/pkg/ifdraft. When draft is used for the overall document, Accessed: 2014-09-28. specific packages can be still set to final mode by loading the package with the final option. This [7] lscape — place selected parts of a document might sometimes be helpful to examine the package’s in landscape. http://www.ctan.org/pkg/ “final” behavior. An example: lscape. Accessed: 2014-09-30. [8] microtype — subliminal refinements towards \documentclass[draft]{article} typographical perfection. \usepackage[final]{graphicx} http://www.ctan.org/pkg/microtype. The ifdraft package [6] implements commands Accessed: 2014-09-28. to flexibly customize the behavior of draft and/or [9] multicol — intermix single and multiple final. For example, in a thesis the author might columns. http://www.ctan.org/pkg/ like to omit the title page and content lists while he’s multicol. Accessed: 2014-09-28. still working on the document. This is straightfor- [10] pdflscape — make landscape pages display ward, using either the \ifdraft or \iffinal macro as landscape. http://www.ctan.org/pkg/ provided by the package: pdflscape. Accessed: 2014-09-30. \documentclass[draft]{report} \usepackage{ifdraft} ⋄ Thomas Thurnherr \title{...} texblog (at) gmail dot com \author{...} http://texblog.org \begin{document}

LATEX document class options 248 TUGboat, Volume 35 (2014), No. 3

How to influence the position of float 6.1 Ensure that floats appear “here” . . 253 environments like figure and table in LATEX? 6.2 Provide a bottom float area for two-column floats ...... 253 Frank Mittelbach 6.3 Ensure that floats are always placed Abstract after their call-out ...... 253 6.4 Prevent floats on certain pages . . . 254 In 2012, a question “How to influence the float place- 6.5 Implement float barriers ...... 254 ment in LAT X” was asked on TeX.stackexchange [3] E 6.6 Overwrite placement restrictions . . 254 and as there had been many earlier questions around 6.7 Final tuning advice ...... 254 this topic I decided to treat the topic in some depth and explain most of the mysteries that the under- 1 Introduction lying mechanism poses to people trying to use it To answer this question one first has to understand successfully. the basic rules that govern LATEX’s standard place- Once my answer appeared on the web, people ment of floats. Once these are understood, adjust- asked to see this converted into an article and I fool- ments can be made, for example, by modifying float ishly replied “only if this answer ends up becoming parameters, or by adding certain packages that mod- a ‘great’ answer” (gets 100 votes). At the time of ify or extend the basic functionality. writing this article, the answer stands at 222 votes, so I had better make good on that promise. 2 LATEX floats terminology Contents 2.1 Float classes Each float in LATEX belongs to a class. By default, 1 Introduction 248 LATEX knows about two classes, viz., figure and table. Further classes can be added by a document class or 2 LAT X floats terminology 248 E by packages. The class a float belongs to influences 2.1 Float classes ...... 248 certain aspects of the float positioning, such as its 2.2 Float areas ...... 248 default placement specification (if not overridden on 2.3 Float placement specifiers ...... 248 the float itself). 2.4 Float algorithm parameters . . . . . 249 One important property of the float placement 2.5 Float reference point ...... 249 algorithm is that LATEX never violates the order of placement within a class of floats. E.g., if you have 3 Basic behavioral rules of LATEX’s figure 1, table 1, figure 2 in a document, then figure 1 float mechanism 249 will always be placed before figure 2. However, table 1 3.1 The basic sequence ...... 249 (belonging to a different float class) will be placed 3.2 Detailed placement rules ...... 250 independently and hence can appear before, after, or 3.3 Emptying the holding queue at the column or page boundary ...... 250 between the figures. 3.4 Parameters influencing the placement 250 2.2 Float areas A 4 Consequences of the algorithm 251 LTEX knows about two float areas within a column 4.1 A float may appear in the document where it can place floats: the top area and the bottom earlier than its location in the source 251 area of the column. In two-column layout, it also 4.2 Double-column floats are always knows about a top area spanning the two columns. deferred first ...... 251 There is no bottom area for page-wide floats in two- 4.3 There is no bottom float area column mode. A for double-column floats ...... 252 In addition, LTEX can make float columns and 4.4 All float parameters (normally) restrict float pages, i.e., columns or pages which contain only A the placement possibilities ...... 252 floats. Finally, LTEX can place floats in-line into the 4.5 “Here” just means “here if it fits” . . 252 text (but only if so directed on the individual float). 4.6 Float specifiers do not define an order 2.3 Float placement specifiers of preference ...... 252 4.7 Relation of floats and footnotes . . . 252 To direct a float to be placed into one of these areas, a float placement specifier can be provided as an 5 Documentation of the algorithm 253 optional argument to the float. If no such optional argument is given then a default placement specifier 6 How to address specific issues 253 is used (which depends on the float class as mentioned

Frank Mittelbach TUGboat, Volume 35 (2014), No. 3 249

above but usually allows the float to be placed in all 3 Basic behavioral rules of LATEX’s areas if not subject to other restrictions). float mechanism A float placement specifier can consist of the With this knowledge, we are now ready to delve into following characters in any order: the algorithm’s behavior. ! indicates that some of the restrictions that nor- First we have to understand that all of LATEX’s mally apply should be ignored (discussed later) typesetting algorithms are designed to avoid any h indicates that the float is allowed to be placed sort of backtracking. This means that LATEX reads in-line (“here”) through the document source, formats what it finds t indicates that the float is allowed to go into a and (more or less) immediately typesets it. The rea- top area sons for this design choice were to limit complexity b indicates that the float is allowed to go into a (which is still quite high) and also to maintain rea- bottom area sonable speed (remember that this is from the early eighties). p indicates that the float is allowed to go on a For floats, this means that the algorithm is float page or column area greedy, i.e., the moment it encounters a float it will The order in which these characters are put in immediately try to place it and, if it succeeds, it will the optional argument does not influence how the never change its decision. This means that it may algorithm tries to place the float! The precise order choose a solution that could be deemed inferior in is discussed in section 3.2. This is one of the common light of data received later on. misunderstandings, for instance when people think For example, if a figure is allowed to go to the that bt means that the bottom area should be tried top or bottom area, LATEX may decide to place this first. figure in the top area. If this figure is followed by two However, if a letter is not present then the cor- tables which are only allowed to go to the top, these responding area will not be tried at all. tables may not fit anymore. A solution that could have worked in this case (but wasn’t tried) would 2.4 Float algorithm parameters have been to place the figure in the bottom area and There are about 20 parameters that influence the the two tables in the top area. placement. Basically they define • how many floats can go into a certain area, 3.1 The basic sequence • how big a float area can become, So here is the basic sequence the algorithm runs • how much text there has to be on a page (in through: other words, how much the top and bottom float • If a float is encountered, LATEX attempts to place areas can occupy), and it immediately according to its rules (detailed • how much space will be inserted later); • – between consecutive floats in an area and if this succeeds, the float is placed and that decision is never changed; – between the float area and the text above • A or below it. if this does not succeed, then LTEX places the float into a holding queue to be reconsidered 2.5 Float reference point when the next page is started (but not earlier). • A A point in the document that references the float Once a page has finished, LTEX examines this (e.g., “see figure X”) is called a “call-out” and the holding queue and tries to empty it as best as float body should be placed close to the (main) call- possible. For this it will first try to generate out, as its placement in the document affects the as many float pages as possible (in the hope placement of the float in the output, because it de- of getting floats off the queue). Once this pos- termines when LATEX sees the float for the first time. sibility is exhausted, it will next try to place It’s important to understand that if a float is placed the remaining floats into top and bottom areas. in the middle of a paragraph, the reference point for It looks at all the remaining floats and either the algorithm is the next line break, or page break, places them or defers them to a later page (i.e., in the paragraph that follows the actual placement adding them once more to the holding queue). in the source. • After that, it starts processing document ma- For technical and practical reasons it is usually terial for this page. In the process, it may en- best to place all floats between paragraphs (i.e., after counter further floats. the paragraph with the call-out), even if that makes • If the end of the document has been reached or the call-out and reference point slightly disagree. if a \clearpage is encountered, LATEX starts a How to influence the position of float environments like figure and table in LATEX? 250 TUGboat, Volume 35 (2014), No. 3

new page, relaxes all restrictive float conditions, it out as best as possible. For this it will first try to and outputs all floats in the holding queue by build float pages.1 placing them on float page(s). Any floats participating in a float page (or col- In two-column mode the same algorithm is used, umn) must have a p as a float specifier in its float except that it works on the level of columns, e.g., placement specification. If not, the float cannot go when a column has finished LATEX will look at the on a float page and, in addition, will also prevent any holding queue and generate float columns, etc. further deferred float of the same class from being placed onto the float page! 3.2 Detailed placement rules If the float can go there, it will be marked for Whenever LATEX encounters a float environment in inclusion on the float page, but the processor may the source, it will first look at the holding queue to still abort the attempt if the float page will not check if there is already a float of the same class get filled “enough” (depending on the parameter in the queue. If that happens to be the case, no settings for float pages). Only at the very end of the placement is allowed and the float immediately goes document, or when a \clearpage has been issued, into the holding queue. are these restrictions lifted, and a float will then be If not, LATEX looks at the float placement spec- placed on a float page even if it has no p and would ifier for this float, either the explicit one in the op- be the only float on that page. tional argument or the default one from the float Creation of float pages continues until the algo- class. The default per float class is set in the doc- rithm has no further floats to place or when it fails ument class file (e.g., article.cls) and very often to produce a float page due to parameter settings. In resolves to tbp, but this is not guaranteed. the latter case, all floats that have not been placed • If the specifier contains a !, the algorithm will so far, are then considered for inclusion in the top ignore any restrictions related either to the num- and bottom areas of the next page (or column). ber of floats that can be put into an area or the The process there is the same as the one de- maximum size an area can occupy. Otherwise scribed above, except that the restrictions defined by the parameters apply. • the h specifier no longer has any meaning (as we • As a next step it will check if h has been speci- are, by now, far away from the original “here”) fied. and is therefore ignored, • If so, it will try to place the float right where • and the floats at this time are not coming from it was encountered. If this works, i.e., if there the source document but are taken one after the is enough space, then it will be placed and pro- other from the holding queue. cessing of that float ends. Any float that couldn’t be placed is then put back • If not, it will look next for t and if that has into the holding queue, so that when LAT X is ready been specified the algorithm will try to place E to look at further textual input from the document the float in the top area. If there is no other the holding queue may already contain floats. A restriction that prevents this, then the float is consequence of this is that a float encountered in the placed there and float processing stops. document may immediately get deferred just because • If not it will finally check if b is present and, if an earlier float of the same float class is already on so, it will try to place the float into the bottom hold. area (again obeying any restrictions that apply if ! wasn’t given). 3.4 Parameters influencing the placement • If that doesn’t work either or is not permitted There are four counters that control how many floats because the specifier wasn’t given, the float is can go into areas: added to the holding queue. • A p specifier (if present) is not used during the totalnumber (default 3) is the maximum number of above process. It will only be looked at when floats on a text column; it is not used for float the holding queue is being emptied at the next pages; page or column boundary. topnumber (default 2) is the maximum number of This ends the processing when encountering a float floats in the top area; in the document. bottomnumber (default 1) is the maximum number of floats in the bottom area; 3.3 Emptying the holding queue at the 1 column or page boundary In two-column mode LATEX will build float columns A (when finishing a column) and also attempt to generate float After a column or page has been finished, LTEX pages when finishing a page. In the remainder of the article looks at the holding queue and attempts to empty “float page” will denote either depending on the context. Frank Mittelbach TUGboat, Volume 35 (2014), No. 3 251

dbltopnumber (default 2) is the maximum number the final document. It may move visually backward of full-width floats in two-column mode going to some degree as it may be placed in the top area above the text columns. on the current page; see section 6.3 on how to change The size of the areas are controlled through pa- this. It can, however, not end up on an earlier page A rameters (to be changed with \renewcommand) that than the surrounding text due to the fact that LTEX define the maximum (or minimum) size of the area, does no backtracking and the earlier pages have expressed as a fraction of the page height: already been typeset. \topfraction (default 0.7) maximum size of the Thus normally a float is placed in the source top area near its first call-out (i.e., text like “see figure 5”) because this will ensure that the float appears either \bottomfraction (default 0.3) maximum size of the on the same page as this text or on a later page. bottom area However, in some situations you may want to place \dbltopfraction (default 0.7) maximum size of the a float on the preceding page (if that page is still top area for double-column floats visible from the call-out). This is possible only by \textfraction (default 0.2) minimum size of the moving the float to an earlier position in the source. text area, i.e., the area that must not be occu- pied by floats The space that separates floats within an area, as 4.2 Double-column floats are always well as between float areas and text areas, is defined deferred first through the following parameters (all of which are When LATEX encounters a page-wide float environ- rubber lengths, i.e., can contain some stretch or ment (indicated by a * at the end of the environment shrink components). Their defaults depend on the name, e.g., figure*) in two-column mode, it imme- document font size and change when class options diately moves that float to the deferred queue. The like 11pt or 12pt are used. We show only the 10pt reason for this behavior again lies in the “greedy” defaults: behavior of its algorithm: if LATEX is currently as- \floatsep (default 12pt plus 2pt minus 2pt) the sembling the second column of that page, the first separation between floats in top or bottom areas column has already been assembled and stored away; A \dblfloatsep (default 12pt plus 2pt minus 2pt) the recall that because LTEX does not backtrack there separation between double-column floats on two- is no way to fit the float on the current page. To column pages keep the algorithm simple, it does the same even if \textfloatsep (default 20pt plus 2pt minus 4pt) working on the first column (where it could in theory the separation between top or bottom float area do better even without backtracking). and the text area Thus, in order to place such a float onto the current page, one has to manually move it to an \dbltextfloatsep (default 20pt plus 2pt minus earlier place in the source — before the start of the 4pt) the analog of \textfloatsep for two- current page. If this is done, obviously any further column floats change in the document could make this adjustment For in-line floats (that have been placed “here”) the obsolete; hence, such adjustments are best done (if separation to the surrounding text is controlled by at all) only at the very last stage of document pro- \intextsep (default 12pt plus 2pt minus 2pt) duction — when all material has been written and In the case of float pages or float columns (i.e., the focus is on fine-tuning the visual appearance. a page or a column of a page containing only floats) Also note that the base algorithm has a bug2 parameters like \topfraction etc. do not apply. In- in this area: it maintains two independent holding stead the creation of them is controlled through queues: one for single-column and one for double- \floatpagefraction (default 0.5) minimum part of column floats. As a result the float order is not the page (or column) that needs to be occupied necessarily preserved and floats may get typeset out by floats to be allowed to form a float page (or of sequence. If this happens one either has to man- column). ually move the double-column float to an earlier (or later) place in the document or load the fixltx2e 4 Consequences of the algorithm package that implements a correction for this issue. 4.1 A float may appear in the document earlier than its location in the source 2 As this is the documented behavior in the LATEX man- The placement of the float environment in the source ual [1] it is perhaps more correctly called an undesired feature determines the earliest point where it can appear in than a bug.

How to influence the position of float environments like figure and table in LATEX? 252 TUGboat, Volume 35 (2014), No. 3

4.3 There is no bottom float area for 4.5 “Here” just means “here if it fits” double-column floats . . . and often it doesn’t fit. This is somewhat surpris- This isn’t so much a consequence of the algorithm ing for many people, but the way the algorithm has but rather a fact about its implementation. For been designed the h specifier is not an unconditional double-column floats the only possible placements command. If an unconditional command is needed, offered are the top area or a float page. Thus if extension packages such as the float package offer H somebody adds an h or a b float placement specifier as an alternative specifier that really means “here” to such a float it simply gets ignored. As a special (and starts a new page first if necessary). important case {figure*}[b] implies that this float 4.6 Float specifiers do not define an order will not get typeset at all until either a \clearpage of preference is encountered or the end of the document is reached. As mentioned above, the algorithm tries to place 4.4 All float parameters (normally) restrict floats into available float areas in a well-defined order the placement possibilities that is hard-wired into the algorithm: “here”, “top”, This may be obvious but it is worth repeating: any “bottom” and — on page boundaries — first “page” float parameter defines a restriction on LATEX’s abil- and only if that is no longer possible, “top” followed ity to place the floats. How much of a restriction by “bottom” for the next page. depends on the setting: there is always a way to set Thus specifying [bt] does not mean try bottom a parameter in such a way that it does not affect first and only then top. It simply means allow this the placement at all. Unfortunately, in doing so one float to go into top or bottom area (but not onto a invites rather poor-looking placements. float page) just like [tb] would. By default LATEX has settings that are fairly liberal. For example, for a float page to be accepted 4.7 Relation of floats and footnotes the float(s) must occupy at least half of the available This is not exactly a consequence of the algorithm but page. Expressed differently, this means that such a one of its implementation: Whenever LATEX tries to page is allowed to be half empty (which is certainly decide on a placement for a float (or a \marginpar !) not the best possible placement in most cases). it has to trigger the output routine to do this. And What often happens is that users try to improve as part of this process all footnotes on the page are such settings and then get surprised when suddenly removed from their current place in the galley and all floats pile up at the end of the document. To are collected together in the \footins box as part stay with this example: if one changes the parameter of TEX’s preparation for page production. \floatpagefraction to require, say, 0.8 of the float But after placing the float (or deferring it) LATEX page, a float that occupies about 0.75 of the page then returns the page material to the galley, and will not be allowed to form a float page on its own. because of TEX’s output routine behavior the galley Thus, if there isn’t another float that could be added has now changed: all the footnotes have been taken and actually fits in the remaining space, the float out from their original places. So LATEX has to put will get deferred and with it all other floats of the the footnotes back, but it can only place them in a same class. But, even worse, this specific float is single place (not knowing the origin anymore). What too big to go into the next top area as well because it does is reinsert the footnotes (the footnote text there the default maximum permissible area is 0.7 to be precise) at the end of the galley. There are (from \topfraction). As a result all your floats stay some good reasons for doing this, one of which is deferred until the next \clearpage. that LATEX expects that all of the returned material For this reason it is best not to meddle with the still fits on the current page. parameters while writing a document or at least not However, if for some reason a page break is fi- to do so in a way that makes it more difficult for nally taken at an earlier point then the footnotes will the algorithm to place a float close to its call-out. show up on the wrong page or column. This is a fairly For proof-reading it is far more important to have unlikely scenario and LATEX works hard at making it a figure next to the place it is referenced then to a near-impossibility, but if it happens check if there avoid half-empty pages. Possibilities for fine-tuning is a float near the chosen page break and either move an otherwise finished document are discussed below. the float or guide the algorithm by using explicit Another conclusion to draw here is that there are page breaks. An example of this behavior can be dependencies between some of the float parameters; it found in another question on TeX.stackexchange [4]. is important to take these dependencies into account In fact the particular case discussed in the question when changing their values. is worth highlighting: Do not place a float directly

Frank Mittelbach TUGboat, Volume 35 (2014), No. 3 253

after a heading, unless it is a heading that always can load to trace some strange float placement deci- starts a page. The reason is that headings normally sions, or simply to understand the algorithm a bit form very large objects (as a heading prevents a better. It offers the commands \tracefloats and page break directly after it). However placing a float \tracefloatsoff to start or stop tracing the algo- in the middle of this means that the output rou- rithm and \tracefloatvals to display the current tine gets triggered before LATEX makes its decision values of various float parameters that are discussed where to break and any footnotes get moved into the in this article. wrong place. As the package is identical to the kernel code with tracing added, it may or may not work if you 5 Documentation of the algorithm load any other package that manipulates that part As requested, here is some information on existing of the kernel code. In such a case your best bet is to documentation. The algorithm and its implemen- load fltrace first. tation are documented in the file ltoutput.dtx as part of the LATEX kernel source. This can be type- 6 How to address specific issues set standalone or as part of the whole kernel (i.e., In the final section we discuss a few strategies to cir- by typesetting source2e. — ignore the checksum cumvent or resolve common issues. It is by no means 3 error if it is still there, sorry). comprehensive and you may find further information This documentation is an interesting historical in other publications, e.g., The LATEX Companion [2] artifact. Parts of it show semi-formatted pseudo-code that devotes a whole chapter to the topic of floats. which dates back to LATEX 2.09; in other words it is from the original documentation by . 6.1 Ensure that floats appear “here” The actual code is documented using doc style and Sometimes it is necessary to ensure that floats appear in parts is more or less properly documented (from in-line at certain points in the document text even scratch) and dates back to 1994 or thereabouts when if that results in some partially empty pages. As Chris Rowley and myself adjusted and extended the discussed above the h specifier doesn’t provide this A original algorithm for LTEX 2ε (the current version). functionality but there are extensions that do, such It also fairly openly documents the various issues as the float package which offers an H specifier for with the algorithm and/or its implementation — in this purpose. many cases we didn’t dare to alter it because of the An alternative is the \captionof command many dependencies and, of course, because of the from the caption package that generates a normal danger to screw up too many existing documents float caption (including its entry in the list of fig- that implicitly rely on the current behavior for good 4 ures or tables, etc.) but without the need for a or ill. Near the end you’ll find a list of comments surrounding float environment. compiled on the algorithm back then, but there are also comments, questions, and tasks (?:-) sprinkled 6.2 Provide a bottom float area for throughout the documentation of the code. two-column floats One interesting aspect of this file (that I forgot As discussed above, the standard algorithm doesn’t all about) is that it contains all the code necessary support double-column floats at the bottom of pages. to trace the behavior of the algorithm in real life. It This missing functionality is added, except for the is fairly raw and detailed output and probably for first page6, if you load the stfloats package. that reason I didn’t make this publicly available back then. But even in its current form it does give some 6.3 Ensure that floats are always placed interesting insight into the behavior of the algorithm after their call-out and how certain decisions come about. By default the LAT X float algorithm allows for floats Thus while writing this article I had second E to move before their call-out as long as float and call- thoughts and now the most recent distribution of 5 out are on the same page; more precisely, it allows LAT X (May 2014) offers the package fltrace that you E floats to appear in the top area of the column in 3 But this also means you are running an older release of which the float has been encountered. LATEX. 4 This is, for example, the reason that the correction of \input docstrip the issue discussed in section 4.2 was placed into the fixltx2e \generateFile{fltrace.sty}{t}{% package and not made part of the kernel algorithm. \from{ltoutput.dtx}{fltrace,trace}} 5 If you have an earlier version of LATEX installed, you can \endbatchfile still extract this code yourself, by writing a short installation and run this through LATEX. file fltrace.ins with the following content: 6 See [5] in this issue to manually lift even this restriction.

How to influence the position of float environments like figure and table in LATEX? 254 TUGboat, Volume 35 (2014), No. 3

This practice offers a better chance that the always be placed unless there are already deferred float is visible from the call-out position and doesn’t floats with the same float class or the allowed ar- end up on a later page. For some journals, however, eas get bigger than the available space when adding this is too liberal and they require that floats are the float. strictly placed after their call-out, i.e., that in the As the order of attempts is still the same (first call-out column only the bottom area forms a valid top then bottom), you may have to use [!b] to force placement option. To accommodate this requirement, a float into the bottom area as [!tb] would normally this strategy is implemented by the flafter package. already succeed in placing it into the top area. The This may work well if your document has only downside is of course that if the float doesn’t fit, it a few floats. For documents with lots of floats, place- will only appear in the bottom area of a following ment obviously becomes much more difficult, and page. Thus any later text change may create havoc you may find that all your floats appear together on your placement decisions. at the end of the document or chapter, or you may receive a “Too many unprocessed floats” error. 6.7 Final tuning advice There are many ways to fine-tune the behavior of 6.4 Prevent floats on certain pages the float placement algorithm; most of them have Sometimes it is helpful to prevent floats from ap- been discussed in this article. However, there is one pearing on a certain page, for example, to prevent more “tuning” possibility and in fact the biggest of a float in a new section from moving into the top all: changes in your document text. area on the current page without disallowing a Therefore, as final advice: do not start manip- placement in the top area of a later page. For ulating parameters or change placement specifiers this type of fine-tuning LATEX offers the command or move floats within your document until after you \suppressfloats[placement]. The optional argu- have fully written your text and your document is ment can be either t or b and prevents any further close to completion. It is a waste of effort and it may placement into the respective area(s) on the current even result in inferior placements as your initially page. Without an argument, all remaining floats on provided restrictions may no longer be adequate after the current page are deferred. a text change.

6.5 Implement float barriers References Standard LATEX already implements a float barrier [1] Leslie Lamport. LATEX: A Document Preparation called \clearpage. Floats on either side will never System: User’s Guide and Reference Manual. appear on the other. It works by outputting all de- Addison-Wesley, Reading, MA, USA, second ferred floats, if necessary by generating float pages, edition, 1994. Reprinted with corrections in 1996. and then starting a new page. While this is suit- [2] Frank Mittelbach, Michel Goossens, Johannes able to keep floats within one chapter (as chapters Braams, David Carlisle, and Chris Rowley. The A typically start on a new page) there are cases where LTEX Companion. Tools and Techniques for one would wish for a less intrusive barrier, i.e., one Computer Typesetting. Addison-Wesley, Reading, MA, USA, second edition, 2004. Also available as that works without forcing a new page or is partially an eBook, see http://www.latex-project.org/ porous. site-news.html#2013-11-02. This functionality is offered by the placeins pack- [3] Marco Daniel. How to influence the position of age, which implements a \FloatBarrier command float environments like figure or table in LATEX?, that doesn’t introduce a page break. Through pack- 2012. http://tex.stackexchange.com/q/39020. age options you can alter the behavior to allow for [4] Martin Hermann. “thanks” note (footnote) floats to migrate from one side to the other as long placed below right column even though as they still appear on the same page. there is enough space on the left, 2012. http://tex.stackexchange.com/q/43294. 6.6 Overwrite placement restrictions [5] Barbara Beeton. Placing a full-width insert If a given float is (slightly) too large to fit into a cer- at the bottom of two columns. TUGboat, tain area or if an area already contains the maximum 35(3):255–255, 2014. http://tug.org/TUGboat/ number of floats but you nevertheless want to force 35-3/tb111beet-banner.pdf. the current float into this place then adding ! to the optional argument of the float is a good choice. It re- ⋄ Frank Mittelbach sults in ignoring all restrictions implemented through LATEX3 Project parameters for this particular float, so that it will http://www.latex-project.org

Frank Mittelbach TUGboat, Volume 35 (2014), No. 3 255

Placing a full-width insert at the bottom two columns of text at the top and a single wide of two columns illustration at the bottom. However, the nature of the material allowed all pages to be divided into four Barbara Beeton quadrants which could be managed individually or There’s one location on a two-column page where a as horizontal or vertical pairs. That doesn’t help in full-width \begin{figure*} can’t be placed under solving the more general problem. A ordinary circumstances — the bottom. The package So what can be done today? A LTEX-flavored stfloats lifts that restriction — except for the first kludge that will produce a one-page document with two columns at the top and a full-width insertion at page. The purpose of the present exercise is to 2 demonstrate that this can in fact be done, using the bottom is shown in fig. 1. Of course, this also only basic LAT X tools. works for a longer document, but for this demon- E TUGboat Why might one want (or need) to do this? Con- stration, one page is sufficient. It is also sider a project for which an interim report is best evident that the method works with footnotes (and expressed as a table or diagram, with very little prose, other such insertions), and that cross-references work but the required report format specifies two columns. normally. The impact of the data is lost if the illustration must The figure is given as an overwide single-column be deferred to another page, while the first page is [b] figure, in the first column. The page must con- nearly empty. In fact, the meat of the report could tain enough text to continue into the second column. even be lost, when the impatient recipient fails to Once there is enough text, the trick is to issue a nega- turn the page over. tive \enlargethispage command that will leave the At the 2010 TUG annual meeting, Frank Mit- bottom part of the second column blank, allowing the telbach presented a talk entitled “Exhuming coffins full-width figure to overflow into the empty area. (On from the last century” that dealt with the problems a two-column page, \enlargethispage is equivalent of positioning boxes on a page. The talk didn’t make to the (nonexistent) \enlargethiscolumn.) it into print in TUGboat, but Kaveh Bazargan was Of course, this is entirely manual, and requires there with his recording equipment, and produced a intervention and iteration, preferably after the text A video that can be viewed at river-valley.zeeba. is final. Tweaking of LTEX’s float parameters, such tv/exhuming-coffins-from-the-last-century/. as \bottomfraction, is likely. Captions may require The techniques proposed there won’t solve this prob- still more effort. Nevertheless, there are situations in lem any time soon, but they show promise for the which it makes possible a desirable effect that cannot future. otherwise be accomplished. Enjoy! At TUG 2014 in Portland, Boris Veytsman gave ⋄ Barbara Beeton a talk1 on composing a book in which the illustrations http://tug.org/TUGboat were more important — and occupied more space — tugboat (at) tug dot org than the text, and indeed, there were pages with

1 “An output routine for an illustrated book: Making the 2 This technique was presented in TeX.stackexchange. FAO Statistical Yearbook”, TUGboat 35:2, pages 202–204. com/q/107270. \documentclass{ltugboat} \title{Placing a full-width insert at the bottom of two columns} \author{Barbara Beeton} \begin{document} \maketitle There’s one location on a two-column page where a full-width ... \begin{figure}[b]\setlength{\hfuzz}{1.1\columnwidth} \begin{minipage}{\textwidth} \ttfamily ... code for the insertion ... \end{minipage} \end{figure} At the 2010 \tug\ annual meeting, Frank Mittelbach presented ... \enlargethispage{-16.5\baselineskip} Of course, this is entirely manual, and requires intervention ... \end{document} Figure 1: A full-width figure at the bottom of the first page! 256 TUGboat, Volume 35 (2014), No. 3

biblatex variations Listing 1: The example bib file vortrag.bib Ulrike Fischer 1 @termin{dante2014, Abstract 2 title={Biblatex-Variationen}, 3 date ={2014-04-11}, biblatex I show three small examples of using the 4 time ={15.15}, package for more than printing bibliographies: we 5 location={Heidelberg}} will redefine bibliography drivers, define new cite 6 commands and declare new entry types to create 7 @online{dante, qrcodes, insert PDF files and manage addresses. 8 title={Internetseite dante e.V.}, 9 url ={http://www.dante.de}} Remark 10 In April I gave a talk at the DANTE e.V. meeting 11 @online{heidelberg, about biblatex variations and later wrote an article 12 title={Stadt Heidelberg}, 13 url ={http://www.heidelberg.de}} for the proceedings in Die TEXnische Kom¨odie. This 14 is more or less a translation of that article. I didn’t 15 @adresse{max, adapt the examples, so they are still in German. 16 name ={Muster, Max}, ad hoc small-scale databases 17 strasse ={Im Versuchsweg 10}, 18 ort ={Testgel¨ande}, I have always been interested in the handling of small 19 plz ={X01234}, databases. I wrote my first articles in DTK ([1, 2]) 20 gender ={sm}} about a mail merging system that I developed to 21 handle around 50 addresses. And later on I regu- 22 @adresse{eva, larly had to find ways to automatically process small 23 name ={Muster, Eva}, numbers of data records without too much fuss. 24 strasse ={Im Versuchsweg 10}, 25 ort ={Testgel¨ande}, For such small scale databases, the bib is an 26 plz ={X01234}, interesting option — at least on the input side. Look- 27 gender ={sf}} ing at it without the prejudice “that is something for 28 bibliographies only” — Listing 1 shows such a typical 29 @article{input1, bib file, that I will use in this article — one can see 30 author={Fischer, Ulrike}, that it has several features making it suitable for ad 31 title ={Erster Text}, hoc small scale databases: 32 journal={Beispiele}, • One can mix different datatypes in one file. 33 date ={2012-04-08}, 34 url ={inputtext1.pdf}} • One can easily create new datatypes. 35 • One can easily add new fields. 36 @article{input2, • Fields without values can be (should be) omitted. 37 author={Fischer, Ulrike}, This saves space. 38 title ={Zweiter Text}, 39 journal={Beispiele}, • The records can be created, changed and read 40 date ={2013-02-07}, with any editor, but there are also good GUIs 41 url ={inputtext2.pdf}} (e.g. JabRef), so the database can be edited by 42 A people who don’t know LTEX. 43 @book{gambol, • With @string one can define variables. 44 author={Gambolputty de von Ausfern- ,schplenden-schlitter-crasscrenbon→֒ • It is possible to define relations between records, ,{Johann→֒ e.g. with crossref. 45 title={Titel}, So, it is easy to create a small database but . . . 46 year={1970}} how should one process and output the records? 47 Before biblatex this was not usually feasible. Cre- 48 @book{dante2007, ating a suitable bst file was a difficult and time- 49 author = {Dante Alighieri}, consuming task. But with biblatex this has com- 50 title = {Die G¨ottlicheKomm¨odie}, pletely changed. Now it is possible, with very little 51 gender = {sm}, 52 location = {Stuttgart}, effort, to output the records in various ways. The 53 year = {2007}, following examples are meant to demonstrate this. 54 translator = {Hermann Gmelin}} They will show various methods one can use. The

Ulrike Fischer TUGboat, Volume 35 (2014), No. 3 257

Listing 2: The creation of QR-codes

1 % Compile with XeLaTeX dante 2 \documentclass{article} Internetseite dante e.V. url: http://www.dante.de 3 \usepackage[margin=0.05in,textwidth=1.9in, ,textheight=1.9in,paperwidth=2in→֒ {paperheight=2in]{geometry→֒ 4 \usepackage{xcolor} 1 2 5 \usepackage{pst-barcode} heidelberg 6 \usepackage{fontspec} Stadt Heidelberg 7 \usepackage{biblatex} url: 8 \addbibresource{vortrag.bib} http://www.heidelberg.de 9 10 \defbibenvironment{qrcode}{\centering}{}{} 3 4 11 Figure 1: The PDF output pages with QR-codes 12 \DeclareBibliographyDriver{online}{% 13 \begin{minipage}[c][1.9in]{1.9in} 14 \centering Listing 2 shows how one can create QR-codes 15 \printtext{\thefield{entrykey}}\\[2ex] from the url field of a bib file. The actual docu- 16 \printfield{title}\\[2ex] ment body is very short (lines 26–29): all entries are 17 \printfield{url} 18 \end{minipage} cited with \nocite{*} and then the bibliography is 19 \newpage printed with \printbibliography, which is given 20 \begin{pspicture}(1.9in,1.9in) three options: heading=none suppresses the heading 21 \label{\thefield{entrykey}}% of the bibliography; type=online only outputs en- 22 \psbarcode[linecolor=red]{\thefield{url tries of type @online; and env=qrcode ensures that width=1.9 height=1.9}{qrcode}% the bibliography is not a rather complicated list but}{{→֒ 23 \end{pspicture}% the simple environment defined in line 10. 24 \newpage} The start of the listing (lines 3–8) is basic: the 25 page size is set to 1.9 in × 1.9 in, needed packages are 26 \begin{document} loaded and the name of the bib file is declared. 27 \nocite{*} Line 10 defines a simple qrcode environment, 28 \printbibliography[env=qrcode,type=online, heading=none] as the standard list environment would only insert→֒ 29 \end{document} unwanted spaces. The core of the approach is in lines 12–24. They define how a @online entry is formatted in the bibli- ography. The code creates two pages for each entry. examples are kept as simple as possible. In larger The first page shows some information about databases one would likely need to add some security the following QR-codes. This page is not required; I precautions, such as tests for empty fields. produce it only because it looks nice. The code use the biblatex commands \printtext, \printfield and \thefield to output fields from the bib record. 1 Example 1: QR-codes Lines 20–24 create the second page with the In this example, we first create a PDF file which con- QR-code. A useful addition is the label, specified in tains the QR-code of the url field of every @online line 21: an external document is then able to find entry in a bib file. The QR-codes can then be in- the page with the QR-code of a specific bib key (the serted with \includegraphics in another document. “entrykey”). In line 22 the QR-code is created. The As methodology, the redefinition of a bibliography url is inserted with \thefield{url}. driver is shown. Compiling with xelatex–biber–xelatex results in a PDF file with four pages, shown in figure 1. 1.1 The creation of the QR-codes Herbert Voß has shown how QR-codes can be created 1.2 Using the QR-codes generally in a DTK article [3]. That method uses Listing 3 shows how one can insert the QR-codes the package pst-barcode, meaning that one needs a in other documents. The code uses the package TEX compiler which can handle PostScript; I usually refcount to convert a label to a number which can use xelatex. be used with the option page of \includegraphics.

biblatex variations 258 TUGboat, Volume 35 (2014), No. 3

Listing 3: Inserting the QR-codes Listing 4: Inserting PDF attachments 1 \documentclass[parskip=half-]{scrartcl} 1 \documentclass[parskip=half-,toc=flat]{ 2 \usepackage{graphicx,refcount,xr} {scrartcl→֒ 3 \externaldocument[qrcode-] 2 \usepackage[utf8]{inputenc} 4 {qrcodes-bib-erzeugen} 3 \usepackage[T1]{fontenc} 5 \begin{document} 4 \usepackage[ngerman]{babel} 6 \section*{QR-Codes laden} 5 \usepackage[autostyle]{csquotes} 7 \includegraphics[page= 6 \usepackage{pdfpages} 8 \getpagerefnumber{qrcode-dante}] 7 \usepackage[style=authoryear]{biblatex} 9 {qrcodes-bib-erzeugen-dtk} 8 \addbibresource{vortrag.bib} 10 \quad 9 11 \includegraphics[page= 10 \newcounter{anlage} 12 \getpagerefnumber{qrcode-heidelberg}] 11 13 {qrcodes-bib-erzeugen-dtk} 12 \DeclareCiteCommand{\citeanlageX} 14 \end{document} 13 {} 14 {\clearpage\refstepcounter{anlage}% 15 \addtocontents{toc} 16 {\protect\contentsline 17 {section} It also uses the package xr to access the labels of 18 {\protect\numberline{Anlage~\ %{the document with the QR-codes. The code must ֒→theanlage be able to find the PDF and aux files of the external 19 \protect\fullcite{\thefield{ %{{document with the QR-codes. ֒→entrykey 20 }% 21 {\thepage} 2 Example 2: Inserting PDF attachments 22 {}}% 23 \IfFileExists{\thefield{url}} The second example inserts a PDF file in a document 24 {\includepdf[pages=-,fitpaper]{\thefield{ {{{and writes information about this file to the table of ֒→url contents. The method used is the definition of a new 25 {\par\textbf{Datei zu \thefield{entrykey %{!cite command. The example also shows that things ֒→} wurde nicht gefunden do not always work as smoothly as one would wish. 26 \clearpage}} Listing 4 shows the core idea: In lines 12–27 27 {}{} a new cite command with the name \citeanlageX 28 29 \makeatletter is defined. Defining a cite command is a bit more 30 \newcommand\citeanlage[1]{% complicated then defining a driver, as a cite command 31 \begingroup lists can have of entries as an argument. 32 \let\blx@leavevmode\relax In the so-called loopcode argument (lines 14– 33 \let\blx@leavevmode@cite\relax 26) a new page is started and the counter for the 34 \citeanlageX{#1}% attachment is advanced (line 14). Then in lines 15– 35 \endgroup} 22 an entry for the toc file is written. The content 36 \makeatother of this entry is the word “Anlage” followed by the 37 number and a \fullcite. 38 \begin{document} In lines 23–26 the file from the url field is in- 39 \tableofcontents cluded with \includepdf. The existence of the file 40 41 \citeanlage{input1} \citeanlage{input2} is checked first with \IfFileExists. 42 \end{document} \citeanlageX does everything that is needed, but it has a flaw: it can lead to unwanted empty pages. The problem is that every biblatex cite command internally executes \leavevmode, and so Inhaltsverzeichnis can start a page. Together with the \clearpage Anlage 1 Ulrike Fischer (2012). „Erster Text“. In: Beispiele. URL: inputtext1. there is then a page too many. pdf 2 Anlage 2 Ulrike Fischer (2013). „Zweiter Text“. In: Beispiele. URL: inputtext2. So I had to dig around a bit in the code to find pdf 4 the source of the \leavevmode. Happily it is easy to deactivate it locally. This is done in lines 29–36. Figure 2: The table of contents created by Listing 4

Ulrike Fischer TUGboat, Volume 35 (2014), No. 3 259

Listing 5: The datamodel file ufischer.dbx Listing 6: How to use the type @adresse

1 \DeclareDatamodelEntrytypes{adresse} 1 \documentclass[parskip=half-,toc=flat, ,fontsize=9pt,DIV = 9→֒ 2 3 \DeclareDatamodelFields[type=list, 2 paper=a5,pagesize,headings= {datatype=name] ֒→normal]{scrartcl→֒ 4 {name} 3 \usepackage[utf8]{inputenc} 5 4 \usepackage[T1]{fontenc} 6 \DeclareDatamodelFields[type=field, 5 \usepackage[ngerman]{babel} {datatype=literal] 6 \usepackage[autostyle]{csquotes→֒ 7 {strasse,ort,plz} 7 \usepackage[datamodel=ufischer,defernumbers {biblatex}[→֒ 8 9 \DeclareDatamodelEntryfields[adresse]{% 8 \addbibresource{vortrag.bib} 10 name,strasse,ort,plz,gender} 9 10 \DeclareBibliographyDriver{adresse}{% 11 \printnames{name}\setunit{\addcomma\ %{addspace→֒ 3 Example 3: Managing addresses 12 \printfield{strasse}\setunit{\addcomma\ →֒ The third example is a bit more complicated. It addspace}% 13 \printfield{plz}\setunit{\addspace}\ shows how to manage addresses in a bib file. The %{printfield{ort→֒ new method shown this time is how the datamodel 14 \usebibmacro{finentry}} can be extended. 15 New entrytypes and fields should be declared 16 \DeclareNameFormat[adresse]{anrede}{#1} in a dbx file, as shown in Listing 5. In line 1 a new 17 entry type adresse is added and in lines 3–7 new 18 \DeclareCiteCommand{\citeanrede}{}{% fields for this type. One should also declare which 19 \iffieldequalstr{gender}{sm} fields can be used by an entry type. A new entry 20 {\printtext{Herr}}{\printtext{Frau}}% type can use known fields, such as the field gender 21 \setunit{\addspace}\printnames[anrede]{ →֒ on the last line. name}} Listing 6 shows how to use the new entry type. 22 {}{} 23 First, the new datamodel is loaded in line 7 with 24 \DeclareCiteCommand{\citeadresse}{}{% datamodel=ufischer. 25 \printtext{\par\noindent}% Lines 10–14 declare a new bibliography driver 26 \iffieldequalstr{gender}{sm}{\printtext{ %{{adresse for the distribution list; it creates a simple, ֒→Herrn}}{\printtext{Frau comma-separated list of the data. 27 \setunit{\\}\printnames{name}% Line 16 declares a name format for the greeting 28 \setunit{\\}\printfield{strasse}% in the letter. It prints only the last name from a 29 \setunit{\\}\printfield{plz}\setunit{\ {{name. ֒→addspace}\printfield{ort In lines 18–22 a cite command for the greeting 30 {}{} is defined: \citeanrede. It uses the gender field to 31 decide if “Frau” or “Herr” should be printed before 32 \begin{document} 33 \citeadresse{max} the name. 34 \citeadresse{eva} In lines 24–30 another cite command is defined: 35

\citeadresse. This command is meant for the ad- 36 1 \bigskip dress window and prints the data line-by-line. 37 Lieber \citeanrede{max}, liebe \citeanrede{ ,{Lines 32–43 show how the various commands ֒→eva can be used. The output can be seen in figure 3. 38 39 schaut euch doch mal \cite{dante} an und {Finally: How to control the content of ֒→lest \cite{input1 4 the bibliography? 40 When one starts to “misuse” bib entries for things 41 \printbibliography[type=adresse,title= [Verteilerliste→֒ other than standard citations, one quickly finds the 42 \printbibliography[nottype=adresse, [need to prevent such entries from finding their way ֒→resetnumbers 1 Many of the comment signs in the code are not needed 43 \end{document} but they do no harm, either.

biblatex variations 260 TUGboat, Volume 35 (2014), No. 3

Listing 8: One solution for cite commands which Herrn Max Muster should not lead to bibliography entries Im Versuchsweg 10 X01234 Testgelände 1 \DeclareBibliographyCategory{inbib} 2 Frau \newboolean{citeinbib} Eva Muster 3 \booltrue{citeinbib} Im Versuchsweg 10 4 \AtEveryCitekey{% X01234 Testgelände 5 \ifbool{citeinbib}{% 6 \addtocategory{inbib}{\thefield{ {{}{{{Lieber Herr Muster, liebe Frau Muster, ֒→entrykey schaut euch doch mal [2] an und lest [1] 7 8 \begin{document} Verteilerliste 9 Die Schreibweise der Namen {\boolfalse{ citeinbib}\citeauthor{gambol} und→֒ [1] Max Muster, Im Versuchsweg 10, X01234 Testgelände. 10 \citeauthor{dante2007}} ist nicht leicht zu .Eva Muster, Im Versuchsweg 10, X01234 Testgelände. ֒→ merken [2] 11 Die g¨ottlicheKom¨odie... \cite{dante2007} Literatur 12 13 \printbibliography[category=inbib] [1] Ulrike Fischer. „Erster Text“. In: Beispiele (8. Apr. 2012). url: 14 \end{document} inputtext1.pdf. [2] Internetseite dante e.V. url: http://www.dante.de.

Figure 3: The output of Listing 6 Die Schreibweise der Namen Gambolputty de von Ausfern -schplenden -schlitter -crasscrenbon und Alighieri ist nicht leicht zu merken. Listing 7: Unwanted cite commands Die göttliche Komödie ... [1]

1 ... Namen \citeauthor{gambol} und 2 \citeauthor{dante2007} ist nicht leicht ... Literatur 3 Die g¨ottlicheKom¨odie... \cite{dante2007} [1] Dante Alighieri. Die Göttliche Kommödie. Übers. von Hermann Gmelin. Stuttgart, 2007.

into the “normal” bibliography. The previous exam- Figure 4: The output of Listing 8 ples have already shown some possibilities: biblatex has excellent filter features. E.g., with nottype one can exclude a given type from a bibliography. 5 Summary But this doesn’t help when we need to avoid I hope the three examples have shown that biblatex specific cite commands leading to an entry in the offers much more than a way to print bibliographies. bibliography. Listing 7 demonstrates the problem. It “mis-”uses the standard \citeauthor to facilitate References the writing of a complicated name (here, from a [1] Ulrike Fischer. Eine Schnittstelle zwischen Monty Python sketch). Neither \citeauthor com- A Datenbanken und LTEX. Die TEXnische mand should trigger entries in the bibliography. But Kom¨odie, 4/98:28–33, Dec. 1998. dante2007 is cited normally later on. So it is not [2] Ulrike Fischer. Serienbriefe. Die T Xnische possible to exclude (e.g., with the keyword skipbib) E Kom¨odie, 2/99:38–44, May 1999. both entries completely from the bibliography. Listing 8 gives one possible solution to this di- [3] Herbert Voß. QR-Codes im Rand ausgeben. lemma: we introduce a new category inbib (line 1) Die TEXnische Kom¨odie, 4/13:34–37, Nov. and a new boolean variable citeinbib (lines 2–3). 2013. Using \AtEveryCitekey, a cited entry is added to the category if the variable is true (lines 4–6). If a ⋄ Ulrike Fischer cite command should not create an entry in the bibli- Bismarckstr. 91 ography, citeinbib is set locally to false (lines 9–10), 41061 M¨onchengladbach so the entry is not added to the category. Finally, fischer (at) troubleshooting-tex dot de \printbibliography can then filter using the cate- gory (line 13).

Ulrike Fischer TUGboat, Volume 35 (2014), No. 3 261

Every LATEX document brings new situations and a few of the approaches, I describe programming issues what I do. David Walden 2.1 Diversity of situations Background: For several years I wrote a column, I am mainly concerned with the use of ellipses in American English, non-mathematical writing. In called Travels in TEXland, for The PracTEX Journal A this context, ellipses seem to have two purposes: about the way I used LTEX. Although I am not a A (1) indicating where something has been left out of LTEX expert, I was once a full-tine computer pro- grammer and am not afraid to bash around trying to quoted text; (2) to indicate a pause or something find some (perhaps ad hoc) way to accomplish some never stated — an unfinished thought or an implicit typesetting goal. After I stopped writing the column, thought (“and so on”). I still sometimes wrote up something I had figured out Here are three examples (in which I have not for the purpose of reflecting on what was done. The tried to perfect the typesetting of the ellipses): present article pulls together three such reflections. 1. “Four score and seven years ago our fathers brought forth . . . a new nation, conceived in 1 Introduction Liberty, and dedicated to the proposition that There is seldom a time I am not composing a doc- all men are created equal.” ument drafted in LATEX. Each document brings its The words “on this continent” have been left out. own style and efficiency issues and, thus, each time 2. My wife may think that I am fussy about little I seem to have to solve some new little LATEX pro- things . . . gramming problem. The ellipsis here might indicate that I have a A natural question might be, “Why not find an lot more to say about this subject that will go existing package that does what you need rather than unspoken. coding your own thing?” Of course, sometimes what 3. “Leave me alone . . . I’m too tired to talk about I need may well be provided by an existing package. it,” he said. However, mostly I am too lazy to search out an If the quoted words are in dialog, the ellipsis existing package if I can’t immediately find one that might indicate a pause in the speech. meets my needs; or I find one but it doesn’t install Ellipses are complicated for at least three rea- and work without difficulty and without much study. sons:3 (1) they can have many different uses; (2) there I’d rather code my own little thing than struggle with are different sets of conventions for how to indicate package installation issues or inter-package interface the elision, for example, always using three dots or issues. I’d rather do my own little thing (even if it sometimes using three and sometimes using four dots is less efficient than what exists) to avoid having to 1 depending on context within sentences; (3) there are understand complex documentation. different approaches for typesetting ellipses. A LTEX is swell because it is programmable, such Here are some of the possible contexts for use that I can create little “tools” that help me do what I of ellipses: want to do. Also, like any experienced programmer, • at the end of a sentence I collect these little solutions for reuse in future • within a sentence documents by copying rather than new thinking. • at the beginning of a line In this note I give three examples of such little • at the end of a line programming problems and the (perhaps quick-and- • within a line dirty) solutions at which I arrived: • with, or without, other punctuation before, or • Issues and ways for typesetting ellipses after, the ellipsis • Blank verso sides without using the book class’s Lots of combinations of these and other situations twoside option also can happen. • Flexible layout of a photo album All three examples here derive from my work 2.2 Different conventions on which I have written before — self- or private On pages 82–83 of his The Elements of Typographic 2 publishing. Style,4 Robert Bringhurst gives a sketch covering

A some of the conventions for using ellipses. In the 2 Typesetting ellipses with LTEX following quotation, the instances of ellipses in the There are many situations and approaches for using first paragraph and the very last instance are part of ellipses in LATEX. After sketching some of the Bringhurst’s text (and are typeset as specified). The

Every LATEX document brings new programming issues 262 TUGboat, Volume 35 (2014), No. 3

rest of the ellipses are by me to indicate my elisions There is a problem in the way LATEX handles from the Bringhurst quote. ellipses: it always puts a tiny bit more space after \dots in text mode than before it, which Most digital fonts now include, among other often results in the ellipsis being off-center things, a prefabricated ellipsis (row of three when set between two other things. baseline dots). Many typographers neverthe- less prefer to make their own. Some prefer It is worth reading the documentation of Heslin’s to see the three dots flush ... with a nor- package, which also describes some of the issues mal word space before and after. Others relating to using ellipses. The package also allows prefer . . . to add thin spaces between the one to specify the Chicago or MLA style and to dots. Thick spaces (M/3) are prescribed by specify the spacing between dots (e.g., in terms of the Chicago Manual of Style, . . . In most an em) in an ellipsis. 9 cases the Chicago ellipsis is much too wide. Another package is lips.sty, which Heslin Flush set ellipses work well with some fonts suggests using if one wants the full Chicago style. and faces but not with all. . . . At small text With so many possibilities and needs, it is not sizes . . . it is generally best to add space . . . surprising that, as Bringhurst says, “Many typog- between the dots. Extra space may also look raphers nevertheless prefer to make their own.” It best in the midst of light, open letterforms, is hard for me to imagine a package with sufficient . . . , and less space in the company of a dark capabilities and options for everyone. (But maybe font, . . . , or when setting in bold face. . . . I’m wrong.) In English ..., when the ellipsis occurs at 2.4 My approach the end of a sentence, a fourth dot, the period, is added and the space beginning the ellipsis My approach has been to define a few macros to disappears. . . . When the ellipsis combines handle common situations for using ellipses in the with a comma, exclamation mark or ques- writing I do. These also implement my own pref- tion mark, the same typographic principle erences, such as for inter-dot spacing and spacing applies. Otherwise a word space is required before and after an ellipsis. I have used a couple of fore and aft. versions of these macros. However, the Bringhurst summary leaves out other Version 1 of ellipses common conventions. For instance, pages 292–296 5 The following definitions were sufficient for the Break- of my copy of Chicago Manual of Style starts by through Management book (walden-family.com/ giving two main conventions for the form of ellipses: breakthrough) which I co-authored and typeset. I (1) always only three dots, or (2) four dots at the used the Minion typeface for this book. The com- end of sentences (or three dots and another punctu- ments in the following code provide the relevant ation mark, as described by Bringhurst) and three explanations. dots elsewhere. The latter is the manual’s preferred convention. Ellipses in block quotes also provide % dots for main text additional circumstances beyond those mentioned in \def\bigdotsspace{3pt} Bringhurst’s sketch. % three dots The are a number of useful on-line discussions 6 % I like the same size space on each side of the use of ellipses, for example, in the Wikipedia % of the ellipsis as between its dots. and in Doc Scribe’s Guide to research styles, where \def\mydots{\hbox{\hspace{\bigdotsspace}% you can look up the approaches recommended in five .\hspace{\bigdotsspace}% well-known style guides (AMA, APA, ASA, Chicago, .\hspace{\bigdotsspace}% 7 and MLA). .\hspace{\bigdotsspace}}}

2.3 Standard tools versus hand crafting % period and three dots = four altogether It seems to me that merely using \dots or \ldots % and the same size space after a period % and before 3 dots in LAT X is often not enough to address some of the E \def\fmydots{\hskip0pt{}% potential needs mentioned in the previous sections. \hbox{.\hspace{\bigdotsspace}% (See also section 2.5.) .\hspace{\bigdotsspace}% Also, naive use of \dots apparently has a prob- .\hspace{\bigdotsspace}% 8 lem that Peter Heslin’s ellipsis style works on fixing. .\hspace{\bigdotsspace}}} As Heslin says,

David Walden TUGboat, Volume 35 (2014), No. 3 263

% dots with only beginning space, % no following space. % for end of line, e.g., in a block quote % I use this with an ellipsis and \def\threedotstightright{\hbox{\,.\hspace{.33em}% % following comma, etc. .\hspace{.33em}.}} \def\mydotsnfs{\hbox{\hspace{\bigdotsspace}% .\hspace{\bigdotsspace}% % tried this but didn’t use it .\hspace{\bigdotsspace}% %\def\sentencespace{\unkern\spacefactor=3000 .}} % \space\ignorespaces}

% dots for block quote text, % also tried this but didn’t use it % which is smaller than main text %\def\fourdots{\unskip\kern\fontdimen3\font \def\smalldotsspace{2pt} % .\kern.1667em\ldots\sentencespace{}}

% small dots without end spaces % also tried this but didn’t use it \def\minsmalldots{\hbox{% %\def\threedots{\unskip\ \ldots\unkern{}} .\hspace{\smalldotsspace}% .\hspace{\smalldotsspace}% % for use in a footnote; I originally thought .}} % I might need a different definition, but % then I was happy with the main text ratios % small dots with end spaces \def\fnfourdots{\fourdots{}} \def\smydots{\hbox{\hspace{\smalldotsspace}% \minsmalldots\hspace{\smalldotsspace}}} % ditto \def\fnthreedots{\threedots{}} % period + three small dots = four altogether \def\fsmydots{\hskip0pt{}% \hbox{\hspace{.3pt}.\hspace{\smalldotsspace}% Ellipses summary \minsmalldots\hspace{\smalldotsspace}}} It is easy to see how my set of definitions could be adapted to using word or sentence spaces before and Version 2 of ellipses after an ellipsis while using some other appropriate I used the following set of definitions with the book inter-dot spacing, or to adapt the definitions to other I compiled and typeset about the technology his- traditional or personal conventions. For instance, tory of the company Bolt Beranek and Newman.10 This book uses the Lucida Bright typeface. For this % sentence space after four dots: second book, I had learned about doing things in \def\fourdots{\hbox{.\hspace{.33em}% terms which varied with font size, e.g., among main, .\hspace{.33em}.\hspace{.33em}}. } footnote, and block quote text. My decisions in the following definitions are only about what looks good I am also sure that there are better approaches than mine for handling a variety of ellipsis situations to me, not about the conventions of a particular A style manual. (The comments in the following code in LTEX, or at least better ways to do what I am provide the relevant explanations.) doing (perhaps automatically detecting whether an ellipsis is at the beginning or end of a line and thus % with an end-of-sentence ellipsis I use eliminating the need for those definitions). % a following thin, not word, space \def\fourdots{\hbox{.\hspace{.33em}% 2.5 Ellipses: appendix .\hspace{.33em}.\hspace{.33em}.\,}} By the way, in the file latex.ltx, I found the fol- % sometimes I don’t even want the lowing definitions, presented without comment. % trailing , e.g., at end of line \def\fourdotstightright{\hbox{.\hspace{.33em}% \DeclareTextCommandDefault{\textellipsis}{% .\hspace{.33em}.\hspace{.33em}.}} .\kern\fontdimen3\font .\kern\fontdimen3\font % for a non-end-of-sentence ellipsis .\kern\fontdimen3\font} \def\threedots{\hbox{\,.\hspace{.33em}% .\hspace{.33em}.\,}} \DeclareRobustCommand{\dots}{% \ifmmode\mathellipsis\else\textellipsis\fi} % for beginning of line, e.g., block quote \def\threedotstightleft{\hbox{.\hspace{.33em}% \let\ldots\dots .\hspace{.33em}.\,}}

Every LATEX document brings new programming issues 264 TUGboat, Volume 35 (2014), No. 3

3 Blank verso sides without using the book \include{title-pages} class twoside option \include{preface} \mainmatter Several years ago I was involved in creating a small 11 \include{history} book (approximately 100 pages ), and a year later I \EOC did the LATEXing of a small pamphlet (approximately 12 \include{toms-webpage-r1} 60 pages ). In both cases, the document needed to \EOC look like a book, but using all the built-in capabilities \include{uses-r} of the book class wasn’t necessary. Therefore, I \include{views} drafted my own class file (which, in the first case, \EOC Karl Berry significantly improved) and loaded that \include{other} on top of the standard book class, e.g., \backmatter \include{biblio} \documentclass{book} \EOC \usepackage{ctssbook} \input{colophon} Naturally, the added style file included a macro, \EOC \beginnewchapter, which reset the various counters \end{document} (such as footnote and figure numbers), formatted the In the pamphlet shown in this example, some chap- chapter title, changed the running headings, and put ters already end on verso sides and calls to \EOC are the chapter title in the table of contents using the not needed. Also, the command \tableofcontents command is included in the title-pages.tex file; and, since \addcontentsline{toc}{chapter} the table of contents in this case is only one page {\protect\fmttocnumber{\thechapter}#1} long, it also includes a call of \EOC. As I was finishing this pamphlet, I needed PDFs where #1 is the chapter title passed to the macro both for sending to the printer and for posting on via the macro call (and \fmttocnumber is a macro the web. For the printer, there needed to be two that formats a right-justified chapter number in a PDFs: one for the color cover (i.e., a single file of properly sized field). the back cover, spine, and front cover), and one Because the command \chapter is never given for the grayscale interior of the pamphlet including A in the LTEX for these two books, the two-side and blank pages at the end of chapters as needed to one-side capabilities of the book style aren’t available. start chapters on recto sides. For the web, I needed This is not a problem. For a document that will a single PDF with the front and back covers at the be printed and needs to start a new chapter on a beginning and end of the interior pages, and I decided recto side, it is easy enough (in the last stages of I wanted to leave out the blank verso sides from the typesetting) to, first, perfect the page breaks: for interior, but keep the same page numbers as in the this I typically use calls to macros such as print version. \newcommand{\Lpushlines}[1] Thus, I created a macro to conditionally add {\enlargethispage{-#1\baselineskip}} the covers to the interior and augmented the \EOC \newcommand{\Lpulllines}[1] macro to add blank verso sides only when needed for {\enlargethispage{#1\baselineskip}} the print version. And then, second, to go through the root file of \def\Forweb{0} %0 = print the document and add a macro call (after the com- %\def\Forweb{1} %1 = web mands to input the content of chapter, frontmatter and backmatter files) to create the necessary blank \RequirePackage[final]{pdfpages} verso sides where needed. This end-of-chapter macro \def\Covers#1{% definition is something like \ifodd\Forweb % #1 is cover filenames \def\EOC{% \includepdf[pages=1-1]{#1.pdf}% \newpage\null\thispagestyle{empty}\newpage \fi} } \RequirePackage{ifthen,changepage} and the resulting root file looked like this: % if for web, increment page counter \documentclass{book} % if for print, output blank page \usepackage{ctssbook} \def\EOC{\newpage\checkoddpage \begin{document} \ifthenelse{\boolean{oddpage}}% \frontmatter {} {\ifodd\Forweb\stepcounter{page}%

David Walden TUGboat, Volume 35 (2014), No. 3 265

\else\null\thispagestyle{empty}% (the file name without its directory), and displayed \newpage\fi}} the image with the caption beneath it at the current Thus, I added \Covers macro calls bracketing the location. Then I wrote three other macros: rest of the document as follows: 1. \oneperpage, which called \image once and \begin{document} centered the specified image and its caption on \Covers{front-cover} a page; ... 2. \sidebyside, which called \image twice and \Covers{back-cover} placed the two images side by side, centered on \end{document} a page; I also then include a call to the revised \EOC macro 3. \overunder, which called \image twice and after including each chapter, frontmatter, and back- placed the two images (and their captions) on matter file (title-pages.tex already contained a the page centered horizontally and spaced out call to \EOC). equally in the vertical direction. 4 Flexible layout of a photo album I wrote a Perl program to generate calls (in alpha- betical order by file name) to the three page-layout This past year I decided to print a dozen or so copies macros for all the image files in each of the six direc- of an album of old photos to distribute to family tories, with the 8×10 and 5×7 images being placed members. The photos had been pulled from sev- alone on pages, the 4×6 and 3.5×5 tall images placed eral photo albums of a deceased parent that were side by side on a page, and the 4×6 and 3.5×5 wide broken up and various photos of individuals sent to images placed in an over-under position on the page. the individual or a family member of the individual. With a little manual text editing, the macro defini- However, some photos needed to go to more than tions and the Perl-generated calls to the page-layout one person; hence, I wanted to create an album of macros became the LAT X program to generate a first these remaining photos which could be distributed E draft album of all of the images. to multiple family members. However, there was a problem. My intention was The photos came in a variety of sizes ranging to print the images at the actual size of the scanned from 8×10 inches to smaller than 3.5×5 inches, many photographs, counting on \includegraphics to read of the sizes being non-standard for today’s typical the metadata in the image file to specify the print digital printing businesses that serve amateur pho- size. This worked well for most of the images. But tographers (these businesses tend to assume 8×10, for some of images in the 3.5×5 (tall) directory, the 5×7, 4×6, and 3.5×5). The photos also came in a images printed at much too big a size. Rather than variety of conditions from quite good to quite bad sort out the reason, I chose the brute force course (faded or otherwise discolored, or never high quality of temporarily changing the definition of \image so in the first place). I scanned all of these photos at \includegraphics used a width of 3.5 inches in calls either 600 pixels per inch or 300 ppi, cropped off by the \sidebyside macro. the borders on the digital version, and then did lots With the printout of this first draft in hand, of Photoshopping to bring as much quality back to I could begin to improve the captions and think the images as I could manage. Because I thought it seriously about layout issues. With actual captions might be useful, I put the images into six separate written, many were wider than their image which directories for 8×10 (the few instances of this only didn’t look good on horizontal images and which had a vertical orientation), 5×7 (the instances were were completely broken on side-by-side images. So I also all vertically oriented), 4×6 tall orientation, 4×6 redefined \image, as follows, to measure the width wide orientation, 3.5×5 tall orientation, and 3.5×5 of the image and make the caption be that width: wide orientation. I decided that I wanted the photos in the album \def\image#1#2{ I was creating to be the exact size of the originals in \centerline{\includegraphics{#1}} inches on the printed page. Consequently, I decided \smallskip the album’s trim size when perfect bound would be 10×11.5 inches. The next step was to figure out how \settowidth\imagewidth{\includegraphics{#1}} to lay out the photos on printed pages.13 \begin{minipage}[b]{\imagewidth} \centering 4.1 Photo album: First effort \large#2 \end{minipage} The first macro I wrote, \image, took two arguments: } an image directory/filename and a draft caption

Every LATEX document brings new programming issues 266 TUGboat, Volume 35 (2014), No. 3

4.2 Photo album: Next approach \macrocall With the new (to me) concept of measuring the image {filename1}{caption1} {filename2}{caption2} width and adjusting the caption width accordingly, I redid the page layout macros. and sometimes The \overunder macro could use the new defi- \macrocall nition of \image directly as shown in the following {\macrocall{filename1}{caption1}} definition and example call: {\macrocall{filename2}{caption2}} \def\overunder#1#2{ \clearpage \vspace*{\fill} 4.3 Photo album: Final approach #1\vfill #2\vfill At this point, I concluded I needed to do several \clearpage things differently: } 1. try top justifying side-by-side images rather than bottom justifying them, which is what happened called like: without doing anything special; \overunder 2. fix the page layout macros so they called images {\image{hfilename1 i}{hcaption1 i}} in a consistent way; {\image{hfilename2 i}{hcaption2 i}} 3. make it easy to move around the calls of image- I redid the \oneperpage and \sidebyside macros caption pairs in the LATEX source file. to use the \imagewidth approach without calling Regarding the first point above, top justifica- the image macro, i.e., tion of side-by-side images, I looked at or tried four different methods, three from a question and answer \def\oneperpage#1#2{ 14 \clearpage \vspace*{\fill} on tex.stackexchange.com and one I made up \centering myself. One of the tex.stackexchange.com sug- \includegraphics{#1} gestions didn’t seem quite relevant and I couldn’t manage to install and use the suggested packages in \medskip the other two suggestions there. \settowidth\imagewidth{\includegraphics{#1}}% The method I tried to develop myself was to \begin{minipage}[b]{\imagewidth} measure the height of the images, find the difference \centering in heights as a positive number, and insert verti- \large#2 cal space of that difference under the shorter of the \end{minipage} images; unfortunately, I couldn’t get the units of \vfill the various parts of these calculations to match well \clearpage } enough to make the method work. After several hours of trying things spread over a couple of days, \def\sidebyside#1#2#3#4{ the bottom-justified approach began to look better \clearpage \vspace*{\fill} and better, and I gave up trying for top justification. \centerline{\includegraphics{#1}% It came to me that dealing with the third point \quad\includegraphics{#3}} above (moving around image-caption pairs), would naturally address the second point (consistent calling \settowidth\imagewidth{\includegraphics{#1}% sequences). \quad\includegraphics{#3}} Regarding the third point above, it seemed to \smallskip me that the best approach was to separate specify- \begin{minipage}[b]{\imagewidth} ing images and their captions from the page layout \centering Left: #2\\Right: #4 macros, that is, to not have the page layout macros \end{minipage} call the image-caption specification macros. My idea %\centerline{Left: #2; right: #4} was to allow something like the following: \vfill \specifyimage \clearpage \specifyimage } \layoutpagewithsidebysideimages Notice that by this time I had created inconsis- \specifyimage tency in how I included images: sometimes \layoutpagewithsingleimage

David Walden TUGboat, Volume 35 (2014), No. 3 267

\specifyimage \specifyimage \layoutpagewithoverunderimages Caption Then I could simply drag the \specifyimage macro calls around to the places I wanted them to be in my LATEX source file for the album, although I would Left Caption Right Caption still have to be aware of what size images could fit within the bounds of a page. Caption For the \specifyimage macro I developed the following macro (Karl Berry pointed out the \ifcase TEX language construct to me): \newcounter{savedphotocount} % define counter \setcounter{savedphotocount}{0}% clear counter Caption Left Caption Right Caption \def\savephoto#1#2{% \stepcounter{savedphotocount}% \ifcase\value{savedphotocount} \errmessage{case zero should never happen}% \or \gdef\photoa{#1}\gdef\captiona{#2}% case 1 Left Caption Left Caption Right Caption Right Caption \or \gdef\photob{#1}\gdef\captionb{#2}% case 2 \or \gdef\photoc{#1}\gdef\captionc{#2}% case 3 \or \gdef\photod{#1}\gdef\captiond{#2}% case 4 Figure 1: Some examples of page layouts; example \else \errmessage{more photos than expected!} images from the actual album are available at \fi} walden-family.com/texland/photo-album.pdf This macro saves up to four images in well-known places from which the page layout macros can use \sidebyside two images side by side, with the pair them; obviously, this macro could have been ex- centered horizontally, bottom justified, with tended to save more images between instances of equal top and bottom spacing (top left exam- zeroing \savedphotocount, which was done at the ple). end of each page layout macro. Below is an example definition of one of the page \oneovertwo three images with the top one over the layout macros that used \savephoto. bottom pair, with equal top, between, and bot- tom spacing among the two rows of images, and \def\oneperpage{ the image and image-pair centered horizontally \clearpage \vspace*{\fill} (bottom left example). \centering \includegraphics{\photoa} \twooverone reverse of the above. \twoovertwo a side-by-side pair over another side- \medskip \settowidth\imagewidth by-side pair with equal top, between, and bot- {\includegraphics{\photoa}}% tom spacing among the rows, and the rows cen- \begin{minipage}[b]{\imagewidth} tered horizontally (bottom right example). \centering This may have not been the optimal approach \large\captiona in terms of requiring the writing a bunch of different \end{minipage} page layout macros, but it was very useful in terms \vfill A \clearpage of flexibly moving images around within the LTEX \setcounter{savedphotocount}{0} file and experimenting with image ordering and page } layouts until a satisfactory overall album layout was determined. If I was going to do lots of such albums, Other page layout macros I needed to define for I might have wanted to include more calculations in the album were: the macros and let LATEX figure out how to lay out \overunder two images centered horizontally, with pages, but I didn’t need that for this one case (but I equal top, between, and below spacing (see the do have a good starting point if I ever want to create top right example in Fig. 1). something more automatic).

Every LATEX document brings new programming issues 268 TUGboat, Volume 35 (2014), No. 3

4.4 Incorrectly sized JPGs different than is built into the user interface of a less Through all of the above, I had to maintain some spe- programmable tool (or maybe what I want is built cial versions of the macro for the 3.5×5 inch images into a part of the tool I have not bothered to learn A with the tall orientation that \includegraphics was about). With (L )TEX I can get a surprising amount A not displaying at the correct size in the PDF. This done with the relatively modest amount of (L )TEX mostly resulting in these images being bigger than programming I know (and packages that “just work” real size and thus with less pixels per inch than the without much study), and over time I have built A desirable 300 pixel minimum. quite a library of ad hoc (L )TEX tools. The library Thus, I embarked on trying to figure out why is not very organized. I am careful to keep the A \includegraphics wasn’t correctly reading (or pro- sources for all my LTEX-based documents, and when cessing) the image size metadata from the image I need a tool I try to remember for which document files. I could see no reason why not, and I asked I developed that tool, and then I go and copy it. 15 on tex.stackexchange.com if someone knew how Notes includegraphics read and processed JPG image 1 Perhaps I am a bad guy, but I lack motivation for metadata. The list tried to help but had no defini- developing my little tools into packages that might tive and workable answers. help others (although I certainly appreciate that others Eventually, I converted the problem JPGs to be develop packages that help me). PNGs, and then \includegraphics correctly sized 2 Self-publishing: Experiences and opinions, the images. I don’t know why this worked for PNGs tug.org/TUGboat/tb30-2/tb95walden.pdf and not for JPGs (the problem JPGs may have been 3 While in this note I only discuss non-math use originally scanned at 600 pixels per inch rather than in American English, it provides enough variety of 300 pixels per inch as was done for most of the situations and problems to perhaps suggest things to images); however, now I have a work-around that think about when using and typesetting ellipses in another language. I will try immediately if I see this problem again 4 sometime. Version 3.1, Hartley & Marks, Publishers, Vancouver, BC, 2005 4.5 Finishing the album 5 I’m looking at the 13th edition and not a later edition. 6 With the work-around found for the problem noted en.wikipedia.org/wiki/Ellipsis 7 in the prior subsection, I could now do a final com- www.docstyles.com 8 pilation of the PDF for delivery to the print shop. ctan.org/pkg/ellipsis 9 There, although almost all the images were grayscale ctan.org/pkg/lips originally (with only a few color images), I had the 10 A Culture of Innovation: Insider Accounts of print shop print them all in color which made the Computing and Life at BBN, David Walden and old brownish grayscale images look better than they Raymond Nickerson, editors, Waterside Publishing, 2011, walden-family.com/bbn/bbn-print2.pdf would have using a black-and-white based grayscale. 11 I made the cover (front, spine, and back) of Karl Berry and David Walden, editors, TEX’s 25 Anniversary: A Commemorative Collection, the photo album with Adobe Illustrator rather than OR A TEX Users Group, Portland, , 2010, tug.org/store/ LTEX. It might have been easier to do the whole lay- tug10 GUI 12 out with a graphical-user-interface ( ) document David Walden and Tom Van Vleck, editors, layout tool. Then again it might have been more Compatible Time-Sharing System (1961–1973): work with a GUI to match the caption widths to the Fiftieth Anniversary Commemorative Overview, images and to drag whole photo-caption pairs around IEEE Computer Society, Washington, DC, 2011, (rather than dragging calls to \savephoto around in walden-family.com/ieee/ctss.pdf 13 my text editor). Who knows? I have LATEX, and I The reader may be interested in Boris Veytsman’s don’t have the alternative tool. approach for a somewhat similar project: “An output routine for an illustrated book”, TUGboat 35:2, pp. 202– 5 Reflection 204, tug.org/TUGboat/tb35-2/tb110veytsman.pdf. 14 In this paper I have discussed three examples of little tex.stackexchange.com/questions/101858/ programming problems that I was led to by my work, make-two-figures-aligned-at-top 15 tex.stackexchange.com/questions/171906/ and by the fact I was using TEX and had been a programmer by trade. T X suits me particularly how-does-includegraphics-from-the-graphicx- E get-the-size-of-a-jpg-image well as I dislike learning the ins-and-outs of user interfaces (especially user interfaces that change with ⋄ David Walden product updates); and I often want something a little walden-family.com

David Walden TUGboat, Volume 35 (2014), No. 3 269

Glisterings: Lining up The optional length argument to the environment is the distance the left and right margins should be Peter Wilson increased, thus temporarily reducing the apparent The lines are fallen unto me in pleasant width of the textblock. The next paragraph is set places; yea, I have a goodly heritage. within \begin{ruled}[1pc] ... \end{ruled} The Bible, Psalm 16, v. 6

The aim of this column is to provide odd hints or The ruled environment produces a result small pieces of code that might help in solving a that might be a little too fancy for your taste, problem or two while hopefully not making things in which case change the thickness of the rules. worse through any errors of mine. This installment presents some items about lining things up. On the other hand, a box will not break across a page boundary which may be an advantage, but 1 Ruling off on the whole I think not. Pujo wrote that he wanted to create a box with 2 Marginal rules a line at the top and bottom but found that the fancybox package [10] only supplied boxes with all David Arnold posed the following problem on ctt. four sides enclosed. Peter Flynn [5] responded with I’d like to adjust my example environment in the the following, based on fancybox: code below so that each example is bracketed between two horizontal rules. The first rule should be placed \documentclass{article} above the example, align with the inner edge of the \usepackage{fancybox,lipsum} \newenvironment{ruledbox}{% text and flow to the outer edge of the text, add a \begin{Sbox} couple of spaces in the outer margin, typeset ‘You \begin{minipage}{\columnwidth}}{% Try It!’, then continue to flow to within 1cm of the \end{minipage}\end{Sbox}% page edge. \centering\medskip Similarly, for the rule at the bottom of the ex- \vbox{\hrule height1pt ample, I’d like to start it at the inner edge of the text, \par\medskip flow into the outer margin, then typeset the square \TheSbox that is flush right within 1cm of the paper edge. \medskip\hrule height1pt}\par\medskip} I’m not showing David’s code here. Instead, the \begin{document} below is effectively the code that I responded with [9]. \lipsum[1] Drawing the rules across the textblock is no problem. \begin{ruledbox} Also typesetting in the margins can be catered for \lipsum[2] by using the \rlap and \llap macros, thus avoiding A \end{ruledbox} LTEX getting huffy about overlong lines. The only \lipsum[1] tedious part of the code is calculating the length \end{document} of the two rules in the margin area. Hopefully the comments in the code explain sufficiently what is Sometime later I wondered if a box was really done in this regard. needed, wouldn’t just drawing a couple of rules do as well? I came up with the ruled environment which \documentclass[twoside]{report} let you change the width of the ruled contents. \usepackage{lipsum} \usepackage{amssymb} \newdimen\narrowsize \newenvironment{ruled}[1][0pt]{% \newcounter{example}[section] \par \renewcommand{\theexample}{\arabic{example}} \narrowsize\hsize \advance\leftskip#1\advance\rightskip#1 %% insert lengths \advance\narrowsize-2\leftskip \newdimen\uwidth \noindent% \newcommand*{\Utryit}{% \rule{\narrowsize}{3pt}\par \space\space You Try It!\space} }{% \settowidth{\uwidth}{\Utryit} \par\noindent \newdimen\sqwidth \rule{\narrowsize}{1pt} \newcommand*{\Usq}{{\Large$\square$}} \par} \settowidth{\sqwidth}{\Usq}

Glisterings: Lining up 270 TUGboat, Volume 35 (2014), No. 3

The code just shown is intended for use in single %% rule length in the oddpage margins = column documents, and as TUGboat uses two col- %% paperwidth - textwidth - 1cm - 1in umns it will not work here (account must be taken %% - oddmargin - insert of which column the example is in). Extending it to % odd page rule lengths cater for two columns is left as an exercise. \newdimen\uxtra % Utryit \newdimen\sqxtra % Square 3 Preventing an awkward page break \uxtra=\paperwidth \advance\uxtra-\textwidth Szabolcs Horvát requested help on ctt: \advance\uxtra-1cm I would like to have an environment that starts and \advance\uxtra-1in ends with a horizontal line (\hrule), with text in \advance\uxtra-\oddsidemargin smaller type in between. The text may run several \sqxtra=\uxtra pages long. How can it be prevented that the page be \advance\uxtra-\uwidth broken right after the first \hrule or right before the \advance\sqxtra\sqwidth last one? As so often happens Donald Arseneau came up %% rule length in the evenpage margins = %% 1in + evenmargin - 1cm - insert with an answer [1]. \newdimen\uxtrav % Utryit The answer to the question is easy: insert \par \newdimen\sqxtrav % Square and \nobreak and \@nobreaktrue. \uxtrav=\evensidemargin The tricky problem is getting the spacing right! \advance\uxtrav 1in \hrule causes normal \baselineskip to be omitted, \advance\uxtrav-1cm but \rule takes a full baseline which leaves too much \sqxtrav=\uxtrav whitespace. Try this: \advance\uxtrav-\uwidth \makeatletter \advance\sqxtrav-\sqwidth \newenvironment{aside}{% \list{}{\leftmargin 5ex \makeatletter \rightmargin\leftmargin} \newenvironment{example}{% \vtop{\hrule width\columnwidth}% \medskip\refstepcounter{example}% \nobreak\@nobreaktrue \ifodd\c@page% odd page \vspace{0.5ex}% \noindent\rule{\hsize}{3pt}% \item\relax\small \rlap{\Utryit\rule{\uxtra}{3pt}} }{% \else \par\nobreak\@nobreaktrue \noindent\llap{\rule{\uxtrav}{3pt}\Utryit}% \advance\baselineskip -0.7ex \rule{\hsize}{3pt} \vtop{\hrule width\columnwidth}% \fi \endlist} \par\noindent\textbf{Example \theexample.}}% \makeatother {% \ifodd\c@page I tried the aside environment and it worked \par\noindent\rule{\hsize}{1pt}% even better than requested as it kept a rule and at \rlap{\rule{\sqxtra}{1pt} least two lines of text together. \Usq} I haven’t tried to combine the aside and ruled \else environments which I leave as an interesting exercise. \par\noindent\llap{\Usq\rule{\sqxtrav}{1pt}}% \rule{\hsize}{1pt} From a slightly different viewpoint, Nick Ur- \fi banik posted to ctt that he wanted to keep a list of \par\medskip} items always on a single page [8]. In particular, to \makeatother keep a question and its suggested answers together, where there was a list of questions each with its list of \begin{document} answers. There were some six respondents to Nick’s \lipsum[1] request for help but the discussion for some reason \begin{example} veered from the enumitem package to the titlesec \marginpar{Simplify: $33+28$} \lipsum[2] package that had no relevance to the initial posting. \end{example} Donald Arseneau again provided a simple solution to \lipsum[1] the original problem [3], resulting in questions and \end{document} answers for a possible accountant’s interview being coded like:

Peter Wilson TUGboat, Volume 35 (2014), No. 3 271

\textbf{Accountancy test} a rule before the supplement unless the supplement \begin{questions} started a new page. \Qitem What is $2+2$? \begin{enumerate} \item 3 A T X leader is not a permissable breakpoint \item 4 E and may vanish at a page break and so provides a \item Whatever you want it to be. \end{enumerate} potential means of meeting such a requirement. Just \Qitem What is the essence of double-entry before this paragraph I specified: bookkeeping? \newskip\rulebreakskip \begin{enumerate} \rulebreakskip=\baselineskip \item Each transaction recorded twice, \newcommand*{\filler}{\hbox to \hsize{% in the credit and debit ledgers. \hss \rule{0.7\hsize}{1pt} \hss}\vskip 1pt} \item Two sets of books, the real ones and \newcommand*{\rulebreak}{% the ones for the tax inspectors. \vskip\rulebreakskip \item Don’t know. \cleaders\filler \end{enumerate} \vskip\rulebreakskip} \Qitem What ... \rulebreak \end{questions} which resulted in either a centered rule or, if at the When processed this will result in a question and all bottom of the column, nothing. its potential answers being kept together on a page and, depending on their length, there may be several *** sets of questions and answers on a page. Just before this paragraph I specified: Accountancy test \renewcommand*{\filler}{% 1. What is 2 + 2? \hbox to \hsize{\hss * * * \hss}} \rulebreak (a) 3 (b) 4 which resulted in either three centered asterisks or, if at the bottom of the column, nothing. (c) Whatever you want it to be. You can put different elements in the \filler 2. What is the essence of double-entry bookkeep- box, such as an \asterism or a moustachio but you ing? might have to adjust the value of \rulebreakskip (a) Each transaction recorded twice, in both for the best optical effect. the credit and debit ledgers. (b) Two sets of books, the real ones and the 5 Line backing ones for the tax inspectors. ‘talazem’ presented ctt with a problem that has (c) Don’t know. never been completely solved in LATEX — namely typesetting to a grid. T X was not designed with 3. What ... E this in mind. Slightly edited, his presentation was: Donald’s method to make this happen is: I am typesetting a book in Memoir and want \newcommand*{\Qitem}{\pagebreak[0]\item} to ensure that the lines register well to avoid shine \newenvironment{questions}% through. The book is mainly in English with a font {\enumerate\samepage}% size 10/12.5. However there are some paragraphs {\endenumerate} that are causing alignment problems. There are some paragraphs that have to be set which says that the questions environment should to a 0.8 ratio of the primary face with a 0.5 ratio be all on one page except that a pagebreak is allowed of line spacing. There are others in a non-English just before a question’s \Qitem. typeface where the font is about 1.4 times bigger than the Roman font for the English text. 4 Not at a page break Paragraphs of this kind throw off the alignment Sometimes it may be desirable to have a divisional of text lines on adjacent pages, and causing shine marker of some kind disappear at a page break. I through on the recto and verso sides of a page. have forgotten the details but someone once had a The basic requirement here is that these irreg- supplement (with a title such as ‘Notes’) at the end ular paragraphs should take up a space that is an of each chapter in the document and wanted to have integral number of the normal \baselineskip.

Glisterings: Lining up 272 TUGboat, Volume 35 (2014), No. 3

The one potential solution provided came from 6 Linespacing an exchange of views between Donald Arseneau and Pander wrote [7]: Dan Luecking [4], as follows, where the environment I have some questions on line spacing (leading) that will occupy an integral number of the normal lines. should respect font size. It mainly concerns non- \makeatletter uniform line spacing that doesn’t reserve space for \@ifundefined{@tempdimc}{\newdimen\@tempdimc}{} ascenders and descenders and line spacing that is too \newenvironment{gridblock}{\par big or too small for small and large font sizes. \setbox\@tempboxa\vtop\bgroup Please see the following TeX [code] for the exact }{\par\egroup questions. I know this is tricky in TeX, but have to % measurement of top ask any way. \@tempdima=\ht\@tempboxa \@tempdimc=\dp\@tempboxa \noindent \ifdim\@tempdima>\ht\strutbox {\tiny aeou\\aeou\\}%too much leading \advance\@tempdimc\@tempdima {\normalsize aeou\\aeou\\} \@tempdima=\ht\strutbox {\Huge aeou\\aeou\\}%not enough leading % \@tempdima is the top height. {\tiny gpqy\\gpqy\\}%too much leading \advance\@tempdimc-\@tempdima {\normalsize gpqy\\gpqy\\} \fi {\Huge gpqy\\gpqy\\}%no space for descenders % measurement of bottom {\tiny bdfhkl\\bdfhkl\\}%too much leading \setbox\@tempboxa\vbox{\unvbox\@tempboxa}% {\normalsize bdfhkl\\bdfhkl\\} \ifdim\dp\@tempboxa>\dp\strutbox {\Huge bdfhkl\\bdfhkl\\}%no space for ascenders \@tempdimb=\dp\strutbox {\tiny gpqybdfhkl\\gpqybdfhkl\\}%too much leading \else {\normalsize gpqybdfhkl\\gpqybdfhkl\\} \@tempdimb=\dp\@tempboxa {\Huge gpqybdfhkl\\gpqybdfhkl\\} \fi % \@tempdimb is the bottom depth. The result of processing this is shown in the left \advance\@tempdimc-\@tempdimb side of Figure 1. Pander also noted a similar problem % \@tempdimc is distance between the top when using different fonts in a tabular. % and bottom baselines. There were several respondents all of whom % The excess, \@tempcnta, is the number % of baselines. noted that Pander’s example consisted of a single \@tempcnta=\@tempdimc paragraph within which the several font size changes \divide\@tempcnta\baselineskip were closed within groups. Further, that TEX takes \advance\@tempdimc -\@tempcnta\baselineskip the font size in effect at the end of a paragraph as \ifdim\@tempdimc >2\vfuzz applying throughout the paragraph, and hence that \advance\@tempdimc-\baselineskip \fi the leading is constant. \divide\@tempdimc\tw@ Donald Arseneau [2] replied with: \vbox to\@tempdima{}% Set \baselineskip=0pt or some small value \nobreak \nointerlineskip Set \lineskip=\lineskiplimit= desired space \kern-\@tempdima \kern-\@tempdimc \nobreak \box\@tempboxa \baselineskip=8pt \nobreak \nointerlineskip \lineskip=4pt \kern-\@tempdimb \kern-\@tempdimc \nobreak \lineskiplimit=\lineskip \hbox{\vrule height \z@ width \z@ depth \@tempdimb}} Then be aware that font-change commands re- \makeatother set \baselineskip, so that font changes that span The gridblock environment doesn’t cater for the end of a paragraph will go back to some larger footnotes, floats, or really anything other than plain \baselineskip. text. It certainly does not handle page breaks. In tabular put \strut in with all your variant This is \tiny text in the gridblock environment. I’m not sure how well fonts. the effect will be demonstrated as the adjacent column may, or may not, be evenly spaced vertically. Roughly speaking, the normal spacing between Did that work out? Are the lines in this para- the baselines of text is \baselineskip but if the graph aligned with those on the adjacent columns, or ‘top’ of a line is closer than \lineskiplimit to the pages? If not it may be because the adjacent columns bottom of the previous line then the spacing will are not set on a grid. Incidentally, the relatively re- be increased so that the top to bottom space is cent package grid may be of interest, though it is \lineskip [6, Ch. 12]. The results of applying Don- not a complete solution either. ald’s settings are shown at the right of Figure 1.

Peter Wilson TUGboat, Volume 35 (2014), No. 3 273

aeou aeou aeou aeou aeou aeou aeou aeou aeou aeou aeou aeou gpqy gpqy gpqy gpqy gpqy gpqy gpqy gpqy gpqy gpqy gpqy gpqy bdfhkl bdfhkl bdfhkl bdfhkl bdfhkl bdfhkl bdfhkl bdfhkl bdfhkl bdfhkl bdfhkl gpqybdfhkl bdfhkl gpqybdfhkl gpqybdfhkl gpqybdfhkl gpqybdfhkl gpqybdfhkl gpqybdfhkl gpqybdfhkl gpqybdfhkl gpqybdfhkl gpqybdfhkl gpqybdfhkl

Figure 1: Different font sizes in a paragraph: (left) Pander’s problem; (right) following Donald Arseneau

References [7] Pander. Line spacing respecting space for [1] Donald Arseneau. Re: Preventing page breaks ascenders / descenders and fontsize. Post to at certain positions. Post to comp.text.tex comp.text.tex newsgroup, 11 April 2011. newsgroup, 17 November 2009. [8] Nick Urbanik. List items always on the same [2] Donald Arseneau. Re: Line spacing respecting page. Post to comp.text.tex newsgroup, space for ascenders / descenders and fontsize. 13 March 2011. Post to comp.text.tex newsgroup, 11 April [9] Peter Wilson. Re: Marginpar in memoir. 2011. Post to comp.text.tex newsgroup, [3] Donald Arseneau. Re: List items always 27 December 2009. on the same page. Post to comp.text.tex [10] Timothy Van Zandt. fancybox.sty: Box A newsgroup, 15 March 2011. tips and tricks for LTEX, September 2000. [4] Donald Arseneau and Dan Luecking. Re: http://ctan.org/pkg/fancybox. vertical height of boxes by multiple of ⋄ baselineskip. Post to comp.text.tex Peter Wilson newsgroup, 9–10 December 2009. 12 Sovereign Close Kenilworth, CV8 1SQ [5] Peter Flynn. Re: Fancybox alternatives. UK Post to comp.text.tex newsgroup, herries dot press (at) 11 September 2009. earthlink dot net [6] Donald E. Knuth. The TEXbook. Addison-Wesley, 1984. ISBN 0-201-13448-9.

Glisterings: Lining up 274 TUGboat, Volume 35 (2014), No. 3

CTAN goes multi-lingual: Additional language support for the Web portal Gerd Neugebauer Abstract TEX is used for many languages: support in the TEX engines and macro packages is abundant, but the CTAN portal has been available in English only. Now we are conducting an experiment to provide the Web presentation in German as well. 1 Introduction Modern Web frameworks are equipped with means for internationalizing the presentation. We are cur- Figure 3: Language configuration in Chrome rently attempting to make use of these means for the CTAN portal, https://www.ctan.org.

Figure 4: Language configuration in IE

Figure 1: The cover page in German When you request a page, the browser passes Whenever you visit the portal, the language is these language preferences to the server, where the automatically selected for you. This magic happens list is compared with the list of supported languages. in a negotiation between the browser and the server The best fit — or the default fallback — is then chosen. in the background. In the browser you can configure Currently, the CTAN portal supports English and your preferred languages (cf. figures 2–4). German. Thus it is usually sufficient for you to navigate to the CTAN portal to see your preferred language from these alternatives. You can also explicitly choose the language on the settings page by clicking on the appropriate flag (figure 5). This selection is valid only for the cur- rent session. When you return later, this setting is forgotten. 2 Features Internationalization includes the localized presenta- tion of several types of content: • The page frame which includes header and footer as well as the static parts of the dynamic pages Figure 2: Language configuration in Firefox • The page content for the “static” pages • The dynamic content from the TEX archive

Gerd Neugebauer TUGboat, Volume 35 (2014), No. 3 275

for languages not understood by the user should be suppressed completely. But this already belongs into the next section on “visions”. 4 Visions In the future we could consider to develop the interna- tionalization of the CTAN portal in several directions. First, of course there are more languages to be sup- ported. Right now the language-specific texts are contained in 31 files. These files are either mapping files which map keys to language-specific texts, or complete pages which are present in separate incar- nations. To support a new language is initially a matter Figure 5: The language selection on the settings page of providing these 31 files and adding a configura- tion option for the new language. However, this isn’t enough. The portal is not static. The content • The dynamic content from the Catalogue changes over time. Thus we would need a commit- • The dynamic content from the search index ment that we have volunteers for the new supported For the current internationalization experiment, language to guarantee continuity. only the frame and some of the static pages are For the support of even more languages the offered in German, as a starting point. It should pri- existing administration interface could be extended. marily help to make it more comfortable for visitors Then maintenance of the different languages could be in their first steps. performed via a Web interface by different persons. Some pages have intentionally not been trans- The other side has already been mentioned. lated. For instance, the upload page (https://www. The TEX catalogue has to be augmented with new ctan.org/upload) is provided in English only since language-specific texts. The same applies here. The the communication language with the CTAN team support has to be guaranteed for future changes. is English. Thus, an uploader should not be encour- Thus volunteers for a long-term engagement would aged to try another language since this might not be be required. understood on our side. We must also reconsider the processes in the CTAN team. They are currently not designed for 3 Limitations parallel processing of a change for a single package. The major limitations are derived from missing data These processes work well for the small team which is currently active. The more languages that are in the back-end. The files in the TEX archive are provided by the numerous authors. The language is supported, the more people have to work on one whatever the author provides — for instance in the change. Unfortunately this seems to be beyond the package documentation. capabilities of the CTAN team right now. The T X catalogue is mainly in English. It is E 5 Epilogue partially prepared to carry texts in different lan- guages. For instance the descriptions are to a small I hope that the experiment with the German lan- degree already present in languages other than En- guage for the CTAN portal succeeds. Feedback in any glish. The CTAN team is presently not able to pro- direction is welcome. For discussions, please consider vide translations, due to limited resources and skills. using the mailing list [email protected]. You can If no appropriate localized text is available the subscribe via https://lists.dante.de/mailman/ English version is used. This can be seen in figure 1. listinfo/ctanweb. Here the text for the topic teaser is in English while Keep on TEXing — in many languages. the other parts of the page are in German. Another limitation applies to the search. The ⋄ Gerd Neugebauer search is currently language-agnostic. It should in Im Lerchelsb¨ohl 5 the future be based on the language settings, in that 64521 Groß-Gerau (Germany) entries with the proper language should be ranked [email protected] higher than non-matching entries. Maybe entries www.gerd-neugebauer.de

CTAN goes multi-lingual: Additional language support for the Web portal 276 TUGboat, Volume 35 (2014), No. 3

Obyknovennaya Novaya There were big plans for the future to make (Ordinary New Face) in METAFONT italic, bold, bold italic and slanted shapes, work with rounding errors, kerning and others. In spite of Basil Solomykov my no longer officially being his student, Vladimir The Obyknovennaya Novaya (Ordinary New Face) Lidovski continued testing my font, made suggestions typeface was widely used in the USSR for scientific and gave advice. It was partly this support that kept and technical publications, as well as textbooks. My me from throwing everything down. During this “Obyknovennaya Novaya” is a revival of that typeface, time I also met Alexander Tarbeev, who gave me and though it is not the first one, I believe it is the some valuable advice regarding the overall design most complete. The Obyknovennaya Novaya family and relationship between the different parameters of currently includes regular, bold, italic, bold italic, the typeface. slanted and small capitals shapes. Obyknovennaya By the spring of 2011 I had made regular, bold, Novaya is free software, available under the terms of italic, bold italic, slanted and small capital shapes. the LPPL. The story of the METAFONT version of Some work to optimize font rendering was done, and this font follows . . . new kerning pairs had been added. In the beginning of 2008 I was a student and At that point, I stopped working with the font needed to choose the theme for my qualifying work. until 2014, and began to learn about other font for- My scientific supervisor was Vladimir Lidovski, and mats, such as TrueType, OpenType and PostScript he advised to learn METAFONT and make the font Type 1. Obyknovennaya Novaya in METAFONT, to expand I read a lot about font smoothing and rendering the variety of available Cyrillic fonts in TEX. I began on digital devices. Beat Stamm’s site (http://www. to read Donald Knuth’s The METAFONTbook and rastertragedy.com) was especially useful for me; it make my first steps in drawing and font making. gives a detailed description how fonts are rendered on After approximately a month, I made the first letter — the screen. I am trying to understand how to apply at that moment it was a big success to me. Over the this method in Metafont, and hope to implement it next month I learned about main parameters, such in the future. as stems, curves, bars and others. The “army” of my In general the work on this font has been a letters was growing, and it inspired me to continue my fruitful experience for me. It is my first contribution A work. By the time the number of letters reached 50, to CTAN and the LTEX community. I hope the I had learned about some new typographic features. font will be useful and I am glad to be able to help I read Knuth’s Volume E of Computers & Type- people. I also hope that the publication of my font setting, which contains precise definitions of about will provide more feedback and suggestions from 500 letters, numerals, and other symbols of the Com- experienced TEX users on how to improve it. puter Modern Typefaces, all described with META- ⋄ Basil Solomykov FONT. I realized that my letters had different pa- bs44550 (at) gmail dot com rameters, each letter was described in its own file, for http://ctan.org/pkg/obnov example LetterA.mf, without any unification. So I decided to combine them by making one file for all Normal shape, (10pt): АБВГД...ЭЮЯ абвгд...ыьэюя letters of one size, and began a large amount of work 0123456789 ABCD...WXYZ abcd...wxyz ?@&*є to restructure my font. Bold shape, (10pt): АБВГД...ЭЮЯ абвгд...ыьэюя By the end of May 2008 I had 66 Cyrillic letters 0123456789 ABCD...WXYZ abcd...wxyz ?@&*є (33 capitals and 33 small), 52 Latin letters, numer- als and some punctuation marks, all in one shape Italic shape, (10pt): АБВГД...ЭЮЯ абвгд...ыьэюя (regular), at several point sizes: 7, 10, 12, 17 pt. I 0123456789 ABCD...WXYZ abcd...wxyz ?@&*є met with Alexander Shen, who supported me and Bold italic, (10pt): АБВГД...ЭЮЯ абвгд...ыьэюя observed new directions to improve my typeface. I 123456789 ABCD...XYZ abcd...wxyz ?@&*є successfully graduated from the university and then it was time to decide what to do next with my project. Slanted shape, (10pt): АБВГД...ЭЮЯ абвгд...ыьэюя At that time the development of the Obyknoven- 0123456789 ABCD...WXYZ abcd...wxyz ?@&*є naya Novaya typeface became supported by the TUG Small Capitals, (10pt): АБВГД...ЭЮЯ абвгд...ьэюя development fund, so I began the long journey of 0123456789 ABCD...WXYZ abcd...wxyz ?@&*є making a high-quality font.

Basil Solomykov TUGboat, Volume 35 (2014), No. 3 277

A simple Arabic typesetting system for to be input in Arabic script and only in Arabic script. mixed Latin/Arabic documents: d. ¯ad Using a transliteration to input Arabic may seem terribly old-fashioned to the reader, but there are Yannis Haralambous cases where it is the best solution. One of these Abstract cases is the context of this paper: a mixture of Latin script, Arabic script, and TEX commands. We describe (d¯ad), a package allowing simple type- D . Indeed, at the GUI level, there are — at least — setting in Arabic script, intended for mixed Latin- two drawbacks in combining Arabic and Latin script Arabic script usage, in situations where heavy-duty in the same paragraph: solutions are discouraged. The D system operates with both and transliterated input, allowing 1. the use of the cursor and of left and right arrow the user to choose the most appropriate approach. keys is very cumbersome: when you select a lo- cation with the cursor you don’t know whether 1 Introduction you are in right-to-left or left-to-right mode and As with many TEX projects, this one was started to hence you don’t know in which direction to ad- fulfill an immediate need: the author was writing vance, or how to select a given character string; a paper for an Arabic language Natural Language 2. the situation is made even worse by the fact Processing conference [7] and hence was in need of that some punctuation marks (period, exclama- a straightforward way to introduce Arabic text into tion mark, dashes, parentheses, braces, brackets, his document. In this context, “straightforward” can etc.) are common to the two scripts and hence be subdivided into the following five requirements: the — quite sophisticated — bidirectional algo- 1. it should be compatible both with the IEEE rithm is used to determine whether a punctua- A tion mark is to be placed on the left or on the LTEX style [19] (required by the conference) A right of an Arabic word. The bidirectional algo- and with the WriteLTEX platform [3], of which the author is an enthusiastic user; rithm (or ‘bidi’ for the insiders) is both a bless- 2. it should allow user-friendly and robust input of ing and a curse. It is a blessing because it puts some order in the rendering of mixed right-to- Arabic text, including when placed inside TEX command arguments; left and left-to-right texts (cf. [6, p. 133–146]) — 3. it should typeset in an optimal way all combina- but this works well only if rle and lre are used tions of letters and diacritics that may appear to indicate the embedding level. Otherwise, in in scholarly text; everyday use, bidi is a curse. Fig. 1 shows how 4. it should provide some extra features: being the TEX code for writing the word «ë with the able to easily change the letter form, as well middle letter colorized in red is displayed by var- as to colorize specific letters without breaking ious programs under various operating systems; contextual analysis; not a single one of them really makes sense. 5. the font should be easily readable in a context Requirement 4 Sophisticated OpenType fonts [6, of mixed Latin-Arabic script. § D.9.4] handle relatively well the many letter + dia- In the following discussion we will see why ex- critic combinations, and most systems (including isting systems did not fulfill the requirements, and X TE EX) can colorize word parts without breaking how the author solved the problem. contextual analysis through the use of the zero-width joiner character [6, p. 104]. But not a single system 1.1 Existing systems and their pitfalls is able to colorize single letters inside F , since this As we live in the 21st century, an obvious choice has always been considered as a single glyph for typesetting Arabic in TEX is X TE EX [12], a TEX by font designers. avatar (in)famous for typesetting in non-Latin scripts Requirement 5 The best way to match Latin and by taking advantage of resources. Arabic script is to choose an Arabic font with rela- Requirement 1 X TE EX indeed is provided on the tively small differences in height between letters. A A A WriteLTEX platform. But the LTEX overhead for quite common choice is the font Geezah by Diwan typesetting Arabic in X TE EX is quite heavy (packages Software Ltd (developed for Apple WorldScript in fontspec, xunicode, arabxetex, etc.) and hence, the early nineties, and still included in Mac OS X, not surprisingly, X TE EX is incompatible with the through today). Geezah is a nice font but its dia- IEEE style. critics are placed rather suboptimally, and modifying Requirement 2 X TE EX is unable to use a transli- their positions requires a high amount of competence teration system and hence requires the Arabic text in fiddling around with OpenType features.

A simple Arabic typesetting system for mixed Latin/Arabic documents: d. ¯ad 278 TUGboat, Volume 35 (2014), No. 3

(a) (b)

(c) (d)

(e) (f)

(g) (h)

Figure 1: To obtain with letter in red, the author has typed the Unicode string ë- «- ë- ❡ ❡ “k¯af, zwj (zero-width joiner), \red{, zwj, t¯a , zwj, }, zwj, b¯a ”. Here is the result, in the following environments: (a) Chrome under Mac OS X, (b) Safari under Mac OS X, (c) Firefox under Mac OS X, (d) Internet Explorer under Windows 8, (e) Iceweasel under Debian, (f) Mellel under Mac OS X, (g) Nisus under Mac OS X, (h) BBEdit under Mac OS X. The reader can see the variety of contextual forms displayed in these examples. It is obviously not straightforward to understand the meaning of the code, and using the cursor to edit it is no less than a Lovecraftian nightmare. In transliteration this code snippet is simply written k-\textcolor{red}{-t-}-b.

1.2 The long and winding road towards tures was an interesting approach, the author tried a solution to investigate ways of succeeding now where he had A failed in 1993, by the use of modern technologies. Obviously, Requirement 1 and WriteLTEX compat- ibility ruled out Ω [8, 9, 10], despite its powerful A first idea was to create ligatures between let- machinery for defining transliterations and applying ters and digits 0–3 to obtain contextual forms of the letters, and to have some other mechanism insert contextual analysis. X TE EX was ruled out since it satisfies none of the requirements (1), (2), (4) or (5). the digits. Indeed, LuaTEX provides callbacks for both the token and the node list, which would be The two remaining choices are pdf(e)TEX[2] (with natural choices for such a mechanism. Traversal of bidirectional typesetting support) and LuaTEX[11]. Let us make a two-decade jump back in time. the node list by LuaTEX is quite efficient, but unfor- On April 12, 1993, the author organized a work- tunately unsuitable for this particular project since at that stage, ligatures have already been applied, so shop entitled “TEX and the Arabic Script”, at the National Institute of Oriental Languages and Civi- it is too late to insert digits among the nodes. After lizations (INALCO) in Paris. One of his contributions some attempts, the token list callback was abandoned since it is still in a very rudimentary state. Also, to this workshop was a public domain Arabic TEX font in Geezah design, in which the contextual analy- unfortunately, LuaTEX has not yet implemented Lua patterns, which are a kind of regular expressions, and sis was entirely done via TEX’s smart ligatures and boundary characters [13]. This is indeed the simplest would make programming the contextual analysis approach: no extra technology was required other much easier. A second idea was to use large virtual fonts to than TEX–XET bidirectional typesetting and TEX 3 smart ligatures. Nevertheless that font had a serious encode all letter + diacritic combinations, so that pitfall: because the number of glyphs was limited the 1993 approach would be applied not to letters to 256, it was impossible to handle automatically alone, but to letter + diacritic combinations. But quadriform letters followed by a vowel. In that case, the attempt to generalize the 1993 approach to large it was necessary to introduce a vertical bar between virtual fonts (OVF) has failed as well, because OVFs the letter and the vowel, so as to obtain its final do not provide support for Knuth’s boundary charac- ter (indeed, in Ω, ΩTPs provide a much more elegant or isolated) form (as for example, in ٌ «ëb = ktAbuN) which had to be written ktAb|uN). The proceedings solution to the word boundary problem, and there- of this workshop were planned to be published in fore the boundary character was left out of the Ω the Cahiers GUTenberg, but this never happened— system, but again LuaTEX does not provide ΩTPs). which is a pity since among the speakers was also the In this approach, the default form of a letter would legendary creator of the Moroccan simplified Ara- be the medial one, and boundary characters would bic writing system [15, 17], Ahmed Lakhdar-Ghazal turn a letter into initial, final or isolated form. († 2008). The third attempt proved successful: instead of As building a font based entirely on smart liga- using boundary characters, the author used smart

Yannis Haralambous TUGboat, Volume 35 (2014), No. 3 279

ligatures between characters. In this case, the default To typeset in Arabic, one need only load the dad letter form is the isolated one. The presence of package and use the macro \arab, which is a \long a second letter changes the form of the first from macro: its argument may have multiple paragraphs. isolated to initial, and the one of the second from Arabic text can be input in transliteration as isolated to final. A third letter will turn the second described in Table 1 or in UTF-8. To obtain, for letter into middle form and the third letter into final example, ëb ِ A®¬ one would write \arab{AlkitAb} or provided, of course, the second letter is quadriform), \arab{ A®¬ِëb }. By writing \arabtt{AlkitAb} one) and so on. More details are given in § 4. obtains the typewriter version A®¬ِëb (which is less appealing, but fits quite nicely with the Computer 2 The name Modern Typewriter font). The next question was what to name the package. Thanks to the Internet, search engines, social 3.1 Rationale of the transliteration media, and the like, people are becoming more and Here are the rules of the proposed transliteration more aware of other languages and writing systems. (see Table 1): Why not give this package an Arabic name, be it a 1. pharyngeal H = H, emphatic S = S, D = D, T = single letter? T, = Z and velar = R are uppercased — do not The author has chosen the letter , called d¯ad, Z R D . confuse them with glottal h = h, non-emphatic because Arabic is traditionally called the “language = s, = d, = t, = z, and alveolar = r; of the d¯ad”, since this sound was historically consid- s d t z ❡ r . 2. long vowels ( = A, = U, = Y) and alif maq.s¯ura ered as being unique to Arabic. A U Y (I = I) are also uppercased; The reader is probably wondering how to pro- 3. some consonants are modified by adding a char- nounce this letter, technically a “voiced velarized acter h ( = dh, c = th, ‰ = sh); alveolar stop” [18, p. 16]. Here is how [20, p. 10] 4. the stand-alone hamza is obtained by a vertical describes its pronunciation: bar | and letter ❢ayn by a grave accent (which, Pronounce the regular sound ‘d’ and you will in legacy TEX produces an inverted curly apos- find that the tip of your tongue will touch trophe, which is sometimes used to transliterate in the region of the upper front teeth/gum. this letter); Now pronounce the sound again and at the 5. to avoid confusion between pairs of letters and same time depress the middle of the tongue. letters obtained by digraphs, one has to use a This has the effect of creating a larger space dash to separate characters: compare ¾§ = s-h between the tongue and the roof of the mouth and ‰ = sh, or Á§ = t-h and c = th; and gives the sound produced a distinctive 6. more generally, the dash plays the rˆole of zero- ‘hollow’ characteristic, which also affects the width joiner:1 when writing = -b, the letter ❡  surrounding vowels. It is difficult to find a b¯a will be in final form; - = b- and - = -b- will parallel in English, but the difference between produce initial and middle letters, provided of ‘Sam’ and ‘psalm’ (standard English pronunci- course the letter is quadriform (as is letter b¯a❡ ation) gives a clue. Tense the tongue muscles in this example). This is very useful when de- in pronouncing ‘psalm’ and you are nearly scribing grammar rules, to signify that a letter there. Now pronounce the a-vowel of ‘psalm’ (or letter group) is an affix; before and after ‘d. ’, saying ‘ad. a’, keeping the 7. the dash can also be used to reestablish con- tongue tense, and that’s as near as we can get textual forms when combined with TEX com- to describing it in print. mands, for example, to colorize letters as in Fig. 1. There is only one special case: when we 3 How to use D want to colorize a letter of an isolated ligature The package provides three PostScript Type 1 fonts F , we add a digit 4 in front of the dash. For the (plain, bold and typewriter), “real” fonts (regular final ligature Gæ it will be a digit 5. Example: to TFM) and large virtual fonts (OVF and OFM files). colorize the l¯ams of F- æG- Á- , write There are also rudimentary FD and STY files, a \arab{t-\textcolor{red}{-l5-}-A5% MAP file, Perl scripts for conversion to (and from) \textcolor{red}{l4-}-A4} UTF-8, the Perl script which builds the font and 1 finally adjustment files, in case the user wants to Except for the case of letter = dh which is biform and hence is not connected with the following letter. By writing change kerning and diacritic placement. ❡ ❡ d&h = d-h one obtains letters d¯al and h¯a , but the h¯a is not in It requires LuaTEX for change of direction and medial form, as it would be in any other case when preceded OVF/OFM compliance. by a dash.

A simple Arabic typesetting system for mixed Latin/Arabic documents: d. ¯ad 280 TUGboat, Volume 35 (2014), No. 3

Table 1: Transliteration of D system | |  ’A ä ’a ü ’u ï ’i ÿ ’I A A b b „ t* t t c th j j

H H x x d d dh r r z z s s ‰ sh S S D D T T Z Z ` ‘ R R f f q q k k l l

m m n n h h U U I I Y Y

A* ûْ o ûَ a ûِ i ûُ u ûً aN ƒ ûٍ iN ûٌ uN ûّ + û– +a û— +i û˜ +u ûœ +aN û +iN ûž +uN ûٰ a* ûê +a* L”“ LLh

p p g g C C J J e e v v

**b ‹ ’n Œ ’f  ’q ûٓ a** ûŠ +a’ ˆ

8. finally, there is yet another use of the dash: when this case — and in this case only — an upper- doubled, it produces a kashida stroke: compare case L is used. The reason is that we wish to = lYl and = l--Y--l. There is also a avoid ambiguity with other uses of the trigram ® ° ®O O° ❡ ( (Koran 6:39 kesh command for extensible kashida (equiv- l¯am-l¯am-h¯a , for example ُ ¯ْ§ ¯ِ ْ ُ\ alent to a \hrulefill using the default rule where we encounter letters ®¯§ but not with the thickness font dimension \fontdimen8): meaning “God”. In contrast to other systems, l--\kesh--Y--\kesh--l. produces: the L”“ ligature is available also in final form ,َM which occurs six times in the Koran K ®K (for ِ ”“ ِ O°. O 9. some digraphs start with an apostrophe: the for example Koran 6:149), and it is possible to hamza-carriers = ’a, = ’i, = ’u, = ’I, add diacritics to its first glyph (as in ِ ِ , Koran ä ï ü ❡ ÿ ”“ LَU .( = ’A and also undotted letters b¯a ˆ = ’b, 2:115 or ِ L—”“ , Koran 2:165 n¯un = ’n, f¯a❡ = ’f and q¯af = ’q; ‹ Œ  3.2 Unicode input 10. other digraphs end with one or more asterisks: ❡ the most frequent one is the t¯a marbu.ta „ = Input can be transliterated or provided directly in ´OِÀ } or t* (which can be used also in initial and me- Unicode Arabic: \arab{YAnis} or \arab{ nis} or \arab{YA ´OِÀ } will produce dial forms, and then becomes a regular t¯a❡). even \arab{ . ِ :The asterisk is also used for the wa.sla (which the same result ❡ À ´ is only placed on the alif ) ƒ = A* as well as for All cells of Table 1 can be obtained by the cor- responding Unicode characters (mostly via a single the vertical fath. a (as in †A ٰ¥ = ha*dhA) and the madda. The latter is normally used only on the character, except for ˇsadda + vowel combinations ❡ alif ( = ’A) but can be found also in the notori- which require two characters). There is a special ❢ case, though: the ligature (see next section). “”ous muqa.t.ta ¯at in the Koran, as in ٓ¼ٓ¿ٓ (Koran L «ٓ¦ (Koran 19:1) — sometimes it is For the convenience of the user who wants to  or (42:2 (write kashida (so that Arabic input is not disrupted ٰ ٰ ٓ ٓ even combined with a ˇsadda (as in ٓŠ A®ٓ² , Koran 7:1 and [21, p. 111] for the ˇsadda); we have defined a command (in Arabic characters) 11. a special transcription is provided for the lig- Á \ ( Á are the first two letters of Á ° = ta.twyl, the ature L”“ = LLh used for the A¾³A®©Gæ®Ò “noun Arabic name of kashida) which is exactly equivalent of majesty”, which is the name of God AL”“ : in to \kesh and has to be placed between Unicode U+0640 arabic tatweel characters.

Yannis Haralambous TUGboat, Volume 35 (2014), No. 3 281

\documentclass{article} \usepackage{dad} \begin{document}

Weak. Weak verbs are those with one or more \textbf{Weak}. Weak verbs are those with one weak letters (U or Y) as radicals. There are four or more weak letters (\arab{U} or \arab{Y}) sub-classes: as radicals. There are four sub-classes: \begin{itemize} • Assimilated. Assimilated verbs have ini- \item\textbf{Assimilated.} Assimilated verbs tial U or (much more rarely) Y, and two sound have \emph{initial} \arab{U} or (much more radicals or middle | and a sound final radical. rarely) \arab{Y}, and two sound radicals or Typical doubled roots are À , U ° . middle \arab{|} and a sound final radical. • Hollow. Hollow verbs have middle U or Y and Typical doubled roots are \arab{Ybs}, two sound radicals or initial | and a sound \arab{USl}. final radical. Typical hollow roots are ºl , . \item\textbf{Hollow.} Hollow verbs have ½ \emph{middle} \arab{U} or \arab{Y} and two • Defective. Defective verbs have final U or sound radicals or initial \arab{|} and a Y and two sound radicals. Typical roots are sound final radical. Typical hollow roots r¨ , r±Ê . are \arab{qUl}, \arab{SYr}. • Doubly weak. Doubly weak verbs have two weak radicals, U, Y or |. Typical doubly weak \item\textbf{Defective.} Defective verbs have roots are , , , , . \emph{final} \arab{U} or \arab{Y} and two U®Ê ¾I äÁÊ räY ¾| sound radicals. Typical roots are \arab{rjU}, \arab{rmY}.

\item\textbf{Doubly weak.} Doubly weak verbs have two weak radicals, \arab{U}, \arab{Y} or \arab{|}. Typical doubly weak roots are \arab{UlY}, \arab{sUI}, \arab{’atY}, \arab{r’aY}, \arab{sU|}.

\end{itemize}

In the middle of a verse hamzah is merged with In the middle of a verse \emph{hamzah} is the final vowel of the preceding word, e.g., َ ¯َ¼ َ «¦ merged with the final vowel of the preceding ,{He created man. Note that the or has word, e.g., \arab{khalaqa Alo’iinosaAna , ِ {n َ¿ْ´ AFْ‘ َq َ ¯َ¼ َ «¦ He created man. Note that the \arab{qa َ joined the ْl or َ n َ¿ْ´ِ Fْ‘ äَ and while the hamzah sign has disappeared from the text, ’alif is retained or \arab{khalaqa} has joined the \arab{lo} but it is not pronounced. or \arab{’aalo’iinosaAna} and while the \emph{hamzah} sign has disappeared from the Practice text 11 text, \emph{’alif} is retained but it is not 1. Man says — pronounced.\\[6pt] \\{n َ¿ْ´ِ AFْ‘ َ ºَl \textbf{Practice text 11 ُ äَ َ¿َْDoes man think? — ُ n َ¿ْ´ِ AFْ‘ ُ .2 3. He said: How long hast thou tarried? — 1. Man says --- \arab{qaAla Alo’iinosaAnu}\\ ºَl 2. Does man think? --- \arab{’aaYaHosabu َ ³َ» ْ \\{Ãْ ®َِÔ Alo’iinosaAnuَ --- ?The truth is out — ˜¼َ A®ْ ََْ َ 3. He said: How long hast thou tarried .4 \\{He created man from dry clay — َ n َ¿ْ´ِ AFْ‘ َ ¯َ»َµ َ «¦ \arab{qaAla kamo labithota .5 َ ْ¶±ِ 4. The truth is out --- \arab{HaSoHaSa lٍ َ ¯ْ AloHaq+u}\\ 5. He created man from dry clay --- \arab{khalaqanaA Alo’iinosaAna mino SaloSaAliN}

\end{document}

A Figure 2: Sample LTEX document using D ([16, p. 5] and [21, p. 54–55])

A simple Arabic typesetting system for mixed Latin/Arabic documents: d. ¯ad 282 TUGboat, Volume 35 (2014), No. 3

l l A 3.2.1 The L”“ ligature and Unicode 0 l l0 0 l ll0 0 ll5A3 ∅ A The ligature is traditionally used for writing the l ®° ®Gæ L”“ h name of God: . It can be found in religious texts, k0 k 0 h k0 k AL”“ A0 A but also in expressions (for example, ïnÚ|AL”“ which means “hopefully” appears even in the French lan- l4A4 l1k3 LLh0 ll1k3 guage as inchallah and in Portuguese as oxal´a) and F ®­ L”“ ®¯­ in the very common surname AL”“ Abdullah. The problem with this ligature is that it con- Figure 3: Finite state automaton starting with an ❡ tains a rather rare diacritic (a ˇsadda combined with isolated l¯am ( alif A stands for the set of letter A = { A, a vertical fath. a — the latter is available in the Apple ä, ï, Â, ƒ }; k stands for any Arabic letter besides h and Arabic keyboard layout but not the Microsoft one) set A. and, as a convenience, most standard fonts will re- ❡ place the character string l¯am-l¯am-h¯a (which would as well as the following four for Unicode input: normally look like ) by the complete ligature ; in ®¯§ L”“ "0643 LIG/ "062A → k1 t0 other words: the font not only changes the glyphs k1 /LIG "062A → k1 t3 but, at the same time, also adds the diacritics. This t3 LIG/ "0628 → t2 b0 behavior is barely legitimate: a ligature (as in ‘fi’ or t2 /LIG "0628 → t2 b3 ‘ F ’) is normally limited to a change of glyphs, and should not add new characters (in this case, charac- The first ligature of each group leaves t/"062A un- ters U+0651 arabic shadda and U+0671 arabic changed (isolated) and turns k/"0643 into initial letter superscript alef) since this means that form. Then the second ligature takes k1t0/k1"062A, what is rendered no longer corresponds to the under- leaves k unchanged (initial) and turns t/"062A into lying Unicode character string. final form. But, because of the following b/"0628, Nevertheless, for the user’s convenience, we have the third ligature will turn t into medial form, leaving b adopted that behavior also in D, but only in the /"0628 unchanged. And, finally, the fourth ligature case of Unicode input. Therefore when the user will leave t unchanged and turn b/"0628 into final types Unicode l¯am-l¯am-h¯a❡ (the first l¯am must not form. It is noteworthy that t changes form thrice: be preceded by a quadriform letter), the system will from isolated (default) it turns into final and then produce the L”“ ligature. into medial form. This method will not work if a diacritic is in- All basic Arabic glyphs are placed into the first serted between the two l¯ams, or if the first l¯am 8-bit table. Then one 8-bit table (except for table follows a quadriform letter and hence will be medial. "06xx which is used for Unicode input) is added for For that case, we have defined a macro ®¯§ / (the every letter + diacritic(s) combination, so that we macro name is in Arabic script so that right-to-left have, in total, 20 tables. The complete font contains direction is not disrupted) which takes an argument: 3,514 virtual glyphs, 403,913 ligatures (321,935 of the vowel between the two l¯ams. Hence, to obtain which are smart) and 7,810 kerns. M the user can choose between one of the following: The most challenging letter is l¯am: the fontَ ِ “” ِ K contains 3 initial l¯ams, 4 medial ones as well as 3 ûِ {ûِ } ®¯§ / fake” ligatures l¯am-l¯am (“fake” in the sense that“ َ faLiLhi they are only needed because of TEX’s approach of The dotted circle, used to show the combining nature building ligature stepwise and hence needing inter- of short vowels and other diacritics, can be obtained mediate steps for all ligatures of length three and ❡ by the macros \arabdottedcircle or dAϽ„ / with more: to obtain the l¯am-l¯am-h¯a ligature (see § 3.2.1) the macro name in Arabic script. one needs an intermediate l¯am-l¯am, even though this pair of letters does not take any special form. In 4 T Xnicalities E Fig. 3 the reader can see the finite state automaton The D font is a tour de force of smart ligature use: starting with an isolated l¯am. k t b for example, to obtain 1 2 3 ( «ë ) out of ktb = The virtual OVP font is built from the met- k0t0b0 (or "0643"062A"0628) input, one needs the rics of the PostScript Type 1 font by a Perl script. following smart ligatures: This script also reads configuration files specifying k0 LIG/ t0 → k1 t0 all kern pairs as well as all horizontal and vertical k1 /LIG t0 → k1 t3 adjustments of diacritics. By this method, every t3 LIG/ b0 → t2 b0 letter has its diacritics placed at optimal positions. t2 /LIG b0 → t2 b3 To compile the OVP file produced by the script into

Yannis Haralambous TUGboat, Volume 35 (2014), No. 3 283

OVF, it is mandatory to use tool wovp2ovf of ver- In Proceedings of EuroTEX’92, pages 293–305. sion higher than “1.13 (build 34787)”, which will be CSTUG, 1992. included in TEX Live 2015. [6] Yannis Haralambous. Fonts & Encodings. O’Reilly, Names of PostScript glyphs are standard,2 so 2007. that copy-paste from a PDF file results in almost [7] Yannis Haralambous, Yassir Elidrissi, and Philippe perfect Unicode strings. Lenca. Arabic language text classification using dependency syntax-based feature selection. 4.1 Conversion to and from UTF-8 Submitted to CITALA 2014. As a tool for users, we provide two Perl scripts al- [8] Yannis Haralambous and John Plaice. lowing conversion from UTF-8 to our transliteration First applications of Ω: Adobe Poetica, Arabic, scheme and back. These scripts can be applied se- Greek, Khmer. TUGboat, 15:344–352, 1994. lectively using, for example, the feature of many [9] Yannis Haralambous and John Plaice. Multilingual typesetting with Ω, a case study: Arabic. In advanced text editors of applying text filters to se- Proceedings of the International Symposium on lected text areas. Multilingual Information Processing ’97, pages 5 Conclusion 137–154. ETL, Tsukuba, Japan, 1997. [10] Yannis Haralambous and John Plaice. The design There was a period (in the early days of non-Latin- and use of a multiple-alphabet font with Ω. alphabet TEX[4, 5, 14]) where transliteration of in- In Electronic Publishing, Artistic Imaging, put text was the only available method. Then, when and Digital Typography, volume 1375 of LNCS. Unicode was sufficiently widespread, TEX switched Springer, 1998. to tools allowing direct non-Latin input. In the case [11] Taco Hoekwater. LuaTEX. TUGboat, 28:312–313, of Arabic, because of the particular characteristics 2007. http://luatex.org. of this script, this is—even today—not always the [12] Jonathan Kew. X TE EX, the multilingual lion: optimal solution, especially when we are dealing with TEX meets Unicode and smart font technologies. short extracts of Arabic text combined with Latin- TUGboat, 26:115–124, 2005. http://tug.org/ alphabet text and TEX commands. Maybe now is . the time to return to methods based on translitera- [13] Donald E. Knuth. The new versions of TEX and tion, as an alternative to direct-input methods. We METAFONT. TUGboat, 10:325–328, 1989. have implemented this approach, using only smart [14] Klaus Lagally. ArabTEX — Typesetting Arabic ligatures, as defined by Donald Knuth in 1990 [13], with vowels and ligatures. In Proceedings and the large virtual font format introduced by Ω of EuroTEX’92, pages 153–172. CSTUG, 1992. [15] Ahmed Lakhdar-Ghazal. Pour apprendre and taken over by LuaTEX. et maˆıtriser la langue arabe We hope that this package will be useful to users . Institut d’´etudes et de recherches pour l’arabisation, Rabat, Morocco, seeking a straightforward method to introduce short 1991. Arabic extracts into Latin-alphabet documents. [16] John Mace. Arabic verbs and essential grammar. References Teach yourself books, 1999. [17] Nicole Richert. Arabisation et technologie. [1] Adobe Systems. Adobe glyph list. Institut d’´etudes et de recherches pour http://partners.adobe.com/public/developer/ l’arabisation, Rabat, Morocco, 1987. en//glyphlist.txt, 2002. Arabic. A linguistic introduction [2] H`an Thˆe´ Th`anh. Micro-typographic extensions [18] Karin C. Ryding. . to the T X typesetting system. TUGboat, Cambridge University Press, Cambridge, 2014. E A 21(4):317–434, 2000. http://pdftex.org. [19] Michael Shell. IEEEtran LTEX class. [3] John Hammersley, John Lees-Miller, et al. http://ctan.org/pkg/ieeetran, 2007. The writeLATEX online collaborative LATEX editor. [20] John R. Smart. Arabic. Teach yourself books, http://www.writelatex.com. 1986. [4] Yannis Haralambous. Arabic, Persian and [21] Barakat Ahmad Syed. Introduction to Ottoman TEX for Mac and PC. TUGboat, Quranic script. Curzon Press, 1984. 11:520–522, 1990. [5] Yannis Haralambous. Towards the revival of ⋄ Yannis Haralambous traditional Arabic typography through TEX. Institut Mines T´el´ecom, T´el´ecom Bretagne, UMR CNRS 6285 Lab-STICC 2 These names are either taken from the Adobe Glyph Technopˆole Brest Iroise CS 83818, List [1], or using the standard convention uniXXXX.var where XXXX is the Unicode position of the character and var ∈ {ini, 29238 Brest Cedex 3, France med, fin, iso}. Ligatures are named by the names of their yannis.haralambous (at) telecom-bretagne components, concatenated using underscores. dot eu

A simple Arabic typesetting system for mixed Latin/Arabic documents: d. ¯ad 284 TUGboat, Volume 35 (2014), No. 3

A Visual editing (in a specialized case): prerex In contrast, the use of LTEX to produce these charts provides complete flexibility as well as profes- Bob Tennent sional quality. Abstract • Text within a course box may be partitioned It is sometimes desirable and straightforward to sup- into regions with varying characteristics. For example, the course code and the timetable in- port visual editing for LATEX; this article describes one such case — course prerequisite charts, supported formation on the first line of course boxes are by the (v)prerex programs. in a smaller font than the course name. The latter is centered and the former are left- and A 1 Introduction right-justified, respectively. Arbitrary LTEX for- matting can be used for the text. One of the most frequently asked questions by LAT X E • Any available fonts may be used. The T X beginners is whether a graphical interface “like a E typesetting engine takes advantage of kerns and word processor” is available. Most readers of this ligatures in the fonts. article know how to respond: we emphasize logi- • cal structure and we point out the availability of Line thickness for boxes may be varied; in the A A example diagram, heavier boxes (and bold-face LTEX-friendly text editors, LTEX development envi- ronments, preview-latex1 and, in recalcitrant cases, text) are used to indicate that a course is “re- 2 3 4 quired” in the program, rather than an option. Scientific Word, or the LYX or TEXmacs “docu- ment processors”. Not as often mentioned is the • Different styles of connectors can be used, for ex- difficulty of parsing arbitrary TEX documents; it’s ample to distinguish prerequisites, co-requisites, been said that only TEX can process TEX. and recommended prerequisites. In this article, I discuss the design of a sys- • Various sizes or shapes of course boxes may be tem that allows (but doesn’t require) a form of vis- used, for example to distinguish between half ual editing of a specialized LATEX environment for and full courses. “prerequisite charts”. • Graphic images such as logos can be imported. • Colours and hyperlinks to on-line course descrip- 2 Prerequisite charts tions or calendars are possible. A prerequisite chart gives an attractive graphical As an example, the chart in Figure 1 is produced presentation of courses in a program (or set of related by the LATEX code in Figure 2. A conventional two- programs), organized by terms or years, linked by pre- dimensional Cartesian coordinate system is used to and co-requisite arrows (directed edges), and, when specify the locations of diagram elements. The origin possible, supplemented by timetable information; (where x = 0 and y = 0) is at the lower-left corner of Figure 1 on page 286 is a small example. Realistic the diagram. The coordinates of boxes are those of its examples may be found at http://www.cs.queensu. centre point; an arrow is described by the coordinates ca/students/undergraduate/prerequisites. of the centre points of its source and target boxes. Some notable properties of these charts: The order of commands is not significant except that • Each course box is sized to just enclose the text the commands for the source and target boxes of an within it, with uniform standard margins. arrow should precede the command for the arrow. 5 6 • The prerex package currently uses pgf and Each arrow between courses is oriented from other standard packages to implement the chart box centre to box centre, rather than from/to environment and a specified set of commands within standard “connection points” on the box out- that environment. Some implementation details: lines. • The half-course boxes are assigned a minimum • The arrows are “clipped” by the course boxes, height to give a more uniform appearance to but the arrowheads abut the target box exactly. horizontal rows of such boxes. These desirable properties are not easily achieved • Arrows with a small height are always drawn using conventional drawing software, no matter how straight (using a specialized and simpler macro) “user-friendly” it purports to be. unless a non-zero curvature is explicitly requested. • A wider white edge is drawn below every arrow 1 http://preview-latex.sourceforge.net/ to improve the appearance of crossing arrows. 2 http://www.sciword.demon.co.uk 3 http://www.lyx.org/ 5 http://www.ctan.org/pkg/prerex 4 http://www.texmacs.org/ 6 http://www.ctan.org/pkg/pgf

Bob Tennent TUGboat, Volume 35 (2014), No. 3 285

3 The prerex editor allowing coordinates of “selected” diagram elements It is certainly possible, though rather tedious and or background points to be conveyed via the “clip- error-prone, to use a conventional text editor to board” from the PDF viewer to the prerex editor create and revise such descriptions, with some op- command line. erations, such as global or partial “shifts” of chart To implement this, a bare-bones open-source elements, being particularly problematic. PDF viewer was hacked so that, when nodes, arrows The prerex editor is a C program that allows or background points are mouse-clicked, the relevant chart descriptions to be edited interactively. The chart coordinates are loaded into the clipboard. For editor supports add, remove, cut-and-paste, and edit nodes and arrows, the relevant chart coordinates are operations on diagram elements, and vertical or hori- already available in the special URIs. For background zontal shifts of: a list of specified elements, all the el- points, it is necessary to transform PDF coordinates ements in a rectangular region, or the entire diagram. into chart coordinates. To allow this transformation, The edited diagram may be saved, re-processed, and two “anchors” (i.e., virtual nodes) are inserted at viewed in any PDF viewer, without exiting the editor. the southwest and northeast corners of the chart; The program reads the source file (possibly for from their PDF coordinates and their known chart an initial “blank” chart), saving text until the chart coordinates, it is possible to compute the chart co- environment is found, then parses the chart com- ordinates of arbitrary clicked points on the chart. mands into an internal representation of the diagram. All the vprerex application does is to start up the The rest of the source file is saved as text. Macro prerex editor in an xterm and the prerex- definitions and calls are not processed or expanded. enabled PDF viewer. This naive approach to visual When the internal representation of the chart editing is relatively simple to implement and quite (linked lists of node and arrow data) is available, pleasant to use. it is routine to implement editing operations. At The source code for the vprerex application any time, the user can ask for a source file to be is available in the documentation directory of the re-created (using the saved texts and the possibly re- prerex package. It depends on the poppler-qt4 vised internal representation) and then re-processed. and other Qt-4 libraries. The revised output is available in about four seconds 5 Discussion and can be observed in any PDF viewer. An interactive editor like prerex is feasible because Also observable in most PDF viewers are the it only has to deal with a single very specialized coordinates of nodes, or the coordinates of the ini- environment and specialized commands within that tial and terminal nodes of arrows, when the cursor is environment. The “visual editor” vprerex is notable moved over the node or arrow; this is because, during for being unambitious: it simply puts coordinates editing, the usual URL associated with nodes is re- into the clipboard for the user to convey to the inter- placed by a special URI containing these coordinates. active editor, whereas very ambitious projects such as On initialization, a coordinate grid is generated for VorT X (Visually-oriented T X) [1] and T XLite [2] the background of the chart in order to facilitate E E E have apparently foundered. Perhaps the approach determination of coordinates. If necessary, the user described here might be applicable to other projects can “escape” from the editor to a shell, for example, which could benefit from a form of visual editing. to edit the source file with a conventional text editor. A multi-level “undo” command is available. If a box References is “cut” and then “pasted” elsewhere, the target or [1] Pehong Chen et al. The VorTEX document source coordinates of arrows into or out of the box preparation environment. In TEX for Scientific are adjusted automatically, and similarly if nodes Documentation, volume 236 of Lecture Notes in are shifted or raised. Computer Science, pages 45–54. Springer, 1986. The source code for the prerex editor is avail- [2] Igor I. Strokov. A WYSIWYG TEX able in the documentation directory of the prerex implementation. TUGboat, 20(4):356–359, package. The only non-standard library dependence December 1999. http://tug.org/TUGboat/ is readline (or libedit). tb20-4/tb65strok.pdf. 4 The vprerex interface to prerex ⋄ Bob Tennent Of course, interactive editing with revisions moni- School of Computing, Queen’s University tored in a PDF viewer is not what is usually meant Kingston, Ontario K7L 3N6 Canada by visual editing. But most of the convenience of rdtennent (at) gmail dot com visual diagram editing can be obtained by simply http://www.ctan.org/pkg/prerex

Visual editing (in a specialized case): prerex 286 TUGboat, Volume 35 (2014), No. 3

Computer

Science 1083 TTh 10:00

Comput. Sci.

1303 MWF 9:30 Concepts Discrete Structures

2813 MWF 8:30 2023 MWF 2:30 2513 TTh 1:00 Computer Procedural Informat. Organiz. I Prog. Devel. Systems

1083

2333 TTh 11:30 2013 MWF 11:30 2685 no Computab. & Software C++ Formal Lang. Engineer. I Program.

2013

3323 MWF 10:30 3813 TTh 8:30 3413 MWF 9:30 3013 MWF 11:30 3513 MWF 8:30 pm 3503 TTh 10:00 Data Comput. Operating Software Database Sys. Anal. Structures Organiz. II Systems I Engineer. II Mngt. Sys. I & Design

• A solid arrow indicates a required prerequisite, a dotted arrow indicates a corequisite (to be taken before or concurrently), and a dashed arrow indicates a recommended prerequisite. Core courses are in bold boxes; other courses (i.e., options or prerequisites) are in light boxes.

• Timetabling abbreviations: M, T, W, Th, F=Mon, Tue, Wed, Thur, Fri, resp.; eve=7:00–9:50 pm; no=not offered. Figure 1:A prerex-formatted prerequisite chart \begin{chart} \text 10,50:{\Large Computer\\\Large Science} \reqfullcourse 50,45:{1083}{Comput.\,Sci.\\Concepts}{TTh 10:00} \reqhalfcourse 25,40:{1303}{Discrete\\Structures}{MWF 9:30} \reqhalfcourse 30,30:{2813}{Computer\\Organiz.\,I}{MWF 8:30} \prereq 50,45,30,30: \prereq 25,40,30,30: \reqhalfcourse 45,30:{2023}{Procedural\\Prog.\,Devel.}{MWF 2:30} \prereq 50,45,45,30: \reqhalfcourse 65,30:{2513}{Informat.\\Systems}{TTh 1:00} \coreq 50,45,65,30: \mini 10,26:{1083} \reqhalfcourse 10,20:{2333}{Computab.\,\&\\Formal\,Lang.}{TTh 11:30} \prereq 25,40,10,20: \prereq 10,26,10,20: \reqhalfcourse 45,20:{2013}{Software\\Engineer.\,I}{MWF 11:30} \prereq 45,30,45,20: \halfcourse 55,20:{2685}{\texttt{C++}\\Program.}{no} \prereq 45,30,55,20: \mini 21,16:{2013} \reqhalfcourse 15,10:{3323}{Data\\Structures}{MWF 10:30} \prereq 25,40,15,10: \prereq 21,16,15,10: \reqhalfcourse 25,10:{3813}{Comput.\\Organiz.\,II}{TTh 8:30} \prereq 30,30,25,10: \reqhalfcourse 35,10:{3413}{Operating\\Systems\,I}{MWF 9:30} \prereq 30,30,35,10: \recomm 45,20,35,10: \halfcourse 45,10:{3013}{Software\\Engineer.\,II}{MWF 11:30} \prereq 45,20,45,10: \halfcourse 58,10:{3513}{Database\\Mngt.\,Sys.\,I}{MWF 8:30 pm} \prereq 65,30,58,10: \prereq 45,20,58,10: \reqhalfcourse 70,10:{3503}{Sys.\,Anal.\\\&\,Design}{TTh 10:00} \prereq 65,30,70,10: \end{chart} Figure 2:LATEX source for the prerequisite chart in Fig. 1

Bob Tennent TUGboat, Volume 35 (2014), No. 3 287

A l3build —A modern Lua test suite of the work of the LTEX3 project, a new Lua-based for TEX programming testing environment has been written to support ongoing development. This testing environment, pre- Frank Mittelbach, Will Robertson and sented at the 2014 TUG conference in Portland [3], The LATEX3 team is suitable for use by the general TEX community. Contents 2 History 1 Introduction 287 The ideas for a regression test suite for LATEX date 1 back to the early nineties when LATEX 2.09 existed 2 History 287 in various incompatible flavours around the world 2.1 The needs in the ’90s ...... 288 due to its limitations in properly supporting font 2.2 The general approach ...... 288 selection, complex mathematics, and languages other 2.3 The new needs (in the new century) 289 than English. Because of that situation LATEX 2ε was designed and implemented to reunite the different 3 Overview of the new system 289 format and to provide a stable platform for future 3.1 Modes of testing ...... 290 A LTEX development. However, to successfully introduce LAT X 2ε as 4 Setting up the regression test system 290 E an accepted successor of LAT X 2.09 it was essential 4.1 Creating and checking test output . 290 E to win over the huge LAT X user base and provide 4.2 An example driver file ...... 291 E them with a system that was as stable and upward 4.3 The structure of test files ...... 291 compatible as possible. Thus existing user interfaces 4.4 Options ...... 292 should be preserved and typesetting should provide 5 Operating the system 292 identical output except in those cases where bug fixes or deliberate design decisions resulted in changes. 6 Acknowledgements 293 To achieve this we devised a validation mech- anism that could be used to ensure that interfaces 1 Introduction behave as expected and typesetting results do not Regression tests are an important tool in any mod- change even though the underlying code gets mod- erately complex programming environment. They ified. With this in place the LATEX3 Project Team allow the programmer to make extensive changes to together with additional volunteers set out to create their code while providing confidence that something a large number of test files and verify them against that used to work still does. Extensive regression test the current LATEX 2.09 implementation. Figure 1 suites have been an essential component of the main- shows the original request for volunteers (exhibiting tenance and development of LATEX 2ε and LATEX3. a severe underestimation of the amount of work in- A regression test suite is typically composed of volved); see also [2] for a more extensive description a number of individual files that contain one or more of this endeavor. testable units of the code being tested. A testable This effort resulted in something like 200 test unit might be either a certain computation with files that were then used to assure ourselves that an expected outcome, a series of logic tests, or — the new LATEX 2ε implementation was faithfully sup- in particular for TEX-based code — material that porting all interfaces — it was one of the key factors is typeset and intended to achieve some particular that ensured the new system became an accepted formatting. replacement for LATEX 2.09 within a reasonably short During code development and before any new period. code is released to the public, this test suite can Once in place this regression test suite was aug- be compiled to ensure that any changes to the code mented over time and now contains roughly 350 test have not introduced bugs or changed the behaviour files altogether. Whenever a bug was found and fixed compared to previous versions. As bugs in the code we added a new test file that would exhibit the unde- are reported, minimal examples demonstrating the sired behavior if that bug would somehow resurface bug often form test files of their own, showing that through later changes. the bug has been fixed and won’t re-occur. Though not perfect (after all we introduced a As TEX-based code operates in at least three dif- number of bugs that initially were not caught by the ferent ‘modes’ (mouth, stomach, and output), regres- 1 As with many ideas in the TEX world, this one too can sion testing is more complex than simply asserting be partly traced back to Don Knuth, who already provided the outcome of certain programming logic. As part his own regression test for TEX a decade earlier [1].

l3build — A modern Lua test suite for TEX programming 288 TUGboat, Volume 35 (2014), No. 3

Validating LATEX 2.09 installations. So the regression suite had to function Writing test files for regression testing: with different installations without creating spurious checking bug fixes and improvements to verify differences. that they don’t have undesirable side effects; Finally all tasks had to work without user in- making sure that bug fixes really correct tervention or manual work because only in that case the problem they were intended to correct; will such a system be used on a regular basis and testing interaction with various document thus benefits be realized. styles, style options, and environments. 2.2 The general approach We would like three kinds of validation files: Designing a test system for verifying T X’s type- 1. General documents. E setting behavior is not easy — how do you test for 2. Exhaustive tests of special correctness and how do you ensure that the tests are environments/modules such as tables, repeatable over time and in different places? displayed equations, theorems, floating The approach we came up with was to build figures, pictures, etc. test files that generate suitable data in their .log 3. Bug files containing tests of all bugs that files. Suitable data would be, for example, the state are supposed to be fixed (as well as those of counters or dimensions produced with \showthe, that are not fixed, with comments about data written with \typeout, and box content shown their status). with \showbox. Some of the tracing parameters A procedure for processing validation files of TEX could be used to verify paragraph build- has been devised; details will be furnished ing or page breaking decisions, but something like to anyone interested in this task. Estimated \tracingall would be inadvisable, as that would time required: 2 to 3 weeks, could be show the internal coding and not the expected func- divided up. tionality. The result of running such a test file would then Figure 1: Original request for volunteers be manually verified and stored away as a certified re- sult. However, as many readers will already be aware, regression test suite), the approach served us very LATEX’s .log files contain a lot of irrelevant data, well and prevented a number of horrible mistakes that some of which differs from run to run and some of would otherwise have made it into public releases which differs when running on different installations. of LATEX. So to make this approach workable we introduced a cleanup step in which we modified the result files 2.1 The needs in the ’90s removing irrelevant material and normalized some of With the initial regression test suite we solved a the remaining parts. Of course one has to be careful number of burning problems. First of all we wanted not to sanitize too far, but we found a number of to be confident that the code and the documented things necessary or at least advisable, including user interfaces worked as expected. Whenever we • shortening file path info to avoid differences recoded an internal function the test suite would au- between installations tomatically alert us if that resulted in any noticeable • drop empty lines (different T X implementations changes at the user level or in downright bugs. E put in different numbers of these) Furthermore LATEX 2ε came with much more documentation and the tests included compiling and • drop line numbers in ‘on line ’ to avoid checking the documentation files for errors and miss- differences just because extra lines got intro- ing references. duced in a test file. In addition the Makefiles that ran the tests also Putting it all together we ended up with a system included goals to build the distribution automati- consisting of test files (with the extension .lvt), cer- cally. Compared to LATEX 2.09, which consisted of tified result files to compare against (extension .tlg) very few files, the format for LATEX 2ε was generated and a fairly complex Makefile and a number of Perl from many source .dtx files, so the housekeeping scripts used to run the different tasks. These tasks complexity was greatly increased. included running the test suite, producing the doc- Another issue we had to tackle was that the umentation and generating the distribution (ready code was no longer maintained by a single person to be shipped to CTAN). It also contained a number but by developers living in different places around of special functions such as unpacking and locally the world and using different operating systems and installing the code, cleaning up the source directories,

Frank Mittelbach, Will Robertson and The LATEX3 team TUGboat, Volume 35 (2014), No. 3 289

checking individual test files, and producing a new thermore it would have been a solution not available .tlg file for a given test file. out of the box on any TEX installation. Eventually, we decided to apply the same prin- 2.3 The new needs (in the new century) ciple used long ago with docstrip.tex: use the As mentioned above, the initial system served us scripting language with some operating system ca- well, when moving from LATEX 2.09 to LATEX 2ε and pabilities that is available out of the box on all TEX then throughout the ’90s, which had very active installations. Back then the answer was that only LATEX 2ε development with releases produced at half- TEX itself fit that bill and so TEX became the tool year intervals. to build style files, etc., from .dtx sources. However, In this century, development of the core of the while TEX as such is too limited to be used for script- LATEX 2ε kernel has slowed to a minimum (releases ing a regression test system, we now had LuaTEX as are now only every couple of years and the changes an engine that offers a full-fledged Lua interpreter — are small) while it has intensified in other areas such and these days LuaTEX is part of all modern TEX as actively progressing the development of the LATEX3 installations. programming language expl3. With this new focus, Moving to Lua (or texlua to be precise) means newly important requirements for a regression test that the test and distribution system is now not tied system became apparent. to either the operating system (as the script runs on Instead of a single distribution we now had to Windows and Unix variants) or to third-party tools deal with a growing number of distributions: core (as Lua is available as part of a modern TEX system). LATEX 2ε and its packages, Babel (with a different release cycle), expl3 and possibly smaller and larger − − ∗ − − distributions of third party code that also wanted to benefit from a functional regression test system. In the remaining sections of this article we describe Windows and Mac OS X became the operating the new system and how it can be applied to support systems of choice for several developers and the Make- arbitrary code within the TEX world. file approach of the original test suite did not work on Windows and only with modifications on Mac OS X. 3 Overview of the new system Last but not least, a number of new TEX-based To illustrate, a hypothetical package will be described engines matured and people now wanted to use LATEX that uses the new system: consider a package abc and friends not only on pdfTEX but also on these new with a collection of source files in the following layout. engines all of which provided additional capabilities. abc/ These new engines showed a number of subtle differ- abc.dtx ences when adding data to the .log file, or due to abc.ins extended capabilities showed additional data (such build.lua as extra nodes in listings). Furthermore the new README engines still have bugs and a number of them showed testfiles/ up when we initially ran test files and compared their test1.lvt output with the certified .tlg data. test1.tlg Thus testing became a multi-dimensional prob- ... lem: one had to verify test results with several en- support/ gines and it had to work on multiple operating sys- abc-test.cls tems. Furthermore new code sources posed new or different requirements for building a distribution or What is added in addition to the normal source files doing the testing and we soon found that the original is a short Lua script, normally called build.lua. approach made a number of hardwired decisions that Test files and their certified results are located in the were no longer applicable if the system was used with folder ‘testfiles/’ with extensions .lvt and .tlg, a distribution different from LATEX 2ε. respectively. The files in support/ (if any) are used For a short while we tried to accommodate the when running the test files. need for Windows support by using a set of .bat files Upon running the test suite, a new folder ‘build’ in parallel with the Makefile approach but obviously is created in which the package is unpacked, support that was doomed to failure, being impractical to files are copied across, and each test file is run in turn maintain. Another avenue we explored was switch- and compared to its original .tlg file. Directories ing to a fully Perl-based approach (using Cons) but and file names are adjustable and other setups are that again didn’t work well with Windows and fur- possible; the above structure is simply the default.

l3build — A modern Lua test suite for TEX programming 290 TUGboat, Volume 35 (2014), No. 3

3.1 Modes of testing This test will then produce the following output, as in standard LAT X only a counter directly “within” The best way to perform regression tests for TEX E programming is to use the .log file; only here can box is reset to zero (e.g., the subsection counter is not content be tested, not just logical and programmatic touched when chapter is stepped): constructs. Box content is essential for checking from ======the very highest level that code changes do not result TEST 8: stepping counters in different typeset output. ======TEX programming can be either expandable or 3-0-4 not. Code that is expected to be expandable should ======be tested as such. This can be done by evaluating 3-1-0 it within something like \typeout (in the case of ======A LTEX). For non-expandable tests one should out- (Assuming it’s the eighth test in the file.) put their results to the .log once they have been evaluated. As mentioned earlier there are also a 4 Setting up the regression test system number of T X tracing parameters and commands A E Consider the case that a LTEX package consists of like \showbox, \showlists, or \showthe that can one or more .dtx files in a flat directory structure. be used to generate relevant test data in the .log file. By default, to set up a regression test suite, you To aid in producing a structured test suite we would create a driver file named ‘build.lua’ and provide a number of commands for use in the test sub-folder named ‘testfiles/’ to contain the test files. The \TYPE command is used to write material files. An example driver file is shown in Section 4.2. to the .log file; it works like \typeout, but it allows The test files can be called basically anything ‘long’ input. A variety of commands, following, then (but should be logical in some way), and by default use \TYPE to output strings to the .log file. have the extension .lvt. These are accompanied • \SEPARATOR inserts a long line of = symbols to by a pre-saved .tlg file which contains the ‘results’ break up the output. of the test file to be checked against subsequent • \TRUE, \FALSE, \YES, \NO insert text strings for compilation of that test. If a test file has different standardized comparison. results for different engines it is possible to “certify” • \ERROR is not defined but is commonly used .tlg files for each engine; those then have extensions to indicate a code path that should never be such as ..tlg. reached. 4.1 Creating and checking test output To produce individual tests we offer the commands \TEST and \TESTEXP. These commands take two ar- The first time a .lvt test file is written, it will need guments: a title and the actual test body. \TESTEXP to be compiled to obtain the necessary .tlg output executes the body within a \TYPE command to test for future tests. This is performed with: expandability but with \TEST you are responsible for texlua build.lua save htest namei generating test output using \TYPE, \TRUE, etc. as (To produce an engine-specific .tlg file an additional it is intended to be used for non-expandable tests. henginei argument can be given.) This task can be Both commands surround the generated output with re-run as many times as necessary until the test file \SEPARATORs and display the title and a test number. demonstrates the necessary behaviour being tested. Here is an example: At this point, \begin{TEST}{stepping counters} texlua build.lua check htest namei { \setcounter{chapter}{2} will then re-run the .lvt file and compare the result \setcounter{section}{5} to the original .tlg output. If no htest namei is \setcounter{subsection}{4} specified all tests in the test directory are run. Pre- \stepcounter{chapter}% suming no code has changed to affect the output of \TYPE{\arabic{chapter}-% the tests, the console output of this task will show \arabic{section}-\arabic{subsection}} the name of the test files being processed followed \SEPARATOR by the line: \refstepcounter{section} All checks passed \TYPE{\arabic{chapter}-% If only one test file is run the usual console output \arabic{section}-\arabic{subsection}} from the TEX compilation is also shown otherwise it } is suppressed.

Frank Mittelbach, Will Robertson and The LATEX3 team TUGboat, Volume 35 (2014), No. 3 291

#!/ usr/bin/env texlua cute some commands to produce a small amount of typeset output. A complete example of such a test -- Build script for abc package is shown in Figure 3. Some points to note: 1. The first line, \input{regression-test} loads module = "abc" the necessary settings and commands to format -- variable overwrites (if needed) the .log file properly for testing. 2. It is not necessary to load a special document -- call standard script class (most tests use article or minimal), but a package author may wish to adjust page margins, kpse.set_program_name ("kpsewhich") etc., without repeating the commands for each dofile (kpse.lookup ("l3build.lua")) test. Such a special test class or package could then be kept in the support/ directory. Figure 2: Driver file for a hypothetical abc package 3. The test begins proper at \START — everything before that point in the .log file will be ignored. \documentclass{breqn-test} This prevents, for example, package version num- \input{regression-test} bers displayed while the preamble is processed \usepackage{breqn} from becoming part of the test. The \AUTHOR \begin{document} \START declaration is an optional way of indicating who \AUTHOR{Will Robertson} might know how to fix the problem should the \begin{dmath} test begin failing. a+b+c+d+e+f+g+h+i+j+k+l+m+ 4. In this example \showoutput generates the ac- n+o+p+q+r+s+t+u+v+w+x+y+z tual test data by generating a symbolic repre- \end{dmath} sentation of the page content in the .log file. \showoutput 5. A slightly modified version of \end{document} \end{document} finishes the test document. Alternatively, one can end the test file with \END which avoids Figure 3: Example test from breqn the final processing done by \end{document} and thus prevents unwanted material from be- These compilations take place in the subdirec- coming part of the test data. In this example tory ‘build/test’, and if a test fails, a diff file is \END cannot be used as that would stop the run deposited there with the information about what has immediately without producing a page — which changed in the output of the test file. Also deposited is our goal here. there are the full .log files for each henginei (i.e., Not shown is the \OMIT ... \TIMO construction, which without modifications from the cleanup step) which puts flags into the .log file between which no test can be helpful to debug complex issues. comparisons will be made. This can be used around code that generates variable log information that is 4.2 An example driver file known to be irrelevant for the test. For example, For a simple setup such as shown in the overview in statements like \newlength or \newcounter write Section 3, the driver file (build.lua) is quite simple. some tracing information into the .log that shows An example of such a driver file is shown in Figure 2; the allocated register number. If the code gets revised it need do little more than inform the build system of these numbers might change and thereby unneces- the name of the package and perhaps set some flags sarily invalidate the test result. or change some defaults if they are not adequate. \OMIT can also be used before \end{document} The main script is l3build.lua, which is auto- if you need the final processing to happen, but want matically found in the texmf (via kpsewhich) to ensure that nothing written at that time becomes and then loaded. Thus, there is no need to hard- part of the test. wire locations in the driver file and it will work on An example of a more structured test from the different installations. LATEX3 test suite is shown in Figure 4. Here, a number of different tests are contained within a single 4.3 The structure of test files file, and a few of these are included in the example. As mentioned previously, the method of using the The content of the test is not really important here (it .log file allows various types of tests to be conducted. is testing aspects of the integer module from expl3) The most simple test might load a package and exe- but it does show a few best practices.

l3build — A modern Lua test suite for TEX programming 292 TUGboat, Volume 35 (2014), No. 3

\OMIT/\TIMO is used to hide the register alloca- \documentclass{minimal} tion numbers from \int_new:N. The first test then \input{regression-test} exercises integer addition and subtraction which is \RequirePackage{expl3} not expandable (therefore \TEST together with \TYPE \begin{document} is used) and it consists in fact of several small tests. \START The expected results are written as comments into \AUTHOR{Frank Mittelbach, LaTeX3 Project} the test file which is helpful in case it ever fails. \ExplSyntaxOn Converting is supposed to be expand- able so \TESTEXP is used for the second test. The \OMIT \int_new:N \l_testa_int same is true for the case selection commands. Here \int_new:N \g_testa_int \YES \NO the test output is generated by or . \TIMO Can you guess the test results, even if you are not familiar with the expl3 language? They are \TEST { adding~and~subtracting } shown in Figure 5. { \int_zero:N \l_testa_int 4.4 Options \int_add:Nn \l_testa_int { 5 * 7 } While the examples shown previously demonstrate \int_add:Nn \l_testa_int { 15 } the behaviour in the standard setup, the new build % we hope for a value of 50 system provides significantly greater flexibility. This \TYPE { \int_use:N \l_testa_int } \int_sub:Nn \l_testa_int { 3 * 5 } is achieved by providing a large number of variables % we hope for a value of 35 that can be (re)set as necessary in the driver file. For \TYPE { \int_use:N \l_testa_int } example, the new system supports building complex \int_gzero:N \g_testa_int distributions consisting of several modules in different { directories with dependencies between them. You \int_gadd:Nn \g_testa_int can also control if the processing should happen in a { (2 + 13) / (2 * 3) } sandbox or if it is allowed to draw any support files \int_gadd:Nn \g_testa_int { 3 } needed for the tests (e.g., extra packages or classes) % we hope for a value of 6 from the TDS tree. The latter is the default as this \TYPE { \int_use:N \g_testa_int } is better for most distributions. For details consult \int_gsub:Nn \g_testa_int { 5 * 5 } the documentation in [4]. } % we hope for a value of -19 There is one option that one may have to modify \TYPE { \int_use:N \g_testa_int } even for simple setups: checkruns. This controls } the number of times each test file is run; to speed up processing it defaults to 1. If, however, the codes \TESTEXP { converting~from~and~to~base } require multiple runs to function (e.g., if you test { material that is passed through the .aux file) you \int_to_base:nn { 17 } { 8 } ~ have to set this variable to 2 or higher to ensure that \int_from_base:nn { 21 } { 8 } your tests actually work correctly. } \TESTEXP{ Case~statements } 5 Operating the system { \int_case:nnn As indicated earlier the system does a bit more than { -1 + 1 } managing a set of test files, so here is a short descrip- { { -1 } { \NO } tion of the main tasks that can be executed once { 3 - 3 } { \YES } } the setup is in place. Each task takes zero or more { \NO } arguments as described below and is executed by \NEWLINE running the driver file (default build.lua) through \int_case:nnn a Lua interpreter (texlua) and passing it the task { 7 - 2 } name and any further argument as necessary, e.g., { { -1 + 3 } { \NO } } { \YES } texlua build.lua check htest namei } would run the check on htest namei using all engines. % more tests here omitted So here is the list of available tasks: \END check hnamei henginei Without arguments, runs all test files found in the directory that contains Figure 4: Expandable and non-expandable tests

Frank Mittelbach, Will Robertson and The LATEX3 team TUGboat, Volume 35 (2014), No. 3 293

======considered certified and will be used to verify TEST 1: adding and subtracting future check runs! ======50 6 Acknowledgements 35 The original test suite system was a joint effort by the 6 A whole LTEX project team at that time, i.e., Frank -19 Mittelbach, Rainer Sch¨opf, David Carlisle, Michael ======Downes, Alan Jeffrey, and Chris Rowley. We also ======had significant help when writing the initial set of TEST 2: converting from and to base test files from a number of volunteers, in particular ======Daniel Flipo and Chris Martin. 21 17 Around 2008 Rainer replaced the Makefile ap- ======proach used for LATEX 2ε by Cons (a Perl-based solu- tion) as the Makefile got so complex over time that ======it was difficult to manage. TEST 3: Case statements For the LATEX3 development we stayed with ======Make as the requirements of the expl3 distribution YES were initially much simpler. YES Joseph Wright wrote a first set of .bat files ======for expl3, as by then many developers worked on Windows. Modelled after this, Frank replaced the Figure 5: Test results Cons solution for LATEX 2ε in 2013. Finally in 2014 Joseph then implemented most of the .lvt files. It reports progress by displaying the new Lua-based system and it is now successfully each test file name currently processed (but oth- used to manage the LATEX3 (expl3) distribution as erwise hides any TEX output to avoid cluttering well as several smaller package distributions. The the screen) and at the end displays a summary LATEX 2ε distribution will follow shortly. indicating success or failure. If hnamei is specified, it will run only the tests References for that .lvt file, and if additionally given an [1] Donald E. Knuth. A torture test for TEX. henginei name will run only the test for that Report STAN-CS-84-1027, 1984. specific engine. In either case it will show ev- erything on the screen, which is helpful if the [2] Frank Mittelbach. A regression test suite for A ε run shows abnormal behaviour (especially if it LTEX 2 . TUGboat, 18(4):309–311, December ends up in an endless loop and never returns for 1997. http://tug.org/TUGboat/tb18-4/ some reason). tb57mitt.pdf clean Cleans up the source tree, removing tempo- [3] Frank Mittelbach. A modern regression rary files and directories. test suite for TEX programming, July ctan Runs all tests, typesets all documentation 2014. Talk given at TUG conference in and if there are no errors, generates a .zip file Portland. Video and slide material available at suitable for uploading to CTAN. http://www.latex-project.org/papers. doc Typesets all documentation (by default .dtx [4] LATEX3 Project. The l3build package: Checking files), thus checking them for trivial processing and building packages, September 2014. errors. http://ctan.org/pkg/l3build install This installs the distribution in the local tree of the user. ⋄ Frank Mittelbach save hnamei henginei This generates (or regener- Mainz, Germany ates) the .tlg file for hnamei. If additionally ⋄ Will Robertson supplied an henginei argument it generates a School of Mechanical Engineering, specific .tlg as discussed above. The University of Adelaide, It is the responsibility of the developer to ver- Australia ify that the data placed into the .tlg produces ⋄ A the desired result, i.e., is actually correct. Once The LTEX3 team http://www.latex-project.org produced or updated with save, the output is

l3build — A modern Lua test suite for TEX programming 294 TUGboat, Volume 35 (2014), No. 3

MetaPost path resolution isolated which produces: Path at line 7, before choices: Taco Hoekwater (0,0){curl 1} ..{curl 1}(2,20){curl 1} Abstract ..{curl 1}(10,5)..controls (2,2) and (9,4.5) A new interface in MPlib version 1.800 allows one to ..(3,10)..tension 3 and atleast4.5 resolve path choices programmatically, without the ..{4096,0}(1,14){4096,0} ..{0,4096}(5,-4) need to go through the MetaPost input language. 1 MetaPost path solving . . . Path at line 7, after choices: (0,0) As readers may agree, MetaPost is pretty good at ..controls (0.66667,6.66667) and (1.33333,13.33333) finding pleasing control points for paths. What may ..(2,20) ..controls (4.66667,15) and (7.33333,10) be less commonly known is that besides drawing on a ..(10,5) picture, MetaPost can also display the found control ..controls (2,2) and (9,4.5) points in the log file. ..(3,10) An initial illustration at this point is useful. ..controls (2.34547,10.59998) and (0.48712,14) Here is the MetaPost path input source of a very ..(1,14) ..controls (13.40117,14) and (5,-35.58354) simple path (as well as a visualisation of the path): ..(5,-4) tracingchoices := 1; path p; 2 . . . outside MetaPost? p := (0,0) ..(10,10) ..(10,-5) ..cycle; But what if you want to use that functionality outside of MetaPost, for instance in a C program? Before MPlib 1.8000, you would have to . . . compile MPlib into your program; And here is what MetaPost outputs in the log file create a MetaPost language input string; (with some editorial line breaks): execute it; and parse the log result. Path at line 5, before choices: (0,0) All of that is not very appealing. It would be much ..(10,10) better if you could . . . ..(10,-5) compile MPlib into your program; ..cycle create a path programmatically; Path at line 5, after choices: run the MetaPost path solver directly, (0,0) automatically updating the original path. ..controls (-1.8685,6.35925) and (4.02429,12.14362) MP ..(10,10) And that is what the current version of lib allows ..controls (16.85191,7.54208) and (16.9642,-2.22969) you to do. ..(10,-5) ..controls (5.87875,-6.6394) and (1.26079,-4.29094) 3 How it works ..cycle Once again, it is easiest to show how it works by A more complex path of course creates more using a source code example: output, as in: #include "mplib.h" p := (0,0)..(2,20){curl1}..{curl1}(10, 5) int main (int argc, char ** argv) { ..controls (2,2) and (9,4.5) MP mp; ..(3,10)..tension 3 and atleast 4..(1,14){2,0} MP_options * opt = mp_options (); ..{0,1}(5,-4); opt -> command_line = NULL; opt -> noninteractive = 1; mp = mp_initialize (opt); my_try_path (mp); /* the crux */ mp_finish (mp); free (opt); return 0; } Most of the example code above is just what one would need, to do anything with MPlib program- Editor’s note: Originally published in ConTEXt Group: Proceedings, 6th meeting, pp.13–18. Reprinted with matically. The new line for our purpose here calls permission. my_try_path(mp):

Taco Hoekwater TUGboat, Volume 35 (2014), No. 3 295 void my_try_path(MP mp) { • mp_knot_x_coord(), mp_knot_y_coord(), mp_knot first, p, q; mp_knot_right_x(), mp_knot_right_y(), first = p = mp_append_knot (mp, NULL, 0, 0); mp_knot_left_x(), mp_knot_left_y() q = mp_append_knot (mp, p, 10, 10); p = mp_append_knot (mp, q, 10, -5); all return the value of a knot field, as an mp_close_path_cycle (mp, p, first); mp_number object (the calls to these functions if (mp_solve_path (mp, first)) { are hidden inside the definition of the SHOW mp_dump_solved_path (mp, first); macro). } • mp_knot_left_type() returns the type mp_free_path (mp, first); } of a knot, normally either mp_endpoint or mp_open. This function uses a new type, mp_knot, as well • mp_number_as_double() converts an as several new library functions in MPlib available mp_number double as of version 1.800. to . • mp_append_knot creates a new knot, appends it To satisfy our curiosity, here is the actual output to the path that is being built, and returns it as of the example program listed above: the new tail of the path. (0,0) • ..controls (-1.8685,6.35925) and (4.02429,12.1436) mp_close_path_cycle is like cycle in the Meta- ..(10,10) Post language. ..controls (16.8519,7.54208) and (16.9642,-2.22969) • mp_solve_path() finds the control points of the ..(10,-5) path. solve_path does not alter the state of the ..controls (5.87875,-6.6394) and (1.26079,-4.29094) ..cycle given MPlib instance in any way, it only modifies its argument path. which is almost exactly the same as in the log file • mp_dump_solved_path() user defined function, (except we’ve altered the line breaks for this article): see below for its definition. (0,0) • ..controls (-1.8685,6.35925) and (4.02429,12.14362) mp_free_path() releases the used memory. ..(10,10) Our user-defined mp_dump_solved_path routine uses ..controls (16.85191,7.54208) and (16.9642,-2.22969) even more new functions. First let us look at its ..(10,-5) definition: ..controls (5.87875,-6.6394) and (1.26079,-4.29094) ..cycle #define SHOW(a,b) mp_number_as_double \ (mp,mp_knot_##b(mp,a)) The numerical output is not exactly the same because void mp_dump_solved_path (MP mp, mp_knot h) { MetaPost itself does not use mp_number_as_double mp_knot p, q; and printf’s %g for printing the scaled values that p = h; are (by default) used to represent numerical values. do { q = mp_knot_next(mp, p); This difference is not really relevant, since any printf("(%g,%g)\n " programmatic use of the path solver should not have "..controls (%g,%g) and (%g,%g)", to be 100% compatible with the MetaPost program- SHOW(p,x_coord), SHOW(p,y_coord), ming language. SHOW(p,right_x), SHOW(p,right_y), SHOW(q,left_x), SHOW(q,left_y)); 4 More complex paths p = q; if (p != h Of course there are also new functions to create the || mp_knot_left_type(mp, h) != mp_endpoint) more complex paths that make use of curl, tension printf ("\n .."); and/or direction specifiers. } while (p != h); Here is how to encode the second MetaPost path if (mp_knot_left_type(mp, h) != mp_endpoint) printf ("cycle"); in the earlier example: printf ("\n"); first = p = mp_append_knot (mp, NULL, 0, 0); } Somewhat hidden in the source above is the q = mp_append_knot (mp, p, 2, 20); p = mp_append_knot (mp, q, 10, 5); existence of another new type, mp_number, which is if (!mp_set_knotpair_curls (mp, q, p, 1.0, 1.0)) the representing a numerical value exit (EXIT_FAILURE); inside MPlib. The MPlib library functions used in our routine q = mp_append_knot(mp, p, 3, 10); mp_dump_solved_path are as follows: if (!mp_set_knotpair_controls (mp, p, q, 2.0, 2.0, 9.0, 4.5)) • mp_knot_next() moves to the next knot in exit (EXIT_FAILURE); the path.

MetaPost path resolution isolated 296 TUGboat, Volume 35 (2014), No. 3

p = mp_append_knot (mp, q, 1, 14); values are allowed, all numbers: if (!mp_set_knotpair_tensions (mp, q, p, 3.0, -4.0)) left_tension A tension specifier exit (EXIT_FAILURE); right_tension Like left_tension q = mp_append_knot (mp, p, 5, -4); left_curl A curl specifier if (!mp_set_knotpair_directions (mp, p, q, right_curl Like left_curl 2.0, 0.0, 0.0, 1.0)) x exit (EXIT_FAILURE); direction_x displacement of a direction specifier mp_close_path (mp, q, first); direction_y Like direction_x Elaborate documentation for these extra func- tions (and a few more) is in the file mplibapi.tex, 6 Issues to watch out for included in the MetaPost distribution. All the ‘normal’ requirements for MetaPost paths still apply using this new interface. In particular: 5 Lua interface • A knot has either a direction specifier, or a curl There is also a Lua interface for use in LuaTEX, specifier, or a tension specification, or explicit which is a bit higher-level: control points, with the additional note that success tensions, curls and control points are split in a = mp.solve_path(

knots, left and a right side (directions apply to both cyclic) sides equally). • This modifies the knots table, which should con- The absolute value of a tension specifier should tain an array of points in a path, with the substruc- be more than 0.75 and less than 4096.0, with ture explained below, by filling in the control points. negative values indicating ‘atleast’. • The boolean cyclic is used to determine whether The absolute value of a direction or curl should the path should be the equivalent of --cycle. If be less than 4096.0. the return value is false, there is an extra return • If a tension, curl, or direction is specified, any argument containing the error string. existing control points will be replaced by the On entry, the individual knot tables can con- newly computed value. tain the six knot field values mentioned above (but ⋄ Taco Hoekwater typically the left_x,y and right_x,y will be miss- Docwolves B.V. ing). x,y_coord are both required. Also, some extra http://metapost.org http://luatex.org

Taco Hoekwater TUGboat, Volume 35 (2014), No. 3 297

Typeset MMIX programs with TEX Example: In section 9 the lines “See also sec- tion 10.” and “This code is used in section 24.” are given. Udo Wermuth No such line appears in section 10 as it only ex- tends the replacement code of section 9. (Note that Abstract section 10 has in its headline the number 9.) In section 24 the reference to section 9 stands for all of ATEX macro package is presented as a literate pro- the eight code lines stated in sections 9 and 10. gram. It can be included in programs written in the If a section is not used in any other section then languages MMIX or MMIXAL without affecting the it is a root and during the extraction of the code a assembler. Such an instrumented file can be pro- file is created that has the name of the root. This file cessed by TEX to get nicely formatted output. Only collects all the code in the sequence of the referenced a new first line and a new last line must be entered. sections from the code part. The collection process And for each end-of-line comment a flag is set to for all root sections is called tangle. A second pro- indicate that the comment is written in TEX. cess is called weave. It outputs the documentation and the code parts as a TEX document. How to read the following program Example: The following program has only one The text that starts in the next chapter is a literate root that is defined in section 4 with the headline program [2, 1] written in a style similar to noweb [7]. “h .tex 4 i ≡”. The file that is created by the Readers who are not familiar with literate program- tangle process is therefore called “mmix.tex”. ming might find the following remarks useful. The tangled output in the original WEB system The program is divided into sections. Each sec- for literate programming is intended to be read only tion has a number that is written in bold at the be- by computers (see [2], p. 116). In the present system ginning of the section. A section contains two parts output is created that is readable by humans. But and at least one of them must be present: (1) a doc- changes to the program should only be made in the umentation part with one or more paragraphs, and original source of the literate program. (2) a code part starting with a headline that is fol- The following text is the output that TEX has lowed either by ≡ or +≡ and the replacement code. produced from the woven document. (A few edits The headline has the format “h Name Number i”. have been made to follow the style of this journal.) Example: Sections 9 and 10 of the program have the respective headlines “h List symbols that are spe- Contents cial in TEX 9 i ≡” and “h List symbols that are special Introduction ...... § 1 in TEX 9 i +≡”. Section 9 has five lines of replace- Format of the output ...... § 6 ment code and section 10 three. Preparation ...... § 8 The Name is the name under which the replace- Linenumbers...... § 15 ment code can be called by other sections. The Timescolumn ...... § 22 Number is either the number of the section (the Setting the output format ...... § 25 case with ≡) or a smaller number when a previously Input format ...... § 29 defined replacement code is extended with the new Activation ...... § 33 code lines (case +≡). In the second case the code Last line ...... § 40 part of the previous section that owns the smaller Shortcuts ...... § 44 section number is followed by a line “See also sec- More shortcuts ...... § 50 ... tions ” and the current section number is some- Final remarks ...... § 52 ... where listed in the “ ”. Also, in the first case Index and List of sections ...... § 55 the code part is also often followed by a line “This code is used in sections ... ”. Then the headline Introduction of this section is used inside the replacement code of other sections (listed in the “... ”), in which the 1. Algorithms in The Art of Computer Program- final output of the complete replacement code with ming (TAOCP) [3] by Donald E. Knuth are stated all extensions must be inserted. The number in the in plain English. But every time an implementation headline states the first section that contains code is needed a machine language or assembler language for it. So a reader sees in a call of a section where is used. In the first three volumes the language is it starts and under this section he finds the other MIX; in Volume 4a the algorithms are implemented sections that add replacement code. in the language of a new computer called MMIX [4]. For the next editions of the TAOCP volumes all the

Typeset MMIX programs with TEX 298 TUGboat, Volume 35 (2014), No. 3

MIX programs of the first three volumes must be Format of the output rewritten either in MMIX or in the new assembler 6. Most of the time programs are not written in language MMIXAL. On his web page http://www- MMIX but in the MMIX Assembly Language, called cs-staff.stanford.edu/~uno/mmix.html, Knuth MMIXAL. MMIXAL allows labels, alphabetic names, asks volunteers to start the conversion of the MIX etc. and introduces new operations that are called programs before he has finished Volume 5 and new pseudo-ops. As we are only interested in formatting editions of Vols. 1–3 are created. the source lines of a program the details of the exten- 2. I decided to be one of the volunteers to do the sions provided by MMIXAL are not discussed here. conversion (although I’m not an MMIXmaster; see The reference [4] defines not only MMIX but gives all [6]). I asked myself the question: How to present the the information about MMIXAL too. A source line of result? The MMIX programs are stored in mms files, a MMIXAL program has up to five elements. Three which allow a very flexible input format. For exam- elements are of principal interest for the program ple, a line that starts with a backslash will be treated behavior: (1) an optional label, (2) the operation (or as a comment and is ignored during the assembly. short: the op-code), and (3) an expression field. The So my idea is to write the mms files in a way that other two elements are not needed for the execution they can be processed not only by the assembler but of the program but for the analysis and the compre- also by TEX. The output of TEX shall reflect the hension: (4) optional timing information, i.e., the style that is used in the TAOCP volumes to present number of times the statement is executed in a run, the MIX and MMIX programs. And TEX can be and (5) an optional comment. used to implement a second idea: Not only shall For the presentation of the program one more the conversion be done but an analysis of the new element is printed: an optional line number. The implementation shall be added. line number is printed in italics, the elements 1–3 are A macro package for TEX is developed that is output verbatim in a monospaced font, element 4 is included in the mms files. Then TEX is able to process written in math mode, and element 5 is formatted and pretty-print such mms files. by TEX as normal text. Therefore the output shall look like this: 3. Other volunteers for the conversion from MIX to line label op-code expression time comment MMIX have obviously felt the same need for T X E 07 Maximum SL kk,$0,3 1 M1. Initialize. output. The solution on the MMIX home page [6] is a lex script mmixtotex.l to create a program that 7. Following this example, let us state the complete reads the mms file and outputs a TEX file that can requirements for the output format: be typeset with a macro package mmstotex.sty. R1 The line number is either empty or has two or three digits; leading zeros are printed. It is writ- 4. Here is the plan for the macro package. ten left-aligned in 9 pt italics. h mmix.tex 4 i ≡ R2 The label is optional. If it is present it is written h Initialization 5 i verbatim in a 10 pt monospaced font. h Definitions 24 i h Useful commands and shortcuts 44 i R3 The op-code is written verbatim in the mono- h Add an analysis of the algorithm 40 i spaced font. h Format the mms file 19 i R4 The expression field may contain one or more h Take off 33 i items but it does not contain a blank (except This is a root. in a string). Like the label and the op-code it is printed verbatim in the monospaced font. 5. The file mmix.tex might be shipped with an mms R5 The timing information is optional. If present, file without this description. Therefore I will be it is printed in 9 pt as a math expression cen- adding plenty of comments to the T X code to help E tered in its column. others to read and understand the macro package. R6 The optional comment is written in a 9 pt ro- h Initialization 5 i ≡ man font. It is written in TEX. % Package to format MMIX programs with TeX R7 Lines that contain only a comment written in % (and some useful commands to document them) % Author: Udo Wermuth the monospaced font are allowed. Lines with % %%% more than one source statement are allowed. h Description 14 i R8 The program source ends with a thick vertical % %%% bar in the comment area. See also sections 11, 12, and 13. R9 It is possible to add a runtime analysis. The This code is used in section 4. text uses a 10 pt roman font.

Udo Wermuth TUGboat, Volume 35 (2014), No. 3 299

R10 The output shall show the name and the source 12. The printed document shall not only show the of the MIX program, the name of the author, program but also provide the possibility of including who programmed the MMIX source, and the an analysis of the algorithm (R9). The analysis is date of the conversion. placed in a second file (a plain TEX file) and it is included with an \input statement. The name of Preparation the file is created from the name of the program extended by the suffix aoa and, of course, with file 8. The monospaced font is a 10 pt font (see R2), the extension .tex. other fonts have size 9 pt (R1, R5, and R6). The 9 (and the 6 pt fonts for subscripts) are not h Initialization 5 i +≡ % file name for the ‘‘Analysis of Algorithm’’ activated in plain TEX. Let’s give them names. \def\AoAfile{\pgmnameH_aoa.tex} h Fonts 8 i ≡ % name 9pt and 6pt fonts 13. The name of the program is by default the name \font\ninerm=cmr9 \font\sixrm=cmr6 of the MMIXAL file — but the user has the ability \font\ninesy=cmsy9 \font\sixsy=cmsy6 to override that name. The name of the author, \font\ninei=cmmi9 \font\sixi=cmmi6 \font\nineit=cmti9 \font\ninesl=cmsl9 the source location and the date are initialized with \font\ninett=cmtt9 some text, but it is expected that the user speci- \font\ninebf=cmbx9 \font\sixbf=cmbx6 fies them before mmix.tex is loaded. The external This code is used in section 24. control sequences are copied and made \undefined. h Initialization 5 i +≡ 9. The example in section 6 shows that MMIXAL \def\checkextdata{% uses characters that have a special meaning in TEX, \ifundef pgmname \def\pgmnameH{\jobname}% for example, the dollar sign and hash mark are im- \else\let\pgmnameH=\pgmname portant symbols in MMIXAL. Therefore the output \let\pgmname=\undefined must be filtered and the functions assigned to the \fi \ifundef author \def\authorH{Unknown}% special characters in TEX have to be deactivated. \else\let\authorH=\author Plain TEX provides a \dospecials command, \let\author=\undefined but let us separate the MMIXAL and TEX special \fi characters. \ifundef source \def\sourceH{TAOCP}% \else\let\sourceH=\source h List symbols that are special in TEX 9 i ≡ % special in TeX but common in MMIX \let\source=\undefined \def\mmixdospecials{\do\ \do\$\do\&\do\#% \fi \do\^\do\_\do\%\do\~} \ifundef date \def\dateF{\number\year}% % remaining special characters in TeX \else\let\dateF=\date \def\texdospecials{\do\\\do\{\do\}} \let\date=\undefined \fi} See also section 10. This code is used in section 24. 14. The values for date, author and source must be 10. To switch off the special meaning of the above declared outside of the package. Let us document this at the beginning of the package. listed characters a command from The TEXbook [5], p. 380, is used. h Description 14 i ≡ % before the macro package is loaded the h List symbols that are special in T X 9 i +≡ E % following must be \def’ed \def\uncatcodespecials{% redef special chars % required: \date, \author, \source \def\do##1{\catcode‘##1=12 }% % the program name is taken from the mms-file \mmixdospecials\texdospecials} % but it can be overwritten % optional: \pgmname 11. In the header the author, the name of the pro- gram and the original source are listed. The footer See also sections 26, 41, and 53. This code is used in section 5. states the date and provides a page number. This fulfills requirement R10. Line numbers h Initialization 5 i +≡ % Header and Footer 15. The output format states all the information \headline={\sevenrm Author: \authorH\hfill about line numbers, as they are not part of the input Program: \pgmnameH.mms (\sourceH)}% file. Of course a counter for the numbers is needed. \footline={\sevenrm Date: \dateF\hfill \sevenbf\folio}%

Typeset MMIX programs with TEX 300 TUGboat, Volume 35 (2014), No. 3 h Counters 15 i ≡ h Format the mms file 19 i +≡ % count registers \def\numbermmixline{% shall no. be printed? \newcount\lnocnt % counter for line numbers \ifcolforlno See also section 25. \ifnumberlines This code is used in section 24. \global\advance\lnocnt by 1 \printlinenumber % yes 16. Next the width of the column for the line num- \else\phantom{\printlinenumber}% no bers has to be defined. Two cases are stated in the \fi\fi} requirements: 2 and 3 digits (see R1). 21. To have the style of line numbers available if a h Dimensions 16 i ≡ comment or the text of the analysis of the algorithm % dimen registers needs to reference a line number, one more control \newdimen\lnotwodigitswidth % 2 digits col \newdimen\lnothreedigitswidth % or 3 digits sequence is provided. The command is used in text printed in roman type. It gets either a single line See also section 22. This code is used in section 24. number or a range of line numbers and prints this in italics. So a simple solution is implemented which 17. Here are the default widths of the columns. doesn’t force the user to type in the italic correction. h Set values of dimen-registers 17 i ≡ h Format the mms file 19 i +≡ {\setbox0=\hbox{\nineit 00\tentt\quad}% \def\pgmline#1{% print #1 as line number \global\lnotwodigitswidth=\wd0 \gdef\argpgmline{#1}% store #1 and look ahead \global\lnothreedigitswidth=1.25\wd0 }% \futurelet\next\pgmlinex} See also section 23. \def\pgmlinex{% check if \next is . or , This code is used in section 35. \if.\next {\it\argpgmline}% no \/ \else\if,\next {\it\argpgmline}% no \/ 18. Lines can only be numbered if the space is re- \else {\it\argpgmline\/}% add \/ served for the column of numbers: \colforlnotrue \fi\fi} must be set. Then a number is printed if the flag \ifnumberlines is true. A third flag is needed to Times column set the number of digits for line numbers. 22. The optional column for the timing information Flags 18 h i ≡ (R5) gets its own dimen register. % if flags \newif\ifcolforlno % true: add col for lno h Dimensions 16 i +≡ \newif\ifnumberlines % true: number the lines \newdimen\timecolumnwidth % column for time \newif\ifthreedigitlno % true: use 001..999 23. To allow entries like “A − 1” the column must See also section 43. This code is used in section 24. be wider than three symbols. h Set values of dimen-registers 17 i +≡ 19. The output routine for line numbers prints lead- {\setbox0=\hbox{$2M+M$}% 10pt gives white space ing zeros (R1). \global\timecolumnwidth=\wd0 }% h Format the mms file 19 i ≡ \def\printlinenumber{% with leading 0s 24. Of course, as the column is optional one more \ifthreedigitlno % how many digits? flag needs to be declared. \hbox to \lnothreedigitswidth{\it It is time to collect all definitions in a sorted \ifnum\lnocnt<100 0\fi list. Such a list might be easier to understand if the \ifnum\lnocnt<10 0\fi \number\lnocnt mmix.tex file comes without this documentation, so \hss}% \else\hbox to \lnotwodigitswidth{\it some sub-entries are created. \ifnum\lnocnt<10 0\fi \number\lnocnt h Definitions 24 i ≡ \hss}% % %%% Definitions \fi} h Counters 15 i h Dimensions 16 i See also sections 20, 21, 27, 28, 29, 30, 31, 32, 34, 35, and 38. Flags 18 This code is used in section 4. h i \newif\iftimeinfostated % true: add time col 20. But before the output routine can be called it h Fonts 8 i must be checked that line numbers shall be printed h List symbols that are special in TEX 9 i at all. Therefore the following macro is called to See also section 39. output the line number. This code is used in section 4.

Udo Wermuth TUGboat, Volume 35 (2014), No. 3 301

Setting the output format 28. In the case that \mmixtype < −1 the numbering of lines is activated by user commands. They are 25. To identify the size of the field for the line placed in the comment of a source line. numbers a hint must be given by the author of the program. This hint is a counter called \mmixtype. h Format the mms file 19 i +≡ % start and stop line numbering Three cases must be considered according to R1: \let\startnumbering=\numberlinestrue The values 0 and 1 mean no line numbers are used, \let\stopnumbering=\numberlinesfalse 2 and 3 stand for two-digit line numbers, and 4 and 5 for three-digit numbers. Input format And requirement R5 is also covered: If the value is odd the timing information is present: 0, 2, and 4 29. How shall the MMIXAL program line of sec- format the program without timing information, but tion 6 be entered into the input file? Requirements 1, 3, and 5 have such information. R2–R4 state that a monospaced font is used for the And a third bit of information is included: if above defined elements 1–3. As special symbols of the number is positive the line numbering starts im- TEX might be present the best way to typeset them mediately. Otherwise a command must be given to is to use verbatim mode. My idea is to use a special start the numbering. character that ends the verbatim mode, which is au- h Counters 15 i +≡ tomatically started in every line, and then to format \newcount\mmixtype % a value between -5 and 5 the rest of the line in TEX. Such a flag makes it % -1,0,1: no line numbers, no space reserved possible to fulfill the requirements R6 and R7. I call % absolute value 2,3: add column for 2 digits this special character the commchar and by default % absolute value 4,5: add column for 3 digits the exclamation mark is used for it. % > 1: start line numbering directly % <-1: a user command starts numbering A single commchar is required if no timing in- % write a line (‘!’ is the commchar) with formation is present and two are used to identify the % even value: label op expr ! comment timing information. The above stated program line % odd value: label op expr ! time ! comment is therefore coded like this: Maximum SL kk,$0,3 !1! \step M1. Initialize. 26. I decided to set this counter by an “assignment” The label, the op-code, and the expression must al- to the macro package. Therefore, a typical first line ways start at the same column to be properly aligned looks like the following lines in the comment. in the output. The macros shouldn’t destroy other h Description 14 i +≡ input styles, for example, several MMIXAL state- % start a programm with all \def’s in one line ments might be written in one line (see R7). % (n is the \mmixtype explained elsewhere): \step % \def\date{}\def\author{} Note: The control sequence is one of the % \def\source{}\input mmix =n useful macros defined later in this package. h Format the mms file 19 i +≡ 27. The value of \mmixtype determines the values \def\setcommchar#1{% boundary for verbatim of all flags. They are set even when \mmixtype has \vskip-\baselineskip% for first end of line a value outside the defined range from −5 to +5. \gdef\commchar{#1}% Such an error situation is tested and reported later. \def\par{\endgraf\verbatim#1}} h Format the mms file 19 i +≡ 30. The verbatim mode is defined in a standard way % %%% Format (see The TEXbook [5], pp. 380–382). \def\setflagsformmixtype{% analyse \mmixtype \ifnum\mmixtype<0 h Format the mms file 19 i +≡ \mmixtype=-\mmixtype % wait with numbering % Verbatim macros \else \def\verbatim{\begingroup % ends in \doverbatim \numberlinestrue % prep. to number 1st line \setupverbatim \fi \doverbatim} \ifnum\mmixtype>1 % activate numbering \def\setupverbatim{% \colforlnotrue \def\par{\leavevmode\endgraf\noindent}% \ifnum\mmixtype>3 % use 3 digits \catcode‘\‘=\active \threedigitlnotrue \obeylines \fi\fi \uncatcodespecials \ifodd\mmixtype % timing info is present \obeyspaces} \timeinfostatedtrue % now make a blank a control space \fi} {\obeyspaces\global\let =\ }% % and avoid ligatures of ? and ! with ‘ {\catcode‘\‘=\active \gdef‘{\relax\lq}}%

Typeset MMIX programs with TEX 302 TUGboat, Volume 35 (2014), No. 3

31. When the verbatim mode is executed the tests • an exclamation mark is given as an argu- for the line numbering and the timing information ment for \setcommchar; are made. The change of \everypar and \par is • \obeylines is called; reverted in the comment field to allow, for example, • and \par starts the verbatim mode. a command like \smallskip. d) Finally, \global\mmixtype is the left side of Note that a missing second commchar with an the assignment statement (see a)). odd \mmixtype results in a couple of errors (runaway And here is the place where we call the test argument) in \printtimeinfo, but forgetting the for the value of \mmixtype. If an error would be second commchar seems unlikely. reported in the macro \setflagsformmixtype the h Format the mms file 19 i +≡ user would see a bunch of tokens that have to be \long\def\doverbatim#1{% #1 is the commchar read again. Therefore the test for an error is made \everypar{\numbermmixline}% just before \par at the end of part c) is executed \def\nextmmixline##1#1{\noindent and the verbatim mode starts. \tentt##1% \endgroup % opened in \verbatim h Take off 33 i ≡ \printtimeinfo}% % %%% Start \iftimeinfostated h Last-minute procedures 36 i \gdef\printtimeinfo##1#1{% #1

Udo Wermuth TUGboat, Volume 35 (2014), No. 3 303 h Format the mms file 19 i +≡ Last line % set the variables to their default values 40. \def\setvariables{\mmixtype=0 \lnocnt=0 At the end of the program source the possibility h Set values of dimen-registers 17 i of including a separate file with the analysis of the al- \colforlnofalse \numberlinesfalse gorithm shall be given (see R9). The name of the file \threedigitlnofalse \timeinfostatedfalse} was already defined above. The text shall be printed in a roman font of size 10 pt. Therefore we must 36. One problem remains: If a T X comment is E switch back to 10 pt before that section can start. placed at the end of the comment field the end-of- 40 line information is not available and the verbatim h Add an analysis of the algorithm i≡ % %%% Macros for the last line(s) mode isn’t restarted. So the % is made active. It % typeset the Analysis of the Algorithm gobbles the comment and behaves like the current \def\Analysis{\medbreak definition of \par. \def\rm{\fam0\tenrm}% back to 10pt h Last-minute procedures 36 i≡ \textfont0=\tenrm \scriptfont0=\sevenrm {\obeylines \catcode‘\%=\active \textfont1=\teni \scriptfont1=\seveni \gdef\handleTeXcomments{\catcode‘\%=\active \textfont2=\tensy \scriptfont2=\sevensy {\obeylines \gdef%##1 % fam3 was not changed {\endgraf\expandafter\verbatim\commchar}}}}% \def\it{\fam\itfam\tenit}% \def\resetTeXcomment{\catcode‘\%=14 } \textfont\itfam=\tenit \def\sl{\fam\slfam\tensl}% This code is used in section 33. \textfont\slfam=\tensl 37. Now we collect the pieces together to prepare \def\tt{\fam\ttfam\tentt}% \textfont\ttfam=\tentt the environment. \def\bf{\fam\bffam\tenbf}% h Prepare the environment 37 i≡ \textfont\bffam=\tenbf \checkextdata \setupfonts \scriptfont\bffam=\sevenbf \setvariables \handleTeXcomments \def\oldstyle{\mit\teni}% This code is used in section 33. \rm % activate \tenrm \noindent{\tenbf Analysis}\par% the headline 38. One task is open: To give a warning message \nobreak\smallskip\noindent} if \mmixtype is out of range. I prefer to issue an See also section 42. error message. The following macro is called after This code is used in section 4. \mmixtype was made positive. 41. The section with the analysis is started with the h Format the mms file 19 i +≡ last line of the file. Similar to the first line it follows \def\testvalueofmmixtype{% value must be < 6 a special convention. \ifnum\mmixtype>5 \errhelp\mmixtypeerror All programs have to use the control sequence \errmessage{The number \string\mmixtype \eop. It typesets a thick vertical rule as it is stated \space must be between -5 and 5}% in requirement R8. It also stops the line numbering. \fi} h Description 14 i +≡ 39. % use \eop in the comment of the last We append the help message to the definitions. % source line h Definitions 24 i +≡ % end the input file with a line that % help messages % contains either (! is the commchar) \newlinechar=‘\^^J % ‘‘!\endprogram\bye’’ (even \mmixtype) \newhelp\mmixtypeerror{% % or ‘‘!!\endprogram\bye’’ (odd \mmixtype) mmixtype is the number % or use \endwAoA instead of \endprogram stated after \string\input\space mmix.^^J% % to input a file with an analysis It must be between -5 and 5. Three aspects are coded into it:^^J% 42. At the end of the input file a group is still open if it is odd that must be closed. And % gets back its default time information is given (use two !);^^J% meaning. But first, up to two empty hboxes are if it is -1, 0, 1 deleted (that might have been created on the current no line numbers are present;^^J% if it is -3, -2, 2, 3 horizontal line) to avoid a line break in front of this line numbers have two digits;^^J% “empty line”. if it is -5, -4, 4, 5 h Add an analysis of the algorithm 40 i +≡ line numbers have three digits;^^J% \def\eop{% end of program if it is >1 \qquad\vrule height 7pt depth 1pt width 3pt immediate numbering of lines is started.}% \eopusedtrue\stopnumbering}

Typeset MMIX programs with TEX 304 TUGboat, Volume 35 (2014), No. 3

\def\clearline{% remove 0--2 empty hboxes a right brace is used to collect the statements for {\setbox0=\lastbox \setbox0=\lastbox}} a comment. Place the command \mlsc (multiple \def\endprogram{\clearline lines, single comment) in the middle or just above \ifeopused\eopusedfalse the middle of the lines that get one comment. \else\message{^^JWarning: end the program with \string\eop^^J}% h Useful commands and shortcuts 44 i +≡ \fi % place the command in (odd number) or \endgroup% opened in \getmmixtype % just above (even number) the middle \resetTeXcomment} \def\mlsc#1:#2{% #1 #lines; #2 comment \def\endwAoA{\endprogram\bigskip \smash{\ifodd#1\else\lower.45\baselineskip\fi \Analysis \input\AoAfile} \hbox{$% next line: see \TeX book, p.194 \openup-1\jot % cancel for \eqalign 43. h Flags 18 i +≡ h Compute \dimen255 from #1 (i.e., #lines) 47 i \newif\ifeopused % true: ‘‘\eop’’ was used \left.\kern-.5em % empty left brace \eqalign{\vrule height\dimen255 Shortcuts width 0pt depth 0pt }% \right\}% visible right brace 44. When the analysis is written some commands $\thinspace#2}}} for often-used idioms reduce the amount of typing. 47. The height of the brace is of course roughly the h Useful commands and shortcuts 44 i ≡ number of lines, which are combined by the brace, % %%% Useful commands \def\MIX{{\ninett MIX}} multiplied by the \baselineskip. I use the formula: \def\MMIX{{\ninett MMIX}} number of lines × (\baselineskip + 3 pt) − 8 pt. \def\MMIXAL{{\ninett MMIXAL}} \let\NULL\Lambda % the null link h Compute \dimen255 from #1 (i.e., #lines) 47 i ≡ \def\AVAIL{\hbox{\ninett AVAIL}}% free space % compute height of brace \let\Gets\Leftarrow % get space from AVAIL \dimen255=\baselineskip \let\implies\Rightarrow % more ‘‘logical’’ \advance\dimen255 by 3pt % units for the analysis: oops and mems \multiply\dimen255 by #1\relax \def\oops{\hbox{$\upsilon$}}\let\oop=\oops \advance\dimen255 by -8pt \def\mems{\hbox{$\mu$}} \let\mem=\mems This code is used in section 46. % reference to equation numbers of TAOCP \def\numeq(#1){\hbox{$({\oldstyle#1})$}} 48. Next some special constructions: The dot minus \def\eq(#1){% outputs Eq. (...) (monus operation or saturating subtraction) is a bi- \hbox{Eq.\thinspace\numeq(#1)}} nary operation defined by a −. b = max(0, a − b). It \def\Eq(#1){% outputs Equation (...) isn’t coded like \doteq as it is not a relation and \hbox{Equation \numeq(#1)}} some care must be taken with the position of the See also sections 45, 46, 48, 49, 50, 51, and 52. dot. The notation of the conditional expression is This code is used in section 4. changed in TAOCP, Vol. 4a. 45. Some commands and shortcuts are needed in h Useful commands and shortcuts 44 i +≡ the comments to a program. For example, the steps % special operations of an algorithm are labeled with the identifying let- \def\dm{% dot minus: saturating subtraction ter of the algorithm, a number, and a phrase. This \mathbin{\mathop{\kern0pt \smash{-}}% information is often stated in the comment to a pro- \limits^{\raise.55ex\hbox{$\textstyle.$}}}} \def\ite(#1?#2:#3){% if-then-else; Vol.4a, p.96 gram. (The phrase might be omitted; for example, (#1\,{\rm?\ }#2{\rm:}\enspace#3)} see [3], Vol. 1, p. 236.) h Useful commands and shortcuts 44 i +≡ 49. The following shortcuts make special symbols % steps in algorithms of TEX available for comments. \def\algidphrase#1#2#3.#4.{% #1 #arguments; The plain TEX command for \l is redefined % #2 phrase delimiter; #3 step id; #4 phrase here. The original definition is stored in \lstroke. $\underline{\hbox{\sl \vphantom{y}#3.% h Useful commands and shortcuts 44 i +≡ \ifnum #1>1 \enspace #4#2\fi}}% % symbols of \TeX $\space} \def\vs{{\tt\char32 }}% visible space \def\step#1. #2.{\algidphrase2.#1.#2.} \def\bs{{\tt\char92 }}% backslash \def\steq#1. #2?{\algidphrase2?#1.#2.} \def\bo{{\tt\char123 }}% open brace \def\stepid#1.{\algidphrase1-#1.-.} \def\bc{{\tt\char125 }}% close brace \let\lstroke\l 46. Sometimes several lines get a single comment: \def\l\_{{\tt\char95 }}% long underline In [3], Vol. 1, p. 258 and 278 (and in [4], p. 107) \def\h\#{\hbox{${}^\#$}}% high # (hex no.)

Udo Wermuth TUGboat, Volume 35 (2014), No. 3 305

More shortcuts many MIX programs. So, to create a book of con- verted programs, for example, for a complete chap- 50. Here are some shortcuts that I find useful. In ter of TAOCP, the individual files can be \input in a a comment short words must be processed in math main file which is then processed by T X. (Of course mode either as roman text or monospaced text. So E the main file must include a definition like, for exam- I define a couple of commands for that. ple, \let\goodbye=\bye and then the redefinition Often text must be placed in an hbox. And a of \bye: \outer\def\bye{\par\vfill\supereject short cut for an array with a roman or monospaced \endinput}.) name and a math mode index is quite useful. To avoid reloading this package a test is added h Useful commands and shortcuts 44 i +≡ that determines if the package is already known. % %%% my shortcuts % output rm or tt in math with 1 to 3 chars And all the counters, fonts etc. are reset to their \def\r#1{{\rm #1}} initial value. \def\rr#1#2{{\rm #1#2}} Note that \endinput and \fi must appear in \def\rrr#1#2#3{{\rm #1#2#3}} the same line (see The TEXbook [5], p. 214). \def\m#1{{\tt #1}} h Description 14 i +≡ \def\mm#1#2{{\tt #1#2}} % %%% \def\mmm#1#2#3{{\tt #1#2#3}} % don’t load the file several times % output of rm or tt text in boxes or arrays % but reset variables, fonts etc. \def\rb#1{\hbox{\rm #1}}% rm box \def\ifundef #1 {% see \TeX book, ex. 7.7 \def\mb#1{\hbox{\tt #1}} \expandafter\ifx\csname #1\endcsname\relax} \def\ra#1[#2]{\hbox{\rm #1[$#2$]}}% rm array \ifundef mmixisloaded \def\mmixisloaded{true}% \def\ma#1[#2]{\hbox{\tt #1[$#2$]}} \else\getmmixtype\endinput\fi 51. I add a comment in front of a or 54. To test the scripts and to give an example of procedure. A few lines describe the calling sequence, how to use the macro package a small example is the entry and exit conditions, and changed special shown in the appendix. or global registers. This is described on page 55 of [4]. I start such comments indented at the column Index and List of sections of the op-code and with a > that sticks out to the left. To get this alignment in the case when timing 55. A literate program comes usually with an index information is present some care must be taken. of the names of used identifiers — variables, types, functions, procedures, or whatever the used pro- h Useful commands and shortcuts 44 i +≡ \def\gts{% align ‘g’ with the op-code col gramming language offers. It includes also certain \iftimeinfostated aspects of the program that might be of interest to {% omit time column if \mmixtype is odd users or developers who want to change the code. \ninerm\hskip-\timecolumnwidth For example, error messages are listed. {\tentt\ }}% add space for 2nd commchar The index lists the section numbers, in which \fi the entry appear. The section number, in which an {\tentt>\space }} identifier is defined, is written in slanted digits. : 30 : 29, 32, 33, 36 Final remarks (space) \commchar % : 36 conditional expression: 48 52. The following command doesn’t produce any ‘ : 30 \date : 13, 14, 26 \algidphrase : 45 \dateF : 11, 13 output. Nevertheless I find it useful in the analysis \Analysis : 40, 42 default values: 12, 13, 17, 23, of the algorithm. The timing information states how \AoAfile : 12, 42 29, 33 often a source line is executed but for a line with a \argpgmline : 21 \dm : 48 branch instruction it is also useful to know how often \author : 13, 14, 26 \do : 9, 10 a bad branching decision was made. \authorH : 11, 13 \doverbatim : 30, 31 \AVAIL : 44 \endprogram : 41, 42 Therefore I place the following command di- \bad : 52 \endwAoA : 41, 42 rectly after the second commchar and state the num- \bc : 49 \eop : 41, 42 ber of bad decisions. \bo : 49 \eopusedfalse : 42 49 42 h Useful commands and shortcuts 44 i +≡ \bs : \eopusedtrue : changed plain TEX \Eq : 44 % used to state number of bad decisions 44 \def\bad#1\bad{\ignorespaces}% no output commands: 30, 36, 49 \eq : \checkextdata : 13, 37 \everypar : 31 42 53. The macros have been presented for a single \clearline : \footline : 11 \colforlnofalse : 35 \getmmixtype : 33, 42, 53 mms file. But the conversion project needs to rewrite \colforlnotrue : 27 \Gets : 44

Typeset MMIX programs with TEX 306 TUGboat, Volume 35 (2014), No. 3

\gts : 51 \numeq : 44 h Compute \dimen255 from #1 (i.e., #lines) 47 i Used in \h : 49 \oop : 44 46. \handleTeXcomments : 36, 37 \oops : 44 h Counters 15, 25 i Used in 24. 29 30 31 32 \headline : 11 \par : , , , , 33 h Definitions 24, 39 i Used in 4. : 18, 20 : 21 \ifcolforlno \pgmline h Description 14, 26, 41, 53 i Used in 5. \ifeopused : 42, 43 \pgmlinex : 21 h Dimensions 16, 22 i Used in 24. \ifnumberlines : 18, 20 \pgmname : 13, 14 18, 43 Used in 24. \ifthreedigitlno : 18, 19 \pgmnameH : 11, 12, 13 h Flags i \iftimeinfostated : 24, 31, \printlinenumber : 19, 20 h Fonts 8 i Used in 24. 51 \printtimeinfo : 31 h Format the mms file 19, 20, 21, 27, 28, 29, 30, 31, 32, 34, \ifundef : 13, 53 \r : 50 35, 38 i Used in 4. \implies : 44 \ra : 50 h Initialization 5, 11, 12, 13 i Used in 4. \ite : 48 \rb : 50 h Last-minute procedures 36 i Used in 33. 31 Knuth, Donald Ervin: 1 \resetpar : h List symbols that are special in TEX 9, 10 i Used in 24. 49 36 \l : \resetTeXcomment : , 42 h mmix.tex 4 i Root. : 15, 19, 20, 35 : 50 \lnocnt \rr h Prepare the environment 37 i Used in 33. \lnothreedigitswidth : \rrr : 50 h Set values of dimen-registers 17, 23 i Used in 35. 16, 17, 19 Runaway argument: 31 33 Used in 4. \lnotwodigitswidth : 16, saturating subtraction: 48 h Take off i 17, 19 \setcommchar : 29, 33 h Useful commands and shortcuts 44, 45, 46, 48, 49, 50, \lstroke : 49 \setflagsformmixtype : 27, 33 51, 52 i Used in 4. \m : 50 \setupfonts : 34, 37 \ma : 50 \setupverbatim : 30 References \mb : 50 \setvariables : 35, 37 \mem : 44 \sixbf : 8, 34 [1] Bart Childs, “Thirty years of literate programming \mems : 44 \sixi : 8, 34 and more?” TUGboat 31(2010), 183–188. MIX : 1 \sixrm : 8, 34 http://tug.org/TUGboat/tb31-2/tb98childs. \MIX : 44 \sixsy : 8, 34 pdf (accessed: August 4, 2014) 46 13 \mlsc : \source : , 14, 26 [2] Donald E. Knuth, Literate Programming, 50 13 \mm : \sourceH : 11, CSLI Lecture Note No. 27, 1992. MMIX : 1 \startnumbering : 28 http://www-cs-staff.stanford.edu/~uno/ MMIX home page: 3 \step : 45 \MMIX : 44 \stepid : 45 lp.html (accessed: August 4, 2014) mmix.tex: 4, 5, 24 \steq : 45 [3] Donald E. Knuth, The Art of Computer MMIXAL: 6, 9, 29 \stopnumbering : 28, 42 Programming, Addison-Wesley, Vol. 1 (3rd ed.), \MMIXAL : 44 TAOCP: 1, 2, 6, 45, 46, 48 1997; Vol. 2 (3rd ed.), 1998; Vol. 3 (2nd ed.), 1998; \mmixdospecials : 9, 10 \testvalueofmmixtype : 33, 38 Vol. 4a (1st ed.), 2011. \mmixisloaded : 53 \texdospecials : 9, 10 http://www-cs-staff.stanford.edu/~uno/ \mmixtype : 25, 26, 27, 33, The number \mmixtype ... : 38 taocp.html (accessed: August 4, 2014) 35, 38, 41, 51 \threedigitlnofalse : 35 [4] Donald E. Knuth, The Art of Computer 39 \mmixtypeerror : 38, \threedigitlnotrue : 27 Programming — MMIX: A RISC Computer \mmm : 50 \timecolumnwidth : 22, 23, 31, for the new Millennium, Vol. 1, Fascicle 1, mms files : 2, 53 51 \newcommchar : 32 \timeinfostatedfalse : 35 Addison-Wesley, 2005. \newlinechar : 39 \timeinfostatedtrue : 27 http://www-cs-staff.stanford.edu/~uno/ \next : 21 \uncatcodespecials : 10, 30 mmix.html (accessed: August 4, 2014) \nextmmixline : 31 usage, first line: 14, 26 [5] Donald E. Knuth, The TEXbook, Volume A of \ninebf : 8, 34 last line: 41 Computers & Typesetting, Addison-Wesley, 1984. \ninei : 8, 34 program lines: 25, 29, 41 [6] MMIX home page, hosted by: The MMIX Group \nineit : 8, 17, 34 value of \mmixtype : 25, 39 at Munich University of Applied Sciences. 8 \ninerm : , 34, 51 user commands: 14, 21, 28, http://mmix.cs.hm.edu (accessed: August 4, 2014) 8 \ninesl : , 34 32, 42, 44, 45, 46, 48, 49 [7] Norman Ramsey, “Literate programming \ninesy : 8, 34 my collection: 50, 51, 52 simplified”, IEEE Software 11 (1994), 97–105. \ninett : 8, 34, 44 \verbatim : 29, 30, 31, 32, 36 \NULL : 44 \vs : 49 http://www.cs.tufts.edu/~nr/noweb/ (accessed: \numberlinesfalse : 28, 35 Warning: end the ... : 42 August 4, 2014) \numberlinestrue : 27, 28 Wermuth, Udo: 5 \numbermmixline : 20, 31 ⋄ Udo Wermuth Babenh¨auser Straße 6 56. The second index collects all headlines of the 63128 Dietzenbach code parts. Here the headlines contain all section Germany numbers that define the replacement code for the u dot wermuth (at) icloud dot com section name. h Add an analysis of the algorithm 40, 42 i Used in 4.

Udo Wermuth TUGboat, Volume 35 (2014), No. 3 307

Appendix: An Example First the input (file 1-3-3I.mms) is shown. Note that commchars are used only in lines that contain a comment. Line numbers and timing information are given only for the lines that belong to the subroutine. The value of \mmixtype is −3 meaning that (a) the first line is not numbered (the value is negative), (b) line numbers need only two digits (i.e., value is −2 or −3), and (c) timing information is given (so value must be odd).

\def\date{04 Aug 2014}\def\source{V1, p.\ 177}\def\author{Udo Wermuth}\input mmix =-3 !!\clearline{\tenbf Program I} ({\tenit Inverse in place\/})% use a lot of ‘‘features’’ !!\clearline\smallskip\timecolumnwidth=2.5em % (some are not necessary in this conversion) n GREG 6 !! Number of elements in the j IS $0 !! Variables of the algorithm i IS $1 mm IS $2 !! $\mm mm = 8m$ LOC Data_Segment X GREG @ OCTA 0 !! $X[0]$ is not used OCTA 6,2,1,5,4,3 !! The data of Table 1.3.3--3 LOC #100 !!\gts Inverse a permutation in place !!\gts Entry condition: $X[1]\,\ldots\,X[n]$ is a permutation of $\{1,\ldots,n\}$ !!\gts Exit condition: array $X$ contains inverted permutation \startnumbering :Invert SL mm,n,3 !1! \step I1. Initialize. $m\gets n$. NEG j,1 !1! $j\gets-1$. 2H LDO i,X,mm !N! \step I2. Next element. $i\gets X[m]$. PBN i,5F !N!\bad C\bad To I5 if $i<0$. 3H STO j,X,mm !N! \step I3. Invert one. $X[m]\gets j$. SR j,mm,3 !N! \mlsc 2:{$j\gets-m$.} % multi-line comment NEG j,j !N! SL mm,i,3 !N! $m\gets i$. LDO i,X,mm !N! $i\gets X[m]$. 4H PBP i,3B !N!\bad C\bad \steq I4. End of cycle? To I3 if $i>0$. SET i,j !C! Otherwise set $i\gets j$. 5H NEG i,i !N! \step I5. Store final value. STO i,X,mm !N! $X[m]\gets-i$. \newcommchar. % change the commchar 6H SUB mm,mm,8 .N. \step I6. Loop on $m$. PBP mm,2B .N.\bad 1\bad To I2 if $m>0$. \stopnumbering * inspect memory locations of array X for the result TRAP 0,Halt,0 Main IS :Invert .. \eop ..\endwAoA\bye

The last line of the input file ends the source with \endwAoA. So a second file with the analysis of the algorithm is needed; it is the file 1-3-3I_aoa.tex:

In step~I3 each slot of the array~$X$ once receives a negative value and in step~I5 it is filled with a positive number. Using Kirchhoff’s law the number of times step~I6 is executed is equal to the number of times steps~I5 and~I2 are executed; that is steps~I2 and~I6 have count $N$. Step~I5 is entered from I4 $C$~times, so I2 goes $N-C$~times to step~I5 and $C$~times to~I3. And step~I3 goes $N$~times to step~I4, which must return $N-C$~times to~I3.

Of course, $N$ is the number of elements in the permutation and~$C$ is the number of its cycles. The \mb{PB..}~instructions in lines~\pgmline{04} and~\pgmline{10} are based on the assumption that in most cases $C\leq N/2$. An analysis of~$C$ shows that its average value is the harmonic number~$H_n$. So the assumption is correct.

The program needs $4N\mems + (12N+5C+4)\oops$. The execution with the test data, the permutation $(4 5)(2)(1 6 3)$, gives the statistic for \mb{Invert}: {\tt 78~instructions, 24~mems, 91~oops; 11~good guesses, 7~bad}. (The total run time is 96\oops\ as the \mb{TRAP} instruction needs 5\oops.) As in this case $N=6$ and $C=3$ the above formula calculates $(4\times6)\mems=24\mems$ and $(12\times6+5\times3+4)\oops=(72+15+4)\oops=91\oops$ in agreement with the measured data.

Typeset MMIX programs with TEX 308 TUGboat, Volume 35 (2014), No. 3

And this shows the final output (with simulated headline and footline). I assume that in a TAOCP volume only the numbered lines appear.

Author: Udo Wermuth Program: 1-3-3I.mms (V1, p. 177)

Program I (Inverse in place) n GREG 6 Number of elements in the permutation j IS $0 Variables of the algorithm i IS $1 mm IS $2 mm = 8m LOC Data_Segment X GREG @ OCTA 0 X[0] is not used OCTA 6,2,1,5,4,3 The data of Table 1.3.3–3 LOC #100 > Inverse a permutation in place > Entry condition: X[1] ...X[n] is a permutation of {1, . . . , n} > Exit condition: array X contains inverted permutation 01 :Invert SL mm,n,3 1 I1. Initialize. m ← n. 02 NEG j,1 1 j ← −1. 03 2H LDO i,X,mm N I2. Next element. i ← X[m]. 04 PBN i,5F N To I5 if i < 0. 05 3H STO j,X,mm N I3. Invert one. X[m] ← j. 06 SR j,mm,3 N j ← −m. 07 NEG j,j N o 08 SL mm,i,3 N m ← i. 09 LDO i,X,mm N i ← X[m]. 10 4H PBP i,3B N I4. End of cycle? To I3 if i > 0. 11 SET i,j C Otherwise set i ← j. 12 5H NEG i,i N I5. Store final value. 13 STO i,X,mm NX[m] ← −i. 14 6H SUB mm,mm,8 N I6. Loop on m. 15 PBP mm,2B N To I2 if m > 0. * inspect memory locations of array X for the result TRAP 0,Halt,0 Main IS :Invert

Analysis In step I3 each slot of the array X once receives a negative value and in step I5 it is filled with a positive number. Using Kirchhoff’s law the number of times step I6 is executed is equal to the number of times steps I5 and I2 are executed; that is steps I2 and I6 have count N. Step I5 is entered from I4 C times, so I2 goes N − C times to step I5 and C times to I3. And step I3 goes N times to step I4, which must return N − C times to I3. Of course, N is the number of elements in the permutation and C is the number of its cycles. The PB.. instructions in lines 04 and 10 are based on the assumption that in most cases C ≤ N/2. An analysis of C shows that its average value is the harmonic number Hn. So the assumption is correct. The program needs 4Nµ + (12N + 5C + 4)υ. The execution with the test data, the permutation (45)(2)(163), gives the statistic for Invert: 78 instructions, 24 mems, 91 oops; 11 good guesses, 7 bad. (The total run time is 96υ as the TRAP instruction needs 5υ.) As in this case N = 6 and C = 3 the above formula calculates (4 × 6)µ = 24µ and (12 × 6 + 5 × 3 + 4)υ = (72 + 15 + 4)υ = 91υ in agreement with the measured data.

Date: 04 Aug 2014 1

Udo Wermuth TUGboat, Volume 35 (2014), No. 3 309

A Citation Style Language (CSL) workshop Attribution-ShareAlike 3.0 Unported license by Cre- ative Commons.5 For the search of specific citation Daniel Stender styles the online CSL style editor6 is quite useful Abstract because it provides, in addition to other features, a search by example. CSL is a free and open XML-based language for the programming of citation styles. With these styles, 3 Pandoc-citeproc bibliographical references can be printed out in dif- Among the several current applications which al- ferent ways from several database formats, including ready know how to use CSL styles for the automatic BibT X. The so-far over 7000 CSL styles which are E generation of references, there is the popular univer- currently available can be used with several popular sal markup converter Pandoc [1].7 With short cita- applications like Zotero, Mendeley, or Pandoc. This tion keys like @doe2014 [p. 40-42] for its extended article is an introduction into the programming of Markdown lightweight markup, Pandoc can query citation styles with CSL, based on a few example bibliographical data files and recognize CSL styles to BibT X bibliographic database records. E put out variously formatted references for documents 1 Introduction either in HTML,LATEX or ConTEXt markup, along with several others [3]. Pandoc is a command line Bibliographical references, as used in scientific publi- interface application; thus, the CSL style which is cations, are pointers to cited or regarded literature. going to be processed and the bibliographical data- Regularly, they consist of two standardized compo- base(s) are given as arguments in the program call. nents: an in-line citation (the “cite”) refers to an Here’s an example (some line breaks are editorial) for entry in the publication’s bibliography. Despite the Pandoc’s LAT X output of a random BibT X data- common concepts, there is no uniform outline for E E base record (see below for details), formatted using references; rather, each scientific discipline and ev- chicago-author-date.csl:8 ery publishing house has its own traditional set of conventions, which also might change between series. $ echo "On this, see @reference2 [p. 127]." \ In electronic typesetting, bibliographical infor- | pandoc --to=latex \ --csl=chicago-author-date.csl \ mation is often gathered in comprehensive, reusable --bibliography=references.bib data files. CSL1 is a programming language for ci- tation styles, with which differently formatted refer- On this, see Flom (2007, 127). ences can be generated from the same bibliographic databases. CSL (current version: 1.0.1) is XML-based, Flom, Peter. 2007. ‘‘LaTeX for Academics and open and free, and was substantially developed for Researchers Who (Think They) Don’t Need It.’’ the all-around reference manager Zotero.2 \emph{TUGboat} 28 (1): 126--128. This article demonstrates how a rudimentary ex- As shown in this example, the bibliography is printed ample citation style could be implemented with CSL, at the end of a document. Incidentally, in the input Bib with reference to the TEX data format. The usage (shown later), the title is given in lowercase; the of several programs refers to a Debian GNU/Linux titlecasing done here is automatic, a feature of this based system (like and Linux Mint), but style [8, chp. 14]. CSL styles could also be easily developed on other Processors which produce formatted citations operating systems. Some basic knowledge of XML out of bibliographic databases according to CSL styles Bib and the TEX data format is definitely needed to are called CiteProcs.9 CiteProcs are being developed follow every detail. in several programming languages. The one which is used normally by Pandoc is pandoc-citeproc,10 writ- 2 CSL styles ten in the same functional programming language The citation styles which have already been imple- Haskell as Pandoc itself, and developed closely to- mented in CSL (file extension: .csl) are collected gether with it. This CiteProc (currently: 0.3.0.1) by the CSL developers in the official style reposi- already deals with a number of different database tory,3 and in the Zotero style repository.4 The so- 5 far more than 7000 styles are distributed under the http://creativecommons.org/licenses/by-sa/3.0/ 6 http://editor.citationstyles.org/about/ 1 http://citationstyles.org/ 7 http://johnmacfarlane.net/pandoc/ 2 8 http://www.zotero.org/ Although given explicitly here, this CSL style is the 3 http://github.com/citation-style-language/ default in Pandoc. styles/ 9 http://en.wikipedia.org/wiki/CiteProc/ 4 http://zotero.org/styles/ 10 http://github.com/jgm/pandoc-citeproc/

A Citation Style Language (CSL) workshop 310 TUGboat, Volume 35 (2014), No. 3

formats, but it’s said that it works best with BibTEX by version, while the class attribute determines that 11 resp. BibLATEX [4] databases so far. this style provides cites in the running text by default, rather than as footnotes or end notes (which would 4 Developing CSL styles be “note”). CSL styles are XML, and therefore the whole related XML tool chain can also be used with them. For 6 Info block somebody who deals with XML regularly, a special- The next mandatory unit of a CSL style is an hinfoi ized editor is useful, but fundamentally CSL styles block, which provides metadata for labeling and iden- can be created and modified with any text editor. tification. A typical info block looks like this: CSL is described in detail in the specification [10], and the primer which has been written by the devel- An example CSL style opers [9] is a good starting point for beginners. http://www.danielstender.com/csldemo CSL is standardized as an XML grammar in the 2014-09-18T23:53:00+02:00 schema language RELAX NG,12 and in principle any CSL style file can checked with an XML validator Even if it is not intended to publish the style, there against the CSL schema to determine if it is correct are at least three mandatory child elements that are (valid).13 Unfortunately, some validators, for exam- meant for this purpose: ple xmllint (Debian package: libxml2-utils), cannot htitlei is the title of the CSL style as it is going to cope with XML schemes in RELAX NG compact syn- be displayed to users, tax like the one shipped by the CSL developers (file hidi contains, like xmlns in the CSL header (see extension: .rnc); the scheme has to be converted above), a random URI, which may be real or RELAX NG (e.g. with Trang) into the regular syn- fictitious, and which is solely for identification .rng CSL tax (file extension: ) before a style can be purposes, validated against it: hupdatedi carries a xsd:dateTime compliant time $ git clone https://github.com/\ stamp15 of the last modification. citation-style-language/schema.git Cloning into ’schema’ [...] 7 Example BIBTEX records $ trang schema/csl.rnc schema/csl.rng $ xmllint --noout -relaxng schema/csl.rng \ Before getting any deeper into CSL style program- Bib chicago-author-date.csl ming, here are a few sample TEX records to be chicago-author-date.csl validates referred to hereafter to demonstrate how CSL works. A typical @Book entry type [5, chp. 13.2] goes like 5 XML declaration and CSL header this The standard XML declaration commonly starts a @Book{reference1, CSL style file: author = {Kopka, Helmut and Daly, Patrick W.}, title = {A Guide to LaTeX and Electronic Publishing}, Incidentally, although it is often suggested to be publisher = {Addison-Wesley}, included, specifying the UTF-8 encoding like this year = 2004, could be omitted because UTF-8 is the XML default address = {Boston}, [2, p. 28]. edition = {Fourth}} The next thing which is needed is a well-formed The next one is an @Article data set of a (well- CSL header to specify that the XML file is a CSL known) journal whose issues are counted as volume style. A standard CSL header goes like this: numbers:

record types of different natures, such as “Second”, style. This macro could be employed similarly to the “second”, “2nd”, “2”, etc. Therefore, if a CSL style others, with: needs to be robust, and requires an exact format for edition information, type queries and conversions routines may be needed especially for this field. For 9.9 Sorting key this purpose, CSL provides a number of different tests With an author–date citation style like this it’s useful for complex, conditional processing of data fields; for to install a sort order, or the records are going to ap- example, is-numeric returns a (Boolean) “false” if a pear in the order of occurrence of the corresponding variable has the form “Second”, “second”, etc. cites, which is typically not wanted. The following sort key puts the entries of the bibliography into the 9.8 Publication details alphabetical order of the author’s surnames: The rendering of the publication details of books can be implemented like this: 10 Result With these features and the closing h/stylei root ele- ment, the very basic citation style which we intended to implement is completed. Like the others, this CSL style could be used to produce complete formatted citations out of the example BibT X data. E A The LTEX formatted Pandoc output of the ex- ample references looks like this (with some editorial line breaks): {[}Flom 2007{]} Peter Flom: ‘‘LaTeX for This macro first checks whether the variable publisher academics and researchers who (think they) is defined (which is not the case with @Article), and, don’t need it’’. In: \emph{TUGboat} 28,1 if this is true, renders it together with publisher-place (2007), p. 126--128. (which adopts the BibTEX field address) and again the macro year in the desired way for this citation {[}Kopka \& Daly 2004{]} Helmut Kopka,

A Citation Style Language (CSL) workshop 314 TUGboat, Volume 35 (2014), No. 3

Patrick W. Daly: ‘‘A Guide to LaTeX and References Electronic Publishing’’. Fourth edition. [1] Massimiliano Dominici. An overview of Boston: Addison-Wesley 2004. pandoc. TUGboat, 35(1):44–50, 2014. {[}Sharma 2014{]} Tushar Sharma: ‘‘Why I never URL: http://tug.org/TUGboat/tb35-1/ close Emacs’’. In: \emph{Open Source For You} tb109dominici.pdf. 1/2014, p. 53--55. [2] Joe Fawcett, Liam R.E. Quin, and Danny To be sure, what has been set up here is far from Ayers. Beginning XML. John Wiley & Sons robust and is just for demonstration purposes. The Inc., Indianapolis, fifth edition, 2012. experienced bibliography writer knows that even with [3] Axel Kielhorn. Multi-target publishing. only the basic publication types which have been TUGboat, 32(3):272–277, 2011. URL: discussed, plenty of open questions remain which http://tug.org/TUGboat/tb32-3/ would go beyond our scope here. A more refined tb102kielhorn.pdf. style would need additional features such as book [4] Philipp Lehman, Philip Kime, Audrey titles set in italics, using the prefix “pp.” for page Boruvka, and Joseph Wright. The BibLATEX ranges, using “et al.” for multiple authors if required package: Programmable bibliographies and by the style, etc. These topics and several others are citations. version 2.9a. 24/06/2014, 2014. URL: planned to be the subject of a follow-up article. http://ctan.org/pkg/biblatex. In general, CSL offers features for every last de- [5] Frank Mittelbach, Michel Goossens, et al. tail of bibliographical typesetting; the styles which The LAT X Companion. Addison-Wesley are actually used in production are much more com- E plex than what has been demonstrated here. Series on Tools and Techniques for Computer Typesetting. Addison-Wesley, Boston, second 11 Conclusion edition, 2004. CSL provides a sophisticated and versatile tool (e.g. [6] Oren Patashnik. Designing BibTEX styles, it also supports localization) for the programming 1988. URL: http://mirror.ctan.org/ of citation styles. It has already become widespread, biblio/bibtex/base/btxhak.pdf. for good reason. [7] Michael Shell and David Hoadley. BibTEX In my opinion, CSL responds to the natural com- tips and FAQ. version 1.1, 2007. URL: plexity of the subject “citation” with a very elegant, http://mirror.ctan.org/biblio/bibtex/ intuitive and simple XML-based user interface. This contrib/doc/btxFAQ.pdf. distinguishes CSL from the, for example, difficult- [8] University of Chicago Press staff, editor. Bib to-penetrate stack-based TEX language for .bst The Chicago Manual of Style. University of styles [6]. Chicago Press, Chicago, Ill., sixteenth edition, A Although a CSL preprocessor for LTEX, to the 2010. best of my knowledge, still remains a desideratum, it [9] is still highly recommended to become familiar with Rintze M. Zelle. Citation style language 1.0: Primer, 2011. URL: http://citationstyles. CSL when dealing with bibliographical typesetting. org/downloads/primer.html. Finally, until a CSL capable replacement for the BibTEX preprocessor becomes available, Pandoc’s [10] Rintze M. Zelle, Frank G. Bennet, Jr., and LATEX output is useful. Bruce D’Arcus. Citation style language 1.0.1: Language specification, 2012. URL: http://citationstyles.org/downloads/ specification.html.

⋄ Daniel Stender Hamburg, Germany daniel (at) danielstender.com http://www.danielstender.com/

Daniel Stender TUGboat, Volume 35 (2014), No. 3 315

parisa in fonts The Treasure Chest Persian fonts derived from FarsiTEX et al. playfair in fonts Playfair Display fonts. ptmsc in fonts This is a list of selected new packages posted to CTAN Use proprietary Adobe TimesSC with newtx. (http://ctan.org) from March through Septem- in fonts ber 2014, with descriptions based on the announce- Roboto fonts. ments and edited for extreme brevity. * universalis in fonts Entries are listed alphabetically within CTAN Universalis fonts, alternatives to Univers and Frutiger. directories. A few entries which the editors subjec- tively believe to be of especially wide interest or graphics otherwise notable are starred; of course, this is not asypictureb in graphics intended to slight the other contributions. User-friendly integration of Asymptote into LATEX. We’d especially like to point out the welcome blox in graphics/pgf/contrib proliferation of font packages. A wide variety of Draw block diagrams. fonts are available to (LA)T X users nowadays, almost dsptricks in graphics//contrib E Digital signal processing plots. all usable with any T X engine. We recommend E interactiveplot in graphics the DK-TUG Font Catalogue (http://www.tug.dk/ Creating interactive 2D/3D functions inside a PDF. FontCatalogue) for exploration. pst-spirograph in graphics/pstricks/contrib We hope this column and its companions will Simulate operation of a spirograph. help to make CTAN a more accessible resource to the qcircuit in graphics TEX community. Comments are welcome, as always. Macros to generate quantum circuits.

⋄ Karl Berry info http://tug.org/ctan.html latexsource-ng in info Introduction to LAT X, with setup information. fonts E almfixed in fonts Arabic Unicode extending Latin Modern Mono. language * baskervaldx in fonts dad in language/arabic Greatly extended and modified BaskervaldADF. Typesetting Arabic and mixed Arabic/Latin. caladea in fonts (See article in this issue.) Caladea fonts. calibri in fonts macros/generic Carlito sans fonts. bagpipe in macros/generic cinzel fonts in Typesetting bagpipe music. Cinzel and Cinzel Decorative fonts. docbytex in macros/generic clearsans fonts in Creating documentation from source code. Clear Sans fonts. lpform in macros/generic dantelogo fonts in formulations. Using the DANTE e.V. logo. tracklang in macros/generic drm fonts in Determining user-requested languages. Revised modern meta-font. ebgaramond-maths in fonts LATEX support for using EBGaramond in math. macros/latex/contrib erewhon in fonts afparticle in macros/latex/contrib Extends Heuristica which extends . Typeset articles for the open access journal in fonts Archives of Forensic Psychology. Fira fonts, designed for Firefox. assoccnt in macros/latex/contrib heuristica in fonts Advancing many counters simultaneously. Heuristica fonts, extending Utopia with Cyrillic. bangorcsthesis in macros/latex/contrib newtxtt in fonts Thesis class for Bangor University. Enhancement of typewriter fonts from newtx. bnumexpr in macros/latex/contrib obnov in fonts Extends eTEX’s \numexpr to big integers. Obyknovennaya Novaya Cyrillic font. clrscode3e in macros/latex/contrib (See article in this issue.) Typeset pseudo-code as in Introduction to Algorithms.

macros/latex/contrib/clrscode3e 316 TUGboat, Volume 35 (2014), No. 3

dithesis in macros/latex/contrib templatetools in macros/latex/contrib Undergraduate theses at the University of Athens. Conditionals helpful in templates. doctools in macros/latex/contrib testhyphens in macros/latex/contrib Tools for documentation of LATEX code. Testing hyphenation patterns. efbox in macros/latex/contrib tudscr in macros/latex/contrib Enhanced inline box with optional frames and colors. Technische Universit¨at Dresden documents. environ in macros/latex/contrib ucbthesis in macros/latex/contrib New interface for LATEX environments. UC Berkeley thesis class, based on memoir. fifo-stack in macros/latex/contrib yathesis in macros/latex/contrib FIFO and stack implementations. Writing a thesis following French rules. fullminipage in macros/latex/contrib Minipage spanning a complete page. macros/latex/contrib/babel-contrib getmap in macros/latex/contrib latvian in m/l/c/babel-contrib Downloading OpenStreetMap maps. Babel support for Latvian. gitinfo2 in macros/latex/contrib Use metadata from git repositories in LATEX. graphbox in macros/latex/contrib macros/latex/contrib/beamer-contrib Provide more options for placement of graphics. themes/beamerdarkthemes in m/l/c/beamer-contrib grundgesetze in macros/latex/contrib Bundle of dark color (black background) themes. Typeset Frege’s Grundgesetze der Arithmetik. handout in macros/latex/contrib macros/latex/contrib/biblatex-contrib Handout for audiences at a talk. biblatex-anonymous in m/l/c/b-c komacv in macros/latex/contrib Managing anonymous works. Typeset CV with various style options. biblatex-bookinarticle in m/l/c/b-c * l3build in macros/latex/contrib New entry type @bookinarticle. A Test and build system for (L )TEX. biblatex-multiple-dm in m/l/c/b-c labyrinth in macros/latex/contrib Load multiple datamodels in biblatex. Drawing labyrinths and related. biblatex-realauthor in m/l/c/b-c lastpackage in macros/latex/contrib Indicate real author of a work. Defines last point where packages can be loaded. biblatex-true-citepages-omit in m/l/c/b-c * latexdemo in macros/latex/contrib Avoid limitations of standard citepages=omit option. Demonstrate LATEX code with resulting output. logicproof in macros/latex/contrib macros/luatex Box proofs for propositional and predicate logic. longfigure in macros/latex/contrib luatodonotes in macros/luatex/latex Figure-like environment that breaks over pages. Add editing annotations in margins. matlab-prettifier in macros/latex/contrib placeat in macros/luatex/latex A Pretty-print Matlab source code. Absolute content positioning for LuaLTEX. mugsthesis in macros/latex/contrib Marquette University Graduate School theses. macros/xetex listlbls in macros/latex/contrib bidi-atbegshi in macros/xetex/latex List of all labels used in a document. Bidi-aware shipout macros. pressrelease in macros/latex/contrib bidicontour in macros/xetex/latex Class for typesetting press releases. Bidi-aware version of contour. pygmentex in macros/latex/contrib bidipagegrid in macros/xetex/latex Typeset code listings using Pygments. Bidi-aware version of pagegrid. qrcode in macros/latex/contrib bidipresentation in macros/xetex/latex Generate QR codes. Bidi-aware presentations. repltext in macros/latex/contrib bidishadowtext in macros/xetex/latex Control text copied from a PDF. Typesetting bidi-aware shadow text. sclang-prettifier in macros/latex/contrib Pretty-print SuperCollider source code. support sphdthesis in macros/latex/contrib texlive-dummy in support Theses at National University of Singpore. Dummy RPM to satisfy package requirements. sympytexpackage in macros/latex/contrib Support for sympy (Symbolic Python) expressions. tablestyles in macros/latex/contrib systems Separation of text and style in tables. hktex in systems/android TEXish formula parsing software for Android.

macros/latex/contrib/dithesis TUGboat, Volume 35 (2014), No. 3 317

A Book review: Practical LATEX, presentations and the customization of LTEX. The by George Gr¨atzer coverage of illustrations is especially nice, includ- ing both the basics of placing image files using the William Adams graphicx package and creating illustrations using George Gr¨atzer, Practical LATEX, http: the code-oriented tool TikZ (this chapter is based on //www.springer.com/new+%26+forthcoming+ Jacques Cr´emer’s publicly available A very minimal titles+(default)/book/978-3-319-06424-6. introduction to TikZ ). Paperback, 216 pp., Springer, 2014. There is an excellent index as well as several very useful appendices for symbols and commands which also make the book a useful quick reference. If the book has a flaw, it is the rather inexpli- cable appendix “LATEX on the iPad” which unneces- sarily limits itself to a specific platform and swerves into a political screed which at once complains that the GNU Public License (GPL) “stops you from hav- ing it used on the fastest growing platform of all time” while at the same time discussing several dif- ferent LATEX apps for the iPad which are capable of typesetting LATEX documents. As an historical footnote and commentary: Dun- can P. Steele of Valletta Ventures had initially au- thored a blog post complaining of the LATEX codebase and noting that the GPL licensing of the mainstream TEX distributions made producing a TEX editor for the iPad impossible due to the interactions of Apple’s licensing and the GPL, http://vallettaventures. com/2011/12/10/messy-latex/. However, after be- ing informed of the existence of KerTEX(http:// www.kergis.com/en/kertex.html For those interested, Gr¨atzer is also the author of ), which is avail- Math Into LATEX, More Math Into LATEX, several able under a permissive license, and being convinced that T X itself was available in the public domain, other LATEX books, and many non-LATEX books also. E Practical LATEX is a slim, well-named volume, Steele was able to produce a version of TEXpad which http://vallettaventures.com/ since it is eminently practical. It is an excellent intro- runs on the iPad: 2012/09/07/texpad-ipad-v1.1/ ductory text, covering contemporary markup, macros . Highly recommended for beginners, in particular and packages, eschewing obsolete material. Cover- A ing the essentials of document production including those who might wish to make use of LTEX on their iPad, as well as the occasional user who needs a niceties such as BibTEX and TikZ, it is an excellent, up-to-date introduction which one can hand to any reference. An updated version which eschewed the political commentary and included coverage of at potential user with confidence that it will help them A to rapidly achieve a basic proficiency allowing for the least one of the many LTEX editors for Android production of documents with an efficiency which tablets would be especially welcome. would be the envy of other tools. ⋄ William Adams Nicely designed and typeset, it covers document 608 Wayne Drive elements including text, environments, formulas, il- Mechanicsburg, PA lustrations, symbols and bibliographies as well as willadams (at) aol dot com 318 TUGboat, Volume 35 (2014), No. 3

Book review: Apprendre `aprogrammer (even if, in the end, it can also be considered a refer- en TEX, by Christian Tellechea ence manual, since it offers complete coverage of its chosen topics). Jacques Andr´e In the first hundred or so pages, the author ex- Christian Tellechea, Apprendre `aprogrammer en plains the very low-level concepts such as catcodes, TEX (Learning programming in TEX), http://www. commands, active characters, arguments, develop- lulu.com/us/en/shop/christian-tellechea/ ments, expansions, . . . At first glance, this part apprendre-%C3%A0-programmer-en-tex/ appears a bit verbose or slow, and you can’t see the paperback/product-21816783.html. Paperback, forest for the trees. Nevertheless, this is probably 580 pp., Lulu, 2014. Revision of 21/9/2014. the best way to make sure these concepts are fully understood by people not familiar with such a lan- guage. The odds are that even experienced macro users will learn something! The second part describes numbers, and control structures. Exercises allow the readers used to conventional programming to write macros to simulate their conventional structures such as for. . . do. . . or various forms of if then else fi. Boxes, dimensions and input/output are studied in the following chapters and exercises answer every- day needs (lists, stacks, grids, etc., not to mention curve drawing without waste of memory or time!), as well as classical algorithms. A final part revisits all the material in the con- text of long and thoroughly-commented examples such as the layout of paragraphs. This book is easy to read and progressive. It leaves no question unanswered, and the exercises are useful both to understand the underlying concepts This book is in French and let me say right out that and to be reused in our programs. it deserves an English translation. As I said at first: a most worthwhile book to be There are many books on TEX, as a typesetting translated into English. tool (e.g., see list at [1]). Very few are dedicated to TEX as a programming language. Not a func- References tional or general-purpose language, rather the kind [1] Books about T X and Friends, of “programming language with a documentation E http://tug.org/books language, thereby making programs more robust, more portable, more easily maintained, and arguably [2] Donald E. Knuth, Literate Programming, more fun to write than programs that are written CSLI Lecture Notes, no. 27, Stanford, 1992. only in a high-level language” (preface of [2]). [3] Donald E. Knuth, The TEXbook, Reading, Tellechea’s book could be seen as a rewrite of : Addison-Wesley, 1984. a subset of The TEXbook [3]. However it avoids everything concerning, e.g., mathematical formula ⋄ Jacques Andr´e setting and focuses only on matters that are used jacques dot andre35 (at) gmail A today, e.g., to write (L )TEX macros. Furthermore it dot com uses a new framework, since it is a tutorial manual http://jacques-andre.fr TUGboat, Volume 35 (2014), No. 3 319

Book review: The Imitation Game, true for other readers) is to stimulate sufficient cu- by Jim Ottaviani & Leland Purvis riosity to research the critical issues. The Internet is replete with references to the entscheidungsproblem Michael S. Berry (Church–Turing thesis), working code for various it- Jim Ottaviani & Leland Purvis, The Imitation erations of Turing Machines and detailed analyses of Game. http://www.tor.com/stories/2014/ the Enigma code and its decipherment (some heavy 06/the-imitation-game-jim-ottaviani- lifting involved but it fleshes out an understanding leland-purvis, 2014. of the novel). As noted earlier, Turing was not fully appreci- ated in his time beyond a small circle of colleagues. His brilliant, leading effort to break the Enigma code was largely masked due to national security issues. Then there was the familiar academic practice of extensive “borrowing” (some would say plagiarism) of ideas on the part of established, senior scholars (e.g. von Neumann) at the expense of lesser known contributors. But such slights paled in comparison to the tragic events of his final years. The authors walk us through Turing’s arrest and trial for homosexu- From the editors. While Alan Turing of course died long ality (a fact he did not challenge), the subsequent before TEX existed, there are several reasons why we conviction and his untimely death (the authors ac- think his biography as a graphic novel might be interest- cept the designation of suicide whereas others opt for ing for TUGboat readers. First, many of us are involved with computer science, and Turing was one of the found- accidental ingestion of cyanide). I believe that the ing fathers of this fascinating field. Second, both DEK authors have here missed an opportunity to place and (more recently) Leslie Lamport are recipients of the these events in broader context so as to clearly convey —the highest distinction for computer sci- the ludicrous irony involved. entists. Third, many TUGboat readers are interested The issue involved the academic imprimatur in modern book design, and therefore might appreciate afforded eugenics at the time. Led by Charles Dav- an online graphic novel. Last but not least, this is an enport in the United States, positive (encouraging interesting and unusual book. reproduction by superior lineages) and negative (elim- − − ∗ − − ination of inferior lineages) eugenics was a widely Alan Turing was a young man in search of himself accepted, if terribly wrongheaded, derivation of Dar- as well as universal truths, or so he is cast by Otta- winian natural selection. Nazi Germany appropriated viani and Purvis in The Imitation Game. Socially Davenport’s eugenics theory in toto and extended awkward and eccentric, he nonetheless managed to it to its logical ends: the enormity of genocide and gain success through a combination of innate genius the holocaust. Turing was awarded the OBE (Most and happenstance, the latter due to Great Britain’s Excellent Order of the British Empire) for his role involvement in World War II. The authors take us in defeating Hitler and the Nazi regime. Yet he was through Turing’s life from childhood, advanced math- convicted and chemically castrated for a crime, the ematical and philosophical education, his lead role rationale of which was based in the assumed valid- in breaking the German Enigma code, subsequent ity of the theory of eugenics (i.e., the elimination lack of recognition for theoretical accomplishments, of the “unfit” from the gene pool) and suffered the and, finally, his persecution and premature death. consequent physical impairment and ignominy. It The story is told by Turing, himself, his mother and should not be lost to history, or the readers of this a host of associates from various periods of his life. novel, that Winston Churchill, who lauded Turing’s The dialog is, of course, conjectural reconstruction, World War II contributions, was a vocal advocate of but the authors do a convincing job of portraying eugenics for the improvement of the British people. the “essential” Turing and, for the most part, the In sum, with the above exception noted, I found narrative flows well. the novel both illuminating for those unfamiliar with The novel is decidedly not a primer on the Tur- Turing’s too-brief life history and stimulative of ad- ing Machine or its underlying philosophical issues. ditional research for those interested in the more It traverses critical points in the evolution of Tur- arcane aspects addressed. A good read. ing’s thought processes in a manner assuming a fairly sophisticated knowledge on the part of the reader, ⋄ Michael S. Berry knowledge this reviewer did not possess. The salu- Dominguez Archaeological Research Group tary effect (in my case and I suspect the same will be msberry49 (at) gmail dot com 320 TUGboat, Volume 35 (2014), No. 3

A Book review: Let’s Learn LATEX, deprecated in LTEX 2ε and should be avoided [5]. by S. Parthasarathy There are also some instances where a font chang- ing declaration, such as \Huge, is followed by an un- Nicola L. C. Talbot necessary group, which may confuse the reader into S. Parthasarathy, Let’s Learn LATEX, http:// thinking the command requires an argument. For www.freewebs.com/profpartha/teachlatex.htm, example, on line 50 of certif0.tex there are two 2014. 24 pp., free ebook. Version 201408e. sets of redundant braces in

{\bf{\Huge{‘‘\LaTeX\ hands-on’’}}}

since neither \bf nor \Huge have an argument. Only the outermost set of braces scope the effects of the L 3 declarations. Let’s Learn LAT X E Many of the sample documents load the epsfig (Version : 201409c) package [7]. The original epsfig style that was pro- vided with LATEX 2.09 is now obsolete and should not This ebook is constantly updated. Make sure that you have the latest version of this ebook. be used. Current TEX distributions provide a newer

Texts shown in winered are click-sensitive hyperlinks. epsfig package that is just a wrapper package that loads the graphicx package [2]. The recommended practice is to use graphicx directly and not spec- ify the image file extensions [4]. Incidentally, there is also no longer any need to specify the dvips package S. Parthasarathy [email protected] option, which occurs in many of the sample docu- Algologic Research and Solutions Secunderabad ments. The only time a driver option is needed is India in the cases where it can’t be determined, such as dvipdfm [10]. Omitting the driver and the file ex- tensions helps to make the document more portable. This book is licensed under a Creative Commons Attribution-ShareAlike 4.0 Unported license. Curiously, some of the documents, such as the The license is adequately described in https://creativecommons.org/licenses/by-sa/4.0/legalcode . A copy of this licence is also available with this L 3 book. file spacing.tex load both epsfig and graphicx, which is redundant. There are other instances of unnecessary repetition where there are multiple at- tempts to load the same package. For example, in the file torture1.tex not only are both epsfig and graphicx loaded on line 5, but there is also an at- This is a free ebook licensed under the Creative tempt to load epsfig on line 11 and graphicx on Commons Attribution-ShareAlike 4.0 Unported li- line 13. Similarly, there are two attempts to load cense. The preface starts with a tribute to Richard amssymb (on lines 4 and 10), amsmath (on lines 4 Stallman and brief information about FOSS (free and 8) and amsfonts (on lines 4 and 9). Removing and open source software — the acronym could do this duplication would provide a more streamlined with an expansion in the book). The preface then example. states that the purpose of the book is to encour- On the subject of images, the image files aren’t age hacking as a method of learning LATEX. The actually provided with the sample documents for book, rather than being a reference text containing copyright and licensing issues, so I think it would instructions and definitions, provides a list of ex- be useful if the author could mention the use of ample documents that the reader can try out and the demo option provided by the graphicx package, modify as a learning tool. which would enable the documents to be compiled The idea of learning by hacking example code is without error. Alternatively, perhaps mention the a useful concept, and one that I often employ when image files provided with the mwe package [9]. investigating a new programming language. How- There is prolific use of \\ within paragraphs ever, I’m concerned that the sample documents pro- in the sample documents, which is generally best vided with this book use obsolete code and depre- avoided [1]. I think using paragraph breaks instead cated practices. Some illustrations follow below. of \\ would be more appropriate in most of these Most of the sample documents use the obsolete cases, and blank lines would additionally help read- LATEX 2.09 font commands, such as \bf. These are ability of the code.

Nicola L. C. Talbot TUGboat, Volume 35 (2014), No. 3 321

More surprising is the use of \\ immediately [2] David Carlisle. The graphicx package, before paragraph breaks. The LATEX user guide [6, April 2014. Available from CTAN, macros/ p. 213] warns against this as it produces underfull latex/required/graphics (version: 1.0g, \hbox warnings and extra vertical space. (If vertical 2014-04-25). spacing is required between paragraphs, there are [3] TEX FAQ. Zero paragraph indent. more appropriate methods of achieving this [3].) URL: http://www.tex.ac.uk/cgi-bin/ I was somewhat bemused by the line texfaq2html?label=parskip (version: 3.28, %\documentstyle[epsfig, picinpar, 12pt]{article} 2014-06-10). in the file latexography.tex. Even though the line [4] TEX FAQ. Portable imported graphics, is commented, \documentstyle should not appear 2014. URL: http://www.tex.ac.uk/ in any modern LATEX tutorial, except where it is cgi-bin/texfaq2html?label=graph-pspdf being pointed out as obsolete. There’s a danger here (version: 3.28, 2014-06-10). that new curious users may uncomment the line and [5] TEX FAQ. What’s wrong with \bf, try it out. \it, etc.?, June 2014. URL: http: I was interested to see that the sample docu- //www.tex.ac.uk/cgi-bin/texfaq2html? ment kuralengtam3a.tex loads the fontspec pack- label=2letterfontcmd A A (version: 3.28, age [8], which is a X LE TEX and LuaLTEX package. 2014-06-10). It’s not often that a LAT X tutorial uses a different E LAT X: A document engine. I think this is a good idea, but it would help [6] Leslie Lamport. E preparation system. Addison-Wesley, 1994. if the author pointed out to the reader in a comment at the start of the file that X LE ATEX or LuaLATEX is [7] Sebastian Rahtz and David Carlisle. The required, otherwise users eager to compile the sam- epsfig package, February 1999. Available ple document before reading it may not realise they from CTAN, macros/latex/required/ need a different engine. graphics (version: 1.7a, 1999-02-16). This sample document requires the font. [8] Will Robertson and Khaled Hosny. The As a GNU/Linux user, I don’t have any commercial fontspec package: Font selection for fonts installed, and it wasn’t immediately obvious X LE ATEX and LuaLATEX, June 2014. Available at what point the document was switching to Arial, from CTAN, macros/latex/contrib/ but the command-line invocation of grep Arial * fontspec (version: 2.4a, 2014-06-21). tracked it down to eight of the accompanying files. [9] Martin Scharrer. The mwe package, After I had replaced all instances of Arial with May 2012. Available from CTAN, Liberation Sans in those files, I was able to get macros/latex/contrib/mwe (version: 0.3, the document to compile without error. Windows 2012-05-15). users won’t have this problem, but I was puzzled by [10] Joseph Wright. Answer to: Driver the author’s choice of Arial rather than a free font specification for hyperref and graphicx. given the book’s FOSS ethos. TeX – LaTeX Stack Exchange, 2010. URL: If the author updates the sample documents so http://tex.stackexchange.com/a/6949 that the redundancy, obsolescence and deprecated (version: 2010-12-12). practices are removed, this book could be a useful tool in learning LAT X and introducing the reader to E ⋄ X LAT X. Nicola L. C. Talbot E E School of Computing Sciences References University of East Anglia Norwich Research Park [1] David Carlisle. Answer to: When to Norwich use \par and when \\. TeX – LaTeX NR4 7TJ Stack Exchange, 2012. URL: http: United Kingdom //tex.stackexchange.com/a/82666 N.Talbot (at) uea dot ac dot uk (version: 2012-11-14). http://theoval.cmp.uea.ac.uk/~nlct/

Book review: Let’s Learn LATEX, by S. Parthasarathy 322 TUGboat, Volume 35 (2014), No. 3

Die TEXnische Komödie 2–3/2014 Mac OS X MacRoman, the various Unix derivatives offer HPRoman8, CP-850 and ISO Latin 1, among others. But Die TEXnische Komödie is the journal of DANTE e.V., the these are only sufficient for Western Europe and parts German-language TEX user group (http://www.dante. of the Americas; for central Europe one needs additional de). [Non-technical items are omitted.] encodings. Günter Partosch Die TEXnische Komödie 2/2014 , Anforderungen an wissenschaftliche Abschlussarbeiten [Requirements for Andreas Entenmann Walter Entenmann and , scientific theses]; pp. 94–98 Zum Entwurf von Postern [On the creation of posters]; The usual way to finish a course of studies in Ger- pp. 37–51 many is to write a thesis. Form, length and other param- Thanks to the a0poster package by Kettl and Weiser A eters are usually defined not just by the university but one can create posters for scientific meetings using LTEX’s also by the thesis supervisor. Additional requirements standard formatting commands. Conventions of corpo- are introduced when the thesis is to fulfill good scientific rate design can be incorporated without problems, if the work or to be published on the Internet. In this article logo files and colors are available. After a short intro- A it is shown how LTEX can be successfully applied. duction to the a0poster class this article provides some general insights on the design of posters for scientific Die TEXnische Komödie 3/2014 conferences. We also present some little tricks to, e.g. Jacob Wiersma, Mehr Möglichkeiten mit Fußnoten create A4 testprints, to convert the output format or to [More options with footnotes]; pp.6–13 slice the poster into printable pieces. Based on a specific For some time there have been packages that extend example the different steps are described and bundled in the limits of LAT X’s standard footnote algorithms. This a package. E article presents the bigfoot, manyfoot and footmisc Dominik Wagenführ, Registerhaltiger Satz mit packages and discusses a few suggestions for improve- A A LTEX [Grid typesetting with LTEX]; pp.52–64 ments. If one looks at a modern newspaper, one will likely Steve Zakrzowsky, Paket skmath für see that in multi-column typesetting the adjacent lines are mathematische Formeln [The skmath package always on the same height. This property is called grid A for mathematical formulas]; pp.14–18 typesetting. While LTEX does not offer this functionality The skmath package was developed by Simon Sig- out-of-the-box one may achieve good results with some urdhsson, who has also created a few special document manual interventions. classes. Writing mathematical expressions and equa- A A Ulrike Fischer, BibLTEX-Variationen [BibLTEX tions can easily make documents confusing. The skmath variations]; pp. 65–75 package offers some extensions for the simple and intu- [Translation published in this issue of TUGboat.] itive entry of mathematical expressions. Uwe Ziegenhagen , Spendenbescheinigungen Idris Samawi Hamid, DANTE summary report: A [Creating donation receipts using LTEX, SQL and Introducing Arabic-Latin Modern ; pp.19–46 Python]; pp.76–82 The Oriental TEX project was initiated in 2006 to In my capacity as treasurer for the Cologne-based facilitate the development of high quality typography Dingfabrik “fab lab” one of my tasks is to create the and typesetting of academic and scholarly texts that annual donation receipts for all donors. The process in require the Arabic script, such as critical editions and place until recently involved manual aggregation in Excel monographs. Although support for the Arabic script in and manual creation of the receipts in MS Word, not a modern typesetting software has been slowly improving desirable way to go for a TEXie. This article describes over the past decade or so, the situation is still very far A how the forms were created from scratch with LTEX and behind the Latin script in terms of features, available filled using an intelligent combination of Quicken, Excel, high-quality typefaces, and layout-processing software. MySQL and Python. For academic and scholarly work, it’s still very much a Axel Kielhorn, Präsentationen mit Beamer wilderness out there. A full solution to the problems of [Presentations with Beamer]; pp. 83–93 advanced Arabic-script typography and typesetting, par- The beamer document class offers an easy way to ticularly one based on OpenType and Unicode standards, create presentations. Due to the numerous options and is still some ways off. templates, getting started with Beamer may not seem Christine Römer, Mit etoc Inhaltsverzeichnisse that easy. A presentation example shows many of the anpassen [Adjusting tables of contents with etoc]; available options and presents some of the challenges pp. 47–54 (and their solutions) a new Beamer user might face. A The new etoc package extends LTEX’s capabilities A A Axel Kielhorn, LTEX für Nichtlateiner [LTEX for to create individual tables of contents. It is especially non-Latinates]; pp. 94–98 useful to create local tables of contents. For historical reasons, different operating systems use different character encodings. Windows uses CP-1252, [Received from Herbert Voß.] TUGboat, Volume 35 (2014), No. 3 323

Les Cahiers GUTenberg issue 57 (2012) important external libraries had been deprecated and needed to be replaced. Other areas of work include Les Cahiers GUTenberg GUT is the journal of enberg, finding fonts and syncing xdvipdfmx with dvipdfmx, as the French-language TEX user group (www.gutenberg. well as handling general bug reports. A report on the eu.org). completed work was given in TUGboat 34:2. Thierry Bouche, Editorial;´ pp. 3–4 Dynamic library support in LuaTEX Charles Bigelow, Histoire d’O, d’o et de 0 [Oh, oh, Applicant: Luigi Scarso, Italy, zero!]; pp. 5–53 http://www.luatex.org/swiglib.html Published in TUGboat 34:2. Amount: US$2000; acceptance date: 31 May 2013. La liste Typographie, Microtypographie digitale Support shared libraries in LuaTEX using SWIG [Digital microtypography]; pp. 55–63 (http://www.swig.org). Some libraries are already sup- Since its inception, the French typographie mail- ported, e.g., mysql and graphicsmagick. ing list has always devoted a large part of its discus- METAFONT sions to microtypography, with special emphasis on non- Metaflop: via the web alphabetic glyphs and complex constructs. It is thus Applicant: Marco M¨uller, Switzerland, no surprise that the design of digits has been regularly http://www.metaflop.com. discussed, as well as the problem of having them co- Amount: US$1000; acceptance date: 20 Jun 2013 habit with letters within typeset pages. Here, we ex- (completed 10 Aug 2014). tract from the list archives (sympa.inria.fr/sympa/arc/ Enhance the Metaflop web application, which pro- typographie) three discussion threads dealing with is- vides a graphical interface for adjusting Metafont param- sues such as the shape or the width of digits, especially eters, with improvements to the underlying fonts, the oldstyle figures. preview mechanism, and the generation. [Received from Thierry Bouche.] TEX Live for Android Applicant: Clerk Ma, China, TEX Development Fund 2013–2014 report http://code.google.com/p/texlive-for-android. TEX Development Fund committee Amount: US$2000; acceptance date: 26 Jun 2013. Add a native editor and package manager GUI to MetaPost 2: Numerical engines the TEX Live for Android project. http://tug.org/ Applicant: Taco Hoekwater, The Netherlands, tug2013/abstracts/ma.txt has more background. http://tug.org/metapost. Project Fandol: Free Chinese fonts and Amount: US$2000; acceptance date: 2 Dec 2009 Russian-style math fonts (completed 24 May 2011). Implement better numerical handling in MetaPost, Applicants: Clerk Ma and Jie Su, China, among other enhancements. An article about the initial http://code.google.com/p/fandol-font. MetaPost 2 project goals, by Hans Hagen and Taco Amount: US$1000; acceptance date: 9 Aug 2013. Hoekwater, was published in TUGboat 30:3. Meta- (Information below is from the applicants.) Most Post 1.802, included in TEX Live 2013, has support math books in China are produced by Founder Book- for several numeric representations, for example via the maker. This system has used a set of Russian style math -numbersystem option. fonts for more than 30 years. These commercial fonts are designed with a unique encoding by Founder. And, Lineno and related updates these fonts cannot work in TEX or other programs. Applicant: Uwe Lueck, Germany, We have a set of metal types which contain two http://www.ctan.org/pkg/lineno. Russian style fonts (serif and sans serif). By analyzing Amount: US$1000; acceptance date: 17 Sep 2011. these metal types, we find Founder’s fonts are derived For updates to the complex lineno package, and from these fonts, and Founder only provided a serif ver- related efforts, such as factoring out functionality into sion (we will provide these math fonts in both serif and separate packages. sans serif). These metal types were imported from the U.S.S.R. in 1953. X TE EX math and other updates We will trace the metal fonts to outlines (initially Applicant: Khaled Hosny, Egypt, in EPS format). For more detailed adjusting, we will be http://www.ctan.org/pkg/xetex. using FontForge. Parts of our Chinese fonts are already Amount: US$4000; acceptance date: 24 Apr 2012 processed in this workflow. For these Russian style fonts, (completed 25 Jul 2013). we will also work in this way. For updates to the X T X engine, especially relating E E ⋄ to OpenType math typesetting, and including updates as TEX Development Fund committee http://tug.org/tc/devfund needed to LuaTEX to keep the engines in sync. Several 324 TUGboat, Volume 35 (2014), No. 3

2015 TEX Users Group election 2015 TUG Election — Nomination Form Kaja Christiansen Only TUG members whose dues have been paid for 2015 for the Elections Committee will be eligible to participate in the election. The signa- tures of two (2) members in good standing at the time The positions of TUG President and nine members of the they sign the nomination form are required in addition to Type or print Board of Directors will be open as of the 2015 Annual that of the nominee. names clearly, using Meeting, which will be held in July 2015 in Darmstadt, the name by which you are known to TUG. Names that Germany. cannot be identified from the TUG membership records will not be accepted as valid. The current President, Steve Peter, has stated his in- tention to step down, and the current Vice-President, Jim The undersigned TUG members propose the nomi- Hefferon, has stated his intention to run for President. nation of: The directors whose terms will expire in 2015: Bar- Name of Nominee: bara Beeton, Karl Berry, Susan DeMeritt, Michael Doob, Taco Hoekwater, Ross Moore, Cheryl Ponchin, Philip Signature: Taylor, and Boris Veytsman. Continuing directors, with terms ending in 2017: Date: Kaja Christiansen, Steve Grathwohl, Jim Hefferon, Klaus for the position of (check one): H¨oppner, Arthur Reutenauer, David Walden.  The election to choose the new President and Board TUG President members will be held in Spring of 2015. Nominations for  Member of the TUG Board of Directors these openings are now invited. for a term beginning with the 2015 Annual Meeting, The Bylaws provide that “Any member may be July 2015. nominated for election to the office of TUG President/to the Board by submitting a nomination petition in accor- 1. (please print) dance with the TUG Election Procedures. Election . . . shall be by written mail ballot of the entire membership, carried out in accordance with those same Procedures.” (signature) (date) The term of President is two years. The name of any member may be placed in nomina- 2. tion for election to one of the open offices by submission (please print) of a petition, signed by two other members in good stand- ing, to the TUG office at least two weeks (14 days) prior (signature) (date) to the mailing of ballots. (A candidate’s membership dues for 2015 will be expected to be paid by the nomina- tion deadline.) The term of a member of the TUG Board Return this nomination form to the TUG office (forms sub- is four years. mitted by FAX or scanned and submitted by e-mail will A nomination form follows this announcement; forms be accepted). Nomination forms and all required supple- may also be obtained from the TUG office, or via http: mentary material (photograph, biography and personal statement for inclusion on the ballot) must be received //tug.org/election. 1 in the TUG office no later than 1 February 2015. It Along with a nomination form, each candidate must is the responsibility of the candidate to ensure that this supply a passport-size photograph, a short biography, deadline is met. Under no circumstances will incomplete and a statement of intent to be included with the ballot; applications be accepted. the biography and statement of intent together may not exceed 400 words. The deadline for receipt of nomina-  nomination form tion forms and ballot information at the TUG office is  photograph 1 February 2015. Forms may be submitted by FAX, or  biography/personal statement scanned and submitted by e-mail to [email protected]. Ballots will be mailed to all members within 30 T X Users Group FAX: +1 815 301-3568 days after the close of nominations. Marked ballots must E be returned no more than six (6) weeks following the Nominations for 2015 Election mailing; the exact dates will be noted on the ballots. P. O. Box 2311 Ballots will be counted by a disinterested party not Portland, OR 97208-2311 affiliated with the TUG organization. The results of U.S.A. the election should be available by early June, and will 1 Supplementary material may be sent separately from the be announced in a future issue of TUGboat as well as form, and supporting signatures need not all appear on the through various TEX-related electronic lists. same form. TEX Users Group TUG membership rates are listed below. Please check the appropriate boxes and Membership Form mail the completed form with payment (in US dollars) to the mailing address at left. If paying by credit/debit card, you may alternatively fax the form to the 2015 number at left or join online at http://tug.org/join.html. The web page also provides more information than we have room for here. Status (check one) New member Renewing member Automatic membership renewal in future years Will use given payment information; contact office to change/cancel. Rate Amount Early bird membership for 2015 $85 After March 31, dues are $105. Special membership for 2015 $55 You may join at this special rate ($75 after March 31) if you are a senior (62+), student, new graduate, or from a country with a modest economy. Please circle accordingly. If financially feasible for you, please consider checking here to donate $30 (the difference from the regular membership rate). $30 Subscription for 2015 (non-voting) $110 Institutional membership for 2015 $500 Includes up to eight individual memberships Promoting the use and site-wide electronic access. Don’t ship any physical benefits (TUGboat, software) deduct $20 of TEX throughout TUGboat and software are available electronically. the world. Purchase last year’s materials: TUGboat volume for 2014 (3 issues) $20 TEX Collection 2014 $10 DVD with proTEXt, MacTEX, TEX Live, CTAN. Voluntary donations (more info at https://www.tug.org/donate.html) address: General TUG contribution P.O. Box 2311 Bursary Fund contribution Portland, OR 97208-2311 USA TEX Development Fund contribution CTAN contribution phone: +1 503-223-9994 LATEX contribution fax: +1 815-301-3568 LuaTEX contribution email: [email protected] MacTEX contribution web: http://www.tug.org TEX Gyre fonts contribution Total $ Tax deduction: The membership fee less $40 is generally deductible, at least in the US. Multi-year orders: To join for more than one year at this year’s rate (up to ten years, non-refundable), please multiply by the number of years desired. President Steve Peter Payment (check one) Payment enclosed Visa MasterCard AmEx Vice-President Jim Hefferon Treasurer Karl Berry Secretary Susan DeMeritt Account Number: Exp. date: Executive Director Robin Laakso Signature:

Privacy: TUG uses your personal information only to send products, publications, notices, and (for voting members) official ballots. TUG does not sell or otherwise provide its membership list to anyone.

Name Department Institution Address

City State/Province Postal code Country Email address Phone Fax Position Affiliation 326 TUGboat, Volume 35 (2014), No. 3

TUG TEX Consultants Institutional Members The information here comes from the consultants themselves. We do not include information we know to be false, but we cannot check out any of the information; we are transmitting it to you as it was American Mathematical Society, given to us and do not promise it is correct. Also, this Providence, Rhode Island is not an official endorsement of the people listed here. Aware Software, Inc., Midland Park, New Jersey We provide this list to enable you to contact service providers and decide for yourself whether to hire one. Center for Computing Sciences, Bowie, Maryland TUG also provides an online list of consultants at CSTUG, Praha, Czech Republic http://tug.org/consultants.html. If you’d like to diacriTech, Chennai, India be listed, please see that web page.

Fermilab, Batavia, Illinois Aicart Martinez, Merc`e o a Google, San Francisco, Tarragona 102 4 2 08015 Barcelona, IBM Corporation, T J Watson Research Center, +34 932267827 Yorktown, Email: m.aicart (at) ono.com Institute for Defense Analyses, Center for Web: http://www.edilatex.com A Communications Research, Princeton, New Jersey We provide, at reasonable low cost, L TEX or TEX page layout and typesetting services to authors or publishers Marquette University, Department of Mathematics, world-wide. We have been in business since the begin- Statistics and Computer Science, ning of 1990. For more information visit our web site. , Dangerous Curve Masaryk University, Faculty of Informatics, PO Box 532281 Brno, Czech Republic Los Angeles, CA 90053 +1 213-617-8483 MOSEK ApS, Copenhagen, Denmark Email: typesetting (at) dangerouscurve.org New York University, Academic Computing Facility, Web: http://dangerouscurve.org/tex.html New York, New York We are your macro specialists for TEX or LATEX fine typography specs beyond those of the average LAT X Springer-Verlag Heidelberg, Heidelberg, Germany E macro package. If you use X TE EX, we are your StackExchange, New York City, New York microtypography specialists. We take special care to typeset mathematics well. , Computer Science Department, Not that picky? We also handle most of your typical Stanford, California TEX and LATEX typesetting needs. Stockholm University, Department of Mathematics, We have been typesetting in the commercial and Stockholm, academic worlds since 1979. Our team includes Masters-level computer scientists, University College, Cork, Computer Centre, journeyman typographers, graphic designers, Cork, Ireland letterform/font designers, artists, and a co-author of a Universit´eLaval, Ste-Foy, Qu´ebec, Canada TEX book. University of Ontario, Institute of Technology, Latchman, David Oshawa, Ontario, Canada 4113 Planz Road Apt. C Bakersfield, CA 93309-5935 , Institute of Informatics, +1 518-951-8786 Blindern, Oslo, Norway Email: david.latchman (at) University of Wisconsin, Biostatistics & texnical-designs.com Medical Informatics, Madison, Wisconsin Web: http://www.texnical-designs.com LATEX consultant specializing in: the typesetting VTEX UAB, Vilnius, Lithuania of books, manuscripts, articles, Word document conversions as well as creating the customized packages to meet your needs. Call or email to discuss your project or visit my website for further details. TUGboat, Volume 35 (2014), No. 3 327

Peter, Steve Sofka, Michael (cont’d) +1 732 306-6309 newsletters, and theses in TEX and LATEX: Automated Email: speter (at) mac.com document conversion; Programming in Perl, C, C++ Specializing in foreign language, multilingual, and other languages; Writing and customizing macro linguistic, and technical typesetting using most packages in TEX or LATEX; Generating custom output flavors of TEX, I have typeset books for Pragmatic in PDF, HTML and XML; Data format conversion; Programmers, Oxford University Press, Routledge, Databases. and Kluwer, among others, and have helped numerous If you have a specialized TEX or LATEX need, or if authors turn rough manuscripts, some with dozens you are looking for the solution to your typographic of languages, into beautiful camera-ready copy. In problems, contact me. I will be happy to discuss addition, I’ve helped publishers write, maintain, and your project. streamline T X-based publishing systems. I have an E Veytsman, Boris MA in Linguistics from Harvard University and live in 46871 Antioch Pl. the New York metro area. Sterling, VA 20164 Sievers, Martin +1 703 915-2406 Klaus-Kordel-Str. 8, 54296 Trier, Germany Email: borisv (at) lk.net +49 651 4936567-0 Web: http://www.borisv.lk.net Email: info (at) schoenerpublizieren.com TEX and LATEX consulting, training and seminars. Web: http://www.schoenerpublizieren.com Integration with databases, automated document As a mathematician with ten years of typesetting preparation, custom LATEX packages, conversions and experience I offer TEX and LATEX services and much more. I have about eighteen years of experience consulting for the whole academic sector (individuals, in TEX and three decades of experience in teaching universities, publishers) and everybody looking for a & training. I have authored several packages on high-quality output of his documents. From setting up CTAN, published papers in TEX related journals, and entire book projects to last-minute help, from creating conducted several workshops on TEX and related subjects. individual templates, packages and citation styles Young, Lee A. (BibT X, biblatex) to typesetting your math, tables or E 127 Kingfisher Lane graphics—just contact me with information on your Mills River, NC 28759 project. +1 828 435-0525 Sofka, Michael Email: leeayoung (at) morrisbb.net 8 Providence St. Web: http://www.thesiseditor.net Albany, NY 12203 Copyediting your .tex manuscript for readability and +1 518 331-3457 mathematical style by a Harvard Ph.D. Your .tex file Email: michael.sofka (at) gmail.com won’t compile? Send it to me for repair. Experience: Skilled, personalized TEX and LATEX consulting and edited hundreds of ESL journal articles, economics and programming services. physics textbooks, scholarly monographs, LATEX I offer over 25 years of experience in programming, manuscripts for the Physical Review; career as macro writing, and typesetting books, articles, professional, published physicist.

TUG2015 Darmstadt,Germany July20–22,2015 http://tug.org/tug2015 328 TUGboat, Volume 35 (2014), No. 3

Calendar

2014 Apr 16 – 19 DANTE Fr¨uhjahrstagung nd and 52 meeting, Stralsund, Germany. Nov 8 – 9 The Twelfth International Conference www.dante.de/events.html on Books, Publishing, and Libraries, Apr 29 – BachoTEX 2015: “Disruptive Technologies and the rd May 3 23 BachoTEX Conference, Evolution of Book Publishing Bachotek, Poland. and Library Development”, www.gust.org.pl/bachotex Simmons College, Boston, Massachusetts. Apr 30 – TYPO San Francisco, booksandpublishing.com/the-conference May 1 Yerba Buena Center for the Arts, Nov 13 – 14 The Printing Historical Society’s th San Francisco, California. 50 Anniversary, “Landmarks in typotalks.com/sanfrancisco Printing: from origins to the digital age”, May 21 – 23 TYPO Berlin 2015, “Character”, St Bride Institute, London, UK. Berlin, Germany. typotalks.com/berlin printinghistoricalsociety.org.uk/ forthcoming_phs_events/#144 Jun 29 – Digital Humanities 2015, Alliance of Jul 3 Digital Humanities Organizations, Nov 14 TYPO Day, “Business Typography “Global Digital Humanities”, Talks”, M¨unchen, Germany. Sydney, Australia. dh2015.org typotalks.com/day/muenchen-2014 th Jul 7 – 10 SHARP 2015, “The Generation and Nov 28 – 29 5 Meeting of Typography, Regeneration of Books”. Society for the “Ubiquitous”, Escola Superior History of Authorship, Reading & de Tecnologia – IPCA, Barcelos, Publishing, Longueuil/Montreal, Canada, Portugal. www.atypi.org/events/ www.sharpweb.org 5th-meeting-of-typography Dec 13 Day, Museum of Printing, North Andover, Massachusetts. TUG 2015 www.museumofprinting.org Darmstadt, Germany. th Jul20–22 The36 annual meeting of the 2015 TEX Users Group. tug.org/tug2015 Feb 1 TUG election: nominations due. tug.org/election Jul 31 TUGboat 36:2, submission deadline Mar 6 TUGboat 36:1, submission deadline. (proceedings issue). Mar 7 – 9 Typography Day 2015, Aug 9 – 13 SIGGRAPH 2015, “Xroads of Discovery”, “Typography, Sensitivity and Fineness”, Los Angeles, California. Industrial Design Center, s2015.siggraph.org Indian Institute of Technology, Aug 24 – 28 SHARP 2015, Society for the History of Bombay, India. www.typoday.in Authorship, Reading & Publishing, Mar 9 TUGboat 36:1, submission deadline Jinan, Snandong Province, China, (regular issue) www.sharpweb.org Mar 19–21 “Publish or Perish? Scientific Oct 19 – 20 The Thirteenth International Conference periodicals from 1665 to the present”. on Books, Publishing, The Royal Society, London, UK. and Libraries, University of British royalsociety.org/events/ Columbia, Vancouver, Canada. booksandpublishing.com/the-conference-2015

Status as of 20 October 2014

For additional information on TUG-sponsored events listed here, contact the TUG office (+1 503 223-9994, fax: +1 815 301-3568. e-mail: [email protected]). For events sponsored by other organizations, please use the contact address provided. A combined calendar for all user groups is online at texcalendar.dante.de. Other calendars of typographic interest are linked from tug.org/calendar.html. TUGBOAT Volume 35 (2014), No. 3 Introductory 231 Barbara Beeton / Editorial comments • typography and TUGboat news 244 Charles Bigelow / A letter on the persistence of (e)books • the Kindle, Nook, Sony Reader, and Pierre-Simon Fournier 232 Donald Knuth / A footnote about ‘Oh, oh, zero’ • notes on early typesetting of computer programs by ACM and Addison-Wesley 274 Gerd Neugebauer / CTAN goes multi-lingual: Additional language support for the Web portal • making ctan.org available in German, as an experiment 230 Steve Peter / Ab epistulis • upcoming election, conferences, TUGboat, book reviews 276 Basil Solomykov / Obyknovennaya Novaya (Ordinary New Face) in METAFONT • a reworking of a famous Cyrillic typeface, in several sizes and shapes Intermediate 315 Karl Berry / The treasure chest • new CTAN packages, March–September 2014 256 Ulrike Fischer / biblatex variations • biblatex as a database: QR codes, PDF attachments, address lists 235 Twenty Questions for Donald Knuth • to celebrate the publication of TAOCP as eBooks 284 Bob Tennent / Visual editing (in a specialized case): prerex • a useful application for visual editing — charts of course prerequisites A 245 Thomas Thurnherr / L TEX document class options • options for the standard classes, and packages extending similar functionality 269 Peter Wilson / Glisterings: Lining up • ruling off; marginal rules; preventing an awkward page break; not at a page break; line backing; linespacing Intermediate Plus A 248 Frank Mittelbach / How to influence the position of float environments like figure and table in L TEX? A • explaining and working with the L TEX float placement algorithm 287 Frank Mittelbach, Will Robertson, LATEX3 team / l3build —A modern Lua test suite for TEX programming A • regression testing for L TEX, including typeset output 309 Daniel Stender / A Citation Style Language (CSL) workshop • introduction to this XML-based language for programming citation and bibliography styles A 261 David Walden / Every L TEX document brings new programming issues • practical approaches for ellipses, blank verso pages, and photo album layout Advanced 255 Barbara Beeton / Placing a full-width insert at the bottom of two columns • even on the first page of an article 277 Yannis Haralambous / A simple Arabic typesetting system for mixed Latin/Arabic documents: d. ¯ad • supporting both transliteration and direct Unicode input of Arabic, using ligatures 294 Taco Hoekwater / MetaPost path resolution isolated • new interface in MPlib 1.800 for resolving paths from external programs 297 Udo Wermuth / Typeset MMIX programs with TEX • a TEX macro package to typeset (M)MIX(AL) programs Contents of other TEX journals 322 Die TEXnische Kom¨odie 2–3/2014; Les Cahiers GUTenberg 56 (2012) Reports and notices 317 William Adams / Book review: Practical LATEX, by George Gr¨atzer A • review of this introductory text on document production with L TEX 318 Jacques Andr´e / Book review: Apprendre `aprogrammer en TEX, by Christian Tellechea • review of this book in French on TEX as a programming language 319 Michael Berry / Book review: The Imitation Game, by Jim Ottaviani and Leland Purvis • review of this graphic novel about the life of Alan Turing 320 Nicola Talbot / Book review: Let’s Learn LATEX, by S. Parthasarathy A • review of this free ebook intended to assist learning L TEX by example 323 TEX Development Fund committee / TEX Development Fund 2013 report 324 TUG Election committee / TUG 2015 election 325 TUG membership form 326 Institutional members 326 TEX consulting and production services 327 TUG 2015 announcement 328 Calendar