Mlbibtex: Reporting the Experience∗
Total Page:16
File Type:pdf, Size:1020Kb
MlBibTEX: Reporting the experience∗ Jean-Michel Hufflen LIFC (EA CNRS 4157) University of Franche-Comté 16, route de Gray 25030 BESANÇON CEDEX France hufflen (at) lifc dot univ-fcomte dot fr http://lifc.univ-fcomte.fr/~hufflen Abstract This article reports how the different steps of the MlBibTEX project were con- ducted until the first public release. We particularly focus on the problems raised by reimplementing a program (BibTEX) that came out in the 1980’s. Since that time, implementation techniques have evolved and new requirements have ap- peared, as well as new programs within TEX’s galaxy. Our choices are explained and discussed. Keywords TEX, LATEX, BibTEX, reimplementation, reverse engineering, im- plementation language, program update. Streszczenie Artykuł omawia realizację poszczególnych kroków przedsięwzięcia MlBibTEX, w czasie do przedstawienia pierwszej publicznej wersji. W szczególności skupiamy się na problemach powstałych przy reimplementacji programu (BibTEX), powsta- łego w latach 80 zeszłego wieku. Od tego czasu rozwinęły się techniki implemen- tacyjne, powstały nowe wymagania oraz nowe programy w świecie TEX-owym. Przedstawiamy i dyskutujemy dokonane wybory. Słowa kluczowe TEX, LATEX, BibTEX, reimplementacja, reverse engineering, język implementacji, aktualizacja programu. 0 Introduction aim to provide new programs, based on TEX & Co.’s ideas.2 A first representative example is the LAT X 3 In 2003, TEX’s 25th anniversary was celebrated at E 1 project [32], a second is N S [27]. the TUG conference, held in Hawaii [1]. LATEX T MlBibT X — for ‘MultiLingual BibT X’ — be- [28] and BibTEX [35] — the bibliography processor E E longs to the class of such projects. Let us recall usually associated with the LATEX word processor — are more recent, since they came out in the 1980’s, that this program aims to be a ‘better BibTEX’, especially regarding multilingual features. For an shortly after TEX. All are still widely used, such longevity being exceptional for software. However, end-user, MlBibTEX behaves exactly like ‘classical’ these programs are aging. Of course, recent ver- BibTEX: it searches bibliography data base (.bib) sions have incorporated many features absent from files for citation keys used in a document and then the first versions, which proves the robustness of arranges the references found, writing them to a .bbl A these systems. Nevertheless, they present some lim- file suitable for LTEX, w.r.t. a bibliography style. 3 4 itations due to the original conception, and a major MlBibTEX is written in Scheme, it uses XML as a reimplementation may be needed to integrate some modern requirements. In addition, interactive word 2 Concerning TEX, an additional point is that TEX’s de- processors have made important progress and are se- velopment has been frozen by its author, Donald E. Knuth [26]. If incorporating new ideas to a ‘new TEX’ leads to a ma- rious rivals, even if they do not yield typesetting of jor reimplementation, the resulting program must be named such professional quality. That is why some projects differently. 3 The version used is described in [24]. ∗ Title in Polish: MlBIBTEX: raport z doświadczeń. 4 EXtensible Markup Language. Readers interested in 1 TEX Users Group. an introductory book to this formalism can consult [37]. TUGboat, Volume 29, No. 1 — XVII European TEX Conference, 2007 157 Jean-Michel Hufflen central format: when entries of .bib files are parsed, ming language, higher than C. So we consider they result in an XML tree. Bibliography styles tak- again the prototype in Scheme that we sketched ing advantage as far as possible of MlBibTEX’s new in 2002. SXML7 [25] is chosen as the represen- features are written using nbst,5 a variant of XSLT6 tation of XML texts in Scheme. Some parts of described in [15]. The stack-based bst language [34] MlBibTEX are directly reprogrammed from C used for writing bibliography styles of BibTEX can to Scheme. As for the other parts, this proto- be used in a compatibility mode [20]. type is a good basis for much experiment [16]. We think that the experience we have gained Nov. 2004 The version written in C is definitely in developing MlBibTEX may be useful for other, dropped, whereas the version in Scheme is mod- analogous, projects. To begin, we briefly review the ified to improve efficiency; it becomes the ‘offi- chronology of this development. As will be seen, cial’ MlBibTEX [18]. this development has not been linear, and the two following sections focus on the problems we had to Sep. 2005 We decided to freeze MlBibTEX’s design face. We explain how we have determined which and concentrate only on finishing programming. Many Scheme functions are rewritten in confor- criteria are accurate when a programming language 8 is to be chosen for such an application. Then we mity to SRFIs [39]. show how the compatibility with ‘old’ data and the May 2006 A working version is almost finished, integration of modern features should be managed. except for the interface with the kpathsea li- brary. 1 MlBibTEX’s chronology May 2007 Public availability of MlBibTEX’s Ver- Oct. 2000 MlBibTEX’s design begins: the syntax sion 1.3. of .bib files is enriched with multilingual anno- Let us also explain that MlBibT X is not our tations. Version 1.1’s prototype is written using E only task. As an Assistant Professor in our univer- the C programming language and tries to reuse sity, we teach computer science, and participate in parts of ‘old’ BibT X’s program as far as pos- E other projects. As a consequence, MlBibT X’s de- sible. E velopment has been somewhat anarchic: we hardly May 2001 The first article about Ml T X is [9]. Bib E worked on it for two or three months, put it aside Later, the experience of developing Ml T X’s Bib E for one or two months, and so on. Last, we have su- Version 1.1 is described in [10]. pervised some student projects regarding graphical May 2002 After discussions with participants at tools around MlBibTEX [2, 8], programmed using the EuroBachoTEX conference, we realise that Ruby [31], but concerning the development of the the conventions for bibliography styles are too MlBibTEX program itself, we have done it alone. diverse, even if we consider only those of Eu- ropean countries. We realise that this first ap- 2 Choice of an implementation language proach is quite unsuitable, without defining a There are several programming paradigms: impera- new version of the bst language. So we decide tive, functional, and logic programming. There are to explore two directions. First, we develop a also several ways to implement a programming lan- questionnaire about problems and conventions guage: interpretation and compilation. Some par- concerning bibliography styles used within Eu- adigms are more appropriate, according to the do- ropean countries. Second, we begin a prototype main of interest. Likewise, some interpreted lan- in Scheme implementing the bst language [11]. guages are more appropriate if you want to pro- Initially, this prototype is devoted to experi- gram a prototype quickly and are just interested in ments about improving bst in a second version, performing some experiment.9 But compiled lan- 1.2. guages are often preferable if a program’s efficiency Jan. 2003 Version 1.2 is stalled. The new version is crucial. In addition, the level of a programming (1.3) is based on XML formats. The nbst lan- language has some influence on development: in a guage is designed and presented at [12, 13]. We high-level language, low-level details of structures’ explain in [14] how the results of our question- implementation do not have to be made explicit, so naire have influenced this new direction. Feb. 2004 It appears to us that MlBibTEX should 7 Scheme implementation of XML. be developed using a very high-level program- 8 Scheme Requests for Implementation, an effort to coor- dinate libraries and other additions to the Scheme language 5 New Bibliography STyles. between implementations. 6 eXtensible Stylesheet Language Transformations, the 9 Such is the case for the two graphical tools around language of transformations used for XML documents [44]. MlBibTEX programmed in Ruby by our students [2, 8]. 158 TUGboat, Volume 29, No. 1 — XVII European TEX Conference, 2007 MlBibTEX: Reporting the experience development is quicker, and the resulting programs implementation of low-level functionalities. are more concise, nearer to a mathematical model. Finally, our choice was Scheme, the modern di- In addition to these general considerations, let alect of Lisp. We confess that we are personally us recall that we aim to replace an existing program attracted by functional programming languages, be- by a new one. This new program is supposed to cause they can abstract procedures as well as data: do better than the ‘old’ one. ‘To do better’ may in this sense, they are very high-level programming mean ‘to have more functionalities, more expressive languages. Concerning Scheme, it seems to us to be power’, but for sake of credibility, it is desirable for undebatable that it has very good expressive power, the new program to be as efficient as the ‘old’ one. and takes as much advantage as possible of lexical Let us not forget that TEX and BibTEX are written scoping. In addition, it allows some operations to be using an old style of programming — more precisely, programmed ‘impurely’, by side effects, as in imper- a monolithic style used in the 1970’s–1980’s — based ative programming, in order to increase efficiency. mainly on global variables, without abstract data However, we use this feature parsimoniously, on lo- types. Choosing a language implemented efficiently cal variables, since it breaks the principles of func- is crucial: as a counter-example, NT S, written us- tional programming.