Bijlage U The LATEX2HTML translator: An Overview 61

The LATEX2HTML translator: An Overview

Herbert W. Swan [email protected]

LATEX2HTML is, as its name implies, a translator which If you are using SDBM or GDBM, use must use Perl converts a standard LATEX document into Hypertext version 5. Markup Language (HTML), for incorporation into the LATEX2HTML can translate both LATEX2ε documents, • World-Wide Web. Like LATEX, it is freely available soft- as well as the older LATEX 2.09 documents. However, if ware, supported by highly dedicated volunteers. Unlike you wish to use all of the features of the LATEX2HTML LATEX, it is currently available only on platforms. translator, you should upgrade your LATEX program it- Complete documentation concerning this software may self to the newer version 2e. be browsed from . • tions, ®gures and arbitrary environments into GIF im- A European mirror is reported at .From UNIX utilities available and working: this online document, a PostScript version of the complete dvips version 5.516 or later or dvipsk. It is also rec- document may easily be obtained for hardcopy viewing. • ommended that your installation of dvips permits This article (the one you are reading) represents a conden- users to generate their own fonts through Metafont sation of that more complete documentation. and the MakeTeXPK script. This will cause equa- tions and other GIF text to be much more screen- LAT X2HTML replicates the basic structure of a LAT X E E readable. document as a set of interconnected HTML ®les which can Ghostscript version 2.6.1 or later, with the ppmraw be explored using automatically generated navigation pan- • device driver built in. (To see whether your version els. The cross-references, citations, footnotes, the table-of- of Ghostscript contains this driver, type contents and the lists of ®gures and tables, are also trans- gs lated into hypertext . Formatting information which devicenames == has equivalent ªtagsº in HTML (lists, quotes, paragraph- quit breaks, type-styles, etc.) is also converted appropriately. The pbmplus or netpbm library. Some of the im- The remaining heavily formatted items such as mathemat- • age ®lters in these libraries are used during the Post- ical equations, pictures or tables are converted to images Script to GIF image conversion (described below). which are placed automatically at the correct positions in If you dislike the white background color of the the ®nal HTML document. • generated inlined images then you should get ei- LATEX2HTML also extends LATEX by supporting arbi- ther the netpbm library (instead of the older pbm- trary hypertext links and symbolic cross-references be- plus) OR install the giftrans ®lter by Andreas tween evolving remote documents. It allows the speci®- Ley . Ver- cation of conditional text and the inclusion of raw HTML sion 1.10.2 is known to work without problems, but commands. These hyper-media extensions to LATEXare later versions should also be OK. available as new commands and environments from within aLATEX document. Obtaining LATEX2HTML System Requirements One way LATEX2HTML may be obtained is through one Before you consider obtaining LATEX2HTML, you should of the three primary Comprehensive TEX Archive Network ®rst ensure that your UNIX system includes the following (CTAN) sites nearest you. They are located in the United utilities: States , the United Kingdom , and Germany .It • written in the Perl language, this utility is essential. can be found under the tex-archive/support/ Since Perl version 5 makes more ef®cient use of dy- latex2html directory. namic memory, it is recommended over version 4. The CTAN version will always be the latest stable revi- Your UNIX system must support some form of keyed sion of LAT X2HTML, currently V96.1-b. However, it • database management system, such as DBM, NDBM, E will not always contain the latest patches submitted by the SDBM, or GDBM. To ®nd out whether your system user community. If you wish to be on the leading edge, supports one of these standards, do a ªman dbm,º etc. Bijlage U The LATEX2HTML translator: An Overview 62 you are better off obtaining the program from (European mirror . This directory will contain all of the recent process ordinary text adequately, it fails to ª®ndº LATEX2HTML releases, including the latest major release, the images to process. One common cause of this 96.1, dated January of 1996. After downloading the ®le problem is that the ppmraw device is not built latex2html-96.1.tar.gz, you will have to un- into Ghostscript. Another cause involves the TEX- zip it with ªgzip -d,º and extract the ®les with ªtar xvf INPUTS environment variable. During normal op- latex2html-96.1.tar.º You should then proceed to down- eration, LATEX2HTML creates a subdirectory un- load the latest revision (currently Rev. e, dated April 8, der the current one to contain the translated HTML 1996), latex2html-96.1.reve.tar.gz. Simply unzip and ex- ®les. It is from this subdirectory that LATEXand tract this ®le on top of the original release ®le for the latest dvips are run during image conversion. If the patches. It will include .diff ®les which will show you TEXINPUTS environment variable that is visible what patches were made since the previous major release. to these programs does not include ª..º, then im- age conversion will fail. At some sites, ªlatexº and ªdvipsº are really scripts which set TEXINPUTS Installation and some other environment variables and call te Once the LATEX2HTML source ®les are obtained, installa- actual programs. If this is the case, the the user tion consists of: including ª..º in his or her TEXINPUTS will not 1. Editing the Perl scripts, to tell the system where the perl solve the problem. translator resides; Database problems Sometimes LATEX2HTML will sim- 2. Editing the latex2html.config script to tell ply die if the Perl interface to the UNIX database LATEX2HTML where the necessary UNIX utility pro- routines is not working. In Linux systems, this grams reside; problem is typically solved by uncommenting the 3. Running the install-test script, which mainly veri®es ªuse GDBM Fileº line in the LATEX2HTML and the previous two steps; install-test scripts. On other systems it may be nec- 4. Making local copies of the LATEX2HTML icons so that essary to recompile Perl with the proper database li- they are accessible to your WWW server. The instal- braries. lation variable $ICONSERVER in latex2html.- File globbing failure This problem manifests itself by config must then be set to this location. LATEX2HTML creating only some temporary ®les, but no valid HTML ®les. This problem can be Warnings: If you cannot do that bear in traced to the inability of Perl to locate the csh pro- mind that these icons will have to travel gram. Look for a line similar to ªcsh='csh'ºin from Livermore, California!!! Also note the Perl Config.pm ®le. If it is absent, recon®g- that several more icons were added that ure Perl . were not present in previous versions of LATEX2HTML. Normal Operation 5. Customizing the installation, as necessary. This is ac- A complished by setting various ªinstallation variablesº To use L TEX2HTML simply type: ªlatex2html (described later) in the ®le latex2html.config <®le>.tex.º The ª.texº suf®x is optional and will be sup- Individual users may also override the system-wide plied by the program if omitted by the user. This will cre- con®guration, either globally for all their documents, ate a new directory called <®le> which will contain the for separately for individual documents. generated HTML ®les, some log ®les and possibly some For a ªper userº initialization ®le, copy the ®le images. To view the result use an HTML viewer such as dot.latex2html-init in the home directory of NCSA or Navigator on the main HTML file/file. any user that wants it, modify it according to her pref- ®le which is . This ®le will contain erences and rename it as .latex2html-init.At navigation links to the other parts of the generated docu- runtime, both the latex2html.config ®le and ment. $HOME/.latex2html-init ®le will be loaded, The LATEX2HTML script includes a short manual which but the latter will take precedence. can be viewed by typing ªnroff -man latex2html.º You can also set up a ªper directoryº initialization ®le If a GIF image (of an equation or a ®gure, etc.) is used by copying a version of .latex2html-init in more than once in the document, it will be generated only each directory you would like it to be effective. An ini- once. Furthermore, on subsequent calls to LAT X2HTML tialization ®le /X/Y/Z/.latex2html-init will E on the same document, most images need not be regener- take precedence over all other initialization ®les if /X/ ated at all, but are recycled from one run to the . Their Y/Z is the ªcurrent directoryº when LATEX2HTML is names might change, but their contents do not. The only invoked. exception are images which are order-sensitive, meaning Bijlage U The LATEX2HTML translator: An Overview 63 that their content depends upon their location. Examples -up url Speci®es a universal resource locator of order-sensitive images are equation and eqnarray (URL) to associate with the ªUPº button in the nav- environments. This is because their ®gure numbers are part igation panel(s). of the image. Figures and tables with captions, on the other -up title Speci®es a title associated with this hand, are order-insensitive because the ®gure numbers are URL. handled by HTML and are not part of the image itself. -prev url Speci®es a universal resource locator (URL) to associate with the ªPREVIOUSº button For the most part, LAT X2HTML acts as expected, process- E in the navigation panel(s). ing most of the commands standard to LAT X. \ref and E -prev title Speci®es a title associated with this \label pairs are translated as as anchor/hyper-references URL. in the HTML document. LAT Xlists are converted to E -down url Speci®es a URL for the ªNEXTº HTML lists. Footnotes are collected onto one page and button in the navigation panel(s). displayed as hypertext. Tables and certain inline equations -down title Speci®es a title associated with are translated to their HTML equivalents, providing the se- this URL. lected HTML version supports these constructs. -contents Speci®es a URL for the ªTABLE OF CONTENTSº button, for document segments that Command-line options would not otherwise have one. It is possible to customize the output from LATEX2HTML -index Speci®es a URL for the ªINDEXº but- using a number of command-line options with which you ton, for document segments that otherwise would can specify how to break up the document, where to put not have an index. the generated ®les, what the title is, what the signature -external ®le <®lename-pre®x> Speci®es the pre®x of at the end of each page is, what to put in the navigation the .aux ®le that this document segment should panel, what kind of extra information to include about the read. The .aux extension will be appended to this document, whether to retain the original LATEX section- pre®x. This ®le is necessary for document segments numbering scheme, the version (below) of HTML to gen- to obtain ®gure and section numbers from LATEX. erate, etc. -dir Redirect the output to this di- The command-line options described below can be used to rectory. -no subdir Place the generated HTML ®les in the cur- change the default behaviour of LATEX2HTML. Alterna- tively the corresponding environment variables in the ini- rent directory. The default behaviour is to create (or tialization ®le .latex2html-init may be changed, in reuse) another ®le directory. order to achieve the same result. -pre®x <®lename-pre®x> The <®lename-pre®x> will -split Stop splitting sections into separate ®les at be prepended to all .gif, .pl and .html ®les this depth. A value of 0 will put the document into produced, except for the top-level .html ®le. This A a single HTML ®le. The default is 8. will enable multiple products of L TEX2HTML to -link Stop revealing child nodes at each peacefully coexist in the same directory. However, node at this depth. (A node is a \part or do not attempt to simultaneously run multiple in- A \chapter or \section or \subsection or stances of L TEX2HTML using the same output di- \subsubsection etc.). A value of 0 will show rectory, else various temporary ®les will overwrite no links to child nodes, a value of 1 will show only each other. the immediate descendents, etc. A value at least as -font size This option provides better control big as that of the -split option will produce a ta- over the font size of environments processed by A ble of contents for the tree structure, rooted at each L TEX. must be one of the font sizes that A given node. The default is 4. L TEX recognizes; i.e. 10pt, 11pt, 12pt,etc. A -nolatex Disable the mechanism for passing unknown The default size is the same as the L TEX document. Whatever size is selected, it will be magni®ed by environments to LATEX for processing. This can be thought of as ªdraft modeº which allows faster the installation variables $MATH SCALE FACTOR translation of the basic document structure and text, or $FIGURE SCALE FACTOR. without fancy ®gures, equations or tables. -no tex defs If speci®ed, the translator will make no at- This option has been superseded by the tempt to interpret raw TEX commands. This feature -no images option (see below). will enable sophisticated authors the ability to in- -external images Instead of including any generated im- sert arbitrary TEX commands in environments that A ages inside the document, leave them outside the are destined to be processed by L TEXanyway;e.g. document and provide hypertext links to them. ®gures, theorems, etc. - mode Use only ascii characters and do not include -ps images Use links to external PostScript images rather any images in the ®nal output. In ascii mode the out- than inlined GIF images. put of the translator can be used on character-based -address Sign each page with this browsers which do not support inlined images (the address. HTMLIMG tag). -no navigation Disable the mechanism for putting navi- -t Name the document using this title. gation links in each page. Bijlage U The LATEX2HTML translator: An Overview 64

-top navigation Put navigation links at the top of each tles must be unique and must not contain inlined page. graphics. -bottom navigation Put navigation links at the bottom of -html version This speci®es the HTML each page as well as the top. version (See below). to generate. Currently, ver- -auto navigation Put navigation links at the top of each sions 2.0, 2.1, 2.2, 3.0 and 3.1 are available. The page. Also put one at the bottom of the page if the default version is 2.0. page exceeds $WORDS IN PAGE number of words -vs Print the current version of LATEX2HTML. (default = 450). -debug Run in debug mode, using the Perl debugger, if -index in navigation Put a link to the index-page in the it is installed. navigation panel if there is an index. -h Print out the list of options. -contents in navigation Putalinktothetable-of- contents in the navigation panel if there is one. -next page in navigation Putalinktothenext logical page in the navigation panel. -previous page in navigation Putalinktotheprevious logical page in the navigation panel. HTML version -info Generate a new section About this docu- ment... containing information about the document The Hypertext Mark-up Language is an evolving standard, being translated. The default is to generate such a with different versions supporting different features. In or- section with information on the original document, der to make your documents viewable by the widest pos- the date, the user and the translator. An empty string sible audience, you should use the most basic HTML ver- (or the value 0) disables the creation of this extra sion in common usage. This is currently version 2.0 which A section. If a non-empty string is given, it will be is the default version for L TEX2HTML. It supports im- placed in the contents of the About this document... ages, server-side image maps, interactive forms, and min- section instead of the default information. imal typographic elements (bold, italic and teletype). This -reuse The speci®es version adheres to the ISO±Latin±1 (ISO±8879) character A how or whether image ®les are to be shared or re- set. Other HTML versions supported by L TEX2HTML cycled, and accepts three valid options: are: 0 Do not ever share or recycle image ®les. Version 2.1 HTML version 2.1 is essentially identical to This choice also invokes an interactive ses- HTML version 2.0, with some extensions for inter- sion prompting the user about what to do nationalization. Most importantly, the default char- about a pre-existing HTML directory, if it acter set is no longer ISO±8859±1 but ISO±10646, exists. commonly known as Unicode. This is a 16-bit char- 1 Recycle image ®les from a previous run if acter set and can thus display a much larger set of they are available, but do not share identical characters. images that must be created in this run. Version 2.2 This version supports all the features of ver- 2 Recycle image ®les from a previous run and sion 2.1, plus the HTML3 Table Model. This fea- share identical images from this run. (This is ture is already available in many HTML browsers, the default). including V1.2 and later. Version 3.0 This version adds to version 2.2 some of the -no reuse Do not share or recycle images generated dur- HTML 3.0 textual formatting features, including ing previous translations. This is equivalent to centering, ¯ush right, ¯ush left and underlining. -reuse 0. (This will enable the initial interactive Version 3.1 LATEX2HTML revision ªdº and later support session during which the user is asked whether to HTML subscripts and superscripts, which may be reuse the old directory, delete its contents or quit.) browsed by Navigator V2 and later. -init ®le <®le> Load the speci®ed ®le. This Version 3.2 In addition to all of the features of ver- Perl ®le will be loaded after loading sion 3.1, this version adds support for math mark- $HOME/.latex2html-init, if one exists. It up. Currently the only browsers which can dis- can be used to change the default options. play this mark-up are (http://www.w3. -no images Do not attempt to produce any inlined im- org/hypertext/WWW/Arena)andUdiWWW ages. The missing images can be generated ªoff- for Windows 95 (http://www.uni-ulm.de/ lineº by restarting LATEX2HTML with the option Ärichter/udiwww.) -images only. -images only Try and convert any inlined images that Very few native TEX commands are supported. The ones that are include: \newdimen, \newbox, \vskip, were left over from previous runs of LATEX2HTML. -show section numbers Show section numbers. By de- \hskip, \char, and some limited forms of \def.Fur- fault the section numbers are not shown in order to thermore, \def and \renewcommand may not behave encourage the use of particular sections as stand- as expected, if they de®ne the same entity more than once alone documents. In order to be shown, section ti- in the same document. Furthermore, macro de®nitions cannot contain any verbatim-like environments. Bijlage U The LATEX2HTML translator: An Overview 65

Internationalization 1. Substituting the \externalref command for the A special variable $LANGUAGE TITLES in the initializa- \ref command, and tion or con®guration ®les determines the language in which 2. Including an \externallabels command some- some section titles will appear. For example setting it to where in the document to may a correspondence be- $LANGUAGE_TITLES = ``french''; tween the URL of the target document and its LATEX2- will cause LATEX2HTML to produce ªTable des matiÁeresº HTML table of symbols. Such a symbol table is auto- instead of ªTable of Contentsº. Furthermore, the value of matically generated by LATEX2HTML for every docu- \3rd June 1996 is presented in a format which is cus- ment. tomary in that language. Still another way of cross referencing text, either within The only languages currently supported are french, or between documents, is with the \htmlref command. english and german, but it is trivial to add support for This macro is similar to \htmladdnormallink, except latex2html.config another language in the ®le .As A a guide here is the entry for the French titles: that the second argument is replaced by a symbolic L TEX \label sub french_titles { label, de®ned by . If the label not found within the $toc_title = "Table des mati\\`eres"; current document, it may instead appear in any of the ex- $lof_title = "Liste des figures"; ternal LATEX documents listed in an \externallabels $lot_title = "Liste des tableaux"; command. There is also a \hyperref construction $idx_title = "Index"; which produces a different wording of the link in the $bib_title = "R\\'ef\\'erences"; printed version of the document than in the electronic one. $info_title = "\\`A propos de ce document..."; If a portion of the text is to appear only in the HTML $abs_title = "Résumé"; version of the document, it may be enclosed within the $pre_title = "Préface"; htmlonly environment. Short text segments destined $app_title = "Annexe"; only for HTML may alternately be placed in the \html @Month = ('','janvier','février', command. Conversely, portions of the document intended 'mars', 'avril', 'mai', only for the printed version may be enclosed within the 'juin', 'juillet', 'août', latexonly environment, or the \latex command. 'septembre', 'octobre', 'novembre', 'décembre'); Any other HTML construct may be inserted by way of the } the rawhtml environment. Any text contained within this In order to provide full support for another language you environment is passed directly to the HTML ®le without may also want to replace the navigation buttons which modi®cation. This text could include forms, frames, or any come with LATEX2HTML (which are by default in English) other HTML object. with your own. As long as the new buttons have the same ®lenames as the old ones, there should not be a problem. Navigation panels The navigation panels are the strip containing ªbuttonsº Hypertext Extensions to LATEX and text that appears at the top and perhaps at the bottom of each generated page and provides hypertext links to other One of the greatest appeals of the World-Wide Web is its sections of a document. Some of the options and variables high connectivity through hyperlinks. LAT X2HTML im- E that control whether and where it should appear have al- plements several LAT X extensions designed to rapidly ex- E ready been mentioned. ploit this connectivity. These extensions are made known to LATEXbyusingthehtml package (style ®le) included A simple mechanism for appending customized buttons to with the LATEX2HTML distribution. the navigation panel is provided by the command \html- addtonavigation. This takes one argument which The simplest (but most tedious) way to insert a hy- LAT X2HTML appends to the navigation panel. For exam- pertext link into your document is by use of the E ple, \htmladdnormallink command. This command takes two arguments: the text to highlighted as hyper- \htmladdtonavigation{\htmladdnormallink text, and the URL that that hypertext will take you if {\htmladdimg{http://server/URL}}{% selected. Images may be inserted in your document via http://server/link}} \htmladdimg, which takes as its single argument the URL of the image to be inserted. These commands are will add an active button mybutton.gif pointing to the time-consuming for the author, because exact URLs must speci®ed location. be supplied, and because they must be manually updated if Apart from these facilities it is also possible to spec- the destination URLs are ever changed. ify completely what appears in the navigation pan- Part of the author's burden of maintaining hyperlinks is re- els and in what order. As each section is processed, lieved by the use of symbolic links. The LATEX \label LATEX2HTML assigns relevant information to a num- and \ref mechanism is one way to providing such links. ber of global variables. These variables are used If the target link is another LATEX2HTML-processed doc- by the subroutines top navigation panel and ument, this mechanism may be extended by: bottom navigation panel, where the navigation Bijlage U The LATEX2HTML translator: An Overview 66 panel is constructed as a string consisting of these vari- Figure 1 shows an example of a navigation panel speci®- ables and some formatting information. cation. (The ª.º is the Perl string concatenation operator and ª#º signi®es a comment). These subroutines can be rede®ned in a system or user con®guration ®le (respectively, LATEX2HTMLDIR/ Converted images latex2html.config and \$HOME/.latex2html- LAT X2HTML copies the contents certain commands and init). Any combination of text, HTML tags, and the E and all environments that it was not speci®cally pro- variables mentioned below is acceptable. grammed to handle into the ®le images.tex for GIF ®le conversion. This conversion is done by way of LATEX, The control-panel variables are: dvips, Ghostscript, and the portable bitmap library. The contents of these commands and environments are not al- Iconic links (buttons): teredbyLATEX2HTML, except for macro expansion, but PREVIOUS Points to the previous section; this can be defeated if necessary. The complete LATEX UP Points up to the ªparentº section; preamble from the user's document is also passed to NEXT Points to the next section; images.tex, so that user macros, packages and graph- NEXT GROUP Points to the next ªgroupº section; ics options (like \graphicspath) are made available to PREVIOUS GROUP Points to the previous ªgroupº LATEX processing the images. The environments which are section; processed as images include: CONTENTS Points to the contents page if there is one; Inline equations, excepts those which can be processed • INDEX Points to the index page if there is one. by LATEX2HTML according to its HTML version, speci®ed as above; equation and eqnarray, except when running un- Textual links (section titles): • der HTML version 3.2. PREVIOUS TITLE Points to the previous section; table and tabular, except under HTML version UP TITLE Points up to the ªparentº section; • 2.2 or higher; NEXT TITLE Points to the next section; figure, floatfig, wrapfig, and other ¯oating- NEXT GROUP TITLE Points to the next ªgroupº sec- • ®gure environments; tion; theorem and minipage environments; PREVIOUS GROUP TITLE Points to the previous • Any user-de®ned environment (as described below) ªgroupº section. • speci®edtoLATEX2HTML that should be passed to If the corresponding section exists, each iconic button will LATEX. contain an active link to that section. If the corresponding The standard LAT X commands which are con- section does not exist, the button will be inactive. If the E verted to images include \fbox, \framebox, section corresponding to a textual link does not exist then \parbox, \dag, \ddag, \oe, \OE, and certain the link will be empty. special accents not de®ned in HTML. Also con- The number of words that appears in each textual link verted to images are the graphics package com- is controlled by the variable $WORDS IN NAVIGA- mands, \rotatebox, \scalebox, \reflectbox, TION PANEL TITLES which may also be changed in \resizebox, \includegraphics, \epsfig, the con®guration ®les. \psfig, \epsffile,and\epsfbox.

sub top_navigation_panel { # Start with a horizontal rule (3-d dividing line) "


". # Now add few buttons with a space between them "$NEXT $UP $PREVIOUS $CONTENTS $INDEX $CUSTOM_BUTTONS" . # Line break "
\n" . # If ``next'' section exists, add its title to the navigation panel ($NEXT_TITLE ? " Next: $NEXT_TITLE\n" : undef) . # Similarly with the ``up'' title ... ($UP_TITLE ? "Up: $UP_TITLE\n" : undef) . # ... and the ``previous'' title ($PREVIOUS_TITLE ? " Previous: $PREVIOUS_TITLE\n" : undef) . # Horizontal rule (3-d dividing line) and new paragraph "

\n" }

Figure 1: Sample Perl subroutine to be inserted in the LATEX2HTML con®guration or startup ®le to customize the docu- ment's top navigation panel Bijlage U The LATEX2HTML translator: An Overview 67

On the other hand, the following text color commands For ®ner control, several parameters affecting the conver- are processed as ordinary text by LATEX2HTML, if they sion of a single ª®gureº can be controlled with the com- occur outside of a command or environment that would mand \htmlimage, which is de®ned in html.sty.The otherwise be passed to LATEX for image conversion: one argument of \htmlimage is a string of options sep- \color, \textcolor, \pagecolor, \colorbox, arated by commas. The options are: \fcolorbox. scale= • external Macro interactions • thumbnail= • If a LAT X graphics package requires one or more T X def- map= E E • initions to made within each graphics environment (e.g. usemap= • eepic), there are currently several ways to proceed: flip= <¯ip option> • 1. If it is possible to de®ne the required macros only once, align= • do that at the top of the document. LAT X2HTML will E scale= expand each reference to the macros, before passing The option allows control over the size of the ®- nal image. It overrides the global FIGURE SCALE FAC- them on to LATEX for image conversion. for this one ®gure. 2. If the TEX macros need to change with each ®gure, it is not possible to make them within each ®gure. LATEX2- The external option will cause the image not to be in- HTML will expand all references to the macros accord- lined. (Images are inlined by default.) External images will ing to their last de®nition in the document. Instead, be accessible via a hypertext link. place the entire ®gure, macros and all, into a separate The thumbnail= option will cause a small inlined im- ®le, say chart.eepic. Then in your main docu- age to be placed in the caption. The size of the thumbnail ment, insert depends on the .Useofthethumbnail= ª\fbox{\InputIfFileExists{chart.- option implies the external option. eepic}}.º The command \InpuIfFileExists is not reqognized by LATEX2HTML, and macros con- The map= and usemap= options specify that the image tained within it will be left untouched. It is recognized is to be made into an active image map (See below). by LAT X, however, and will be processed by it cor- E The flip= option speci®es a change of orientation of rectly. the electronic image relative to the printed version. The 3. Another method of having multiple, variable macro <¯ip option> is any single command recognized by pn- de®nitions is to make them with native T X \def com- E m¯ip. The most useful of these include: mands, and to use the -no tex defs command-line rotate90 or r90 This will rotate the image clockwise by switch. With this switch, T X \def's are not recog- E 90 . nized by LAT X2HTML, and will be left unexpanded as ◦ E rotate270 or r270 This will rotate the image counter- they are copied to images.tex. clockwise by 90◦. leftright This will ¯ip the image around a vertical axis of Scaling and orientation rotation. Images created by this mechanism can be divided into two topbottom This will ¯ip the image around a horizontal categories: ªsmallº and ªlarge.º For purposes of discus- axis of rotation. sion ... The align= option speci®es how the figure will be ªsmall imagesº refers to equations, special accents and aligned. The choices are: top, bottom, middle, left, A any other image generating L TEX commands; right and center.Themiddle option speci®es that while ... the image is to be left-justi®ed in the line, but centred ver- A ª®guresº applies to any image-generating L TEX envi- tically. The center option speci®es that it should also be figure table minipage ronments (e.g. , , etc.) centred horizontally. This option is valid only if the HTML version is 3 or higher, or if the NET SCAPE HTML con®g- The size of all ªsmall imagesº depends on a con®gura- uration variable is set. The default value is bottom. tion variable MATH SCALE FACTOR which speci®es how In order to be effective the command\htmlimage and its much to enlarge or reduce them in relation to their orig- options must be placed inside the environment on which it inal size in the PostScript version of the document. For will operate. example a scale factor of 0.5 will make all images half as big while a scale factor of 2 will make them twice as big. Active Image Maps Larger scale factors result in longer processing times and larger intermediate image ®les. A scale factor will only Image maps are images with active regions in which a be effective if it is greater than 0. The con®guration vari- Web surfer can click, to send him off to another sector able FIGURE SCALE FACTOR performs a similar func- of cyberspace. LATEX2HTML can design either inline tion for ª®guresº. Both of these con®guration variables are ª®guresº or external ones (with or without a thumbnail initially set to 1.6. version) to be image maps. However, HTML requires aURLofanHTML map-®le which de®nes the coor- Bijlage U The LATEX2HTML translator: An Overview 68 dinates of each active region in the map with a desti- coordinates can be obtained with the aid of a program nation URL. Usually this map ®le is kept on the server such as xv. If the user clicks in the ®rst rectangle, it will machine, however HTML version 3 also allows it to re- cause a branch to the URL associated with symbolic label side on the client side for faster response. (See Both con®gurations are supported by LATEX2- point (200, 100). Clicking in this region will cause a branch HTML through the \htmlimage options ªmap= ºand to symbolic label label3. Labels label2 and label4 ªusemap= ,º respectively. will be visited if the user clicks anywhere outside of the ex- plicit regions. If any labels are not de®ned in any of the Keeping such a map ®le up to date manually can be tedious, labels.pl ®les mentioned, they will be interpreted as especially with dynamic documents under revision. An ex- URLs without translation. perimental program makemap can help automate this pro- cess. This program (which is really a Perl script) takes The HTML image maps are generated and placed in direc- one mandatory argument and one optional argument. The tory report/ by invoking the command: ªmakemap mandatory argument is the name of a user-map ®le, de®ned report.map report.º below. The optional argument is the name of the directory where the HTML map ®le(s) are to be placed. Fancy lists The best way of describing how this works is by example. htmllist.sty Suppose that a document has two ®gures designated to be- An optional style ®le has been pro- come active image maps. The ®rst ®gure included a state- vided which produces fancier lists in the electronic ver- A ment like, sion of the document. This ®le de®nes a new L TEX environment, htmllist, which causes a user-de®ned \begin{figure} item-mark to be placed at each new item of the list, and \htmlimage{map=/cgi-bin/imagemap/Blk.map} which causes the optional description to be displayed in ... bold letters. The mark is determined by the \html- \end{figure} itemmark { } command. This com- The second ®gure had a line like, mand accepts either a mnemonic name for the , from a list of icons established at installation, or \begin{figure} \htmlimage{map=/cgi-bin/imagemap/.map} the URL of a mark not in the installation list. The com- ... mand \htmlitemmark must be used inside the html- \end{figure} list environment in order to be effective, but it may be used more than once to change the mark within the A typical user map ®le, named report.map, might con- list. The item-marks supplied with LATEX2HTML are tain the following information1: BlueBall, RedBall, OrangeBall, GreenBall, +report/ URL PinkBall, PurpleBall, WhiteBall and Yellow- # Ball.Thehtmllist environment is identical to the # Define map #1: description environment in the printed version. # An example of its usage is: Blk.map: label1 rect 288,145 397,189 label2 rect 307,225 377,252 \begin{htmllist} label2 default \htmlitemmark{WhiteBall}% # \item[Item 1:] This will have a # Define map #2 white ball. # \item[Item 2:] This will also have a Flow.map: white ball. label3 circle 150,100 200,100 \htmlitemmark{RedBall}% label4 default \item[Item 3:] This will have a red ball. \end{htmllist} In this ®le, comments are denoted by a #-sign in column 1. The line beginning with +report states that the sym- bolic labels are to be found in the labels.pl contained in the directory report/, and that its associated URL Change bars is as stated. Any number of external labels.pl ®les LATEX2HTML supports the LATEX2ε changebar.sty may be so speci®ed. The block diagram image has two ac- package, written by Johannes Braams , for inserting change-bars in a document (288, 145) and (397, 189), while the second is a rectan- in order to indicate differences from previous versions. gle bounded by corners (307, 225) and (377, 252). These Changed text is surrounded by bracket GIF icons. 1This ®le is designed for an NCSA server. CERN servers use ªrectº instead of ªrectangle,º specify a radius instead of an outer point in the circle, and enclose point coordinates by parentheses. Bijlage U The LATEX2HTML translator: An Overview 69

Document Segmentation the installation variable $AUTO PREFIX.Theinfor- As we have seen, the LATEX author can provide these links mation must be one of the following: either manually or symbolically. Manual links are more te- internals This is the default type, which need not be dious because a URL must be provided by the author for given. It speci®es that the internal labels from every link, and updated every time the target documents the designated segment are to be input and made change. Symbolic links are more convenient, because available to the current segment. the translator keeps track of the URLs. Earlier releases contents The table of contents information from des- of LATEX2HTML required the entire document to be pro- ignated segment are to be made available to the cessed together if it was to be linked symbolically. How- current segment. ever, it was easy for large documents to overwhelm the sections Sectioning information is to be read in. Note memory capacities of moderate sized computers. Further- that the segment containing the table of contents more, processing time could become prohibitively high, if requires both contents and sections information even a small change required the entire document to be re- from all other program segments. processed. ®gure Lists of ®gures from other segments are to be read. For these reasons, program segmentation was developed. table Lists of tables from other segments are to be This feature enables the author to subdivide his document read. into multiple segments. Each segment can be processed index Index information from other segments are to independently by LATEX2HTML. Hypertext links between be read. segments can be made symbolically, with references shared through auxiliary ®les. If a single segment changes, only \startdocument The \begin{document} and that segment needs to be reprocessed (unless a label is \end{document} statements are contained in the changed that another segment requires). Furthermore, the parent segment only. It follows that the child seg- entire document can be processed without modi®cation by ments cannot be processed separately by LATEX, with- LATEX to obtain the printed version. The top level segment out modi®cation. However they can be processed sep- that LATEX reads is called the parent segment. The others arately by LATEX2HTML, provided it is told where the are called child segments. end of the LATEX preamble is; this is the function of the \startdocument directive. It substitutes for Program segmentation does require a little more work on \begin{document} in child segments, but is oth- the part of the author, who will now have to undertake erwise ignored by both LAT XandLATX2HTML. some of bookkeeping formerly performed exclusively by E E \htmlhead{}{} This com- LAT X2HTML. The following four LAT X extensions carry E E mand is not normally placed in the document at all. out segmentation: It is automatically passed from parent to child via \segment{<®le>}{}{} .ptr. It identi®es to LAT X2HTML that the This command indicates the start of a new program seg- E current segment is a LAT X sectional unit of type .tex, repre- E type>, with the speci®ed heading. This command is sents the start of a new LAT X sectional unit of type E ignored by LAT X itself. (e.g., \section, \chapter, etc.) and E has a heading of . A variation of this com- mand, \segment*, is provided for segments that are A segmentation example not to appear in the table of contents. These commands The best way to illustrate document segmentation is perform the following operations in LATEX: through a simple example. Suppose that a document is 1. The speci®ed sectioning command is executed. to be segmented into one parent and two child segments. 2. LATEX will write its section and equation counters Let the parent segment be report.tex, and the the two into an auxiliary ®le, named .ptr. It will child segments be sec1.tex,andsec2.tex. The latter also write an \htmlhead command to this ®le. are translated with ®lename pre®xes of s1 and s2, respec- This information will tell LATEX2HTML how to ini- tively. This example is included with the version 96.1 dis- tialize itself for the new document segment. tribution of LATEX2HTML, with more proli® comments 3. LATEX will then proceed to input and process the ®le than are shown here. .tex. The \segment and \segment* commands are ig- The text of report.tex is given below: nored by LAT X2HTML. E \documentclass{article} \internal[<®le>]{} \usepackage{html,makeidx,color} This command directs LATEX2HTML to load inter- segment information of type from the ®le \internal[figure]{s1} .pl. Each program segment \internal[figure]{s2} must be associated with a unique ®lename-pre®x, spec- \internal[sections]{s1} i®ed either through a command-line option, or through \internal[sections]{s2} \internal[contents]{s1} \internal[contents]{s2} Bijlage U The LATEX2HTML translator: An Overview 70

\internal[index]{s1} This segment needs to load internal labels from the ®rst \internal[index]{s2} one, because of the reference to first. These circular de- pendencies (two segments referencing each other) are ei- \begin{document} ther not allowed or handled incorrectly by the Unix utility \title{A Segmentation Example} make, without resorting to time stamps and some trickery. \date{\today} A time-stamp is a zero-length ®le whose only purpose is to \maketitle \tableofcontents record its creation time. Besides evaluating segment inter- \listoffigures dependence, another function of make is to provide inter- segment navigation information. % Process the child segments:

\segment{sec1}{section}{Section 1 title} \segment{sec2}{section}{Section 2 title} \printindex \end{document} A sample Makefile is included in the distribution. This correctly generates the fully-linked document. The ®rst This ®le obtains the information necessary to build an in- time it is invoked, it runs: dex, a table of contents and a list of ®gures from the child LATEXonreport.tex twice; segments. It then proceeds and typesets these. • dvips to generate report.ps; • LATEX2HTML on sec1.tex; • The ®rst child segment, sec1.tex, is shown below: LATEX2HTML on sec2.tex. At this point, • \begin{htmlonly} sec2.html is completely linked, since the labels \documentclass{article} from the sec1 were available; \usepackage{html,color,makeidx} LATEX2HTML on sec1.tex to pick up the labels \input{sec1.ptr} • from sec2; \end{htmlonly} LATEX2HTML on report.tex. \internal{s2} • \startdocument Proper operation of make depends on the fact that LATEX2- Here is some text. HTML updates its own internal label ®le only if something \subsection{First subsection} in its current program segment causes the labels to change Here is subsection from the previous run. This ensures that LAT X2HTML is 1\label{first}. E not run unnecessarily. It is also usual for the information \begin{figure} -info \colorbox{red}{Some red page to be suppressed by specifying on all but text\index{Color text}} the top-level program segment. \caption[List of figure caption]{Figure 1 caption} Extending LATEX2HTML \end{figure} A Reference\index{Reference} to As the translator covers only partially the set of L TEX A \ref{second}. commands and because new L TEX commands can be de- ®ned arbitrarily using low level TeX commands, the trans- The ®rst thing this child segment does is establish the lator should be ¯exible enough to allow end-users to spec- A L TEX packages it requires, then loads the counter informa- ify how they want particular commands to be translated. tion that was written by the \segment command that in- voked it. Since this segment contains a symbolic reference (second) to the second segment, it must load the internal labels from that segment. Adding Support for Speci®c Style Files The ®nal segment, sec2.tex, is shown below: LATEX2HTML provides a mechanism where code to trans- \begin{htmlonly} late speci®c style ®les is automatically loaded, if such code \documentclass{article} is available. When use of a style, such as german.sty,is \usepackage{html,makeidx} detected in a LAT X source document, the translator looks \input{sec2.ptr} E \end{htmlonly} for a ®le LATEX2HTMLDIR/styles/german.perl. \internal{s1} If one is found, then it will be loaded into the main script. \startdocument This mechanism will help to keep the core script smaller as Here is another section\label{second}. well as make it easier for others to contribute and share so- Plus another\index{Reference, another} lutions on how to translate speci®c style ®les. The current reference\ref{first}. \begin{figure} distribution includes the ®les listed in Table 1. These will \fbox{The figure} provide good examples of how you can create your own ex- \caption{The caption} tensions to LATEX2HTML. \end{figure} Bijlage U The LATEX2HTML translator: An Overview 71

.perl ®le Description {}'s mark compulsory arguments and []'s optional ones. alltt Supports the LATEX2ε's alltt package. Some commands may have arguments which should be left changebar Provides rudimentary as text even though the command should be ignored (e.g. \mbox \ change-bar support. , , color Causes colored text to be etc.). In these cases arguments should be left unspeci®ed. processed as ordinary text Here is an example of how this mechanism may be used: by LATEX2HTML. french Special support for the &ignore_commands( <<_IGNORED_CMDS_); documentstyle # [] # {} French language. linebreak# [] eps®g Processes embedded ®g- center ures not enclosed in a figure environment. _IGNORED_CMDS_ ¯oat®g Processes ¯oating ®gures. german Special support for the Ger- Passing Commands to LATEX man language. Commands that should be passed on to LATEX for process- graphics Supports commands in the ing because there is no direct translation to HTML may be graphics package. speci®edinthe.latex2html-init ®le as input to the graphicx Supports the alternate syn- process_commands_in_tex subroutine. The format tax of graphics commands. is the same as that for specifying commands to be ignored. heqn Alters the way displayed Here is an example: equations are processed. &process_commands_in_tex htmllist Provides support for fancy (<<_RAW_ARG_CMDS_); lists. fbox # {} makeidx Generates the index. framebox # [] # [] # {} texdefs Supports some raw TEX commands. _RAW_ARG_CMDS_ wrap®g Supports wrapped ®gures. New Features Tabel 1: Supported LATEX2HTML style ®les. A number of new features were introduced in the 96.1 ver- sion of LATEX2HTML. This changes were provided by the The problem however, is that writing such extensions re- users of this program. Some of these have already been quires an understanding of Perl and of the way LATEX2- discussed. The more signi®cant ones include the follow- HTML is organised. Interfaces that are more ªuser- ing: friendlyº will be investigated. Font generation: The quality of equation bitmaps is At the moment a rudimentary mechanism is provided so greatly improved by enabling the PK GENERA- that a user can ask for particular commands and their ar- TION con®guration variable. When this is done Metafont will be invoked through dvips to gener- guments either to be ignored or passed on to LATEXfor processing (the default behaviour for unrecognized com- ate fonts more suitable to screen viewing. Ideally, mands is for their arguments to remain in the HTML text). this should be done by setting the mode= switch to dvips, but unfortunately not all versions of CommandsthatarepassedontoLATEX are converted to images which are either ªinlinedº in the main document dvips support this option. For those that don't, a .dvipsrc or become accessible via hypertext links. Simple exten- ®le is supplied with this distribution. sions using the commands above may be included in the Active image-maps: Both server and client-side maps $LATEX2HTMLDIR/latex2html.config ®le or in are supported. Image-maps can either be inline each personal $HOME/.latex2html-init initializa- or external. External maps can be associated with tion ®le. a thumbnail image. A separate script makemap helps resolve external URLs in image-maps. Adding ignored Commands Improved image recycling: The older version of image Commands that should be ignored may be speci- recycling often caused images to overwrite each ®edinthe.latex2html-init ®le as input to the other due to confusion in the bookkeeping. This has ignore_commands subroutine. Each command which been ®xed. It is also no longer necessary to keep is to be ignored should be on a separate line followed by generating the same image which is being used re- compulsory or optional argument markers separated by #'s peatedly in a document. Furthermore, images with e.g.2: thumbnails can now be recycled, as well as active image-maps. Only images of the correct size are re- #{}# []# {}# [] ... cycled. 2It is possible to add arbitrary Perl code between any of the argument markers which will be executed when the command is pro- cessed. For this however a basic understanding of how the translator works and of course Perl is required. Bijlage U The LATEX2HTML translator: An Overview 72

Document segmentation: Large documents can now be Null images: If for some reason an image produced a divided into independently processed segments. null GIF ®le, then no reference is made to that ®le Additional command-line switches provide inter- and the program proceeds gracefully. Furthermore, segment navigation information, while automati- such null images do not need to be regenerated. cally generated Perl ®les pass symbolic references This would occur in HTML3.0 images whose cap- and counter information between segments. tion is enclosed in a \parbox, for example. Command parsing: Constructions such as \hello2 Navigation panels: There are now separate subroutines are now treated as macro \hello followed by to control the top and bottom navigation panels. argument `2', rather than as macro `\hello2'. Section headings: Labels, equations and images are Added \makeatletter and \makeatother now permitted inside a section heading, and in the commands. \title command. However, the equations may Graceful termination: When LATEX2HTML is inter- not look very good because of the size difference. rupted by a termination signal, all child tasks are also terminated. This provides a more orderly shut- down than that offered by previous versions. Future directions More inline-math: Small inline equations which can be Work is currently underway on a complete rewrite of typeset in HTML are typeset in HTML. This fur- LATEX2HTML. The new version will be based much more ther reduces the number of GIF ®les that need to be on TEX's grammar and parsing. According to Marcus generated. E. Hennecke, doctoral student at Stanford University, the External hypertext: Commands \html\-ref and rewrite (probably to be named V96.3) ªdoes away with any \hyper\-ref now accept labels de®ned in an kind of preprocessing, thus eliminating texexpand, DBM external document, if the internal reference is not ®les, etc. and instead works directly on the raw LaTeX found. code. I would imagine that it should be possible to run that Additional style ®les: version on a PC except for the image generation, which will htmllist De®nes a fancier list environment which be the only thing requiring external helper programs.º One embellishes each item of a description list minor drawback of V 96.3 is that it will require users to up- with a user-selected icon. (It is the same as grade to Perl 5. For this reason, versions based on V96.1 the description environment in the pa- will continued to be supported in parallel for some time to per version.) come. heqn Rede®nes the equation environment so that equation numbers are handled in Hennecke emphasizes that LATEX2HTML V96.3 is not, HTML. This causes equations in this en- by itself, a port to non-UNIX platforms. However, it vironment to be recyclable. It also causes should make such a port easier to achieve, since it will equation arrays to be recyclable if their equa- store all system dependencies in variables, rather hard- tion numbers do not change from the previ- coded into the source. Anyone contemplating working on ous run. a port from UNIX should corroborate with him directly at ¯oat®g Provides support for this environment by . making it look like an ordinary ®gure in the electronic version. User Support wrap®g (Same comment as for floatfig). To keep in touch with other users of LATEX2HTML, to get graphics De®nes elements of the standard additional assistance, or to suggest or provide bug ®xes, LATEX2ε graphics.sty package. you can join the online mailing list, provided through the Argonne National Labs. To subscribe to this list, LAT X2ε support: Provided support for the \ensuremath E with the contents: subscribe .Tobere- age is interpreted as the LATEX2HTML language style-®le to load. User-de®ned commands and en- moved from the list send a message to the same address vironments can now have an optional argument. with the contents: unsubscribe . All recent postings to Stubs have been provided for \enlargethis- this discussion group are archived in a web-browsable at ages alltt, graphics, graphicx, color, . You may not get changebar and epsfig were provided. immediate answers to all your question, but most inquiries Figure orientation: The flip= option of \html- are answered eventually. Furthermore, comments, sugges- image causes the program pnmflip to be called tions and critiques are all helpful for the the continuing prior to the generation of the GIF ®le. This allows development of this evolving program. the electronic version of the image to be oriented differently than the paper version.

3 Bijlage U The LATEX2HTML translator: An Overview 73

Acknowledgements tion, as revised most recently by myself, Ross Moore4 and 5 6 LATEX2HTML was originally written by Nikos Drakos3 Michel Goossens . Thanks also go to Donald Arseneau while he was with the Computer Based Learning Unit of for his url.sty LATEX package, which was instrumental the University of Leeds. The material in this article was in formatting this article. drawn heavily on the LATEX2HTML user's documenta-

4email: 5email: 6email: