<<

TUGb oat, 17 1996, No. 3 255

characters are not usually intended to be printed

on the page, rather their purp ose is to control the

T Xtensions

E

pro cess in some way.

This mechanism can be used to embed some

text into the .dvi le, but one should The

Editing .dvi Files, or Visual T X

E

T Xbook, page 228 \b e careful not to make the

E

Jonathan Fine

list [of characters, i.e., the text] to o long, or you

mightover owT X's string memory." The

E

do es not know if this will be a danger for the

Abstract

constructions ab out to b e describ ed. Using emT X

E

This note outlines the sp eci cation of a T X format,

E

he has created a .dvi le with 500,000 di erent

that will allow the resulting .dvi le to be edited

sp ecials, whose content is the numb ers from 1 to

via a suitable previewer and a .dvi le editor. Such

500,000, as digit strings.

close linking of and typ esetting app ears to

One solution to the added value problem is to

b e within the present capabilities of T X.

E

place the entire text of the input le myfile.tex

as sp ecial in the do cument. Although this satis es

Value Added Typ esetting

the formal requirement, it is a little coarse. To

edit the le myfile.dvi consists of editing the

Typ esetting can be thought of as a pro cess which

of myfile.tex which is emb edded as a sp ecial in

adds value to the do cument b eing pro cessed. This

myfile.dvi. It would not be dicult to adapt a

may not b e true for works typ eset from the author's

text editing program, so that it op erated on this

original and corrected pro ofs, for such

emb edded sp ecial, rather than a self-contained le.

physical do cuments reveal change of mind, history

T X can then b e run, without an error arising one

E

of comp osition and other details which are lost in

hop es, to refresh myfile.dvi.

the printed version of the do cument. But here we

Although coarse, this illustrates the essence of

consider the typ esetting of, say, a suitably tagged

the metho d by which .dvi les maybe edited via

ASCI I le.

the previewer. What is required is that the pro cess

Throughout this do cument we will use the

b e re ned.

language and conventions of T X, but most of the

E

So far as I know, pro ducts such as Lightning

issues involved are of a more general nature, and

Textures continually refresh the previewed .dvi le

apply to any computer typ esetting system.

as the the user changes the .tex le, but

Supp ose throughout that myfile.tex is typ eset

do not asso ciate the individual characters, words or

to pro duce myfile.dvi. If the latter le is the

markup in the underlying .tex le to the content

former, together with some added value, then it

of the displayed .dvi le. Thus, the user cannot

should be p ossible to recover the former from

edit the .tex le solely by interacting with the

the latter. Oddly enough, a recent p osting to

.dvi le. This is p ossible with Scienti c Word,

an electronic discussion list raised precisely this

which should b e thought ofasaWSYIWYG or more

problem. An author had in error deleted the

exactly visual editor, whose underlying le format

original .tex le, and wished to recover its content,

A

is L T X. I b elieve that it is precisely b ecause T X

E E

as b est as was p ossible, from the .dvi le. This

as usually used do es not allow the solution of the

then is the de nition of value added |

problems describ ed here, that Scienti c Word do es

from the typ eset le it must b e p ossible to extract

not use tex to format les for the editor to display.

the source le.

Popp elier 1991 also contains a discussion of

Smaller Sp ecials

the pro cess by whichtyp esetting adds value to the

do cument, but from a di erent p oint of view.

The text of a do cument, say as an ASCI I le,

is naturally broken down into paragraphs, words,

Sp ecials

characters, and spaces. It seems natural to break

a do cument down into words. They are the

T X has a pro cess by which sp ecial instructions can

E

smallest units of meaning. This is re ected in the

be transmitted to printing devices The T Xbook,

E

very name of the to ol used by to prepare

page 226, and that is the \special mechanism.

do cuments, the word pro cessor. Programmers are

Each \special that makes it to the .dvi le

more accustomed to using the le editor.

will pro duce in it a string of characters, attached

to some sp eci c lo cation on the page. These

256 TUGb oat, Volume 17 1996, No. 3

For the moment, we shall assume that the \if 2 \sentinel

do cument is very plain, with no changes of font \let \next \relax

or other control sequences. Supp ose that we have \else

a T X format that will, b esides typ esetting the \let \next \dopar

E

do cument, place b efore eachword in the do cument \fi

a \special, whose content is the following word, as \next

represented in the source le. Because of ligatures }

and hyphenation, this may not b e the same as the

The macro \doword similarly go es through the

characters which follow the \special in the .dvi

paragraph word byword until \sentinel is reached.

le.

\def \doword 1~2

Supp ose that myfile.dvi is created from my-

{

file.tex by using this format. It will not be

\special { 1 } 1~

dicult, by extracting the text of the sp ecials, to

\ifx 2 \sentinel

recreate myfile.tex from myfile.dvi. This is not

\let \next \gobble

strictly correct. Assuming the usual category co des,

\else

additional spaces between words, additional lines

\let \next \doword

between paragraphs, and the lo cation of line breaks

\fi

within paragraphs, will all be lost when passing

\next 2

from myfile.tex to myfile.dvi. This is probably

}

no great loss. Contrarywise, typ eset paragraph line

This sample co de is not intended to be the

breaks have b een intro duced. It may even be an

basis for a practical implementation of a format

advantage to have a source le whose line breaks

that will create value-added .dvi les. Rather, its

agree with those of the .dvi le.

purp ose is to show that such a format is p ossible,

and to draw attention to some of the diculties

A Sp ecial Format

which may be encountered when creating such an

It is not so dicult to create a format that will

ob ject.

read the input le word by word, and place the

words as it reads them into \specials. The

Editing via a Previewer

co de b elow, which is intended to be read in an

Supp ose now that myfile.dvi has b een created

environment where white space is ignored, and ~ is

by a format le as ab ove. The viewer notices a

a space character, shows the basic features of such

missp elt wro d. Within a sp ecial in the .dvi le, it

an environment.

is easy to change the letters wrod into word. It will

The macro \sentinel is used simply to indicate

b e harder to add or delete letters within the sp ecial,

the end of a paragraph, or the end of the le.

b ecause there would not be ro om for the addition

\def \sentinel { \noexpand \sentinel }

at the correct p ointinthe .dvi le, or a hole would

The idea now is to de ne \dopar so that

b e left in it. But this is the sort of problem which

\dopar editing programs are accustomed to dealing with.

The first paragraph is not very Now that the copy of myfile.tex which is

long at all. within myfile.dvi has b een changed, one would

like the rest of myfile.dvi to b e brought up to date.

For simplicity,we shall assume that myfile.dvi is The second paragraph is even

simply one page, a galley that is long enough to shorter.

accommo date all that is placed on it. Changing

\sentinel wrod to word will change the paragraph in which

it is placed. The change will in general be more

will result in appropiate typ esetting and sp ecials.

complicated than replacing wrod by word. Even

Here is a simple to o simple implementation.

this simple change may change the line breaks in

The macro \dopar will read text paragraph by

the paragraph. Hyphenation may change, as may

paragraph, until the \sentinel follows a blank line

ligatures and kerning. Correcting a simple letter

or explicit \par.

transp osition error will require resetting the whole

\def\dopar 1 \par 2

paragraph.

{

\doword 1 ~\sentinel \par

TUGb oat, Volume 17 1996, No. 3 257

In most T X formats, the size and content pages, and p erhaps the previous page, will have

E

of one paragraph do es not in uence the setting of to be reconsidered. Assume T X is b eing used

E

the others. All then that needs to be reset is in a normal manner, so that parameters such as

the paragraph in which the change o ccurred. If \linepenalty and \brokenpenalty do not change

the do cumentwere set into pages rather than just from page to page. Were T X to repro cess the

E

a galley, then page breaks would also need to be whole of the changed myfile.tex, all paragraphs

reconsidered. from the previous version would b e broken exactly

This discussion has fo cussed on changing the as b efore, and so there would b e no need for them

letters in a single word in a paragraph. Adding to b e reset.

or deleting whole words will go the same way, as The previewer and editor combination can ask

will addition or deletion of paragraphs, provided T X to break the new do cument by passing it a

E

the format do es not do something like numb ering suitably co ded sequence of boxes, p enalties, skips

paragraphs. In any case, T X will be required to and kerns to break, for this is all the page breaking

E

pro cess some text when a change is made to the mechanism The T Xbook, 15 op erates

E

.tex le emb edded in the .dvi le, to bring the on. Alternatively, some other program could be

.dvi le up to date. asked to do the breaking. This is the approach

taken by Typ e & Set Asher, 1992.

Calling T X from the Previewer

E

Control Words

When the user has nished making changes to the

paragraph, or earlier if wished, the previewer must The asp ect which presents the most diculties is

call up on T X to repro cess the changed paragraph. now to be discussed, and brie y at that. Most

E

As b efore, we assume that the diculty faced by all do cuments contain control words, such as \TeX

editing programs, of deleting or adding material in and \section and \eqalign. These are part

the middle of a le, has b een solved. of the source myfile.tex and so they must go

T X turns a text le into a .dvi le. Ordinarily, into myfile.dvi as sp ecials. When it comes to

E

this .dvi le is not accessible until T X has come the editing pro cess, even though the addition or

E

to an \end. By writing a suitable device or virtual deletion of a control word such as \TeX is fairly

le, which will dep end on the op erating system, it inno cuous, to add or remove an \equalign can

should be p ossible to use the .dvi output of T X have a drastic e ect.

E

b efore the \end. Thus, a command suchas The braces {}, and the mathematics shift char-

acter $, will also have a drastic e ect when added or

\shipout\vbox{\input tempfile}

removed from myfile.tex. The same is true when

will cause T X to pro duce a page which contains

E

the delimiter required by some macro is omitted.

the revised and reset paragraph, which the editing

For this approachtohave all the typ esetting p ower

functions attached to the previewer can now paste

and programmability which T X provides, access

E

into place, replacing the old version. Incidentally,if

to lo cal change of font etc., parameter delimiters,

the output of T X is a virtual .dvi le, then there

E

and mathematics shift must b e provided. This has

should be no reason why the input tempfile.tex

to be done in a manner which is consistent with

should not also b e a virtual device.

the editing requirements imp osed by the emb edded

The foregoing discussion is intended to demon-

\special approach.

strate that by combining T X as it is, together

E

Unbalanced braces, missing or extra mathe-

with a suitable format, a suitable previewer, editors

matics shift characters, and missing delimiters all

for text and .dvi les, and a bit of op erating

provide diculties for users of T X. A format

E

system virtual le magic, it is p ossible to pro duce a

which detected such input errors early, b efore they

WSYIWYG variantofT X. Users of this comp osite

E

gave rise to an error message from the stomachof

program will be able to edit their do cuments via

T X, could make T X easier for many users. A

E E

the previewer. The format as describ ed is limited

format which allows the user to edit the copy of

to a single page, a single font, and no mathematics,

myfile.tex emb edded within myfile.dvi would

but it do es havemultiple paragraphs.

similarly have to detect input errors b efore they

reachT X's stomach.

E

Breaking Pages

Supp ose now that the do cument is broken into

pages, and a paragraph is added. All subsequent

258 TUGb oat, Volume 17 1996, No. 3

Performance most e ectively,itmay b e necessary to extend the

.dvi le sp eci cations.

It is quite p ossible that such a format, which

Also required is a format le, which satis es

repro duces the input text as sp ecials, and detects

requirements far more exacting than met at present.

all errors b efore T X do es, will run at p erhaps a

E

This also would b e a ma jor piece of work.

tenth of the sp eed of a regular format. However, the

usual approach requires the do cumenttobetyp eset

Acknowledgements

as a whole, and so an unchanged paragraph maybe

typ eset a dozen times or more during the revisions. The author thanks Mimi Burbank for correcting

A format which sets whole do cuments slower, but many errors in a previous version of this article,

which is able to reset the do cument paragraph by and Alan Ho enig for some encouraging remarks.

paragraph may very well consume fewer machine

cycles over the life of a do cument. Postscript

Moreover, the paragraphs can b e set or reset as

For various reasons the publication of this article

completed, rather than le by le. If the computer is

has b een long delayed. It was widely circulated in

suciently rapid, and many are to day, this machine

preprint form at TUG'93 in Aston, England, and

work can be done as required. This will result in

was submitted later in that year. Not all of the delay

the user b eing lo cked out for a short p erio d at the

has b een due to the author. Since then, the author

end of each paragraph, while pro cessing takes place,

has solved all the fundamental problems involved in

just as an editing program may deny access to the

creating a T X format le that will create a .dvi le

E

user while a le is b eing written.

that can b e visually pro cessed. Also since then the

World Wide Web has `taken o ' and this provides

Thinking alike

an additional reason for pro ducing .dvi les which

Since writing this article, I found the following supp ort rich visual interaction. The article \T X

E

statement put forward to motivate the CONCUR innovations at the Louis-Jean printing house", by

feature provided by SGML. Laugier and Haralamb ous, describ es an interesting

It is sometimes useful to maintain informa- program that allows certain changes to be made

tion ab out a source and a result do cument interactively to a .dvi le. Also relevant is the

simultaneously in the same do cument, as author's own article, \Do cuments, compuscripts,

in \what you see is what you get" WSYI- programs and macros".

WYG word pro cessors. There, the user There is some overlap between this article

A

app ears to interact with the formatted and Rahtz, \Another lo ok at L T X to SGML

E

output, but the editorial changes are ac- conversion". In b oth cases, the placing of text from

tually made in the source, which is then the do cument into the .dvi le as \specialsis a

reformatted for display. crucial technique. In this article the purp ose is to

This quotation comes from Annex C.3.1 of ISO 8879 store the original text of the article. For Rahtz,

the SGML standard and is also repro duced as is the purp ose is to create a transformed form of the

the whole of ISO 8879 in Goldfarb page 88. article.

Also relevant is the article of Kawaguti and Ki-

Conclusions ta jima, which describ es a di erent approach. They

have created a lo osely coupled comp osite from adap-

More can be done with T X as it is, than is

E

tions of two existing to ols, namely emacs and xdvi.

commonly realised. Some of its limitations exist

Sp ecial tags are added to a traditional LaT X

E

in the imagination of the critics rather than in the

source le, which pro duce \specials containing le

program itself. I hop e that all those who say that

name and line numb ers in the .dvi le. These are

T X cannot do such-and-such think carefully as to

E

used to supp ort new emacs and xdvi commands,

why it is that T X cannot do what they wish.

E

which together allow the user to move the emacs

For a p ortable WSYIWYG variantofT Xtobe

E

cursor by clicking on the previewed .dvi le. This

pro duced, the various additional comp onents, which

is, as the authors recognise, not the same as what

are previewer, .dvi le editor, text le editor, and

this article calls visual typ esetting.

op erating system magic, must also be p ortable or

Would those who are interested in following

p orted. An editor for .dvi les is probably the

the path outlined in the present article, particularly

most imp ortant new program. For this to work

on the viewing and editing side, please contact me.

For such a system to provide an op en architecture,

TUGb oat, Volume 17 1996, No. 3 259

standards for the use of sp ecials, going far b eyond

Rokicki's \A prop osed standard for sp ecials", will

b e required.

Bibliography

Asher, Graham. \Inside Typ e & Set", TUGb oat 13

1, pages 13{22, 1992.

Fine, Jonathan. \Do cuments, compuscripts, pro-

grams and macros", TUGb oat 15 3, pages

381{385, 1994.

Goldfarb, Charles F. The SGML Handbook, Oxford

University Press, 1990

Kawaguti, Minato and Kita jima, Norio. \Concur-

rent use of an interactive T X previewer with

E

an Emacs-typ e editor", TUGb oat 15 3, pages

293{300, 1994.

Laugier, Maurice, and Yannis Haralamb ous, \T X

E

innovations at the Louis-Jean printing house",

TUGb oat 15 4, pages 438{443, 1994.

Lavagnino, John. \Simultaneous electronic and pa-

per publication", TUGb oat 12 3, pages 401{

405, 1991.

Mittelbach, Frank. \E-T X: Guidelines for future

E

T X", TUGb oat 11 3, pages 337{345, 1990.

E

Popp elier, N. A. F. M. \SGML and T X in Scienti c

E

Publishing", TUGb oat 12 1, pages 105{109,

1991.

A

Rahtz, Sebastian. \Another lo ok at L T XtoSGML

E

conversion". TUGb oat 16 3, pages 315{324,

1995.

Rokicki, Tomas G. \A prop osed standard for sp e-

cials", TUGb oat 16 4, pages 395{401, 1995.

Taylor, Philip. \The future of T X", TUGb oat 13

E

4, pages 433{442, 1992.

Sieb enmann, Laurent. \The Lion and the Mouse",

EuroT X '92 Pro ceedings, edited byJir Zlatuska,

E

pages 43{52, 1992.

 Jonathan Fine

203 Coldhams Lane

Cambridge, U.K.

CB1 3HY [email protected]