TUGb oat, Volume 17 1996, No. 3 255
characters are not usually intended to be printed
on the page, rather their purp ose is to control the
T Xtensions
E
printing pro cess in some way.
This mechanism can be used to embed some
text into the .dvi le, but one should The
Editing .dvi Files, or Visual T X
E
T Xbook, page 228 \b e careful not to make the
E
Jonathan Fine
list [of characters, i.e., the text] to o long, or you
mightover owT X's string memory." The author
E
do es not know if this will be a danger for the
Abstract
constructions ab out to b e describ ed. Using emT X
E
This note outlines the sp eci cation of a T X format,
E
he has created a .dvi le with 500,000 di erent
that will allow the resulting .dvi le to be edited
sp ecials, whose content is the numb ers from 1 to
via a suitable previewer and a .dvi le editor. Such
500,000, as digit strings.
close linking of editing and typ esetting app ears to
One solution to the added value problem is to
b e within the present capabilities of T X.
E
place the entire text of the input le myfile.tex
as sp ecial in the do cument. Although this satis es
Value Added Typ esetting
the formal requirement, it is a little coarse. To
edit the le myfile.dvi consists of editing the copy
Typ esetting can be thought of as a pro cess which
of myfile.tex which is emb edded as a sp ecial in
adds value to the do cument b eing pro cessed. This
myfile.dvi. It would not be dicult to adapt a
may not b e true for works typ eset from the author's
text editing program, so that it op erated on this
original manuscript and corrected pro ofs, for such
emb edded sp ecial, rather than a self-contained le.
physical do cuments reveal change of mind, history
T X can then b e run, without an error arising one
E
of comp osition and other details which are lost in
hop es, to refresh myfile.dvi.
the printed version of the do cument. But here we
Although coarse, this illustrates the essence of
consider the typ esetting of, say, a suitably tagged
the metho d by which .dvi les maybe edited via
ASCI I le.
the previewer. What is required is that the pro cess
Throughout this do cument we will use the
b e re ned.
language and conventions of T X, but most of the
E
So far as I know, pro ducts such as Lightning
issues involved are of a more general nature, and
Textures continually refresh the previewed .dvi le
apply to any computer typ esetting system.
as the the user changes the source .tex le, but
Supp ose throughout that myfile.tex is typ eset
do not asso ciate the individual characters, words or
to pro duce myfile.dvi. If the latter le is the
markup in the underlying .tex le to the content
former, together with some added value, then it
of the displayed .dvi le. Thus, the user cannot
should be p ossible to recover the former from
edit the .tex le solely by interacting with the
the latter. Oddly enough, a recent p osting to
.dvi le. This is p ossible with Scienti c Word,
an electronic discussion list raised precisely this
which should b e thought ofasaWSYIWYG or more
problem. An author had in error deleted the
exactly visual editor, whose underlying le format
original .tex le, and wished to recover its content,
A
is L T X. I b elieve that it is precisely b ecause T X
E E
as b est as was p ossible, from the .dvi le. This
as usually used do es not allow the solution of the
then is the de nition of value added typesetting |
problems describ ed here, that Scienti c Word do es
from the typ eset le it must b e p ossible to extract
not use tex to format les for the editor to display.
the source le.
Popp elier 1991 also contains a discussion of
Smaller Sp ecials
the pro cess by whichtyp esetting adds value to the
do cument, but from a di erent p oint of view.
The text of a do cument, say as an ASCI I le,
is naturally broken down into paragraphs, words,
Sp ecials
characters, and spaces. It seems natural to break
a do cument down into words. They are the
T X has a pro cess by which sp ecial instructions can
E
smallest units of meaning. This is re ected in the
be transmitted to printing devices The T Xbook,
E
very name of the to ol used by authors to prepare
page 226, and that is the \special mechanism.
do cuments, the word pro cessor. Programmers are
Each \special that makes it to the .dvi le
more accustomed to using the le editor.
will pro duce in it a string of characters, attached
to some sp eci c lo cation on the page. These
256 TUGb oat, Volume 17 1996, No. 3
For the moment, we shall assume that the \if 2 \sentinel
do cument is very plain, with no changes of font \let \next \relax
or other control sequences. Supp ose that we have \else
a T X format that will, b esides typ esetting the \let \next \dopar
E
do cument, place b efore eachword in the do cument \fi
a \special, whose content is the following word, as \next
represented in the source le. Because of ligatures }
and hyphenation, this may not b e the same as the
The macro \doword similarly go es through the
characters which follow the \special in the .dvi
paragraph word byword until \sentinel is reached.
le.
\def \doword 1~2
Supp ose that myfile.dvi is created from my-
{
file.tex by using this format. It will not be
\special { 1 } 1~
dicult, by extracting the text of the sp ecials, to
\ifx 2 \sentinel
recreate myfile.tex from myfile.dvi. This is not
\let \next \gobble
strictly correct. Assuming the usual category co des,
\else
additional spaces between words, additional lines
\let \next \doword
between paragraphs, and the lo cation of line breaks
\fi
within paragraphs, will all be lost when passing
\next 2
from myfile.tex to myfile.dvi. This is probably
}
no great loss. Contrarywise, typ eset paragraph line
This sample co de is not intended to be the
breaks have b een intro duced. It may even be an
basis for a practical implementation of a format
advantage to have a source le whose line breaks
that will create value-added .dvi les. Rather, its
agree with those of the .dvi le.
purp ose is to show that such a format is p ossible,
and to draw attention to some of the diculties
A Sp ecial Format
which may be encountered when creating such an
It is not so dicult to create a format that will
ob ject.
read the input le word by word, and place the
words as it reads them into \specials. The
Editing via a Previewer
co de b elow, which is intended to be read in an
Supp ose now that myfile.dvi has b een created
environment where white space is ignored, and ~ is
by a format le as ab ove. The viewer notices a
a space character, shows the basic features of such
missp elt wro d. Within a sp ecial in the .dvi le, it
an environment.
is easy to change the letters wrod into word. It will
The macro \sentinel is used simply to indicate
b e harder to add or delete letters within the sp ecial,
the end of a paragraph, or the end of the le.
b ecause there would not be ro om for the addition
\def \sentinel { \noexpand \sentinel }
at the correct p ointinthe .dvi le, or a hole would
The idea now is to de ne \dopar so that
b e left in it. But this is the sort of problem which
\dopar editing programs are accustomed to dealing with.
The first paragraph is not very Now that the copy of myfile.tex which is
long at all. within myfile.dvi has b een changed, one would
like the rest of myfile.dvi to b e brought up to date.
For simplicity,we shall assume that myfile.dvi is The second paragraph is even
simply one page, a galley that is long enough to shorter.
accommo date all that is placed on it. Changing
\sentinel wrod to word will change the paragraph in which
it is placed. The change will in general be more
will result in appropiate typ esetting and sp ecials.
complicated than replacing wrod by word. Even
Here is a simple to o simple implementation.
this simple change may change the line breaks in
The macro \dopar will read text paragraph by
the paragraph. Hyphenation may change, as may
paragraph, until the \sentinel follows a blank line
ligatures and kerning. Correcting a simple letter
or explicit \par.
transp osition error will require resetting the whole
\def\dopar 1 \par 2
paragraph.
{
\doword 1 ~\sentinel \par
TUGb oat, Volume 17 1996, No. 3 257
In most T X formats, the size and content pages, and p erhaps the previous page, will have
E
of one paragraph do es not in uence the setting of to be reconsidered. Assume T X is b eing used
E
the others. All then that needs to be reset is in a normal manner, so that parameters such as
the paragraph in which the change o ccurred. If \linepenalty and \brokenpenalty do not change
the do cumentwere set into pages rather than just from page to page. Were T X to repro cess the
E
a galley, then page breaks would also need to be whole of the changed myfile.tex, all paragraphs
reconsidered. from the previous version would b e broken exactly
This discussion has fo cussed on changing the as b efore, and so there would b e no need for them
letters in a single word in a paragraph. Adding to b e reset.
or deleting whole words will go the same way, as The previewer and editor combination can ask
will addition or deletion of paragraphs, provided T X to break the new do cument by passing it a
E
the format do es not do something like numb ering suitably co ded sequence of boxes, p enalties, skips
paragraphs. In any case, T X will be required to and kerns to break, for this is all the page breaking
E
pro cess some text when a change is made to the mechanism The T Xbook, Chapter 15 op erates
E
.tex le emb edded in the .dvi le, to bring the on. Alternatively, some other program could be
.dvi le up to date. asked to do the breaking. This is the approach
taken by Typ e & Set Asher, 1992.
Calling T X from the Previewer
E
Control Words
When the user has nished making changes to the
paragraph, or earlier if wished, the previewer must The asp ect which presents the most diculties is
call up on T X to repro cess the changed paragraph. now to be discussed, and brie y at that. Most
E
As b efore, we assume that the diculty faced by all do cuments contain control words, such as \TeX
editing programs, of deleting or adding material in and \section and \eqalign. These are part
the middle of a le, has b een solved. of the source myfile.tex and so they must go
T X turns a text le into a .dvi le. Ordinarily, into myfile.dvi as sp ecials. When it comes to
E
this .dvi le is not accessible until T X has come the editing pro cess, even though the addition or
E
to an \end. By writing a suitable device or virtual deletion of a control word such as \TeX is fairly
le, which will dep end on the op erating system, it inno cuous, to add or remove an \equalign can
should be p ossible to use the .dvi output of T X have a drastic e ect.
E
b efore the \end. Thus, a command suchas The braces {}, and the mathematics shift char-
acter $, will also have a drastic e ect when added or
\shipout\vbox{\input tempfile}
removed from myfile.tex. The same is true when
will cause T X to pro duce a page which contains
E
the delimiter required by some macro is omitted.
the revised and reset paragraph, which the editing
For this approachtohave all the typ esetting p ower
functions attached to the previewer can now paste
and programmability which T X provides, access
E
into place, replacing the old version. Incidentally,if
to lo cal change of font etc., parameter delimiters,
the output of T X is a virtual .dvi le, then there
E
and mathematics shift must b e provided. This has
should be no reason why the input tempfile.tex
to be done in a manner which is consistent with
should not also b e a virtual device.
the editing requirements imp osed by the emb edded
The foregoing discussion is intended to demon-
\special approach.
strate that by combining T X as it is, together
E
Unbalanced braces, missing or extra mathe-
with a suitable format, a suitable previewer, editors
matics shift characters, and missing delimiters all
for text and .dvi les, and a bit of op erating
provide diculties for users of T X. A format
E
system virtual le magic, it is p ossible to pro duce a
which detected such input errors early, b efore they
WSYIWYG variantofT X. Users of this comp osite
E
gave rise to an error message from the stomachof
program will be able to edit their do cuments via
T X, could make T X easier for many users. A
E E
the previewer. The format as describ ed is limited
format which allows the user to edit the copy of
to a single page, a single font, and no mathematics,
myfile.tex emb edded within myfile.dvi would
but it do es havemultiple paragraphs.
similarly have to detect input errors b efore they
reachT X's stomach.
E
Breaking Pages
Supp ose now that the do cument is broken into
pages, and a paragraph is added. All subsequent
258 TUGb oat, Volume 17 1996, No. 3
Performance most e ectively,itmay b e necessary to extend the
.dvi le sp eci cations.
It is quite p ossible that such a format, which
Also required is a format le, which satis es
repro duces the input text as sp ecials, and detects
requirements far more exacting than met at present.
all errors b efore T X do es, will run at p erhaps a
E
This also would b e a ma jor piece of work.
tenth of the sp eed of a regular format. However, the
usual approach requires the do cumenttobetyp eset
Acknowledgements
as a whole, and so an unchanged paragraph maybe
typ eset a dozen times or more during the revisions. The author thanks Mimi Burbank for correcting
A format which sets whole do cuments slower, but many errors in a previous version of this article,
which is able to reset the do cument paragraph by and Alan Ho enig for some encouraging remarks.
paragraph may very well consume fewer machine
cycles over the life of a do cument. Postscript
Moreover, the paragraphs can b e set or reset as
For various reasons the publication of this article
completed, rather than le by le. If the computer is
has b een long delayed. It was widely circulated in
suciently rapid, and many are to day, this machine
preprint form at TUG'93 in Aston, England, and
work can be done as required. This will result in
was submitted later in that year. Not all of the delay
the user b eing lo cked out for a short p erio d at the
has b een due to the author. Since then, the author
end of each paragraph, while pro cessing takes place,
has solved all the fundamental problems involved in
just as an editing program may deny access to the
creating a T X format le that will create a .dvi le
E
user while a le is b eing written.
that can b e visually pro cessed. Also since then the
World Wide Web has `taken o ' and this provides
Thinking alike
an additional reason for pro ducing .dvi les which
Since writing this article, I found the following supp ort rich visual interaction. The article \T X
E
statement put forward to motivate the CONCUR innovations at the Louis-Jean printing house", by
feature provided by SGML. Laugier and Haralamb ous, describ es an interesting
It is sometimes useful to maintain informa- program that allows certain changes to be made
tion ab out a source and a result do cument interactively to a .dvi le. Also relevant is the
simultaneously in the same do cument, as author's own article, \Do cuments, compuscripts,
in \what you see is what you get" WSYI- programs and macros".
WYG word pro cessors. There, the user There is some overlap between this article
A
app ears to interact with the formatted and Rahtz, \Another lo ok at L T X to SGML
E
output, but the editorial changes are ac- conversion". In b oth cases, the placing of text from
tually made in the source, which is then the do cument into the .dvi le as \specialsis a
reformatted for display. crucial technique. In this article the purp ose is to
This quotation comes from Annex C.3.1 of ISO 8879 store the original text of the article. For Rahtz,
the SGML standard and is also repro duced as is the purp ose is to create a transformed form of the
the whole of ISO 8879 in Goldfarb page 88. article.
Also relevant is the article of Kawaguti and Ki-
Conclusions ta jima, which describ es a di erent approach. They
have created a lo osely coupled comp osite from adap-
More can be done with T X as it is, than is
E
tions of two existing to ols, namely emacs and xdvi.
commonly realised. Some of its limitations exist
Sp ecial tags are added to a traditional LaT X
E
in the imagination of the critics rather than in the
source le, which pro duce \specials containing le
program itself. I hop e that all those who say that
name and line numb ers in the .dvi le. These are
T X cannot do such-and-such think carefully as to
E
used to supp ort new emacs and xdvi commands,
why it is that T X cannot do what they wish.
E
which together allow the user to move the emacs
For a p ortable WSYIWYG variantofT Xtobe
E
cursor by clicking on the previewed .dvi le. This
pro duced, the various additional comp onents, which
is, as the authors recognise, not the same as what
are previewer, .dvi le editor, text le editor, and
this article calls visual typ esetting.
op erating system magic, must also be p ortable or
Would those who are interested in following
p orted. An editor for .dvi les is probably the
the path outlined in the present article, particularly
most imp ortant new program. For this to work
on the viewing and editing side, please contact me.
For such a system to provide an op en architecture,
TUGb oat, Volume 17 1996, No. 3 259
standards for the use of sp ecials, going far b eyond
Rokicki's \A prop osed standard for sp ecials", will
b e required.
Bibliography
Asher, Graham. \Inside Typ e & Set", TUGb oat 13
1, pages 13{22, 1992.
Fine, Jonathan. \Do cuments, compuscripts, pro-
grams and macros", TUGb oat 15 3, pages
381{385, 1994.
Goldfarb, Charles F. The SGML Handbook, Oxford
University Press, 1990
Kawaguti, Minato and Kita jima, Norio. \Concur-
rent use of an interactive T X previewer with
E
an Emacs-typ e editor", TUGb oat 15 3, pages
293{300, 1994.
Laugier, Maurice, and Yannis Haralamb ous, \T X
E
innovations at the Louis-Jean printing house",
TUGb oat 15 4, pages 438{443, 1994.
Lavagnino, John. \Simultaneous electronic and pa-
per publication", TUGb oat 12 3, pages 401{
405, 1991.
Mittelbach, Frank. \E-T X: Guidelines for future
E
T X", TUGb oat 11 3, pages 337{345, 1990.
E
Popp elier, N. A. F. M. \SGML and T X in Scienti c
E
Publishing", TUGb oat 12 1, pages 105{109,
1991.
A
Rahtz, Sebastian. \Another lo ok at L T XtoSGML
E
conversion". TUGb oat 16 3, pages 315{324,
1995.
Rokicki, Tomas G. \A prop osed standard for sp e-
cials", TUGb oat 16 4, pages 395{401, 1995.
Taylor, Philip. \The future of T X", TUGb oat 13
E
4, pages 433{442, 1992.
Sieb enmann, Laurent. \The Lion and the Mouse",
EuroT X '92 Pro ceedings, edited byJir Zlatuska,
E
pages 43{52, 1992.
Jonathan Fine
203 Coldhams Lane
Cambridge, U.K.
CB1 3HY [email protected]