On-Line Computer Text Processing: a Tutorial* RICHARD C
Total Page:16
File Type:pdf, Size:1020Kb
Behavior Research Methods &Instrumentation 1974, Vol. 6, No.2, 159-166 SESSION VI TUTORIAL JOSEPH B. SIDOWSKI, University ofSouth Florida, Presider On-line computer text processing: A tutorial* RICHARD C. ROIST ACHER Center for Advanced Computation and College of Commerce University ofIllinois at Urbana-Champaign, Urbana, Illinois 61801 The use of an on-line computer system for preparing manuscripts affords the scholar many advantages. This tutorial discusses text editing, file and output processing systems, and how they may be used for producing and disseminating scientific documents. Text processing is a powerful computing utility whose necessary resources, and because there were not so many user community is rapidly growing. Text processing as text users among the insiders as to overload the available used here refers to the storage and editing of resources. However, the potential market for computer manuscripts maintained as computer files of text and the text processing was so wide as to make university use of computer programs to format those manuscript computing administrators fear for their ability to files into documents. Text processing as used in this support a drastically increased community of users, most tutorial does not include either information retrieval, in of whom would have to operate on internal funds. which a file of information is scanned for a desired Text processing systems have now evolved to the entry, or such areas as computer-based analysis of point where they should find more general use, literary text. especially among those who use on-line systems in the The aim of this tutorial is to introduce computer text course of their other work. Text processing systems can processing to a reader who has had some experience with be found on many time-sharing systems, including IBM's on-line computer systems, but who is not necessarily a Time Sharing Option for OS/360, Digital Equipment programmer. While this paper reviews several systems Corporation's TOPS, and Bolt, Beranek, and Newman's and machines, it is not meant to be a complete review of TENEX for the PDP-lO, the Burroughs B-6700 MCP, the the field. The particular machines, systems, and University of Michigan's Michigan Terminal System, and criticisms presented here are drawn primarily from the the Massachusetts Institute of Technology's Multics. author's experience, rather than from an exhaustive literature search. TEXT PROCESSING SYSTEM Text processing services are what might be called COMPONENTS AND CHARACTERISTICS external computing utilities, in that like BASIC or teaching programs, they provide sophisticated A computer text processing system has three major computing services to clients who are not necessarily components: a file system, an editor, and an output programmers or even members of the usual computer processor. The file system is the computer resource for user community. Text processing systems have been in cataloging and storing information which is not in the the libraries of several operating systems for many years. machine's memory. Information may be stored on a Until recently, text processing has not been widely used variety of devices, including magnetic disks, drums, and because the programs required considerable computing tapes. Most text files are stored on a rapid access device sophistication to use, and because they were quite such as a disk while they are being edited and processed, expensive to run. In academia, it was possible for and are then archived onto magnetic tape. interested computing center "insiders" to use text An editor is a program for the insertion, scanning, and processing systems because they had access to the deletion of text in computer files. Most editors are used primarily for editing programs maintained as disk files, *This work was supported in part by the Advanced Research rather than as decks of punched cards. Editors have Projects Agency of the Department of Defense and was monitored by the U.S. Army Research Office-Durham under evolved from primitive programs, which allowed only Contract No. DAHC04-72-C-0001, and in part by the College of the replacement of specified lines in a file into Commerce and Business Administration. University of Illinois at Urbana-Champaign. programming languages in their own right, capable of 159 160 ROISTACHER complex sequences of operations on text files. that "If anything can go wrong, it will" and will tend to Once text has been input to the file system and has be least available when they are most needed. Since text been corrected with an editor, it must be arranged into a processing tends to be addictive, an unreliable system proper output format through the use of an output can be a major handicap. A third disadvantage arises processor, a program for formatting, justifying, and directly from one of the advantages of text processing, printing text. Output processors are extremely the ability to make instant backup copies of document complicated programs, approaching compilers in files. Unless the user exercises a good deal of complexity. An output processor uses commands self-discipline, it is possible to become lost in a welter of inserted in the stream of text to control the formatting different versions and copies of a document, leading to of the finished document. confusion, delay, expense, and sometimes loss of a Text processing offers the scholar a number of' document. powerful ad van tages in the preparation and dissemination of scientific papers. The most obvious FILE SYSTEMS advantage is that error correction tends to be cumulative. Once corrections are made, they stay in the File Operations and Capabilities file, and thus typographic errors are not reintroduced The power and flexibility of a computer's file system into corrected copy, as is the case when drafts must be is of great importance to the text processing user, who retyped by hand. It is also possible to find and correct will often want to do rather complex copying and all errors of the same kind in a single editing operation. concatenation operations when such capabilities are The ability to insert, delete, and rearrange text without available. It is often best practice to keep a long regard to page boundaries and numbering allows the document in several relatively short files, which are scholar to revise his paper far more easily than is the case concatenated (attached end to end) at the time they are when he must physically cut, paste, and retype his fed to the output processor. Sometimes it is possible to manuscript. concatenate a file to the interior of another at run time, Bibliographic work is especially facilitated by the a useful feature for inserting tables and diagrams. computer, even when no formal information retrieval or indexing procedures are used. It is possible to add to a Numbered-Line Files file of bibliographic references as each ,new reference is Direct access files can be loosely classified into found, making sure that each new entry in the file is numbered-line and stream formats. A numbered-line file, correct. Once a paper has been completed, it may be as its name indicates, has a number assigned to each of scanned for all bibliographic citations, e.g., by printing its lines. It is possible to insert and delete lines from such all lines containing the character string "(19", or ",19", a file, but each line is a separate entity. A numbered-line and the relevant references extracted from the me need not have its numbers displayed when it is bibliographic file. printed, but it is still stored internally as separate, One advantage of computer text processing, which numbered lines. In a numbered-line file, the line becomes apparent only after some experience, is the numbers are stored in a directory and do not change freedom from paper copy itself. There is no longer any when a line is inserted or deleted. Instead they are "original" which must be preserved from harm. explicitly reordered at the user's command. Duplication of an original becomes a relatively simple task, as a tape copy of a severalhundred page document Stream Files stored on magnetic tape can be made in a few minutes, A stream file is stored as a continuous stream of and far more cheaply than such a document could be characters. Lines of the file are delimited by a carriage Xeroxed. ' return and linefeed character sequence or by a special Dissemination of a document via the computer may end-of-line character defined by the operating system. not be important when all of the users of a time-sharing The major advantage of such a me is that it is extremely system are in the same institution. However, when users easy to edit across lines, since all that is required is the are spread over a wide area, as in the case of a regional or insertion or deletion of a line delimiting character. The national computer network, the system itself provides major disadvantage of stream organization is that there is the fastest possible way of transmitting a manuscript to no fixed way to refer to parts of the file. It is possible an interested reader. for the system to count the number of line delimiters Computer text processing, however, cannot be said to between the head of the file and a given point, but any be an unmixed blessing. The obvious disadvantage is the change in the number of such delimiters will change the additional cost to the user, who must invest in a terminal "line number" of a given piece of text. and extra computer time. The faithful Model 33 Teletype, while technically usable as an input to a text EDITORS processing system, is too slow and fatiguing to fill the bill and must be replaced by a more expensive Basic Characteristics device.