Image Processing, Video Terminals, and Printer Techuologies

Digital Technical Journal Digital Equipment Corporation

Volume 3 Number 4

Fall 1991 Editorial Jane . Blake, Editor Helen L. Patterson, Associate Editor Kathleen M. Stetson, Associate Editor Leon Descoteaux, Associate Editor Circulation Catherine M. Phillips, Administrator Sherry L. Gonzalez Production Mildred R. Rosenzweig, Production Editor Margaret L. Burdine, Typographer Peter R. Woodbury, Illustrator Advisory Board Samuel H. Fuller, Chairman Richard W Beane Robert M. Glorioso Richard]. Hollingsworth John W McCredie Alan G. Nemeth Mahendra R. Patel F Grant Saviers Victor A. Vyssotsky Gayn B. Winters

The Digital Technicaljournal is pub I ished quarterly by Digital Equipment Corporation, 146 Main Street ML01-3/B68, Maynard, Massachusetts 01754-2571. Subscriptions to the journal are $40.00 for four issues and must be prepaid in U.S. funds. University and college professors and Ph.D. students in the electrical engineering and computer science fields receive complimentary subscriptions upon request. Orders, inquiries, and address changes should be sent to the Digital Technicaljournal at the published-by address. Inquiries can also be sent electronically to DTJ@CRLDECCOM. Single copies and back issues are available for $16.00 each from Digital Press of Digital Equipment Corporation, 1 Burlington 0 80 -45 Woods Drive, Burlington, MA 1 3 39.

Digital employees may send subscription orders on the ENET to RDVAX::)OURNAL or by interoffice mail to mailstop MLOI-3/B68. Orders should include badge number, site location code, and address. All employees must advise of changes of address.

Comments on the content of any paper are welcomed and may be sent to the editor at the published-by or network address.

Copyright© 1991 Digital Equipment Corporation. Copying without fee is permitted provided that such copies are made for use in educational institutions by faculty members and are not distributed for commercial advantage. Abstracting with credit of Digital Equipment Corporation's authorship is permitted. All rights reserved.

The information in the journal is subject to change without

notice and should nor be construed as a commitment by Digital Equipment Corporation. Digital Equipmem Corporation assumes no responsibility for any errors that may appear in thejournal.

lSSN 0898-901X

Documentation Number EY-H889E-DP

The following are trademarks of Digital Equipment Corporation: ALL-IN-1, DECimage, DECnet, DECprint, DECserver, DECstarion, DECwindows, Digital, the Digital logo, LAT, LN03, MicroVAX, PrintServer, Q-bus, ReGIS, rtVAX, ULTRIX, VA,'\, VAXELN, VAXstation, VMS, VT1000, VTI200, VT1300, and VXT 2000.

Apple DeskTop Bus is a trademark and LocalTal k is a registered trademark of Apple Com purer, Inc.

Motorola and 68000 are registered trademarks of Motorola, Inc.

Open Software Foundation is a trademark and OSF and OSF/1 are CoverDesign registered trademarks of Open Software Foundation, Inc. High-performance screen display of bitonal images is one PostScript is a registered trademark of Adobe Systems, Inc.

of the topics in this issue. The handwriting and manually Texas Instruments is a trademark of Texas Instruments, Inc. produced technical drawings on our cover are types of images is a registered trademark of UNIXSystem Laboratories, Inc. that can be scanned, st01·ed electronically, ana then displayed is a trademar.k of the Massachusetts Institute on an screen; portions of an image can be enlarged of Technology. or rotated on screen. Book production was done by Digital's Database Publishing Group

The cover was designed trySandra Calef of Calef Associates. in Northboro, MA. I Contents

7 Foreword Larry Cabrinety

Image Processing, Video Terminals, and Printer Technologies

9 Hardware Accelerators for Bitonal Imag e Processing Christopher ). Payson, Christopher ]. Cianciolo, Robert N. Crouse, and Catherine F. Winsor

26 X Window Terminals Bjorn Engberg and Thomas Porcher

36 ACCESS. bus, an Open Desktop Bus Peter A. Sichel

43 Design of the DECprint Comm on Printer Supervisor for VMS Systems Richard Landau and Alan Guenther

55 The Comm on Printer Access Protocol James D. Jones, A jay P Kachrani, and Thomas E. Powers

61 Design of the Turbo PrintServer 20 Controller Guido Simone, Jeffrey A. Metzger, and Gary Va illette

1 I Editor's Introduction

boards, mice, and tablets, all of which may not be made by the same company. A new open desktop bus, described by Peter Sichel, is a simple means to connect as many as 14 low-speed devices to a desk­ top system. In his paper, Peter presents the project background, reviews the I'C technology on which the bus is based, and describes the protocol and the configuration process. Hard-copy presentation of data and recent devel­ opments in printer tec hnologies are the topics of the next three papers. Rick Landau and Alan Guenther review the DECprint Printing Services, which is jane C. Blake software that controls numerous printer fe atures Editor for a wide range of printers. Also called a common print symbiont, this component of the VMS print­ Products designed for quality, high-performance ing system supports several page description lan­ presentation of data in both video and hard-copy guages, handles multiple media simultaneously, form are the topics of papers in this issue of the and uses different I/0 interconnections and com­ Digital Te chnical jo urnal. The design challenges nmnication protocols. range from managing the huge storage require­ Both DECprint Printing Services and the subject ments of images for display on X terminals to ensur­ of the next paper, the common printer access pro­ ing high-performance in a feature-rich printer tocol, are part of the DECprint architecture. The environment. CPAP provides the fundamental services necessary Image processing is the subject of the opening for the presentation of data at the printer. Jim paper by Chris Payson, Chris Cianciolo, Bob Crouse, Jones, Ajay Kachrani, and To m Powers describe the and Cathy Winsor. The authors note that one advan­ challenges of developing a protocol that operates tage of scanning images for screen display is the in a he terogeneous, internetworking environment input time saved; however, the scanned images and that also ensures backward compatibility with and data can consume significant amounts of stor­ older products. Their success in developing a high­ age space. They then review the development of an performance protocol is evidenced by OSF accep­ image accelerator board that not only helps solve tance of CPAP for inclusion in a future release of the problem of storage but also addresses the need OSF/1. for high-performance display-view and manipula­ As was the case with the CPAP, performance tion-of bitonal images. In addition to specifics of was also key in the development of the turbo the board implementation, the authors offe r an PrintServer 20 controller. Guido Simone ,Jeff Me tzger, overview of imaging concepts, terms, and future and Gary Vaillette explain that the req uirements of directions for image accelerators. complex documents demanded turbo controller The terminal on which the image accelerator performance that was five to eighttimes that of the board resides is DECimage 1200, an X terminal. current controller. To aiel them in making design X terminals development in general, including a decisions, a performance analysis tool, RETrACE, discussion of the VT1200, is the subject of a paper was created and is described here. Authors also by Bjorn Engberg and To m Porcher. Bjorn and To m relate how they us ed existing chips in order to keep fo cus their discussion on a comparison of the development costs low and still deliver a high­ X terminal and X environments, and perfo rmance controller. explain why X terminals are a low-cost alternative. The editors thank Liz Griego-Powell of the Video, The authors present the design choices debated by Image and Print Systems Gro up for her he lp in the engineers during the development of Digital's preparing this issue. X terminals, including the selection of a hardware platform, terminal and window management, X server, com munications protocols, and fo nt file systems. Video terminal and workstation users need the assistance of a number of 1/0 devices, such as key-

2 Biographies I

Christopher J. Cianciolo As a hardware design engineer in the Video, Image and Print Systems Group, Chris Cianciolo is currently working on the design for the group's latest imaging product. Chris jo ined Digital in 1985 after par­ ticipating in a co-op session in the Power Supply Engineering Group. He also participated in co-op sessions for Charles Stark Draper La boratory, Inc. on a fib er-optic missile guidance system project. He received his B.S.E.E. from No rtheastern University in 1988 and is currently pursuing an M.S.E.E., also from Northeastern.

Robert N. Crouse Senior engineer Bob Crouse is a member of the Video, Image and Print Systems Group. He is currently working on the advanced devel­ opment of new imaging technology for X window terminals. Bo b was project engineer for the development of a bitonal imaging accelerator for a low-end VAXstation workstation. As a member of the Electronic Storage Development Group , he designed a double-bit error detection and correction circuit fo r a VAX mainframe. Bob received his B.S.E.E. from NortheasternUniversity and holds one patent.

Bjorn Engberg As a principal software engineer in the Video, Image and Print Systems Group, Bj orn Engberg was the main architect and software project leader for the VTIOOO and VT1200 X window terminals. He joined Digital in 1978 and worked as a development engineer at CSS in Sweden, where he modified Digital's terminals for the European market. He relocated to the United States in 1982 to work on the VT240, the VT320, the LJ250, and several advanced devel­ opment projects. Bj orn received an M.S.E.E. (honors) from the Royal In stitute of Te chnology in Stockholm.

3 Biographies

A. Alan Guenther As a member of the technical staff in the DECprint System Software Group, Alan Guenther is involved in the ongoing design and implemen· tat of the DECprint common print symbiont. Prior to this, he was the primary designer and implementor of the distributed qu euing services. Alan has worked at Digital since 1973, both as a fu ll-time employee and as an independent consul­ tant (from 1982 to 1990). After receiving a B.S. 01onors, 1970) from the University of Moman a, he worked at the university until he joined Digital.

James D. jones Jim Jones is a principal engineer in the Hardcopy Systems Engineering Group. He jo ined Digital in 1974 and was part of a team developing diagnostic programs fo r the DECsystem-10 and DECSYSTEM-20 systems. After run­ ning his own software business for five years, Jim rejoined Digi tal to design printer controllers and software. Most recently, he provided software for the PrintServer products, au thored the Common Printer Access Protocol specifica­ tion, and is helping to define the next generation of network printers. Jim is a member of IEEE and ACM an d participates in the JETF.

Ajay P. Kachrani Principal software engineer Ajay Kachrani currently works on the OSF/1 socket and XTI kernel interfaces an d security project. Previously, he led the development of the overal l PrintServer software version 4.0 with dual network protocol support (DECnet and TCP/IP), from inception through field test. Ajay presented the CPAP protocol as an Internet standard to the IETF and ad ded PrintServer support in version 1.0 of the Palladium Print System at MIT/Project Athena. Aj ay holds a B.S.E.E. (honors) from the University of Mysore, India, and an M.S.C.S. from the University of Lowell.

Richard B. Landau Richard Landau is the DECprint program manager fo r the Video, Image and Print Systems Group. Working to improve the interaction of printing software and hardware, he initiated the DECprint, Font, and PostScript programs. Prior to this, Rick was the program and development manager for the VA X DBMS, DATATRlEVE, COD, an d Rdb/VMS products and for the rel ational database architecture. Before joining Digital in 1974, Rick was an independent consultant and was also employed by Applied Data Research, In c. He holds A.B. (cum laude, 1969) and M.A. (1973) degrees in statistics from Princeton University.

Jeffrey A. Metzger Presently a senior engineer, Jeff came to Digital as a co-op student in 1983, working firstin the Semiconductor Engineering Group and then in Hardcopy En gi neering. He became a fu ll-time employee after gradu ating from Cornell University in 1985. He introduced Hardcopy to system-level logic simu­ lation, contributed to the hardware, software, and firmware development of the PrintServer 20, an d developed RETrACE, which is used to characterize the exe­ cution behavior of PrintServer systems. Jeff is currently working in the Entry Systems Business Group on a next-generation processor product.

4 I

Christopher J. Payson Chris Payson joined Digital as a hardware design engineer in 1989 after five co-op terms. He is currently working on XIE software and image hardware accelerators. Chris previously worked on performance testing, diagnostics, logic design, and demonstration software, all associatt:d with imaging. He is coapplicant for a patent related to an image clipping algo­ rithm and hardware logic. Chris received a B.S.C.E. from Rochester Institute of Technology with highest honors and is currently pursuing an M.S.C.E. from Northeastern University.

Thomas C. Porcher Principal engineer Tom Porcher is a member of the Video, Image and Print Systems Group. He provided technical leadership in the development of the VXT 2000 X terminal. Previously he was a technical leader for the VT240 terminal, VAX Session Support Utility, and the DEctt:rm . To m holds five patents for work on the VT240 terminal and on the multi­ session protocol used in the VT340 and VT400 series terminals. To m received his B.S. in mathematics from Stevens Institute of Technology (1975). Ht: is a member of the ACM.

Thomas E. Powers As a consultant engineer in the Hardcopy Engineering Firmware/Software Group, To m Powers is a vendor liaison for desktop PostScript printer products. He chairs the DECprint PAP Architecture Te am and was a contributor to the PrintServer 40 internal hardware/firmware archi­ tecture. To m represented Digital on American and international standards com­ mittees on from 1979 to 1989. He led several firmware teams and is coinventor of the ReGIS Graphics Protocol. To m has a B.S. E. E. from Tufts University and an M.S.E.C.E. from the University of Massachusetts at Amherst.

Peter A. Sichel As a principal software engineer in the Video Terminals Archi­ tecture Group, Peter Sichel led the development of the ACCESS. bus architecture and device protocol specifications, in addition to writing the initial ACCESS. bus device firmware. He worked on the VT420 video terminal and the DECterm DECwindows terminal emulator, and helps maintain Digital standards for video terminals and keyboards. Peter joined Digital in 1981 after receiving B.S. and M.S. degrees in computer engineering from the University of Michigan.

Guido R. Simone Guido Simone is a principal engineer in the Print Systems Engineering Group and was the project leader and architect for the turbo PrintServer 20 controller. He is currently working on the development of a new print system architecture to be used with advanced printing technologies. In previous work, Guido was the project leader and architect for an rtVAX 78R32 CPU chip-based laser printer controller. Before joining Digital in 1980, he received a B.S. in electrical engineering from Rensselaer Polytechnic Institute.

5 Biographies

Gary P. Vaillette Senior hardware engineer Gary Va illette has heen involved in the design and implementation of printing system hardware since joining Digital in 1983. His current work includes performance characterization of PostScript printers and PrintServer products, and hardware implementation of cern decompress ion in the turbo PrintScrver 20 product. Previously, Gary worked at Data General Corporation and helped to develop their token bus network product. He holds an A.A.E.E (1974) from Quinsigamond Community College and expects to receive a B.S.C.S. (May 1992) from Boston University.

Catherine F. Winsor As a senior engineer in the Video, Image and Print Sys­ tems Group, Cathy Winsor has worked on image accelerators. As the project leader for the DEC:image 1200 hardware and the image utility library software, Cathy wa s involved in the planning and development of an image-capable VT1200. She is currently leading the project to support imaging on the next generation of Digital's X terminals. The project inclucles an image accelerator board and XIE software. Cathy received an A.B. in engineering sciences from Dartmouth Co llege and a B.S.E. E. from the Thayer School of Engineering.

6 I Foreword

processing. The X terminal user can now benefit from the graphical user interface, sophisticated applications, and standards of perfo rmance previ­ ously available only on . X terminals run Xll server code which is independent and ideally suited for heterogeneous, network-based computing environments. In this issue you will read about the engineering decisions made as the X terminals were developed. There is a growing need in the industry to have imaging applications run alongside conventional text and graphics applications. Technical docu­ Larry Cabrinety mentation is an example of this. Imaging applica­ Vice President, tions, however, have special requirements to achieve Video, Image and Print acceptable end-user performance. Although the Xll Sy stems Group software can handle images as bit-map data, soft­ ware and hardware assistance is required to achieve acceptable performance. Digital has designed For the millions of people worldwide who use DECimage hardware accelerators for rapid process­ Digital's computer equipment, the computer is not ing of image data. This technology is included in the sophisticated system in the back room, or the the DECimage 1200 and will be incorporated in complex network. It is the equipment they use following generations of X terminals. To make this each day-the terminal or monitor, keyboard and possible, Digital developed extensions to the mouse, desktop printer or network printer system. X server software that support the high-speed To day's users demand products with high levels transport and display of image data. To assure open of usability and superior ergonomic fe atures. standards, the extensions have been proposed to Digital's products set worldwide standards for the MIT for incorporation into releases of the Xll user interface to computer systems. In the 90s our server software. focus is to offe r products that operate in multi­ In November 1990, Digital announced its next vendor environments with the goal of delivering a generation of X terminals. The VXT2000 terminal complete computing solution. In this issue you will provides virtual memory and supports both a tra­ read about some of the Video, Image and Print Sys­ ditional host-based model with software down­ tems (VIPS) Group's products and technologies that loaded to the terminals as well as the server style of support network computing and standards-based X terminal computing. environments. The VXT2000 terminal was designed to support Digital entered the video terminal market in 1975 TCP/IP and LAT protocols,and further demonstrates with the VT52 for its time-sharing users. Its replace­ our commitment to openness and support for cus­ ment, the VT100, embodied two important princi­ tomers' multivendor environments. This same phi­ ples-the use of standards in data communications losophy is seen in our printer products and our open interchange and the protection of customer invest­ desktop bus. ments through backwardcompa tibility of new gen­ Digital pioneered the distributed printing busi­ erations of products. The VT220, introduced in ness with networked laser printers. This prod­ 1983, and the cost-effective VT320 terminals saw uct area began when we combined two concepts the addition of functionality and ergonomic fea­ which had not been combined before-mid-range tures which established Digital as a leader in the laser printers and networks. In the mid-1980s most commodities market. large-scale computing was done on mainframe In March 1990, Digital entered the X terminal computers with large printers attached directly market with the introduction of the VTlOOO, fol­ to these systems. Typically these dedicated print­ lowed by the VT1200 and VT1300 terminals later ers were only accessible to users on that particular that year. The emergence of MIT's X Window system. Digital's distributed computing provided Systems as the accepted industry standard for an alternative to the mainframe. By combining the windowing systems provided a standards-based power of multiple systems in clusters or on net­ environment for distributed applications display works, a new distributed large system was created.

7 Fo reword

A printing solution was needed to effectively work original PrintServer 40, along with cu stom hard· in this new distributed computing environment. ware acceleration hoards developed by the liard­ The PrintSe rver se rie s addn:ssed this need. copy Group to enable printing at 40 pagt:s per PrintServer products en abled printing resources minute. In this issue you will read about th<.:single· to be directly connected to networks for the first board controller that replaces the MicroVAX 11 and time, and since they were on the network and not offers far more processing power. Us ing the latest tied to any one system, they were accessible hy all system-on-a-chip technology, our new turbo board systems on those networks. They enabled the com· provides leadership performance for our printers. plex printer fu nctionality previously found only in The CCITIimag e decompression chip enables us to dedicated mainframe printers to be distributed provide full-speed image printing to our customers throughout end-user environments. as the image market develops. As these mid-range printers migrated out of the The first PrintServersystems su pported printing computer room and into the office, new demands from VMS hosts ovt:r DECnet networks. Since then for functionality were created. La rge groups of users the breadth of platform support has increased to brought many different requirements for printing, include firstULTRJX systems and then UNIX operat· and our goal was to satisfy as many as possible in a ing systems. A software kit for Sun systems will be single PrintServer. For example, some people need available soon. In expanding PrintServer connectiv· "A" size paper for office correspondence, while ity to include UNIX systems and TCP/11' network s, others may need "B" size paper for CAD/CAl\1 or we again faced the problem that no network print· accounting work, and still others need transparen­ ing protocol existed for TCP/IP. With the help of cies for presentations. The PrintServer is flex ible Digital's experts at the We stern Re search Labora· enough to have all of these different types of tory, we were able to devdop a solution. In this media available and offer both simplex and duplex issue, we discuss the creation of a network printer printing. access protocol for TCP/ IP. To day this network pro­ In 1985 when Digital was first developing the tocol is a proposed standard at the Internet Engi· PrintServer, there was no industry standard way neering Task Force, the body controll ing the TCP/IP of describing the contents of a page to a printer. protocol. Each major vendor had its proprietary language, The development of the ACCESS. bus product has and none offered the compatibility necessary to brought an easy, standard way to link a desktop achieve our print system vision. Our goal was to computer to many interactive user interfaces. This create a fam ily of products, from large to small, open desktop bus is currently implemented on the that offered compatibility for all applications. To Personal DECstation 5000 workstation, and imple· achieve this goal we had to select a protocol mentations on future RISC workstations and video that would en able us to print any file on any terminals is underway. Developers of Digital's prod· printer. At that time Adobe Systems, the developer ucts will continue to place a high priority on open of PostScript, was a small start-up company in standards. The papers included in this issu t: of the Silicon Va lley. PostScript was not a standard, and in Digital Te chnical jo urnal will providt:insight into fa ct, only a single PostScript laser printer model the key areas of technology ust:d in the design and had been shipped, the original Apple LaserWriter. development of VJPS products. Our technical community fe lt PostScript was the best solution to ou r net:ds, and at that point Digital committed to adopting PostScript as our strategic page description language. PostScript printers and PostScript application support are now pervasive throughout the industry and standard printing protocols enable interactive communication with hosts on the network. Significant advances have taken place in the PrintServer series over the past seven years. An entire :VlicroVAX II system was housed within the

8 Christopher]. Payson Christopher]. Cianciolo Robert N. Crouse Catherine E Winsor

Hardware Acceleratorsfor Bitonal Image Processing

Electronic imaging systems transfer views of real-world scenes or objects into digital bits for storage, manipulation, and viewing. In the area of bitonal images, a large market exists in document management, which consists of scanning vol­ umes of papers for storage and retrieval. However, high scan densities produce huge volumes of data, requiring compression and decompression techniques to pre­ serve system memory and improve system througbput. These tecbniques, as well as general image processing algorithms, are compute-intensive and require high memory bandwidth. To address tbe memory issues, and to acbieve interactive image display petformance, Digital bas designed a series of bitonal image hard­ ware accelerators. Tbe intent was to create interactive media view stations, with imaging applications alongside other applications. In addition to achieuirzg mem­ ory, performance, and versatility goals, the hardware accelerators have signifi­ cantly improved finalimage legibility

Bitonal image technology, which can be viewed as should come of age in the 1990s. Digital imaging the electronic version of today's microfilm method, is already in use in many areas and new applica­ is experiencing a high rate of growth. However, the tions are being created for both commercial and electronic image data objects generated and manip­ scientific markets. The emergence of digital images ulated in this technology are very large and require as standard data types supported by the majority intensive processing. In a generic system, these of systems (like text and graphics of today) seems requirements can result in poor image processing assured. For a greater understanding of specific performance or reduced application performance. imaging applications, this section presents general To address these needs, Digital has designed a series imaging concepts and terms used throughout the of imaging hardware accelerators for use in the doc­ paper. ument management market. This paper provides a tutorial on electronic Concepts and Terms imaging. It begins with a general description of the In its simplest form, imaging is the digital repre­ imaging data type and compares this type to the sentation of real-world scenes or objects. Just as a standard text and graphics data types. It continues camera transfers a view of the real world onto a with a discussion of specific issuesin bitonal imag­ chemical film, an electronic imaging system trans­ ing, such as image data size, network transport fers the same view into digital bits for storage, method, rendering speed, and end-user legibility. manipulation, and viewing. In this paper, the term The paper then focuses on Digital's DECimage 1200 image refers to the digital bits and bytes that repre­ hardware accelerator for the VT 1200 X window sent the real-world view. terminal developed by the Video, Image and Print The process of digitizing the view may be done Systems Group. It concludes with fu ture image through various methods, e.g., an image scanner accelerator demands for the processing of multi­ or image camera. A scanner is the conceptual media applications and continuous-tone images. inverse of a normal printer. A printer accepts an electronic stream of bits that describe how to Introduction to Imaging place the ink on the paper to create the desired Just as graphics technology blossomed in the 1980s, picture. Conversely, optical sensors in the scanner electronic imaging and its associated technologies transform light intensity values reflected from a

Digital Te chnical journal Vol. 3 No. 4 Fa/11991 9 Image Processing, Video Terminals, and Printer Technologies

sheet of paper and create a stream of electronic bits media density. Scalingan image may be as simple as to describe the picture. Similar sensors in the focal replicating and dropping pixels, or it may involve plane of a camera produce the other common digi­ interpolation and other algorithms that take neigh­ tization method, the electronic image camera. boring pixels into account. Generally, the more The format of a digitized image has many param­ complex scaling algorithms require more process­ eters. A pixel is the common name for a group of ing power but yield higher-quality images, where digitized image bits that all correspond to the same quality refers to how wel I the original scene is rep­ location in the image. This pixel contains informa­ resented in the resulting image. tion about the intensity and color of the image at Before an image can be displayed, its pixel values one location, in a format that can be interpreted often require conversion to account for the charac­ and transformed into a visible dot on a display teristics of the display device. As a simple example, device such as a printer or screen. The amount of a color image cannot retain its color when output information in the pixel classifies the image into to a black-and-white video monitor or printer. In one of three basic types. general, when a device can display fewer colors than an imagecontains, the image pixel values must A bitonal image has only one bit in each pixel; • be quantized. Simple quantizing, or thresholding, the bit is either a one or a zero, representing one can be used to reduce the number of image colors of two possible colors (usually black and white). to the number of display colors, but can result in

• A gray-scale image has multiple bits in each pixel, loss of image quality. Dithering is a more sophisti­ where each pixel represents an intensity value cated method of quantizing, which produces the between one color (all zeros) and another color illusion of true gray scale or color. Although dither­ (all ones). Since the two colors are usually black ing need use no more colors than simple quantiz­ and white, they produce a range of gray-scale ing, it results in displayed images of much higher values to represent the image. quality. Image compression is a transformation process A color image has multiple components per • used to reduce the amount of memory required to pixel, where each component is a group of store the information that represents the image. bits representing a value within a given range. Different compression methods are used for bitonal Each component of a color image corresponds images than those used for gray-scale and color to a part of the color space in which it is repre­ images. These methods are standardized to specify sented. Color spaces may be thought of as dif­ exactly how to compress and decompress each fe rent ways of representing the analog, visible type of image. For bitonal images, the most com­ range of colors in a digitized, numeric form. The mon standards are the ones used in facsimile most popular color spaces are television's YlN machines, i.e., Recommendations T.4 and T.6 of the format (one gray-scale and two color compo­ Comite Consultatif Internationale de Telegraphique nents) and the bit-mapped computer display's et Telephonique (CCJTT).1·2 Commonly known as RGB format (red, green, and blue components). the Group 3 and Group 4 standards, the desig­ The resolution of an image is simply the density nations are often shortened to G3-ID, G3-2D, and of pixels per unit distance; the most common den­ G4-2D, referring to the particular standard group sities are measured in dots per inch (dpi), where aod to the coding method, which may be either a pixel is called a dot. For example, a facsimile one- or two-dimensional. For gray-scale and color machine (which is nothing more than a scanner, images, the joint Photographic Experts Group printer, and phone modem in the same unit) typi­ OPEG) standard is now emerging as a joint effort of cally scans and prints at 100 dpi, although newer the International Standards Organization (ISO) and models are capable of up to 400 dpi. As another CCITI.0 Whichever format or process is used, com­ example, most workstation display monitors are pression is a compute-intensive task that involves capable of 75- to 100-dpi resolution, and some high­ mathematically removing redundancy from the end monitors achieve up to 300-dpi resolution. pixel data. To display an image at a density diffe rent from A typical compression method creates an its scanned density, without altering the image's encoded bit stream which cannot be displayed original size, requires the image to be scaled, so directly; the compressed bits must be decom­ that the new image density matches the output pressed before anything recognizable may be

10 Vol. 3 No. 4 Fu /119')1 Digital Te chuicaljournal Ha rdware Accelerators fo r Bitonal Image Processing

displayed. The term compression ratio represents The major drawback in the imaging process is the size of the original image divided by the size of increased data size, which results in storage mem­ the compressed form. For bitonal images using ory and network transport problems. High scan the CCITI standards, the ratio is commonly 20:1 on densities and color information components create normal paper documents, but can vary widely with large volumes of data for each image; a bitonal the actual content of the image. The CCITI stan­ image scanned at 300 dpi from an 8.5-by-11-inch dards are also "lossless" methods, which means that sheet of paper requires over 1 megabyte of mem­ the decompressed image is guaranteed to be iden­ ory in its original pixel fo rm. Therefore, compres­ tical to the original image (not one bit different). sion and decompression are integral parts of any In contrast, many "lossy" compression methods imaging system. Even in compressed form, a bitonal allow the user to vary the compression ratio such image of a text page requires about 50 kilobyt es that a low ratio yields a nearly perfect image repro­ of storage, whereas its American standard code duction and a high ratio yields a visible degradation for information interchange (ASCII) text equivalent in image quality. This trade-off between compres­ requires only 4 to 5 kilobytes. Similarly, a graphics sion and image quality is very useful because of the file to describe a simple block diagram is much wide range of applications in imaging. An applica­ smaller than its scanned image equivalent. tion need pay no more in memory space and band­ Based on these advantages and limitations, sev­ width than necessary to meet image quality eral applications have emerged as perfect matches requirements. for imaging technology. Bitonal images are used in the expanding market of document manage­ A Ne w Data Typ e and Its Features ment, which consists of scanning volumes of The image data type is fundamentally different papers into images. These images are stored and from text and graphics. When a user views charac­ indexed for later searching and viewing. Basically ters or pictures on a display device, the source of an electronic file cabinet, this system results in that view is usually not important. A sheet of text large savings in physical cabinet space, extremely from a printer may have come from either a text file fast document access, and the ability for multiple where the printer's own fonts were used, a graph­ users to access the same document simultaneously. ics filewhere the characters were drawn with line Gray-scale imaging is often used in medical appli­ primitives, or an image file where the original text cations. Electronic versions of X rays can be sent document was scanned into the system. In any instantly to any specialist in the world for diagno­ case, the same letters and words present the user sis, and the ordering of sequential computer-aided with approximately the same information; the dif­ testing (CAT)-scan images into a "volume" can pro­ fe rences are mostly in character quality and format. vide valuable three-dimensional views. The appli­ In spite of their large storage space require­ cations for color imaging are relatively new and ments, images have several advantages over graph­ still emerging, but some are already in use commer­ ics or text. First, consider the process of getting cially, e.g., license and conference registration pho­ the information into the computer. With the imag­ tographs. A fu rther extension to still imaging is ing process, documents may be scanned automati­ digital video, which can be considered as a stream cally in a few seconds or less, compared to the time of still images. In conjunction with audio, digital required for someone to type the information cor­ video is commonly known as multimedia, applica­ rectly (absolutely no errors) into a text file. Also, tions for which range from promotional presenta­ even though the software exists to convert elec­ tions to a manufacturing assembly process tutorial. tronic raster images into graphic primitive files,the In this paper, we focus on the static bitonal imag­ process loses detail from the original image and is ing method of representing real-world data inside relatively slow. Next, consider the variety of infor­ computers. Static imaging is a simpler method of mation possible on a sheet of paper: a user can­ representing a broader range of information than not easily reproduce a diagram or a signature on a the text and graphics media types, but it carries document. A scanned image preserves not only the a greater requirement for processing power and characters, but their fo nt, size, boldness, relative memory space. In addition, static imaging can be position, any pictures on the page, and even viewed as one part of true multimedia, as can text, smudges or tears depending on the quality of the graphics, audio, video, and any other media for­ image scan. mats. Ye t static imaging does not have the system

Digital Te chnical journal Vol. 3 No. 4 Fa ll 1991 11 Image Processing, Video Te rminals, and Printer Te chnologies

speed requirements of a motion video and audio mately 3 kilobytes of memory. If the same sheet of system, which must present data at real-time rates. paper is digitized by scanning at various dot den­ As long as the user can tlcal with static imagcs at sities, the resulting data tiles are huge, as shown an interactive rate, i.e., being ablc to view the by the decompressed bitonal image sizes in Ta ble !. images in the fo rmat of choice as fast as the user Note that Ta ble 2 includes the size of the scanned can sdect them, then static imaging is a powerful image if scanned in gray-scale and color modes, media presentation tool. The next section presents although using these modes would not make sense the important issues concerning bitonal imaging in on a black-and-whit�: sheet of paper. The image a tlocumcnt managcment env ironment. sizes arc included for comparison and are discussed in the section Future Image Accelerator Require­ Bitonal Imaging Issues ments. The clara presented in Ta bles 1 and 2 illus­ As previously mentioned, bitonal electronic imag­ trates that the size of the original ASCII file is much ing as an alternative to paper documcnts offers smaller than any of the scanned versions. The data many benehts, such as reducctl physical storage also gives evidence that scanned images, in general, space, instant and simultaneous access of scanned require considerable memory. images, and in general a more accessible media. Since the typical use for bitonal images is for Serious issues need to be resolvetl before a produc­ volume document archival, an imaging application tive imaging operation can be implemented. The must include a compression process to reduce mem­ chief issues are the image data size, transport ory usage. This process must transform the original method, perceivetl rendering speed, and fi nal legi­ scanned image file to a much smaller file without bility. In the following sections, we examine each losing the content of the original scanned data. issue and present solutions. Compression algorithms may take different paths to achieve the same re sult, but they share one basic Digitized Image Data Size process, the removal of redundant information to The most important issue concerns image data size. reduce the object size. A common compression Images are typically documents, drawings, or pic­ routine searches the pixel data for groupings, or tures that have been digitized into a computer­ "run lengths," of black or white pixels. Each run readable form for storage antl retrieval. Depending length is assigned a code significantly shorter than on the dot density of the scanner, a single image the ru n length itself The codes are assigned by can be I to 30 megabytes or more in size. However, statistics, where the most frequent run lengths storing a single image in its scanned form is not the are assigned the shortest codes; statistics have been typical usage model. Instead, a company may have amassed on a variety of document types for diffe r­ tens of thousands of scanned documents. Clearly, ent scan densities and document sizes. A compres­ with today's storage technologies, a company can­ sion process parses through the original image not afford to store such a large volume of images file,ge nerating another tilethat contains the codes in that format. representing the original image. Figure I, a sample A typical ASCII file representing the text on an bitonal image co mpression, illustrates these com­ 8.5-by-1 1-inch sheet of paper requires approxi- pressed codes in a serial bit stream.

Ta ble 1 Sample Bitonal Image Sizes

Scan Kilol'ytes of Data Document Ty pe Density Pixel Form Ty pical (Paper Size) (dpi) (Decompressed) Compressed

A size 100 114 46 (8.5 X 11 inch) 200 457 47 300 1027 50 Esize 100 1826 106 (44 x 34 inch) 200 7305 114 300 16436 127

12 Vo l. 3 No. 4 Fa ll 1991 Digital Te chnical jo urnal Ha rdware Acceleratorsfo r Bitonal Image Processing

Ta ble 2 Sample Gray-scale and Color Ne twork Transport Co nstraints Image Sizes The network transport performance for an image is important, because images arc most often stored Kilobytes of Data Document Ty pe in Pixel Form on a remote system and viewed on a widespread and Size (Decompressed) group of display stations. For example, one group in an insurance company receives and scans claim 128 x 128 pixel, 12 bits per pixel 24 papers to create a centralized image database, gray-scale image while users in another group acn:ss the documents 51 2 X 512 pixel, 8 bits per pixel 256 color image simultaneously to process claims. For the imaging system to be productive. this image data needs 51 2 x 51 2 pixel, 24 bits per pixel 768 color image to be transported quickly from one group to the other: telephone attendants answering calls must 8.5 X 11 inch, 100 dpi, 24 bits 2740 per pixel, color image have immediate access to the data. Scanned image documents take a long time to transport between systems, simply because they are Several algorithms for bitonal compression are so large. When compression techniques are used, a widdy used today. As mentioned in the previous typical uncompressed image stored in 1 megabyte section, the most common for bitonal images are can be reduced to approximately 50 kilobytes. the CCllT standards G3-l D, G3-2D, and G4-2D, which Since transport time is proportional to the number all use the approach just described. For the one­ of packets that must be sent across the network, dimensional method, the algorithm creates run reducing the data size to .:; percent of its original lengths from all pixels on the same scan line. In the size also reduces the transport time to 5 percent two-dimensional methods, the algorithm some­ of the original time. Therefore, you can now send times creates run lengths the same way, but the twenty compressed images in the same time previ­ previous scan line is also examined. Some codes ously spent sending one uncompressed image. represent run lengths and even whole scan lines Even with compression techniques, the image as "the same as the one in the previous scan line, files are still larger than their text file equivalents. except offset by .v pixels," where N is a small inte­ Moreover, most network protocols limit their ger. The two-dimensional method takes advan­ packet size to a maximum number of bytes, i.e., an tage of most of the redundancy in an image and image file larger than the maximum packet size returns the smallest compressed file. In addition to gets divided over multiple packets. If the protocol preserving system memory, these compression requires an acknowledgmenr between packets, then methods significantly improve network transport the transport of a large file over a busy network performance. becomes a lengthy operation.

IMAGE PIXELS

o AN LINE o F ' Ls -_ - t?t1 1r---- _�r __,...,. --, �600 WHITE 250 WHITE U 845 WHITE ACK

... 01 1010000101000 000001 1 01000 0101 110101 101 1 000001 1 01000 01 101001 000001 1 ..

COMPRESSED CODES IN A CONTINUOUS BIT STREAM

Figure I Bitonal Image Co mp ression

Digital Te chllicalJo ur11al VlJI. 3 No. 4 Fa ll 1991 13 Image Processing, Video Te rminals, and Printer Te chnologies

The platform for our most recent ac celerator is them. The disk ac cess time data from the DECimage the VT1200 X window terminal, which uses the project analysis shown in Table 3 demonstrates that local area transport (I.AT) network protocol. We the disk file read time is a significant portion of the soon realized that the X server packet size was overall response time. Thus, the display station limited to 16 kilobytes and the typical A-size render time is the only are a of the display response compressed document was approximately 50 kilo­ time which can be clearly intluenced and is, there­ bytes. With this arrangement, each image transport fore, the main focus of our image accelerators. The would have required fo ur large data packets and local processing that must occur at the display sta­ four acknowledgment packets. Working with the tion is not a trivial task; an image must be decom­ X Window Te rminal Base System Software Group, pressed, scaled, and clipped to fit the user's current we were able to rai se the packet size limit to window size, and optionally rotated. 64 kilobytes. The base system group also imple­ The decompression procedure inverts the com­ mented a delayed acknowledgment scheme, which pression process; both are computationally com­ eliminates the need for the client to wait for an plex. Input to the procedure is compressed data, acknowledgment packet before sending the next and output is the original scan line pixel data, data packet. Ta ble 3 shows compressed image data which can be written to a display device. Scal ing taken during the DECimage 1200 development the data to fit the current window or fill a region cycle. Notice that the network transport times of interest is not trivial either: a huge input data for Digital document interchange fo rmat (DDIF) stream must be processed (the decompressed, orig­ decrease sharply aft er the packet changes. inal file), and a moderate output data stream must be created (the viewable image to be displayed). Pe rceived Rendering Sp eed While simple pixel replicate and drop algorithms Because the image scanning and compression may be used to scale the data, a more sophisticated operations occur only once, they are not as scaling algorithm has been shown to greatly perform ance-critical as the decompression and enhance the output image quality. rendering for display operations, which are done In add ition to scaling an d clipping, the orthogo­ many times. Decompression and rendering are part nal rotation of image s (in 90-degree increments) of the system's display response time, which is a is a useful function on a display station. Some docu­ critical factor in a system designed fo r high-volume ments may have words running in one direction applications that access thousands of images daily. while pictures are oriented another way, or the user This time is measured from the inst ant the user may wish to view a portrait-mode image in land­ presses the ke y to select an image to view, to the scape mode. In either case, orthogonal rotation moment the image is displayed completely on the can help the user understand the inform ation; i.e., screen. The display response time is a fu nction of the increased time to ro tate the view is warranted. the disk read time, network transport time, an d dis­ When an image is scanned, particularly with a play station render time. hand-held scanner, the paper is never perfectly Although network transport time and disk file al igned. Thus, the image often requires a rotation of read time have a direct effect on the response time, I to 10 degrees to make the view appear straight accelerator developers rarely have any control over in the image file. However, multiple users want the

Ta ble 3 DD IF Image File Read Ti me and File Tra nsport Performance

Network Tra nsport Time (milliseconds) Disk Read Time After Before Image Size (milliseconds) Packet Packet (kilobytes) MicroVAX II VAX 8800 VA X 6440 Change Change

19 1223 480 281 325 960 41 1534 655 332 61 4 1792 99 2351 1035 598 1351 3928

157 3288 1380 71 6 2283 6430

14 Vol. 3 No. 4 Fa ll 1991 Digital Te clmicaljourual Ha rdwa re Accelerators fo r Bitonal Image Processing

information from the document as quickly as pos­ General Design Stra tegy sible, and should not have to rotate the image by The number of applications using bitonal image a few degrees to make it perfectly straight on the data continues to increase. In general, these appli­ screen. Therefore, this minimal ro tation should cations attempt to offer low cost while achieving be done after the initial scanning process; i.e., an interactive level of performance, defined as only once, prior to indexing the material into no more than 1 second from point of request to the database, and not by every user in a distributed complete image display. Ultimately, software may environment. Because any form of rotation is provide this fu nctionality without hardware accel­ compute-intensive, allowing the user to perfo rm eration, but today's software cannot. Moreover, the minimal ro tations at a high-volume view station parameters of image systems are not static; scan wo uld reduce the application's perceived ren­ densities, overall image size, and the number of dering speed and add little value to the station's images per database will all increase. These function. increases will provide the most incentive for hard­ ware assist at the low end of the X window ter­ Final Legibility minals market, because software alone cannot While the primary issue facing imaging applica­ perform the amount of processing that users will tions is data size, image viewing issues must also be expect for their investment. addressed. In short, an effective bitonal imaging The Us er Model Although a single model cannot display system must be responsive to overall image suit every application, imaging is centered on cer­ display performance and the resulting quality of tain functions. Therefore, a user model built on the image displayed. To enhance our products, we these fu nctions would be very useful in mapping optimized the display performance parameters as individual steps to the hardware: hardware versus best we could, given that some parameters are not software performance, the fu nction's frequency of under our control. Improvements to monitor reso­ use, and the cost of implementation. lution and scanner densities continue to increase The general user model for bitonal imaging sys­ the legibility of images. An affordable image system tems is relatively simple. A small market exists fo r should increase the image legibility by rendering image entry stations, in which documents are a bitonal image into a gray-scale image using stan­ scanned, edited, and indexed into a database. While dard image processing techniques. We discuss the a high throughput rate is important at these sta­ method used in our accelerators, i.e., an intelligent tions, a general-purpose image accelerator is not scale operation in the hardware pipeline, in the the solution-dedicated entry stations already next section. exist in the market. Instead, we designed a general­ purpose platform, or versatile media view station, Ha rdware AcceleratorDesi gn to be used for imaging applications alongside other As explained in the previous section, transforming applications. The user model fo r this larger market documents into a stream of electronic bits is not is a set of operations for viewing and manipulating the demanding part of a bitonal imaging process images already entered into a database. The most for document management. Also, scanners and common operations in this model are decompres­ dedicated image data-entry stations abound in the sion, scaling, clipping, orthogonal ro tation, and marketplace already. Instead, the challenge lies in: region-of-interest zooming. (1) managing the image data size to control memory costs and reduce network slowdown; Display Performance and Quality Optimization (2) increasing the image rendering speed, i.e., The main thrust of the DECimage accelerator is to decompress the image , scale it, and clip it to fit achieve interactive perfo rmance for the operations the window size with optional rotation; and defined in the user model. A secondary goal is to (3) increasing the quality of the displayed images. bring added value to the system by increasing the This section describes the way our strategy quality of the displayed image compared to the influenced the design of DECimage products. We quality of the scanned image. A side effect of maxi­ also discuss the chips used for decompression and mizing performance in hardware is that the main scaling, and how Digital's existing client-server pro­ system processor has work off-loaded fro m it, free­ tocols support these imaging hardware accelerators. ing it fo r other tasks.

Digital Te chnical ]o un1al Vol. 3 No. 4 Fa ll 1991 15 Image Processing, Video Te rminals, and Printer Te chnologies

The general design of the accelerator uses a The most straightforward method of integra tion is pipelined approach. Since maximum performance to relocate the hardware from the present option is desired and a large amount of data must be pro­ to the main system processor board; successive cessed by the accelerator board, multiple passes steps of integration would consolidate mapped through the board arc not feasible. Similarly, the tar­ hardware to fe wer total devices. The most cost­ geted low cost does not allow a whole image buffer effective integration will be the inclusion of the on the board. With one exception (rotation), all mapped hardware in the processor in a way similar board processing should be dune in one pipeline, to a floating-point unit (FI'l"). Just as gra phics accel­ with the system processor simply feeding the input eration is now being incl uded in system processor end of the pipe and draining the output encl. design, images will eventually achieve the status of lkcause of the large amount of data to be read from a required data type and thus be supported in the the board and displayed on the screen, the proces­ base system processor. sor should only have to move that data, not do any further operations on it. lo this end, any logic Product Definition-What Doesthe Us er required to fo rmat the pixels for the display bitmap Wa nt? should be included in the pipeline. The previously described strategy was used in the design of the image accell"rator board for the Cost Reduction through Less E\pensive Sy stem DECimage 1200 system. The product requirements Components The net cost of a bitonal imaging called for a low-cost, high-performance document system is influenced by the capability of the assist image view station. These requirements evolved hardware. The capability of the hardware implies from the belief that most users currently investi­ Oexibility in the choice of other system hardware. gating imaging systems are interested in applica­ In this regard, the most significant impact on cost tions and hardware that will enable them to quickly occurs in the memory and the display. A system that and simultaneously view document images and run makes use of fa st decompression and scaling hard­ their existing non imaging appl ications. These users ware can quickly display compressed images from are involved with commercial and business appli­ memory. This means either more images can be cations, rather than scientific plicap ations. The maintained in the same memory, or the system can DECimage 1200 system was planned for the manage­ operate with less memory than it wou ld without ment of insurance claims processing, hospital the assist hardware; less memory means lower cost. patient medical records, bank records, and manu­ A mon.: dramatic effect on system cost is in the facturing documents. As previously stated, the display. Imaging systems generally need higher­ imaging fu nctions required for these view-oriented density displays than nonimaging systems, but the appl ications are high-speed decompression, scal­ cost of a 150-dpi display is approximately twice the ing, rotation, zooming, and clipping. cost of a 100-dpi display of the same dimensions. However, we fo und that we could increase legibil­ General Product Design ity, i.e., expand a bitonal image to a gray-scale repre­ In defining the image capable system, the key sentation, by using an intell igent scale operation points in the product requirements list were in the hardware pipeline. For ex ample, a bitonal image rendered to a 100-dpi display using the intel­ • High-perfo rmance image display ligent scale process gives the perceived legibility • Low cost of the same image rendered to a 150-dpi display with a simple scaling method. That is, hy adding the • Bitonal images only (not gray-scale or color) intell igent scale, a 100-dpi display can be used • View-only functions where previously only a 150-dpi display would be adequate. The need fo r high-performance display influ­ enced the project team to design the hardware Cost Reduction through In tegration Presently, as accelerator board to handle image decompression, in the DECimage 1200, hardware-assisted image scaling, and ro tation. Previous performance test­ manipulation exists as a board-ll"vel option. Higher ing on a 3-VUP (VAX-1 1/780 units of performance) levels of integration with the base platform will CPU had yie·lded image software display times from provide lower overa ll cost for an imaging system. 5 to 19 seconds. These images were compressed

16 lk)f. 3 No. 4 rail 19'JI Digital Te cht�ical jo urnal Ha rdware Accelerators fo r Bitonal Image Processing

according to the cenT Group 4 standard (300 dpi, Group 3 and Group 4 bitonal compression algo­ 8.5-by-1 1 inches), and ranged from 20 to 100 kilo­ rithms. This specialization allowed the use of a bytes in size. In addition, the software display times Digital application-specificintegrated circuit (ASIC) were highly dependent on the image data content. decompression chip. Finally, the view-only require­ The more complex image files, which had lower ment limited the scope and complexity of the compression ratios, took significantly longer to design by eliminating the need for extra hardware decompress, scale, and display than the simpler to handle the compression of images after they image files. For example, an A-size, 300-dpi, cenT have been scanned and edited. Group 4 compressed image with a compression ratio of 10:1 took approximately 18 seconds to Sp ecificProdu ct Design display, while another with a ratio of 33:1 took approximately 7 seconds. The decisions described in the previous section led The other three requirements Jed to decisions to our design of an image accelerator board that about the specific design of the image accelerator supports: CCI1TGroup 3 and Group 4 image decom­ board. The need for low cost meant designing an pression using an ASIC decompression chip; integer option for an existing low-cost platform, which led scaling using an ASIC scaling chip; orthogonal rota­ us to Digital's VT1200 X window terminal. This tion; and image display. Figure 3 shows a general requirement also led to our support of the pro­ block diagram of the board and how it fits into posed X Image Extension (XIE) protocol.4 The XIE Digital's VT1200 system architecture. The accelera­ protocol extends the X ll core protocol to enable tor board is attached to the system address/data the transfer of compressed images across the wire bus, and its registers, data input port, and data and to enable interactive image rendition and dis­ output port are mapped into the CPU's 110 space. play at the server. In the X windowing client-server The accelerator board is accessed by reading and environment, image applications and compressed writing specific addresses like any other system image files exist on the client host machine, as memory space. Note that the image accelerator depicted in Figure 2. In addition, the XIE protocol logic is separate from the video terminal logic. standardizes the interface-to-image functions in Decompressed images are read from the image the X windowing environment and enables the board and written to the base system video mem­ development of a common application that can be ory for display. used on any XIE-capable station. The client applica­ The main operation consists of the following tion issues commands to the X server display sub­ steps: compressed image data is read from system system and the XIE specialized image subsystem. memory and written to the ASIC decompression When a user selects an image to view, the com­ buffe r by the processor; the data is then decom­ pressed image file is transported from the client­ pressed, scaled by the ASIC sealing chip, packed side storage device to the X server memory. into words, and written to the output buffer. Because the proposed accelerator would han­ Figure 4 shows a detailed block diagram of the dle only bitonal images, we could specialize our image accelerator board logic. The scaling chip out­ board to decompress only the standard CCilT puts pixels of data (1 bit per pixel in this case) which are packed into words using shift registers. As soon as a word of data is available, the scaling chip output halts. Control signals generated in pro­ X/X IE CLIENT X/X IE grammable array logic (PAL) write the packed word APPLICATION SERVER X1 1 AND XIE into the output buffer and tell the scaling chip to WIRE PROTOCOL begin outputting pixels again. When the output buffe r is full, the processor reads the rendered image data from the buffer. If rotation is required,

COMPRESSED the processor writes the data to the rotation IMAGE STORAGEI IMAGE DISPLAY matrix; otherwise, the data is clipped and written • DISK HARDWARE • CD ROM to the bit map. The image driver software, after set­ ting up the board, alternates between checking whether the input buffer is empty and whether the Figure 2 X Client-server Architecture output buffer is fu ll.

Digital Te chllicaljournal Vo l. 3 No. 4 Fa ll 1991 17 Image Processing, Video Te rminals, and Printer Te chnologies

NETWORK SYSTEM IMAGE c:J INTERFACE MEMORY HARDWARE t ! t ! < SYSTEM ADDRESS/DATA BUS > f ! ! DIAGNOSTIC OPTION VIDEO DISPLAY MEMORY ROM MEMORY f-

F(!{ure3 VT/200Sy stem Architecture

The rotation circuit handles 90- and 270-degree to be decompressed before it can begin scaling; rotation, whereas 180-clegree rotation is hand led insteacl, scaling begins as soon as the first byte of in the data packing shift registers by changing the data is output from the decompressor. Thus dif­ shift direction. The circuit rotates an 8-by-8-bit ferent pieces of the image file are being decom­ block of data at a time. The first byte of eight con­ pressed, scaled, and rotated simultaneously. The secutive scan lines is written into eight individual hardware pipeline also eliminates the need to byte-wide registers. The most significant bit (MSB) store the fully uncompressed image (approximately of each of these registers is connected to the byte­ 1 megabyte of data for A-size 300-dpi images) in wide rotation output port latch. A processor read memory. The compressed image is written from of this port triggers a simultaneous shift in all of the system memory to the accelerator board and a rotation data registers so that the next bit of each decompressed, scaled, and clipped image is read register is now latched at the rotation output port from the board. Because of the speed of the hard­ for the next read. Figure 5 diagrams the rotation ware, the software can redisplay an image with dif­ circuitry just described. ferent sealing, cJ ipping, or ro tation parameters; it lb achieve the best performance, we pipelined merely changes the hardware setup for the diffe r­ the functional blocks in the hardware. The scaling ent parameters and sends the compressed image engine docs not need to wait for the entire image tile back through the accelerator board pipel in e.

INPUT SCAN LINE BUFFER RAM BUFFER ! ! DECOMPRESSOR SCALING DATA PACKING FIFO OUTPUT CHIP - CHIP r-- CIRCUIT - BUFFER

! SYSTEM ADDRESS/DATA BUS t I t COMMAND 8x8 AND STATUS ROTATION REGISTER MATRIX

Figure 4 Block Diagram of DECimage 1200 Accelemtor Ha rdware

18 Vo l. .i No. 4 Fa ll 1991 Digital Te chnical jo urnal Hardware Accelerators j0 1· Bitonal /mage Processing

the word contain the number of consecutive white or black pixels (called the run length). The upper 8 SCAN LINES OF IMAGE DATA #1 #2 #3 bit of the word contains the run-length color code 8-BYTE 18-BYTE 1 8-BYTE BLOCKI BLOCK BLOCK I (0 for a white run and 1 for a black run). An FLC is read from the FIFO buffe r and decoded into one of eight routine types. Each routine is made up of sev­ eral states that control the color code toggling, run­ 8x8 ROTATION length adder, and accumulator circuits. At the end 01ROTATION234567 OUTPUT MATRIX REGISTER of each routine, a new word containing the run­ 0 t------1 length and color information is written into a FIFO 1 SCAN LINE 2 buffer for the finalstage. t------1 SCAN LINE 3 1------i - The final stage of the decompressor chip con­ SCAN LINE 4 1------t verts the run-length and color information to black SCAN LINE 5 1------i or white pixels. This stage outputs these pixels in SCAN LINE 6 1------i 16-bit chunks when the scaling chip sends a signal SCAN LINE 7 '------' SCAN LINE indicating a readiness to accept more data. SCAN LINE 01 234567 Scaling Chip The primary purpose of the scaling chip is to input high-resolution document images (300 dpi) and scale them for display on a medium­ DISPLAY BYTE density monitor (100 dpi). The chip offers inde­ pendent scaling in the horizontal and vertical Figure 5 Rotation Ma trix directions. The scaling design implemented in the chip is a patented algorithm that maps the input ASIC Design Description image space to the output image space. General The ASIC design consists of a decompressor chip, M-to-N pixel scaling is provided where M and N are which decodes the compressed image data to pixel integers between 1 and 127, with the delta between image data, and a scaling chip, which converts the them less than 65. M represents the number of pix­ image from the input size to the desired display size. els in and N represents the number of pixels out (in the approximated scale factor). Decompressor Ch ip The decompressor chip acts Given an image input size and a desired display as a CCITT binary image decoder. The chip contains size, we must find the M and N scale factors that three distinct stages, which are pipelined for the best approximate the desired scale factor, within most efficient data processing. Double buffe ring the range limits of M and N as previously stated. of compressed input data is implemented to enable Thus an input width of 3300 and a desired out­ simultaneous input data loading and image decod­ put width of 550 are represented by an 111 of 6 and ing to occur. Compressed data is loaded into the an N of 1. The approximated M and N values are input buffer by the processor through a 16- or 32-bit loaded into the chip scale registers for downscaling port. Handshaking controls the transfer of decom­ or upscaling. pressed data from the decompressor's 8-bit-wide The chip scaling logic uses the scale register val­ output bus to the scaling chip. ues to increment the input pointer position and The first stage of the decompressor chip con­ generate output pixels. A latched increment deci­ verts CCITT-standard Huffman codes, which are of sion term is updated every clock cycle, based on variable-length, to 8-bit, fixed-lengthcodes (FLCs) 5 the previous term and the scale register values. A sequential tree fo llower circuit is implemented When scaling down (where fewer pixels are output to handle this conversion. Every Huffman code cor­ than are input), the logic increments the input responds to a unique path through the tree, which pointer position every clock cycle, but only out­ ends at a leaf indicating the FLC. The 8-bit FLC is puts a pixel when the increment decision term is sent to a first-in, first-out (FIFO) buffer, which holds greater than or equal to zero. Figure 6a illustrates the data fo r the second stage. how this algorithm maps input pixels to output pix­ The second stage of the chip generates a 16-bit, els for a sample reduction. When scaling up (where run-length value from the FLC. The lower 15 bits of every input pixel represents at least one output

Digital Te chnical journal Vo l. 3 No. 4 Fa ll 1991 19 Image Processing, Video Te rminals, and Printer Tec hnologies

SCALE DOWN FROM 10 INPUT PIXELS TO 5 OUTPUT PIXELS

(M = 2 AND N = 1 )

I NIT = N = 1

DELTA1 = 2N = 2

DELTA2 = 2N -2M =-2

2 3 4 5 6 7 8 9 10 INPUT • • 0 0 • • 0 • • • INCREMENT

DECISION D = 1 D =-1 D = 1 D =-1 D = 1 D =-1 D = 1 D =-1 D = 1 D =-1 REGISTER

OUTPUT • 0 • 0 • 1 2 3 4 5

(a) Downscaling

SCALE UP FROM 3 INPUT PIXELS TO 9 OUTPUT PIXELS

(M = 1 AND N = 3)

INIT = 2M - N =-1 DELTA1 =2M=2

DELTA2 = 2M - 2N = -4

2 3 INPUT • • • INCREMENT

DECISION D =-1 D = 1 D = -3 D = -1 D = 1 D = -3 D =-1 D = 1 D = -3 REGISTER

OUTPUT • • • 0 • 0 0 0 • 2 3 4 5 6 7 8 9

(b) Up scaling

Figure 6 Chip Scaling Examples pixel), the.: logic outputs a pixel every clock cycle, tion and display at the server using the hardware but only increments the input pointer position accelerator board. Like the Xll protocol, the XIE when the increment decision term is greater than protocol consists of a client-side library called or equal to zero. Figure 6b illustrates how this algo­ XIElib, which provides client applications access rithm maps input pixels to output pixels for a sam­ to image routines, and a server-side piece, which ple magni.fication. For both cases, the value of the executes the client requests. The XIE server imple­ pixel (black or white) being output is the value of ments support at two levels: device-independent the input pixel pointed at during that clock cycle. and device-dependent. The device-dependent level In this dc.:scription, simply substitute rows for pix­ supports the fu nctions that benefit from optimi­ els to repn:sent the vertical scaling process. zation for a particular platform, or functions that are implemented in hardware accelerators. The Software Supportfo r the Ha rdware device-independent level enables quick porting of Software support is needed to enhance the func­ fu nctionality from platform to platform. Figure 7 tions of the hardware accelerator in our image view illustrates the X/XIE client-server architecture. station. As mentioned in the section General The client-side XIEiib offers the minimum Product Design, the XIE protocol extends the Xll functions necessary for image re ndition and dis­ core protocol to enable the transfer of compressed play. The tool kit level offe rs higher-level routines images across the wire and to enable image rendi- that assist with windows application development.

20 11JI. 3 No. 4 Fa ll 19';)1 Digital Te chnical jo urnal Ha rdware Accelerators fo r Bitonal /mage Processing

which is essentially as fast as the user can ask for the displays. This speed is possible because the image already resides in compressed form in the APPLICATIONS server memory. Thus, the image does not have to be read from the disk or transported across the network.

Future Image Accelerator Requirements I Hardware accelerators will continue to be required for bitonal imaging until software can provide the + XIELIB same functionality at the same performance level. This section discusses the more complex image schemes that are used for gray-scale imaging and multimedia applications. In contrast to bitonal -- DEVICE INDEPENDENT (DIX) imaging, these applications will require the use of hardware accelerators well into the future. XIE SERVER X SERVER Other applications will require richer user inter­ l DEVICE DEPENDENT+ (DDX) faces utilizing continuous-tone images, video, and audio. All of these new data types are generally I I data-intensive, and compression or decompression IMAGING DISPLAY I I of any one of them is a significant processing bur­ HARDWARE den. Handling them in combination indicates that the need for specialized hardware assistance will Figure 7 X/XIEArchitecture persist for the foreseeable fu ture.

An example of a routine at this level might be Continuous-tone Images lmageDisplay, which displays an image in a previ­ Bitonal images are either black or white at each ously created window. lmageDisplay parameters point, but some applications require smoothly might include and y scaling values, the rotation x shaded or colored images. These images are typi­ angle, and region-of-interest coordinates. Whether cally referred to as continuous-tone images, a term programming with the XIE protocol at the library that denotes either color or gray-scale, e.g., photo­ or toolkit level, applications developers benefit graphs, X rays, and still video. The representation from the platform interoperability of the standard and required processing of this image format is interface. Image accelerator hardware and opti­ significantly different from that of bitonal images. mized device-dependent XIE code changes the Continuous-tone images are represented by mul­ application's image display performance, but an tiple bits per pixel. This format allows a greater application developed using the XIE protocol can range of values for each pixel, which yields greater run on any XJE-capable server. accuracy in the representation of the original object. Additionally, each pixel can consist of mul­ Accelerator Performance Results tiple components, as in the case of color. The num­ With the DECimage I200 X terminal, we have ber of bits used to represent a continuous-tone achieved interactive performance rates, reduced image is chosen according to the nature of the memory usage, and increased final image legibility. image. We achieved these rates by transporting com­ For example, medical X rays require a high pressed filesinstead of huge pixel files and by imple­ degree of accuracy. Consequently, 12 bits are gener­ menting specialized image processing hardware. ally regarded as the minimum acceptable for the The DECimage 1200 can read, transport, decom­ rendering of this class of image. Color images typi­ press, scale, and display an 8.5-by-1 1-inch bitonal cally require 8 bits per pixel for each component document in I to 2 seconds. Successive displays, (YUV or RGB format) for a total of 24 bits per pixel. i.e., rotating, region-of-interest zooming, panning Table 2 shows the relative size of samples of each around the image, all occur in less than I second, image. The need to express these images in a

Digital Te chtzical]ournal Vo l. 3 No. 4 Fall 1991 21 Image Processing, Video Te rminals, and Printer Te chnologies

compressed format is obvious from the storage because any change to either the evolving standard space requirements and the current storage media or a standard extension could be easily introduced limits. to the firmware. The fastest implementations arc The compression of continuous-tone images can achieved by special-purpose hardware accelerators. be accomplished in several ways. However, most The ]PEG implementation docs not require hard­ imaging applications are not closed systems; ware, i.e., the algorithm can be performed com­ inevitably, each system needs to manipulate images pletely in software. The case for hardware assist that are not of its own making. for this reason we is made in performance. Ta ble 5 describes the acloptccl the ]PEG standard, which specifics an algo­ reduced instruction set computer (!USC) processor rithm for the compression of gray-scale and color performance, in millions of operations per seconcl images. Specifically, the ]PEG compression method (mops), needed to provide the specified operation is based on the two-dimensional (20) discrete at a motion video rate of 30 frames per second 7 cosine transform (OCT). The OCT decomposes an However, generic IUSC processors of those speeds 8-by-H rectangle of pixels into its 64 20 spatial­ are not available today Therefore, dedicated, ctts­ frequency components. The sum of these 64 20 tom very large-scale integration (VLSI) devices sinusoids exactly reconstructs the 8-by-8 rectangle. (such as the CL550-10 from C-Cube Microsystems) However, the rectangle is approximated-and com­ must be used to perform the operations 8 Even pression is achieved-by discarding most of the if the motion video rate is not required , the ASIC 64 components. Typically adjacent pixel values devices offe r the simplest hardware solution. vary slowly, thus there is little energy in most of the discarded high-frequency components. Live Vi deo and Vi deo Compression The edges of objects generally contribute to the Video captures the natural progression of events in high-frequency components of an image, whereas an environment, and is therefore a natural and the low-frequency components are made up of efficientway to communicate. Consider, for exam­ intensities that vary more gradually The more ple, the assembly of a set of components. One way frequency components included in the approxi­ to express the assembly process is to show a series mation, the more accurate the approximation of photographs of the assembly at successive steps becomes. Ta ble 4 shows some sample ]PEG image of completion. As an alternative, video can show compression ratios -" the actual assembly process from start to finish. The most popular part of the ]PEG stanclarcl, Subtle details of the process such as part rotations the "'baseline" method, was defined to be easily and movements can be clearly conveyed , with the mapped into software, firmware, or hardware. added dimension of time. Straightforward OCT algorithms can be efficiently Obviously, information expressed in video form implemented in firmware for programmable DSP can be valuable; however, significantproblems arise chips, clue to their pipelined architecture. The first in adapting video for use in computer systems. systems to embody the standard did so using DSPs, First, the huge data size of video applications can strain the system's storage capability. Video can

Ta ble 4 Ty pical Compression Parameters be characterized as a stream of continuous-tone for JPEG images. Each of these images consists of pixel val­ ues with individual components making up each Compression Compression Rendered Image pixel. For video to have fu ll effectiveness, the still Ratio Method Integrity images must be presented at video rates. In many

2:1 Lossless Highest quality- ctses the rate to faithfully reproduce motion is no data loss 30 frames per second, which means that one 12:1 Lossy Excellent quality- minute of uncompressed video (512-by-480 pixels indistinguishable at 24 bits per pixel) would consume over 1 gigabyte from the original of storage. In addition to storage demands, large 32:1 Lossy Good quality- vo lumes of data cause bandwidth problems. satisfactory for Presenting 30 frames per second to the video out­ most applications put with the above parameters would require a 100:1 Lossy Low quality- transfer rate of more than 22 megabytes per sec­ recognizable ond from the storage device to the video output.

22 Vol. 3 No. 4 Fa ll l991 Digital Te chuicaljournal Ha rdware Acceleratorsfo r Bitonal/mage Processing

Ta ble 5 Processing Requirements for Imaging Functions

Processor Imaging Processor Operations per Pixel* Operations Functions Read Write ALUt Multiply Total at 30 fps (mops)

Pixel move .25 .25 0 0 .5 15 Point operation 2 1 0 4 120 3 X 3 convolve 9 8 9 27 81 0 8 X 8 DCT 24 14 16 65 1950 8 x 8 block 128 191 0 320 9600 matching

'RISC processor, 1M pixels, 30 frames per second (Ips), 8 bits. tALU = arithmetic logic unit

Thus, reducing the amount of data used to repre­ Audio and Audio Compression sent the video stream would alleviate both storage Video is usually accompanied by audio. The audio and bandwidth concerns. can be reproduced as it was recorded (with the The starting point for the compression of video video), or it can be mixed with the video from is with still images and, as previously mentioned, a separate source (such as a compact elise (CD) the JPEG algorithm can be used to compress still player). The audio data is defined by application continuous-tone images. Because video can be rep­ requirements. If the application allows lower resented as a sequence of still images, the algorithm quality, the audio can be sampled at lower rates could be applied to each still. This procedure with fewer bits per sample, such as telephony rates, would produce a sequence of compressed video which are sampled at 8 kilohertz and 8 bits per frames, each frame independent of the other sample. For applications requiring high-quality frames in the sequence. (CD) audio, samples are usually taken at 44 kilo­ The evolving Motion Picture Experts Group hertz and 16 bits per sample. (MPEG) standard takes advantage of frame-to-frame Integrating audio data into an application creates similarities in a video sequence, thereby enabling special problems. The major characteristic that more efficient compression than the application differentiates audio from the other data formats of the ]PEG algorithm alone.9 In most situations, presented here is its continuous nature. Audio video sequences contain high degrees of similar­ must flow uninterrupted for it to convey any mean­ ity between adjacent frames. The compression of ing. In video systems, the flow of frames may slow video can be increased by encoding a frame using down under heavy system loading. The user may only the diffe rences from the previous frame. The never notice it, or may not be annoyed by it. Audio, majority of scenes can be greatly compressed; how­ however, cannot slow or stop. For this reason, large ever, scene transitions, lighting changes, or condi­ buffers are used to allow for load variations that tions of extreme motion need to be compressed as may affect audio reproduction. independent frames. A more subtle problem in creating applications The need for hardware assist in this area is com­ using audio is in synchronization. Audio data is pelling. Table 5 shows that to sustain a .JPEG decom­ usually included to add another dimension of infor­ pression at 30 frames per second would require a mation to the application (such as speech). 1950-mops processor. The same result can be Without a method of synchronizing the video and obtained using the CL550-10 ]PEG Image Compres­ audio, one data stream will drift out of phase with sion Processor.9 Although this device does not the other. One way to include synchronization is to make use of interframe similarities to increase com­ use time stamps on the audio and video. This is par­ pression efficiency, a device implementing the ticularly useful because standard time codes are MPEG standard would exploit these similarities. used in most production machines. Table 5 shows that motion compensation, to be The compression of audio data is not as efficient supported at 30 frames per second, requires a as that of the other data formats. Since a statistical 9600-mops processor. approach to coding audio is highly dependent on

Digital Te chnical journal Vol. 3 No. 4 Fa ll 1991 23 Image Processing, Video Te rminals, and Printer Te chnologies

the type of input (i.e., voice, musical instrument), referenced. As the data is incorporated, it is com­ another method is required for generalized inputs. pressed and stored. Authors require the capability Differential pulse code modulation (DPC.VI) is often to edit and mix video and audio passages to get the used to encode audio data. DPCM codes only the dif desired result. Moreover, the video and audio may ference between adjacent sample values. Since the originate from different devices and may.even be in difference in value between samples is usually less different formats. than the magnitude of the sample, modest compres­ As definedabove, "user systems" do not require sion can be achieved (4 :1). The limitation using this all of the fu nctions that authoring systems need: technique is in the coding of high-frequency data. only. decompression is required in a typical user Hardware assist for the audio data fo rmat will system. Most ex isting user systems require an ana­ probably come in the fo rm of hardware to perform log video source (videodisk), which is purchased functions other than compression. For instance, as part of the application. The device control is per­ DSP algorithms can perform equalization, noise fo rmed by the application, i.e., when a user selects reduction, ami special effects. a passage to be replayed, the application sends commands to the videodisk. Figure 8 depicts an Multimedia authoring system and a user system, along with As the term implies, multimedia may integrate all suggested I/O capability. of the previously. mentioned image fo rmats. The Next-generation multimedia platforms will make word "may." is important in this context. This area fu ll use of digital video and audio. This implies that has been mainly technology-driven, due to such systems will be able to receive and transmit multi­ factors as lack of standards, developing 1/0 devices, media applications and data over networks. This insufficient system bandwidth, differing clara for­ interactive capability will improve the efficiencyof mats, and a vast amount of software integration. many mundane appl ications and devices. For exam­ It is currently a topic of debate whether typical ple, electronic mail can be extended with video and users will require the ability to create. as opposed audio annotations, or meetings can be transformed to only access, multimedia source material. How­ into video teleconferencing. The adoption of com­ ever, for discussion purposes, multimedia plat­ pletely digital data for multimedia also implies that fo rms can be classified into two categories: the platform I/O will change. Some user systems authoring and user. Authoring refers to creation wil I not require analog device interfaces or control: of multimedia source material and requires diffe r­ the user will load the application over the network ent capabilities than user platforms. In the creation or from an optical disk. of a multimedia application, data from many differ­ Each of the image formats described in this ent devices may need to be digitized and cross- section has diffe rent characteristics, and each wil l

AUTHORING USER

110 AND DEVICE 110 AND DEVICE CONTROL CONTROL COMPRESSION/ DECOMPRESSION DECOMPRESSION

NETWORK NETWORK

Figure 8 Sample Multimedia Platforms

24 Fall 1991 Vol. .3 To. 'I Digital Te chnical jo urnal Ha t"dwareAccelera tors fo r Bitonal image Processing

be presented in the embodiment of multimedia. References andNo te Given the size, processing requirements (compres­ 1. Standardization of Group 3 Fa csimile Appa­ sion and decompression), and real-time demands of ratus fo r Document Tra nsmission, cenT applications, hardware assist will be a necessity. Recommendations, Vo lume VI-Fascicle VII.3, Summary Recommendation T.4 (1980). Imaging is a unique data type with special sys­ 2. Fa csimile Coding Schemes and Coding Control tem requirements. To achieve interactive rates of Functions fo r Group 4 Fa csimile Apparatus, bitonal image display performance today, hardware CCITI Recommendations, Vo lume Vl l-Fascicle accelerators are needed; that has been the primary VIJ .3, Recommendation T.6 (1984). fo cus of this paper. In the future, a general-purpose 3. Digital Compression andCoding of Co ntinuous­ processor should be able to handle the imaging pro­ To ne Still Images, Pa rt I, Requirements and cess at the necessary speed, and beyond that, the Guidelines, ISO!IEC JTC 1 Draft International processor should be affordable in a low-cost bitonal Standard 10918-1 (November 1991). imaging system. However, the bitonal document processing market will not wait; it is in a high state 4. J. Mauro, X Image Extension Co ncepts, Ve rsion of growth and requires that products like accelera­ 2. 4 (Cambridge: MIT X Consortium, June 1988). tors be developed for at least a fe w years. 5. D. A. Huffman, "A Method for the Construction Continuous-tone documents and multimedia of Minimum Redundancy Codes," Proceedings applications will place an even heavier processing IRE, vol. 40 (1962): 1098-1101. load on an imaging system. These areas will require accelerators for several years. As imaging applica­ 6. G. K. Wa llace, "The .JPEG Still Picture Compres­ tions, including bitonal, expand to cover more mar­ sion Algorithm," Communications of the ACIVI, kets, the quality enhancements and performance vol. 34, no. 4 (April 1991): 30-44. benchmarks met by accelerators today will set 7. Table 5 is adapted from Y. Kim, "Image Com­ customer expectations. Consequently, our future puting Requirements for the 1990's: From Multi­ imaging products must be designed to meet these media to Medicine," Proceedings of Electronic expectations. Imaging We st (April 1991).

Acknowledgments 8. CL550 ]PEG Image Compression Processor, Pre­ The authors wish to express thanks to the liminary Data Book (San Jose, CA: C-Cube X Window Te rminal Hardware and Software Design Microsystems Inc., November 1990). Groups for their support in developing the 9. Coding of Moving Pictures and Associated DECimage 1200 option. The two major ASICs used Audio, Committee Draft of Standard ISO 11172: in the design were developed for previous pro­ ISO/MPEG 90/ 176 (December 1990). jects, and those two design teams are also offered our thanks. Special thanks to Frank Glazer and Tim Hel lman for their insightful research on the image rendering process.

Digital Te ch-nicaljo urnal Vo l. 3 No. 4 f'i:lll 1991 25 Bj orn Engberg Thomas Porcher

X Wi ndow Te rminals

X window terminals occupy a nicbe between X window workstations and grapbics terminals. Tbe pU!pose of terminals in general is to provide low-cost user access to bost computers or smaller dedicated systems. X window terminals furtber tbe advance in grapbics terminals and prm•ide new and interesting wczys to utilize bost systems. Etbernet cable provides for grapbics performance previous�}' not seen in terminals. Tbe X Wi ndow 5)!stem developed by MIT allows multiple applications to be displayed and contml/ed from tbe user's workstation. Now, with X window terminals, tbe same poweJful user inte1face is available on bost and other non­ workstation computers.

In mid 1987, the Video, Image and Print Systems graphics, but their performance was far below the (VIPS) Group began the design of Digital's first expectations of a workstation user. Few people X window terminal, the VTIOOO terminal and its have the patience to run, for example, a computer­ code upgrade, the VT 1200 terminal. Our goal was to aided design appl ication on a VT240 terminal, assum­ design and implement an X window terminal that ing such a version of the application is available. would allow the use of windowing capabilities on Al though a workstation offers fast graphics capa­ large computer systems. In 1989, Digital developed bilities, its applications sometimes need more CPU the VT l )OO X terminal and in 1991 the VXT 2000 power or more disk space to do calculations in a X terminal. The designs of these X window termi­ timely fashion. Graphics applications written for nals are all quite different. Our design approach workstations could not run on faster host comput­ changed as the underlying technology changed. ers, which did not provide a display. Nor was there This paper first compares host -system comput­ a standard way to get data from the host to display ing with applications that run on workstations. on a workstation. Each application required a It summarizes the significance of the X Window unique solution to this problem. System developed by MIT and discusses the client­ Since the introduction of the new client-server server model. The paper then presents the need for model of computing and modern networks, many X window terminals and fo llows their development tasks can be divided into subtasks that can run stages. It compares and contrasts Digital's diffe r­ on the most suitable processor. The X Window Sys­ ent design strategies for the VTlOOO, VT I200, and tem uses the cl ient-server approach, as shown in VT I300 X terminals. The paper concluues with a Figure 1. The application is viewed as an X client, summary of the recently announced VXT 2000 and a workstation or a terminal can run an X server X terminal. that controls the display.The X server also controls input from the keyboard and mouse or other point­ Background ing devices. Before the development of the X Window Sys­ tem, there was very little overlap in fu nctionality between workstations and other kinds of comput­ ers. Wo rkstations had stunning and fast graphics, and many powerful applications were available on them. Those applications were not available to users of basic 80-by-24 character-cell text display termi­ nals connected to a host system located in a clean room. Graphics t<.:rminals,of course, allowed the use of ReGIS or another protocol for math and busin<.:ss Fig ure 1 Client-seruer Model

26 Vr>l. 3 No. 4 /-(i/1 /'}')/ Digital Te clmical jo urnal X Window Te rminals

An X client and an X server use an X wire to CPU as the X server, or it may execute on a host, communicate, as shown in Figure 2. The X wire another workstation, or a compute server. The is simply a two-way error-free byte stream, which X server can be connected to several X clients can be implemented in many different ways. The simultaneously, with any combination of local X Window System architecture does not stipu­ (running on the same CPU) or remote (running on late how the X wire should be implemented, but another CPU) X clients. The X server treats local and several de facto standards have emerged. Manu­ remote clients equally. facturers have designed X wires usually based on the clata transport mechanisms that were available Wo rkstation Environment and convenient when the X Window System was Figure 3 compares a traditional non-X windowing implemented. The X wires use transmission control workstation with an X windowing workstation. In protocol/internet protocol (TCP/IP), DECnet, Local both workstations the application must use a Area Transport (LAT), and other protocols, and even graphics library to communicate with the display shared memory buffe rs as a transport to avoid hardware and software. protocol overhead. A single implementation often In an X windowing cl ient environment, the supports several transport mechanisms. library of routines is called Xlib. An application The X server typically executes on a processor designer can choose from a wide variety of toolkits, with display hardware. The X client can execute on which are essentially a level of additional library almost any processor. It may execute on the same routines between the application and Xlib. The use of a toolkit can significantly reduce the amount of work an application has to do. The application software, Xlib, optional toolkit, and other libraries compose the X client, as shown in Figure 4. Wi th few exceptions, the X server comes with the display hardware and input devices (keyboard and pointer) indicated in Figure 5. The X Window System with its flexibility neatly solves the problems of CPU power and disk space versus display availability. Applications written for X can execute on a wide variety of computers, and the resu lts can be displayed on any of a multitude Figure 2 X Wires of devices, even on a workstation that would not

X

TRADITIONAL WINDOWING WORKSTATION I WORKSTATION

APPLICATION APPLICATION

GRAPHICS LIBRARY

WINDOWING I SOFTWARE

DISPLI AY HARDWAREI I I KEYBOARD SCREEN I I MOUSE

Figure 3 Inside the Wo rkstation

Digital Te clmicnl Jo urnal Vo l. 3 No. 4 Fall 1991 27 Image Processing, Video Te rminals, and Printer Te chnologies

stations. In a software development or document APPLICATION editing environment, the users often set up their

OPTIONAL TOOL·KIT X CLIENT workstat·ions as terminals. They usually created a AND OTHER LIBRARIES few termina1l emulation windows and used SET HOST or RLOGIN commands to connect to a host X LIB system on which they stored their working envi­

ronment and fiks. Only two features of a work­ station were frequently used. Users kept several Figure 4 1he X Client terminal emulators on their screens at the same time, and set the terminal emulator windows to be larger than 80 by 24 characters. Only rare.ly did the X PROTOCOL HANDLER X SERVER average workstation user take advantage of the full DISPLAY AND INPUT HARDWARE power of graphics applications. The results of our study indicated a need for a cost-effective alternative to a workstation that Figure 5 Th e X Server would provide the featuresdesi red by a large num­ ber of users. We envisioned a new kind of termi­ have the capacity to run the application locally. nal, one that would allow people to have multiple Figure 6 shows how the X Window System fitsinto windows of arbitrary size, to connect with mul­ a network environment. tiple hosts, ami, since the X architecture allowed it, The X Window System has already generated to be able to use the same kind of graphics as a many useful applications, and its widespread popu­ workstation. larity ensures that many more applications will be From an X architecture standpoint, X terminals made available in the fu ture. and X workstations are quite similar. They can in fact use the same hardware. For example, Digital's Need fo r X Te rminals VT1300 terminal runs on the same hardware as the In a study to determine how workstations are used, VA.'Cstation 3100 workstation. X terminal software the VIPS Group found that many users did not take can also be made to run well on hardware plat­ advantage of the fu ll potential of their work- forms that are not suitable for workstations.

HOST CPU 1 HOST CPU 2

CLIENT CLIENT CLIENT I X CLIENT I 1 x 1 1 x 1 r x l I NETWORK l l 1 I INTERFACE: I I NETWORK INTERFACE I I I ETHERNET I X WINDOWING WORKSTATION NETWORK I INTERFACE: I I X CLIENT I I X CLIENTl I I I X SERVER I I l I KEYBOARD II MOUSE I I SCREEN I

Figure 6 X Wi ndow Ne twork Environment

28 Vo l. .3 Nu. 4 Fat/ 1991 Digital Te chrtical]ournal X Window Te rminals

The main architectural difference between host, via the Ethernet or the serial lines as shown the X terminal and X workstation software is that in Figure 7. This capability makes the VT1200 ter­ X terminals are closed systems that do not sup­ minal useful in an environment that does not have port local user applications. Although this may X window support. seem to be an unnecessary restriction, it does allow Although any X server can run windows soft­ X terminals to be made for less money. An open sys­ ware, it does not provide a user interface. To manip­ tem that allows any user application to run locally ulate the windows, the user needs a window must have an established CPU architecture, a sup­ manager. The window manager creates window ported operating system, such as the VMS, UNIX, frames that allow the user to invoke functions to or ULTRIX system, and, subsequently, sufficient move windows, resize windows, change stacking memory and/or disk space to support such an envi­ order, and use icons. This capability also makes the ronment. A closed system, on the other hand, can VT 1200 terminal useful when no host is available to be designed with simpler hardware, a smaller oper­ run a remote window manager. A terminal with a ating system, less memory, and thus lower cost. local window manager generates less network The absence of the ability to run user applications traffic, and window management is not slowed locally does not impact usability significantlysince by host congestion or network round-trip delays. the user can run any desired application on another The VT1200 X terminal allows use of a remote win­ CPU. Digital's VTlOOO and VT1200 X terminals were dow manager, if the user prefers a different style of designed based on this approach. window management. The local terminal manager provides the user X Te rminal Environment interface to initiate connections to host systems. X terminals often have local applications, but they It is also responsible for the terminal customization must be built into the terminal by the designers. interface. The VT1200 terminal hasa video terminal emulator All clients communicate with the X server using (VTE), a window manager, and a terminal manager standard X wire commands only. Any window man­ as the local applications. The VTE all ows the VT1200 ager, remote or local, can manage all the windows terminal to make American National Standards on the screen, regardless of whether the clients are Institute (ANSI) character-cel l connections to a remote or local.

CPU 1 CPU 2

HOST HOST

SERIAL LINES

Figure 7 Th e X Te rminal Environment

Digital Te chnical journal Vo l. 3 No. 4 Fall 1991 29 Image Processing, Video Te rminals, and Printer Te chnologies

Development of X Wi ndow Te rminals col without some compression of the wire protocol The development process of the VT 1000 and itself. We had to build Digital's firstX terminal with VT1200 X terminals has important lessons to teach an Ethernet interface. us. The knowledge we gained in 1987 has helped us We needed to determine if this hardware platform develop future generations of X terminals. could give us sufficientperf ormance. We made sev· When we designed the VTlOOO X terminal and eral performance estimates, based on what we its code upgrade, the VTl200, we held many discus­ knew then about the X server and other software sions within the group and with people from other components. We went through each step in as much groups. We planned many iterations before we detail as we could (before anything was built). We arrived at the finalarchitecture. It was by no means calculated how many instructions were necessary the only way to design an X terminal, and in 19H9 to perform each task in the chain of receiving a we tried a different approach with the design of the command and displaying it on the screen. By .know­ VT1300 terminal. We knew that the best decision at ing the speed of the CPU, we could estimate per· a particular time might be very different from the fo rmane<.: in characters or vectors per second. best decision one year later, since the technical and Our estimates showed that the VTIOOO X terminal marketing environment is constantly changing. would not be exceedingly fast, but the perfor· New tools, standards, and practices enter the field mance would most probably be sufficient, de.fi· while others become obsolete. Newer products nitely fa ster than a VA.'I(station 2000 in most cases. must always have new features to meet changing In retrospect, actual performance of the VT1000 technology requirements. terminal and the later software upgrade, the VT1200, was close to our estimates, but it took sev­ Ha rdware Pla tform eral passes of code optimization to ach ieve such performance. Our .first step was to discuss the hardware plat­ We also discussed alternate hardware designs fo rm and select the kind of CPU to use, memory for performance improvements. One solution pro· size, 110 considerations, type of display, etc. We posed two CPUs, the TMS34010 microprocessor to studied many diffe rent CPUs to determine which handle the display and a 68000 microprocessor to one would provide the most capabilities for the handle I/0 and other tasks. Unfortunately, we found lowest cost. A VAXchip was rejected because, at the no easy way to balance the workload between the time, it was far too expensive for the required price two crus. We estimated that the diffe rent software range of the VTIOOO terminal. The Motorola 68000 components would have the fo llowing relative CPU series CPUs are quite powerful, but we had to con· demands: sicler other factors such as availability of software and hardware tools, cross compilers and linkers • Interrupts, 5 percent that could run on the VMS system, and hardware • Communications, lO percent debugging facilities of sufficient power. We fmally selected Texas Instruments' TMS34010 micro· • Operating system, 5 percent processor with video support and several built-in graphics instructions that made it a cost-effective • X server (minus display routines), 60 percent solution. It also came with VMS development tools, • Display routines, 20 percent a c compiler, an assembler and linker, a single-step, hardware trace buffe r with disassembler, and a To equalize the load between the CPUs, we would powerful in-circuit emulator that made it possible have had to split the X server in two, a solution that to control execution in detail, inspect registers and was not fe asible. Any other split of tasks would memory, and set break points and hardware watch cause one CPU to spend most of its time waiting points (for example, break when writing value .x fo r the other, and the overall performance gain into locationy). would be minimal. Communication between mul· We further discussed the kind of l/0 to use. A tiple CPCs is complex and is very difficult to debug. sample implementation of the MIT X server on a Therefore, we decided that two CPUs were not VAXstation 2000 workstation and a primitive serial worth the trouble or the cost. The best way to line protocol showed, as expected, that serial lines double performance is to install a single CPU that were clearly insufficientto carry the X wire proto· is twice as fa st. At that time, the TMS34020 was

30 Vo l. _) No. 4 Fa ll 19')1 Digital Te clmical]ou>·naf X Window Terminals

already being mentioned as a follow-up micro­ uncovered little timing windows and race condi­ processor. Since its software would be compatible tions that had not been handled properly. Even with the TMS34010, we decided to keep it in mind with in-circuit emulators, such bugs could take for possible use in a future terminal. weeks to track clown. In the VT1300 we decided to use the VAX ELN Code Selection operating system. We wanted to avoid the possibil­ ity of time wasted on findingand patching holes in The use of read-only memory (ROM)-based code the design of a new operating system. versus downloaded code has been debated for some time. ROM-based code starts up faster and incurs less network traffic at startup time (espe­ Local Te rminal Ma nager cially on a site with many X terminals), but is not The VTlOOO X terminal is self-starting at power-up, flexible when software is upgraded. On the other but without a host system, it needs a local user hand, downloaded code can be easily distributed. interface. We decided that this interface should An entire site can be upgraded with one or a fe w resemble a workstation session manager and thus installations by a system manager as opposed to called it the local terminal manager. Although it changing ROMs in a large number of terminals. covers a different set of functions, we wanted the (With the VT1200 X terminal, customers can change local terminal manager to implement a similar set of ROM boards.) From the point of view of terminal objects and operations (the "look and feel" or style) business, it made sense to use ROM-based code in of a workstation session manager. The style of the 1987. We reasoned that not all sites would have DECwindows session manager was chosen to make Ethernet, but with ROMs the X terminal would it easier for a user to switch between an X terminal still be useful as a multiwindow terminal emula­ and a DECwindows workstation. We wrote a subset tor. We realized that such concerns would change toolkit for all the "customize" screens and ensured with time, and on the whole, downloaded code that the VTE could use the same subset toolkit for would become the better approach. The only its "customize" screens. As DECwindows has pro­ exceptions would be in the home or small office gressed, subsequent X terminals have adapted the markets where a boot host or an Ethernet mightnot new user interface preferences, in this case Motif. be available. Subsequent X terminals are being made in both downloaded (for example, in the Local Te rminal Emulator VT1300 terminal) and ROM versions. We considered a local terminal emulator to be an important component. We knew that X-based ter­ Op erating System Selection minal emulators could run on the host, but in 1987 Next we considered which operating system to hosts with X windowing support were rare. Since use. We looked at other vendors' operating sys­ we were in the terminal group, a terminal that tems, but found they were either too complex and could not manipulate ordinary text by itself was big or inadequate. One of our coworkers had writ­ considered unsellable. We wanted the ability to teo a very compact operating system for a VAX access both X and non-X hosts and we wanted system used on another project. We used it in our to support multiple text windows. Therefore we prototype and then adapted it for the TMS34010 defined the terminal emulator as an X client so that processor. We implemented additional functions text windows could coexist with X client windows. to run the rest of the software with minimum This feature has proved to be exceptionally popu­ changes. lar. A large number of users use nothing but video There are many advantages to working with terminal emulator windows. They are not inter­ "your own" operating system. It is easy to make ested in X windowing graphics, but do want mul­ changes, to work around trick )' problems, and to tiple and/or larger text windows on a large screen. make special enhancements. But operating system code is difficult to debug. Timing is very critical, Local Window Manager and throughout the project, we found strange bugs We debated whether or not to implement a local in code that had initially appeared to be all right window manager. The DECwindows window man­ to everyone involved . We found bugs under heavy ager was under development and was constantly load conditions after a rare sequence of events changing. The DECwindows window manager

Digital Te chnical journal Vo l. 3 No. 4 Fa ll 1991 31 Image Processing, Video Te rminals, and Printer Technologies

contained far too many VMS dependencies to be Communications Protocol ported easily. Also the X terminal did not have Many communications protocols were available, enough memory to run the DECwindows wolkit but our choice was dictated by market pressures locally. We could have ported other window man­ rather than technical reasons. The market demanded agers, but they lacked the essential characteristics TCP/IP. DECnet would have been acceptable, but of the DECwindows window manager. For a while it was running out of available addresses, at least we considered letting the local clients have a primi­ within Digital. DECnet address space supports tive way £0 manage their own windows, until a fu ll­ only 64,000 nodes and requires manual address featured window manager could be started on a and name assignments. After waiting weeks to get

host. Again, this alternative lacked the DI:Cwindows addresses for a few workstations, we realized that system's qualities. We eventually decided to write adding thousands of X terminals into Digital's inter­ a window manager based only on Xlib and our nal network wou ld not be possible. DECnet Phase v subset toolkit calls. It has the essential characteris­ software has solved this problem. tics of the DECwindows product. Also, since the Next we looked at the LAT protocol used by DECwindows window manager of necessity wou ld Digital terminal servers and fou nd that it bad sev­ keep changing, we wrote the local window man­ eral advantages. first, the VMS operating system ager in such a way that it could relinquish control supports the LAT protocol. LAT uses unique 48 -bit to a remote window manager. This solution gave us Ethernet addresses to identify each node, which the most flexibility for this hardware platform. The allows a large node address space. LAT also does not recently announced VXT 2000 X terminal has been require any system management to add another ter­ designed with virtual memory to accommodate a minal. A user can connect a terminal to a power well-establ is heel unmodified window manager, the source, and the terminal automatically becomes . part of the network. Our performance evaluations found that the LAT interface on the host could be X Server written to incur Jess host overhead than DECnet, which is important when many X terminals are con­ We al so needed to choose an X server. We cou ld nected to hosts. have based our code on the distribution tape from Changes were needed in the VMS LAT driver to MIT, but at the time the X Window System was not accommodate X wire and fo nt service connections. yet a mature product. Every implementor had to The VMS Software Engineering Group worked with spend considerable time stabilizing the implemen­ us to ensure that we would have those changes ta tion enough to yield a product and improve per­ on schedule and in the appropriate VMS releases. fo rmance. Since the VMS DECwindows Group had As a result, we chose the LAT protocol for the VMS been writing code for the server, we decided to use community and TCP/IP fo r users of ULTRIX and UNIX DECwindows code. Once the porting effort started, systems. we fo und that most of the performance had been improved by V;'w'{ MACRO code. Consequently, we had to re-engineer all the modules or adapt new Fo nt File Sys tem ones from the MIT tape. As we kept porting and Storing fonts and changing fo nt file fo rmats were enhancing performance, our code changed more major problems. Since the VTIOOO X terminal did and more until it became extremely difficult to not have a local file system, some fo nts had to be track bug fixes made by the DECwindows Group. stored in RO M to allow the VT lOOO terminal to func­ The MIT patches were also nearly impossible to use tion in standalone mode. A quick review of the because of code changes and because our starting available DECwinJows fonts showed that not all of code was one step removed from the tape. them fit in the ROM space allowed for the terminal. 1b

32 Vo l. .:i No 4 Fa ii i'J'JI Digital Te chuicaljounwl X Wi ndow Te rminals

cess could deliver fo nt data on request to one or nate fo rmat, the VT1200 terminal does not have to several VTIOOO terminals. The VMS system's fo nt open and read the font file until a client actually daemon uses the LAT protocol to deliver the fo nts intends to use it. and protects somewhat against fo nt file format changes. In many ways, the design of the font Comparison of X Te rminals daemon makes it a precursor to a general fo nt The VT1200 and VT1300 X window terminals server, and it is very similar to the X Font Server were built using different approaches to solve being delivered by MIT in the latest release of the the problems encountered during development. X Window System. The X terminal is a new and flexible concept; there To use the fo nt service, the terminal user must is no single "best" design. Ta ble 1 compares the specify a fo nt path in the VT1200 local terminal most important differences between the two termi­ manager. Specifying a host name is sufficient to nals. We also include the specificsfor the VXT 2000 access the default font path, although users with X terminal. their own fo nt files can optionally search other The VT1200 is ROM-based; all its software is per­ directories. At startup, the VT1200 terminal makes a manently resident in the terminal. The VT1300 soft­ font connection to the host's font service and deliv­ ware is downloaded, so a host or bootserver on the ers the fo nt path specification to the fo nt service. same network must supply the terminal with a load The fo nt service sends font names and other basic image at power-up. font information about all the fonts in the selected Since downloaded terminals are dependent on path. When the VT1200 X server needs a fo nt, the the existence of at least one working host system, VT1200 first searches the ROM-based fo nts; if it is the user interface can be designed differently. not there, a request to read the fo nt is sent to the While the VT1200 X terminal has a built-in user font daemon. The daemon sends the required infor­ interface, the VT1300 does not need it. The VT1300 mation to the VT1200, and the X server can display terminal automatically makes an X connection to a characters from that fo nt. Since memory is limited, host at power-up, and the user is presented with the VT1200 has font caching, a mechanism to dis­ the same DECwindows login box as on a work­ card fo nts no longer used or to discard the least station. The VT1300 has no local clients; all clients used fo nts. Our current X terminals increase the run on the host system. robustness of the font mechanism; for example, The VT1200 terminal uses the LAT protocol for they provide recovery should the font service or its its ease of use and minimal network management host become unavailable. demands. The VT1300 terminal uses the DECnet The special LAT code that we used on VMS sys­ software already implemented in the VAXELN oper­ tems for the font service was not available on ating system used internally. Both terminals sup­ UNIX and ULTRJX operating systems. Since inter­ port TCP/IP. net protocol (IP) was available, we could use the trivial file transfer protocol (TFTP) to read a file from a host system, if the system manager set the VXT2000 X Te rminal proper protections. We chose TFTP for its ease One problem that has plagued all X terminals is of implementation and its wide availability on limited memory space. Workstations usually have a UNIX and ULTRJX systems. The TFTP fo nt path in a virtual memory system, which provides large pag­ VT1200 terminal specifies a host IP address and a ing and swap areas on a disk, and applications complete path to a file (usually named fo nt.paths) and X servers can use more memory space than that contains the complete path to all the font the hardware has. Until now X terminals have not files that the VT1200 can use. The terminal can had virtual memory systems. If too many applica­ then access all those font files, again through TFTP, tions made excessive demands, or if a client created to obtain font names and other basic information large off-screen images (called "pixmaps" in the about each fo nt. When a client wishes to use a fo nt, X Window System) the terminals quickly used all the proper font file can be read again, this time to memory space. If the X server implementation load the complete font. Since this process is time­ was correct, an error was reported and a client consuming, the fo nt path pointing to the file has might try a less demanding approach. In other an alternate format in which the font name fol­ cases, the terminal or client might simply crash. lows the complete path to each file. Using this alter- One alternative was to install more memory in the

Digital Te clmicaljournal \1()/. 3 No. 4 Fa ll 199l 33 Image Processing, Video Te rminals, and Printer Tec hnologies

Ta ble 1 Comparison of X Window Te rminals

VT1200 Te rminal VT11300 Te rminal VXT 2000 Te rminal

Monochrome only Color only Monochrome and color

1 bit plane 4 or 8 bit planes 1 or 8 bit planes

Code in ROM Code downloaded Code downloaded

No virtual memory No virtual memory Virtual memory 2-4MB RAM 8-32MB RAM 4-16MB RAM

TMS34010 CPU VA X CPU VA X CPU

Special operating system VA XELN operating system Special operating system

Local clients: No local clients Local clients: Te rminal manager Te rminal manager Window manager Motif window manager Video terminal emulator DECterm term inal emulator

Local customization Customized on host Local customization just as a workstation Centralized customization

Choice of host (LAT only) Automatic X window Choice of host login to boot host (LAT and TCP/IP using XDMCP)

LAT protocol DECnet protocol LAT protocol TCP/1 P protocol TCP/I P protocol TCP/I P protocol

Special hardware Available on several Uses standard hardware workstation platforms

X terminal, although this can be costly and offers no customization of all X termin;�ls on the network. guarantees. Figure 8 shows the configuration for the VXT2000 In the next generation of Digital's X terminals, X terminal. the VXT 2000, this problem has fo und a cost­ effective solution. Based on the VA.,'( architecture, Co nclusion the VXT 2000 terminal uses virtual memory and X terminals are not intended to replace work­ downloaded code. The Digital In.foScrver, an St:ltions. Nor will workstations replace host sys­ Ethernet storage server, provides the load image, tems or completely displace X terminals in the virtual memory paging space, fo nts, and customiza­ foreseeable future. It is likely that host computers tion storage. The same Info Server also solves will always be faster and have more memory and another problem: now the X terminal has access to disk space than reasonably priced workstations a file system. This allows more extensive customi­ of the same era. It is also likely that terminals can zation, as well as centralized management of the be built cheaper than workstations of reasonable

Figure 8 The VXT2000 Network Environment

34 Vo l. j No. 4 Fa /1 1991 Digital Te chnical jo urnal X Window Te rminals

performance fo r some time to come. As long as that is the case, there will be a market for X terminals and host systems. Future X terminals will be faster, and have more built-in functionality, more local applications, X extensions, and most likely, addi­ tional hardware fe atures. X terminals will be the networked terminals of the 1990s.

Acknowledgments We wish to thank the members of the VT1200 devel­ opment team who worked many long hours on this project. Thanks to everyone inside and outside the Video, Image and Print Systems Group who contributed helpful suggestions, constructive criti­ cism, and important hours using and testing the products. Thanks to the LAT and VMS Software Engineering Groups for incorporating the changes needed for the VT1200 X terminal to be useful. Thanks to the VIPSQuali ty Group for ensuring that as few bugs as possible remained in the product when shipped.

Digital Te ch11icaljournal Vo l. 3 No. 4 Fa ll /991 35 Peter A. Sichel I

ACCESS. bus, an Op en Desktop Bus

Wi th the recent introduction of the ACCESS. bus product, Digital has affi rmed its commitment to open sy stems and thus to fa cilitating better solutions fo r inter­ active computing. Th is op en desktop bus protJides a simple, unifo rm way to link a desktop computer to as many as 14 low-speed 1/0 devices such as a keyboard, mouse, tablet, or three-dimensional tracker ACCESS.bus fe atures a 100-kilobit-p er­ second ma.:'

As the cost of personal interactive computing describes the process we went through to convert decreases, the range of applications and the need to a new bus architecture, and summarizes the key for specialized I/0 devices is growing dramatically. advantages of the chosen design. Traditional personal computers were designed to The basic desktOp bus concept is illustrated in accept only a small number of standard devices; Figure 1. The bus allows multiple, low-speed l/0 adding devices beyond those originally envisioned devices to be interconnected and thus interfaced usually requires specialized hardware or software. through a single host port. Desktop bus devices Custom interfacing is expensive for vendors and such as a keyboard or a tablet, which are not hand­ users and thus limits the availability of new devices. held, provide two connectors and allow another ACCESS.bus provides a simple, uniform way tO device to be da isychained. A hand-held device link a desktop computer to a number of low-speed such as a mouse can be placed at the end of the l/0 devices such as a keyboard, a mouse, a tablet, or daisychain, or a connector expansion box can be a three-dimensional (3-D) tracker. Designed from attached to accommodate additional devices that the beginning as an open desktop bus, ACCESS.bus do not provide two connectors. facilitates cooperative solutions using equipment from diffe rent vendors. This paper describes the ACCESS. bus design and gives some insight intO how the idea was adopted at Digital.

Design Goal, Process, and Advantages The design goal fo r the desktop bus fo llows from our experience within the Video, Image and Print Systems (VIPS) Input Device Group with trying to support new devices on Digital terminals and workstations. While various new devices have been CONNECTOR successfully prototyped over the years, the need EXPANSION BOX fo r nonstandard hardware and custom software drivers was always an expensive, time-consuming obstacle. Even after successful prototyping, these devices could not be read ily adapted tO our stan­ dard systems, limiting their use to custom appl ica­ tions. In designing the desktop bus, our goal was to make it as easy as possible to interface previously unavailable 110 devices to our systems in a way that was both practical and marketable. This sec­ tion explains the benefits of using a desktop bus, Figure 1 Basic Desktop Bus

36 Vo l. 3 No. 'I Fa /1 /'J'J/ Digital Te chnical jo urnal ACCESS. bus, an Op en Desktop Bus

The desktop bus has the fo llowing benefits: By making the bus truly open, we encourage third parties to add value to our systems. Enables greater flexibility and variety of use • The benefits of a desktop bus are significant. But converting to a new architecture, especially one • Reduces the cost of connecting multiple devices that is not back-ward compatible, is expensive in

• Expedites bringing new technology to market terms of the time and effort required. How does a large corporation build agreement to make such Helps leverage third-party devices • an investment decision' The desktop bus project The firstbenefit, greater flexibility, can be simply started as a grass roots engineering effort and grad­ achieved by allowing additional devices and more ually built momentum. The process was one of modular solutions. We further extended this bene­ dialogue to attract partners. Initially, three groups fitby designing a way for devices to be added at run with slightly different objectives worked together time without disrupting system operation. Con­ to develop the bus. The visibility of separate groups figuration should be automatic; connecting stan­ jointly supporting the bus concept was essential to dard devices should not require powering down or transform the idea into action. People are more rebooting the system before a new device can be will ing to accept an idea that others around them used. The desktop bus supports multiple like have already adopted. devices without switches or jumpers. The three groups that initiated the desktop The second benefit, reduced cost, was crucial to bus project were our VIPS Input Device Group in having the bus accepted as a solution across a wide Westford, MA, mentioned previously; the Wo rk­ range of products from low-end video terminals station Systems Engineering (WSE) Group, located to high-end workstations. We recognized that con­ in Palo Alto, CA; and the Video Advanced Develop­ temporary electrical techniques could eliminate ment (A/D) Group in Al buquerque, Nl\1. Our Input the need fo r level translation circuits, -12 volt (V) Device Group was looking for ways to simplifythe power sup pi ies, and perhaps some of the protec­ process of prototyping specialized input devices tive components used with RS-232 interfacing. and of getting related software support fo r our Although many devices would now require two video terminals and workstations. WSE was devel­ connectors, system cost would decrease because oping a low-cost, personal workstation and needed we would need to supply only as many connectors a flexible way to support multiple input devices as the number of devices to be attached, or possibly without greatly increasing the cost of the base one more. workstation. The Albuquerque A/D Group had been The third benefit, expediting the time to market experimenting with next generation 1/0 devices, for new technology, allows us to better satisfy i.e., force-feedback joystick, 3-D tracker, and real­ changing requ irements. Key to this benefit is hav­ time audio and video, and was interested in having ing the means to connect new devices without these technologies adopted by other Digital groups. changing the system hardware or software. Based This AID Group had used PC technology success­ on our experience with input devices, we devel­ fu lly in one of its previous video projects. oped the concept of device capability reporting In january of 1990, engineers from each group and generic device protocols. Standard devices realized they were working on similar problems like keyboards and locators, e.g., mice, tablets, and and began to collaborate. The WSE Group was to trackballs, all work in similar ways. For this class build the desktop bus host interface and software of device, we define a simple device protocol and drivers into their workstation; the VlPS Group was a way to parameterize and report device unique to help define the device protocols and supply characteristics. A single generic driver can adapt desktop bus keyboards and mice; and the Albu­ itself to work with a class of similar devices so querque AID Group was to support bus devel­ that no custom software is required fo r basic opera­ opment and prototype additional devices. Within tion of standard devices. four months, VIPS had defined the basic protocols Leveraging third-party devices, the fourth and could demonstrate a working FC keyboard benefit, is aimed at satisfying diverse customer and mouse. These early prototypes helped per­ requ irements. Because the use of computers con­ suade WSE to support the project and, in turn, tinues to proliferate, the range of applications far helped reinforce the importance of the project to exceeds that which any one vendor can master. the VIPS Group.

Digital Te cbnicaf jo unral Vo l. 3 No. 4 Fa ll 1991 37 Image Processing, Video Te rminals, and Printer Te chnologies

We began presenting the desktop bus idea to • Dynamic recontiguration. The hardware and interested groups within Digital and received many software allow bus devices to be "hot-plugged " useful suggestions including and used immediately, without restarting the system. The devices are recognized automati­ • Use the same keycodes as on the LK201 keyboard cally and assigned unique addresses. This advan­ to eliminate the need to rewrite k<:yboard tage results in a plug-and-play user interface. lookup tables. • A mature capabilities grammar to support generic • Store the country keyboard variation inside device drivers. An extensible free-form grammar the keyboard so users will not need to enter it allows devices to describe their characteristics manually. to a generic driver. Most common devices can

• Keep the devices simple, without modes. work with standard drivers.

In addition, third-party input device vendors Bus or network interconnection has become made the fo.llowing suggestions. widely accepted as a means of providing fl exible open solutions. To appreciate ACCESS. bus, it is help­ • Usc a modular connector that is easy to plug and ful to position its performance capabilities with unplug correctly. respect to those of other network interconnect technologies, as shown in Table 1. • Provide enough power for several additional devices.

• Al low vendors to supply their own device Ta ble 1 Network Interconnects drivers; tuning their own device drivers is part Order of Magnitude of the value added by the vendor. Performance The bus idea was elegant and generally well Bus Ty pe (kilobits per second) received . Most of the reservations centered around Apple DeskTo p Bus, 10-100 the likely impact on existing system components, ACCESS.bus the current problems, and whether conversion to LocaiTalk 100-1,000 the bus was fe asible. Because we recognized that Ethernet 1,000-10,000 other groups were facing tight dcvclopmcnt schecl­ FDDI 10,000-1 00,000 ulcs, we did not pressure these groups to support our desktop bus work. We presented the desktop bus as a possible solution to interface problems, At first glance, the 100-kb/s speed of the made our design information available, and worked ACCESS.bus may seem adequate for large desktop to incorporate suggestions. But as the development devices like printers and modems. But these work progressed, more partners supported our devices can transmit long data streams indepen­ effort. dent of any user activity and, ifnot restricted, could Once we cleciclecl to use a desktop bus, we compromise the interactive performance of the looked at available designs, including the Apple bus. Thus, ACCESS.bus is intended for low-speed DeskTop Bus, the Musical Instrument Digital activities that people perform with their hands Interface (MIDI), and serial buses offered by other and is fast enough to hand le multiple interactive semiconductor vendors, and evaluated these alter­ devices like a keyboard, mouse, or 3-D tracker. natives with respect to our design goal. Key advan­ tagcs of the design chosen, i.e., the ACCESS. bus, are Hardware Description Before discussing the ACCESS.bus design, we pre­ • Off-the-shclfinterintegrated circuit (J2C) micro­ sent a description of the Philips FC technology controller technology with maximum data rate upon which the design is based. Details of the of 100 kilobits per second (kb/s) . This technol­ specificACCESS. bus implementation fo llow. ogy is low-cost, yet fast enough fo r sophisticated input devices like a 3-D tracker. Jnterintegrated Circuit Fu ndamentals

• Built-in hardware arbitration, which simplifies ACCESS.bus extends the Phil ips PC bus to operate the software and allows reliable communication off-board and, thus, connect desktop devices. The without inventing a new protocol. FC is a two-wire serial clock and serial data

38 Vo l. 3 No. 1 Fall I'J'JI Digital Te chnicaljmwnal ACCESS.bus, an Op en Desktop Bus

open-collector bus. An open-collector design means maximum capacitance not to exceed 700 pico­ that the clock and data lines are normally in a high­ farads (pF). impedance floating state and are pulled up to a log­ Number of devices. The maximum number of ical high state. • ACCESS.bus devices allowed on the bus is 14. A device that wants to send a message waits for Limiting factors are the device addressing range any message frame in progress to complete, then and the power distribution (a total of 500 rnA for asserts a START signal to become bus master and all devices). begins to generate data and clock signals. The bus clock is synchronized among all devices by its • Hardware interfaces. ACCESS.bus hardware inter­ wired AND connection. Each device, whether faces are implemented using standard PC micro­ transmitting or receiving, stretches the low period controllers developed by the Signetics Company of the clock until ready for the next bit to be trans­ or under license from Philips Corporation. (Sig­ fe rred. When the last device is ready, the bus clock netics Company is a division of North American is allowed to go high, generating a rising edge on Philips Corporation.) the serial clock. At this time, all active devices sense the state of the bus data I ine. For a receiving ACCESS. bus Protocol device, the state represents the received data bit. Every device on the bus is a microcontroller with For a transmitting device, the state determines an FC interface and behaves as either a master whether the device has successfully asserted its transmitter or a slave receiver, exclusively, as data on the bus. A transmitter that is sending a logi­ definedby the FC Bus Specification. cal high state and detects that the data line is being held low by another sender, recognizes that it has Message Fo rmat lost arbitration and must try again later. When a "collision" or arbitration occurs, no data is lost, one A message transmits information between a device and the computer or between the computer and one message is transmitted and received, and the remaining messages must be sent again. or more devices. There is one exception: a device FC data messages are transmitted as 8-bit bytes, may attempt to reset other devices assigned to the with each byte being acknowledged by a ninth same address by sending a Reset message to itself. ACCESS.bus messages have the following format: ACKNO\V LEDGE bit from the receiver. PC technol­ ogy also defines unique START and STOP signals to Byte Bit Number delimit message frames. The first byte of any mes­ Number 1 2 3 4 5 6 7 8 l sage frame is always the destination address. destaddr 1 0 l Destination ACCESS. bus Physical Implementation address

Details of the physical implementation of ACCESS. bus 2 srcaddr 1 0 l Source are as follows: address Basic electrical configuration. ACCESS.bus uses • 3 [ P I length Protocol four-pin, shielded, modular-type connectors that flag, length feature positive orientation and locking tabs. (the number Data and power for the bus are transmitted over of data bytes low-capacitance, four-wire, shielded cable. The from 0 to 127) four conductors are used for ground, serial data, serial clock, and + 12 v. 4 through body Consists (length + 3) of 0 to 127 Available power. The maximum available power • data bytes for all devices is 12 vat 500 milliamperes (rnA). ACCESS. bus devices may supply their own power length + 4 checksum from a separate source, if needed. A power-up Initially, devices respond to a default power-up reset circuit must still be provided to reset the address. During the configuration process, the com­ device when bus power is applied. puter assigns a unique address to every device on

• Cable length. The maximum cable length fo r the bus. Messages are either device data stream the entire bus is 8 meters. The limiting factor is a (P=O) or control/status (P=l), as indicated by the

Digital Te clmicaljour11al Vo l. 3 No. 4 Fa ll 1991 39 Image Processing, Video Te rminals, and Printer Te chnologies

protocol flag. The minimum length of a message is unique address, and connect devices to the appropri­ 4 bytes; the maximum length is 131 bytes (127 data ate software driver. Configuration norma'llyoccurs bytes and 4 bytes for overhead). The message at system start-up, or at any time when the com­ checksum is computed as the logical XOR of all prc­ puter detects the addition or removal of a device. vic)lls bytes, including the message address. Power-up/Reset Phase Standard Messages When reset or powered up, a device always reverts The ACCESS.bus protocol defines the seven stan­ to the default address and sends an Attention dard interface messages summarized in Table 2. message to alert the computer to its presence. At Parameters defined within the body of the message system start-up or reinitialization, the computer are listed in parentheses. sends a Reset message to all PC addresses in the ACCESS.bus device address range (14 messages) to Ide ntification ensure that all devices on the bus respond at the Since the ACCESS.bus is a bus-topology network, power-up default address. unique identificationstrings arc used to distinguish devices. These strings arc structured as follows: Ide ntification Phase protocol revision: 1 byte (e.g., "A'') To begin address assignment, the computer sends an Identification message at the device default module revision: 7 bytes (e.g. . ''Xl.3 " ) address. Every device at this address must then vendor name: 8 bytes (e.g., "DEC " ) respond with an Identification Reply message. As module name: 8 bytes (e.g., "LK501 " ) device number: 32-bit signed integer each device sends its message, the JlC arbitration mechanism automatically separates the messages The module revision, vendor name, and module based on the identification strings. The computer name strings an: left-justified ASCII character can then assign an address to each device by includ­ strings padded with spaces. The device number ing the matching identification string in the Assign string is a 32-bit two's complement signed integer Address message. When a device receives this mes­ and may be either a random number (ifnegat ive) or sage and finds a complete match with the identifi­ a unique serial number (if positive). cation string, it moves its device address to the new assigned value. As soon as a device has a unique Configuration Process address, it is allowed to send data to the computer. The configuration process is used to detect what The FC physical bus protocol allows multiple devices are present on the bus, assign each device a devices on the bus at the same time ifthose devices

Ta ble 2 Standard ACCESS.bus Protocol Messages

Computer-to-device Messages Purpose

Reset () Force device to power-up state and default 12C address. Identification Request () Ask device for its "identification string." Assign Address (identification string, Te ll device with matching "identification string" to change its new address) address to "new address." Capabilities Request (offset) Ask device to send the fragment of its capabilities information that starts at "offset."

Device-to-computer Messages

Attention (status) Inform computer that a device has finished its power-up/reset test and needs to be configured; "status" is the test result.

Identification Reply (identification string) Reply to Identification Request with device's unique "identification string." Capabilities Reply (offset, data fragment) Reply to Capabilities Request with "data fragment," a fragment of the device's capabilities string; the computer uses "offset" to reassemble fragments.

40 v'<:!l. .3 No. 4 Fa ll /':)':)1 Digital Te chnical]ottrnal ACCESS.bus, an Op en Desktop Bus

are transmitting exactly the same message. In the puter then parses the capabilities string to choose rare event that two like devices report the same the appropriate application driver fo r the device. random number or are mistakenly assigned to the The parsed string is also made available to applica­ same address, each interactive device transmits a tion programs using the device. Reset message to its assigned address prior to send­ ing its firstdata message after being assigned a new No rmal Op eration address. The self-addressed Reset message forces During normal operation, the computer periodi­ other devices at the same address back to the cally requests inactive devices to identify them­ power-up default address, as if they had just been selves. If a device is found to be missing, or a new hot-plugged. The message guarantees that each device appears by sending an Attention message at device has a unique address, but not until the the default address, the computer sends an Identi­ device is actually used. The pseudo-random number fication Request message to each device address (or serial number, ifavailable) distinguishes devices previously recorded asin use (up to 14 messages) to at identificationtime before they are used, allowing confirm which devices are still present. The com­ the host to inventory which devices are present. puter also sends a Reset message to each device address previously recorded as not in use. The com­ Capabilities Phase puter then begins the address assignment process Device capabilities is the set of information that by sending an Identificationmessage to the default describes the functional characteristics of an address and assigning each device that responds to ACCESS. bus peripheral. The purpose of capabilities an unused device address. information is to allow software to recognize and use the features of bus devices without prior Generic Device Concepts knowledge of their particular implementation. By ACCESS.bus uses the concept of generic device having locator devices report their resolution, for drivers to support familiar 1/0 devices using only a example, generic software can be written to sup­ few drivers. Generic specifications for keyboards, port a range of device resolutions. Capabilities locators, and text devices have been developed. information provides a level of device indepen­ dence and modularity. Key boards The structure of capabilities information is The keyboard device protocol attempts to define designed to be simple and compact for efficiency, the simplest set of fu nctions from which a Digital but also extensible to support new devices without LK201 or a common personal computer keyboard requiring changes to existing software or periph­ user interface can be built. A generic keyboard con­ erals. These objectives are supported by making sists of an array of key stations assigned numbers the structure hierarchical and representing capabil­ between 8 and 255. When any key station transi­ ities information in a form that applications (and tions between open and closed, the entire list of humans) can use directly. The capabilities informa­ key stations currently closed or depressed is trans­ tion is an ASCII string constructed from a simple, mitted to the host. This reporting scheme is func­ readable grammar. The grammar allows text strings tionally complete; the host can detect every key to be formed into lists, nested lists, and lists with transition, and the scheme provides information tagged elements. The capabilities string for a loca­ about the full state of the keyboard on each report. tor might read as follows: No special resynchronization reports are required. (prot( Locator) In addition to reporting key stations, the generic type(mouse) keyboard device can support simple feedback buttons( 1(L) 2(R) 3(M) ) dim(2) rel res(200 inch> range(-1 2 7 127) mechanisms such as keyclicks, bells, and light­ dQ(dname(X)) emitting diodes. These mechanisms are controlled d1 (dname(Y)) explicitly from the host so that minimal keyboard ) state modeling is required. The capabilities infor­ After assigning a unique address to a device, the mation is used to identify the keyboard mapping computer retrieves the device's capabilities string table and the feedback mechanisms available. The as a series of fragments using the Capabilities keyboard mapping table can also be stored in the Request and Capabilities Reply messages. The com- keyboard itself as part of the capabilities string.

Digital Te chnicalJourna.l Vo l. 3 No. 4 Fa /11991 4l Image Processing, Video Te rminals, and Printer Te chnologies

Locators applications. As the advantages of this open desk­ The locator device protocol is designed to accom­ top bus design become well known, we expect motla te a range of basic locator devices such as wider adoption of this product. The ACCESS.bus a mouse or tablet. More complex devices can be is currently implemented on Digital's Personal modeled as a combination of basic devices or can OECstation 5000 workstation, with implementa­ provide their own device driver, thus minimizing tions underway for the next generation of RISC the burden on the protocol. workstations and video terminals. A generic locator consists of one or more dimen­ sions described by numeric values and, optionally, Acknowledgments a small number of key swi rches. The standard driver Many people contributed to the design ancl devel­ requires the locator device to identify the type of opment of ACCESS.bus. I would especially like to data it will report from a small list of options and acknowledge To m Stockcbrand and To m Furlong adjusts to handle this data type. These options are for their vision and early support; Chris Cued, Mark Shepard, and Ernie Souliere fo r their contributions Number of dimensions, e.g., two, for a mouse or • to the ACCESS.bus electrical design and protocols; a tablet and Robert Clemens for the excellent demonstra­

• Dimension type: absolute, i.e., referenced to tion hardware and firmware development support. some fixed origin, like a tablet: or relative. i.e., changed since last report, like a mouse GeneralRej(erences

• Resolution in divisions per unit, e.g., counts per D. Lieberman, "Desktop Bus Is Born Free," Elec­ inch or counts per revolution tronic Engineering Tintes (September 2, 1991): 16.

• Dynamic range of values that can be reported, ACCESS. bus Developer's Kit (Palo Alto, CA: Digital i.e., the minimum and maximum values Equipment Corporation, Workstation Systems 15 Engineering TRJ/ADD Program, 1991). • Number of key switches, from 0 to The assignment of scalar-value dimensions Signetics flC Bus Sp ecification (Sunnyvale, CA: returned from one or more devices to the user Signetics Company, a Division of North American 1987). interface fu nctions is left to the application. How­ Philips Corporation, February ever, to accommodate most conventions. the scalar dimensions and the key switches can be labeled in the capabilities string.

Te xt Devices The text device protocol is intended to provide a simple way to transmit character data to and from character devices such as a bar code reader or a small character display. A generic text device trans­ mits a stream of 8-bit bytes from a character set. Simple control messages are defined to support flow control and to select communication parame­ t<.:rs that might be used to interface with a modem. The capabilities string contains information that identifies the specific character set and communi­ cation parameters used.

Summary The ACC.ESS.bus network interconnect offers the possibility of a standardized, low-speed, plug-and­ play serial communications channel that can untan­ gle peripheral interfacing and open the way to new

42 Vo l. ."> No. 4 t�{t/1 J')')J Digital Te clmicaljournal Ri chard Landau AlanGue nther

Design of the DECprint Common Printer Supervisor for VMS Sy stems

DECprint Printing Services software controls a variety of printerfeatures fo r a wide range of printers. It supports several different page description languages, handles multiple media simultaneously, and uses different I/0 interconnections and commu­ nication protocols. Operating within the VMS printing environment, it imple­ ments a large number of user-specified options to the PRINT command. DECprint Printing Services fu nctions as the supervisor in the VMS printing system fo r all PostScript printers supplied byDigital. The common printer supervisor bas an espe­ cially flexible internal structure and processing method to serve complex printing environments.

The increasing variety and complexity of printing the design of CPS, focusing on its capabilities within devices in the last decade have strained the abili­ the VMS system. The paper then discusses the oper­ ties of operating systems to support them. Users ation of the VMS printing system and the enhanced demand access to, and control over, the increas­ printing environment made possible by CPS. ingly sophisticated fe atures of their printers. At the same time, application programming resources are Printing Sy stem Dirnensions stretched by the requirement to support various A printing system is the set of software and hard­ devices and features. Modern operating systems ware components through which print requests include printing systems that support printers and pass from the time the user decides to print a docu­ insulate applications from many details of printing. ment until the appropriate hard copy arrives. DECprint Printing Services software was designed The variety of printing devices in use is a chal­ to handle a wide variety of printers, with a range lenge for the printing system and for applica­ of I/O connections, media hand ling capabilities, tion . We use the word "printer" in finishing equipment, data syntaxes, and so forth. this article to imply the full range of output devices It provides the controlling software that supports that are attached to systems and networks. A sys­ the fu ll range of Digital printers capable of printing tem today must support a wide number of dimen­ PostScript documents. sions: marking technologies, media, medium sizes, DECprint Printing Services fu nctions as a compo­ speeds, transmission rates, and interconnects. nent of the VMS printing system at the level of printer supervisor, called symbiont in VMS termi­ The DECprint Model of Printing nology. The supervisor is known within Digital as The DEC print model of printing is composed of sev­ the DECprint common printer supervisor or com­ eral layers. Each layer has definedfu nctions and 1/0 mon print symbiont (CPS). It is called common interfaces. The layers of the DEC print model and their because it replaces a number of diffe rent symbionts relationships to VMS and CPS are shown in Figure 1. and is common to a range of printers. CPS is a com­ This model of printing describes a useful structure pletely new program developed by the Video, with consistent fu nctions and responsibilities. Image and Print Systems Group.

This paper explores the environment in which • Application. An application program creates printing systems now reside. It describes the struc­ information that the user may want to print. Al l ture and functions of DECprint Printing Services and types of applications fit into the model at this

Digital Te chu'icaljourual Vo l. 3 No. 4 Fa ll /991 43 Image Processing, Video Te rminals, and Printer Te chnologies

DECPRINT ARCHITECTURE VMS CPS

APPLICATION

USER DIGITAL JOB INTERFACE COMMAND SUBMISSION LANGUAGE INTERFACE ---1------+-- PRINT (NETWORK) CLIENT SYS$SNDJBC PRINTING INTERFACE -----1 ------+-- PRINT QUEUE SPOOLER MANAGER PRIN T I SE AVICE I PRINT PRINT COMMON PRINTER SUPERVISOR SYMBIONT PRINT ACCESS SYMBIONT INTERFACE ----1------+--

MARKING PRINTERS ENGINE I PRINTER FINISHING EQUIPMENT

Figure 1 Relationships of the Vl l1S Printing Sy stem Components to the DECprint Mo del

level, from data processing programs and simple • Print client. The client accepts requests through text editors to high-quality document formatting its API, performs defaulting for the user, assists in

and publ ishing applications. The application may selecting the correct print service, ga thers the present a printing interface directly to the user, print instructions and document files, and sub­ or may create a final fo rm document from which mits the job to the print service. The protocol the user can access other printing interfaces. used to submit the job may be operating system­ specific or may be based on emerging standards • User printing interface. A user expresses the for network printing. The print service may be desire to print through a user interface to the local to the print cl ient (and the user), or it may printing system. The interface may be oriented be located elsewhere in the network. to written commands, to user selection of

menu choices, or to a point-and-sdcct graphical • Print service. The print service is a convenient interface . abstraction that includes the print spooler and aU subsequent layers in the execution of the print • ]ob submission interface. User interface pro­ job, for some set of physical printers. Printers grams communicate with the lower kv<.:b of the are often grouped together based on their static printing system through an application program­ characteristics, such as type of printer, printer ming interface (API) to the print client. The API data syntax, ancl default media. contains fu ll capabili ties for creating, destroying, and managing print jobs of all types. The job sub­ • Print spooler. The print spooler accepts the print mission interface may be operating svstcm­ job from the client, spools the files and queues specific or may he based on emerging standards the job for later execution if necessary, ancl fo r network printing. then schedules the job for execution. If the job

44 Vo l. 3 No. 4 Fa ll 1991 Digital Te chnical jou rnal Design of the DECprint Co mmon Printer Supervisorfor VMS Sy stems

requires resources that are not immediately avail­ Implementations of the model in various operat­ able, human intervention may be necessary. For ing systems and printers may express the layers example, ifa job requires a special print medium, differently, sometimes skipping certain layers. The then an operator or other printer attendant must VMS printing system contains components at most provide the medium for the printer. If the job levels of the DECprint model. The DECprint com­ requires a special fo nt, the spooler may be able to mon printer supervisor (CPS) operates within the obtain the font from a library without human VMS system, as indicated in Figure 1. We designed intervention. CPS to satisfy the requirements and projected needs of users, system managers, and programmers. In the Printer supervisor. The supervisor directly con­ • next section we discuss the design of CPS. trols the printer. It interprets the print instruc­ tions for the job, manages the printer and its fin­ Sharing Devices ishing equipment, and writes the document data Printers are often shared, especially high-end or to the page description language (PDL) inter­ specialized, expensive devices. Since shared print­ preter. It also monitors the status of the printer, ers are not always immediately available to the supplies some resources on demand, and responds user or application program, the printing system to error conditions. On the VMS operating sys­ is required to hold jobs for printing later. The sys­ tem, the printer supervisor is called a symbiont; tem must be able to store the user's instructions for on ULTRIX and UNIXsystems, a daemon. printing, along with the contents of the document, until they are needed. • PDL interpreter. Generally, finalform document data is written in a data syntax intended for print­ ing, but it is not in the native form required Insulating the App lication fr om Details by the marking engine. A PDL interpreter trans­ A printing system insulates appl ications from the forms the printer language into the lower-level details of printing devices. For example, DECprint form for the marking engine. For example, in a typ­ Printing Services provides communications mecha­ ical laser printer, a PostScript interpreter trans­ nisms and protocols, determines whether a shared forms the PostScript language into a device-level device is currently busy, and sometimes translates bit map and media control instructions for printer data syntax. the print engine. In a simpler impact printer, Application programmers generally prefer to the controller turns characters and control deal with as few external interfaces as needed to sequences into pin timing and paper movement perform the task. Thus it is desirable to minimize instructions. the number of different classes of printing devices while maximizing the variety and flexibility of Marking engine. The marking engine consists of • printing devices. The DECprint architecture speci­ the media transport and printing mechanisms, fiesthat the printing system take responsibility for generally controlled at a low level. Marking may matching the needs of the application to the capa­ be done by a wide spectrum of technologies, and bilities of the output device, whenever possible. the media used may also vary widely. For the For example, a printing system might need the abil­ most part, descriptions in this paper use raster ity to transform the printer data stream from a devices such as laser printers as examples. data syntax used by the application to a data syntax used by the printer. Hidden transformation makes • Finishing equipment. The overall printing sys­ tem includes finishing options that are not often the system easier for applications to use. DECprint considered part of the (largely electronic) print­ Printing Services provides a certain number of ing system. Currently affordable components of printer data syntax transformations of this type, the printing system are typically automated. For from languages such as DEC PPL3 (which is com­ example, several years ago duplex (two-sided) monly referred to as "ANSI" within Digital) and printing was not economical for most office ReGIS to PostScript, and from PostScript to printer applications; today it is, and many officeprinters bit maps. include this finishing feature. Stapling, on the other hand, is still not economical for most office InternalSt ructure of CPS applications, though it is implemented in many In designing CPS, our primary goal was to create a high-end production printers. flexible system that would handle all the printer

Digital Te chnical journal Vo l. 3 No. 4 Fa ll 1991 45 Image Processing, Video Te rminals, and Printer Te chnologies

fe atures we could fo resee ancll many that we could as simple as "create a separator page·· or as complex not foresee, a system that could be modified as as the sequence "read a file, perform carriage con­ needed to handle not just new printers but new trol convt:rsion, add /HEADER, translate from A:'-ISI classes of printers. CPS is capable of managing a data syntax to PostScript, and write the ITsult to wide variety of character, line, page, and document the printer." A print job is a set of job steps that per­ printers. forms all fu nctions the user requests explici tly or lb create a flexible printing system, we needed to implicitly. The CPS fa cility that translates sekcted design a highly modular internal structure. This inter­ printer data syntaxes into the PostScript language is nal structure combines modules into sequences at discussed in the section Data Syntax Tra nslation. several levels to provide a general framework for I/O controlling and manipulating devices. Printjo b Analyzers At the bottom level of the structure are filter To simpl ify the addition of new printers and new modules, which are lightweight, independently classes of printers, CPS contains a software struc­ schedulable subprocesses within a VMS process. Filter modules commlmicate with each other by ture that corresponds to the hardware mechanisms of a printer. means of 1/0 routines and a shared data structure (PJA) containing joh information. Pointers to the 110 rou­ A print job analyzer determines which CPS tines and shared data are supplied in the invoca­ job steps are required to process a job. includes a separate print job analyzer for each major class tion of the filter module. The effect of the stream of printer that it supports: serial PostScript, 1/0 routines is much like that of pipes in the UNIX PrintServer, and LN03 Image printer devices. When operating systems. the symbiont begins execution, a PJA is chosen based At the next higher level is a set of communicating on the type of device associated with the queue. filter modules; each stream of filter modules is This PJ A is used until the symbiont is stopped. If a called a job step. Finally, a module called the print terminal device, such as a TI or TX or LT device, is job analyzer combines a sequence of job steps to associated with the queue, then the PJ A for a serial handle a complete print job. device is invoked. If an LD device is used, then the Filter Modules andjo b Steps PJ A for an LN03Q printer is chosen. Otherwise, the Filter modules can read input from a preceding filter PJ A associated with PrintServer devices is used. module and write data to a succeeding filter mod­ Each PJA contains a list of all job steps required to ule. hlter modules may perform fu nctions such as execute a job on the class of printers it supports. reading a file,converting carriage control, translat­ The PJA selects the job steps it needs from this list, ing data syntax, or writing data to the printer. A depending upon the instructions received from the filtermodu le receives as arguments an input stream queue manager. and an output stream, like a UNfX process, and a Job steps are I inked together. The first job step shared data structure, unlike a Ul\TJX process. A sim­ chosen by the PJ A is linked to the termination of the ple filter moclulc reads data from the input stream, P.JA itself; when the PJA finishes compiling the job, processes data, and writes data to the output stream. it terminates, thus starting the execution of the job. A filter module may condition its operation based At the beginning of each job step, each filter mod­ on information from the shared data structure or ule is assigned stack space and a stack fra me. Its ini­ the contents of the data stream. For example, a tial program counter address and arguments are translator filtermodu le might format data based on stored in its saved registers for process activation. the page size, margins, and aspect ratio specified CPS uses a piped stream 1/0 mechanism similar in in the shared data structure, or based on control fu nction to a UNIX stream; a filter module's input sequences in the data stream, or both. comes from the output of the previous module, and Not all filter modules use the input or output its output becomes input to the fo llowing module. streams. The filereader filtermodule reads from the I3yconv ention, the firstfilter mo dule of the job step file instead of the input stream. Similarly, the device is activated first in the job step; when a filter blocks output module writes to the printer instead of the fo r output, the next filter module is activated . That output stream. filter module then runs until it blocks for input or A job step is a set offilter modu les piped together output, at which point the previous or fo llowing to perform one complete subtask. A subt:tsk may be filter module is activated.

46 Vr>l. 3 Nu. 4 Fall 1991 Digital Te chnical journal Design of the DECprint Common Printer Supervisorfo r VMS Sy stems

Ta ble 1 Simplified Job-step Sequence Table 1 shows a simplified listing of the job steps compiled by the serial PJA to process a simple job: Job Step Function one fileto be printed in ANSI mode. Each of the job init_ps_device Ensure the device is "fixed up." steps shown contains one or more filter modules check_prologues Ensure that persistent piped together. For example the job-burst job step prologues are loaded. has two modules piped together: the job-burst mod­ 2 sheet_count Get the beginning page count. ule and the write-to-printer module. Figure shows job_burst Print job burst page. several job steps with several filter modules each. If an error occurs at any point in the processing sheet_size Get the cu rrent sheet_size. of a job, CPS skips job steps until it reaches the wait_sheet_size Wait for the sheet_size before 1, continuing. identified error job step set by the PJA. In Table the error job step points to the sync job step that file_setup Send any file /SETUP modules. precedes the job-trailer job step. In this case, CPS get_vmbytes Get the amount of local printer memory available on the resynchronizes with the printer and prints the job­ printer. trailer page, including the error message. wait_vmbytes Wait for the local printer memory message from the Event Ha ndling printer. In addition to the output side of processing a job, file_out Read the file to print and send there is a corresponding input side. The input side it to the DECansi translator. reads messages from the printer, parses them, and sync Wait for the printer to finish all notifies the appropriate handler of the event. The pages. handler is chosen based on the type of message sent. init_ps_device Ensure the device is "fixed up." CPS internal messages are dispatched to the sheet_count Get the ending page count. • appropriate symbiont routines. For instance, wait_sheet_count Wait for the page count to come back. printer resource messages contain information job_trailer Print the job trailer page. that affect CPS internal operations: paper size is stored for later use by layup (the general map­ sync Wait for the printer to finish the job-trailer page. ping of page images to sheets) and translators; disconnect Release the printer. virtual memory size is stored for translators; and page count is stored for later use in accounting.

JOB STEPS

FILTER MODULES

READ GET VMBYTES J MODULE WAIT FOR VMBYTES

Note that data flows from top to bottom and job steps progress from left to right.

Figure 2 jo b Steps and Filter Modules

Digital Te chnical Jo urnal Vo l. 3 No. 4 Fa ll 1991 47 Image Processing, Video Te rmjoals, and Printer Te chnologies

• Printer status messages arc dispatched to the The VM S Printing SJ 'StemEnvi ronment operator and, in some cases, to the current user. CPS functions as a component of the VMS printing CPS uses the normal V.\·IS OI'CO:vl notification system at the level of printer supervisor. As such, it mechanism to send messages to the system opn­ interacts with, and is shaped by, the other compo­ ator. If the user specified /NOTIFY in the print nents of the VMS system. The term printer super­ instructions, then CPS us<.:s the V.\IS SllRKTI IRt : visor is used in this paper to be consistent with the syst<.:m service to send the message to the user terminology of the emerging International Stan­ also. dards Organization (ISO) Document Printing Appli­ cation draft standard, ISO/IEC DIS 10175. In some cases, printer status messages require additional processing. for example. paper jams Components requ ire spe<.:ialhand ling on some print<.:rs: since VMS CPS cannot determine how many pages were lost The Batch/Print system is a general queue man­ in th<.:jam, it invok<.:s human intervention by plac­ agement service, capable of queuing, scheduling, ing the job on hold. The operator or user can and executing jobs in response to a variety of user­ specified instructions.1 On the VMS system, the determine what parts oft he job, ifany , to reprint. printing instructions arc stored in a print job • Program status messages and user data messages object, which is placed in a queue of jobs fo r a arc dispatched to the job log. If the user specified printer. Modern print jobs often resemble batch /NOTIFY. then they are also displayed with the jobs, due to complex stored processing instruc­ $llRKTHRU system s<.:rvice. These messages may tions and the heavy computing load placed on be printed or logg<.:d. graphics printer control lers. The V.\15 printing system contains components at The input and output sides of the symbiont run most levels of the DECprint architectural model. asynchronously most of the time, hut occasionally it

is necessary for the output sid<.: towa it for a mes­ • llser printing interface. The VMS system includes sage from the printer. This synchronization between interactive Digital Command Language (DCL) the input side and output side of the symbiont is interfaces for printing and managing print jobs, accomplished by an internal event-signaling facil­ printers, and the printing system itself.2 For ity. When synchronization is r<.:quired, the output DECwindows applications. the DECwindows Print side waits for a specific named <.:ventand the input Widget provides a graphical interface that per­ side signals that event when it is detected. For mits users to specify all the options for printing, example, at the end of a job, CPS needs the final and the ALL-IN-1 application provides character­ printer sbeet_count in order to calculate the cel l menus for choosing print options, including sheet_ count for the job; this count is printed on the the enhanced options offered by CPS. trailer page and stored in the VMS accounting • job submission interface. The VMS system records. When CPS ne<.:ds the sheet_count, the out­ includes program call interfaces that give the put side waits for an event named sheet_ count. The program all the capabilities of the DCL user input side parses the incoming sheet_count mes­ interface. 1 sage, stores the returned value in the shared data structure, and signals the sheet_count event. The • Print client and service for remote printing. The processing of this event is asynchronous: at the distributed queuing services product currently time the message comes in, the output side may or provides transparent remote printing in net­ may not have stalled while waiting for the works using a proprietary network protocol. sheet_count even£. If the output side was waiting • Print spooler. The VMS job Controller, recently for that event, it is scheduled for fu rther process­ replaced by the VMS Queue Manager, fu nctions as ing; if the output side was not waiting, the event is queue manager and scheduler. (The function of remembered, in case the output side attempts to spooling printer data to temporary files is per­ wa it fo r this condition in the near hnw-e. formed by the VMS file system and is transparent In the next section we describe the ways CPS is to most components of the printing system.) controlled and managed in the VMS printing system and how it expands printing capabilities in the VMS • Printer supervisors. The VMS system provides environment. two standard symbionts to support most line

48 \kJI. j No. 4 Fa ii i'.J'Jt Digital Te chllicaljourtwl Design of the DECprint Common Printer Supervisorfo r VMS Sy stems

printers and serial printers. PRTSMB supports job, but cannot specify certain other qualifiers such printers attached directly to communication as DATA_ TYPE. It can also permit users of old appli­ ports on the CPU, e.g., the printer port on a VAX cations to access new features of the printing workstation. LATSYM provides support for print­ system. ers attached to the serial or parallel ports of DECserver network communications servers. For VMS Print Commands and Interfaces PostScript printers, CPS is used instead of these The VMS printing system is manipulated through standard symbionts. DCL commands and qualifiers. Many of the The VMS printing system also contains compo­ qualifiers are handled by the queue manager and nents that affect CPS processing. have no impact on the operation of print sym­ bionts; others directly affect the operation of CPS.2

• Device control libraries are collections of small The VMS system also supplies a call interface to text sequences that can be inserted into the data these fu nctions.·' stream from the symbiont to the printer. The sequences are ideally organized into text libraries VMS Interfaces to Sy mbionts containing named modules, with a separate The VMS Job Controller/Queue Manager provides library for each type of output device. Device two interfaces fo r customizing print symbionts: the control modules can be associated with a printer PSM module-replacement interface, and the SMB queue by the system manager as part of a FORM server symbiont interface. CPS is currently imple­ definition or a job reset function, or accessed mented as a single-stream symbiont through the directly by the user with the /SETUP qualifier. SMB interface. Device control libraries frequently contain The SMB interface permits a user to replace the device-specific control sequences that alter the flow of control of the symbiont with a separate pro­ format of the text and pages, for example, setting cess. The process may be written in any style and printer paper margins, setting character pitch, or structure suitable to the task at hand, and need fol­ enabling landscape printing. They may also con­ low only certain minor guidelines with respect to tain downloadable fo nt data or preprinted data the operating system environment. To use the SMB for each page. interface, we replaced the entire symbiont process. The result was much greater flexibility, but we VMS form definitions contain page size and mar­ • were required to write more program code. gin specificationsthat guide the print formatting The SMB interface provides services to the sym­ process for a print job. The user can also specify biont process through subroutine entry points and page setup strings and can prohibit the symbiont callbacks that passmessages between the symbiont from wrapping lines during processing. and the VMS queue manager. Messages from the VMS Print Queues system to the symbiont specify fu nctions such as start up, shut down, begin job, pause, resume, and VMS has several distinctly different types of queues. interrupt. Messages from the symbiont to the Execution queues process jobs through a symbiont, system return information such as job status, job and generic queues transfer jobs to other queues. completed, device status and error information, Often generic queues are used for load balancing: and checkpoint and accounting data. one generic queue may feed several printers of sim­ ilar capability and location. CPS also uses generic queues in an unusual way. Range of Printers Supp orted Default attributes can be specified for generic CPS currently supports the fullrange of PostScript queues that cause all jobs submitted through the printers supplied by Digital, from a low-speed queues to inherit certain default print instructions. color printer up to a 40-page-per-minute laser For example, a queue can be established that, by printer that can hand.le 11 different paper sizes. defau lt, assumes that jobs are PostScript docu­ ments, or assumes that jobs should be printed in Sp ecial I/ 0 Processing landscape orientation. This ability to set default CPS supports several different means of communi­ queue attributes is essential for supporting applica­ cation with the printer: serial, Ethernet, and a spe­ tions that can specify the queue name for a print cial high-speed video connection.

Digital Te chnical jo urnal V!Jl. 3 No. 4 Fa l/ 1991 49 Image Processing, Video Te rminals, and Printer Technologies

The serial connection may be either a direct con­ video connection or data bus. For such a controller­ nection between the compmer and the printer or less printer to be generally useful, the printing a local area transport (LAT) connection by which system must emulate an existing class of printer. printer is attached to a serial port of a DECscrver The LN03 Image printer (LN03Q) is a bit-map terminal server. The two methods diffe r only in printer of this type. It uses a special high-speed the way johs are started and terminated . For DMA bit-map interface that plugs into a Q-bus and LAT-connccted printers, CPS must establish and dis­ provides the speed required for printing scanned miss the LAT connection at the start and end of images. The protocol between this interface and each job. the printer consists of bit maps and a small amount Once the connection is established with the of status and synchronization information. serial printer (via IAT or direct connect), CPS begins The engine itself includes only the laser imaging a dialogue with the printer using an asynchronous and paper ha ndling equipment. CPS handles the serial line protocol and PostScript programs. The rest of the controller fu nctions in the host com­ asynchronous serial I ine protocol, defined by puter. Because of the level of support and emula­ Adobe Systems Inc., consists of five control charac­ tion provided, the LN03Q printer appears to be an ters that alter or query the state of the printer. ordinary PostScript job printer with some special The symbiont fo rces the printer into an idle state image capabilities. by a series of controi/T, control/C, and controi/D For a given print job, CPS performs the normal characters. When a controi/T results in an IDLE processing up to the point at which the PostScript message from the printer, the symbiont and printer languag�.: data stream would normally be sent to the are ready to process a job. printer. At this point, CPS directs the data stream to PrintServer printers on Ethernet networks are a special PostScript interpreter subroutine that pro­ DECoct nodes. To write to a PrintServer printer, CPS duces a bit-map image of the printed page in mem­ establishes a DECoct task-to-task session at the ory The bit-map image is then sent to the printer beginning of the job. The dialogue required for syn­ through a special LNV21 direct memory access 1/0 chronizing serial printers is not necessary fo r the interface on the Q-bus. Ethernet printers; the PrintServer protocols pro­ The software for the LN03Q printer also has one vide synchronization ancl device control opera­ special processing path. The l.N03Q printer is tions through separate control channels. intended as an image printer for bit-map images. Printers connected through Ethernet use several CPS supports image files containing page images protocols, which are layered on DECnet task-to-task that are scanned or precomputed at device resolu­ communications. The protocol used depends upon tion (300 cl ots per inch) and optionally compr<.:ssed the version of the PrintServer code. with Comite Consultatif Internationale de Tek­ The local area print service (LAPS) protocol was graphique et Tetephonique (CCJTT) Group 3 (lD) or developed fo r the PrintServer fa mily and is still in Group 4 (20) compression methods. Image files can use. The Common Printer Access Protocol (CPAP) be transmitted directly to the printer without con­ wi ll replace LAPS in all PrintServer printers. ; PAP is verting to PostScript. Image files can only be sent based on the earlier Reid-Kent protocol, Internet directly to the printer if they are printed one page Socket 170, and is being discussed as a possible new per sheet; if the user requests printing mu ltiple pages Internetstand ard.' per sheet, or other layup fu nctions, then the image is processed through the PostScript interpreter. Sp ecial Processing fo r "D umb" Printers Image files are structured in Digital document In some printer configurations, it is economical to interchange fo rmat (DDIF), which expresses text, use the wo rkstation or CPU as the printer con­ graphics, and images together. Files intended for the troller. In this case, the printer includes only the LN03Q printer must contain only image bit maps. print engine and media handling and fi nishing If the print job specifies DATA_TYVE= DDIF or the equipment, ancl none of the electronics, comput­ fileis a DDIF file, then CPS examines the file in a spe­ ers, and interpreter programs that render the cial mode. If the file correctly contains only image graphics language into the elements required by the bit maps. CPS decompresses the images in memory print engine (usually an array of pixels). Such a if necessary, using the DECimage Image Support "dumb" printer is physically connected to the com­ Libra11' routines. and then sends the uncompressed puter by a very high-speed link such as a direct bit map directly to the LN03Q print engine. Thus

50 !4JI. 3 Nu. 4 l-it// 1991 Digital Te chnical jou r, at Design of the DECprint Common Printer Supervisorfo r VMS Sy stems

the image goes directly to the printer without pass­ Data Sy ntax Tra nslation ing through the PostScript interpreter. CPS provides a facil ity that translates selected printer data syntaxes into the PostScript language. Sp ecialProcessing in CPS The translating programs are subroutines, some CPS includes a number of special features and func­ quite large and complex, that accept a data stream tions to satisfy the requirements of the DECprint in one format and produce a data stream in another architecture and the VMS printing system. In this format. The translators are responsible for all for­ section, we discuss the features that extend the matting, including sheet size, page orientation, process of standard print symbionts or are com­ aspect ratio, and type sizes; CPS is responsible for pletely new. all I/0 and coordination with the printer. The trans­ lation facility currently supports the following Reading Print Instructions printer data syntaxes: DEC PPL3, ReGIS, Te ktronix CPS reads the print instructions for a job from the 4010/4014, and PCL Level 4. VMS queue manager through the SMB$READ_ The translation facility has several restrictions. A MESSAGE and SMB$READ_MESSAGE_ITEM functions file may consist of only one data syntax, and all files of the SMB interface. Print instructions are in a job must be of the same data syntax. expressed as attributes with values. Each attribute In general, CPS performs the translation from has an associated numeric code and symbol, called one data syntax to another on the host computer. an item code, and a value of a specific data type. In this way, simple printers that support only the The symbiont reads each item code and value, and PostScript language internally can be extended stores the information in a static data structure. to support a number of printer languages. This The information is used later to determine the pro­ reduces the requirement for a complex printer con­ cessing sequence for the job, special information to troller that supports multiple data syntaxes inter­ be displayed on separator pages, and so forth. nally. Host translation can guarantee consistent use across jobs of the printer's internal fonts, page ori­ Bidirectional Communication with entation, finishingequipment, and page layup The Po stScript Printers general mapping of page images to sheets supplied CPS requires a fu ll duplex communications path to as part of CPS requires that the printer operate in PostScript printers since they report many condi­ PostScript mode. To ensure consistent use of fonts tions by sending messages to the host computer. and consistent positioning of pages with respect to These messages include device status messages, finishing such as duplexing and stapling, all lan­ program status and error messages, user data mes­ guage translation must be done by the symbiont. sages, and replies to CPS inquiries. CPS also requests information from the printer Page Layup Multiple Pages per Sheet for synchronization, formatting, and accounting Page layup is the process of printing more than one purposes. For instance, to determine how to for­ page image on a sheet of paper. When more than mat ANSI text, the symbiont needs to know what one page image is placed on a sheet of paper, the paper is loaded in the printer. images are rotated and scaled to fiton the page, but CPS receives the messages from the printer and are altered in no other way. The layup facility works parses them to determine what it should do with with all data types, including PostScript and PCL the message. If the message is device status, then data syntaxes. Layup also permits formatting for CPS routes the message to the operator and/or the larger paper sizes and then printing on smaller user whose job is being printed. If the message is an sheets. internal CPS communication, then CPS processes it. Layup is invoked explicitly with one or both Otherwise, the message is either a program status of the extended qualifiersNUMBER_ UP and LAYUP_ message or a user data message. In either case it is DEFINITION. NUMBER_UP specifies the maximum Jogged for the user. number of page images that will be printed on a Allmessages are parsed except user data mes­ single side of a sheet; for example, two-up printing sages. Messages from the printer's interpreter are is specifiedby the "NUMBER_UP=2" option. Two or converted to a standard format that would, if fo ur page images per side may save significantquan­ desired, permit the message to be translated into tities of paper for draft printing, handouts, and the the user's native language. like. Up to 100 page images may be placed on a

Digital Te chnical jounwl Vo l. 3 No. 4 Fa ll 1991 51 Image Processing, Video Te rminals, and Printer Te chnologies

single sheet of paper for thumbnail draft printing to PostScript environment, including the alt<.:rctl p:1ge review the overall layout of a document. geometry, e.g., layup established by the print job. Layup may also he invoked through a combina· CPS defines two new separator pages. The file tion of PAGE_SIZE and SHEET_SIZE with NUMHER_UP. error page is printed when a tilecannot be opcn�.:d for example, the combination of PAGE_SIZE=E, or an error occurs while r�.:ading the file. The file SHEET_SIZE=A,NUMBER_UP= 1 permits printing error page informs the usn of the error condition draft copies of large-format documents on small which caused it to be print<.:d . The job log page con· paper. Conversely, the combination of PAGE_ tains up to 40 lines of the job log file. The job log file SIZE=A,SHEET_SIZE=B,NU.\1BER_U P= 1 magnifies the contains job events such as job start and job com· smaller page to tit the larger sheet. pletion as well as programstatus messages and user data returned from the printer. Duplex Printing Printing on both sides of the paper introduces a Ma naging Printer Resources numbcr of new options and interactions that Once communication is established with the require special processing in CPS. CPS begins each serial printer, the symbiont must establish what document on the first siclc of a new sheet, so that resources are available on the printer. These recto and vcrso (right-hand and left-hand) pages resources include prologues, which are commonly and alternating margins arc al igned with the cor· used PostScript routines, the amount of available rect sides of sheets as they arc stacked by the virtual memory, and the med ium in the default printer. This function :1lso interacts with the dir<.:c· paper tray. For example, CPS persistently loads the tion in which the medium is physically loaded into PostScript prologue fo r the output of the ANSI text the printer if the medium is not symmetric left-to· translator into the PostScript interpreter. This right, top-to-bottom , or front·tO·back, such as pre· resource might be lost to the printer because of a drilled paper. power failure or might become obsolete clue to a The interactions of POL coordinate systems, page software upgrade. CPS interrogates the printer at layup, media selection, asymmetric media, duplex the beginning of any job requiring the translator printing, and binding are the most elusive engineer· prologue and loads a new prologue, if necessary. ing problems in the printing :1pplication space. No CPS also performs similar processing for the gennal model of these interactions has been devel· PostScript prologue that is used to generate the oped, despite considerable effort in standards com· separator pages. mittees. It appears that it is necessaq• to implement For traditional resources such as paper, CPS relies every possible option. on status messages from the printer to indicate that the printer is stopped because paper supply is Separator Pages empty or jammed. These conditions are relayed to the operator and to the current user by standard CPS prints all the separator pages defined by the VMS mechanisms. VMS queuing system as well as some generated by CPS. Flag, burst, and trailer pages for job and file lev­ els are available as defined by VMS, and contain LibrarySearch Lists the same information presented in a high ly lcgible In the standard VMS print symbiont, only one format. In addition to the standard V:'v!S infor· device control library may be associated with a mation, the job trailer page also contains the first queue. This is not a probl.em since the standard VMS two PostScript language errors returned from the print symbiont deals with only one data syntax. printer. This often makes it unnecessary tO use (Recall that clevice control libraries are often writ· MESSAGES=PRJNT to see simple errors. ten in device-dependent data syntax.) CPS, on the To ensure that the job separator pages can always other hand, uses more than one data syntax when be printed correctly, CPS r<.:s<.:tsthe POL interpreter printing a non-PostScript job: the data stream to the in the printer befor<.: printing these pages. The CPS· printer is PostScript, but the data stream ro tbe generated scparator pages do not alter the coordi· translator is in another data syntax. nate system of the interpreter; the user's document Early versions of symbionts that supported starts printing with the default PostScript state. File PostScript suffered from the same restriction: only separator pag�.:s, in contrast, print in the current one device control library was available, and its

52 Vo l. 3 No. 4 Fa ll 1')')1 Digital Te cbnicaf jou ,..,wl Design of the DECprint Co mmon Printer Supervisorfo r VM S Sys tems

modules were expre�seu in PostScript. This made it on this article. Finally, we thank the many sites and impossible for users to share device control people who have tested the DECprint Printing libraries with their standard VMS print symbiont Services software. ami their non-PostScript printers. To solve the problem of multiple data �yntaxes References in a job. CPS introduced device control library search lists. The system manager, rather than speci­ 1. VMS Utility Routines Ma nual (Maynard: Digital fying a single file specification in the IN!TlALIZE/ Equipment Corporation, Order No. AA-LA67B-TE, QUEUE/LIBRARY command. creates a logical name 1990). instead. CPS translates that specific logical name 2. VMS DCL Dictionary, 2 vols. (Maynard: Digital and uses each element of the result as a device con­ Equipment Corporation, Order Nos. AA-PBK5A-TE trol library. Each library in the search list can have a and AA-PBK6A-TE, 1991 ). data syntax associated with it by adding the qualifier, /DATA_ TYPE= . 3. Vil,.JSSy stem Services Reference (Maynard: Digital CPS supplies a device control library, Equipment Corporation, Order No. AA-LA69A-TE, CPS$DEVCTL, which must be included in the search 1991 ). list, usually as the first, or only, element in the 4. ]. Jones, A. Kachrani, and T. Powers, "The search I ist. Common Printer Access Protocol," Digital Te chnical jo urnal, val. 3, no. 4 (Fall 1991 , this Summary issue): 55-60. The DECprint model of printing describes a useful 5. B. Reid and C. Kent, "TCP/IP PrintServer Print structure with consistent functions and responsi­ Server Protocol," We stern Research Lab Te chnical bilities. CPS is an advanced print symbiont that runs No te TN-4 (Maynard: Digital Equipment in the VMS printing system. It includes many spe­ Corporation, 1988). cialized functions to support the features of a wide range of modern printing devices. It provides, we fe el, an extraordinary level of support. It was General References designed with a highly modular and flexible inter­ nal structure to permit enhancements to be engi­ Guide to Ma intaining a VJ11S Sy stem (Maynard : neered with minimal interactions with current Digital Equipment Corporation, Order No. operations. AA-LA34A-TE, 1990). CPS is currently shipping its fourth version. This DECprint Printing Services Us er's Guide (Maynard: version fu lly supports the ten different PostScript Digital Equipment Corporation, Order No. printers supplied by Digital. which range from a AA-PBZGA-TE, 1991). low-speed color printer to a high-speed laser printer. It also supports fivediff erent data syntaxes DECprint Printing Services Sy stem !Hanager's in which applications can write documents. We Guide (Maynard : Digital Equipment Corporation, expect that more printers and more capabilities Order No. AA-PBZFA-TE, 1990). will be added in future versions, and that CPS will Digital ANSI-Compliant Printing Protocol Level 3 require a minimum ofadd itional engineering effort Programming Reference Ma nual (Maynard: Digital due to its very general internal structure. Equipment Corporation, Order No. EK-PPLV3-PM, 1991). Acknowledgments Digital ANSI-Compliant Printing Protocol Level 3 We would like to thank Peter Conklin for actively Programming Supp lement (Maynard: Digital initiating CPS and Gary L. Brown for even more Equipment Corporation, Order No. EK-PPLV3-PS, actively expanding it. We would also like to thank 1991). past and current CPS developers: Ned Batchelder, Cathy Callahan, Mark DeVries, Rich Emmel, Dave PostScript Translator'sRe ference Manualfo r ReGIS Gabbe, David Larrick, Klara Levin. Mary Marotta, and Te ktronix 4010/4014 (Maynard: Digital Doug Stcf:tnd li, and Charlotte Timlege. We would Equipment Corporation, Order No. AA-PBWFA-TE, like to thank Bill Fisher fo r his extensive comments 1991).

Digital Te chuicaf ]o uruaf Vol. _; No. 4 Pall 1991 53 Image Processing, Video Te rminals, and Printer Te chnologies

PostScript Printers Programmer's Supplement (Maynard: Digital Equipment Corporation, Order No. EK-POSTP-PS, 1991).

PostScript Language Reference Manual, 2nd ed., Adobe Systems Incorporated, ISBN 0-201-18127-4 (Reading, MA: Addison-Wesley, 1990).

Information Te chnology-Te xt and Office Sys­ tems-Document Printing Application (DPA), ISO/IEC JTC1/SC 18 N, Draft International Standards 10175-1 and 10175-2 (September 1991).

CDA Base Services Te chnical Overview (Maynard: Digital Equipment Corporation, Order No. AA-PHJYA-TE, 1991 ).

Creating Compound Do cuments Us ing CDA Base Services (Maynard: Digital Equipment Corporation, Order No. AA-PHK2A-TE, Il)89).

Wr iting Converters Us ing CDA Base Services (Maynard: Digital Equipment Corporation, Order No. AA-PHK1A-TE, 1991).

CDA Base Services Reference Manual, 2 vols. (Maynard: Digital Equipment Corporation, Order Nos. AA-PHJZA-TE and AA-PHKOA-TE, 1991).

CDA : DDIF Te ch nical Sp ecification (Maynard: Digital Equipment Corporation, Order No. AA-PHK3A-TE, 1991).

('DA : DTIF Te chnical Sp ecification (Maynard: Digital Equipment Corporation, Order :-.Jo. AA-PHK4A-TE, 1991).

DDIS .�vntax Sp ec�fication (Maynard: Digital Equipment Corporation, Order No. EL-00081-00-1 , 1987).

54 Vo l . .3 No. 4 Fa /1 1')91 Digital Te cbnical]ournal ja mes D. jo nes Ajay P.Ka chrani Thomas E. Powers

The Common Printer Access Protocol

The DEC PrintServer Supporting Ho st Software version 4.0 incotporates Digital's first implementation of the new common printer access protocol (CPA P) . Th is pro­ tocol is compatible with the local area print server (LAPS) protocol, which was op timized fo r VMS access and DECnet transport, and with tbe Reid-Kent proto­ col, a PostScript-based, TCP/!P- connected print server fo r a client-server environ­ ment. Th e CPA P protocol supports a variety of data presentation protocols and allows printers to be connected to driving applications by various communica­ tions and process-to-process intetfaces. Tbe protocol also couples entities running different op erating systems across disparate networks. Because of its superior performance, the new CPA P protocol has been accepted by the Op en Software Fo undation fo r inclusion in a fu ture release of OSF/ 1.

The presentation of computerized data has become cal printing device. The common printer access a remarkably sophisticated and subtle operation. protocol (CPAP) described in this paper provides Video displays now support windows with com­ the fu ndamental services required by a printer plex al locations of display space, variable fo nts, and supervisor for the presentation of data and collec­ real-time user input operations. Printing devices tion of accounting information. In addition, the now offer support for publication-quality fo nts, CPAP supplies easier network access between line art, and images. These devices can present printer supervisors and printers, as well as ancil­ visual objects on a variety of media, from many lary control of printers for network management sources, and in variable orientations and presenta­ and device configuration. The CPAP also provides tion modes. In addition, both video and printing services to distribute the processing requirements devices are now decoupled from dedicated com· of the printer itself, most notably a mechanism for puting environments, and are shareable from many delivery of network fo nt services. This last capabil­ hosts and by many users or programs. ity allows a printer to offer what amounts to vir­ Now, only the simplest printing devices are lim­ tual services, i.e., the ability to configure itself ited to presenting just characters, and many users dynamically to the demands of a print job without are finding such restricted capabilities inadequate. the involvement of the printer supervisor. Also, most printing devices still require dedicated This paper begins with a discussion of the connections to single computers. However, more influence of existing protocols and the DECprint printers now offer fu ll network accessibility; i.e., architecture on our CPAP design goals. The sections network printers are capable of offering sophisti­ that follow present the printer session concepts cated services to a wide variety of users and their and the functional interface between the protocol applications. and applications. We then describe the implemen­ The paper entitled "Design of the DECprint tation of the new protocol in a server environment, Common Printer Supervisor for VMS Systems" including interoperability, compatibility, and the in this issue of the Digital Te chnical jo urnal translation of the older PrintServer protocol. At the describes access methods and interrelations close of the paper, we discuss ongoing standard­ among services that provide for these increasingly ization issues. sophisticated data presentation capabilities.1 The printer access protocol (PAP), a service interface in History the DECprint architecture, couples the printer The PrintServer 40, Digital's firstfu lly networked supervisor component to the logical printer for printer, was firstshipped in 1986. Its local area print presenting data and otherwise controlling a physi- server (LAPS) protocol was analogous to later

Digital Te chnical jo urnal Vo l. 3 No. 4 Fa /1 1991 55 Image Processing, Video Te rminals, and Printer Te chnologies

printer access protocols. The PrintServer 40 was a problem was a major conceptual test in the first ground-breaking product for Digital, and the LAPS implementation of a CPAP server. This is discussed protocol was a major aspect of the PrintServer in more detail in the section The CPAP Server development effort, portions of which date back to Implementation. 198:1. The LAPS protocol was designed and devel­ The Reid-Kent protocol met many of the techni­ oped with particular product-oriented deliverables cal design requirements fo r a new PAP. It was built in mind, and was optimized for VMS access and on industry-standard components, and contained DECnet transport. While this protocol predates no proprietary technology that would prevent its much of the architectural work now being imple­ publication. mented in Digital's printing products, it was (and However, certain PAP design goals were not cov­ still is) a signi.ficant element of PrintServer archi­ ered by the Reid-Kent protocol in its 1988 version. tecture and implementation. • There was no facility to select a specific page Wo rk began on more general PAPs in 1987 as description language (PDL) for printers support­ part of the early work on the DECprint architec­ ing multiple interpreters. ture (known at the time as the Printing Systems Model). The specifics of what would become the • There was no method for soliciting the capabili­ CPAP emerged in late 1988 in two internal papers ties and media available on the printer. by Brian Re id and Chris Kent of Digital's We stern • The only language supported was English Research Laboratory. These papers presented the (contrary to the corporate guidelines for initial design concepts for a PostScript-based, internationalization). TCP/IP-connected (transmission control protocol/ Data sent from the printer was not categorized; internet protocol) print server in a clearly defined • user-specific information was mixed with opera­ client-server environment. This print server proto­ tor and service data. col came to be known as the Re id-Kent protocol.

• No means was provided to solicit the status of the printer. Design Rationale and Goals There was no encoding to discriminate between By early 1988, design goals for (and constraints on) a • PAP were well understood, and had been collected binary and text files. and published as part of Digital's Printing Systems However, these flaws were largely omiSSIOns Model. Chief among these goals and constraints was from the design goals, not fundamental conflicts the need to support a variety of data presentation with them. The architecture team decided that the protocols, and to allow printers to be connected to Reid-Kent protocol could be extended to address driving applications by a variety of communica­ these omissions without serious conflict. In fact, tions and process-to-process interfaces. the necessary extensions were designed to allow The increasing corporate commitment to open clients and servers conforming to the original Reid­ systems made it clear that a PAP would also have to Kent protocol to remain in conformity with the full couple entities running various operating systems C:PAP specification. across different networks. Thus, the DF.Cprint PAP architecture team decided early in the design pro­ Architecture cess that a PAP should be designed fo r public The CPAP is primarily a communication-oriented access: that is, the specification for the protocol protocol, i.e., the presentation of its function is should be put into the public domain and submit­ closely coupled with its encoding. The major syn­ ted for industry standardization. tactic features of the <:PAP derived from the Reid­ Jnteroperability is a most serious constraint. Kent protocol are the fo llowing. Digital has a strong tradition of maintaining back­ All encoclings are ASCII strings. This eases the ward compatibility within and among its product • generation of protocol streams and ensures inde­ families. In a distributed processing environment, pendence from the underlying communications however, backward compatibil ity takes on the channels. added burden of interoperability. Multiple clients must communicate with multiple servers. any of • No data fields are fixed length. This provides for which can be upgraded to new versions of sup­ extensibility of the protocol and eases the gener­ ported pwtocols asynchronously. Addressing this ation of a protocol stream.

56 Vo l. 3 No. 4 Fa ll I'J'J/ Digital Te chnical journal Th e Common Printer Access Protocol

will use for each document and which virtual cir­ • Multiple channels of communication use the same basic format. Common parsing of separate cuit will be used for its transmission. channels simplifiesimplementations. Selection of the proper virtual circuit for trans­ mission of documents to the printer is performed Simple numeric tokens define theoperators. • by passing tokens from the printer to the printing service. The tokens are then mapped to whichever Session Concepts virtual-circuit service is being used by both the The CPAP architecture definesseparate contexts for printing service and the print server. This map­ each type of work the CPAP can perform. Each con­ ping approach avoids passing network-specific text requires that a separate session be established information within the protocol. Not only does the for its own tasks, and each session involves the cre­ approach make the CPAP independent of the net­ ation and use of a separate network connection works on which it might run, it ensures that the between the controlling cl ient and the server. Each network services need no knowledge of CPAP connection identifies the type of session the initia­ encodings. Such virtual-circuit mapping is criti­ tor requires. The CPAP defines three different ses­ cal to allow CPAP client-server processing to be sion types: print, management, and console. implemented in a heterogeneous, internetworking The set of CPAP operators allowed for a session is environment. restricted to those needed to support that type of During the printing of the document, some data session. All session types have access to printer presentation interpreters (PostScript, for example) status and configuration information. In addition, send data back to the user or print service. In addi­ multiple concurrent sessions are permitted. Print tion, the printer may run out of paper or toner, sessions and management sessions may have one or may have a full output tray, or may encounter other more virtual circuits active to a printer at a time. exception conditions not directly related to the The use of multiple circuits permits the streaming interpretation of page description data. The CPAP of data to the printer over logically separate chan­ categorizes such conditions and delivers relevant nels, thereby eliminating application protocol over­ messages to the user, the operator, or the event logs. head for the most frequent operations. In contrast, Upon completion of the job, the printing service console sessions use a single virtual circuit for is notified of the media used, the number of pages exchange of data with remote terminals. printed, and the printer processing time required to complete the job. The protocol also includes a Print Sessions Print sessions usually consist of a provision to abort jobs, e.g., an improperly formed series of documents printed for a user on a given document that might otherwise hang the printer. host by a printing service (a "printer supervisor" as defined by the DECprint architecture). With the Management Sessions The CPAP supports certain operators provided by the CPAP, the printing ser­ printer services through management hosts. A man­ vice can determine the language interpreters, agement host is a network entity (not necessarily printer options, fonts, prologues, and media that the same entity as the printing service) with which are currently installed at the server. These opera­ the printer can exchange information or request tors also provide the current operational state, services. Such services include number of jobs queued to the printer, and the cur­ Time service rent job status. These features permit the printing •

service to select the printer (server) that can satisfy • Centralized event logging the user's request and to determine a method for Centralized accounting submitting the job to the printer. •

Once the printing service has begun a session • Program loading and configuration and identified itself, it identifies the user and the Font services user's job code to the printer. This information may • be used by the printer to provide usage information An important aspect of the CPAP is that the to a centralized accounting service. The printing printer is always passive with regard to initiating service can then present documents to the printer. management services. A candidate management A transaction between the printing service and the host advertises that it has services to offer, and a printer establishes which interpreter the printer print server accepts or rejects the offer. Once a

Digital Te cbnicaljourual Vo l. 3 No. 4 Fa ll 1')91 57 Image Processing, Video Te rminals, and Printer Te chnologies

connection with one or more management hosts programming interface (API) that aiiO\:vs access to is established, the printer may use such hosts as all CPAP facilities is included in the protocol's servers for time synchronization, configuration file specification. access, and font lookup. Additional functions for A connection block, which is passed as a parame­ these hosts may be loading program images, event ter to all functions, provides support fo r vari­ logging, accounting, and general fileaccess. ous printer types, their device identifications, and File naming to access general file services is a descriptors for command and data channels. This problem that needs special attention if the server support includes separate command and data and the protocol are to maintain independence channels for printers supporting multiple virtual from the host operating systems. Commonly used circuits or channels. Just as in the case of the data­

filesan : identified in the CPAP by reserved tokens, stream form of the protocol, the API form allows such as $CONFIG, $DEFAULTS, $RESOURCES, and separate channels for data and commands. $SETUP. Arbitrary path names are allowed, but can A separate command channel allows ease of con­ access only a limited domain (from a known root trol flow between client and server. This may directory) to preserve file system independence include the client receiving the server's status or and to maintain security. events, or the client sending aborts to the server. Translation to the host's services is provided For devices that support only a single channel, the by the host itself. This permits the printer to be generic printer driver can set both command and served by different hosts using a wide variety of data channels to the same value. For supporting operating systems (and their implicitly diffe rent multiple jobs active at the same time (job overlap), file-naming conventions and syntaxes) without any a job identification (ID) parameter is passed with awareness of a management host's implementation all functions. by the server. To support various message types, the address of a read-callback routine is passed to the open Co nsole Sessions A console session is a form of printer function along with a pointer to read-call­ printer management. The content of the data back arguments. These arguments may signal vari­ exchanged during a console session is specific to ous events, or may consist of messages for the user, the printer, and is not specified by the CPAP. operator, accounting, or resources available in the Services performed within a console session might printer. include An early version of the generic fu nctional inter­ face was part of MIT Project Athena's Palladium Operator services, such as telling a printer what • Print System. The printer supervisor in Digital's media have been loaded (e.g., by color, weight, LN03R ScriptPrinter product was modified to cre­ or transparency), or setting physical printer ate a generic printer interface for both the defaults (e.g., duplex versus simplex, or default ScriptPrinter device and the PrintServn family. medium selection) This conversion from an API-accessible base took

• Network management configuration services, one week to execute, whereas it typically takes such as control I ing domain access to or from six months of effo rt to develop a new printer the printer supervisor for a device as complex as the PrintServer product. • Troubleshooting or debugging services Digital's implementation of console services on The CPAP Se rver Implementation current PrintServer products conforms to the The implementation of a protocol gives rise to Enterprise Management Architecture. problems different from those related to its design. When defining the architecture, one strives to pro­ Application Program In terfa ce vide an ideal that includes all of the desired features The functional interface to any protocoJ provides in an elegant manner. When performing an imple­ an additional abstraction between an application mentation, one finds that elegance often has to take

and a protocol. This abstraction answers many of a back seat to pragmatics. This is especially true today's software application needs, including inter­ when the new protocol is intended to replace two operability, portability. modularity, and reusabil­ different protocols in a new version of an existing ity across multiple architectures. An application product. Merely implementing the new protocol

58 V!J!. 3 No. 4 Fa ll 1991 Digital Te e/mica/ journal Th e Common Printer Access Protocol

is not enough-the implementation must some­ biont (CPS), several versions of the ULTRIX oper­ how coexist with the protocols being replaced. ating system, and a source kit version running on Digital's first production implementation of the a Sun Microsystems workstation. CPAP was targeted for the DEC PrintServer Sup­ Unfortunately, this testing uncovered latent porting Host software version 4.0, which loads and defects in the implementation of the existing prod­ drives the PrintServer family of printers. For the ucts. We had to analyze each of these defects and rest of this paper, we refer to this software by the plan corrective action. Since updating the existing PrintServer product designation of LPS version 4.0. products in the field is a difficult process (both We started the implementation by modifying technically and procedurally), we corrected most Digital's ULTRJX PrintServer client, which already of the defects by altering the server to deal with the used the Reid-Kent subset of the CPAP, to use problems. Retesting was performed over several DECnet network transport and run on the VMS oper­ baselevels to ensure that our changes caused no ating system. We then updated the LPS server regression. code to permit either DECnet or TCP/IP transport. At one of the early baselevels, the interface This was accomplished by using the direct-to-port between the network distribution software and the communication features of the VAXELN operating server's PostScript interpreter was updated to use a system. The server establishes a circuit using the stream-based connection in place of the previous appropriate transport and then spawns a process packet protocol. This update permitted the new for dealing with each incoming connection. Thus, CPAP data channel to be mapped by reference to the same code can service print sessions, manage­ the input of the PostScript POL or any other POL ment sessions, and console sessions without con­ supported by the printer. This change alone per­ cern for the type of network transport. mitted the performance of the server to be main­ The CPAP was, by design, directly upward­ tained even when the server was translating from compatible with the Reid-Kent protocol subset. the old LAPS protocol to the CPAP. However, Digital's PrintServer offerings prior to In general, development proceeded incremen­ LPS version 4.0 were LAPS-based, and LAPS was not tally, i.e., key features were identified and added CPAP-compatible. To permit users of existing with each baselevel. While this technique limits the PrintServer printers to continue to use these complexity of producing the product, it raises an products, we had to find a way for the new CPAP important business issue. Specifically, the provi­ implementation to coexist with the older LAPS sion of enhanced services in a client-server envi­ application protocol. We achieved this coexistence ronment often exposes aspects of the proverbial by having the server perform translations from the "chicken-and-egg" situation. There is little call to older protocol to the new one in the server itself. offer enhanced features in a server if clients have When the client establishes the initial connection, not been programmed to solicit the features. How­ the server senses which protocol is being used by ever, clients are not readily upgraded to solicit the client system. If the initial message indicates features that might not be widely available. the use of LAPS, the server spawns incoming and The LPS version 4.0 project team met its backward outgoing filtersto deal with the incoming connec­ compatibility design goals by including the LAPS-to­ tion, and a new internal circuit replaces the CPAP filters. In doing so, they undercut the need network connection to handle the interpretation to provide the enhanced feature support that the of the CPAP. CPAP was designed to deliver, since existing clients The coding of the LAPS filters was the last step (earlier versions) could not avail themselves of in implementation before testing began. The the added features. In addition, the risks of includ­ PrintServer 20, PrintServer 40, PrintServer 40 plus, ing fu ll CPAP support in LPS version 4.0 (possible and the new turbo PrintServer 20 all had to be increase in time to market, and the creation or expo­ tested using both LAPS and the Reid-Kent subset of sure of more latent defects in all supported environ­ the CPAP. In addition, the new implementations of ments) seemed to outweigh the benefits. However, the management cl ient and the console client on a last-minute change to use the new protocol's data the VMS system required verification. This verifi­ channel for loading fo nts yielded such a large per­ cation entailed a multitude of tests using the LPS formance advantage that resistance to using the symbiont running on older versions of the VMS new features crumbled, and the project team was operating system, the newer common print sym- allowed to submit the full protocol to field test.

Digital Te chnical journal Vo l. 3 No. 4 Fa ll 1991 59 Image Processing, Video Te rminals, and Printer Te chnologies

Standardization architecture and created the firstprototype imple­ Network printing became widely available in the mentations. Jim jones championed the proto­ mid-1980s, but products from different vendors col in the DECprint PAP archjtecture team (Alan were not compatible. Network printing protocols Guenther, To m Hastings, Jim Jones, Tom Powers, were largely proprietary efforts by vendors who and Eric Rosen) and coded the LPS version 4.0 had developed them fo r their own printer prod­ server. Carol Gallagher wrote the LAPS filters to ucts. Digital's PrintServer 40 and its LAPS protocol translate from the old protocol to the new. Mike were typical in this regard . By the late 1980s, Augeri and John McLain ported the management network printing was an established and competi­ and console clients to the VMS system from the tive technology, but there was still little inter­ ULTRIX system. ]. K. Martin rewrote the Berkeley operability among the various vendors' producb. Software Development (BSD) source kit to use the In the absence of printing protocol standards, the new protocol. Ajay Kachrani developed our ULTRIX Internet Engineering Task Force (IETF) fo rmed a and MIT Athena clients and represented the proto­ Network Printing Protocol working group in col during early phases of the IETF standardization early 1990. This group's charter was to examine effort. Many others supported these efforts, and printing protocols then in existence or under devel­ others are yet beginning to develop new CPAP opment, assess their applicability to Internet-wide clients. We thank them all for their efforts. use, and suggest changes. Digital's representatives to the IETF working group on the Palladium Reference Printing Systems standardization reported the inter­ 1. R. Landau and A. Guenther, "Design of the est shown in Digital's Reid-Kent protocol. Thus, in DECprint Common Printer Supervisor for VMS July of 1990, Digital submitted a version of the PAP Systems," Digital Te chnical jo urnal, vol. 3, no. 4 that was under consideration by the DECprint PAP (Fall 1991 , this issue): 43-54. architecture team. Early consideration of this PAP by IETF and the LPS version 4.0 implementation effort ran concur­ rently. This provided a unique opportunity fo r Digital's implementers to obtain fe edback from a very knowledgeable architectural community. In turn, they could report implementation experi­ ences that affected the review and progress of the specification towards standardization. Implemen­ tations of CPAP clients and servers by companies other than Digital are in progress. As part of Project Athena's Palladium Printing System, the CPAP has been accepted by the Open Software Foundation for inclusion in a future release of OSF/1. A draft of the CPAl' is being circulated among Internet members for comment. Meanwhile, work on future enhancements continues. Wo rk is now in progress to specify a superset of the existing pro­ tocol that deals with authentication and encryp­ tion to strengthen security. This work is being done in the spirit of the original migration from the Reid-Kent protocol to the CPAP; i.e., the security fe atures being added will not adversely impact users who do not need the new features.

Acknowledgments The CPAP effort has been the work of many devel­ opers. Chris Kent and Brian Reid drafted the base

60 VIii. 3 No. 4 Fa ll 1991 Digitttl Te chnical jo urnal Guido Sim one jeffre y A. Metzg er Gary Va illette

Design of the Tu rbo PrintServer 20 Controller

The turbo PrintServer 20 controller is a pe1formance enhancement of the original PrintServer 20 system controlle�: The turbo controller was developed to enable PostScript code to execute faster and thus improve page throughput for complex documents. The RETrACE analysis system was designed to analyze tbe peiformance of the original PrintServer 20 system and estimate expected performance future systems. Tbe turbo controller's processor and its three subsystems for memory, write buffer; and bit-map data transfer were selected based on the analysis results. Performance tests conducted on both the original and the turbo PrintServer 20 indicate the enhanced processing performance of the turbo controller.

In 1988 the turbo controller project was conceived We were to use existing, qualified chips wherever as a means of extending the life of the PrintServer 20 possible in order to avoid new part qualification platform by introducing a performance-enhanced costs and application-specific integrated circuit system controller. The system controller in the (ASIC) development costs. PrintServer 20 is housed within and powered by Early investigations indicated that the perfor­ the printer or "print engine"; it is a concise imple­ mance target was indeed achievable with existing mentation of a single-board computer containing a inexpensive ruse processors, as well as a high­ CPU, a memory subsystem, an Ethernet interface, speed Digital proprietary VAX processor. A ruse and a printer interface. It supplies an environment processor would require porting a 2.5-megabyte in which a multitasking software system manages (MB) software system, which was far beyond the communications with remote clients and with the scope of the project. The highest performance print engine, performs data conversion from the VAX processor and the associated support chips, page description language (PostScript) to bit-map which would not cause a problem with the soft­ images, and provides management of physical print ware system, were far too expensive to be consid­ engine resources. ered. Alternatives were therefore limited to less The original controller provided a maximum expensive, lower speed VAX processors: the low­ print speed of 20 pages per minute, but this perfor­ risk, 60-nanosecond (ns) CMOS VAX or CVAX pro­ mance could not be maintained when the docu­ cessor was proven, ancl the higher speed and more ment included complex text, graphics, or images. To cost-effective "system on a chip" or SOC processor improve page throughput for complex documents, was under development. Either choice would have a controller was needed on which PostScript code a minimal impact on the software system and could execute faster. To enhance performance, the would provide a cost-effective solution. competition was moving toward controllers based The original performance estimates for the CVAX on new industry-standard reduced instruction set and the soc processors in general-purpose process­ computer (RISC) processors. Therefore, to be com­ ing environments were below the lower bound of petitive, Digital's new controller was required to the performance target. The design team was also improve performance by five to eight times that of uncertain of the actual execution characteristics of the original controller, which hacl been based on the PrintServer software. For these reasons, it was the rtVA.,'C microprocessor. decided to begin the project with a performance As challenging as the performance improve­ analysis of the original controller to determine the ment would be to achieve, budgetary pressures expected performance of a design based on either fo rced restrictions on the implementation strategy. processor.

Digitttl Te clmicaljourmrl Vo l. 3 No. 4 Fa ll /991 61 Image Processing, Video Te rminals, and Printer Te chnologies

This paper discusses the problems encountered The RETrACE hardware consists of the following: t.luring our analysis and the solutions d<:vised by the • Two imerconnect boards boot and operate a Hart.lcopy Systems Engineering Group to overcome system controller on a table top. Developed as them. The RETrACE tool suite, a performance analy­ part of the original PrintServer 20, the boards sis system, is described and the analysis results are connect the controiJer to a print engine and an provided. The paper then discusses the hart.lware Ethernet. architecture of the turbo controller and ends with a presentation of the performance resu lts obtained • The PrintServer 20 server controller was modi­ for standard PostScript benchmarks. fied for use as an intelligent trace buffe r system.

• The PrintServer 20 server controller's memory Pe rformance Analysis of the Original capacity (12MB) was extended using the standard Co ntroller 4MB memory module used on the Kanji version The PrintServcr 20 software system consists of a of the PrintServer 20. VA-'\ ELN operating system, an Adobe Systems, Inc. PostScript interpreter, and a substantial amount of • The RETrAC E mother board was developed specif­ software to manage communications and resources. ically for this tool suite. It contains a 32-bit wide, (FifO) The task of analyzing its performance was compli­ first-in, first-out buffer and two loosely cated by two additional factors. First, the software coupled state machines. system's behavior depended on the characteristics • A standard PrintServer 20 system controller and of the user's PostScript document. PostScript is print engine were used as the "system under an interpreted programming language. Thus, like observation." any computer program, low-level machine perfor­ mance can be dramatically affected by the program • The console terminal was selected from the stan­ being executed. Second, ami more painful, the dard VT series of terminals. proprietary nature of the PostScript interpreter A diagram of the RETrACE hardware system is prohibited us from obtaining code sources, and dis­ shown in Figure 1. cussing its internal architecture with engineers The RETrACE mother board performed the data from Adobe Systems. capture, using the modi.fiedcontr oller's memory as While the characterization of a complex, par­ a large buffer. The board monitored the processor tially proprietary, real-time software system is bus of the system under observation by copying difficult, it is not impossible. Programmer counter alI addresses and communications between the address (PC) traces have offered many systems rtVA-'\ processor and its external floating-point designers very detailed insight into the execution unit. This copied data was placed into a FIFO buffe r performance and characteristics of systems. PC that in turn was written into the memory of the traces provide a means to observe a system at a modified controller using a direct memory access macroscopic level, allowing a view of the complex (Oi'vlA)device. Since a standard PrimServcr 20 con­ interactions between the hardware and software troller and its optional memory expansion provide systems. System designers can use captured address 16.VIB of storage, approximately 3 seconds of real­ trace; from current machine performance to extra­ time execution address traces could be caprured. polate expected performance of futuresystems and The data capture continued until the trace buffe r help them make architectural trade-offs. memory was exhausted, at which point the data was uploaded over a network connection to a VAX Th e RETrA CE Analysis Sy stem VMS computer for analysis. The HETrACE tool suite was created to provide Due to the design of the original PrintServer 20 a nonintrusive means of capturing real-time PC system, many large data areas and code sections traces and analyzing the captured addresses. The were mapped into different explicit memory spaces. tool suite consists of both hardware and software This subdivision provided a means of determining components. which code function was executing in any given In order to keep expenses at a minimum, existing segment of the address trace. With a simple statisti­ hardware was used wherever possible. Only one cal study it was possible to generate software exe­ small module had to be developed to complete the cution histograms and to determine many of the RETrACE hardware platform. characteristics of the system, including translation

62 Vr>l. 3 No. 4 Fa ll 1991 Digital Te chnical jo urnal Design of the Tu rbo PrintServer 20 Co ntroller

RETRACE CONSOLE TERMINAL

PRINTSERVER 20 SYSTEM CONTROLLER

SYSTEM UNDER OBSERVATION

Figure 1 RETrA CE Analysis Sy stem Ha rdware buffe r, floating point, instruction stream (!-stream) fou nd, then a miss was signaled and the simulator versus data stream (D-stream), read versus write, would proceed to the next level of memory in the and interrupt performance. Hit rates for fully asso­ hierarchy. ciative caches of separate !-stream and D-stream, Whenever a hit occurred at a given level, it as well as a combined I- and D-stream cache, were was logged and all levels of memory in the hier­ also provided. These hit rates were determined for archy above it would allocate entries based on first-level write-through caches from 128 bytes up their definedal location rules. While this procedure to 256 kilobytes (KB). Thus an upper bound for an indicated the memory system performance for optimum-performance cached memory system a proposed architecture, the overall system per­ was determined. formance was still unknown. Using a simple rule Both processors under consideration possessed based on the average execution time per address the ability to access a memory subsystem at speeds for the existing controller, and scaling that time greater than that achievable with existing low-cost based on the clock speed increase of proposed pro­ dynamic random-access memory (DRAM) technol­ cessors, an overall performance number was esti­ ogy. The performance numbers predicted by the mated for a system based on either processor with processor groups indicated that cached memory any arbitrary memory architecture. subsystems were required. Because these sub­ systems can be expensive and their performance is Benchmark Selection subject to the peculiarities of the software that The RETrACE tools suite provided the components executes on them, a multilevel memory simulator required to study the execution characteristics of was developed to allow accurate studies to be per­ the PrintServer system without changing the char­ formed on proposed cache architectures. acteristics of its normal operation. The only diffi­ The simulator was configured atru n-time to sim­ culty was to narrow the focus of the benchmark list ulate an arbitrary hierarchical memory system that to provide a representative sample of PostScript was N levels deep, with an arbitrary size, associa­ documents to print. Due to time constraints, the tivity, perfo rmance, and behavior at each level. list was limited to five benchmarks. The memory level nearest the processor was definedas the first level, and the last as main mem­ BM 1 The BM 1 benchmark stresses those aspects of ory. The simulator processed a trace file by walk­ the system that convert the mathematical represen­ ing each address in the file through the memory tations of characters to bit-map representations, hierarchy starting nearest the processor at the first which comprise the form that is printed. This level. If a copy of the address was found at a given benchmark uses several fo nts in standard character memory level, then a hit was signaled and the next orientations, stressing both very large and small address was processed. If that address was not character sizes.

Digital Te cbnicaljour11al Vo l. 3 No. 4 Fall 1991 63 Image Processing, Video Te rminals, and Printer Te chnologies

R.\12 Of the same type as B.\1 1, this benchmark the overall hit rate to approximately 81 to 84 per­ stn::sses the transforms from mathematical to hit­ cent. As the analysis of the data progressed, it was mapped character representations; howcvt.:r, the understood that the write data must be studied characters printed are at arbitrary orientations very closely since it could have a dramatic impact with sizes ranging from typical to very small. on the overall cache miss rate. During the cache model simulations, the hit rates of the 1-stream BM3 The B�1.1 benchmark is one of the standard were between 85 to 90 percent. However, the benchmarks for PostScript performance qualifica­ D-stream hit rates were between 35 to 45 percent, tion. It is a simpk: 41-page document that contains with writes accounting for 60 to 90 percent of the st.:veral different fo nts. The benchmark is designed total D-stream misses. To achieve the greatest posi­ to characterize the standard text-handling per­ tive effect on the hit rate of the system, enhance­ fo rmance of a printer. This benchmark is printed ment of write-miss performance was the most n.vice to ensure that all characters to be primed advantageous. The two options to improve this per­ have been converted from mathematical outlines formann: were either to implement a write-back to bit-map representations of the characters. Thus cache or to add a write buffe r to the system. Further the focus of the benchmark is to move the text cache simulations showed that a write buffer would data through the system, to copy the character bit provide an 8 to 16 percent overall system perfor­ maps to the 1MB region in memory that contains mance improvement, which was equal to that of a the image to be printed, and to print the image. It write-back cache. The write buffe r, however, was should be noted that this is the only benchmark the more straightforward solution to implement. that printed at engine speed on the PrintScrver 20 Cache analysis revealed that the processors system controller powered by the rtVAX system. required different memory architectures. The CVA.X had an internal lKB, two-way set associative cache. HOUSE A binary image file, the HOUSE benchmark This was to be configured as a mixed 1- and D-stream was used to stress the communications aspects of cache. An additionai 32KH ro 64KB, two-cycle write­ the PrintServcr system. through cache was to be added external ly. This would also be configured as a mixed I- and D-stream SCHEll'! The SCHEM benchmark was a vector repre­ cache. A singl.e-longword, two-cycle write buffe r sentation of a logic schematic. This benchmark was would provide enough buffe ring to reduce the used to stress the PostScript interpreter's ability to dramatic impact of write misses. The soc was interpret nonnative PostScript code and to exhibit proposed to have an internal write-back cache the characteristics of drawing vectors. between 5KB and 8KB, with each lKB region mak­ ing up a single set. Cache simulations indicated Analysis Results that with a minimum internalmixed 1- and 0-stream The thrust of the analysis was to provide credible cache of 5KB, five-way set associative, an external evidence to support architectural and implemen­ data cache would have to be over 64KB to have even tation trade-offs. The major areas of fo cus were a negligible effect on ovnall system performance. Therefore no external cache was recommended. To • Memory system organization mitigate the write-miss penalty, a two-cycle write

• Printer intnface performance buffe r of 4 to 6 longwords was recommended. As an acceleration technique, the original • Main memory bandwidth PrintServer 20 controller contained a memory

• Overall system performance access capability that allowed data written to mem­ ory to be logical.ly ORed with data that was already Memory Sy ste-m Orga niza tion The statistical anal­ stored. This technique was particularly useful when ysis of the trace information provided many clues the software system was wri ting the image that was to direct our investigation toward the optimum ultimately printed. As part of the process of gener­ mcmorv system architecture. The overall read-to­ ating an image to print, the individual characters write ratio for the observed benchmarks ranged appearing on a page must be copied from a region from as low as 4.5: 1 up to '5.'5:1, which me:1ns for of memory ca lled the fo nt cache to another region a write-through cache system with a theoretical called the frame buffe r. The frame buffe r contains 100 percent read hit rate, the writes would degrade the actual data that is sent to the print engine.

64 Vo l. 3 No. 4 Fa ll I')<)I Digital Te chnical ]o urual Design of the Tu rbo PrintServer 20 Controller

To complicate things, the data written to the frame Overall System Performance The execution char­ buffer must be able to overlay data that may already acteristics of the original PrintServer 20 provided be there, thus requiring a logical OR function. some interesting surprises. Most floating-point When a document was printing at or near the calculations were performed in double precision; maximum engine speed of 20 pages per minute, and even more interesting, for each floating-point analysis showed this low-level copying fu nction operation, there was a floating-point conversion consumed approximately 20 percent of the total from single to double precision, and then back system time allotted to generate and print one page. again. Since the precise operations were not Thus a logical OR fu nction in the memory system required, a simple compiler switch removed the would reduce the number of memory data cycles conversions and provided a 3 percent overall sys­ from "2 reads 1 write" to "1 read 1 write," and tem performance improvement for floating-point­ reduce the impact from a second read occupying a intensive PostScript documents. A second surprise useful cache location. Without this capability, the came from the results of the BM3 benchmark, degradation would be between 5 and 10 percent of which indicated a translation buffe r hit rate of overall system performance when printing at or 85 percent. At the time of the discovery, the near 20 pages per minute. Therefore memory capa­ PrintServer 20 was configured with a standard bility with a logical OR function was recommended. MicroVA..'( processor; however, by substituting an rtVAX, which uses one less memory access to refer­ Printer interfa ce Performance Whena PrintServer ence its page tables, an 11 percent system per­ 20 is printing, every page that ex its the printer formance improvement was achieved. With this requires the 1MB frame buffe r to be copied from improvement, the rtVA.c'C processor provided memory to the print engine interface. Changing a enough power to al low the original PrintServer 20 program-controlled printer interface to one driven to ship with its 20-page-per-minute designation. by a DMA device provided two significant advan­ This information led the turbo controller designers tages. The first was to reduce the real-time require­ to determine that the translation buffer of the SOC ments on the PrintServer software system, and the would be large enough for all the entries required. second was to allow for a limitecl degree of paral­ lelism on the controller. The parallelism was due to Results the ability of the processor to continue to execute The finalanalysis revealed that the expected perfor­ from its cache memory system while the DMA mance of a CVAX or SOC processor would place device accessed memory.The processor only stops either design on the low side of the performance executing when a cache miss occurs. requirement. Therefore close attention to detail would be required during the implementation Main Memory Bandwidth With a CVAX processor phase of the project as every ounce of performance configuredas recommended in the section Memory mattered. The expectation was to have a choice System Organization, the main memory system between an SOC processor with a 40-ns cycle time bandwidth requirement of the processor was and a CVAX processor with a 60-ns cycle time. The 60 percent. For the soc, it was 70 percent when performance improvements of the two processors an existing DRAl\1 controller was used. A Dl\'lA­ are compared in Ta ble 1. driven printer interface required 15 percent, and an Ethernet interface required nominally 4 percent with bursts up to 20 percent. Each subsystem was Ta ble 1 Performance Improvement Relative scrutinized to reduce its required memory band­ to Original PrintServer 20 Controller width. The resulting recommendation was to add a 32-bit bus to the memory subsystem to provide a soc CVAX Benchmark Processor Processor dedicated channel for all data being sent to the printer interface. This provision would reduce 8M1 4.7 3.7 required memory bandwidth for the printer inter­ 8M2 4.9 4.0 15 face from percent to about 7 percent. The sys­ 8M3 4.3 3.3 tem would then have a nominal memo1-y bandwidth HOUSE 4.9 4.2 requirement of 71 percent for a CVAX system and SCHEM 4.7 3.7 81 percent fo r an soc.

Digital Te cbtticaljournal Vo l. 3 No. 4 Fa /1 1')')1 65 Image Processing, Video Te rminals, and Printer Te chnologies

As the project schedule progressed, the risk asso­ In each case existing chips satisfied some of the ciated with the new soc processor decreased. As requirements for the subsystem. In the end these this risk window collapsed, it was understood that chips met all our requirements, but only because a turbo controller based on the soc processor tht.:y wt.:reused in ways not originally intt.:ndcd by would not only perform better, but would also cost the chip designers. less as it would not require an external cache. Ma in Me mory Tu rbo Controller Hardware Design Since the SOC has a bus interface that is compati­ The turbo controller was destined for a relatively ble with the CVAX chip, the most obvious chip to high-end printer. Therefore the hardware architec­ use as a main memory control ler was the CVAX ture had to provide maximum performance, even memory controller (CMCTL) chip.1 It responds to all though this implementation would increase costs. bus cycles generated by the soc, anct since it was Based on the results obtained during RETrACE analy­ already used on a munber of platforms supported sis, the hardware design had the fo llowing imple­ by the VAXELN operating system, its use greatly mentation goals: simplified porting VAXELN to the turbo controller. • The SOC would provide the CPU, the floating­ However, the turbo controller requires two special point accelerator (FPA), and the cache subsystem. memory modes that are not provided directly by No second-level cache would be implemented. the CMCTL, namely OR mode and scan-erase mode. It was essential to devise a way to include these • A four- to six-entry write buffer would be implemented. two modes if the CMCrL were to be used. OR-mode memory is a technique used to improve • The transfer of bit-map data to the print engine performance during the writing of the page bit would require a 32 -bit DMA subsystem with scan­ map into memory (scan conversion). During nor­ erase capability. mal memory operation (called replace mode), the

• The memory subsystem would support OR-mode destination operand in memory is replaced by the memory access by the CPU and scan-erase access source operand. During an OR-mode write cycle, by the DMAcontrol ler. the destination operand is modifiedas fo llows:

Although both the SOC and rtVAX chips comply • For each logicaJ zero in the source data being with the VAX architecture standard and both are written, the corresponding destination bit in conceptually very similar, they have significantdif ­ memory remains unchanged . fe rences in the bus interface. For example, the soc uses a quadword cycle (one 32-bit address fol­ • For each logicaJ one in the source data being lowed by two 32-bit data reads) to fill one internal written, the corresponding destination bi t in cache block, while the rtV�'\ processor, which does memory is written with the corresponding bit in not support caching, does not usc this type of the pattern register. cycle. AJso, the clocking system on the SOC was • The pattern register is a 32-bit register which enhanced, and the timing relationships between determines the "color" pattern of the "ink" being signals were modified to improve performance. written on the pagt.:. The changes to the SOC bus in terfact:, plus the required functional changes revealed by RETrACE Figure 2 shows a portion of the logic between analysis, meant that very little of the original the CMCTL and the memory array that implements PrintServer 20 controller design could be applied the OR-mode fu nction in hardware. The OR-mode to the new controller. One of the first questions to operation is accomplished by inverting the source be answered before the design of the turbo con­ data and connecting it to 32 independent write troller could begin, was whether or not one or enables of the memory array. When a zero is writ­ more AS!Cs wou ld be required for the design. This ten, it is inverted and the write cycle for that bit question had to be answered for three subsystems: becomes a read cycle, thus preventing any change to the memory contents. When a one is written, it • Main memory is inverted and the write is aiJowed to occur, but • Write buffe r the data actually written depends on the value pre­

• Bit-map data transfer subsystem viously written into the pattern register.

66 vbl. 3 No. 4 Fa ii i')'JI Digital Te ciJIIicaljournal Design of the Tu rbo PrintServer 20 Controller

MEMORY DATA BUS FROM CMCTL 32 Three operations occur durin g a single scan­ erase cycle. Refer to the circuit drawing in Figure 3.

1. The bus master asserts the signal's "bit-map 32 DATA DATA load" and "bit-map erase" and requests a masked IN OUT � I-- D � [> write. The C1viCTL performs a re ad, and the bit map is read ontO the memory data bus. PATTERN READ REGISTER DATA PATH 2. Bit-map data is automatically transferred from the memory data bus intO the FIFO buffe r. 3. The CMCTL performs a write. However, since the bit-map erase signal has disabled the data WRITE '----- ENABLE path and the pull-down resistors have set the [>� data-in lines to all zeros, the write cycle, which was intended by the designers of the chip as a OR-MODE MAIN MEMORY masked write, has in fact become a memory WRITE ARRAY DATA PATH clear operation.

Wr ite Buffer Figure 2 OR-mode Circuit The LR3220 chip was chosen as the base for the write buffe r subsystem. It provides a six-entry FIFO Two featu res of the CMCTL chip make it possible buffe r for address, data, and byte mask and detects to implement OR-mode memory. First, its 64MB whether the processor has requested a read at a address space is divided into 4 arrays of 4 banks memory location for which a write is stil l pending. (16 banks total). Second, the CMCTL chip can selec­ It also supports two operating modes: LR3000 tively disable parity checking on an array. mode and Harvard mode. The large address space of the CMCTL allows the If it were not fo r the Harvard-mode fe ature, it use of 2 arrays for replace mode and 2 arrays fo r would have been more difficult to inclucle the OR mode, since the turbo controller supports up to LR3220 chip into the turbo controller. The LR3000 32MB of memory. The control signals of the two processor, for which this chip was designed, has sets of arrays are combined such that OR mode and staggered address timing. Some of the address and replace mode access the same physical memory, byte-mask bits are asserted on the fall ing edge of though in diffe rent ways. Parity error detection the clock, and the remaining bits are asserted on is disabled on the OR-mode arrays; thus a read­ the rising edge of the clock. When the LR3220 chip through OR-mode address space cannot cause a par­ is configured in LR3000 mode, the processor sub­ ity error. This is necessary because OR-mode write system must meet these timing requirements. cycles may corrupt parity. Normally any bit map However, when the LR3220 chip is configured in created using OR-mode write cycles is read using Harvard mode, all address, data, and byte-mask OR-mode read cycles. information is read at the same rising clock edge. The other special mode required for the main The basic strategy for including the write buffer memory system is called scan-erase mode. It is an into the turbo controller was to insert the write buf­ operating mode designed to improve bus utiliza­ fe r between the soc ancl the rest of the system as tion during the transfer of the bit map from main shown in Figure 4. The SOC would issue read and memory to a FIFO buffer connected to the printer write requests to the write buffe r, ancl the write data lines. This mode is made possible by a side buffe r would issue read and write requests to the effect of the error-correcting code (ECC)/parity rest of the system. During CPU cycles the soc and generation logic in the CMCTL. Any time a masked the wri te buffe r have a master-slave re lationship write occurs (any write other than an aligned long­ in which the SOC is the master. The relationship word , such as a byte write), the destination long­ between the write buffe r and the rest of the system word must first be read by the CMCTL, then is also a master-slave relationship; however. the combined with the bytes tO be written in order write buffe r is the master. In fact, the write-bu ffer to generate the parity or ECC check bits for that output interface must look almost identical to the longword. soc.

Digital Te chnicaljoumal Vo l. 3 No. 4 Pa ll 1991 67 Image Processing, Video Te rminals, and Printer Te chnologies

FIFO BUFFER

DATA TO [[IlJ PRINT ENGINE BIT-MAP LOAD I MEMORY DATA BUS FROM CMCTL 32

REPLACE-MODE WRITE DATA PATH

32 DATA DATA 32 '-- [> i IN OUT r-+- [> '-- READ BIT-MAP ERASE I DATA PATH

WRITE ENABLE FROM CMCTL 1 COMMON ---.::.__c__::_c__::_c__::______-f� WRITE ENABLE

MAIN MEMORY ARRAY

Figure 3 Scan-erase Circuit

The structure of the write-buffe r subsystem is entries in the LR3220 chip have data, the bus cycle shown in Figure 5. The bus interface unit responds generator (BCG) removes the next entry ancl issues to read or write requests from the soc. During a write request to the appropriate subsystem. write cycles, the bus interface writes the data into The write-buffe r subsystem allows the SOC to the LR3220 chip and immediately alerts the SOC to "read around" the write buffe r, provided rhe address terminate the cycle quickly. Whenever om� or more being read does not have a pending write in the

TO OTHER SYSTEM WRITE-BUFFER SUBSYSTEMS ON A CHIP 1--- SUBSYSTEM

CVAX UNIVERSAL MEMORY DIRECT MEMORY CONTROLLER ACCESS (CMCTL) CONTROLLER

I I PRINTER MAIN DATA FIFO MEMORY BUFFER ARRAY

TO PRINTI ENGINE

Figure 4 Interconnection of Tu rbo ControllerSubs ystems

68 Vo l. 3 No. 4 l'i:t /1 1')91 Digital Te ch11icaljo urt�al Design of the Tu rbo PrintServer 20 Controller

BUS CYCLE BUS LOCAL GENERATOR SYSTEM CONTROL SOC CONTROL INTERFACE CONTROL SIGNALS AND UNIT ARBITRATOR I-- r-- r-- ADDRESS IN r-I>

....._ r---. LATCH L_ ADDRESS OUT SYSTEM DATA/ADDRESS SOC DATA/ADDRESS DATA IN DATA OUT SOC BYTE MASK v MULTIPLEXER LR3220 WRITE BUFFER SYSTEM BYTE MASK

Fig ure 5 Write Buffe r

LR3220. To handle this, the BCG includes an arbitra­ Bit-map Tra nsfe r Subsystem tion circuit. When the SOC requests a read cycle, The bit-map transfer subsystem transfers bit-map the bus interface unit of the write buffe r passes data, created by the PostScript interpreter, to the the request to the BCG. The BCG responds once print engine. It is composed of the 32 -bit DMA con­ it has completed any write cycle currently in troller, a FIFO subsystem, and scan-erase logic in progress, provided that the address to be read does main memory as described in the section Main not have a pending write in the write buffer. When Memory. the slave device being read acknowledges the BCG, The main requ irements for the 32 -bit DMA con­ the acknowledgment is passed back to the bus troller were interface ancl finally to the soc to terminate the • 32 MB address range cycle. The BCG then resumes its task of removing entries from the LR3220 chip and issuing writes to • Ability to transfer 32 bits at a time the rest of the system. • Ability to transfer the frame buffe r fo rward In order to maintain data coherency, the write­ (incrementing the source address) or backward buffe r subsystem enforces some additional (decrementing the source address) protocols. None of the available DMA controller chips met • All writes to any location other than main all our requirements, but the AMD 9516 universal memoryrequire a write-flush cycle; that is, the DMA controller (UDC) met some of them. The uoc bus interface must wait until the LR3220 chip is a 16-bit DMA controller with a 16MB address is empty before writing the data to it. Further­ range and the ability to increment or decrement the more, the bus interface must wait until the BCG source address. There were two drawbacks to the has finished the cycle before it acknowledges the use of this chip. The software would have to ensure SOC and allows it to perform the next cycle. that the frame buffer was always within the lower 16MB of memory, and the UDC would use twice as • All reads to any location other than main mem­ much bus bandwidth since it could transfer only ory require a read tl ush, which has the same 16 bits at a time. restrictions as a write flush. These restrictions It was proposed that the UDC could be used as a are required to avoid the possibility of reading fu ll 32-bit DMA controller ifit was connected to the around a pending 110 space write, which often bus "incorrectly" by shifting the data/address lines has side effects to other addresses. to the left by one bit. That is, data/address line 0 on • The write-buffe r subsystem must pass all DMA the UDC would be connected to data/address line 1 bus transactions to the soc to ensure that all on the bus; data/address line 1 of the UDC would be cached memory locations that are modified by connected to data/address line 2 on the bus; etc. DMA cycles have their corresponding cache This type of connection doubles the address range entry invalidated. of the chip and causes the source address on the

Digital Te chnical journal 14">1. 3 Nu. 4 Fa /1 1991 69 Image Processing, Video Te rminals, and Printer Te chnologies

bus to increment by 4 bytes (32 bits) instead of Ta ble 3 Benchmark File Attributes 2 bytes (16 bits). This decision had a few impl<.:mcntation impacts. File Name General Attributes of File

For example, the register definitions were now BM1.PS Contains 39 pages of text with incorrect, since all the bits in all the registers were 13 fonts of various sizes. Some shifted one bit to the left. However, once the soft­ text strings are at varying angles. ware was modified to compensate for this, the UDC BM2.PS Contains 1 page of spiral text functioned properly as a 32-bit DMA controller. of various point sizes. When combined with the scan-erase fe ature of BM3.PS Contains 41 pages of text with 5 fonts. main memory, it allowed us to achieve our bit-map transfer goal of reading 32 hits from memory, load­ HOUSE.PS Contains a 1-page bitonal image of 3000 blocks (DECnet limited). ing it into the FII'O subsystem, and clearing the SCHEM.PS memory location, all in a single DMAcycle. Contains a 65-page schematic of graphics (vectors) and text. Pe rformance In this section, the performance of the original Ma th Op erators Performance of the PrintServer 20 is compared to the enhanced perfor­ Po stScriptjo b mance of the turbo PrintServer 20. Figure 6 illustrates the controllers' performance Except for performance, the original PrintServer results in math operations per second. The test 20 and the turbo PrintServer 20 have identical func­ determines the time needed to perform 50,000 tional capabilities. Ta ble 2 lists the five fu nctional primitive math operators (e.g., adding two mJm­ subsets that were characterized for performance bers 50,000 times) during a PostScript test docu­ on both printers. The firstfo ur fu nctional subsets ment. The real-time operator reads the current were rated using the PostScript real-time operator; time, and the repeat construct repeats the math they measure the elapsed CPU time needed to operator. This test measures the performance of complete a test. The last functional subset was the CPU only. rated according to the rate of pages exiting the printer. The term "DECnet/DPS" refers to the 14000 DECnet job (a job is one of several multiprocessing 0 z tasks running on the controller) and the '"dis­ 8 12000 tributed PrintServer software'' job. The term w (/) 10000 "printer system" refers to the complete printer a: w Cl.. system, including the PostScript job and the print­ 8000 z(/) ing overhead jobs. The printer system was rated 0 6000 according to the rate of pages exiting the printer. � II Table 3 reports the general attributes of the five � 4000 0 files that were run with the RETrACE system and I 200 chara<:terizedfo r performance. � : I I _[ I I I I ADD DIV MUL SORT COS EXP LOG

Ta ble 2 Functional Subsets of the Printers KEY:

0 ORIGINAL CONTROLLER a ac iza i n Functional Subsets Ch r ter t o • TURBO CONTROLLER TURBO = 6.7 x ORIGINAL CONTROLLER PostScript job Math operations per second PostScript job Te xt: characters per second Figure 6 PostScriptjo b Performance with Math PostScript job Graphics: vector inches per second

DECnet/DPS jobs DECnet/DPS: kilobytes per Te xt Performance of the Po stScriptjo b second Figure 7 compares the text performance of the Printer system Image printing: square PostScript job on the original controller and the inches per second excluding turbo controller. The test determines bow long it DECnet/DPS takes the PostScript job to compose 250,000 equally

70 14>1. 3 No. 4 Fa ll 1991 Digital Te chnical jo urnal Design of the Tu rbo PrintServer 20 Controller

20000 Image Performance The image test characterized the complete printer 0 z system, including the PostScript job and the print­ 0 15000 () w ing overhead jobs, but excluding the DECnet/ DPS (f) a: time required to transfer an image file to a printer. w (l_ 10000 Three one-square-inch bitonal images at device (f) a: resolution were placed into the user dictionary w f- () and were used repeated ly during the performance <>: a: 5000 measurement. The result of using these precached <>: I () images was to eliminate the DECnet and DPS soft­ ware time that would be required to transfer a full­ 0 I I I I 2 4 I, _I, 10 12 page image from a host to the printer. Performance FONT POINT SIZE was measured by printing 10 pages of 80 square KEY: inches of image per page.

0 ORIGINAL CONTROLLER The pages were printed landscape and portrait • TURBO CONTROLLER to measure the image performance both on axis and TURBO = 5.2 ORIGINAL CONTROLLER x off axis. (On axis means that the printer sequen­

Figure 7 PostScriptjo b Performance with Text tially prints all bits of a word from the image on a single scan line. Off axis by 90 degrees means that sized characters to the page buffer in memory, the printer prints one bit from each word and does which eventually is sent to the print engine to be not print the next bit in the word until it is at printed. the same position on the next scan line.) Figure 9 shows the results of the image performance test in Graphics Performance of PostScriptjo b square inches per second. An important means of characterizing graphics per­ formance is in vector inches per second. Figure 8 22.0 shows the results obtained by running a PostScript 0 20.0 z vector program in which all vectors are at 0 18.0 () 45 degrees and vector lengths are from 0.1 inch to w 16.0 (f) a: 14.0 3 inches. w (l_ 12.0 (f) w 10.0 ()I 3000 ?; 8.0 w 6.0 0 a: 25oo 15 <>: 4.0 () ::::J a w (f) 2.0 � 2000 0 w (l_ � 1500 KEY: ()I ?; 1000 0 ORIGINAL CONTROLLER a: • TURBO CONTROLLER 0 TURBO = 3.0 x ORIGINAL CONTROLLER ()f- 500 w > Figure 9 Image Performance Measurement of the Printing Sy stem

KEY:

0 ORIGINAL CONTROLLER DECnet/DPS jo bs Performance • TURBO CONTROLLER DECnet/DPS transfer rates can be ignored for text 4.6 TURBO = x ORIGINAL CONTROLLER and graphics files, but these rates can consume most of the time needed to print large image files. Figure 8 PostScriptjob Performance For example, a single, letter-size page of image with Graphics contains more than 1MB of image data, but the

Digital Te chnical journal Vo l. j No. 4 Fa ll 1991 71 Image Processing, Video Te rminals, and Printer Te chnologies

corresponding PostScript file contains morT than to fileprinted and by the amount of CPU time used 2MB. Because the image data is represented in Amer­ to print the job. For example, in the B.\1 :) bench­ ican standard code for information interchange mark, the speed is limited by the 20-page­ (ASCII) hexadecimal characters in PostScript, S bits per-minute print engine, but the CPU time needed of the PostScript file are needed to represent 4 hits to print the file can be used as a performance of image data. measurement. 'I(> measure DECnct/DPS, a PostScript file of I MR ofcomments was sent to the printer. The clock was Summary started ·when the beginning of the file was tTccived The turbo controller enhanced the performance of by the PostScript interpreter and stopped when the the PrintScrver 20 printer system. Its design was end of the file was received. The assumption of this prompted by the need to maintain print speed test method was that the PostScript interpreter can performance for complex documents containing parse comment I ines much faster than DECnet/ DPS text, graphics, and images. The RETrACE system was can transferthem. designed to analyze the PrintServer 20 system to The DECnet/DPS transferrate is basically propor­ determine which architectural changes would pro­ tional to the slower of the host and printer proces­ vide the greatest improvement in PostScript perfor­ sors. Figure 10 shows the DECnct/DPS results. mance. By optimizing hardware only in areas where it was truly worthwhile, we were able to use exist­ RETrA CE Benchmark Files ing chips and reduce development costs. The sub­ The benchmark files listed in Ta ble 4 arc charac­ systems of the turbo controller hardware that terized both by the elapsed time from file arrival were optimized as a result of this analysis were the processor (SOC which provided CPU, fl oating­ point accelerator, and cache subsystem), a memory 90 subsystem with OR-mode and scan-erase access, 80 a write-buffe r subsystem, and a 32-bit DMA sub­ 0 z 70 0 system. Results of the performance tests for five ()w 60 benchmarks, including PostScript jobs, indicate the (/) cc levels of enhanced performance. w 50 a_ (/) 40 w f-- Acknawledgments >- 30 C)0 Chris Mayer designed and implemented the ...J 20 52 RETrACE multilevel cache simulator. He developed 10 a ticketing algorithm that simplified the manage­ 0 ment of delayed behavior memory constructs such

KEY: as write buffe rs.

0 ORIGINAL CONTROLLER • TURBO CONTROLLER Refe rence TURBO = 4.9 " ORIGINAL CONTROLLER 1. D. K. Morgan, "The CVAX CMCTL-A CMOS Memory Controller Chip," Digital Te chnical Figure 10 DECnet/DPS}o bs Perjonnance jo urnal, vol. 1, no. 7 (August 1988): 139-143.

Ta ble 4 Benchmark Files Characterized by Elapsed Time and CPU Time (Seconds)

Benchmark Original Tu rbo Original Tu rbo delta delta File CPU CPU Elapsed Elapsed CPU Elapsed

BM1 7585 1707 7735 2050 4.4 3.8 8M2 238 51 241 51 4.7 4.7

8M3 56 15 128 120 3.7 1.1 * HOUSE 67 15 106 31 4.5 3.4 SCHEM 2802 625 3073 675 4.5 4.6 "Limited by engine.

3 Digital Te chnical journal 72 Vo l. No. i Fu ll 1991 I Further Readings

Th e Digital Te chnical Journal DECwindows Program publishes papers that explore Vo l. 2, No. 3. Summer 1990 the technological fo unda tions An overview and descriptions of the enhance­ of Digital's majo r products. Each ments Digital's engineers have made to MIT's X Window System in such areas as the server, tool­ Journalfixuses on at least one product area and presents a kit, interface language, and graphics, as we ll as cornpilation of papers written contributions made to related industry standards by the engineers who developed VAX 6000 Model 400 System the product. The contentfo r Vo l. 2, No. 2, Sp ring 1990 the Journal is selected by the The highly expandable and con.figurablemidr ange jo urnalAdv isory Board. family of VAX systems that includes a vector proc­ Digital engineers who would essor, a high-performance scalar processor, and like to contribute a paper advances in chip design and physical technology to the Journalshou/. d contact the editor at RDVAX.::BLAKE. Compound Document Architecture Vo l. 2, No. 1, Wi nter 1990 To pics covered in previous issues of the Digital The CDA family of architectures and services that Te chnical jo urnal are as fo llows: support the creation, interchange, and processing of compound documents in a heterogeneous net­ Availability in VA Xcluster Systems/ work environment Network Performance and Adapters Distributed Systems Vo l. 3, No. 3, Summer 1991 Vo l. 1, No. 9,june 1989 Discussions of VMS volume shadowing, VAXcluster Products that allow system resource sharing application design, and new availability featuresof throughout a network, the methods and tools local area VAXcluster systems, together with details to evaluate product and system performance of high-performance Ethernet and FDDI adapters, and an analysis of FDDI LAN performance Storage Technology Vo l. 1, No. 8, February 1989 Fiber Distributed Data Interface Engineering technologies used in the design, man­ Vo l. 3, No. 2, Sp ring 1991 ufacture, and maintenance of Digital's storage and FDDI IAN The system and Digital's products that informationmanagement products support this technology, with an overview and papers on the physical and data link layers, CVAX-based Systems Common Node Software, bridge and concentrator Vo l. 1, No. 7, August 1988 devices and related management software, and an CVAX chip set design and multiprocessing archi­ ULTR.IX network adapter tecture of the midrange VA.,'\ 6200 family of systems and the MicroVA.,'< 3500/3600 systems Transaction Processing, Databases, and Software Productivity Tools Fault-tolerant Systems 6, Vo l. 3, No. 1, Winter 1991 Vo l. 1, No. Fe bruary 1988 To ols that assist programmers in the development The architecture and products of Digital's dis­ tributed transaction processing systems, with of high-quality, rel iable software information on monitors, performance measure­ VA Xcluster Systems ment, system sizing, database availability, commit Vo l. 1, No. 5, September 1987 processing, and fault tolerance System communication architecture, design and implementation of a distributed lock manager, and VAX 9000 Series performance measurements Vo l. 2, No. 4, Fa ll 1990 The technologies ami processes used to build VAX 8800 Family Digital's firstmain frame computer, including Vo l. 1, No . 4, February 1987 papers on the architecture, microarchitecture, The microarchitecture, internalboxes, VAXBI bus, chip set, vector processor, and power system, and VMS support for the VAX 8800 high-end multi­ as well as CAD and test methodologies processor, simulation, ancl CAD methodology

Digital Te cbnical ]ou mal Vo l. 3 No. 4 Fa ll /')91 73 Further Readings

Networking Products J Delahunty and T. Kielty, "Automated Pareto Vo l. 1, No. 3, September 1986 Analysis for Continuously Improving a VLSI The Digital Network Architecture (DNA), network Fabrication Area's Process Stability," Advanced performance, LANbridge 100, DECnet-liLTRIX and Semiconductor A1anufacturing Co nference DECnet-DOS, monitor design (September 1990).

MicroVAX II System S. Dell, "Promoting Equality of the Sexes through lf)/. 1, No. 2, Ma rch 1986 Technical Writing," Society.fo r Technical Commu­ The implementation of the microprocessor and nication (August 1990). floating point chips, CAD suite, MicroVAX work­ B. Doyll' and K. Mistry, "A Lifetime Prediction station, disk controllers, and TK';O tape drive Method for Hot-carrier Degradation in Surface­ VA X 8600 Processor channel P-MOS Devices," IEI:.'F [i-ansactions on Vo l. 1, No. 1, August 1985 Electron Devices (May 1990). The system design with pipelinecl architecture, the 1-box, F-box, packaging considerations, signal E. Freedman and Z. Cvctanovic, "Efficient integrity, and design for reliability Decomposition and Performance of Parallel POE, FFT, Monte Carlo Simulations Simplex Subscriptions to the Digital 1'echnical.fournal are ami Sparse Solvers," IEEE Supercomputing '90 available on a yearly, prepaid basis. The subscrip­ (November 1990). tion rate is 540.00 per year (four issues). Requests should be sent to Cathy Phillips, Digital Equipment A. Garde! and P Deosthali, "Hub-centered Pro­ Corporation, MLOI-3/B68, 146 Main Street, Maynard, duction Control of Wafer Fabrication," Advanced MA 01754, U.S.A. Subscriptions must be paid in U.S. Semiconductor Manufacturing Conference dollars, and checks should be made payable to (September 1990). Digital Equipment Corporation. A. Hartzell, "Introduction of Argon as a Heat Single copies and past issues of the Digital Transfer Gas in a Single Wa fe r RJESyste m," Te chnical.fournalcan be ordered from Digital Jntenwtional Sy mposium fo r Te sting and Press at a cost of $16.00 per copy. Fa ilure Analysis (October 1990).

Technical Papers by Digital Authors A. Heyman and J Thottuvelil, "Linear Averaged and Sampled Data Models fo r Large Signal Control of R. Al-Jarr, ''A Methodology fo r Evaluating Decision High Power Factor AC-DC Converters," IEEE Po wer Making Architectmes for Automated Manufactur­ Electronics Sp ecialists Oune 1990). ing Systems,'' Eleventh JF-A C Conference (August 1990). L. Hill, "Video Signal Analysis for EMI Control," IEEE Electromag '91 (1991). S. Angebranndt, R. Hyde, D Luong, and N. Siravara, "Integrating Audio and Te l<.:phony in a Distributed L. Hill and A. Metsler, "Video Subsystem Design EMI IEEE Wo rkstation Environment," Proceedings of the for Control," Electrmnag '91 (1991). Summer 1991 T!SLNIX Conference Oune 1991). S. Kasturi, "Forced Convection: The Key to the Ve rsatile Reflow Process," NEPCON East '90 S. Batra, M. Mallary, and A. To rabi, "Frequency Oune 1990) Response of Thin-lilm Heads with Longitudinal and Transverse Anisotropy," !J:'I:'f:' lntermag '90 D. Mirchandani and P Biswas, "Characterization (April 1990). and Modeling Ethernet Performance of Distributed DECwinclows Applications," R. Csencsits, N. Riel, ]. Dion and S. Arsenault, ACM Sigmetrics (May 1990) "Interfacial Structure and Adhesion of Metal-on­ polyamide," InternationalSy mposium fo r Te sting W Metz, "Automated On-line Optimization of an and Fa ilure Analysis (October 1990). Epitaxial Process," In ternational Semiconductor Ma nufacturing Science Sy mposium (May 1990). R. Csencsits, J. Rose, R. St. Amand, L. Ell iott, A. Hartzell, L. Kisselgof, and]. Lloyd, "Aluminum K. Mistry, B. Doyle, and D. Krakauer, "Imract of Interconnect Microstructure and Its Role in Snapback Induced Hole Injection on Gate Oxide Electromigration," Jntentational Sy mjJ Osium Reliability in N-MOSFET's," //.;/:1:.' Electron Device fo r Te sting and Fa ilure Ana�ysis (October 1990). Letters (October 1990).

74 Vo l. .) No. .:j Fa ll I'J'JI Digital Te chnical jo urnal I

C. Pan, "Gas Lubrication," ASME/STLE Tribology VAX/VMS: Writing Real Programs in DCL Conference (October 1990). Paul C. Anagnostopoulos, 1989, softbound, 409 pages, Order No. EY-C168E-DP-EEB ($29.95) A. Philipossian, "Fluid Dynamics Analysis of Thermal Oxidation Systems via Residence Time This book contains information that can help the Distribution (ROT)," Electromechanical Society reader Jearn to write powerful and well-organized (October 1990). programs in DCL, the command language for the VAX/VMS operating system. The text includes a M. Sidman, "Convergence Properties of an review of the syntax and semantics of DCL and a Adaptive Runout Correction System," ASME discussion of significantissues in the development Winter Meeting (November 1990). of serious DCL software. Programming paradigms M. Sidman, "Parametic System Identification are presented, as well as the correct way to on Logarithmic Frequency Response Data," implement them. The book presents good pro­ IEEE Tra nsactions on Automatic Control gramming techniques and helps the student to (September 1991 ). make effective use of the VMS operating system.

D. Skendzic, "Two Transistor Flyback Converter X WINDOW SYSTEM TOOLKIT: Design for EM! Control," IEEE Sy mposium on The Complete Programmer's Guide Electromagnetic Compatibility (August 1990). and Specification A. Smith and W Gol ler, "New Domain Configura­ Paul). Asente and Ralph R. Swick, 1990, softbound, tion in Thin-filmHeads," Intermag '90 (April 1990). 1000 pages, Order No. EY-E757E-DP-EEB ($44.95)

H. Smith and). Beagle, "SIMS for Accurate This book consists of two parts, "Programmer's Process Monitoring in CoSi2-on-Si MOSFET Guide" and "Specification." "Programmer's Guide" Technology," Secondary Jon Mass Sp ectrometry describes how to use the X To olkit to write (September 1989). applications and widgets, and includes many examples. Each chapter in this part contains an ). Thottuvelil, "Using SPICE to Model the Dynamic application writer's section and a widget writer's Behavior of OC-to-DC Converters Employing Mag­ section. Application programmers need to read the netic Amplifiers," IEEE Applied Power Electronics widget writer's sections only ifthey are curious Conference (March 1990). about what is going on behind the scenes; R. Ulichney, "Frequency Analysis of Ordered widget programmers should read both sections. Dither," Hard-copy OutputOE! LASE 89 SPIE '89 "Specification" provides a complete and concise Proceedings (1991). description of every component of the X To olkit Intrinsics, as standardized by the MIT X Consor­ R. Ulichney, "Challenges in Device Independent tium. The level of detail in this part is sufficient Image Rendering," Applied Vision Op tical Society to enable a programmer to create a new imple­ of America Te chnical Digest Series '89 (1991). mentation of the X To olkit. E. Zimran, "Performance EfficientMapping of Applications to Paral lel and Distributed Archi­ PRODUCTION SOFTWARE TIIAT WORKS: tectures," International Conference on Pa rallel A Guide to the Concurrent Development Processing (August 1990). of Realtime Manufacturing Systems john A. Behuniak, Iftikhar Ahmad, and Digital Press Ann l\1.Courtright, 1992, softbound, 204 pages, Digital Press is the book publishing group of Order No. EY-H895E-DP-EEB ($24.95) Digital Equipment Corporation. The Press is an This is a practical guidebook for manufacturing international publisher of computer books and managers and process engineers who must develop journals on new technologies and products fo r better process methodologies to stay competitive users, system and network managers, program­ and for developers of realtime manufacturing mers, and other professionals. Proposals and ideas software who need to cut time and costs from their for books in these and related areas are welcomed. work. The presentation, which provides useful The fo llowing book descriptions represent a advice ancl easy-to-follow procedures, addresses sample of the books available from Digital Press. three basic tasks of realtime software development

Digital Te chnical jo urnal Vo l. 3 No. 4 Fu ll 1991 75 Fu rther Readings

in a manufacturing plant: (1) managing the design VAX Pascal programming language, this is the first of the system; (2) setting up and managing a book to actual ly discuss the construction of real development organization; and (3)implementing VMS applications. It sets forth a methodology for tools for successfu! completion and managem<..:nt. producing high-quality, professional VMS appli­ ca tions by fo cusing on the aspects of the VMS UNIX FOR VMS USERS operating system crucial to every well-written Philip E. Bourne, 1990, softbound, 36R pages, application. Order No. EY-C 177E-DP-EEU (S28.95)

This book emphasizes the practical aspects of THE DIGITAL GUIDE TO SOFTWARE making the transition from the ViviS to the UNIX DEVELOPMENT operating system. Every concept pr<..:sented is The staff of the Corporate User Information (CUIP/ASG), il lustrated with one or more examples, comparing Products Digital Equipment 1989, 239 how to perform a particular task in each of the Corporation, softbound, pages, EY-Cl78E-DP-EEB ($27.95) two operating systems. The book is organized in a Order No. logical order and covers the following topics: fu n­ THI:' DIGITA L GUIDE TO SOF1mlRE DEVELOPNIENT damental concepts to be grasped before touching is the tirst published description of the method­ the keyboard, the firstterminal sessions, the first ology that Digital uses to design and develop its commands, editing, communicating with users, software. For the engineer ancl other professionals resource utilization, using devices, more advanced associated with the creation ami marketing of commands, using high-level languages, program­ software applications, this book gives a rare look at ming the operating system, text processing, and the practices of an industry leader and provides a networking. Appendixes provide extensive cross­ model for others who wish to introduce software reference tables to make this a valuable reference enginet:ring methods and tools into their own UNIX tool for even the experienced user. companies. Also discussed are the use of selected VMS case tools to expedite the process; the roles of LOGISTICAL EXCELLENCE: teams and team leaders; the use of review meetings It's Not Business as Usual and documents; and formal procedures for testing Donald). Bowersox, Pa triciaJ Daugherty, and ma intenance. The guide includes numerous Cornelia L. Drogue, Richard N. Germain, and diagrams and tables, clear gu idelines for the coding 1992, 300 Da le S. Rodgers, pages, and documentation of software modules, a listing EY-1 1953E-DP-EEB Order No. of related VMS documentation, and coding gu ide­ This book focuses on the interpretation of research lines fo r VAX C. findings that have been compiled to help managers who seek to improve logistical competency within DIGITAL GUIDE TO DEVELOPING their organization. It provides a sequential model, INTERNATIONAL SOFTWARE the best practices of "excellent" logistics managers The staff of the Corporate User Information with supportive statistical evidence, and extensive Products (O JIP/A S< i), Digital Equipment coverage of Electronic Data Interchange in the Corporation, 1991, softbound, 381 pages, logistics process. It also includes a brief overview Order No. EY-F577E-DP-EEB ($28 95) of the expanding role that logistics has recently This book introduces the ground-breaking pack­ played in the overall corporate strategy of increas­ aging and design gu idelines recommended by ing speed and qual ity. lb facilitate interest and ease Digital fo r products destined for overseas markets. of reading, an action-oriented case dialogue ru ns Al ready used by more..: than 400 independent soft­ throughout the eight chapters. wa re vendors and development groups, as well as by Digital engineers, this book offersan approach WRITINGVAX/ VMS APPLICATIONS USING that greatly simplifiesthe steps required to adapt PASCAL software to local markets once the parent product Theo de Klerk, 1991, hardbound, 748 pages, has been released. The book fea tures a description Order No. EY-F592E-DP-EEB ($39.9';) of Digital's international product model, a scheme Written for the professional appl ication program­ for separating the core functions of a product from mer on the VAX/VMS operating system using the those that require translation or modificationfo r

76 1-h/. J No. 4 l'ali i'J'JI Digital Te chnical]ou.-nal I specificmarkets. Al so included are guidelines for Quality. The Epilogue, Part IV, concludes with developers working in DECWindows,VMS, and three appendices detailing Benchmarking, Build­ ULTRIX environments; special considerations ing Networks, and Networking Capabilities. involved in preparing a product fo r multibyte Asian languages or for multilanguage environments; and THE ART OF TECHNICAL DOCUMENTATION appendixes with information on the systems issues Katherine Haramundanis, 1992, softbound, in computer architecture. 267 pages, Order No. EY-H892E-DP-EEB ($28.95)

Written primarily for novice and aspiring technical USING MS-DOS KERMIT:Connecting your writers within the computer industry, Th e Art PC to the Electronic Wo rld, Second Edition of Te chnical Documentation has unique fe atures, Christine M. Gianone, 1991, softbound, including its advice on planning and process, 344 pages with software disk included, research techniques, use of graphics, audience Order No. EY-H893E-DP-EEB ($34.95) analysis, definitionof quality, standards, and As in the firstedition, this software package leads careers that are valuable to experienced technical the novice step by step through installation, com­ writers as well. Haramundanis views the practice munication setup, terminal emulation, file transfer, of technical writing as being different from that and script programming, and also serves as a com­ of scientific writing, and closer to investigative plete reference work for the experienced user. reporting. In keeping with this premise, this book Complete with 5Y.-inch diskette containing the is not a style guide that deals with all aspects of official M5-DOS KERMITVe rsion 3.11 program from typography and copy editing, but instead presents Columbia University, this revision includes a new the distilled knowledge of the author's many years section on local area networks, additional material experience. on running Kermit in windowed environments such as and Quarterdeck A COMPREHENSIVE GUIDE TO Rdb/VMS DesqView, a new appendix containing tables of Lilian Hobbs and Kenneth England, 1991, the escape sequences used by Kermit's text and softbound, 352 pages, Order No. EY-H873E-DP-EEB graphics terminal emulators, and expanded ($34.95) descriptions of many of Kermit's features. The Rdb/VMS relational database system was ENTERPRISE NETWORKING: developed by Digital Equipment Corporation for Wo rking Together Apart VAX computers using the VMS operating system. Raymond H. Grenier and George S. Metes, 1991, This system is one of a number of information hardbound, 260 pages, Order No. EY-H878E-DP-EEB management products that work together to ($29.95) facilitate the sharing of information. The Rdb/VMS system is used, for example, in high-performance To successfully compete in the next century, com­ transaction processing systems. This book is based panies must recognize and adapt to exponential on Rdb!VMSVe rsion 4.0, which Digital made avail­ changes, including the dispersion of markets and able to customers at the end of 1990, and thus resources and acceleration in market demands. includes the latest fu nctionality. ENTERPRISE NETWORKJNG: Wo rking Together Apart, describes how management can support DIGITAL GUIDE TO DEVELOPING this distributed electronic information environ­ INTERNATIONAL USER INFORMATION ment and move through planned transitions to Scott Jones, Cynthia Kennelly, Claudia Mueller, a new organization, confident they will prosper. Marcia Sweezey, Bill Thomas, and Lydia Ve lez, 1992, Intended for individuals in charge of directing softbound, 214 pages, Order No. EY-H894E-DP-EEB transition of information-focused groups that ($24.95) extend across geographies, this book is segmented into fo ur parts. The Introduction, Part I, defines Designed for the busy professional, this book the assumptions and realities. Part II focuses on presents models that extend beyond Digital and Capability Based Environments. Part III discusses English speaking countries in a quick read/ Simultaneous Distributed Wo rk, both Goals and reference fo rmat. Nine chapters and four appen­ Processes, ancl Continuous Design and Quest for dices outline methods for creating written, visual,

Digital Te chnicaljottrlla/ Vo l. 3 No. 4 Fa t/ 1991 77 Further Readings

and verbal information for cost-e!Ie ctive trans­ VMS FILE SYSTEM INTERNALS lation. Primarily fo r information specialists, Kirby McCoy, 1990, softbound, 460 pages, including writers, editors, illustrators, course Order No. EY-F575E-0P-EEB ($49.95) developers, and their managers, this book will also VIIIJS FILE SYSTEM INTE RNALS, based on VMS Ve rsion help software developers and students enhance 5.2, is a book for system programmers, software their background in technical communication. specialists, system managers, applications design­ VAX/VMS PRACTICAL KNOWLEDGE ENGINEERING: ers, and other users who need to under­ Creating SuccessfulCommercial Expert stand the interfaces to and the data structures, Systems algorithms, and basic synchronization mechanisms of the v,'viS file system. Th is system is the part of Richard V Kelly, Jr., softbound, 212 pages, Order No. EY-F591 E-OP-EEB ($28.95) the VA X/VMS operating system responsible fo r storing and managing filesand info rmation in This book is a concise guide to practical methods memory and on secondary storage. The book is fo r initiating, designing, building, managing, also intended as a case study of the V.\'fS implemen­ and demonstrating commercial expert systems. tation of a tile system for graduate and advanced It is a front-line report of what works (and what umkrgraduate courses in operating systems. does not) in the construction of expert systems, drawn from the author's decade of experience DECNET PHASE V: An OSI Implementation gained working on such projects in all major James ,vlartin and Joe Leben, 1992, hardbound, areas of application fo r American, European, and 572 pages, Orcler No. EY-H882E-DP-EEB ($49.95) Japanese organizations. It also brieflyreviews This book provides a firstin-depth look at DECnet the knowledge representation, programming, Phase V and the important issues that must be and management techniques commonly used to resolved in the design and implementation of very implement expert systems today, and describes the large networks. It presents key Open Systems In ter­ intellectual, organizational, financial, and manage­ connection (OSI) concepts and shows how DECnet rial issues that knowledge engineers face daily in Phase V hardware and software products imple­ performing their jobs. Among the topics covered ment international standards associated with the are: prospecting for "legitimate" problems; fore­ OSI model. casting costs, establishing project metrics and writing specifications; preparing fo r system VAX/VMS OPERATING SYSTEM CONCEPTS "demos''; interviewing and selecting engineering David Miller, 1991, hardbound, 512 pages, team members; and solving common difficulties Order No. EY-F590E-DP·EEB ($44.95) in design and implementation. This book begins with an overview that centers COMPUTER PROGRAMMING AND on one visible aspect of an operating system, ARCHITECTURE: The VA X, Second Edition terminal input and output; it proceeds into well­ Henry M. Levy and Richard H. Eckhouse,Jr., 1989, organized chapters on process definition, paging hardbound, 444 pages, Order No. EY-6740E-DP-EEB and memory management, security, protection ($3800) and privacy; and it concludes with a chapter on orerating systems at Digital Equipment This book is both a reference for computer profes­ Corporation. Each chapter provides an intro­ sionals and a text for students. A systems approach duction, theoretical discussion, generally recog­ helps the reader understand the issues crucial to nized solutions, algorithms and data structures, the comprehension, design, and use of modern and questions to encourage review of the central computer systems. Using the VA)( computer as an concept presented. example, the first half of the book is a text suitable for a complete course in assembly language pro­ THE VMS USER'S GUIDE gramming. The second half of the book describes James F. Peters, III and Patrick). Holmay, 1990, higher-level systems issues in computer architec­ softbound, 304 pages, Order No. EY-6739E-DP·EEB ture, namely, support for operating systems and ($28.95) operating systems structures, virtual memory, parallel processing, microprogramming, caches, This up-to-elate guide for new VMS users provides and translation buffers. a sequence of steps fo r learning the VMS operating

78 1-b/. 3 No . .; Fa ll 1')91 Digital Te e/mica/ jo urnal I

system and includes hands-on experiments with "X Protocol," " XLib," "X To ol kit Intrinsics," "Motif," step-by-step instructions. The book also can be and "General X"; all routines and data structures used as a reference fo r commands and utilities. are organized alphabetically within each of these THE V1VIS USER'S GUIDE, reflecting VMS Version 5, sections. provides complete VMS coverage-from Jogging in to creating command procedures; contains FIFfH GENERATION MANAGEMENT: a thorough discussion of filesand directories; IntegratingEnterprises through Human covers both the EDT and the EVE editors in detail; Networking 1990, 267 and introduces programming with VA,'

Digital Te chnical jo urttal Vo l. 3 Nu. 4 Fa ll 1991 79 Fu rther Readings

To receive a copy of our latest catalog or further inJormation on these or other publications from Digital Press, please write:

Digital Press Department EEB I Burlington Wo ods Drive Burlington, MA 01803-4539

Or, you can order by call ing DECdirect at 800-DIGITAL (800-344-4825).

When ordering be sure to refer to Catalog Code EEB.

80 Vol. 3 Nu. 4 Fa ll 1991 Digital Te chnical journal ISSN 0898-90IX

Primed in U.S. A. EY-H889E-DP/91 12 02 18.0 DBP/NRO Copyrighr © Digital Equipment Corporation. All Rights Reserved.