BY DAVID STAAL Understanding Line Data Formats

ince a large amount of production printing is still based on line data and its hybrids, technical Sanalysts in the print industry need to understand these line data formats, especially when trying to integrate mainframe print data with PC-based solutions.

laser printers with full graphic non-proportional, that is, the characters are all the ALTHOUGH capabilities have been available for same width. more than a decade, a large amount of production printing is use continuous feed paper (with holes on the still based on line data (or hybrids, such as Xerox LCDS). For sides), which can include blank paper, green bar this reason, technical analysts in the print industry need to paper and special forms, such as labels and thoroughly understand line data formats, especially when pre-printed checks trying to integrate mainframe print data with PC-based move down the page in only one direction. The paper is solutions. As an instructor, I have found that many analysts moved with a tractor feed and can be advanced, but not lack this basic knowledge, largely because it is not available backed up. in any simple form. This article is intended to provide a basic and thorough Line data is data that is formatted for line printers. It description of the most common line data formats, including generally has the following characteristics: mainframe Fixed Block (FBA or FBM), Variable Block (VBA or VBM), and ASCII line data. This article will also is text only explain the basic varieties of carriage control and is stored as records or lines of data provide examples of these formats. has some kind of “carriage control” commands that instruct Line data, or line printer data, can be defined simply as the printer on how to move down the paper, such as Line data formatted for a printer that prints one line at a time. As Feed or printers have evolved, more advanced printers (those that could support fonts and graphics) continued to support the THREE ELEMENTS OF LINE DATA data intended for the older line printers, as a backward compatibility feature. Eventually, the expression “line data” Over the years, and on different platforms, was coined to refer to this older data stream. For a thorough many formats evolved for storing line data. The most com- understanding of line data, it is necessary to review the mon of these formats (e.g., ASCII text, EBCDIC VBA or original target of line data, which is, of course, a line printer. VBM and EBCDIC FBA or FBM) will be discussed in detail later in this article. To interpret any line data format, it is nec- LINE PRINTERS essary to identify the following three basic elements that make up the format: Line printers, sometimes called impact printers, have the following characteristics: Record Separation: How are the records separated or delimited? print one line or less (one ) at a time Carriage Control: What type of carriage control print text only, no graphics is used? use a fixed font because the characters are physically Character Set: Is the data EBCDIC, ASCII or engraved on the print heads or print band. The font is something else?

TECHNICAL SUPPORT • FEBRUARY 2001 ©2001 Technical Enterprises, Inc. Reproduction of this document without permission is prohibited. Record Separation FIGURE 1: CARRIAGE CONTROL Because line data is fundamentally Carriage Control Machine Machine ASA SCS ASCII record-oriented, methods of storing line w/out data with data (ANSI) CC data must allow for the separating or Do not 03 (no op) 01 + 0D 0D delimiting of records. This ensures that the Space one line 0B09 (space) 0D 15 0D 0A program reading the data doesn’t accidentally Space two lines 13 11 0 0D 15 15 string together two or more records or break Space three lines 1B19 - 0D 15 15 15 Skip to Channel 1 8B89 1 0D 0C 0C a record into pieces. Skip to Channel 2 93 91 2 0D 04 82 The following three basic schemes are Skip to Channel 3 9B99 3 0D 04 83 used by different line data formats for Skip to Channel 4 A3 A1 4 0D 04 84 record separation: Skip to Channel 5 ABA9 5 0D 04 85 Skip to Channel 6 B3 B1 6 0D 04 86 Skip to Channel 7 BB B9 7 0D 04 87 Fixed Length Records: This is the Skip to Channel 8 C3 C1 8 0D 04 88 most basic scheme where you arbitrarily Skip to Channel 9 CBC9 9 0D 04 89 decide on a length, such as 80 or 1331, Skip to Channel 10 D3 D1 A 0D 04 7A and all records in the file are made to Skip to Channel 11 DBD9 B 0D 04 7B Skip to Channel 12 E3 E1 0D 04 7C be the same length. A program processing a fixed-length file of 133- records, FIGURE 2: SAMPLE LINE DATA for example, can then just read 133 into its buffer at a time and A TEST OF CARRIAGE CONTROL. Single Space. assume that it has read in one record. This scheme is used primarily for Double Space. mainframes, and the most common format for this is called FBA (Fixed Block with ASA/ANSI carriage control) or FBM (Fixed Block with Machine carriage control). Record : The record New Page scheme relies on special characters that A TEST OF CARRIAGE CONTROL. are different from normal text to mark the end of a record. The most familiar Triple Space. example of this is ASCII text, which Double Space. Single Space. uses the (0Dh), Line Overprint Feed (0Ah), and Form Feed (0Ch) as record delimiters. A program processing this data would read one byte at a time, FIGURE 3: LINE DATA ELEMENTS FOR FBA and when it came to one of these special Record Separation Fixed Length Records characters, it would assume it had Carriage Control ASA reached the end of the record. One Character Set EBCDIC flaw in this approach is that something unusual in the data portion of the FIGURE 4: RECORD STRUCTURE FOR FBA record (such as an escape code string) might get confused with a Record 1 Record 2 record delimiter. C Data C Data Length Byte or Word: The most advanced and flexible scheme for Carriage Control byte separating records is to use a byte or two (a word) at the beginning of each using Length Words are VBA (Variable printer carriage control must have the fol- record to store the length of the following Blocked with ASA carriage control) or lowing two basic commands: record. This approach provides the VBM (Variable Blocked with Machine move down a line (called Line Feed, benefit of storing variable length carriage control) and Barr S/370 format. New Line or Space One Line) records, and allows you to store any move to top of next page (called Form type of data, not just text. For example, Carriage Control Feed, Page Break or Skip to Channel 1) applications used with Xerox printers Line data needs to have some kind of car- can store Xerox Metacode inside of a riage control, which are the instructions to Following are the four most common line data format using this scheme. Two the printer on how to move vertically down types of carriage control used in the produc- common examples of line data formats the page. At a minimum, any type of line tion printing business:

1The number 133 is derived from the old line printers that could print 132 columns wide. The record would store 1 byte of ASA carriage control followed by 132 bytes of text.

©2001 Technical Enterprises, Inc. Reproduction of this document without permission is prohibited. TECHNICAL SUPPORT • FEBRUARY 2001 ASCII: used in the PC, and FIGURE 5: HEX DUMP OF FBA Internet arenas 00000000 4E C1 40 E3 C5 E2 E3 40-D6 C6 40 C3 C1 D9 D9 C9 +A TEST OF CARRI SCS: used in RJE, 3270, and 5250 00000010 C1 C7 C5 40 C3 D6 D5 E3-D9 D6 D3 4B 40 40 40 40 AGE CONTROL. 00000020 40 40 40 40 40 40 40 40-40 40 40 40 40 40 40 40 print streams 00000030 40 40 40 40 40 40 40 40-40 40 40 40 40 40 40 40 ASA or ANSI: used on mainframes 00000040 40 40 40 40 40 40 40 40-40 40 40 40 40 40 40 40 00000050 40 40 40 40 40 40 40 40-40 40 40 40 40 40 40 40 and AS/400s, as well as adopted for 00000060 40 40 40 40 40 40 40 40-40 40 40 40 40 40 40 40 other platforms 00000070 40 40 40 40 40 40 40 40-40 40 40 40 40 40 40 40 00000080 40 40 40 40 40 40 E2 89-95 87 93 85 40 E2 97 81 _Single Spa 1403/3211 Machine: native carriage 00000090 83 85 4B 40 40 40 40 40-40 40 40 40 40 40 40 40 ce. control used on mainframe channel- 000000A0 40 40 40 40 40 40 40 40-40 40 40 40 40 40 40 40 000000B0 40 40 40 40 40 40 40 40-40 40 40 40 40 40 40 40 attached line printers 000000C0 40 40 40 40 40 40 40 40-40 40 40 40 40 40 40 40 000000D0 40 40 40 40 40 40 40 40-40 40 40 40 40 40 40 40 000000E0 40 40 40 40 40 40 40 40-40 40 40 40 40 40 40 40 The table in Figure 1 shows the hexa- 000000F0 40 40 40 40 40 40 40 40-40 40 40 40 40 40 40 40 00000100 40 40 40 40 40 40 40 40-40 40 F0 C4 96 A4 82 93 0Doubl decimal values for the standard carriage 00000110 85 40 E2 97 81 83 85 4B-40 40 40 40 40 40 40 40 e Space. control commands in each type of carriage 00000120 40 40 40 40 40 40 40 40-40 40 40 40 40 40 40 40 00000130 40 40 40 40 40 40 40 40-40 40 40 40 40 40 40 40 control. In the case of ASA, it shows the text 00000140 40 40 40 40 40 40 40 40-40 40 40 40 40 40 40 40 representation instead, because the hexa- 00000150 40 40 40 40 40 40 40 40-40 40 40 40 40 40 40 40 00000160 40 40 40 40 40 40 40 40-40 40 40 40 40 40 40 40 decimal value of ASA differs when it is 00000170 40 40 40 40 40 40 40 40-40 40 40 40 40 40 40 40 used with ASCII or EBCDIC text. 00000180 40 40 40 40 40 40 40 40-40 40 40 40 40 40 40 F1 1 00000190 D5 85 A6 40 D7 81 87 85-40 40 40 40 40 40 40 40 New Page The “Skip to Channel” or “Skip to Stop” 000001A0 40 40 40 40 40 40 40 40-40 40 40 40 40 40 40 40 commands are an old line printer standard 000001B0 40 40 40 40 40 40 40 40-40 40 40 40 40 40 40 40 000001C0 40 40 40 40 40 40 40 40-40 40 40 40 40 40 40 40 that instructs the printer to move down the 000001D0 40 40 40 40 40 40 40 40-40 40 40 40 40 40 40 40 page to a predetermined line on the page. 000001E0 40 40 40 40 40 40 40 40-40 40 40 40 40 40 40 40 000001F0 40 40 40 40 40 40 40 40-40 40 40 40 40 40 40 40 The location of each stop can vary for each 00000200 40 40 40 40 40 40 40 40-40 40 40 40 40 40 40 40 print job. For example, on one print job Stop 3 might be defined as line 15 on the FIGURE 6: LINE DATA ELEMENTS FOR VBM page. So, whenever the printer receives the “Skip to Channel 3” command, it moves Record Separation Record Length Field forward to line 15, even if that means is has Carriage Control 3211 Machine to move to the next page to get there. On a Character Set EBCDIC (or hidden binary) different print job, Stop 3 might be defined as line 30 or might not be defined at all. FIGURE 7: RECORD STRUCTURE OF VBM The Stops and the lines they represent are coded in a Forms Control Buffer (FCB), Formatted without blocking: which is stored in a library or directory on the host computer and sent to the line printer Record 1 Record 2 RL 00 C Data RL 00 C Data when needed. Incidentally, the FCB also codes two other pieces of information: the Carriage Control byte lines-per-inch setting (either 6 or 8), and Nulls (two bytes) the form length in number of lines-per- Record Length (two bytes) form or page. The important difference between ASA Formatted with blocking: and Machine carriage control is that, although both are based on records, they BDW Record 1 Record 2 skip at different times. ASA carriage control BL 00 RL 00 C Data RL 00 C Data is executed before the line of data is written. Machine carriage control is executed after Carriage Control byte the data is written. These two techniques are Nulls (two bytes) referred to as Skip Before and Skip After. Record Length (two bytes) For example, if the ASA Skip to Channel 1 Nulls (two bytes) Block Length (two bytes) command (page break) is at the beginning of a record, the printer will issue a page break and print the record on the next page. harder to read because it seems to occur out In practice, however, line data In contrast, if the Machine Skip to Channel of sequence. formats often contain something besides 1 command is at the beginning of a record, just text, such as LCDS, Metacode, PCL the record will be printed at the current Character Set (EBCDIC or ASCII) escape codes, or even raw binary data. Most location on the current page, and then the Technically, a true line data format of the cases that include some kind of binary printer will move to the next page. This should contain only in one of the data are only possible when using the makes Machine carriage control a little two basic character sets, EBCDIC or ASCII. Length Word method of record separation.

TECHNICAL SUPPORT • FEBRUARY 2001 ©2001 Technical Enterprises, Inc. Reproduction of this document without permission is prohibited. There are two basic standards for encoding FIGURE 8: HEX DUMP OF VBM (WITHOUT BLOCKING) text characters: 00000000 00 20 00 00 09 C1 40 E3-C5 E2 E3 40 D6 C6 40 C3 .....A TEST OF C 00000010 C1 D9 D9 C9 C1 C7 C5 40-C3 D6 D5 E3 D9 D6 D3 4B ARRIAGE CONTROL. 00000020 00 12 00 00 11 E2 89 95-87 93 85 40 E2 97 81 83 .....Single Spac EBCDIC: used on mainframes and 00000030 85 4B 00 12 00 00 89 C4-96 A4 82 93 85 40 E2 97 e.....iDouble Sp AS/400s 00000040 81 83 85 4B 00 0D 00 00-09 D5 85 A6 40 D7 81 87 ace...... New Pag 00000050 85 00 20 00 00 19 C1 40-E3 C5 E2 E3 40 D6 C6 40 e.....A TEST OF ASCII: used by the rest of the 00000060 C3 C1 D9 D9 C9 C1 C7 C5-40 C3 D6 D5 E3 D9 D6 D3 CARRIAGE CONTROL computing industry 00000070 4B 00 12 00 00 11 E3 99-89 97 93 85 40 E2 97 81 ...... Triple Spa 00000080 83 85 4B 00 12 00 00 09-C4 96 A4 82 93 85 40 E2 ce...... Double S 00000090 97 81 83 85 4B 00 12 00-00 09 E2 89 95 87 93 85 pace...... Single Both EBCDIC and ASCII use a single 000000A0 40 E2 97 81 83 85 4B 00-0E 00 00 01 D6 A5 85 99 Space...... Over 000000B0 97 99 89 95 A3 00 0E 00-00 09 D6 A5 85 99 97 99 print.....Overpr byte of data to represent a character. In fact, 000000C0 89 95 A3 int ASCII originally only used the lower 7 of the byte. In the days of simple line printers, FIGURE 9: LINE DATA ELEMENTS FOR ASCII line data would have only straight EBCDIC or ASCII with almost no variations. Record Separation Delimiters (CR, LF, FF) Sometimes a printer would have a few Carriage Control ASCII CR, LF, FF special characters, such as the cent , Character Set ASCII which had to be accommodated, but this was easily handled. FIGURE 10: RECORD STRUCTURE OF ASCII Record 1 Record 2 The examples presented Data CR LF Data CR FF

here cover the most Delimiter/CC Delimiter/CC common types of line data. FIGURE 11: HEX DUMP OF ASCII

Once you have become 00000000 41 20 54 45 53 54 20 4F-46 20 43 41 52 52 49 41 A TEST OF CARRIA 00000010 47 45 20 43 4F 4E 54 52-4F 4C 2E 0D 0A 53 69 6E GE CONTROL...Sin familiar with these formats 00000020 67 6C 65 20 53 70 61 63-65 2E 0D 0A 0D 0A 44 6F gle Space.....Do 00000030 75 62 6C 65 20 53 70 61-63 65 2E 0D 0C 4E 65 77 uble Space...New 00000040 20 50 61 67 65 0D 0A 41-20 54 45 53 54 20 4F 46 Page..A TEST OF and learn to recognize their 00000050 20 43 41 52 52 49 41 47-45 20 43 4F 4E 54 52 4F CARRIAGE CONTRO 00000060 4C 2E 0D 0A 0D 0A 0D 0A-54 72 69 70 6C 65 20 53 L...... Triple S 00000070 70 61 63 65 2E 0D 0A 0D-0A 44 6F 75 62 6C 65 20 pace.....Double patterns, you should be able 00000080 53 70 61 63 65 2E 0D 0A-53 69 6E 67 6C 65 20 53 Space...Single S 00000090 70 61 63 65 2E 0D 0A 4F-76 65 72 70 72 69 6E 74 pace...Overprint to recognize other variations 000000A0 0D 4F 76 65 72 70 72 69-6E 74 0D 0A .Overprint.. of line data as well. FBA (Fixed Block with ASA which is a disadvantage of the fixed length Carriage Control) record format. EXAMPLES OF LINE DATA FBA can be described simply as fixed The same Fixed Block format can be FORMATS length records in which the first byte or combined with Machine carriage control, in character of each record is an ASA carriage which case it is called FBM. The dump The sample line data shown in Figure 2, , and the rest of the record is would look almost the same, with many with two simple pages of text, will be used the printable EBCDIC data. In order to read spaces padding out each record to the same in all of the following examples. That way, or write fixed length records, you need to length. Only the carriage control byte at the you can see how different line data formats know or document the record length for the beginning of each record would look different, represent the same printable text. FBA file, since it can be any arbitrary length. containing Machine carriage control com- The examples in Figures 3 and 4 (FBA In the mainframe world, a length of 133 is mands instead of ASA. One way to recognize and VBM) originally came from the IBM commonly used, because most line printers Machine carriage control is to look for mainframe environment. The names FBA can print 132 characters across (plus one byte values of 09h, which is the commonly used and VBM are derived from record format for the ASA carriage control character). “write and space one line” command. designations used on the mainframe, i.e., Figure 5 shows a hex dump of a typical RECFM=FBA and RECFM=VBM. There FBA print file with an EBCDIC interpretation VBM (Variable Block with 3211 are several possible variations of these, so I at the right. You will notice a lot of blanks Machine Carriage Control) have chosen the two that I have found to be or spaces in the data represented by 40s in The VBM format uses a record length field the most common, FBA and VBM. Once the hexadecimal portion of the dump. The at the beginning of each record to indicate you learn the Fixed and Variable record hexadecimal value for an EBCDIC space the length of the following record. The length structures, you will see that they can each be character is 40. Spaces are required in a is coded in the field in binary, and it includes used with ASA or Machine carriage control. fixed length format in order to pad out every itself in the count. See Figure 6. The three elements of line data are identified record to its full length. Clearly, this can As Figure 7 shows, the IBM variable for each example. result in a lot of wasted storage space, length record format can be either with or

©2001 Technical Enterprises, Inc. Reproduction of this document without permission is prohibited. TECHNICAL SUPPORT • FEBRUARY 2001 without blocking (V or VB). In the PC FIGURE 12: LINE DATA ELEMENTS FOR ASCII WITH ASA environment, there is no need for blocking, so many software packages prefer to use the Record Separation Delimiters (CR, LF) format without blocking. Carriage Control ANSI (ASA) The block length plus the two trailing Character Set ASCII nulls are called a BDW or Block Descriptor Word. The record length plus the two FIGURE 13: RECORD STRUCTURE OF ASCII WITH ASA trailing nulls are called a RDW or Record Record 1 Record 2 Descriptor Word. If you’re wondering why C Data CR LF C Data CR LF there are two null bytes after the record or block length, IBM documentation says they Delimiter Delimiter are “reserved for possible future system ASA Carriage Control Character use.” The nulls actually turn out to be helpful, because a person or a program can more easily detect this format by looking for the FIGURE 14: HEX DUMP OF ASCII WITH ASA pattern of nulls. Figure 8 shows a hexadecimal 00000000 2B 41 20 54 45 53 54 20-4F 46 20 43 41 52 52 49 +A TEST OF CARRI 00000010 41 47 45 20 43 4F 4E 54-52 4F 4C 2E 0D 0A 20 53 AGE CONTROL..._S dump of a VBM file without blocking, 00000020 69 6E 67 6C 65 20 53 70-61 63 65 2E 0D 0A 30 44 ingle Space...0D which you could technically call a VM file. 00000030 6F 75 62 6C 65 20 53 70-61 63 65 2E 0D 0A 31 4E ouble Space...1N 00000040 65 77 20 50 61 67 65 0D-0A 20 41 20 54 45 53 54 ew Page.. A TEST Note that you can locate the beginning of 00000050 20 4F 46 20 43 41 52 52-49 41 47 45 20 43 4F 4E OF CARRIAGE CON each record by just finding two nulls (00h) 00000060 54 52 4F 4C 2E 0D 0A 2D-54 72 69 70 6C 65 20 53 TROL...-Triple S 00000070 70 61 63 65 2E 0D 0A 30-44 6F 75 62 6C 65 20 53 pace...0Double S in a row. 00000080 70 61 63 65 2E 0D 0A 20-53 69 6E 67 6C 65 20 53 pace... Single S As with the Fixed length format in the 00000090 70 61 63 65 2E 0D 0A 20-4F 76 65 72 70 72 69 6E pace... Overprin 000000A0 74 0D 0A 2B 4F 76 65 72-70 72 69 6E 74 0D 0A t..+Overprint.. previous example, the IBM Variable format can be used with either ASA or Machine carriage control (Variable Block with FIGURE 15: TEXT VIEW OF ASCII WITH ASA ASA is called VBA). The hex dump of a + A TEST OF CARRIAGE CONTROL. Single Space. VBA file would look the same as that for 0 Double Space. a VBM file, except for the carriage control 1 New Page A TEST OF CARRIAGE CONTROL. byte. If you see many values of 40h in the - Triple Space. + carriage control position, it’s ASA, but if 0 Double Space. Single Space. you see many values of 09h, it’s Machine Overprint carriage control. + Overprint

ASCII A plain ASCII file will contain nothing try to view the file as a regular text file using ASCII is the standard used on the PC but printable text, spaces, and the three control any editor or word processing program. If it platform and on many midrange systems, characters. Notice that double space and is regular ASCII, it will look like the original including Unix systems. ASCII uses a triple space are represented simply by data at the beginning of these examples. If it very simple form of carriage control, repeated use of carriage return and line feed. is ASCII with ANSI carriage control, it will which is made up of just three control look like the sample shown in Figure 15. characters — carriage return (CR or 0Dh), ASCII with ANSI (ASA) You should notice that the first character on line feed (LF or 0Ah), and form feed (FF Carriage Control every line will be a space, a number, or a or 0Ch). See Figure 9. The Unix flavor of ASCII with ASA is a hybrid format plus (+) or minus (-) sign. ASCII doesn’t even use the carriage where the ASA carriage control from the return, just the line feed and form feed. mainframe world has been combined with CONCLUSION The control characters also do double simple ASCII text. The ASCII delimiters duty as delimiters to indicate the end of (CR and LF) are still used, but the form feed The examples presented here cover the the record. So, if you are reading an is dropped. More importantly, the ASCII most common types of line data. Once you ASCII record, and you encounter any one delimiters are not interpreted as carriage have become familiar with these formats of these three control characters, it means control, but simply as record separators. and learn to recognize their patterns, you you have reached the end of that record. The carriage control is represented by the should be able to recognize other variations See Figure 10. ASA character at the beginning of each of line data as well. Figure 11 shows a hexadecimal dump of record. See Figures 12 and 13. an ASCII file using the same sample data as At first glance, the data in Figure 14 looks all the other cases. In this hexadecimal similar to ordinary ASCII data. You can see David Staal is a professional services super- dump and the one shown in Figure 14, the the ASCII text and the ASCII delimiters (0D visor for Barr Systems, Inc., in Gainesville, text representation on the right is based on 0A), and it’s hard to tell whether it is ordinary FL. He can be contacted via email at an ASCII interpretation. ASCII or something more. To make sure, [email protected].

TECHNICAL SUPPORT • FEBRUARY 2001 WWW.NASPA.COM