The Evolution of Character Codes, 1874-1968
Total Page:16
File Type:pdf, Size:1020Kb
The Evolution of Character Codes, 1874-1968 Eric Fischer [email protected] Abstract Émile Baudot’sprinting telegraph was the first widely adopted device to encode letters, numbers, and symbols as uniform-length binary sequences. Donald Murray introduced a second successful code of this type, the details of which continued to evolveuntil versions of Baudot’sand Murray’scodes were standardized as International Tele- graph Alphabets No. 1 and No. 2, respectively.These codes were used for decades before the appearance of com- puters and the changing needs of communications required the design and standardization of a newcode. Years of debate and compromise resulted in the ECMA-6 standard in Europe, the ASCII standard in the United States, and the ISO 646 and International Alphabet No. 5 standards internationally. This work has been submitted to the IEEE for possible publication. Paper copies: Copyright may be transferred without notice,after whichthis version will be superseded. Electronic copies: Copyright may be transferred without notice,after whichthis version may no longer be accessible. Introduction were received, would print them in ordinary alpha- Today we takeitfor granted that a ‘‘plain text’’ betic characters on a strip of paper.Hereceiveda file on a computer can be read by nearly anyprogram, patent for such a system on June 17, 1874.3, 4, 5 printed on anyprinter,displayed on anyscreen, trans- Baudot’swas not the first printing telegraph, but it mitted overany network, and understood equally eas- made considerably more efficient use of communica- ily by anyother makeormodel of computer.Plain tions lines than an earlier system invented by David E. text is plain, though, only because of a near-universal Hughes. Hughes’sprinter contained a continually agreement about what symbols and actions corre- rotating wheel with characters engravedonitinthe spond to what arbitrary arrangement of bits, an agree- order shown in Figure 2. Acharacter could be printed ment that was reached only after manyyears of design by sending a single pulse overthe telegraph line, but work, experimentation, and compromise. depending on the current position of the wheel it The first portion of the paper will coverthe origins might takenearly a complete rotation before the cor- of International Telegraph Alphabet No. 2 (often rect character would be ready to print.6 Instead of a called ‘‘Baudot’’), the five-unit code standardized in variable delay followed by a single-unit pulse, Bau- the 1930s. The second portion will coverthe design dot’ssystem used a uniform six time units to transmit and standardization of its successor,the seven-bit each character.Ihave not been able to obtain a copy international standard code nowused by the majority of Baudot’s1874 patent, but his early telegraph proba- of the world’scomputers and networks. This second bly used the six-unit code (Figure 3) that he attributes topic has previously been addressed from different to Davy in an 1877 article.7 perspectivesinapaper by Robert W.Bemer1 and a (In Figure 3, and in other figures to follow, each book by Charles E. Mackenzie.2 printable character is shown next to the pattern of impulses that is transmitted on a telegraph line to rep- Émile Baudot resent it. In this figure, dots ( )specifically represent On July 16, 1870, twenty-four-year-old Jean-Mau- the positive voltage of an idle telegraph line and cir- rice-Émile Baudot (Figure 1) left his parents’ farm cles ( )the negative voltage of an active line. In and begananewcareer in France’sAdministration related systems using punched paper tape, circles rep- des Postes et des Télégraphes. He had receivedonly resent a hole punched in the tape and dots the absence an elementary school education, but beganstudying of a hole.) electricity and mechanics in his spare time. In 1872, It may seem surprising that Hughes and Baudot he started research toward a telegraph system that invented their own telegraph codes rather than design- would allowmultiple operators to transmit simultane- ing printers that could work with the already-standard ously overasingle wire and, as the transmissions Morse code. Morse code, though, is extremely -2- A N B O C P D Q E R F S G T H U I V J W K X L Y M Z Figure3. Six-unit code (alphabet only) from an 1877 article by Émile Baudot.7 characters had varying patterns but were always trans- mitted in the same amount of time. Asix-unit code can encode 64 (26)different char- acters, far more than the twenty-six letters and space that are needed, at a minimum, for alphabetic mes- sages. This smaller set of characters can be encoded more efficiently with a five-unit code, which allows 32 (25)combinations, so in 1876 Baudot redesigned 3 Figure1. Émile Baudot (1845-1903). his equipment to use a five-unit code. Punctuation and digits were still sometimes needed, though, so he adopted from Hughes the use of twospecial letter letters letters letters letters space and figurespace characters that would cause the 1 A 1 8 H 8 15 O ? 22 V = printer to shift between cases at the same time as it 2 B 2 9 I 9 16 P ! 23 fig spc advanced the paper without printing. 3 C 3 10 J 0 17 Q ’ 24 W ( The five-unit code he beganusing at this time 4 D 4 11 K . 18 R + 25 X ) (Figure 4)9 wasstructured to suit his keyboard (Figure 5 E 5 12 L , 19 S − 26 Y & 5), which controlled twounits of each character with × 6 F 6 13 M ; 20 T 27 Z " switches operated by the left hand and the other three 7 G 7 14 N : 21 U / 28 let spc units with the right hand.10 Such ‘‘chorded’’key- figures figures figures figures boards have from time to time been reintroduced.11, 12 The Hughes system had used a piano-likekeyboard Figure2. Order of characters on Hughes printing (Figure 6). The typewriter was still too newaninv en- telegraph typewheel.6 Some equipment replaced the tion to have any impact on the design of telegraph letter W by the accented letter É and the multiplica- equipment. tion sign (×)byasection sign (§). Donald Murray difficult to decode mechanically because its characters By 1898, though, typewriters had become much vary both in their length and in their pattern. It was more common. In that year,15 Donald Murray (Figure not until the beginning of the twentieth century that 7), ‘‘an Australian journalist, without prior practical F. G.Creed was able to develop a successful Morse experience in telegraph work,’’16 invented a device printer,and evenhis invention could not print mes- which operated the keysofatypewriter or typesetting sages immediately as theywere received, but instead machine according to patterns of holes punched in a required that theyfirst be punched onto paper tape.8 strip of paper tape. In 1899 he receivedaUnited Hughes simplified the task by adopting a code in States patent for this invention17 and came to New which characters varied only with time, not in their York, where he worked to develop a complete tele- pattern. Baudot chose the opposite simplification: his graph system around it for the Postal Telegraph-Cable -3- let fig let fig 1 2 3 4 5 6 7 8 9 0 . , ; : unused * * A B C D E F G H I J K L M N A 1 K ( É & L = " & ) ( = / § − + ’ ! ? E 2 M ) Z Y X É V U T S R Q P O I ° N No O 5 P % 14 U 4 Q / Figure6. Hughes printing telegraph keyboard. Y 3 R − B 8 S ; C 9 T ! D 0 V ’ F f W ? G 7 X , H h Z : J 6 t . figure space letter space Figure4. Émile Baudot’sfive-unit code.9, 10, 13 Figure7. Donald Murray (1866-1945). Photo pro- Figure5. Baudot’sfive-key keyboard.10 vided by and reproduced courtesy of Bob Mackay. Company.15, 18, 19 assigned to the letters, control characters, and comma Murray’sprinter,likeBaudot’stelegraph, repre- and period. His patent unfortunately givesnoindica- sented each character as a sequence of fiveunits and tion of what characters were available in the figures employed special shift characters to switch between case or in what order theywere arranged. cases. Baudot’ssystem had only letter and figure On January 25, 1901, William B. Vansize (identi- cases, but Murray’sfirst printer had three: figures, fied as Murray’sattorneyinsev eral of his patents)21 capitals, and miniscules (‘‘release’’). Tomaximize described Murray’sinv ention to the American Insti- the structural stability of the tape,20 Murray arranged tute of Electrical Engineers, and Murray demonstrated the characters in his code so that the most frequently the printer in operation.16 By this time, his equipment used letters were represented by the fewest number of used a code (Figure 9a) that was almost identical to holes in the tape. Figure 8 shows the codes he the one from 1899, except that the codes for the space -4- rls cap fig cap fig let fig let fig e E − E 3 E 3 E 3 t T unk T 5 T 5 T 5 a A / A & A cor A : i I unk I 8 I 8 I 8 n N unk N £ N − N − o O unk O 9 O 9 O 9 s S ( S : S £ S 1 r R % R 4 R 4 R 4 h H unk H ; H ; H 5/ d D ) D − D 1 D 2 l L unk L % L prf L / u U unk U 7 U 7 U 7 c C unk C ( C ’ C ( m M unk M ? M ? M , f F unk F " F " F 1/ w W £ W 2 W 2 W 2 y Y unk Y 6 Y 6 Y 6 p P unk P 0 P 0 P 0 b B unk B / B / B ? g G unk G ’ G 3/ G 3/ v V unk V ) V () V ) k K unk K ½ K 9/ K 9/ q Q " Q 1 Q 1 Q 1 x X 3 X ¾ X % X £ 17 Figure8.