The Evolution of Character Codes, 1874-1968

Total Page:16

File Type:pdf, Size:1020Kb

The Evolution of Character Codes, 1874-1968 The Evolution of Character Codes, 1874-1968 Eric Fischer [email protected] Abstract Émile Baudot’sprinting telegraph was the first widely adopted device to encode letters, numbers, and symbols as uniform-length binary sequences. Donald Murray introduced a second successful code of this type, the details of which continued to evolveuntil versions of Baudot’sand Murray’scodes were standardized as International Tele- graph Alphabets No. 1 and No. 2, respectively.These codes were used for decades before the appearance of com- puters and the changing needs of communications required the design and standardization of a newcode. Years of debate and compromise resulted in the ECMA-6 standard in Europe, the ASCII standard in the United States, and the ISO 646 and International Alphabet No. 5 standards internationally. This work has been submitted to the IEEE for possible publication. Paper copies: Copyright may be transferred without notice,after whichthis version will be superseded. Electronic copies: Copyright may be transferred without notice,after whichthis version may no longer be accessible. Introduction were received, would print them in ordinary alpha- Today we takeitfor granted that a ‘‘plain text’’ betic characters on a strip of paper.Hereceiveda file on a computer can be read by nearly anyprogram, patent for such a system on June 17, 1874.3, 4, 5 printed on anyprinter,displayed on anyscreen, trans- Baudot’swas not the first printing telegraph, but it mitted overany network, and understood equally eas- made considerably more efficient use of communica- ily by anyother makeormodel of computer.Plain tions lines than an earlier system invented by David E. text is plain, though, only because of a near-universal Hughes. Hughes’sprinter contained a continually agreement about what symbols and actions corre- rotating wheel with characters engravedonitinthe spond to what arbitrary arrangement of bits, an agree- order shown in Figure 2. Acharacter could be printed ment that was reached only after manyyears of design by sending a single pulse overthe telegraph line, but work, experimentation, and compromise. depending on the current position of the wheel it The first portion of the paper will coverthe origins might takenearly a complete rotation before the cor- of International Telegraph Alphabet No. 2 (often rect character would be ready to print.6 Instead of a called ‘‘Baudot’’), the five-unit code standardized in variable delay followed by a single-unit pulse, Bau- the 1930s. The second portion will coverthe design dot’ssystem used a uniform six time units to transmit and standardization of its successor,the seven-bit each character.Ihave not been able to obtain a copy international standard code nowused by the majority of Baudot’s1874 patent, but his early telegraph proba- of the world’scomputers and networks. This second bly used the six-unit code (Figure 3) that he attributes topic has previously been addressed from different to Davy in an 1877 article.7 perspectivesinapaper by Robert W.Bemer1 and a (In Figure 3, and in other figures to follow, each book by Charles E. Mackenzie.2 printable character is shown next to the pattern of impulses that is transmitted on a telegraph line to rep- Émile Baudot resent it. In this figure, dots ( )specifically represent On July 16, 1870, twenty-four-year-old Jean-Mau- the positive voltage of an idle telegraph line and cir- rice-Émile Baudot (Figure 1) left his parents’ farm cles ( )the negative voltage of an active line. In and begananewcareer in France’sAdministration related systems using punched paper tape, circles rep- des Postes et des Télégraphes. He had receivedonly resent a hole punched in the tape and dots the absence an elementary school education, but beganstudying of a hole.) electricity and mechanics in his spare time. In 1872, It may seem surprising that Hughes and Baudot he started research toward a telegraph system that invented their own telegraph codes rather than design- would allowmultiple operators to transmit simultane- ing printers that could work with the already-standard ously overasingle wire and, as the transmissions Morse code. Morse code, though, is extremely -2- A N B O C P D Q E R F S G T H U I V J W K X L Y M Z Figure3. Six-unit code (alphabet only) from an 1877 article by Émile Baudot.7 characters had varying patterns but were always trans- mitted in the same amount of time. Asix-unit code can encode 64 (26)different char- acters, far more than the twenty-six letters and space that are needed, at a minimum, for alphabetic mes- sages. This smaller set of characters can be encoded more efficiently with a five-unit code, which allows 32 (25)combinations, so in 1876 Baudot redesigned 3 Figure1. Émile Baudot (1845-1903). his equipment to use a five-unit code. Punctuation and digits were still sometimes needed, though, so he adopted from Hughes the use of twospecial letter letters letters letters letters space and figurespace characters that would cause the 1 A 1 8 H 8 15 O ? 22 V = printer to shift between cases at the same time as it 2 B 2 9 I 9 16 P ! 23 fig spc advanced the paper without printing. 3 C 3 10 J 0 17 Q ’ 24 W ( The five-unit code he beganusing at this time 4 D 4 11 K . 18 R + 25 X ) (Figure 4)9 wasstructured to suit his keyboard (Figure 5 E 5 12 L , 19 S − 26 Y & 5), which controlled twounits of each character with × 6 F 6 13 M ; 20 T 27 Z " switches operated by the left hand and the other three 7 G 7 14 N : 21 U / 28 let spc units with the right hand.10 Such ‘‘chorded’’key- figures figures figures figures boards have from time to time been reintroduced.11, 12 The Hughes system had used a piano-likekeyboard Figure2. Order of characters on Hughes printing (Figure 6). The typewriter was still too newaninv en- telegraph typewheel.6 Some equipment replaced the tion to have any impact on the design of telegraph letter W by the accented letter É and the multiplica- equipment. tion sign (×)byasection sign (§). Donald Murray difficult to decode mechanically because its characters By 1898, though, typewriters had become much vary both in their length and in their pattern. It was more common. In that year,15 Donald Murray (Figure not until the beginning of the twentieth century that 7), ‘‘an Australian journalist, without prior practical F. G.Creed was able to develop a successful Morse experience in telegraph work,’’16 invented a device printer,and evenhis invention could not print mes- which operated the keysofatypewriter or typesetting sages immediately as theywere received, but instead machine according to patterns of holes punched in a required that theyfirst be punched onto paper tape.8 strip of paper tape. In 1899 he receivedaUnited Hughes simplified the task by adopting a code in States patent for this invention17 and came to New which characters varied only with time, not in their York, where he worked to develop a complete tele- pattern. Baudot chose the opposite simplification: his graph system around it for the Postal Telegraph-Cable -3- let fig let fig 1 2 3 4 5 6 7 8 9 0 . , ; : unused * * A B C D E F G H I J K L M N A 1 K ( É & L = " & ) ( = / § − + ’ ! ? E 2 M ) Z Y X É V U T S R Q P O I ° N No O 5 P % 14 U 4 Q / Figure6. Hughes printing telegraph keyboard. Y 3 R − B 8 S ; C 9 T ! D 0 V ’ F f W ? G 7 X , H h Z : J 6 t . figure space letter space Figure4. Émile Baudot’sfive-unit code.9, 10, 13 Figure7. Donald Murray (1866-1945). Photo pro- Figure5. Baudot’sfive-key keyboard.10 vided by and reproduced courtesy of Bob Mackay. Company.15, 18, 19 assigned to the letters, control characters, and comma Murray’sprinter,likeBaudot’stelegraph, repre- and period. His patent unfortunately givesnoindica- sented each character as a sequence of fiveunits and tion of what characters were available in the figures employed special shift characters to switch between case or in what order theywere arranged. cases. Baudot’ssystem had only letter and figure On January 25, 1901, William B. Vansize (identi- cases, but Murray’sfirst printer had three: figures, fied as Murray’sattorneyinsev eral of his patents)21 capitals, and miniscules (‘‘release’’). Tomaximize described Murray’sinv ention to the American Insti- the structural stability of the tape,20 Murray arranged tute of Electrical Engineers, and Murray demonstrated the characters in his code so that the most frequently the printer in operation.16 By this time, his equipment used letters were represented by the fewest number of used a code (Figure 9a) that was almost identical to holes in the tape. Figure 8 shows the codes he the one from 1899, except that the codes for the space -4- rls cap fig cap fig let fig let fig e E − E 3 E 3 E 3 t T unk T 5 T 5 T 5 a A / A & A cor A : i I unk I 8 I 8 I 8 n N unk N £ N − N − o O unk O 9 O 9 O 9 s S ( S : S £ S 1 r R % R 4 R 4 R 4 h H unk H ; H ; H 5/ d D ) D − D 1 D 2 l L unk L % L prf L / u U unk U 7 U 7 U 7 c C unk C ( C ’ C ( m M unk M ? M ? M , f F unk F " F " F 1/ w W £ W 2 W 2 W 2 y Y unk Y 6 Y 6 Y 6 p P unk P 0 P 0 P 0 b B unk B / B / B ? g G unk G ’ G 3/ G 3/ v V unk V ) V () V ) k K unk K ½ K 9/ K 9/ q Q " Q 1 Q 1 Q 1 x X 3 X ¾ X % X £ 17 Figure8.
Recommended publications
  • Typology of Programming Languages E Early Languages E
    Typology of programming languages e Early Languages E Typology of programming languages Early Languages 1 / 71 The Tower of Babel Typology of programming languages Early Languages 2 / 71 Table of Contents 1 Fortran 2 ALGOL 3 COBOL 4 The second wave 5 The finale Typology of programming languages Early Languages 3 / 71 IBM Mathematical Formula Translator system Fortran I, 1954-1956, IBM 704, a team led by John Backus. Typology of programming languages Early Languages 4 / 71 IBM 704 (1956) Typology of programming languages Early Languages 5 / 71 IBM Mathematical Formula Translator system The main goal is user satisfaction (economical interest) rather than academic. Compiled language. a single data structure : arrays comments arithmetics expressions DO loops subprograms and functions I/O machine independence Typology of programming languages Early Languages 6 / 71 FORTRAN’s success Because: programmers productivity easy to learn by IBM the audience was mainly scientific simplifications (e.g., I/O) Typology of programming languages Early Languages 7 / 71 FORTRAN I C FIND THE MEAN OF N NUMBERS AND THE NUMBER OF C VALUES GREATER THAN IT DIMENSION A(99) REAL MEAN READ(1,5)N 5 FORMAT(I2) READ(1,10)(A(I),I=1,N) 10 FORMAT(6F10.5) SUM=0.0 DO 15 I=1,N 15 SUM=SUM+A(I) MEAN=SUM/FLOAT(N) NUMBER=0 DO 20 I=1,N IF (A(I) .LE. MEAN) GOTO 20 NUMBER=NUMBER+1 20 CONTINUE WRITE (2,25) MEAN,NUMBER 25 FORMAT(11H MEAN = ,F10.5,5X,21H NUMBER SUP = ,I5) STOP TypologyEND of programming languages Early Languages 8 / 71 Fortran on Cards Typology of programming languages Early Languages 9 / 71 Fortrans Typology of programming languages Early Languages 10 / 71 Table of Contents 1 Fortran 2 ALGOL 3 COBOL 4 The second wave 5 The finale Typology of programming languages Early Languages 11 / 71 ALGOL, Demon Star, Beta Persei, 26 Persei Typology of programming languages Early Languages 12 / 71 ALGOL 58 Originally, IAL, International Algebraic Language.
    [Show full text]
  • Mobile Digital Computer Program. Mobidic D
    UNCLASSIFIED AD 4 7_070 DEFENSE DOCUMEI'TATION CENTER FOR SCIENTIFIC AND TECHNIA!. INFO'UMATION CAMERON STATION, ALEXANDRW , VIFGINI, UNCLASSIFIED NOTICE: When government or other drawings, speci- fications or other data are used for any purpose other than in connection with a definitely related government procurement operation, the U. S. Government thereby incurs no responsibility, nor any obligation whatsoever; and the fact that the Govern- ment may have formulated, furnished, or in any way supplied the said drawings, specifications, or other data is not to be regarded by implication or other- wise as in any manner licensing the holder or any other person or corporation, or conveying any rights or permission to manufacture, use or sell any patented invention that may in any way be related thereto. FINAL REPORT 1 FEBRUARY 1963 J a I MOBILE DIGITAL COMPUTER PROGRAM MOBIDIC D FINAL REPORT 1 July 1958 to 1 February 1963 I Signal Corps Technical Requirements I SCL 1959 SCL 4328 Contract No. DA 3 6 -039-sc-781 6 4 I DA Project No. 3-28-02-201 I Submitted by: _, _ _ _ E. W. Jer'7is, Manage'r MOBIDIC Projects February 1963 S SYLVANIA ELECTRONIC SYSTEMS-EAST SYLVANIA ELECTRONIC SYSTEMS A Division of Sylvania Electric Products Inc. 189 B Street-Needham Heights 94, Massachusetts ~• I 3 TABLE OF CONTENTS I Section Page LIST OF ILLUSTRATIONS v ILIST OF TABLES vii I PURPOSE 1-1 1U1.1 MOBIDIC D General Purpose High-Speed Computer 1-1 1.2 MOBIDIC D Program 1-1 11.2. 1 Phase I -Preliminary Design 1-1 1.2.2 Phase II-Design 1-1 1.2.3 Phase III-Construction and Test 1-2 1.2.4 Phase IV-Update MOBIDIC D to MOBIDIC 7A 1-2 I 1.2.5 Phase V-Van Installation and Test 1,-2 II ABSTRACT 2-1 III PUBLICATIONS, LECTURES, CONFERENCES & TERMINOLOGY 3-1 3.1 Publications 3-1 T3.2 Lectures 3-1 3.3 Conferences 3-2 3.4 Terminology and Abbreviations 3-10 S3.4.1 Logical and Mechanization Designations: 3-13 Central Machine and Converter S3.4.2 Logical and Mechanization Designations: - 3-45 Card Reader and Punch Buffer 3.4.
    [Show full text]
  • A Politico-Social History of Algolt (With a Chronology in the Form of a Log Book)
    A Politico-Social History of Algolt (With a Chronology in the Form of a Log Book) R. w. BEMER Introduction This is an admittedly fragmentary chronicle of events in the develop­ ment of the algorithmic language ALGOL. Nevertheless, it seems perti­ nent, while we await the advent of a technical and conceptual history, to outline the matrix of forces which shaped that history in a political and social sense. Perhaps the author's role is only that of recorder of visible events, rather than the complex interplay of ideas which have made ALGOL the force it is in the computational world. It is true, as Professor Ershov stated in his review of a draft of the present work, that "the reading of this history, rich in curious details, nevertheless does not enable the beginner to understand why ALGOL, with a history that would seem more disappointing than triumphant, changed the face of current programming". I can only state that the time scale and my own lesser competence do not allow the tracing of conceptual development in requisite detail. Books are sure to follow in this area, particularly one by Knuth. A further defect in the present work is the relatively lesser availability of European input to the log, although I could claim better access than many in the U.S.A. This is regrettable in view of the relatively stronger support given to ALGOL in Europe. Perhaps this calmer acceptance had the effect of reducing the number of significant entries for a log such as this. Following a brief view of the pattern of events come the entries of the chronology, or log, numbered for reference in the text.
    [Show full text]
  • Unicode/IDN Security Mark Davis President, Unicode Consortium Chief SW Globalization Arch., IBM the Unicode Consortium
    Unicode/IDN Security Mark Davis President, Unicode Consortium Chief SW Globalization Arch., IBM The Unicode Consortium Software globalization standards: define properties and behavior for every character in every script Unicode Standard: a unique code for every character Common Locale Data Repository: LDML format plus repository for required locale data Collation, line breaking, regex, charset mapping, … Used by every major modern operating system, browser, office software, email client,… Core of XML, HTML, Java, C#, C (with ICU), Javascript, … Security ~ Identity System A X = x System B X ≠ x IDN You get an email about your paypal.com account, click on the link… You carefully examine your browser's address box to make sure that it is actually going to http://paypal.com/ … But actually it is going to a spoof site: “paypal.com” with the Cyrillic letter “p”. You (System A) think that they are the same DNS (System B) thinks they are different Examples: Letters Cross-Script p in Latin vs p in Cyrillic In-Script Sequences rn may appear at display sizes like m ! + ा typically looks identical to # so̷s looks like søs Rendering Support ä with two umlauts may look the same as ä with one el is actually e + l Examples: Numbers Western 0 1 2 3 4 5 6 7 8 9 Bengali ৺৺৺৺৺৺৺৺৺৺ Oriya ୱୱୱୱୱୱୱୱୱୱ Thus ৺ୱ = 42 Syntax Spoofing http://example.org/1234/not.mydomain.com http://example.org/1234/not.mydomain.com / = fraction-slash Also possible without Unicode: http://example.org--long-and-obscure-list-of- characters.mydomain.com UTR
    [Show full text]
  • Network Working Group D. Goldsmith Request for Comments: 1641 M
    Network Working Group D. Goldsmith Request for Comments: 1641 M. Davis Category: Experimental Taligent, Inc. July 1994 Using Unicode with MIME Status of this Memo This memo defines an Experimental Protocol for the Internet community. This memo does not specify an Internet standard of any kind. Distribution of this memo is unlimited. Abstract The Unicode Standard, version 1.1, and ISO/IEC 10646-1:1993(E) jointly define a 16 bit character set (hereafter referred to as Unicode) which encompasses most of the world's writing systems. However, Internet mail (STD 11, RFC 822) currently supports only 7- bit US ASCII as a character set. MIME (RFC 1521 and RFC 1522) extends Internet mail to support different media types and character sets, and thus could support Unicode in mail messages. MIME neither defines Unicode as a permitted character set nor specifies how it would be encoded, although it does provide for the registration of additional character sets over time. This document specifies the usage of Unicode within MIME. Motivation Since Unicode is starting to see widespread commercial adoption, users will want a way to transmit information in this character set in mail messages and other Internet media. Since MIME was expressly designed to allow such extensions and is on the standards track for the Internet, it is the most appropriate means for encoding Unicode. RFC 1521 and RFC 1522 do not define Unicode as an allowed character set, but allow registration of additional character sets. In addition to allowing use of Unicode within MIME bodies, another goal is to specify a way of using Unicode that allows text which consists largely, but not entirely, of US-ASCII characters to be represented in a way that can be read by mail clients who do not understand Unicode.
    [Show full text]
  • Standards for Computer Aided Manufacturing
    //? VCr ~ / Ct & AFML-TR-77-145 )R^ yc ' )f f.3 Standards for Computer Aided Manufacturing Office of Developmental Automation and Control Technology Institute for Computer Sciences and Technology National Bureau of Standards Washington, D.C. 20234 January 1977 Final Technical Report, March— December 1977 Distribution limited to U.S. Government agencies only; Test and Evaluation Data; Statement applied November 1976. Other requests for this document must be referred to AFML/LTC, Wright-Patterson AFB, Ohio 45433 Manufacturing Technology Division Air Force Materials Laboratory Wright-Patterson Air Force Base, Ohio 45433 . NOTICES When Government drawings, specifications, or other data are used for any purpose other than in connection with a definitely related Government procurement opera- tion, the United States Government thereby incurs no responsibility nor any obligation whatsoever; and the fact that the Government may have formulated, furnished, or in any way supplied the said drawing, specification, or other data, is not to be regarded by implication or otherwise as in any manner licensing the holder or any person or corporation, or conveying any rights or permission to manufacture, use, or sell any patented invention that may in any way be related thereto Copies of this report should not be returned unless return is required by security considerations, contractual obligations, or notice on a specified document This final report was submitted by the National Bureau of Standards under military interdepartmental procurement request FY1457-76 -00369 , "Manufacturing Methods Project on Standards for Computer Aided Manufacturing." This technical report has been reviewed and is approved for publication. FOR THE COMMANDER: DtiWJNlb L.
    [Show full text]
  • Unicode and Code Page Support
    Natural for Mainframes Unicode and Code Page Support Version 4.2.6 for Mainframes October 2009 This document applies to Natural Version 4.2.6 for Mainframes and to all subsequent releases. Specifications contained herein are subject to change and these changes will be reported in subsequent release notes or new editions. Copyright © Software AG 1979-2009. All rights reserved. The name Software AG, webMethods and all Software AG product names are either trademarks or registered trademarks of Software AG and/or Software AG USA, Inc. Other company and product names mentioned herein may be trademarks of their respective owners. Table of Contents 1 Unicode and Code Page Support .................................................................................... 1 2 Introduction ..................................................................................................................... 3 About Code Pages and Unicode ................................................................................ 4 About Unicode and Code Page Support in Natural .................................................. 5 ICU on Mainframe Platforms ..................................................................................... 6 3 Unicode and Code Page Support in the Natural Programming Language .................... 7 Natural Data Format U for Unicode-Based Data ....................................................... 8 Statements .................................................................................................................. 9 Logical
    [Show full text]
  • Ill[S ADJUSTMENTS
    B249 DATA TRANSMISSION CONTROL UNIT INTRODUCTION AND OPERATION Burroughs FUNCTIONAL DETAIL FIELD ENGINEERING CIRCUIT DETAIL lJrn~[}{] ~ D~ill[s ADJUSTMENTS MAINTENANCE ~ ill ~ OD ill [S PROCEDURES INST ALLAT ION PROCEDURES RELIABILITY IMPROVEMENT NOTICES OPTIONAL FEATURES MODIFICATIONS (BRANCH LIBRARIES) Printed in U.S. America 9-15-66 Form 1026259 Burroughs - B249 Data Transmission ~echnical Manual I N D E X INTRODUCTION & OPERATION - SECTION I Page No. B249 Data Transmission Control Unit - General Descript ion . 1 Glossary - Data Transmission Terminal Unit and MCU. 5 Glossary - DTCU & System . 3 Physical De$cr1ption . 2 FUNCTIONAL DETAIL - SECTION II B300 Active Interrogate. · . · . · · 57 B300 Data Communications Read. · · . · · · · 62 B300 Data Communications Write · · . 78 B300 Passive Interrogate · · 49 VB300 Passive Interrogate - ITU (B486) Mode · . 92 v13300 Read - ITU (B486) Mode. · · . · · . · . 95 V'B300 Write - ITU (13486) Mode · · . · · 108 B5500 Data Communications Interrogate. · . 1 B5500 Data Communications Read . · 9 B5500 Data Communications Write. · 21 CIRCUIT DETAIL - SECTION III "AU Register Load - (Normal & Reverse) 1 Clock Control. ... 11 Scan . 3 Translator . 4 ADJUSTMENTS - SECTION IV Clock Adjustments ..... 1 Variable Bias Adjustment . 1 MAINTENANCE PROCEDURES - SECTION V Maintenance Panel ... 1 INSTALLATION PROCEDURES - SECTION VI DTCU Installation. 1 Pluggable Options. 2 Power ON . 3 Special Inquiry Terminal Connection .. 2 NOTE: Pages for Sections VII, VIII and IX will be furnished when applicable. Printed in U. S. America Revised 4/1/67 For Form 1026259 Burroughs - B249 Data Transmission Technical Manual Sec, I Page 1 Introduction & Operation B249 DATA TRANSMISSION CONTROL UNIT - GENERAL DESCRIPTION The B249 DTCU is required when: 1. More than one B487 DTTU is used on a single Processing System.
    [Show full text]
  • Relink Eo 3-11-97
    CHAPTER 1 COMMUNICATIONS As an Aviation Electronics Technician, you will be communication circuits. These operations are accom- tasked to operate and maintain many different types of plished through the use of compatible and flexible airborne communications equipment. These systems communication systems. may differ in some respects, but they are similar in many Radio is the most important means of com- ways. As an example, there are various models of AM municating in the Navy today. There are many methods radios, yet they all serve the same function and operate of transmitting in use throughout the world. This manual on the same basic principles. It is beyond the scope of will discuss three types. They are radiotelegraph, this manual to discuss each and every model of radiotelephone, and teletypewriter. communication equipment used on naval aircraft; therefore, only representative systems will be discussed. Every effort has been made to use not only systems that Radiotelegraph are common to many of the different platforms, but also have not been used in the other training manuals. It is Radiotelegraph is commonly called CW (con- the intent of this manual to have systems from each and tinuous wave) telegraphy. Telegraphy is accomplished every type of aircraft in use today. by opening and closing a switch to separate a continuously transmitted wave. The resulting “dots” and “dashes” are based on the Morse code. The major RADIO COMMUNICATIONS disadvantage of this type of communication is the relatively slow speed and the need for experienced Learning Objective: Recognize the various operators at both ends. types of radio communications.
    [Show full text]
  • Printf and Primitive Types
    CS 241 Data Organization using C printf and Primitive Types Instructor: Joel Castellanos e-mail: [email protected] Web: http://cs.unm.edu/~joel/ Office: Farris Engineering Center Room 2110 8/29/2019 1 Read: Kernighan & Ritchie Read ◼ Due Thursday, Aug 22 1.1: Getting Started 1.2: Variables and Arithmetic Expressions 1.3: The For Statement 1.4: Symbolic Constants ◼ Due Tuesday, Aug 27 (first iclicker quiz) 1.5: Character Input and Output ◼ Due Thursday, Aug 29 1.6: Arrays 2 2 1 Quiz: Basic Syntax One of these things is not like the others. One of these things does not belong. Can you tell which thing is not like the others before the end of this quiz? a) int a = 40 * 2*(1 + 3); b) int b = (10 * 10 * 10) + 2 c) int c = (2 + 3) * (2 + 3); d) int d = 1/2 + 1/3 + 1/4 + 1/5 + 1/6; e) int e = 1/2 - 1/4 + 1/8 - 1/16; 3 3 printf function printf("Name %s, Num=%d, pi %10.2f", "bob", 123, 3.14 ); Output: Name bob, Num=123, pi 3.14 printf format specifiers: %s string (null terminated char array) %c single char %d signed decimal int %f float %10.2f float with at least 10 spaces, 2 decimal places. %lf double 4 4 2 printf function: %d //%d: format placeholder that prints an int as a // signed decimal number. #include <stdio.h> void main(void) { int x = 512; Output: printf("x=%d\n", x); x=512 printf("[%2d]\n", x); [512] printf("[%6d]\n", x); [ 512] printf("[%-6d]\n", x); [512 ] printf("[-%6d]\n", x); [- 512] } 5 5 printf function: %f #include <stdio.h> void main(void) { float x = 3.141592653589793238; double z = 3.141592653589793238; printf("x=%f\n", x); printf("z=%f\n", z); printf("x=%20.18f\n", x); printf("z=%20.18f\n", z); } x=3.141593 Output: z=3.141593 x=3.141592741012573242 6 z=3.141592653589793116 6 3 Significant Figures Using /usr/bin/gcc on moons.unm.edu, a float has 7 significant figures.
    [Show full text]
  • The Compression Problem Computational Thinking in This Discussion, We Will Look at the Compression Problem
    Squeezing Ten Pounds of Data in a Five Pound Sack - or - The Compression Problem Computational Thinking In this discussion, we will look at the compression problem. Along the way, we will see several examples of computational thinking: • representations allow the computer to store and work with letters, images, or music by representing them with a numeric code; • patterns or repeated strings in text can be used by a computer for compres- sion; • index tables allow us to replace a long idea with a short word or number; Reading assignment Read Chapter 7, pages 105{121, \9 Algorithms that Changed the Future". The Compression Problem for Traveling When you pack for a trip, you want to minimize the number of suitcases and bags you take, but maximize the things you can transport. Sometimes it takes several repackings, rolling and squashing and rearranging, before you are satisfied. The Compression Problem for File Storage Your computer hard disk has a limited amount of storage. The information in a single movie or song could easily fill up your entire computer memory. This doesn't happen because the information is compressed in several ways. The Compression Problem for Data Transmission Data must be transferred from one place to another, over networks with limited capacity. In North America, 70% of Internet traffic involves services that are streaming data, such as movies or music. When the local network is overloaded, a movie becomes unwatchable, music becomes noise, users are dissatisfied. Solving the compression problem with more capacity One solution to these problems is to increase capacity: Buy more suitcases or bigger ones.
    [Show full text]
  • Data Encoding: All Characters for All Countries
    PhUSE 2015 Paper DH03 Data Encoding: All Characters for All Countries Donna Dutton, SAS Institute Inc., Cary, NC, USA ABSTRACT The internal representation of data, formats, indexes, programs and catalogs collected for clinical studies is typically defined by the country(ies) in which the study is conducted. When data from multiple languages must be combined, the internal encoding must support all characters present, to prevent truncation and misinterpretation of characters. Migration to different operating systems with or without an encoding change make it impossible to use catalogs and existing data features such as indexes and constraints, unless the new operating system’s internal representation is adopted. UTF-8 encoding on the Linux OS is used by SAS® Drug Development and SAS® Clinical Trial Data Transparency Solutions to support these requirements. UTF-8 encoding correctly represents all characters found in all languages by using multiple bytes to represent some characters. This paper explains transcoding data into UTF-8 without data loss, writing programs which work correctly with MBCS data, and handling format catalogs and programs. INTRODUCTION The internal representation of data, formats, indexes, programs and catalogs collected for clinical studies is typically defined by the country or countries in which the study is conducted. The operating system on which the data was entered defines the internal data representation, such as Windows_64, HP_UX_64, Solaris_64, and Linux_X86_64.1 Although SAS Software can read SAS data sets copied from one operating system to another, it impossible to use existing data features such as indexes and constraints, or SAS catalogs without converting them to the new operating system’s data representation.
    [Show full text]