The ASCII Character Set
Total Page:16
File Type:pdf, Size:1020Kb
Appendix The ASCII Character Set The various sorting, text processing, and character handling operations for which Unix utilities are available assume that characters are ordered in the sequence prescribed by the ASCII character set. An ASCII (Amer ican Standard for Computer Information Interchange) character is defined as a string of 7 bits which represents a printable character or a nonprintable control function. For example, 1 101 000 represents the printable char acter h, while a 001 000 represents the backspace function. For ease in reading, the binary digits are often written in groups of three, and even more frequently each group of three is given its natural numerical inter pretation, that is, an octal representation is used. Thus 1 101 000 is normally written as 150, the groups 101 and 000 having been interpreted as the octal numbers 5 and 0, respectively. Decimal and hexadecimal representations, in which 1 101 000 appears as 104 or 6a, are also commonly used. Since an ASCII character is exactly seven bits long, 128 distinct char acters can be formed. These are given in Table A.l, The ASCII Character Set. Perhaps surprisingly, there are some installation-dependent differences between printed symbols and the corresponding ASCII bit configurations. For example, 043 (0 100 all) is rendered as the crosshatched "pounds" sign in American practice and as the "pounds sterling" symbol on many printers in Britain. Some other British printers and terminals substitute the "pounds sterling" sign for the dollar sign. There is no ambiguity, how ever, about the alphabetic and numeric characters, nor about the math ematicaloperators. 322 Appendix: The ASCII Character Set TABLE A.t. The ASCII Character Set Nonprinting Printable characters , 000 nul A@ 040 100 @ 140 001 soh AA 041 101 A 141 a 002 stx AB 042 II 102 B 142 b 003 etx AC 043 # 103 C 143 c 004 eot AD 044 $ 104 D 144 d 005 enq AE 045 % 105 E 145 e 006 ack AF 046 & 106 F 146 f 007 bel AG 047 107 G 147 g 010 bs AH 050 ( 110 H 150 h 011 ht AI 051 ) III I 151 i 012 nl AJ 052 * 112 J 152 j 013 vt AK 053 + 113 K 153 k 014 np AL 054 114 L 154 1 015 cr AM 055 115 M 155 m 016 so AN 056 116 N 156 n 017 si AO 057 / 117 0 157 0 020 die Ap 060 0 120 p 160 p 021 del AQ 061 1 121 Q 161 q 022 dc2 AR 062 2 122 R 162 r 023 dc3 AS 063 3 123 S 163 s 024 dc4 AT 064 4 124 T 164 t 025 nak AU 065 5 125 U 165 u 026 syn Ay 066 6 126 y 166 v 027 etb AW 067 7 127 W 167 w 030 can AX 070 8 130 X 170 x 031 em Ay 071 9 131 y 171 Y 032 sub AZ 072 132 z 172 z 033 esc A[ 073 133 [ 173 { 034 Is A\ 074 < 134 \ 174 I 035 gs A] 075 135 ] 175 } AA A 036 rs 076 > 136 176 '" 037 us A 077 ? 137 177 del Character Names Perhaps surprisingly, the names of all the ASCII characters are not well known. The letters and numerals are called by their natural names, of course; but that only accounts for 62 of the 128 characters. Of the re mainder, about half are not printable in the sense that they do not cor respond to any prescribed pattern of ink on paper. The first 32 characters Character Names 323 TABLE A.2. Special characters in the ASCII set Oct Mark Name Alternative or popular names 040 space blank 041 exclamation point exclamation mark 042 " quotation mark double quote 043 # number sign pound sign 044 $ dollar sign currency symbol 045 % percent sign 046 & ampersand 047 apostrophe closing single quote, solidus 050 opening parenthesis left parenthesis 051 closing parenthesis right parenthesis 052 * asterisk star 053 + plus 054 comma 055 hyphen minus, dash 056 period decimal point, dot 057 slant slash, virgule, oblique stroke 072 colon 073 semicolon 074 < less than left-arrow, left angle bracket 075 equals equal sign 076 > greater than right-arrow, right angle bracket 077 ? question mark 100 @ commercial at at 133 [ opening bracket left [square] bracket 134 "'- reverse slant backslash, reverse slash 135 ] closing bracket right [square] bracket 136 A circumflex caret, up-arrow 137 underline underscore 140 opening single quote [single] back quote 173 { opening brace opening (left) curly bracket 174 I vertical line logical or, vertical rule 175 } closing brace closing (right) curly bracket 176 tilde (octal 000 to 037) and the last entry in the table (octal 1 77) are nonprinting control functions, the remainder are special characters (punctuation marks and the like). The punctuation marks and other special characters in the ASCII set have clearly defined names (as laid out in the relevant standard) and are popularly called by various additional names. Names of the special char acters, officially correct as well as popUlar, are shown in Table A.2, Special Characters. The nonprinting characters have standard names also, but they are fre quently called control characters, because they are formed by pressing a key while the CONTROL key is held down. For example, the backspace character (octal 010) is called backspace (bs) or control-H. In the relevant 324 Appendix: The ASCII Character Set ANSI standards documents, the control characters are given the following names: A@ nul null Ap die data link escape AA soh start of heading AQ del device control 1 AB stx start of text AR dc2 device control 2 AC etx end of text AS dc3 device control 3 AD eot end of transmission AT dc4 device control 4 AE enq enquiry AU nak negative acknowledge AF ack acknowledge AV syn synchronous idle AG bel bell AW etb end transmission block AH bs backspace AX can cancel AI ht horizontal tabulation Ay em end of medium AJ If line feed AZ sub substitute AK vt vertical tabulation A[ esc escape AL ff form feed A" fs file separator AM cr carriage return A] gs group separator AN so shift out rs record separator AO si shift in us unit separator The names given here are in accordance with the ANSI standard. They differ from names as used by the Unix community in four cases. The control-J character is usually called newline (abbreviated nl) by Unix pro grammers, the control-L character is sometimes called newpage (np). Control-Q and control-S (del and dc3) are often called X-on and X-off, respectively, because they control transmission of characters to the ter minal screen. Subject Index A shell scripts, 82 Abbreviations, 168-169 vi, 82, 146, 166-167 Access permissions, 39, 41-42, 43, Basic, 254-256, 261-262 44 Berkeley Pascal, see Pascal alteration, 45-46, 89 Blocks, 112-114, 116-117 processes, 108-109 Bourne, S., 59 Active processes, 102 Bourne shell, 59, 75, 83, 88, 95-96 Ada, 216 .profile file, 88-89 Archive files, see Library files customizing, 88-89 Arrays, 239 loops, 83-85, 87-88 ASCII character set, 321-324 variables, 85-86 Assembler language, 4, 99, 217, 218, Buffers, 113, 116-117 245 editing, 26, 28, 146-147, 163 kernel and, 99-100 deletion, 160, 162-163 loader and, 244 shell, 68-69 Pascal and, 251 special files, 117 programming, 256-257 at-processes, 57 c C intermediate code, 218, 226, 244, 8 245 Background processes, 63, 64, 80, 87 C language, 4, 83, 99, 216, 233-234 killing, 105 compilers, 3, 233-234, 245-248 Backslashes, 71-72, 200 Fortran and, 218, 221, 234 BACKSPACE key, 17, 69; see also functions, 234-237 Erase characters input-output, 242 Backup files, 13, 56-58, 127 operators, 239-241 326 Subject Index C language (cont.) C language, 237 preprocessors, 242-244 Ratfor, 227-231 structures, 241-242 shell, 79-81 variables, 237, 238 Control characters, 18, 323-324 C shell, 59, 75, 83, 88, 95-96, 265 B, 151 .cshrc file, 93, 94 D, 15-16, 17, 18, 32, 123, 151 aliases, 92-93 F, 151 customizing, 93-94 G,84 event numbers, 90, 93, 94 H,70 historical recall, 90-91 J,324 loops, 83, 85, 88 L,324 variables, 86 Q, 18, 128, 324 Calendars, 142 S, 18, 128, 324 Character strings, 72, 85, 179-180 U, 151 aliases, 92 X,69 protection, 72 Z, 16,32 Characters, see also Control charac cron, 1l0-1l1, 122, 141-142 ters; Metacharacters; Special csh, see C shell characters; Wild card charac Current processes, 98 ters Cursor, 28-29; see also Line pointer input-output, 112-113, 117 vi, 149, 152, 159 translation, 138 Chemical formulae, typography, 184, 201-202 D Child processes, 102 Data segment, 106, 140 user identification, 109-110 Data streams, 64-66 Clock daemon, 1l0-1l1, 122, 141-142 Data transfer, 33 close,1l7 Debugging, Command decoder programs, see f77, 219, 220-223 Shells macros, 169-170 Command mode, 149-150, 169 trotT, 201 Commands, 2, 10-11, 18-20,60-61, DELETE key, 16, 19, 151 258-304; see also Command Deletion, Index immediately following ed, 178-179 this index; Macros; Options; vi, 157, 159-160, 162-163 System calls derotT, 140 arguments, 61, 62, 72-73 Device drivers, 99, Ill, 113 ed, 176-177 Device independence 111 program execution, 25 Dictionaries, 204-207 shell, 61-64, 79-81, 91, 92-93, 100 Directory files, 31, 33-45, 51, 87 user added, 20 .exrc files, 174 vi, 26-27, 29, 149-170 access permissions, 42 Compilers, 25 changing, 41 C language, 3, 233-234, 245-248 creation, 43 Fortran, 21, 25-26, 218-219 i-numbers, 118-119 f77, 218, 219-225 links, 26, 44, 45, 47, 55-56, 86, 87, Pascal, 251-253 119, 127 Conditional execution constructs, listing, 21, 34, 35, 42, 43, 44-45, 49 Subject Index 327 names, 34, 35, 43 File management, 126-140 parent, 35 File volumes, see Disk files; Physical physical structure, 114-115 volumes; Removable storage removable storage media, 53-55 media removal,43 Files, 20, 31-59; see also Directory tree structure, 34-41 files; Library files; Special Disk files, 33-34; see also Removable files storage media comparing, 134-136, 138-139 physical structure, 114, 115, 140 content identification, 49-50 special files, 112-113 copying,