Other Encodings

Other Encodings

11/13/2019 Other Encodings ASCII, Unicode, BCD and EBCDIC ASCII • American Standard Code for Information Interchange • Representation of printable (and related) characters as bit patterns • Basic ASCII is a 7-bit code - 8th bit used for parity (primitive error checking) • Extended ASCII: ISO8859-1, CP437, etc - extensions of 7-bit ASCII to 8 bits - include graphics symbols, European characters - not consistent - All include 7-bit ASCII as the 1st 128 characters 1 11/13/2019 ASCII tables – see montcs.bloomu.edu/Information/Encoding/ASCII-EBCDIC.html Hexadecimal Range Usage Examples range First 32 0x00 - 0x1f Control Ctrl-D, ‘\n’, values characters Escape Second 32 Punctuation, 0x20 - 0x3f (, ), 0..9, = values digits Third 32 Uppercase 0x40 - 0x5f A..Z, [, ],@ values letters Fourth 32 Lowercase 0x60 - 0x7f a..z, {, }, ~ values letters Extended ASCII Various non- Last 128 English 0x80 - 0xff ¶, ü, ┝, ┤ values characters, “ASCII graphics” ISO8859-1, a.k.a. Latin-1 2 11/13/2019 CodePage 437 – the IBM Character Set Unicode family • Multi-byte successor to ASCII - Represents alphabets by "code points" - UTF-8, other encodings of code points » 1 byte for ASCII codes » expands to 2, 3, or 4 bytes for other character sets • Support for many languages - Greek - Cyrillic - Arabic - Mandarin - Sanskrit - Kanji 3 11/13/2019 partial Unicode code table – see montcs.bloomu.edu/Information/Encodings/unicode.html Unicode and Emojis • Different representations for code points • More than 1700 emojis currently defined 4 11/13/2019 UTF-8 Encodings • Unicode currently defines code points U+0000 through 0x10ffff - somewhat over 1 million characters in 17 planes • UTF-8 uses up to four bytes to represent these code points - 5-, 6-byte encodings unneeded Unicode and UTF-8 – A Few Example Alphabets Character First Second Third Fourth Code Points Set Byte Byte? Byte? Byte? Basic Latin 0x00 – U+0000 – U+007f (ASCII) 0x7f 0xc0 – 0x80 – Latin-1 U+0080 – U+00ff 0xc3 0xbf Latin 0xc4 – 0x80 – U+0100 – U+017f Extended-A 0xc5 0xbf Latin 0xc6 – 0x80 – U+0180 – U+024f Extended-B 0xc9 0x8f 0xc9 – 0x90 – … U+0250 – U+036f 0xcd 0xaf Greek and 0xcd – 0xb0 – U+0370 – U+03ff Coptic 0xcf 0xbf 0xd0 – 0x80 – … U+0380 – U+07ff 0xdf 0xbf 0x80 – Samaritan U+0800 – U+083f 0xe0 0xa0 0xbf … U+0840 – U+10ffff 0xe0 0xa1 – … 0x80 – … ??? 5 11/13/2019 Bit Pattern BCD Unsigned Binary Coded Decimal Binary 0000 0 0 0001 1 1 • BCD – scheme for encoding 10 decimal digits 0010 2 2 3 - Bit patterns 1010 – 1111 unused 0011 3 0100 4 4 • Two BCD digits per byte 0101 5 5 - examples: 6 13 = 0000 0011 0110 6 64 = 0110 0100 0111 7 7 - 00-99 range is less than 1000 8 8 unsigned binary range of 0-255 1001 9 9 • Hardware support more 1010 - 10 complicated 1011 - 11 - Addition, subtraction are full of 1100 - 12 “special cases” requiring 1101 - 13 additional circuitry 1110 - 14 1111 - 15 Using BCD numbers: a 1979 HP-9845 interfaces with a 1967 HP voltmeter 6 11/13/2019 EBCDIC • Extended Binary Coded Decimal Interchange Code - BCD is embedded within it • Alternative to ASCII - 8 bits • Created for IBM mainframes - suited to 80-column punched cards - support for business applications • Country-specific versions were not mutually consistent • Little used today EBCDIC 7 11/13/2019 EBCDIC table – see montcs.bloomu.edu/Information/Encoding/ASCII-EBCDIC.html a punched card showing EBCDIC 8 .

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    8 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us