Bits, Bytes, and Integers Binary Representations Encoding Byte

Binary Representations Bits, Bytes, and Integers 0 1 0 3.3V CSci 2021: Machine Architecture and Organization Lectures #2-4, January 23rd-28th, 2015 2.8V Your instructor: Stephen McCamant 0.5V 0.0V Based on slides originally by: Randy Bryant, Dave O’Hallaron, Antonia Zhai 1 2 Encoding Byte Values Byte-Oriented Memory Organization Byte = 8 bits . Binary 000000002 to 111111112 0 0 0000 • • • . Decimal: 010 to 25510 1 1 0001 2 2 0010 . Hexadecimal 0016 to FF16 3 3 0011 4 4 0100 . Base 16 number representation Programs Refer to Virtual Addresses 5 5 0101 . Use characters ‘0’ to ‘9’ and ‘A’ to ‘F’ 6 6 0110 . Conceptually very large array of bytes 7 7 0111 . Write FA1D37B16 in C as 8 8 1000 . Actually implemented with hierarchy of different memory types – 0xFA1D37B 9 9 1001 . System provides address space private to particular “process” A 10 1010 – 0xfa1d37b B 11 1011 . Program being executed C 12 1100 . Program can clobber its own data, but not that of others D 13 1101 E 14 1110 Compiler + Run-Time System Control Allocation F 15 1111 . Where different program objects should be stored . All allocation within single virtual address space 3 4 Machine Words Word-Oriented Memory Organization 32-bit 64-bit Bytes Addr. Machine Has “Word Size” Addresses Specify Byte Words Words . Nominal size of integer-valued data Locations 0000 Addr 0001 . Including addresses . Address of first byte in word = 0000?? 0002 . Most current machines use 32 bits (4 bytes) words . Addresses of successive words differ Addr 0003 . by 4 (32-bit) or 8 (64-bit) = Limits addresses to 4GB 0000?? 0004 . Becoming too small for memory-intensive applications Addr 0005 = . High-end systems use 64 bits (8 bytes) words 0004?? 0006 . Potential address space ≈ 1.8 X 1019 bytes 0007 0008 . x86-64 machines support 48-bit addresses: 256 Terabytes Addr 0009 . = Machines support multiple data formats 0008?? 0010 Addr . Fractions or multiples of word size = 0011 . Always integral number of bytes 0008?? 0012 Addr 0013 = 0012?? 0014 0015 5 6 1 Data Representations Byte Ordering How should bytes within a multi-byte word be ordered in C Data Type Typical 32-bit Intel IA32 x86-64 memory? char 1 1 1 Conventions short 2 2 2 . Big Endian: Sun, PPC Mac, Internet convention int 4 4 4 . Least significant byte has highest address long 4 4 8 . Little Endian: x86, VAX . Least significant byte has lowest address long long 8 8 8 float 4 4 4 double 8 8 8 long double 8 10/12 10/16 pointer 4 4 8 7 8 Byte Ordering Example Reading Byte-Reversed Listings Big Endian Disassembly . Least significant byte has highest address . Text representation of binary machine code Little Endian . Generated by program that reads the machine code . Least significant byte has lowest address Example Fragment Example Address Instruction Code Assembly Rendition . Variable x has 4-byte representation 0x01234567 8048365: 5b pop %ebx 8048366: 81 c3 ab 12 00 00 add $0x12ab,%ebx . Address given by &x is 0x100 804836c: 83 bb 28 00 00 00 00 cmpl $0x0,0x28(%ebx) Deciphering Numbers Big Endian 0x100 0x101 0x102 0x103 . Value: 0x12ab 01 23 45 67 . Pad to 32 bits: 0x000012ab Little Endian 0x100 0x101 0x102 0x103 . Split into bytes: 00 00 12 ab 67 45 23 01 . Reverse: ab 12 00 00 9 10 Examining Data Representations show_bytes Execution Example Code to Print Byte Representation of Data int a = 15213; . Casting pointer to unsigned char * creates byte array printf("int a = 15213;\n"); show_bytes((unsigned char *) &a, sizeof(int)); void show_bytes(unsigned char *start, int len){ int i; for (i = 0; i < len; i++) printf(”%p\t0x%.2x\n",start+i, start[i]); Result (Linux): printf("\n"); } int a = 15213; 0xbffffcb8 0x6d 0xbffffcb9 0x3b 0xbffffcba 0x00 Printf directives: %p: Print pointer 0xbffffcbb 0x00 %x: Print Hexadecimal 11 12 2 Decimal: 15213 Representing Integers Binary: 0011 1011 0110 1101 Representing Pointers Hex: 3 B 6 D int B = -15213; int *P = &B; int A = 15213; long int C = 15213; IA32, x86-64 Sun Sun IA32 x86-64 IA32 x86-64 Sun 6D 00 EF D4 0C 6D 6D 00 3B 00 FF F8 89 3B 3B 00 00 3B FB FF EC 00 00 3B 00 6D 00 00 6D 2C BF FF 00 FF int B = -15213; 00 7F 00 IA32, x86-64 Sun 00 00 93 FF 00 C4 FF FF C4 FF 93 Two’s complement representation Different compilers & machines assign different locations to objects (Covered later) 13 14 Representing Strings Aside: ASCII table char S[6] = "18243"; Strings in C . Represented by array of characters 0 1 2 3 4 5 6 7 8 9 a b c d e f . Each character encoded in ASCII format Linux/Alpha Sun 0x0_ \0 Â ^B ^C ^D Ê ^F ^G ^H \t \n ^K ^L ^M ^N Ô . Standard 7-bit encoding of character set 31 31 0x1_ ^P ^Q ^R ^S ^T Û ^V ^W ^X ^Y ^Z ESC FS GS RS US . Character “0” has code 0x30 38 38 0x2_ SPC ! " # $ % & ' ( ) * + , - . / – Digit i has code 0x30+i 32 32 0x3_ 0 1 2 3 4 5 6 7 8 9 : ; < = > ? . String should be null-terminated 34 34 0x4_ @ A B C D E F G H I J K L M N O . Final character = 0 33 33 0x5_ P Q R S T U V W X Y Z [ \ ] ^ _ Compatibility 00 00 0x6_ ` a b c d e f g h I j k l m n o . Byte ordering not an issue 0x7_ p q r s t u v w x y z { | } ~ DEL 15 16 Today: Bits, Bytes, and Integers Homework turn-in process Representing information as bits For full credit: turn in at the beginning of class on the due (Logistics interlude) date . On-time = 3:35pm, or when I start lecturing, whichever is later Bit-level manipulations . Yes, this means you have to come to class on time (that day) Integers We strongly recommend typing your assignments on a . Representation: unsigned and signed computer, not hand-writing . Conversion, casting . Expanding, truncating Late submissions only will be online using the Moodle . Addition, negation, multiplication, shifting Do not turn in paper assignments at other times Summary . This helps us stay organized 17 18 3 2021-dedicated VMs now available Boolean Algebra SSH into: xA-B.cselabs.umn.edu Developed by George Boole in 19th Century . Where A is 21, 22, or 23 . Algebraic representation of logic . And B is 01, 02, 03, 04, or 05 . Encode “True” as 1 and “False” as 0 . E.g., x22-02.cselabs.umn.edu And (math: ∧) Or (math: ∨) 32-bit version of Ubuntu Linux version 14.04 A&B = 1 when both A=1 and B=1 A|B = 1 when either A=1 or B=1 Do not run graphical programs (Firefox, etc.) on these machines (it would be slow anyway) If you prefer to use other CSE Labs Linux machines, give the -m32 option to GCC to get 32-bit binaries Not (math: ¬) Exclusive-Or “xor” (math: ⊕) ~A = 1 when A=0 A^B = 1 when either A=1 or B=1, but not both 19 20 Application of Boolean Algebra General Boolean Algebras Applied to Digital Systems by Claude Shannon Operate on Bit Vectors . 1937 MIT Master’s Thesis . Operations applied bitwise . Reason about networks of relay switches 01101001 01101001 01101001 . Encode closed switch as 1, open switch as 0 & 01010101 | 01010101 ^ 01010101 ~ 01010101 01000001 01111101 00111100 10101010 A&~B Connection when All of the Properties of Boolean Algebra Apply A ~B A&~B | ~A&B ~A B ~A&B = A^B 21 22 Representing & Manipulating Sets Bit-Level Operations in C Representation Operations &, |, ~, ^ Available in C . Width w bit vector represents subsets of {0, …, w–1} . Apply to any “integral” data type . aj = 1 if j ∈ A long, int, short, char, unsigned . View arguments as bit vectors . 01101001 { 0, 3, 5, 6 } . Arguments applied bit-wise . 76543210 Examples (Char data type) . ~0x41 → 0xBE . 01010101 { 0, 2, 4, 6 } . ~010000012 → 101111102 . 76543210 . ~0x00 → 0xFF . ~000000002 → 111111112 Operations . 0x69 & 0x55 → 0x41 . & Intersection 01000001 { 0, 6 } . 011010012 & 010101012 → 010000012 . | Union 01111101 { 0, 2, 3, 4, 5, 6 } . 0x69 | 0x55 → 0x7D . ^ Symmetric difference 00111100 { 2, 3, 4, 5 } . 011010012 | 010101012 → 011111012 . ~ Complement 10101010 { 1, 3, 5, 7 } 23 24 4 Contrast: Logic Operations in C Shift Operations Contrast to Logical Operators Left Shift: x << y Argument x 01100010 . &&, ||, ! . Shift bit-vector x left y positions << 3 00010000 . View 0 as “False” – Throw away extra bits on left Log. >> 2 00011000 . Anything nonzero as “True” . Fill with 0’s on right Arith. >> 2 00011000 . Always return 0 or 1 Right Shift: x >> y . Early termination (AKA “short-circuit evaluation”) . Shift bit-vector x right y positions Examples (char data type) . Throw away extra bits on right Argument x 10100010 . !0x41 → 0x00 . Logical shift << 3 00010000 . !0x00 → 0x01 . Fill with 0’s on left Log. >> 2 00101000 . !!0x41 → 0x01 . Arithmetic shift Arith. >> 2 11101000 . 0x69 && 0x55 → 0x01 . Replicate most significant bit on right . 0x69 || 0x55 → 0x01 Undefined Behavior . p && *p (avoids null pointer access) . Shift amount < 0 or ≥ word size 25 26 Exercise break: flip case Today: Bits, Bytes, and Integers /* Convert lowercase to uppercase and vice-versa, Representing information as bits return any other characters unchanged */ char flip_case(char c){ Bit-level manipulations if (c >= 'A' && c <= 'Z') { /* 0x41 through 0x5A */ Integers return _______________;c | 0x20 } else if (c >= 'a' && c <= 'z') { . Representation: unsigned and signed /* 0x61 through 0x7A */ .

Load more