CSC 2400: Computer Systems Data Representation Computers and Programs
Total Page:16
File Type:pdf, Size:1020Kb
CSC 2400: Computer Systems Data Representation Computers and Programs • A computer is basically a CPU processor (CPU) interacting with Control Data memory • Your program BUS (executable) must be first loaded into memory before it can start executing Your program Disk Memory Memory: Array of Bytes • Memory is basically an array of bytes, each with its own address • Memory addresses are defined using unsigned binary integers Memory: Array of Words 32-bit 64-bit Words Words Bytes Addr. • A word is a group of bytes 0000 Addr 0001 handled as a unit by the CPU = 0000?? 0002 – tied to the CPU architecture Addr 0003 = – natural storage size for 0000?? 0004 Addr 0005 numbers = 0004?? 0006 • Word address 0007 0008 – address of first byte in word Addr 0009 = – addresses of successive words 0008?? 0010 Addr differ by 4 (32-bit) or 8 (64-bit) = 0011 0008?? 0012 Addr 0013 = 0012?? 0014 0015 Memory and Variables q What happens when you declare a variable? - The compiler allocates a memory box for that variable - How big a box? o Depends on the type of the variable Memory Memory Address Value char c = ‘A’; 0016 01000001 One Annoying Thing: Byte Order q Hosts differ in how they store data - E.g., four-byte number (byte3, byte2, byte1, byte0) q Little endian (“little end comes first”) ß Intel PCs!!! - Low-order byte stored at the lowest memory location - Byte0, byte1, byte2, byte3 q Big endian (“big end comes first”) - High-order byte stored at lowest memory location - Byte3, byte2, byte1, byte 0 q Makes it more difficult to write portable code - Client may be big or little endian machine - Server may be big or little endian machine Memory and Variables (contd.) int i = 258; 00000000 00000000 00000001 00000010 Memory view: Memory Memory Address Value 0020 00000000 00000010 0021 00000000 OR 00000001 0022 00000001 00000000 0023 00000010 00000000 BIG ENDIAN LITTLE ENDIAN (least significant byte (least significant byte at higher address) at lower address) Memory and Variables (contd.) float f = 0.1; 00111101 11001100 11001100 11001101 Memory view: Address Value 0020 00111101 11001101 0021 11001100 OR 11001100 0022 11001100 11001100 0023 11001101 00111101 BIG ENDIAN LITTLE ENDIAN (least significant byte (least significant byte at higher address) at lower address) Data Representations q Sizes of C Data Types (in bytes) C Data Type Sparc Typical 32-bit Intel IA32 int 4 4 4 long int 8 4 4 char 1 1 1 short 2 2 2 float 4 4 4 double 8 8 8 long double 8 8 10/12 void * 8 4 4 The sizeof Operator Category Operators sizeof sizeof(type) sizeof(expr) q UniQue among operators: evaluated at compile-time q Evaluates to type size_t; on most systems, same as unsigned int q Examples int i = 10; double d = 100.0; … … sizeof(int) … /* On matrix, evaluates to 4 */ … sizeof(i) … /* On matrix, evaluates to 4 */ … sizeof(double)… /* On matrix, evaluates to 8 */ … sizeof(d) … /* On matrix, evaluates to 8 */ … sizeof(d + 200.0) … /* On matrix, evaluates to 8 */ Determining Data Sizes q Program to determine data sizes on your computer #include <stdio.h> int main() { printf("char: %d\n", (int)sizeof(char)); printf("short: %d\n", (int)sizeof(short)); printf("int: %d\n", (int)sizeof(int)); printf("long: %d\n", (int)sizeof(long)); printf("float: %d\n", _________________); printf("double: %d\n", _________________); printf("long double: %d\n", _________________); return 0; } q Output on matrix char: 1 short: 2 int: 4 long: 4 float: 4 double: 8 long double: 16 Limits of the Machine: Overflow Overflow: Running Out of Room q Adding two large integers together - Sum might be too large to store in available bits - What happens? 01000 (8) 11000 (-8) + 01001 (9) + 10111 (-9) 10001 (-15) 01111 (+15) Assuming 5-bit 2’s complement numbers q We have overflow if: - signs of both operands are the same, and - sign of sum is different Overflow q Unsigned integers - All arithmetic is “modulo” arithmetic - Sum would just wrap around q Signed integers - Can get nonsense values - Example with 16-bit integers (short datatype) o Sum: 10000+20000+30000 o Result: -5536 Try It Out q Write a program that computes the sum 10000+20000+30000 Use only short variables in your code: short a = 10000; short b = 20000; short c = 30000; short sum = a + b + c; printf(”%d, %d, %d, sum = %d\n", a, b, c, sum); Exercise q Assume a 4-bit two’s complement representation for integer variables q Compute the value of the expression 7 + 7 Casting Signed to Unsigned q C Allows Conversions from signed to unsigned short x = 5; unsigned short ux = (unsigned short) x; short y = -5; unsigned short uy = (unsigned short) y; q Memory allocation: x 00000000 00000101 ux 00000000 00000101 y 11111111 11111011 uy 11111111 11111011 q Resulting Value - No change in bit representation - Nonnegative values unchanged (ux = 5) - Negative values change into (large) positive values (uy = 65531) Exercise q Assume a 5-bit two’s complement representation for int variables q What is the output of the following piece of code? int x = 8; unsigned int ux = (unsigned int) x; int y = -8; unsigned int uy = (unsigned int) y; printf(“%d %d %d %d\n”, x, ux, y, uy); Try It Out q C code: char a = 0xFF; unsigned char b = 0xFF; printf("a = %d\n", a); printf("b = %d\n", b); Int to Char? Try It Out … #include <stdio.h> int main() { char c = 0x81; c 10000001 int i; i i = c; printf(" integer = %x\n character = %x\n", i, c); i = 0x87654321; c = i; printf(" integer = %d\n character code = %d\n", i, c); return 0; } C vs. Java: Cast Conversions q Java: demotions are not automatic C: demotions are automatic int i; char c; … i = c; /* Implicit promotion */ /* Sign extension in Java and C */ c = i; /* Implicit demotion */ /* Java: Compile-time error */ /* C: OK; truncation */ c = (char)i; /* Explicit demotion */ /* Truncation in Java and C */ Floating-Point to Int q C Guarantees Two Levels float single precision (32 bits) double double precision (64 bits) q Conversions - Casting between int, float, and double changes bit values int i = 0x800002F1; i 10000000 00000000 00000010 11110001 float f = (float) i; f - int to float o Round according to rounding mode - int to double o Exact conversion, as long as int has ≤ 53 bit word size Floating-Point to Int q C Guarantees Two Levels float single precision (32 bits) double double precision (64 bits) q Conversions - Casting between int, float, and double changes bit values int i = 0x800002F1; i 10000000 00000000 00000010 11110001 float f = (float) i; f 01001111 00000000 00000000 00000011 Int to Float Rounding – Try It Out int i = 0x800002F1; float f = (int)i; printf(”%x\n", i); i = (int)f; printf(”%x\n", i); Why is this important? Ariane 5! - June 5, 1996 - Exploded 37 seconds after liftoff - Cargo worth $500 million q Why - Computed horizontal velocity as 64-bit floating point number - Converted to 16-bit integer - Worked OK for Ariane 4 - Overflowed for Ariane 5 o Used same software What did we learn? q Memory is bytes, words q Datatype size (in bytes) is machine-dependent q Byte ordering (little endian, big endian) q Limits of machine, overflow q Cast conversions in C.