Bits and Bit Sequences Integers
Total Page:16
File Type:pdf, Size:1020Kb
Data ● (Some repeating CS1083, ECE course) ● bits and bit sequences ● integers (signed and unsigned) ● bit vectors ● strings and characters ● floating point numbers ● hexadecimal and octal notations Bits and Bit Sequences ● Fundamentally, we have the binary digit, 0 or 1. ● More interesting forms of data can be encoded into a bit sequence. ● 00100 = “drop the secret package by the park entrance” 00111 = “Keel Meester Bond” ● A given bit sequence has no meaning unless you know how it has been encoded. ● Common things to encode: integers, doubles, chars. And machine instructions. Encoding things in bit sequences (From textbook) ● Floats ● Machine Instructions How Many Bit Patterns? ● With k bits, you can have 2k different patterns ● 00..00, 00..01, 00..10, … , 11..10, 11..11 ● Remember this! It explains much... ● E.g., if you represent numbers with 8 bits, you can represent only 256 different numbers. Names for Groups of Bits ● nibble or nybble: 4 bits ● octet: 8 bits, always. Seems pedantic. ● byte: 8 bits except with some legacy systems. In this course, byte == octet. ● after that, it gets fuzzy (platform dependent). For 32-bit ARM, – halfword: 16 bits – word: 32 bits Unsigned Binary (review) ● We can encode non-negative integers in unsigned binary. (base 2) ● 10110 = 1*24 + 0*23 + 1*22 + 1*21 +1*20 represents the mathematical concept of “twenty-two”. In decimal, this same concept is written as 22 = 2*101 + 2*100. ● Converting binary to decimal is just a matter of adding up powers of 2, and writing the result in decimal. ● Going from decimal to binary is trickier. Division Method (decimal → binary) ● Repeatedly divide by 2. Record remainders as you do this. ● Stop when you hit zero. ● Write down the remainders (left to right), starting with the most recent remainder. Subtract-powers method ● Find the largest power of 2, say 2p, that is not larger than N (your number). The binary number has a 1 in the 2p's position. ● Then similarly encode N-2p. ● Eg, 22 has a 1 in the 16's position 22-16=6, which has a 1 in the 4's position 6-4 = 2, which has a 1 in the 2's position 2-2=0, so we can stop.... Adding in Unsigned Binary ● Just like grade school, except your addition table is really easy: ● No carry in: 0+0=0 (no carry out) 0+1= 1+0 = 1 (no carry out) 1+1= 0 (with carry out) ● Have carry in: 0+0=1 (no carry out) 0+1 = 1+0 = 0 (with carry out) 1+1 = 1 (with carry out) Fixed-Width Binary Integers ● Inside the computer, we work with fixed-width values. ● Eg, an instruction might add together two 16-bit unsigned binary values and compute a 16-bit result. ● Hope your result doesn't exceed 65535 = 216-1. Otherwise, you have overflow. Can be detected by a carry from the leftmost stage. ● If a result would really doesn't need all 16 bits, a process called zero-extension just prepends the required number of zeros. ● 10111 becomes 0000000000010111. ● Mathematically, these bit strings both represent the same number. Signed Numbers ● A signed number can be positive, negative or zero. ● An unsigned number can be positive or zero. ● Note: “signed number” does NOT necessarily mean “negative number”. In Java, ints are signed numbers. Can they be positive?? Can they be negative?? Some Ways to Encode Signed Numbers ● All assume fixed width; examples below for 4 bits ● Sign/magnitude: first bit = 1 iff number -ve Remaining bits are the magnitude, unsigned binary Ex: 1010 means -2 ● Biased: Store X+bias in unsigned binary Ex: 0110 means -2 if the bias is 8. (8+(-2) = 6) ● Two's complement: Sign bit's weight is the negative of what it'd be for an unsigned number Ex: 1110 means -2: -8+4+2 = -2 ● You can generally assume 2's complement... Why 2's Complement? ● There is only one representation of 0. (Other representations have -0 and +0.) ● To add 2's complement numbers, you use exactly the same steps as unsigned binary. ● There is still a “sign bit” - easy to spot negative numbers ● You get one more number (but it's -ve) Range of N bits 2's complement: -2N-1 to +2N-1-1 2's Complement Tricks ● +ve numbers are exactly as in unsigned binary ● Given a 2's complement number X (where X may be -ve, +ve or zero), compute -X using the twos complementation algorithm (“flip and increment”) ● Flip all bits (0s become 1s, 1s become zeros) ● Add 1, using the unsigned binary algorithm ● Ex: 00101 = +5 In 5 bit 2's complement 11010 + 1 → 11011 is -5 in 2's complement ● And Flip(-5)=00100. 00100+1 back to +5 Converting a 2's complement number X to decimal ● Determine whether X is -ve (inspect sign bit) ● If so, use the flip-and-increment to compute -X Pretend you have unsigned binary. Slap a negative sign in front. ● If number is +ve, just treat it like unsigned binary. Sign extension ● Recall zero-extension is to slap extra leading zeros onto a number. ● Eg: 5 bit 2's compl. to 7 bit: 10101 → 0010101 Oops: -11 turns into +21. Zero extension didn't preserve numeric value. ● The sign-extension operation is to slap extra copies of the leading bit onto a number ● +ve numbers are just zero extended ● But for -11: 10101 → 1110101 (stays -11) Overflow for 2's complement ● Although addition algorithm is same for fixed-width unsigned, the conditions under which overflow occurs are different. ● If A and B are both same sign (eg, both +ve), then if A+B is the opposite sign, something bad happened (overflow) ● Overflow always causes this. And if this does not happen, there is no overflow. ● Eg, 1001 + 1001 →0010 but -7 + -7 isn't +2. Note that -14 cannot be encoded in 4-bit 2's complement. Numbering Bits ● On paper, we often write bit positions above the actual data bits. ● 543210 ← normally in a smaller font than this 001010 bits 3 and 1 are ones. ● Sometimes we like to write bits left to right, and other times, right to left (which is more number-ish). We usually start numbering at zero. ● Inside computer, how we choose to draw on paper is irrelevant. ● Computer architecture defines the word size (usually 32 or 64). Usually viewed as the largest fixed-width size that the computer can handle, at maximum speed, with most math operations. ● So bit positions would be numbered 0 to 31 for a 32-bit architecture. More Arithmetic in 2's complement ● Subtract: To calculate A-B, you can use A + (-B) Most CPUs have a subtract operation to do this for you. ● Multiplication: easiest in unsigned. (Most CPUs have instr.) ● D.I.Y. unsigned multiplication is like Grade 3: But your times table is the Boolean AND !! The product of 2 N-bit numbers may need 2N bits ● For 2's complement, the 2 inputs' signs determine the product's sign. eg, -ve * -ve → +ve ● And you can multiply the positive versions of the two numbers. Finally, correct for sign. Bit Vectors (aka Bitvectors) ● Sometimes we like to view a sequence of bits as an array (of Booleans) ● Eg hasOfficeHours[x] for 1 <= x <= 31 says whether I hold office hours on the xth of this month. ● And isTuesday[x] says whether the xth is a Tuesday. ● So what if you want to find a Tuesday when I hold office hours? Bitwise Operations for Bit Vectors ● Bitwise AND of B1 and B2: Bit k of the result is 1 iff bit k of both B1 and B2 is 1. ● Java supports bitwise operations on longs, ints int b1 = 6, b2 = 12; // 0b110, 0b1100 int result = b1 & b2; // = 4 or 0b100 ● Bitwise NOT (~ in Java) ● Bitwise OR ( | in Java) ● Bitwise Exclusive Or ( ^ in Java) Also write “XOR” or “EOR”. ● Pretty well every ISA will support these operations directly. Find First Set ● Some ISAs have a Find First Set instruction. (You've got a bitvector marking the Tuesdays when I have office hours – but now you want to find the first such day.) ● Integer.numberOfTrailingZeros() in Java achieves this. ● So use Integer.numberOfTrailingZeros(hasOfficeHours & isTuesday) Bit Masking ● Think about painting and masking tape. You can put a piece of tape on an object, paint it, then peel off the tape. Area under the tape has been protected from painting. ● We can do the same when we want to “paint” a bit vector with zeros, except in certain positions. ● Eg, I decide to cancel my office hours except for the first 10 days of the month. ● Or we can protect positions against painting with ones. ● Details next... Bit Masking with AND ● AND(x,0) = 0 for both Boolean values of x ● AND(x,1) = x for both Boolean values of x ● bitwise AND wants to paint bits 0, except where the mask protects (1 protects) ● hasOfficeHours & 0b1111111111 is a bitvector that keeps my office hours for the first 10 days (only). Later in month, all days are painted false. ● hasOfficeHours &= 0b1111111111 modifies hasOfficeHours. By analogy to the += operator you may already love. ● The value 0b111111111 is being used as a mask. ● Quiz: what does hasOfficeHours & ~0b1111111111 do? Bit Masking with OR ● OR(x,1) = 1 for both Boolean values x ● OR(x,0) = x for both Boolean values x ● bitwise OR wants to paint bits with 1s, except where the mask prevents it. A 0 prevents painting. ● hasOfficeHours | 0b1111111111 is a bitvector where I have made sure to hold office hours on each of the first 10 days (and left things alone for the rest of the month) ● hasOfficeHours |= 0b1111111111 makes it permanent.