General-Purpose Registers

• Eight 32-bit general-purpose registers (e.g., EAX) • Each lower-half can be addressed as a 16-bit register (e.g., AX) • Each 16-bit register can be addressed as two 8-bit registers (e.g., AH and AL) IA32 Instructions and 31 16 15 8 7 0 AH AL AX EAX: Accumulator for operands, results Assembler Directives BH BL BX EBX: Pointer to data in the DS segment CH CL CX ECX: Counter for string, loop operations CS 217 DH DL DX EDX: I/O pointer SI ESI: Pointer to DS data, string source DI EDI: Pointer to ES data, string destination BP EBP: Pointer to data on the stack SP ESP: Stack pointer (in the SS segment)

1 2

Segment Registers EIP Register

• IA32 memory is divided into segments, pointed by segment registers • Instruction Pointer or “Program Counter” • Modern operating systems and applications use the unsegmented • Software changes it by using memory mode: all the segment registers are loaded with the same ο segment selector so that all memory references a program makes are Unconditional jump ο to a single linear-address space. Conditional jump ο 232-1 Procedure call ο Return 15 0 CS: code segment register SS: stack segment register 4Gbytes DS: data segment register address space ES: data segment register FS: data segment register GS: data segment register 0

3 4

1 EFLAG Register The Six Flags

31 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 V V IO • ZF (): 1 if result is zero. 0 otherwise. Reserved (set to 0) I I I A V R 0 N P O D I T S Z 0 A 0 P 1 C D P F C M F T L F F F F F F F F F • SF (Sign Flag): Set equal to the most-significant bit of the Identification flag result. Sign bit for a signed integer. Virtual interrupt pending Virtual • CF (): Overflow condition for unsigned Alignment check arithmetic. Virtual 8086 mode Resume flag • OF (): Overflow condition for signed Nested task flag I/O privilege level arithmetic. Overflow flag • PF (): 1 if the least-significant byte of the result Interrupt enable flag contains an even number of ones. 0 otherwise. Sign flag • AF (Adjust Flag): 1 if the arithmetic operation generated a Zero flag Auxiliary carry flag or adjust flag carry/borrow out of bit 3 of the result. Used in Binary-Coded Parity flag Decimal (BCD) arithmetic. Carry flag 5 6

Other Registers Instruction

• Floating Point Unit (FPU) (x87) • Opcode ο Eight 80-bit registers (ST0, …, ST7) ο 16-bit control, status, tag registers ο What to do ο 11-bit opcode register ο 48-bit FPU instruction pointer, data pointer registers • Source operands • MMX ο Immediate: a constant embedded in the instruction itself ο Eight 64-bit registers ο Register • SSE and SSE2 ο Memory ο Eight 128-bit registers ο I/O port ο 32-bit MXCRS register • System • Destination operand ο I/O ports ο Register ο Control registers (CR0, …, CR4) ο ο Memory management registers (GDTR, IDTR, LDTR) Memory ο Debug registers (DR0, …, DR7) ο I/O port ο Machine specific registers ο Machine check registers ο Performance monitor registers

7 8

2 GCC Assembly Examples Accessing Memory

Syntax: Direct Addressing Mode Opcode Source, Destination movl 500, %ecx # Copy long-word at memory address 500 into register ECX movl $5, %eax # Move 32-bit “long-word” # “Little Endian:” byte 500 is least significant, byte 503 is # Copy constant value 5 into register EAX most significant # “Immediate” mode Indirect Addressing Mode movl %eax, %ebx movl (%eax), %edx # Copy contents of register EAX into register EBX # Copy long-word at memory pointed to by register EAX into # register EDX 9 10

Address Computation Examples

General form of memory addressing DISPLACEMENT (%Base, %Index, SCALE)

DISPLACEMENT (%Base, %Index, SCALE) • DISPLACEMENT (Direct Addressing Mode) ο movl foo, %ebx Constant Register Register {1, 2, 4 or 8} • Base (Indirect Addressing Mode) ο movl (%eax), %ebx

Address = DISPLACEMENT + %Base + (%Index * SCALE) • DISPLACEMENT + Base (Base Pointer Addressing Mode) ο movl 20(%ebp), %ebx ο movl foo(%eax), %ebx • DISPLACEMENT + (Index * SCALE) (Indexed Addressing Mode) • Not all four components are necessary ο movl foo(, %eax, 2), %ebx • DISPLACEMENT + Base + (Index * SCALE) ο movl foo(%eax, %ebx, 2), %ecx 11 12

3 Types of Instructions Data Transfer Instructions • Data transfer: copy data from source to destination • mov{b,w,l} source, dest 0 • Arithmetic: arithmetic on integers ο General move instruction • Floating point: x87 FPU move, arithmetic • push{w,l} source • Logic: bitwise logic operations pushl %ebx • Control transfer: conditional and unconditional jumps, procedure calls # equivalent instructions # subl $4, %esp • String: move, compare, input and output # movl %ebx, (%esp) • Flag control: Control fields in EFLAGS • pop{w,l} dest %ESP TOP • Segment register: Load far pointers for segment registers popl %ebx • SIMD # equivalent instructions ο MMX: integer SIMD instructions # movl (%esp), %ebx ο SSE: 32-bit and 64-bit floating point SIMD instructions ο SSE2: 128-bit integer and float point SIMD instructions # addl $4, %esp Stack • System • Many more in Intel manual (volume 2) BOTTOM ο Load special registers and set control registers (including halt) ο Type conversion, conditional move, exchange, compare and exchange, I/O port, string move, etc. 13 14

Bitwise Logic Instructions Arithmetic Instructions

• Simple instructions • Simple instructions ο and{b,w,l} source, dest dest = source & dest add{b,w,l} source, dest dest = source + dest ο sub{b,w,l} source, dest dest = dest – source or{b,w,l} source, dest dest = source | dest ο inc(b,w,l} dest dest = dest + 1 xor{b,w,l} source, dest dest = source ^ dest ο dec{b,w,l} dest dest = dest – 1 not{b,w,l} dest dest = ^dest ο neg(b,w,l} dest dest = ^dest ο sal{b,w,l} source, dest (arithmetic) dest = dest << source cmp{b,w,l} source1, source2 source2 – source1 sar{b,w,l} source, dest (arithmetic) dest = dest >> source • Multiply ο mul (unsigned) or imul (signed) • Many more in Intel Manual (volume 2) mul %ebx # edx:eax = eax * ebx ο Logic shift • Divide ο rotation shift ο div (unsigned) or idiv (signed) ο Bit scan idiv %ebx # eax = edx:eax / ebx ο Bit test # edx = edx:eax % ebx ο Byte set on conditions • Many more in Intel manual (volume 2) ο adc, sbb, decimal arithmetic instructions

15 16

4 Unsigned Integers: 4-bit Example Signed Integers 4-bit patterns Unsigned Ints Signed-Magnitude 0000 0 • Designate the left-most bit as the “sign-bit” 0001 1 1001 9 0010 2 0011 3 +1011 +11 • Interpret the rest of the bits as an unsigned number 0100 4 ------0101 5 • Example 10100 20 0110 6 • 0101 is +5 0111 7 1000 8 • 1101 is -5 1001 9 Overflow: when result does not fit 1010 10 in the given number of bits • Addition and subtraction complicated when signs differ 1011 11 1100 12 • Two representations of 0: 0000 and 1000 1101 13 1110 14 • Rarely used in practice 1111 15 17 18

Signed Integers Signed Integers 4-bit patterns Unsigned Ints Signed Magnitude One’s Complement 0000 0 +0 • For positive integer k, -k has representation (2n – 1) – k 0001 1 +1 0010 2 +2 • Example: 0101 is +5, 1010 is -5 0011 3 +3 0100 4 +4 • Negating k is same as flipping all the bits in k 0101 5 +5 0110 6 +6 • Left-most bit is the sign bit 0111 7 +7 • Addition and subtraction are easy 1000 8 -0 1001 9 -1 • For positive a and b, 1010 10 -2 n 1011 11 -3 a – b = a – b + 2 – 1 + 1 1100 12 -4 1101 13 -5 = a + b1C + 1 1110 14 -6 • Two representations of 0: 0000 and 1111 1111 15 -7 19 20

5 Signed Integers Signed Integers 4-bit patterns Unsigned Ints Signed One’s Magnitude Complement Two’s Complement 0000 0 +0 +0 • For positive integer k, -k has representation 2n – k 0001 1 +1 +1 0010 2 +2 +2 • Example: 0101 is +5, 1011 is -5 0011 3 +3 +3 0100 4 +4 +4 • Negating k is same as flipping all the bits in k and then adding 1 0101 5 +5 +5 0110 6 +6 +6 • Left-most bit is the sign bit 0111 7 +7 +7 • Addition and subtraction are easy 1000 8 -0 -7 1001 9 -1 -6 • For positive a and b, 1010 10 -2 -5 n 1011 11 -3 -4 a – b = a – b + 2 1100 12 -4 -3 1101 13 -5 -2 = a + b2C 1110 14 -6 -1 • Same hardware used for adding signed and unsigned bit- 1111 15 -7 -0 21 strings 22

Signed Integers Signed Integers 4-bit patterns Unsigned Ints Signed One’s Two’s 4-bit patterns Unsigned Ints Signed One’s Two’s Magnitude Complement Complement Magnitude Complement Complement

0000 0 +0 +0 +0 0000 0 +0 +0 +0 0001 1 +1 +1 +1 0001 1 +1 +1 +1 0010 2 +2 +2 +2 0010 2 +2 +2 +2 0011 3 +3 +3 +3 0011 3 +3 +3 +3 0100 4 +4 +4 +4 0100 4 +4 +4 +4 0101 5 +5 +5 +5 0101 5 +5 +5 +5 0110 6 +6 +6 +6 0110 6 +6 +6 +6 0111 7 +7 +7 +7 0111 7 +7 +7 +7 1000 8 -0 -7 -8 1000 8 -0 -7 -8 1001 9 -1 -6 -7 1001 9 -1 -6 -7 1010 10 -2 -5 -6 1010 10 -2 -5 -6 1011 11 -3 -4 -5 1011 11 -3 -4 -5 1100 12 -4 -3 -4 1100 + 12 -4 -3 -4 1101 13 -5 -2 -3 1101 13 -5 -2 -3 1110 14 -6 -1 -2 1110 14 -6 -1 -2 1111 15 -7 -0 -1 1111 15 -7 -0 -1 23 24

6 Branching Instructions The Six Flags

• Unconditional branch • ZF (Zero Flag): 1 if result is zero. 0 otherwise. jmp addr • SF (Sign Flag): Set equal to the most-significant bit of the result. Sign bit for a signed integer.

• Conditional branch • CF (Carry Flag): Overflow condition for unsigned arithmetic. ο Perform arithmetic operation ο Test flags in the EFLAGS register and jump • OF (Overflow Flag): Overflow condition for signed arithmetic. cmpl %ebx, %eax # eax - ebx je L1 • PF (Parity Flag): 1 if the least-significant byte of the result ... # ebx != eax contains an even number of ones. 0 otherwise. L1: • AF (Adjust Flag): 1 if the arithmetic operation generated a ... # ebx == eax carry/borrow out of bit 3 of the result. Used in Binary-Coded Decimal (BCD) arithmetic. 25 26

Conditional Branch Instructions Branching Example: if-then-else

• For both signed and unsigned integers je (ZF = 1) Equal or zero C program Assembly program jne (ZF = 0) Not equal or not zero movl a, %eax if (a > b) • For signed integers cmpl b, %eax # a-b jl (SF ^ OF = 1) Less than jle L1 # jump if a <= b jle ((SF ^ OF) || ZF) = 1) Less or equal c = a; movl a, %eax # a > b branch jg ((SF ^ OF) || ZF) = 0) Greater than movl %eax, c jge (SF ^ OF = 0) Greater or equal jmp L2 • For unsigned integers jb (CF = 1) Below L1: # a <= b branch else jbe (CF = 1 || ZF = 1) Below or equal movl b, %eax c = b; movl %eax, c ja (CF = 0 && ZF = 0) Above jae (CF = 0) Above or equal L2: # finish • For AF and PF conditions, FPU, MMX, SSE and SSE2 ο See the Intel manual (volume 1 and 2) 27 28

7 Branching Example: for-loop Assembler Directives

C program • Identify code and data sections for (i = 0; i < 100; i++) { • Allocate/initialize memory ... • Make symbols externally visible or invisible }

Assembly program movl $0, %edx # i = 0 loop_begin: cmpl $100, %edx # i – 100 jge loop_end # end if i >= 100 ... incl %edx # i++ jmp loop_begin loop_end:

29 30

Identifying Sections Initializing Data

.section .data • Text (.section .text) 0 OS ο Contains code (instructions) p: .byte 215 # one byte initialized to 215 ο Contains the _start label Text q: .word 150, 999 # two words initialized r: .long 1, 1000, 1000000 # three long-words • Read-Only Data (.section .rodata) Data ο Contains pre-initialized constants BSS • Read-Write Data (.section .data) Heap Can specify alignment constraints ο Contains pre-initialized variables .align 4 • BSS (.section .bss) ο Contains zero-initialized variables s: .long 555

Stack 0xffffffff 31 32

8 Initializing ASCII Data Allocating Memory in BSS

• Several ways for ASCII data • For global data .comm symbol, nbytes [,desired-alignment] .byte 150,145,154,154,157,0 # a sequence of bytes • For local data .ascii “hello” # ascii without null char .lcomm symbol, nbytes [,desired-alignment] .byte 0 # add \0 to the end • Example .ascii “hello\0” .section .bss # or just .bss .equ BUFSIZE 512 # define a constant .asciz “hello” # ASCII with \0 .lcomm BUF, BUFSIZE # allocate 512 bytes .string “hello” # same as .asciz # local memory for BUF .comm x, 4, 4 # allocate 4 bytes for x # with 4-byte alignment • BSS does not consume space in the object file

33 34

Making Symbols Externally Visible Summary • Default is local • Instructions manipulate registers and memory • Specify globally visible ο .globl symbol Memory addressing modes ο Unsigned and signed integers • Example: int x = 1; .section .data • Branch instructions .globl x # declare externally visible ο The six flags in the EFLAGS register .align 4 ο Conditional branching x: .long 2 • Assembly language directives • Example: foo(void){...} ο Define sections .text ο Allocate memory .globl foo ο Initialize values foo: ο Make labels externally visible ... leave return 35 36

9