CS356 Unit 4 X86 Instruction Set 4.2 Why Learn Assembly

4.1 CS356 Unit 4 x86 Instruction Set 4.2 Why Learn Assembly • Understand hardware limitations • Understand performance • Use HW options that high-level languages don't allow (e.g., operating systems, utilizing special HW features, etc.) • Understand security vulnerabilities • Can help debugging 4.3 Compiling and Disassembling void abs_value (int x, int *res) { if(x < 0) { • From C to assembly code *res = -x; } else { $ gcc -Og -c ___ file1.c *res = x; } } • Looking at binary files Original Code $ gcc -Og -c file1.c Disassembly of section .text: $ ________ -C file1.o 0000000000000000 <abs_value>: 0: 85 ff test %edi,%edi 2: 78 03 js 7 “if(x<0) goto 7” • From binary to assembly 4: 89 3e mov %edi,(%rsi) 6: c3 retq $ gcc -Og -c file1.c 7: f7 df neg %edi 9: 89 3e mov %edi,(%rsi) $ ________ -d file1.o b: c3 retq Compiler Output (Machine code & Assembly) Notice how each instruction is turned CS:APP 3.2.2 into binary (shown in hex) 4.4 Basic Computer Organization Check the recorded lecture 4.5 x86-64 Memory Organization Recall variables live in memory & need to int x,y=5;z=8; be loaded into the • Because each byte of memory has its x = y+z; processor to be used own address we can picture memory A as one column of bytes (Fig. 2) Proc. 40 Mem. D • With 64-bit logical data bus we can 64 access up to 8-bytes of data at a time Fig. 2 … • We will usually show memory arranged in rows of 4 bytes (Fig. 3) or F8 0x000002 8 bytes 13 0x000001 5A 0x000000 – Still with separate addresses for each byte Logical Byte-Oriented View of Mem. … Fig. 3 b 8E a AD 9 33 8 29 0x000008 7 8E 6 AD 5 33 4 29 0x000004 3 7C 2 F8 1 13 0 5A 0x000000 Logical DWord-Oriented View 4.6 Memory & Word Size CS:APP 3.9.3 Double Word 4 • To refer to a chunk of memory we Word 6 Word 4 must provide: • The starting address Byte 7 Byte 6 Byte 5 Byte 4 • The size: B, W, D, L Byte 3 Byte 2 Byte 1 Byte 0 • There are rules for valid starting Quad Word 0 WordQuad Word 2 Word 0 addresses • A valid starting address should be a Double Word 0 multiple of the data size Byte • Words (2-byte chunks) must start on an Address Word even (divisible by 2) address … 0x4007 4006 • Double words (4-byte chunks) must start DWord 0x4006 Word on an address that is a multiple of 0x4004 0x4005 4004 (divisible by) 4 0x4004 Word … 0x4003 4002 • DWord Quad words (8-byte chunks) must start on 0x4002 an address that is a multiple of (divisible Word 0x4000 4000 QWord 4000 QWord 0x4001 by) 8 0x4000 4.7 Endian-ness CS:APP 2.1.3 • Endian-ness refers to the two alternate methods of ordering the The DWORD value: bytes in a larger unit (2, 4, 8 bytes) 0 x 12 34 56 78 – Big-Endian can be stored differently • PPC, Sparc, TCP/IP • MS byte is put at the starting address – Little-Endian 0x00 12 0x00 78 • used by Intel processors / original PCI 0x01 34 0x01 56 bus 0x02 56 0x02 34 • LS byte is put at the starting address 0x03 78 0x03 12 • Some processors (like ARM) and Big-Endian Little-Endian busses can be configured for either big- or little-endian 4.8 Big-endian vs. Little-endian • Big-endian • Little-endian – makes sense if you view your – makes sense if you view your memory as starting at the memory as starting at the top-left and addresses bottom-right and addresses increasing as you go down increasing as you go up 0 1 2 3 Addresses increasing Addresses … 000000 12345678 000004 000014 000008 000010 upward 00000C 00000C downward 000010 000008 000014 000004 … 12345678 000000 increasing Addresses 3 2 1 0 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 Byte 0 Byte 1 Byte 2 Byte 3 Byte 3 Byte 2 Byte 1 Byte 0 4.9 Big-endian vs. Little-endian Issues • Issues arise when transferring data between different systems – Byte-wise copy of data from big-endian system to little-endian system – Major issue in networks (little-endian computer => big-endian computer) and even within a single computer (system memory => I/O device) Intel is Big-Endian Little-Endian 0 1 2 3 LITTLE-ENDIAN Addresses increasing Addresses Copy byte 0 to byte 0, … 000000 12345678 byte 1 to byte 1, etc. 000004 000014 000008 000010 upward 00000C 00000C downward 000010 DWORD @ 0 in big-endian 000008 system is now different than 000014 000004 DWORD @ 0 in little-endian … system 78563412 000000 increasing Addresses 3 2 1 0 1 2 3 4 5 6 7 8 DWORD @ addr. 0 7 8 5 6 3 4 1 2 wrong! Byte 0 Byte 1 Byte 2 Byte 3 Byte 3 Byte 2 Byte 1 Byte 0 4.10 x86-64 ASSEMBLY 4.11 x86-64 Data Sizes CS:APP 3.3 Integer Floating Point • 4 sizes • 2 sizes – __________ – Single (S) • 8-bits = 1 byte • 32-bits = 4 bytes – __________ – Double (D) • 16-bits = 2 bytes • 64-bits = 8 bytes – __________ • (For a 32-bit data bus, a • 32-bits = 4 bytes double would be accessed from memory in 2 reads) – __________ • 64-bits = 8 bytes In x86-64, instructions generally specify what size data to access from memory and then operate upon. 4.12 x86-64 Register Names CS:APP 3.4 b (1 byte) w (2 bytes) l (4 bytes) q (8 bytes) %rax %eax %ax accumulate %rbx %ebx %bx base %rcx %ecx %cx counter %rdx %edx %dx data %rsi %esi %si source index %rdi %edi %di destination index %rsp %esp %sp stack pointer %rbp %ebp %bp base pointer • In addition: %al, %bl, %cl, %dl, %sil, %dil, %spl, %bpl for least significant byte • In addition: %r8 to %r15 (%r8d / %r8w / %r8b for lower 4 / 2 / 1 bytes) 4.13 x86-64 Instruction Classes • _____________________ (mov instruction) – Moves data between registers, or between registers and memory (One operand must be a processor register.) – Specifies size via a suffix on the instruction (movb, movw, movl, movq) • ___________ Operations – One operand must be a processor register – Size and operation specified by instruction (addl, orq, andb, subw) • ___________________ Flow – Unconditional/Conditional Branch (cmpq, jmp, je, jne, jl, jge) – Subroutine Calls (call, ret) • _____________________ Instructions – Instructions that can only be used by OS or other “supervisor” software (e.g. int to access certain OS capabilities, etc.) 4.14 Operand Locations • Source operands must be in one of the following 3 locations: – A ___________ value (e.g. %rax) Proc. Mem. – A value in a __________ location 400 Inst. Reg. A (e.g. value at address 0x0200e8) 401 Inst. Reg. – A __________ stored in the D ... Data ALU instruction itself (known as Data “________________ value”) ... add $1,d0 – The $ indicates the constant/immediate • Destination operands must be – A register – A memory location (specified by its address 0x0200e8) 4.15 Intel x86 Register Set • 8-bit processors in late 1970s – 4 registers for integer data: ________________ – 4 registers for address/pointers: SP (stack pointer), BP (base pointer), SI (source index), DI (dest. index) • 16-bit processors extended registers to 16-bits but continued to support 8-bit access! – Use prefix/suffix to indicate size: ___ referenced the lower 8-bits of register A ___ the higher 8-bits of register A ___ referenced the full 16-bit value • 32-/64-bit processors (see next slide) 4.16 DATA TRANSFER INSTRUCTIONS 4.17 mov Instruction & Data Size CS:APP 3.4.2 • Moves data between memory and processor register • Always provide the LS-Byte address (little-endian) of the desired data • Size is explicitly defined by the instruction ________ ('mov[_____]') used • Recall: Start address should be divisible by size of access (Assume start address = A) Processor Register Memory / RAM 63 7 0 7654 3210 A+4 Byte operations only access the 1-byte at the Byte movb fedc ba98 A specified address movb leaves upper bits unaffected 63 15 0 7654 3210 A+4 Word operations access the 2-bytes starting at the Word movw fedc ba98 A specified address movw leaves upper bits unaffected 63 31 0 7654 3210 A+4 Word operations access the 4-bytes starting at the 0000 0000 Double Word movl fedc ba98 A specified address movl zeros the upper bits 63 0 7654 3210 A+4 Word operations access the 8-bytes starting at the Quad Word movq fedc ba98 A specified address 4.18 Mem/Register Transfer Examples Memory / RAM • mov[b,w,l,q] src, dst 7654 3210 0x00204 fedc ba98 0x00200 • Initial Conditions: Processor Register ffff ffff 1234 5678 rax – movq 0x200, %rax rax movl zeros the upper rax – movl 0x204, %eax bits of dest. reg – movw 0x202, %ax rax – movb 0x207, %al rax 0x004e4 – movb %al, 0x4e5 0x004e0 0x004e4 – movl %eax, 0x4e0 movl changes only 4 bytes here 0x004e0 Treat these instructions as a sequence where one affects the next. 4.19 Immediate Examples Memory / RAM • Immediate Examples 7654 3210 0x00204 fedc ba98 0x00200 Processor Register ffff ffff 1234 5678 rax – movl $0xfe1234, %eax rax – movw $0xaa55, %ax rax – movb $20, %al rax – movq $-1, %rax rax – movabsq $0x123456789ab, %rax rax – movq $-1, 0x4e0 0x004e4 0x004e0 Rules: • Immediates must be source operand • Indicate with '$' and can be specified in decimal (default) or hex (start with 0x) • movq can only support a 32-bit immediate (and will then sign-extend that value to fill the upper 32-bits) • Use movabsq for a full 64-bit immediate value 4.20 Variations: Zero / Sign Extension • There are several variations with register destination – Used to zero-extend or sign-extend the source • Normal mov does ____________ upper portions of registers (with exception of movl) • movzxy will _______________ the upper portion – movzbw (move a byte from the source but zero-extend it to a word in the destination register) – movzbw, movzbl, movzbq, movzwl, movzwq (but no movzlq!) • movsxy will _______________ the upper portion – movsbw (move a byte from the source but sign-extend it to a word in the destination register) – movsbw, movsbl, movsbq, movswl, movswq, movslq – cltq is equivalent to movslq %eax,%rax (but shorter encoding) 4.21 Zero/Signed Move Variations Memory / RAM • Initial Conditions: 7654 3210 0x00204 fedc ba98 0x00200 Processor Register 0123 4567 89ab cdef rdx – movl 0x200, %eax rax – movslq 0x200, %rax rax – movzwl 0x202, %eax rax – movsbw 0x201, %ax rax – movsbl 0x206, %eax rax – movzbq %dl, %rax rax Treat these instructions as a sequence where one affects the next.

CS356 Unit 4 X86 Instruction Set 4.2 Why Learn Assembly

Computer Organization and Architecture Designing for Performance Ninth Edition

Introduction to the Intel® Nios® II Soft Processor

Computer Organization & Architecture Eie

A Modular Soft Processor Core in VHDL

Assignment Solutions

Register Are Used to Quickly Accept, Store, and Transfer Data And

IAR C/C++ Compiler Reference Guide for V850

Computer Organization and Architecture Structure

Processor-Organaization.Pdf

Computer Architectures an Overview

Understanding the Linux Kernel, 3Rd Edition by Daniel P

The Central Processing Unit (CPU)