CS356 Unit 4 X86 Instruction Set 4.2 Why Learn Assembly

CS356 Unit 4 X86 Instruction Set 4.2 Why Learn Assembly

4.1 CS356 Unit 4 x86 Instruction Set 4.2 Why Learn Assembly • Understand hardware limitations • Understand performance • Use HW options that high-level languages don't allow (e.g., operating systems, utilizing special HW features, etc.) • Understand security vulnerabilities • Can help debugging 4.3 Compiling and Disassembling void abs_value (int x, int *res) { if(x < 0) { • From C to assembly code *res = -x; } else { $ gcc -Og -c ___ file1.c *res = x; } } • Looking at binary files Original Code $ gcc -Og -c file1.c Disassembly of section .text: $ ________ -C file1.o 0000000000000000 <abs_value>: 0: 85 ff test %edi,%edi 2: 78 03 js 7 “if(x<0) goto 7” • From binary to assembly 4: 89 3e mov %edi,(%rsi) 6: c3 retq $ gcc -Og -c file1.c 7: f7 df neg %edi 9: 89 3e mov %edi,(%rsi) $ ________ -d file1.o b: c3 retq Compiler Output (Machine code & Assembly) Notice how each instruction is turned CS:APP 3.2.2 into binary (shown in hex) 4.4 Basic Computer Organization Check the recorded lecture 4.5 x86-64 Memory Organization Recall variables live in memory & need to int x,y=5;z=8; be loaded into the • Because each byte of memory has its x = y+z; processor to be used own address we can picture memory A as one column of bytes (Fig. 2) Proc. 40 Mem. D • With 64-bit logical data bus we can 64 access up to 8-bytes of data at a time Fig. 2 … • We will usually show memory arranged in rows of 4 bytes (Fig. 3) or F8 0x000002 8 bytes 13 0x000001 5A 0x000000 – Still with separate addresses for each byte Logical Byte-Oriented View of Mem. … Fig. 3 b 8E a AD 9 33 8 29 0x000008 7 8E 6 AD 5 33 4 29 0x000004 3 7C 2 F8 1 13 0 5A 0x000000 Logical DWord-Oriented View 4.6 Memory & Word Size CS:APP 3.9.3 Double Word 4 • To refer to a chunk of memory we Word 6 Word 4 must provide: • The starting address Byte 7 Byte 6 Byte 5 Byte 4 • The size: B, W, D, L Byte 3 Byte 2 Byte 1 Byte 0 • There are rules for valid starting Quad Word 0 WordQuad Word 2 Word 0 addresses • A valid starting address should be a Double Word 0 multiple of the data size Byte • Words (2-byte chunks) must start on an Address Word even (divisible by 2) address … 0x4007 4006 • Double words (4-byte chunks) must start DWord 0x4006 Word on an address that is a multiple of 0x4004 0x4005 4004 (divisible by) 4 0x4004 Word … 0x4003 4002 • DWord Quad words (8-byte chunks) must start on 0x4002 an address that is a multiple of (divisible Word 0x4000 4000 QWord 4000 QWord 0x4001 by) 8 0x4000 4.7 Endian-ness CS:APP 2.1.3 • Endian-ness refers to the two alternate methods of ordering the The DWORD value: bytes in a larger unit (2, 4, 8 bytes) 0 x 12 34 56 78 – Big-Endian can be stored differently • PPC, Sparc, TCP/IP • MS byte is put at the starting address – Little-Endian 0x00 12 0x00 78 • used by Intel processors / original PCI 0x01 34 0x01 56 bus 0x02 56 0x02 34 • LS byte is put at the starting address 0x03 78 0x03 12 • Some processors (like ARM) and Big-Endian Little-Endian busses can be configured for either big- or little-endian 4.8 Big-endian vs. Little-endian • Big-endian • Little-endian – makes sense if you view your – makes sense if you view your memory as starting at the memory as starting at the top-left and addresses bottom-right and addresses increasing as you go down increasing as you go up 0 1 2 3 Addresses increasing Addresses … 000000 12345678 000004 000014 000008 000010 upward 00000C 00000C downward 000010 000008 000014 000004 … 12345678 000000 increasing Addresses 3 2 1 0 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 Byte 0 Byte 1 Byte 2 Byte 3 Byte 3 Byte 2 Byte 1 Byte 0 4.9 Big-endian vs. Little-endian Issues • Issues arise when transferring data between different systems – Byte-wise copy of data from big-endian system to little-endian system – Major issue in networks (little-endian computer => big-endian computer) and even within a single computer (system memory => I/O device) Intel is Big-Endian Little-Endian 0 1 2 3 LITTLE-ENDIAN Addresses increasing Addresses Copy byte 0 to byte 0, … 000000 12345678 byte 1 to byte 1, etc. 000004 000014 000008 000010 upward 00000C 00000C downward 000010 DWORD @ 0 in big-endian 000008 system is now different than 000014 000004 DWORD @ 0 in little-endian … system 78563412 000000 increasing Addresses 3 2 1 0 1 2 3 4 5 6 7 8 DWORD @ addr. 0 7 8 5 6 3 4 1 2 wrong! Byte 0 Byte 1 Byte 2 Byte 3 Byte 3 Byte 2 Byte 1 Byte 0 4.10 x86-64 ASSEMBLY 4.11 x86-64 Data Sizes CS:APP 3.3 Integer Floating Point • 4 sizes • 2 sizes – __________ – Single (S) • 8-bits = 1 byte • 32-bits = 4 bytes – __________ – Double (D) • 16-bits = 2 bytes • 64-bits = 8 bytes – __________ • (For a 32-bit data bus, a • 32-bits = 4 bytes double would be accessed from memory in 2 reads) – __________ • 64-bits = 8 bytes In x86-64, instructions generally specify what size data to access from memory and then operate upon. 4.12 x86-64 Register Names CS:APP 3.4 b (1 byte) w (2 bytes) l (4 bytes) q (8 bytes) %rax %eax %ax accumulate %rbx %ebx %bx base %rcx %ecx %cx counter %rdx %edx %dx data %rsi %esi %si source index %rdi %edi %di destination index %rsp %esp %sp stack pointer %rbp %ebp %bp base pointer • In addition: %al, %bl, %cl, %dl, %sil, %dil, %spl, %bpl for least significant byte • In addition: %r8 to %r15 (%r8d / %r8w / %r8b for lower 4 / 2 / 1 bytes) 4.13 x86-64 Instruction Classes • _____________________ (mov instruction) – Moves data between registers, or between registers and memory (One operand must be a processor register.) – Specifies size via a suffix on the instruction (movb, movw, movl, movq) • ___________ Operations – One operand must be a processor register – Size and operation specified by instruction (addl, orq, andb, subw) • ___________________ Flow – Unconditional/Conditional Branch (cmpq, jmp, je, jne, jl, jge) – Subroutine Calls (call, ret) • _____________________ Instructions – Instructions that can only be used by OS or other “supervisor” software (e.g. int to access certain OS capabilities, etc.) 4.14 Operand Locations • Source operands must be in one of the following 3 locations: – A ___________ value (e.g. %rax) Proc. Mem. – A value in a __________ location 400 Inst. Reg. A (e.g. value at address 0x0200e8) 401 Inst. Reg. – A __________ stored in the D ... Data ALU instruction itself (known as Data “________________ value”) ... add $1,d0 – The $ indicates the constant/immediate • Destination operands must be – A register – A memory location (specified by its address 0x0200e8) 4.15 Intel x86 Register Set • 8-bit processors in late 1970s – 4 registers for integer data: ________________ – 4 registers for address/pointers: SP (stack pointer), BP (base pointer), SI (source index), DI (dest. index) • 16-bit processors extended registers to 16-bits but continued to support 8-bit access! – Use prefix/suffix to indicate size: ___ referenced the lower 8-bits of register A ___ the higher 8-bits of register A ___ referenced the full 16-bit value • 32-/64-bit processors (see next slide) 4.16 DATA TRANSFER INSTRUCTIONS 4.17 mov Instruction & Data Size CS:APP 3.4.2 • Moves data between memory and processor register • Always provide the LS-Byte address (little-endian) of the desired data • Size is explicitly defined by the instruction ________ ('mov[_____]') used • Recall: Start address should be divisible by size of access (Assume start address = A) Processor Register Memory / RAM 63 7 0 7654 3210 A+4 Byte operations only access the 1-byte at the Byte movb fedc ba98 A specified address movb leaves upper bits unaffected 63 15 0 7654 3210 A+4 Word operations access the 2-bytes starting at the Word movw fedc ba98 A specified address movw leaves upper bits unaffected 63 31 0 7654 3210 A+4 Word operations access the 4-bytes starting at the 0000 0000 Double Word movl fedc ba98 A specified address movl zeros the upper bits 63 0 7654 3210 A+4 Word operations access the 8-bytes starting at the Quad Word movq fedc ba98 A specified address 4.18 Mem/Register Transfer Examples Memory / RAM • mov[b,w,l,q] src, dst 7654 3210 0x00204 fedc ba98 0x00200 • Initial Conditions: Processor Register ffff ffff 1234 5678 rax – movq 0x200, %rax rax movl zeros the upper rax – movl 0x204, %eax bits of dest. reg – movw 0x202, %ax rax – movb 0x207, %al rax 0x004e4 – movb %al, 0x4e5 0x004e0 0x004e4 – movl %eax, 0x4e0 movl changes only 4 bytes here 0x004e0 Treat these instructions as a sequence where one affects the next. 4.19 Immediate Examples Memory / RAM • Immediate Examples 7654 3210 0x00204 fedc ba98 0x00200 Processor Register ffff ffff 1234 5678 rax – movl $0xfe1234, %eax rax – movw $0xaa55, %ax rax – movb $20, %al rax – movq $-1, %rax rax – movabsq $0x123456789ab, %rax rax – movq $-1, 0x4e0 0x004e4 0x004e0 Rules: • Immediates must be source operand • Indicate with '$' and can be specified in decimal (default) or hex (start with 0x) • movq can only support a 32-bit immediate (and will then sign-extend that value to fill the upper 32-bits) • Use movabsq for a full 64-bit immediate value 4.20 Variations: Zero / Sign Extension • There are several variations with register destination – Used to zero-extend or sign-extend the source • Normal mov does ____________ upper portions of registers (with exception of movl) • movzxy will _______________ the upper portion – movzbw (move a byte from the source but zero-extend it to a word in the destination register) – movzbw, movzbl, movzbq, movzwl, movzwq (but no movzlq!) • movsxy will _______________ the upper portion – movsbw (move a byte from the source but sign-extend it to a word in the destination register) – movsbw, movsbl, movsbq, movswl, movswq, movslq – cltq is equivalent to movslq %eax,%rax (but shorter encoding) 4.21 Zero/Signed Move Variations Memory / RAM • Initial Conditions: 7654 3210 0x00204 fedc ba98 0x00200 Processor Register 0123 4567 89ab cdef rdx – movl 0x200, %eax rax – movslq 0x200, %rax rax – movzwl 0x202, %eax rax – movsbw 0x201, %ax rax – movsbl 0x206, %eax rax – movzbq %dl, %rax rax Treat these instructions as a sequence where one affects the next.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    101 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us