Advanced Systems Programming Assembly

Advanced Systems Programming Assembly Maksym Planeta, Björn Döbel, Tobias Stumpf 24.09.2021 1/65 What the hell - Why should I learn assembly? Understanding debugger output: 400 d4e : 55 push %rbp 400d4f: 48 89 e5 mov %rsp ,%rbp 400d52: bf 84 79 48 00 mov $0x487984 ,% edi 400d57: e8 54 6b 00 00 c a l l q 4078 b0 < IO puts > 400 d5c : 5d pop %rbp 400 d5d : c3 retq get full controll over your hardware (using specific instructions) system programming (e.g. kernel entry/exit) 2/65 We need to go deeper: Fibonacci i n t f i b ( i n t n ) f i n t fcur = 0, fnext = 1, tmp; while(==n>0) f tmp = fcur + fnext; fcur = fnext; f n e x t = tmp ; g return f n e x t ; g i n t main ( i n t argc , char ** argv ) f p r i n t f ( "Fib: %d\n" , fib(atoi(argv[1]))); g 3/65 gcc -Wall -O2 -march=x86-64 -c -o fib.o fib.c Fibonacci fib.c i n t f i b ( i n t n ) f i n t fcur = 0, fnext = 1, tmp; while(==n>0) f tmp = fcur + fnext; fcur = fnext; f n e x t = tmp ; g return f n e x t ; g 4/65 Fibonacci fib.c i n t f i b ( i n t n ) f i n t fcur = 0, fnext = 1, tmp; while(==n>0) f tmp = fcur + fnext; fcur = fnext; f n e x t = tmp ; g return f n e x t ; g gcc -Wall -O2 -march=x86-64 -c -o fib.o fib.c 4/65 Sections of object file $ objdump -h fib.o fib.o: file format elf64-x86-64 Sections: Idx Name Size ... File off Algn 0 .text 00000023 ... 00000040 2**4 CONTENTS, ALLOC, LOAD, READONLY, CODE 1 .data 00000000 ... 00000063 2**0 CONTENTS, ALLOC, LOAD, DATA ... 5/65 Sections of object file $ objdump -h fib.o fib.o: file format elf64-x86-64 Sections: Idx Name Size ... File off Algn 0 .text 00000023 ... 00000040 2**4 CONTENTS, ALLOC, LOAD, READONLY, CODE 1 .data 00000000 ... 00000063 2**0 CONTENTS, ALLOC, LOAD, DATA ... 5/65 Looking into text section $ dd if=fib.o of=fib.o.hex bs=1 count=$((0x23)) skip=$((0x40)) 35+0 records in 35+0 records out 35 bytes copied, 0.000799485 s, 43.8 kB/s $ xxd fib.o.hex 00000000: 83ef 0185 ff7e 16ba 0100 0000 31c9 6690 .....~......1.f. 00000010: 8d04 1189 d189 c283 ef01 75f4 c3b8 0100 ..........u..... 00000020: 0000 c3 6/65 What sees a processor 83ef0185ff7e16ba0100000031c966908d041189d189c283ef01 75f4c3b801000000c3 7/65 What sees a human f i b : sub $0x1 ,% edi t e s t %edi ,% edi j l e 1d <f i b +0x1d> mov $0x1 ,%edx xor %ecx ,%ecx xchg %ax,%ax lea (%rcx ,%rdx ,1) ,% eax mov %edx,%ecx mov %eax ,%edx sub $0x1 ,% edi jne 10 <f i b +0x10> retq mov $0x1 ,%eax retq 8/65 A processor opens \Intel Software Developer's Manual. Volume 2C. Appendix A. Table A-2". $ wget http://svn.inf.tu-dresden.de/repos/advsysprog/asm/opcodes.pdf Table[0x8; 0x3] = Immediate Grp 1 : Ev, Ib What sees a processor 83ef0185ff7e16ba0100000031c966908d041189d189c283ef01 75f4c3b801000000c3 9/65 Table[0x8; 0x3] = Immediate Grp 1 : Ev, Ib What sees a processor 83ef0185ff7e16ba0100000031c966908d041189d189c283ef01 75f4c3b801000000c3 A processor opens \Intel Software Developer's Manual. Volume 2C. Appendix A. Table A-2". $ wget http://svn.inf.tu-dresden.de/repos/advsysprog/asm/opcodes.pdf 9/65 What sees a processor 83ef0185ff7e16ba0100000031c966908d041189d189c283ef01 75f4c3b801000000c3 A processor opens \Intel Software Developer's Manual. Volume 2C. Appendix A. Table A-2". $ wget http://svn.inf.tu-dresden.de/repos/advsysprog/asm/opcodes.pdf Table[0x8; 0x3] = Immediate Grp 1 : Ev, Ib 9/65 Need to look into next byte What 0x83 stands for? Ev, Ib E A ModR/M byte follows the opcode. The operand is either a GPR or an address. v Word, doubleword or quadword I Immediate data b Byte 10/65 What 0x83 stands for? Ev, Ib E A ModR/M byte follows the opcode. The operand is either a GPR or an address. v Word, doubleword or quadword I Immediate data b Byte Need to look into next byte 10/65 Reg/ Mod R/M Opcode 01234567 1 1 1 0 1 1 1 1 Mod(11) + R/M(111) ! edi Opcode(101) ! sub ModR/M 83ef0185ff7e16ba0100000031c966908d041189d189c283ef01 75f4c3b801000000c3 11/65 Mod(11) + R/M(111) ! edi Opcode(101) ! sub ModR/M 83ef0185ff7e16ba0100000031c966908d041189d189c283ef01 75f4c3b801000000c3 Reg/ Mod R/M Opcode 01234567 1 1 1 0 1 1 1 1 11/65 ModR/M 83ef0185ff7e16ba0100000031c966908d041189d189c283ef01 75f4c3b801000000c3 Reg/ Mod R/M Opcode 01234567 1 1 1 0 1 1 1 1 Mod(11) + R/M(111) ! edi Opcode(101) ! sub 11/65 83ef0185ff7e16ba0100000031c966908d041189d189c283ef01 75f4c3b801000000c3 Look into next byte sub $0x1 , %edi Immediate data sub imm8 , %edi 12/65 sub $0x1 , %edi Immediate data sub imm8 , %edi 83ef0185ff7e16ba0100000031c966908d041189d189c283ef01 75f4c3b801000000c3 Look into next byte 12/65 sub $0x1 , %edi Immediate data sub imm8 , %edi 83ef0185ff7e16ba0100000031c966908d041189d189c283ef01 75f4c3b801000000c3 Look into next byte 12/65 Immediate data sub imm8 , %edi 83ef0185ff7e16ba0100000031c966908d041189d189c283ef01 75f4c3b801000000c3 Look into next byte sub $0x1 , %edi 12/65 Table[8; 5] = Test : Ev, Gv G ModR/M byte selects a general register Next instruction 83ef0185ff7e16ba0100000031c966908d041189d189c283ef01 75f4c3b801000000c3 13/65 Table[8; 5] = Test : Ev, Gv G ModR/M byte selects a general register Next instruction 85ff7e16ba0100000031c966908d041189d189c283ef0175f4 c3b801000000c3 13/65 G ModR/M byte selects a general register Next instruction 85ff7e16ba0100000031c966908d041189d189c283ef0175f4 c3b801000000c3 Table[8; 5] = Test : Ev, Gv 13/65 Next instruction 85ff7e16ba0100000031c966908d041189d189c283ef0175f4 c3b801000000c3 Table[8; 5] = Test : Ev, Gv G ModR/M byte selects a general register 13/65 Reg/ Mod R/M Opcode 01234567 1 1 1 1 1 1 1 1 Mod(11) + R/M(111) ! edi Reg(111) ! edi t e s t %edi ,%edi Next introduction (ModR/M) 85ff7e16ba0100000031c966908d041189d189c283ef0175f4 c3b801000000c3 14/65 Mod(11) + R/M(111) ! edi Reg(111) ! edi t e s t %edi ,%edi Next introduction (ModR/M) 85ff7e16ba0100000031c966908d041189d189c283ef0175f4 c3b801000000c3 Reg/ Mod R/M Opcode 01234567 1 1 1 1 1 1 1 1 14/65 t e s t %edi ,%edi Next introduction (ModR/M) 85ff7e16ba0100000031c966908d041189d189c283ef0175f4 c3b801000000c3 Reg/ Mod R/M Opcode 01234567 1 1 1 1 1 1 1 1 Mod(11) + R/M(111) ! edi Reg(111) ! edi 14/65 Next introduction (ModR/M) 85ff7e16ba0100000031c966908d041189d189c283ef0175f4 c3b801000000c3 Reg/ Mod R/M Opcode 01234567 1 1 1 1 1 1 1 1 Mod(11) + R/M(111) ! edi Reg(111) ! edi t e s t %edi ,%edi 14/65 Table[7; e] = jle Short jump is followed by single byte immediate offset. Near jump has prefix 0x0f, e. g. 0x0f7e { near jle. j l e $0x16 Jump is relative to the address of next instruction Continue decoding 85ff7e16ba0100000031c966908d041189d189c283ef0175f4 c3b801000000c3 15/65 Table[7; e] = jle Short jump is followed by single byte immediate offset. Near jump has prefix 0x0f, e. g. 0x0f7e { near jle. j l e $0x16 Jump is relative to the address of next instruction Continue decoding 7e16ba0100000031c966908d041189d189c283ef0175f4c3b8 01000000c3 15/65 j l e $0x16 Jump is relative to the address of next instruction Continue decoding 7e16ba0100000031c966908d041189d189c283ef0175f4c3b8 01000000c3 Table[7; e] = jle Short jump is followed by single byte immediate offset. Near jump has prefix 0x0f, e. g. 0x0f7e { near jle. 15/65 j l e $0x16 Jump is relative to the address of next instruction Continue decoding 7e16ba0100000031c966908d041189d189c283ef0175f4c3b8 01000000c3 Table[7; e] = jle Short jump is followed by single byte immediate offset. Near jump has prefix 0x0f, e. g. 0x0f7e { near jle. 15/65 Continue decoding 7e16ba0100000031c966908d041189d189c283ef0175f4c3b8 01000000c3 Table[7; e] = jle Short jump is followed by single byte immediate offset. Near jump has prefix 0x0f, e. g. 0x0f7e { near jle. j l e $0x16 Jump is relative to the address of next instruction 15/65 Check 0000000000000000 <f i b >: 0 : 83 e f 01 sub $0x1 ,% edi 3 : 85 f f t e s t %edi ,% edi 5 : 7e 16 j l e 1d <f i b +0x1d> ... 1c : c3 retq 1d: b8 01 00 00 00 mov $0x1 ,%eax 2 2 : c3 retq 16/65 CHAPTER 2 INSTRUCTION FORMAT This chapter describes the instruction format for all Intel 64 and IA-32 processors. The instruction format for protected mode, real-address mode and virtual-8086 mode is described in Section 2.1. Increments provided for IA- 32e mode and its sub-modes are described in Section 2.2. 2.1 INSTRUCTION FORMAT FOR PROTECTED MODE, REAL-ADDRESS MODE, AND VIRTUAL-8086 MODE The Intel 64 and IA-32 architectures instruction encodings are subsets of the format shown in Figure 2-1. Instruc- tions consist of optional instruction prefixes (in any order), primary opcode bytes (up to three bytes), an addressing-form specifier (if required) consisting of the ModR/M byte and sometimes the SIB (Scale-Index-Base) byte, a displacement (if required), and an immediate data field (if required). Generic instruction type Instruction Opcode ModR/M SIB Displacement Immediate Prefixes Prefixes of 1-, 2-, or 3-byte 1 byte 1 byte Address Immediate 1 byte each opcode (if required) (if required) displacement data of (optional)1, 2 of 1, 2, or 4 1, 2, or 4 bytes or none3 bytes or none3 7 6 5 3 2 0 7 6 5 3 2 0 Reg/ Mod Opcode R/M ScaleIndex Base 1. The REX prefix is optional, but if used must be immediately before the opcode; see Section 2.2.1, REX Prefixes for additional information. 2. For VEX encoding information, see Section 2.3, Intel® Advanced Vector Extensions (Intel® AVX).

Load more