Inline Assembly
Florob
Why? Assembler Inline Assembly Design Goals
Prior Art D MSVC Florian “Florob” Zeitz gcc LLVM IR
Rust (now) Rust’s Future? 2017-06-07 Questions
1 / 48 Inline Assembly
Florob 1 Why?
Why?
Assembler 2 Assembler
Design Goals Prior Art 3 Design Goals D MSVC gcc LLVM IR 4 Prior Art Rust (now)
Rust’s Future?
Questions 5 Rust (now)
6 Rust’s Future?
2 / 48 Inline Assembly
Florob 1 Why?
Why?
Assembler 2 Assembler
Design Goals Prior Art 3 Design Goals D MSVC gcc LLVM IR 4 Prior Art Rust (now)
Rust’s Future?
Questions 5 Rust (now)
6 Rust’s Future?
3 / 48 Why?
Inline Assembly
Florob
Why?
Assembler
Design Goals
Prior Art D MSVC gcc LLVM IR
Rust (now)
Rust’s Future?
Questions
4 / 48 Why?
Inline Assembly
Florob low-level control config registers Why? hardware not otherwise accessible Assembler Design Goals performance Prior Art predictable timing D MSVC embedded gcc LLVM IR cryptography (side-channel) Rust (now) convenience Rust’s Future? no separate assembly file Questions no need for a build.rs Rust philosophy: safe with unsafe {} escape hatches where needed
5 / 48 Inline Assembly
Florob 1 Why?
Why?
Assembler 2 Assembler
Design Goals Prior Art 3 Design Goals D MSVC gcc LLVM IR 4 Prior Art Rust (now)
Rust’s Future?
Questions 5 Rust (now)
6 Rust’s Future?
6 / 48 Examples
Inline Assembly Florob x86 (AT&T) x86 (Intel)
Why? 1 jmp foo 1 jmp foo Assembler 2 mov $5, %eax 2 mov eax, 5 Design Goals 3 foo: 3 foo: Prior Art D 4 mov 5(%eax, %ebx, 2), %edx 4 mov edx, [eax + ebx*2 + 5] MSVC gcc LLVM IR Rust (now) ARM PowerPC Rust’s Future?
Questions 1 ldmiaeq sp!, {r4-r7, pc} 1 lwz r0, 8(r1) 2 ldrb r0, [sp, #32] 2 rlwimi r11, r1, 0, 30, 28
7 / 48 Properties
Inline Assembly
Florob
Why? textual representation of the instructions a CPU executes Assembler
Design Goals mnemonics Prior Art operands D MSVC register gcc LLVM IR immediate Rust (now) label Rust’s Future? combinations thereof Questions
8 / 48 Inline Assembly
Florob 1 Why?
Why?
Assembler 2 Assembler
Design Goals Prior Art 3 Design Goals D MSVC gcc LLVM IR 4 Prior Art Rust (now)
Rust’s Future?
Questions 5 Rust (now)
6 Rust’s Future?
9 / 48 Design Goals
Inline Assembly
Florob
Why? Assembler easy to use Design Goals portable Prior Art D between platforms (x86, AMD64, ARM, AVR, PowerPC, RISC-V, …) MSVC gcc between backends (LLVM, Cretone, gcc, …) LLVM IR
Rust (now) support all instructions? (with all operand types?)
Rust’s Future?
Questions
10 / 48 Inline Assembly
Florob 1 Why?
Why?
Assembler 2 Assembler
Design Goals Prior Art 3 Design Goals D MSVC gcc LLVM IR 4 Prior Art Rust (now)
Rust’s Future?
Questions 5 Rust (now)
6 Rust’s Future?
11 / 48 D
Inline Assembly
Florob
Why? 1 asm { mov RAX, RDX; } Assembler Design Goals implemented as DSL inside asm {} Prior Art D standardized per CPU family MSVC gcc can be marked pure and nothrow LLVM IR Rust (now) only x86/AMD64 are currently available Rust’s Future? Docs: https://dlang.org/spec/iasm.html Questions
12 / 48 D’s x86/AMD64 syntax
Inline Assembly
Florob close to regular Intel syntax
Why? registers, always uppercase Assembler mnemonics, always lowercase Design Goals local variables accessed as var[EBP] Prior Art D (just var when not in naked code) MSVC gcc many D expressions allowed LLVM IR Rust (now) $ represents the PC, but can’t be used for PIC Rust’s Future? pseudo-ops for: Questions alignment data definition prefixes (lock, rep, …)
13 / 48 D Examples: Add 5 to variable
Inline Assembly
Florob
Why?
Assembler
Design Goals 1 int var = 0;
Prior Art 2 asm { D 3 add var, 3 + 2; MSVC gcc 4 } LLVM IR
Rust (now)
Rust’s Future?
Questions
14 / 48 D Examples: Get L1 cache size
Inline Assembly
Florob 1 int ebx, ecx; Why? 2 asm { Assembler 3 mov EAX, 4; Design Goals 4 xor ECX, ECX; Prior Art 5 cpuid; D MSVC 6 mov ebx, EBX; gcc 7 LLVM IR mov ecx, ECX; 8 Rust (now) } 9 Rust’s Future? writefln("L1 Cache: %s", 10 ebx ebx Questions (( >> 22) + 1) * ((( >> 12) & 0x3ff) + 1) 11 * ((ebx & 0xfff) + 1) * (ecx + 1));
15 / 48 MSVC
Inline Assembly 1 __asm { mov rax, rdx } Florob
Why? DSL after __asm Assembler only available for x86 (Pentium 4 and AMD Athlon opcodes) Design Goals
Prior Art uses MASM (Microsoft Macro Assembler) expressions and C/C++ D MSVC elements gcc LLVM IR allows jumps to C labels from ASM and vice versa Rust (now) allows calls to C functions Rust’s Future?
Questions C/C++ operators cannot be used Docs: https://docs.microsoft.com/en-us/cpp/assembler/ inline/inline-assembler
16 / 48 MSVC, supported MASM
Inline Assembly
Florob
Why? Assembler not supported: Design Goals data definitions Prior Art macros D MSVC supported: gcc LLVM IR alignment Rust (now) comments Rust’s Future?
Questions
17 / 48 MSVC, supported C/C++ elements
Inline Assembly
Florob
Why? symbols (labels, function names, variable names) Assembler MASM reserved words take precedence over symbols Design Goals Prior Art constants D MSVC comments gcc LLVM IR macros and preprocessor directives Rust (now)
Rust’s Future? type names where a MASM type would be legal
Questions
18 / 48 MSVC Examples: Add 5 to variable
Inline Assembly
Florob
Why?
Assembler
Design Goals 1 int var = 0;
Prior Art 2 __asm { D 3 add var, 5 MSVC gcc 4 } LLVM IR
Rust (now)
Rust’s Future?
Questions
19 / 48 MSVC Examples: Get L1 cache size
Inline Assembly
Florob 1 int ebx_v, ecx_v; 2 Why? __asm { 3 mov eax, 4 Assembler 4 xor ecx, ecx Design Goals 5 cpuid Prior Art D 6 mov ebx_v, ebx MSVC gcc 7 mov ecx_v, ecx LLVM IR 8 } Rust (now) 9 std::cout << "L1 Cache: " Rust’s Future? 10 << ((ebx_v >> 22) + 1) * (((ebx_v >> 12) & 0x3ff) + 1) Questions 11 * ((ebx_v & 0xfff) + 1) * (ecx_v + 1)) 12 << '\n';
20 / 48 gcc
Inline Assembly
Florob 1 asm [volatile] ("template" : outs : ins : clobber) Why? Assembler special syntax within asm () Design Goals uses a template strings Prior Art D MSVC template interpreted by the assembler (compiler doesn’t have to gcc LLVM IR know instructions) Rust (now) compiler fills template based on constraints Rust’s Future? Docs: Questions https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html
21 / 48 gcc Template
Inline Assembly
Florob
Why?
Assembler %0, or %[name] as placeholders Design Goals supports modifiers, e.g. %w0 to print a HImode register name (%ax) Prior Art D multiple dialects as {mov %%eax, %%ebx | mov ebx, eax} MSVC gcc specify one or all aviailable dialects LLVM IR purely defined by order Rust (now) escapes: %%, %{, %}, %| Rust’s Future?
Questions
22 / 48 gcc Arguments
Inline Assembly
Florob
Why? specify outputs, inputs, and clobbers Assembler constraints determine how to reference arguments (register, Design Goals address, immediate) Prior Art D all inputs have to read before any output is written MSVC gcc clobbers specify touched state, that can not be inferred from inputs LLVM IR
Rust (now) and outputs Rust’s Future? compiler moves arguments into registers, saves clobbered registers, Questions knows to reload changed memory
23 / 48 gcc Constraints
Inline Assembly
Florob
Why? Assembler constraint codes are usually a single letter Design Goals architecture specific Prior Art D MSVC e. g. r for register, m for memory, i for immediate gcc LLVM IR multiple alternative constraints are allowed, allowing the compiler to Rust (now) choose the most efficient Rust’s Future?
Questions
24 / 48 gcc Constraints
Inline Assembly Florob output
Why? start with = or + (input and output)
Assembler can be modified with & to allow write before all inputs are read
Design Goals (early-clobber) Prior Art input D MSVC can be tied to outputs by specifying the operand number as gcc LLVM IR constrained Rust (now) clobbers Rust’s Future? used registers Questions modified flags if memory was modified
25 / 48 gcc volatile
Inline Assembly
Florob
Why?
Assembler asm statements can be declared volatile Design Goals Prior Art disables code motion and dead code elimination D MSVC may be required when side-effects are not otherwise visible, gcc LLVM IR e. g. asm ("wfi"); Rust (now)
Rust’s Future?
Questions
26 / 48 gcc Examples: Add 5 to variable
Inline Assembly
Florob
Why?
Assembler
Design Goals 1 int var = 0; Prior Art 2 asm ("add $5, %0" : "+r"(var)); D MSVC 3 asm ("add $5, %0" : "=r"(var) : "0"(var)); gcc LLVM IR
Rust (now)
Rust’s Future?
Questions
27 / 48 gcc Examples: Get L1 cache size
Inline Assembly 1 int ebx, ecx; Florob 2 asm (
Why? 3 "mov $4, %%eax;"
Assembler 4 "xor %%ecx, %%ecx;"
Design Goals 5 "cpuid;"
Prior Art 6 "mov %%ebx, %0;" D 7 : "=r"(ebx), "=c"(ecx) MSVC gcc 8 : LLVM IR 9 : "eax", "ebx", "edx" Rust (now) 10 ); Rust’s Future? 11 printf("L1 Cache: %i\n", ((ebx >> 22) + 1) Questions 12 * (((ebx >> 12) & 0x3ff) + 1) 13 * ((ebx & 0xfff) + 1) 14 * (ecx + 1));
28 / 48 No, in and out could be in the same register.
gcc Examples: Increment
Inline Assembly
Florob
Why? 1 int in = 3, out; Assembler 2 asm ( Design Goals 3 "mov $1, %0\r\n" Prior Art 4 "add %1, %0\r\n" D MSVC 5 : "=r"(out) gcc LLVM IR 6 : "r"(in) Rust (now) 7 ); Rust’s Future? Is this correct? Questions
29 / 48 gcc Examples: Increment
Inline Assembly
Florob
Why? 1 int in = 3, out; Assembler 2 asm ( Design Goals 3 "mov $1, %0\r\n" Prior Art 4 "add %1, %0\r\n" D MSVC 5 : "=r"(out) gcc LLVM IR 6 : "r"(in) Rust (now) 7 );
Rust’s Future? Is this correct? No, in and out could be in the same register. Questions
29 / 48 LLVM IR
Inline Assembly
Florob
Why? 1 %out = call i32 asm "template", "constraints"(i32 %in) Assembler Design Goals used to implement gcc-style inline assembly Prior Art D outputs as return values, inputs as call arguments MSVC gcc LLVM IR can be qualified with sideeffect, alignstack, inteldialect Rust (now) Docs: http://llvm.org/docs/LangRef.html# Rust’s Future? inline-assembler-expressions Questions
30 / 48 LLVM Template
Inline Assembly
Florob
Why?
Assembler Design Goals $0 as placeholder Prior Art D supports modifiers, e. g. ${0:w} MSVC gcc escape is $$ LLVM IR
Rust (now)
Rust’s Future?
Questions
31 / 48 LLVM Constraints
Inline Assembly
Florob
Why? constraint codes are a single letter,
Assembler ^ followed by two letters, or {reg} Design Goals architecture specific Prior Art D e. g. r for register, m for memory, i for immediate MSVC gcc multiple alternative constraints are allowed, allowing the compiler to LLVM IR
Rust (now) choose the most efficient Rust’s Future? indirect outputs/inputs via the * modifier Questions argument is an address that will be read/written from, typically =*m
32 / 48 LLVM Constraints
Inline Assembly
Florob output start with = (no +) Why? can be modified with & to allow write before all inputs are read Assembler (early-clobber) Design Goals
Prior Art input D MSVC can be tied to outputs by specifying the operand number as gcc constrained LLVM IR
Rust (now) clobbers Rust’s Future? start with ~ Questions used registers modified flags if memory was modified
33 / 48 LLVM Examples: Add 5 to variable
Inline Assembly
Florob
Why?
Assembler 1 %1 = alloca i32, align 4 Design Goals 2 store i32 0, i32* %1, align 4 Prior Art D 3 %2 = load i32, i32* %1, align 4 MSVC 4 %3 = call i32 asm "add $$5, $0", "=r,0"(i32 %2) gcc LLVM IR 5 store i32 %3, i32* %1, align 4 Rust (now)
Rust’s Future?
Questions
34 / 48 LLVM Examples: Get L1 cache size
Inline Assembly
Florob
Why? 1 %1 = alloca i32, align 4
Assembler 2 %2 = alloca i32, align 4
Design Goals 3 %3 = call { i32, i32 } asm
Prior Art 4 "mov $$4, %eax;xor %ecx, %ecx;cpuid;mov %ebx, $0;", D 5 "=r,={ecx},~{eax},~{ebx},~{edx}"() MSVC gcc 6 %4 = extractvalue { i32, i32 } %3, 0 LLVM IR 7 %5 = extractvalue { i32, i32 } %3, 1 Rust (now) 8 store i32 %4, i32* %1, align 4 Rust’s Future? 9 store i32 %5, i32* %2, align 4 Questions
35 / 48 Inline Assembly
Florob 1 Why?
Why?
Assembler 2 Assembler
Design Goals Prior Art 3 Design Goals D MSVC gcc LLVM IR 4 Prior Art Rust (now)
Rust’s Future?
Questions 5 Rust (now)
6 Rust’s Future?
36 / 48 Rust global_asm!()
Inline Assembly 1 global_asm!(r" Florob 2 add5: Why? 3 mov %rdi, %rax Assembler 4 add $5, %rax
Design Goals 5 ret
Prior Art 6 "); D 7 MSVC extern { gcc 8 fn add5(i: i64) -> i64; LLVM IR 9 } Rust (now)
Rust’s Future?
Questions RFC 1548 insert assembly verbatim into the module not yet implemented
37 / 48 Rust asm!()
Inline Assembly
Florob 1 unsafe { 2 asm!("template" Why? 3 : outs Assembler 4 : ins Design Goals 5 : clobbers Prior Art 6 : options D MSVC 7 ); gcc LLVM IR 8 } Rust (now) Rust’s Future? straight binding to LLVM IR, but supports + Questions options are volatile, alignstack, and intel Docs: https://doc.rust-lang.org/unstable-book/asm.html
38 / 48 Rust Examples: Add 5 to variable
Inline Assembly
Florob
Why?
Assembler
Design Goals 1 let mut var = 0;
Prior Art 2 unsafe { D 3 asm!("add $$5, $0" : "+r"(var)); MSVC gcc 4 } LLVM IR
Rust (now)
Rust’s Future?
Questions
39 / 48 Rust Examples: Get L1 cache size
Inline Assembly 1 let ebx: i32; Florob 2 let ecx: i32;
Why? 3 unsafe {
Assembler 4 asm!(r"
Design Goals 5 mov $$4, %eax;
Prior Art 6 xor %ecx, %ecx; D 7 cpuid; MSVC gcc 8 mov %ebx, $0;" LLVM IR 9 : "=r"(ebx), "={ecx}"(ecx) :: "eax", "ebx", "edx" Rust (now) 10 ); Rust’s Future? 11 } Questions 12 println!("L1 Cache: {}", ((ebx >> 22) + 1) 13 * (((ebx >> 12) & 0x3ff) + 1) 14 * ((ebx & 0xfff) + 1) * (ecx + 1));
40 / 48 Inline Assembly
Florob 1 Why?
Why?
Assembler 2 Assembler
Design Goals Prior Art 3 Design Goals D MSVC gcc LLVM IR 4 Prior Art Rust (now)
Rust’s Future?
Questions 5 Rust (now)
6 Rust’s Future?
41 / 48 RFC 129
Inline Assembly 1 asm!("assembly template", Florob 2 positional parameters,
Why? 3 named parameters, 4 Assembler clobbers and options 5 Design Goals );
Prior Art D MSVC positional parameters: gcc expr1, expr2_in -> expr2_out, LLVM IR
Rust (now) "eax" = expr3_in -> expr3_out, … Rust’s Future? named parameters: Questions name1 = expr_in_out, name2 = expr_in -> expr_out, … clobbers and options: "eax", "ebx", "memory", "volatile", "intel", …
42 / 48 RFC 129 Example
Inline Assembly
Florob 1 fn addsub(a: int, b: int) -> (int, int) { Why? 2 let mut c = 0; Assembler 3 let mut d = 0; Design Goals 4 unsafe { Prior Art D 5 asm!("add {2:r}, {:=r}\n\t\ MSVC 6 sub {2:r}, {:=r}", gcc LLVM IR 7 a -> c, a -> d, b); Rust (now) 8 } Rust’s Future? 9 (c, d) Questions 10 }
43 / 48 Some Questions
Inline Assembly
Florob
Why? Should we go the DSL, or template route? Assembler Does it make sense to support both? Design Goals What to do about early-clobber? Prior Art D MSVC What placehoder should we use? gcc LLVM IR Should we copy gcc since it’s familiar? Rust (now) Are single character constraints sensible? Alternatives? Rust’s Future?
Questions Are sigil heavy constraints sensible? Alternatives?
44 / 48 My Thoughts
Inline Assembly
Florob
Why? Templates work better for portability Assembler
Design Goals We should use the same placeholder as everywhere else {} Prior Art Use words over sigils for constraints D MSVC gcc Allow inputs and outputs in any order LLVM IR
Rust (now) Either do early-clobber by default, or be explicit
Rust’s Future? (early_out/late_out)
Questions
45 / 48 Example: Add 5 to variable
Inline Assembly
Florob
Why?
Assembler
Design Goals 1 let mut var = 0;
Prior Art 2 unsafe { D 3 asm!("add $5, {}", inout(reg) var); MSVC gcc 4 } LLVM IR
Rust (now)
Rust’s Future?
Questions
46 / 48 Example: Get L1 cache size
Inline Assembly 1 let ebx: i32; Florob 2 let ecx: i32;
Why? 3 unsafe {
Assembler 4 asm!(r"
Design Goals 5 mov $$4, %eax;
Prior Art 6 xor %ecx, %ecx; D 7 cpuid; MSVC gcc 8 mov %ebx, {};", LLVM IR 9 out(reg) ebx, out(ecx) ecx, clobber(eax, ebx, edx) Rust (now) 10 ); Rust’s Future? 11 } Questions 12 println!("L1 Cache: {}", ((ebx >> 22) + 1) 13 * (((ebx >> 12) & 0x3ff) + 1) 14 * ((ebx & 0xfff) + 1) * (ecx + 1));
47 / 48 Inline Assembly Florob Thank you for your attention.
Why? Any questions?
Assembler
Design Goals
Prior Art D MSVC gcc LLVM IR
Rust (now)
Rust’s Future?
Questions
https://babelmonkeys.de/~florob/talks/RC-2017-06-07-inline-assembly.pdf
48 / 48