Quiz 1: 39 Avg., 19 Std. Dev.
40
30
20
10
0 10 20 30 40 50 60 70 80 90 100 Linking CS 140 Feb. 11, 2015 Ali Jose Mashtizadeh Outline
• Overview • Detailed Example • Shared Libraries • Optimizations • Security • Summary Compiler Toolchain
foo.c cc foo.S as foo.o bar.c cc bar.S as bar.o ld a.out baz.S as baz.o • How to reference functions across files? • Dynamic libraries • File Formats • Security Considerations Perspective on Memory
• Programming Language: (x += 1; add $1, %eax) • Instructions: Operations to perform • Variables: Mutable operands • Constants: Immutable operands • Hardware: • Executable: Binary code, Just-in-Time compiled code • Read-Only: Constants • Read-Write: Variables, Stack, Heap • Hardware accesses variables/code by address • Linkers, Binary loader, Runtime determine this Process Specification
• Executable file formats: ELF, aout, COFF, PE, MachO • Specification between linker/loader/OS • Explains how and where to load code and data • Linker builds executable from the object files: Header: code length, ... main: Object code: Instructions call XXXX ret ELF calls this .text bar: ret
Exported Symbols: ELF calls this .sym main: 0 bar: 40 Relocations: external refs, ELF .text.rel 4: foo Program Loading In-Memory prog.o ld a.out loader Process
Compile Time Load/Runtime • Most POSIX systems use a loader • Loader maps code and data into memory • Linker for dynamic libraries (later) • ELF lets you suggest a loader to the kernel • Optimizations: • Zero-initialized data not written or read • Demand load: wait until first use to load/link • Copy-on-write/Sharing – read-only data & code What does a process look like?
• Process address space segments: • Code (.text) Kernel • Data (.data) • Zeroed Data (.bss) • Read-only Data (.rodata) • Heap (Not in binary) Stack • Stack (Not in binary) Userspace Heap Data • ELF has simple loader table Code Who sets up what?
• Global code/data: • Generated by the compiler/linker • Loading into memory by loader • Read-only Data: • Space is mmap’ed by loader • Stack: • mmap’ed by kernel, user/kernel for additional threads • stack is allocated/freed by procedures • Compiler determines per-function stack usage statically • Heap: • Runtime allocation managed by malloc Process Creation on Windows
• The spawning application calls into ntdll.dll • Ntdll.dll determines application type: • POSIX, Command Line, OS/2, DOS, Win32 • Runtime may be spawned if different than current: • posix.exe, cmd.exe, ntvdm.exe • Load memory into process space • Tells kernel to enter initialization routine in process Detailed Example Example: Compiler • Simple hello world program • Compile with: % cc –S hello.c • -S compiles but does not assemble file • Output hello.S has symbolic reference to printf .section .rodata .LC0: .string “hello!\n” .text .globl main int main(...) main: { enter $4, $0 printf(“hello!\n”); movl $.LC0, (%esp) } call printf leave ret Example: Assembler • call printf is compiled as call $0 • Assembler assembles each file at address 0x0 • Outputs symbol and relocation tables • % as hello.S • % objdump –d hello.o 0x0000 main: 0x0000 enter $4, $0 0x0003 movl $.LC0, (%esp) 0x000A call $0 0x000F leave 0x0010 ret Example: Linker • Linker must fix the reference to printf • % ld hello.o –o hello
• % objdump –d hello 0x00400100 main: 0x00400100 enter $4, $0 0x00400103 movl $0x00600000, (%esp) 0x0040010A call $0x0043FC84 0x0040010F leave 0x00400110 ret 0x00600000 LC0: “hello!\n” Simple Linker
• Pass 1: • Coalesce sections with same name • Arrange in memory • Read symbol tables (maintain a global symbol table) • Compute virtual address of sections (start and offset) • Pass 2: • Patch references using global symbol table • Emit result into a new object or binary • Emit loader table for loader (simplified view of sections) • Optionally: Symbol tables maybe discarded Linker Scripts
• Tells linker how and where to load • Link with script: % ld –T linker.script foo.o • Output default script: % ld --verbose ENTRY(_init) OUTPUT_FORMAT(elf32-i386) SECTIONS { .text : ALIGN(0x1000) { *(.text) } /* Other sections */ } Linker Scripts Uses
• Custom linker scripts for kernels • Used by kernels and apps to have special sections • Collect data structures into a specified section
/* gcc/clang */ int inside_section __attribute__((section(“.data.special”)); • Per-thread data /* C11/C++11 */ thread_local int per_thread_global; Compiler and Linker Interaction
• Code Model: Specifies where code will run • Compiler must choose the right assembly instructions • These may be relative to where the code is located • Negative vs Positive addresses • Small code may use shorter relative addresses • The linker can’t modify instructions usually • Architecture specific • small (code+data < 2GB), medium (code < 2GB), large (no restrictions), kernel (code > 2GB)
• % cc –mcmodel=kernel pmap.c –o pmap.o ELF: File Header • Print header: % readelf –h a.out ELF Header: Magic: 7f 45 4c 46 02 01 01 09 00 00 00 00 00 00 00 00 Class: ELF64 Data: 2's complement, little endian Version: 1 (current) OS/ABI: UNIX - FreeBSD ABI Version: 0 Type: EXEC (Executable file) Machine: Advanced Micro Devices X86-64 Version: 0x1 Entry point address: 0x400680 Start of program headers: 64 (bytes into file) Start of section headers: 3856 (bytes into file) Flags: 0x0 Size of this header: 64 (bytes) Size of program headers: 56 (bytes) Number of program headers: 8 Size of section headers: 64 (bytes) Number of section headers: 28 Section header string table index: 25 ELF: Sections • Print sections: % readelf –S a.out Section Headers: [Nr] Name Type Address Offset Size EntSize Flags Link Info Align [ 0] NULL 0000000000000000 00000000 0000000000000000 0000000000000000 0 0 0 ... [12] .text PROGBITS 0000000000400680 00000680 0000000000000328 0000000000000000 AX 0 0 16 ... [22] .data PROGBITS 0000000000600c78 00000c78 000000000000001c 0000000000000000 WA 0 0 8 [23] .bss NOBITS 0000000000600c98 00000c94 0000000000000008 0000000000000000 WA 0 0 8 ... [26] .symtab SYMTAB 0000000000000000 00001610 0000000000000678 0000000000000018 27 52 8 [27] .strtab STRTAB 0000000000000000 00001c88 0000000000000313 0000000000000000 0 0 1 ELF: Program Header • Print program header: % readelf –l a.out Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flags Align PHDR 0x0000000000000040 0x0000000000400040 0x0000000000400040 0x00000000000001c0 0x00000000000001c0 R E 8 INTERP 0x0000000000000200 0x0000000000400200 0x0000000000400200 0x0000000000000015 0x0000000000000015 R 1 [Requesting program interpreter: /libexec/ld-elf.so.1] LOAD 0x0000000000000000 0x0000000000400000 0x0000000000400000 0x0000000000000a64 0x0000000000000a64 R E 200000 LOAD 0x0000000000000a68 0x0000000000600a68 0x0000000000600a68 0x000000000000022c 0x0000000000000238 RW 200000 ...
• The OS and loader use these for loading • Loader uses sections when linking against libraries C++ & Name mangling
• C++ has functions: • Same name but different types • Name mangling creates unique symbol per type • Compiler and/or version specific % nm foo.o // C++ 0000 T _Z3fooi int foo(int a) 0008 T _Z3fooii { return 0; } % nm foo.o | c++filt int foo(int a, int b) 0000 T foo(int) { return 0; } 0008 T foo(int, int) Shared Libraries Dynamic Linking
• Shared libraries: • Make upgrading, bug fixing, and security patches easier • Reduces total code size installed • Plugins
• ELF: Main binary specifies which loader to use: • BSD: /libexec/ld-elf.so.1
• Load-time or Run-time linking Static Shared Libraries
• Programs often share many libraries like libc • *.a files are archives of object files created with ar • % ar –rc libc.a printf.o scanf.o ...
ls sh cc
libc.a libc.a libc.a Dynamic Shared Libraries
• No need to be recompile software on libc changes • Must be compiled with –fpic (more on this next) • % ld –shared libc.so printf.o ...
ls sh cc
libc.a Compiler Flag -fpic
• Compiler generates relocatable executables • Uses PC-relative addressing • Architecture specific
• Linking shared libraries • Same procedure as linking a program binary • Different types of addressing modes are handled Position Independent Code
• Loader has to patch every call into a library • Very slow loading times! • Instead we use indirection:
main: Program ... call printf printf: Procedure Linkage Table (PLT) call GOT[5] ... Global Offset Table [5]: &printf (libc) Lazy Dynamic Linking
• GOT Table points to dlfixup • Loader patches calls on first use
main: Program ... call printf printf: Procedure Linkage Table (PLT) call GOT[5] ... Global Offset Table [5]: &dlfixup Explicit Dynamic Linking
• Bind to a symbol at runtime • Used for loading plugins
// Open dynamic library void *p = dlopen(“foo.so”, RTLD_LAZY);
// Lookup symbol void (*fp)(void) = dlsym(p, “foo”);
// Run function pointer fp(); Optimizations Link Time Optimization
• Link Time Optimization • Compiler optimizations that cross modules • Inlining of code • Simplification, dead code elimination, etc. • Requires linking compiler intermediate representation (IR) • Clang supports this if you link llvm IR
% clang –emit-llvm –c foo.c –o foo.o % clang –emit-llvm –c bar.c –o bar.o % clang foo.o bar.o –o a.out Profile-Guided Optimization
• Collect performance profiling of code usage • Optimize code for size/performance based on runs
Generate instrumented code % clang –fprofile-instr-generate foo.c
Run and collect data % ./foo
Giving clang profiling data % clang –fprofile-sample-use=foo.prof foo.c Security Attacks
void fn() { char buf[80]; gets(buf); ... }
1. Attacker injects code into a buffer: Code usually tries to execute a shell 2. Overwrites return address using buffer Pointer points to code Linking and Security
• No eXecute (NX): • Loader mark code as read-only • Stacks/Data marked as non-executable • Address Space Randomization: • Relocate executable to a different address on each load • Makes it harder for the attacker to determine addresses • Attacks usually require an information leak bug • Compiler Protection: • Stack protector (stack cookies): Check for buffer overflows • Bounds checking: Hard to enforce system calls • Control flow integrity: Verify code pointers ASLR: Compiling
• Binaries compiled with special flags:
• For shared/dynamic libraries: • % cc –fpic print.c –o printf.o
• For static libraries and program binary: • % cc –fpie main.c –o main.o ASLR: Loading
• Requires kernel+loader support for ASLR
• Kernel randomized initial stack • Loader chooses random address to load at • Every library and program can be randomized • Heap should be randomized by libc
• Requires exec() to rerandomize! Blind ROP Attack
• Brute force attack for remote exploits • Requires approximately ~2k-4k requests
• ASLR broken by “stack reading” • Defeatable if program always fork-exec’s • MySQL bug allows attackers to bypass ASLR • GOT/PLT simplifies searching for useful functions
• Code & Paper: http://www.scs.stanford.edu/brop/ Linking Summary
• Compiler/Assembler: • Generate one object file for each source file • Don’t know about memory layout and external refs • Linker: • Combines all object files into a library or executable • Determines memory layout • OS: • Loads loader and initial stack • Loader: • Loads binary and libraries into memory • Links shared libraries using GOT/PLT