Memory Corruption: Why Protection is Hard
Mathias Payer, Purdue University http://hexhive.github.io 1 Software is unsafe and insecure
● Low-level languages (C/C++) trade type safety and memory safety for performance – Programmer responsible for all checks
● Large set of legacy and new applications written in C / C++ prone to memory bugs
● Too many bugs to find and fix manually – Protect integrity through safe runtime system
2 3 Memory (un-)safety: invalid deref.
Dangling pointer: free(foo); (temporal) *foo = 23;
Out-of-bounds pointer: char foo[40]; (spatial) foo[42] = 23;
Violation iff: pointer is read, written, or freed
4 Type Confusion
vtable*? Dptr class B { Bptr b int b; }; c? class D: B { int c; vtable* virtual void d() {} }; B b D … B *Bptr = new B; c D *Dptr = static_cast
5 Deployed Defenses
6 Status of deployed defenses
● Data Execution Prevention (DEP) Memory 0x4000x4?? RRWX-X ● Address Space Layout Randomization (ASLR) text
● Stack canaries 0x8000x8?? RWRWX- ● Safe exception handlers data
0xfff0xf?? RWRWX- stack 7 Control-Flow Hijack Attack
8 Control-flow hijack attack
● Attacker modifies code pointer 1 – Information leak: target address 2 3 – Memory safety violation: write ● Control-flow leaves valid graph 4 4' – Reuse existing code – Inject/modify code
9 Control-Flow Hijack Attack int vuln(int usr, int usr2){ Memory void *(func_ptr)(); 1 1 int *q = buf + usr; q … func_ptr = &foo; … buf 2 *q = usr2; … 3 (*func_ptr)(); } func_ptr 2
gadgetcode
Attack scenario: code reuse
● Find addresses of gadgets ● Force memory corruption to set up attack ● Leverage gadgets for code-reuse attack ● (Fall back to code injection)
Code Heap Stack
11 Attack: buffer overflow to ROP
void vuln(char *u1) { // assert(strlen(u1) < MAX) Memory safety Violation char tmp[MAX]; strcpy(tmp, u1); ... Integrity *C } vuln(&exploit); Randomization &C tmp[MAX]don't care
saveddon't base care pointer Flow Integrity *&C pointsreturn to address&system() ebp1st after argument: system *u1 call Control-flow 1st argumentnext stack to frame system() Attack hijack Model for memory attacks
Memory safety Memory corruption
Integrity C *C D *D
Randomization &C &D
Flow Integrity *&C *&D
Code Control-flow Data-only Bad things corruption hijack Control-Flow Integrity
14 Control-Flow Integrity (CFI)
CHECK(fn); (*fn)(x);
CHECK_RET(); Attackerreturn 7 may corrupt memory, code ptrs. verified when used
15 Three directions for CFI
● Goal: minimize target sets, increase precision
16 Source-based CFI
● Kernel protection crucial for system integrity ● Enforce strict pointer propagation rules – Function pointers can only be assigned – Data pointers to function pointers are forbidden
● Enforce types to protect C++ virtual calls
* Fine-Grained Control-Flow Integrity for Kernel Software. Xinyang Ge, Nirupama Talele, Mathias Payer, and Trent Jaeger. In EuroS&P'16
* VTrust: Regaining Trust on Your Virtual Calls. Chao Zhang, Scott A. Carr, Tongxin Li, Yu Ding, Chengyu Song, Mathias Payer, and Dawn Song. In NDSS'16 17 Lockdown*: enforce CFI for binaries
● Fine-grained CFI relies on source code
● Coarse-grained CFI is imprecise
● Goal: enforce fine-grained CFI for binaries – Support legacy, binary code and modularity (libraries) – Leverage precise, dynamic analysis – Enforce stack integrity through shadow stack – Low performance overhead
* Fine-Grained Control-Flow Integrity through Binary Hardening Mathias Payer, Antonio Barresi, and Thomas R. Gross. In DIMVA'15 18 Dynamic CFI analysis
● Leverage program's modularity through loader
/bin/
symbol table of ELF DSO allowed Control Flow transfer .text section of DSO illegal Control Flow transfer
19 Dynamic CFI analysis
● Leverage program's modularity through loader
/bin/
symbol table of ELF DSO allowed Control Flow transfer .text section of DSO illegal Control Flow transfer
20 Necessity of shadow stack*
● Defenses without stack integrity are broken – Loop through two calls to the same function – Choose any caller as return location
● Shadow stack enforces stack integrity – Attacker restricted to arbitrary targets on the stack – Each target can only be called once, in sequence
* Control-Flow Bending: On the Effectiveness of Control-Flow Integrity. Nicholas Carlini, Antonio Barresi, Mathias Payer, David Wagner, and Thomas R. Gross. In Usenix SEC'15 21 printf()-oriented programming*
● Translate program to format string – Memory reads: %s – Memory writes: %n – Conditional: %.*d ● Program counter becomes format string counter – Loops? Overwrite the format specific counter ● Turing-complete domain-specific language
* Direct fame to Nicholas Carlini, blame to me 22 Ever heard of brainfuck*?
● > == dataptr++ %1$65535d%1$.*1$d%2$hn
● < == dataptr-- %1$.*1$d %2$hn
● + == *dataptr++ %3$.*3$d %4$hhn
● - == *datapr-- %3$255d%3$.*3$d%4$hhn
● . == putchar(*dataptr) %3$.*3$d%5$hn
● , == getchar(dataptr) %13$.*13$d%4$hn
● [ == if (*dataptr == 0) goto ']' %1$.*1$d%10$.*10$d%2$hn
● ] == if (*dataptr != 0) goto '[' %1$.*1$d%10$.*10$d%2$hn
* https://en.wikipedia.org/wiki/Brainfuck 23 Exploitable program
void loop() { char* last = output; int* rpc = &progn[pc];
while (*rpc != 0) { // fetch -- decode next instruction sprintf(buf, "%1$.*1$d%1$.*1$d%1$.*1$d%1$.*1$d%1$.*1$d%1$.*1$d%1$.*1$d%1$.*1$d%2$hn", *rpc, (short*)(&real_syms));
// execute -- execute instruction sprintf(buf, *real_syms, ((long long int)array)&0xFFFF, &array, // 1, 2 *array, array, output, // 3, 4, 5 ((long long int)output)&0xFFFF, &output, // 6, 7 &cond, &bf_CGOTO_fmt3[0], // 8, 9 rpc[1], &rpc, 0, *input, // 10, 11, 12, 13 ((long long int)input)&0xFFFF, &input // 14, 15 );
// retire -- update PC sprintf(buf, "12345678%.*d%hn", (int)(((long long int)rpc)&0xFFFF), 0, (short*)&rpc);
// for debug: do we need to print? if (output != last) { putchar(output[-1]); last = output; } } } 24 Presenting: printbf*
● Turing complete interpreter ● Relies on format strings ● Allows you to execute stuff
http://github.com/HexHive/printbf
* Direct fame to Nicholas Carlini, blame to me 25 ● Purdue's Capture the Flag (CTF) team – Compete in international hacking competitions – Gain practical security experience – Competitive, fun, challenging tasks – Open, inclusive environment
● Founded 2014, ~15 active, ~100 interested ● 3rd US academic team, top 50 overall
26 Conclusion
27 Conclusion
● ROP/JOP is key to modern exploits – Leak addresses, connect gadgets, inject code ● Control-flow hijack protection – Shadow stack, precise CFI, and locality – High precision is key for effectiveness ● Low overhead, open-source ● Future work – Protect context of control-flow – Protect data and data-flow, not just control
28 Thank you!
Questions?
Mathias Payer, Purdue University http://hexhive.github.io 29