Memory Corruption: Why Protection is Hard

Mathias Payer, Purdue University http://hexhive.github.io 1 Software is unsafe and insecure

● Low-level languages (/C++) trade type safety and for performance – Programmer responsible for all checks

● Large set of legacy and new applications written in C / C++ prone to memory bugs

● Too many bugs to find and fix manually – Protect integrity through safe

2 3 Memory (un-)safety: invalid deref.

Dangling pointer: free(foo); (temporal) *foo = 23;

Out-of-bounds pointer: char foo[40]; (spatial) foo[42] = 23;

Violation iff: pointer is read, written, or freed

4 Type Confusion

vtable*? Dptr class B { Bptr b int b; }; c? class D: B { int c; vtable* virtual void d() {} }; B b D … B *Bptr = new B; c D *Dptr = static_castB; Dptr->c = 0x43; // Type confusion! Dptr->d(); // Type confusion!

5 Deployed Defenses

6 Status of deployed defenses

● Data Prevention (DEP) Memory 0x4000x4?? RRWX-X ● Address Space Layout Randomization (ASLR) text

● Stack canaries 0x8000x8?? RWRWX- ● Safe exception handlers data

0xfff0xf?? RWRWX- stack 7 Control-Flow Hijack Attack

8 Control-flow hijack attack

● Attacker modifies code pointer 1 – Information leak: target address 2 3 – Memory safety violation: write ● Control-flow leaves valid graph 4 4' – Reuse existing code – Inject/modify code

9 Control-Flow Hijack Attack int vuln(int usr, int usr2){ Memory void *(func_ptr)(); 1 1 int *q = buf + usr; q … func_ptr = &foo; … buf 2 *q = usr2; … 3 (*func_ptr)(); } func_ptr 2

gadgetcode

Attack scenario: code reuse

● Find addresses of gadgets ● Force memory corruption to set up attack ● Leverage gadgets for code-reuse attack ● (Fall back to code injection)

Code Heap Stack

11 Attack: to ROP

void vuln(char *u1) { // assert(strlen(u1) < MAX) Memory safety Violation char tmp[MAX]; strcpy(tmp, u1); ... Integrity *C } vuln(&exploit); Randomization &C tmp[MAX]don't care

saveddon't base care pointer Flow Integrity *&C pointsreturn to address&system() ebp1st after argument: system *u1 call Control-flow 1st argumentnext stack to frame system() Attack hijack Model for memory attacks

Memory safety Memory corruption

Integrity C *C D *D

Randomization &C &D

Flow Integrity *&C *&D

Code Control-flow Data-only Bad things corruption hijack Control-Flow Integrity

14 Control-Flow Integrity (CFI)

CHECK(fn); (*fn)(x);

CHECK_RET(); Attackerreturn 7 may corrupt memory, code ptrs. verified when used

15 Three directions for CFI

● Goal: minimize target sets, increase precision

16 Source-based CFI

● Kernel protection crucial for system integrity ● Enforce strict pointer propagation rules – Function pointers can only be assigned – Data pointers to function pointers are forbidden

● Enforce types to protect C++ virtual calls

* Fine-Grained Control-Flow Integrity for Kernel Software. Xinyang Ge, Nirupama Talele, Mathias Payer, and Trent Jaeger. In EuroS&P'16

* VTrust: Regaining Trust on Your Virtual Calls. Chao Zhang, Scott A. Carr, Tongxin Li, Yu Ding, Chengyu Song, Mathias Payer, and Dawn Song. In NDSS'16 17 Lockdown*: enforce CFI for binaries

● Fine-grained CFI relies on

● Coarse-grained CFI is imprecise

● Goal: enforce fine-grained CFI for binaries – Support legacy, binary code and modularity (libraries) – Leverage precise, dynamic analysis – Enforce stack integrity through shadow stack – Low performance overhead

* Fine-Grained Control-Flow Integrity through Binary Hardening Mathias Payer, Antonio Barresi, and Thomas R. Gross. In DIMVA'15 18 Dynamic CFI analysis

● Leverage program's modularity through loader

/bin/ /lib/libc.so.6 /lib/lib* exported imported exported imported exported imported ­ puts puts _dl* funcA ifunc* scanf scanf ... funcB ... funcA mprotect ...... text .text .text call puts puts: funcA: ...... lea fptr, %eax mprotect: funcB: ...... call *%eax ...

symbol table of ELF DSO allowed Control Flow transfer .text section of DSO illegal Control Flow transfer

19 Dynamic CFI analysis

● Leverage program's modularity through loader

/bin/ /lib/libc.so.6 /lib/lib* exportedModularityimported increasesexported imported precision.exported imported ­ puts puts _dl* funcA ifunc* scanf scanf ... funcB ... funcA No sourcemprotect needed...... textLeverage context.text of transfers..text call puts puts: funcA: ...... lea fptr, %eax mprotect: funcB: ...... call *%eax ...

symbol table of ELF DSO allowed Control Flow transfer .text section of DSO illegal Control Flow transfer

20 Necessity of shadow stack*

● Defenses without stack integrity are broken – Loop through two calls to the same function – Choose any caller as return location

● Shadow stack enforces stack integrity – Attacker restricted to arbitrary targets on the stack – Each target can only be called once, in sequence

* Control-Flow Bending: On the Effectiveness of Control-Flow Integrity. Nicholas Carlini, Antonio Barresi, Mathias Payer, David Wagner, and Thomas R. Gross. In Usenix SEC'15 21 printf()-oriented programming*

● Translate program to format string – Memory reads: %s – Memory writes: %n – Conditional: %.*d ● Program counter becomes format string counter – Loops? Overwrite the format specific counter ● Turing-complete domain-specific language

* Direct fame to Nicholas Carlini, blame to me 22 Ever heard of brainfuck*?

● > == dataptr++ %1$65535d%1$.*1$d%2$hn

● < == dataptr-- %1$.*1$d %2$hn

● + == *dataptr++ %3$.*3$d %4$hhn

● - == *datapr-- %3$255d%3$.*3$d%4$hhn

● . == putchar(*dataptr) %3$.*3$d%5$hn

● , == getchar(dataptr) %13$.*13$d%4$hn

● [ == if (*dataptr == 0) goto ']' %1$.*1$d%10$.*10$d%2$hn

● ] == if (*dataptr != 0) goto '[' %1$.*1$d%10$.*10$d%2$hn

* https://en.wikipedia.org/wiki/Brainfuck 23 Exploitable program

void loop() { char* last = output; int* rpc = &progn[pc];

while (*rpc != 0) { // fetch -- decode next instruction sprintf(buf, "%1$.*1$d%1$.*1$d%1$.*1$d%1$.*1$d%1$.*1$d%1$.*1$d%1$.*1$d%1$.*1$d%2$hn", *rpc, (short*)(&real_syms));

// execute -- execute instruction sprintf(buf, *real_syms, ((long long int)array)&0xFFFF, &array, // 1, 2 *array, array, output, // 3, 4, 5 ((long long int)output)&0xFFFF, &output, // 6, 7 &cond, &bf_CGOTO_fmt3[0], // 8, 9 rpc[1], &rpc, 0, *input, // 10, 11, 12, 13 ((long long int)input)&0xFFFF, &input // 14, 15 );

// retire -- update PC sprintf(buf, "12345678%.*d%hn", (int)(((long long int)rpc)&0xFFFF), 0, (short*)&rpc);

// for debug: do we need to print? if (output != last) { putchar(output[-1]); last = output; } } } 24 Presenting: printbf*

● Turing complete ● Relies on format strings ● Allows you to execute stuff

http://github.com/HexHive/printbf

* Direct fame to Nicholas Carlini, blame to me 25 ● Purdue's Capture the Flag (CTF) team – Compete in international hacking competitions – Gain practical security experience – Competitive, fun, challenging tasks – Open, inclusive environment

● Founded 2014, ~15 active, ~100 interested ● 3rd US academic team, top 50 overall

26 Conclusion

27 Conclusion

● ROP/JOP is key to modern exploits – Leak addresses, connect gadgets, inject code ● Control-flow hijack protection – Shadow stack, precise CFI, and locality – High precision is key for effectiveness ● Low overhead, open-source ● Future work – Protect context of control-flow – Protect data and data-flow, not just control

28 Thank you!

Questions?

Mathias Payer, Purdue University http://hexhive.github.io 29