instead of Motivation Three Decades of Runtime Attacks
return-into- Return-oriented Morris Worm libc programming Continuing Arms 1988 Solar Designer Shacham Race 1997 CCS 2007
…
Borrowed Code Code Chunk Injection Exploitation AlephOne Krahmer 1996 2005 Recent Attacks Attacks on Tor Browser [2013] Stagefright [Drake, BlackHat 2015] FBI Admits It Controlled Tor Servers These issues in Stagefright code critically Behind Mass Malware Attack. expose 95% of Android devices, an estimated 950 million devices
MMS
Adversary
Cisco Router Exploit [2016] The Million Dollar Dissident [2016] Million CISCO ASA Firewalls potentially Government targeted human rights vulnerable to attacks defender with a chain of zero-day exploits to infect his iPhone with spyware.
Adversary Relevance and Impact
High Impact of Attacks
• Web browsers repeatedly exploited in pwn2own contests • Zero-day issues exploited in Stuxnet/Duqu [Microsoft, BH 2012] • iOS jailbreak
Industry Efforts on Defenses • Microsoft EMETCan includes either a ROP be detection bypassed, engine or may not • Microsoft Controlbe Flow sufficiently Guard (CFG) effectivein Windows 10 • Google‘s compiler extension VTV (Virtual Table Verification) [Davi et al, Blackhat2014], [Liebchen et al CCS2015], • Intel’s Hardware[Schuster, Extension et CET al S&P2015] (Control-flow Enforcement Technology)
Hot Topic of Research
• A large body of recent literature on attacks and defenses Runtime Attacks & Defenses: Continuing Arms Race
Defenses Still seeking practicalAttacks and CFI, ROPGuard, BinCFI, ROPecker, secure solutions kBouncer, CCFIR, vTint, vfGuard, SafeDispatch, MoCFI, ROP wo Returns, RockJIT, TVip, Out-of-Control, StackArmor, CPI/CPS, Stitching the Oxymoron, XnR, Gadgets, SROP, JIT- Isomeron, ROP, BlindROP, O-CFI, COOP, StackDefiler, Readactor, "Missing the HAFIX, point(er)" ... The whole story ….. Runtime Attacks Code-Injection Attack Code-Reuse Attack
Basic Blocks (BBL) A Entry: instruction target of a branch A (e.g., first instruction of a function) Exit: Any branch B (e.g., indirect or direct jump/call, return) B corrupt code C D pointer C D
E F E F X DEPX inject malicious corrupt code pointer code Adversary Data flow Data Execution Prevention Program flow Return-oriented Programing (ROP): Prominent Code-Reuse Attack
ROP shown to be Turing-complete ROP: Basic Ideas/Steps Use small instruction sequences instead of whole functions Instruction sequences have length 2 to 5 All sequences end with a return instruction, or an indirect jump/call Instruction sequences chained together as gadgets Gadget perform particular task, e.g., load, store, xor, or branch Attacks launched by combining gadgets Generalization of return-to-libc Threat Model: Code-reuse Attacks
Application
RX ? Code 1 Writable Executable Code? Code? ? 2 ⊕ ? ? Opaque Memory Layout RW Data ? Data? Data ? 3 Disclose readable Memory RX ? ? Code? Data? 4 Manipulate writable Memory RW? 5 Computing Engine Data? Main Defenses against Code Reuse
1. Code Randomization
2. Control-Flow Integrity (CFI) Randomization vs. CFI
Randomization Control-flow Integrity Low Performance Overhead Formal Security (Explicit Control Flow Scales well to complex Checks) Software (OS, browser)
Information Disclosure Tradeoff: hard to prevent Performance & Security Challenging to integrate High entropy required in complex software, coverage EPISODE I Code Randomization Make gadgets locations unpredictable Fine-Grained ASLR Application Run 1 Application Run 2 Library (e.g., libc) Library (e.g., libc) Instruction RET Instruction Sequence 3 RET Sequence 1 Instruction RET Instruction Sequence 1 RET Sequence 2 Instruction RET Instruction Sequence 2 RET Sequence 3
Instruction reordering/substitution within a BBL ORP [Pappas et al., IEEE S&P 2012] Randomizing each instruction‘s location: ILR [Hiser et al., IEEE S&P 2012] Permutation of BBLs: STIR [Wartell et al., CCS 2012] & XIFER [with Davi et al., AsiaCCS 2013] Randomization: Memory Leakage Problem
Direct memory disclosure • Pointer leakage on code pages • e.g., direct call and jump instruction
Indirect memory disclosure • Pointer leakage on data pages such as stack or heap • e.g., return addresses, function pointers, pointers in vTables JIT-ROP: Bypassing Randomization via Direct Memory Disclosure
Just-In-Time Code Reuse: On the Effectiveness of Fine-Grained Address Space Layout Randomization IEEE Security and Privacy 2013, and Blackhat 2013 Kevin Z. Snow, Lucas Davi, Alexandra Dmitrienko, Christopher Liebchen, Fabian Monrose, Ahmad-Reza Sadeghi Just-In-Time ROP: Direct Memory Disclosure
1 Undermines fine-grained ASLR
Shows memory disclosures are far more 2 damaging than believed
3 Can be instantiated with real-world exploit Readactor: Towards Resilience to Memory Disclosure
Readactor: Practical Code Randomization Resilient to Memory Disclosure IEEE Security and Privacy 2015 Stephen Crane, Christopher Liebchen, Andrei Homescu, Lucas Davi, Per Larsen, Ahmad-Reza Sadeghi, Stefan Brunthaler, Michael Franz CodeCode Randomization: Randomization: AttackAttack & & Defense Defense Techniques Techniques
Attack Timeline Static Attack RX Application Morris Worm / (Fine-grained) Return to libc [Solar Function_A Randomization Designer Bugtraq’97] ?Execute? - Function_B? Just-In-Time ROP [Snow et ? al. IEEE S&P’13] Direct code
RX Application Attack Timeline JIT Code JIT Code Attacks Counterfeit Object-oriented Same Protection Programming (COOP) XoM? ? ? as for AOT Code ? ? [Schuster et al. IEEE S&P’15] TrampolineXoM Crash-Resistant Oriented Brute-force X-Virtual Table Programming [Gawlik et al. Attacks on vFunc2 Tramp NDSS’16] Entropy Booby Traps BoobyXoM Trap Terminate Process vFunc1 Tramp
RW Pointers Function Pointer Trampoline Function Virtual Table ? Reuse for Reuse Attacks Ptrvirt.X Function1-virt table Attack Surface Single Function Trampolines & virt. Function2 Large enough? Pointers Booby Traps EPISODE II Control-Flow Integrity (CFI) Restricting indirect targets to a pre-defined control-flow graph Original CFI Label Checking [Abadi et al., CCS 2005 & TISSEC 2009] BBL A label_A ENTRY asm_ins, … EXIT
ATwo Questions CFI CHECK: 1. Benign and correct execution?EXIT(A) -> label_B ? C 2. BRuntimeBBL enforcement? B label_B ENTRY asm_ins, … EXIT CFI: CFG Analysis and Coverage Problem
CFG Analysis • Conservative “points-to” analysis • e.g., over-approximate to avoid breaking the program
CFG Coverage • Precision of CFG analysis determines security of CFI policy • e.g., more precise more secure Which Instructions to Protect?
• Purpose: Return to calling function Returns • CFI Relevance: Return address located on stack
Indirect • Purpose: switch tables, dispatch to library functions • CFI Relevance: Target address taken from either Jumps processor register or memory
Indirect • Purpose: call through function pointer, virtual table calls • CFI Relevance: Target address taken from either processor Calls register or memory Label Granularity: Trade-Offs (1/2) Many CFI checks are required if unique labels are assigned per node
CFI Check Exit(B) == Label_1 [Label_3, Label_4, Label_5] A Basic Block B Label_2 Label
Label_3 C D Label_4
E F Label_5 Label_6 Label Granularity: Trade-Offs (2/2) Optimization step: Merge labels to allow single CFI check However, this allows for unintended control-flow paths
CFI Check Exit(B) == Label_3 A Label_1 Basic Block B Label_2 Label
Label_3 C D Label_3Label_4
Exit(C) == Label_3 E F Label_5Label_3 Label_6 Label Problem for Returns Static CFI label checking Shadow stack allows for leads to coarse-grained fine-grained return protection for returns address protection but incurs higher overhead
Forward- CALL RET Edge CFI Backup State Check Label_1 Label_2 Shadow Stack A‘ A B B‘ Backup storage for CALL return addresses R Return Addr … … RET Backward -Edge CFI Exit(R) == [Label_1, Label_2] Return Addr A’ Forward- vs. Backward-Edge Some CFI schemes consider only forward-edge CFI Google’s VTV and IFCC [Tice et al., USENIX Sec 2015] SAFEDISPATCH [Jang et al., NDSS 2014] And many more: TVIP, VTint, vfguard Assumption: Backward-edge CFI through stack protection Problems of stack protections: Stack Canaries: memory disclosure of canary ASLR (base address randomization of stack): memory disclosure of base address Variable reordering (memory disclosure) Losing Control: On the Effectiveness of Control-Flow Integrity under Stack Attacks ACM CCS 2015 Christopher Liebchen, Marco Negro, Per Larsen, Lucas Davi, Ahmad-Reza Sadeghi, Stephen Crane, Mohaned Qunaibit, Michael Franz, Mauro Conti StackDefiler
• Goal: • Bypass fine-grained Control-Flow Integrity • IFCC & VTV (CFI implementations by Google for GCC and LLVM) • Approach: • Due to optimization by compiler critical CFI pointer is spilled on the stack • StackDefiler discloses the stack address and overwrites the spilled CFI pointer • At restoring of spilled registers a malicious CFI pointer is used for future CFI checks • No stack-based vulnerability needed Bypassing (Coarse-grained) CFI
COOP IEEE S&P 2015 USENIX Security 2014 Felix Schuster, Thomas Tendyck, Lucas Davi, Daniel Lehmann, Christopher Liebchen, Lucas Davi, Ahmad-Reza Sadeghi, Fabian Monrose Ahmad-Reza Sadeghi, Thorsten Holz Coarse-grained CFI: Lessons Learned Control-Flow Integrity
1. Too many callOut sites of control available Control-Flow Bending → Restrict returns[Göktas toet al.,their actual[Carlini caller et al., (shadow stack) IEEE S&P 2014] USENIX Sec. 2015] 2. Heuristics are ad-hoc and ineffective Stitching the gadgets FlowStich → Adjusted sequence[Davi et al., length leads[Hu et to al., high false positive USENIX Sec. 2014] USENIX Sec. 2015] 3. Too many indirect jump and call targets ROP is still dangerous Control Jujutsu Resolving indirect[Carlini et al.,jumps and[Evans calls et alis., non-trivial → Compromise:USENIX Compiler Sec. 2014] supportCCS 2015] Size does matter StackDefiler [Göktas et al., [Conti et al., USENIX Sec. 2014] CCS 2015]
COOP Signal-oriented Programming (SROP) [Schuster et al., [Bosman et al., IEEE S&P 2015] IEEE S&P 2014] Hardware CFI Why Leveraging Hardware for CFI ?
Efficiency Security
Dedicated CFI instructions Isolated CFI storage
CFI_RETURN CFI Memory
CFI_JUMP Branch Targets CFI_CALL Why CFI Processor Support?
CFI Processor Support based on Instruction set architecture (ISA) extensions
Dedicated CFI instructions
Avoids offline training phase
Instant attack detection CFI control state: Binding CFI data to CFI state and instructions HAFIX++
Strategy Without Tactics: Policy-Agnostic Hardware-Enhanced Control-Flow Integrity Design Automation Conference (DAC 2016) Dean Sullivan, Orlando Arias, Lucas Davi, Per Larsen, Ahmad-Reza Sadeghi, Yier Jin Objectives Backward-Edge and Stateful, CFI policy agnostic Forward-Edge CFI No burden on developer No code annotations/changes
Security Hardware protection On-Chip Memory for CFI Data No unintended sequences High performance < 3% overhead
Enabling technology All applications can use CFI features Support of Multitasking Compatibility to legacy code CFI and non-CFI code on same platform HAFIX++ Fine-Grained CFI State Model
Initialization FE Control-Flow BE Control-Flow
CFI disabled insn State 2 • Support for bothpre-call CFI/nonSave FE- CFIcall processes State 1 target pre-call State 0 CFI State 1 Idle execution • Strict enforcementCFI of unique forward- ret execution pre-jump CFI State 2 edge control-flow targetsState 2 Save return enabled Save FE jump target call/jmp target State 1 State 3 CFI • Strict enforcement of unique backward- State 3 CFI Check execution edge controlCFI Check-flowX targets X State 4 State 4 Stop Stop execution execution HAFIX++ ISA Extensions
cfibr Issued at call site setup Backward (BW) Edge
cfiret Issue at return site check BW Edge
cfiprc Issued at call site setup call target
cfiprj Issued at jump site setup jump target
cfichk Issued at call/jmp target check Forward (FW) Edge
Label State • Fine-grained forward edge control-flow policy Stack (LSS) • Separation of call/jump • Unique label per target • Fine-grained backward edge control-flow policy Label State • Return to only most recently issued return label Register (LSR) Indirect Call Policy
Label State Function A Stack (LSS) CFIBR label_A1 State 0 CFI State CFILSR label_B Normal Execution Only CFI instructions CALL *reg allowed label_A1 A1 CFIRET label_A1 Function A Code CALL *reg CFIBR label_A1
Code … CFILSR label_B Function B Function B CFICHK label_B CFICHK label_B B Code Code RET Label State RET CFIRET label_A1 Register (LSR)
label_B Function Return Policy
Function A State 0 CFI State Label State CFIBR label_A1 Normal Execution Only CFI instructions CALL B allowed Stack (LSS) A1 CFIRET label_A1 Function A Code CALL B Code CFIBR label_A1 label_A1 Function B Function B … Code Code CFIRET label_A1 RET RET HAFIX++ Pipeline
Convert CFI to NOP
Fetch Decode Execute Memory Write insn1CFI NOP insn3insn2 insn5insn4 Forward CFI to Control Unit Label does not match CFI Control Unit CFI Label State → Stop Execution Memory CFI label label Forward label to CFI Control Unit to check activity Label access in dedicated memory Function A (25) Function B (31) Call Function B CFICHK CFICHK 25 31 insn Push Label 251 onto LSS insn CFIBR 251 insn CFIPRC 31 RET CALL *reg Return to Function A CFIRET 251 Label 31 valid insn Store Label Pop252 label off stack CFIPRJ 252 and validate JMP *reg CFICHK 252 insn CFICHK Label 252 valid 252 insn Store Label 31 to LSR CFIBR 253CFIPRC 45 Label State Register Label State Stack CALL *reg CFIRET 2525231 253 25112 insn RET Challenges … Architectural Issues • Runtime overhead caused by CFI instrumentation o Initializing and validating the CFI state upon every FW/BW edge o I-cache pressure during instruction fetch o Effective CPI • Runtime overhead and problems caused by hardware o Branch instruction occur about every 3-5 instructions o CFI instructions/operations around every one of them o Memory access for CFI metadata is slow o CFI metadata could be corrupted if considered data (StackDefiler) o CFI metadata could be a bottleneck if placed in code The Multiple Callers Problem
Function A Common Callee Function B CFIPRC 45 CFIPRC 33 CALL *reg CALL *reg • We can not assign both 45 and 33 at the same time. Fn Q • We could assign a common label45 to all targets
Fn M• Introduces erroneousFn P edges in theFn R Control Flow GraphFn U 45 → Call targets must be45 disjointed!33 Use a trampoline! 33 Fn N Fn O Fn S Fn T 45 45 33 33 Label confusion! System Challenges
1 Sharing CFI subsystem resources
2 Separation of process states
3 Handling CFI Module Exceptions
4 Handling of legacy code The Scheduling Issue
Process 1 Process 2
0287 LSSP 3536 7932 LSSP 0452 3589 8459 9265 8182 1415 1618 7182 5772 0003 Label State 0002 Label State Label State Stack Register Label State Stack Register
This is running This is being scheduled The Scheduling Issue
Process 1 Process 2
7932 LSSP 3589 9265 1415 1618 0003 Label State Label State Label State Stack Register Label State Stack Register
This is running This is being scheduled The Stack Issue
2884 We ran out of stack 7950 space! What do we do? 3832 2643 3846 7932 LSSP 3589 9265 1415 0003 Label State Stack The Process Control Block
• Representation of a process to the kernel • In Linux, look for task_struct in include/linux/sched.h • Information contains: • Execution state (runnable, suspended, zombie…) • Virtual memory allocations • Process owner • Process group • Process id • I/O status information • CPU context state Kernel Scheduler Additions read current CFI awareness if CFI is enabled backup CFI state for current read next CFI awareness if CFI is enabled restore CFI state for next else disable CFI subsystem The Scheduling Issue Resolved
CFI Subsystem Process 1 -- PCB
TASK_RUNNING CFI Context CFI_ON
0287 LSSP … 3536 04527932 LSSP Process 2 -- PCB 84593589 81829265 TASK_RUNNING CFI Context 71821415 16185772 CFI_ON 00020003 Label State Label State Stack Register … The Scheduling Issue Resolved
CFI Subsystem Process 1 -- PCB
TASK_RUNNING CFI Context CFI_ON
…
7932 LSSP Process 2 -- PCB 3589 9265 TASK_RUNNING CFI Context 1415 1618 CFI_OFF 0003 Label State Label State Stack Register … Your stack still overflows or underflows for that matter
• We use the PCB already, add things there on overflow: copy bottom half of current’s LSS to PCB move top half of LSS to bottom set LSSP to new location on underflow: get bottom half of current’s LSS from PCB set LSSP to new location The Stack Issue Resolved
CFI Subsystem Process 1 -- PCB
2884 LSSP TASK_RUNNING 7950 CFI Context CFI_ON 3832 2643 … 3846 7932 3589 9265 1415 1618 0003 Label State Label State Stack Register CFI Faults
• The CFI subsystem detected a CFI violation • Add kernel log entry with CFI fault information • Send SIGKILL to offending process • This kills the process with no chance of a signal handler running Related Works
HCFI: New instructions to track control flow Combines and relocates instructions into pipeline bubble slots Single threaded, embedded applications only Intel CET: Shadow stack for return addresses New register ssp for the shadow stack Conventional move instructions cannot be used in shadow stack New instructions to operate on shadow stack New instruction for indirect call/jump targets: branchend Any indirect call/jump can target any valid indirect branch target Control-flow Enforcement Technology [Intel 2016]
Function A Function B A1 call [D1] ✔ B1 ENDBRANCH A2 call [D2] B2 push eax A3 ins ✘ B3 pop eax A4 return B4 return
✘✔ Function C ✔ C1 ENDBRANCH C2 … C3 return
Shadow Stack Stack Data
A2 A23 D1 B1
D2 BCB311 Control-flow Enforcement Technology [Intel 2016]
• Backward edge: • Shadow stack detects return-address manipulation • Shadow stack protected, cannot be accessed by attacker • New register ssp for the shadow stack • Conventional move instructions cannot be used in shadow stack • New instructions to operate on shadow stack
• Forward edge: • New instruction for indirect call/jump targets: branchend • Any indirect call/jump can target any valid indirect branch target • Could be combined with fine-grained compiler-based CFI (LLVM CFI) Comparison with HAFIX++
Shared library BE-Support FE-Support Granularity Overhead & Multitasking
XFI Budiu et al, ASID 2006 Coarse 3.75%
HAFIX Coarse 2% Davi et al., DAC 2015 Architectural dependent optimizations LandHere Coarse N/A http://landhere.galois.com Can branchHCFI to any call/jump target Christoulakiswith et endbranchal., inst. Fine 1% CODASPY 2016 Intel CET https://software.intel.com/site s/default/files/managed/4d/2a Coarse N/A /control-flow-enforcement- technology-preview.pdf
HAFIX++ Fine 1.75% Sullivan et al., DAC 2016