Encrypted Execution and You

1 Assured Information Security (@ainfosec) Encrypted Execution and You @JacobTorrey DayCon 9 { October 2015 @JacobTorrey | HARES Outline 2 Introduction Background HARES Details Results Implications Conclusions @JacobTorrey | HARES Amp Up! 3 I was pretty nervous presenting here in front of all of you, luckily one of my idols gave me the courage to stand up here in front of you all! @JacobTorrey | HARES Who am I? 4 I Advising Research Engineer at Assured Information Security (words are my own) I Site lead for Denver, CO office; provides InfoSec strategy consulting I Leads low-level Computer Architectures research group I Plays in Intel privilege rings ≤ 0 I Ultra-runner, ultra-cyclist & mountaineer @JacobTorrey | HARES Overview 5 I Encrypted (non-reversable) execution is coming I HARES provides the ability to execute fully-encrypted binaries with minimal performance impact (~2%) I Intel Software Guard Extensions (SGX) provides support for encrypted enclaves I How will this impact attackers vs. defenders? @JacobTorrey | HARES Impact 6 I Provides a secure execution capability for running sensitive applications in contested environments I Could integrate with cloud-computing offerings to minimize trust placed in cloud provider I Security of solution dependent on encryption, not security-through-obscurity I Provides technology, does not enforce usage based on \morals" @JacobTorrey | HARES Problem Statement 7 I Algorithms exposed to copying or theft I Application code can be used to develop attacks I Code can be reused for unintended purposes (ROP) I Malware RE can lead to attribution and C&C attack @JacobTorrey | HARES Current State-of-the-Art 8 I Obfuscation and packers | Analysis tools and live debugging can recover instructions I VM-based obfuscation | Can still be mapped to x86 and impacts performance I Encrypting entire OS with trusted boot | Only prevents against offline attacks @JacobTorrey | HARES Von Neumann versus Harvard Architectures 9 Von Neumann systems have a unified memory and buses for data and instructions: Harvard architectures have separate memories for data and instructions: @JacobTorrey | HARES AES-NI 10 I In response to software-based caching attacks on AES, Intel released instruction set to support AES I Hardware logic is faster, and more protected I Supports 128-bit and 256-bit AES I Provides primitives, still requires engineering to make a safe system on top of these @JacobTorrey | HARES TRESOR 11 I Uses AES-NI and CPU debug registers to provide accelerated, cold-RAM resistant AES on Linux I Key loaded early boot I Kernel patch to prevent applications reading debug registers @JacobTorrey | HARES TLB-Splitting 12 I Translation lookaside buffer (TLB) is a cache for virtual −! physical address translations I Used to improve paging performance I Logically treated as single entity, physically multiple components I Switches x86 platform from apparent Von Neumann to Harvard architecture: I Used by PaX/GRSecurity to emulate no-execute bit I Shadow Walker used for memory-hiding root-kit functionality @JacobTorrey | HARES TLB-Splitting (cont.) 13 Figure 1: TLB during normal operation Figure 2: Split TLB @JacobTorrey | HARES TLB-Splitting Impact 14 I TLB-splitting allows, through software, the apparent hardware architecture to be changed, per process I Most end-point protection capabilities (PatchGuard, AV, anti-root-kit tools, etc.) assume a Von Neumann system; TLB-splitting breaks that assumption and can bypass all of them (I bet your AV vendor forgot to mention that...) @JacobTorrey | HARES S-TLB 15 With Nehalem micro-architecture, an L2 cache was introduced, the S-TLB; breaks split-TLB assumptions Figure 3: S-TLB @JacobTorrey | HARES MoRE: Measurement of Running Executables 16 I DARPA Cyber Fast Track program I Explored using TLB-splitting for measurement/integrity verification of interleaved application I Immutable code page (can repeatably measure in real-time) I Mutable data page (for variable isolation) I Used EPT granular permissions to simulate a split-TLB on newer CPUs with S-TLB I Thin-VMM to simulate Harvard architecture on per-process basis @JacobTorrey | HARES Why HARES? 17 I HARES is an example of an encrypted capability (enclave), that allows exploration of the implications of encrypted execution I Intel SGX soon will provide similar capabilities, but through hardware modifications, providing stronger security guarantees I The following deep-dive will provide background of what the current state-of-the-art is @JacobTorrey | HARES VMX Thin-Hypervisor 18 I Loaded as Windows 7 kernel driver I Based on vmcpu root-kit example I No emulation of devices, OS retains direct access to HW I Minimal performance impact I Can use VMCS exit conditions to track certain architectural events @JacobTorrey | HARES On-CPU AES 19 I \Hoists" TRESOR on-CPU AES into VMM I AES-NI key stored in CPU debug registers to protect against cold-RAM attack I Adds VMCS exit condition for debug register accesses (return NULL or silently discard write) I Decrypts executable sections of program into execute-only memory @JacobTorrey | HARES TLB-Splitting 20 I All data-fetch requests, even from application itself, directed via EPT/TLB to encrypted page I All instruction fetches are directed to execute-only, decrypted, memory I Must track Windows application memory management events (COW) and ensure EPT structures correspond with OS-level structures @JacobTorrey | HARES EPT Fault Flowchart 21 @JacobTorrey | HARES Tested Applications 22 I Command-line synthetic test-suite I Microsoft Windows Notepad I Microsoft Windows Paint I Microsoft Windows Calculator Usability of HARES-protected applications was not noticeably impacted to end-user @JacobTorrey | HARES Test Results 23 Overall, average performance impact was ~2% Test Name Average Execution Time (s) Average HARES Execution Time (s) Performance Impact (%) Pi 28.173 28.600 1.515 Randomized Sort1 0.016 0.031 95.83 Coin Flip 1.778 1.809 1.762 Timers 15.013 15.023 0.069 Table 1: Performance Results for CLI Test-Suite 1The randomized sort runs for such a short period that the initialization routine of the Windows process creation monitor almost doubles the execution time. If the TLB-splitting and periodic measurement is disabled, the run-time is still almost double. For this test, it is reasonable to assume that there is a constant increase of ~.015s for each test case run under the VMX hypervisor, and due to this test case's short duration it skewed the percentage. @JacobTorrey | HARES Demonstration System 24 Specifications for the demonstration system to overcome some prototype limitations: I Intel Core-i series processor with AES-NI; Windows 7 32-bit OS I Single processor [numproc=1] (Our thin-VMM only supports single core currently) I 2GiB RAM [truncatememory=0x80000000] (Windows kernel memory layout changes > 2 GiB, and too lazy to update hard-coded values) I No PAE/DEP [nx=AlwaysOff and pae=ForceDisable] (Again, hard-coded memory layout code) @JacobTorrey | HARES Demonstration 25 [ALT-TAB] @JacobTorrey | HARES Major Challenge: Mixed Code & Data 26 Code and string "Statistics" sharing same section (.text): @JacobTorrey | HARES Security Benefits 27 Protects against: I Reverse-engineering and algorithmic IP theft I \Weaponization" of crash case into RCE I Mining application source for ROP gadgets I Harvard architecture resistant to code injection attacks & pivoting @JacobTorrey | HARES Weaknesses 28 I JTAG/ICE/XDM I Memory Dumping I DMA I SMM/AMT I Side-channels I Emulation/VMM I Syscall Inference @JacobTorrey | HARES Overcoming Weaknesses & Future Work 29 Intel SGX will solve many of these, but for HARES: I VT-d/IOMMU (DMA) I Cache-as-RAM (memory dumping) I DRTM launch (VMM) I Combining with unique compilation @JacobTorrey | HARES Offensive Use-Cases 30 I Offensive key management is challenging, needs further research into new models for a PKI-like system I Easier to use when all parties are aware and in agreement @JacobTorrey | HARES Physically Unclonable Functions 31 I Expose manufacturing variance across devices of same type I Return unique (unclonable without FIB) values to given input I Examples: I SRAM initializes to PUF pattern I Flash memory requires different amounts of charge to jump from 1 to 0 I FPGA gate fabric can exhibit PUFs when \racing" ring oscillators I Provides assurances device has not been modified or simulated @JacobTorrey | HARES S-DRTM 32 I Provides method to verify trustworthiness of platform without requiring hardware extensions such as Intel's Trusted Execution Technologies I Typically uses timing-based attestation against remote trusted entity to measure system and loaded code and prevent active RE or emulation/virtualization 2 @JacobTorrey2Martignoni, | HARES L et al. Conqueror: Tamper-Proof Code Execution on Legacy Systems Offensive Key Management 33 I Boot-kit for key loading I Conqueror for VMM/debugger detection I PUFs for machine-unique keys instead TPM (paired with multi-stage dropper) I With protected execution and device-specific keys we can bootstrap a soft-TPM! I Have trusted, opaque implants @JacobTorrey | HARES AV Heuristics 34 I However, notepad.exe (unencrypted) with a 1-bit change to remove from MS white-list also hits 4/57 @JacobTorrey | HARES Attackers vs. Defenders 35 I There is a track record of attackers evolving to take advantage of technology quicker and more effectively than the defenders it was designed and built for: I Blue Pill I SMM root-kits I ARM TrustZone root-kits I Encrypted malware? I Personal incentives are asymmetric between offense and

Encrypted Execution and You

A Complete Bibliography of ACM Transactions on Computer Systems

Low-Cost Mitigation Against Cold Boot Attacks for an Authentication Token

Using Registers As Buffers to Resist Memory Disclosure Attacks Yuan Zhao, Jingqiang Lin, Wuqiong Pan, Cong Xue, Fangyu Zheng, Ziqiang Ma

Lest We Remember: Cold Boot Attacks on Encryption Keys

OS-Level Attacks and Defenses: from Software to Hardware-Based Exploits © December 2018 by David Gens Phd Referees: Prof

Protecting Data In-Use from Firmware and Physical Attacks

Protecting Private Keys Against Memory Disclosure Attacks Using Hardware Transactional Memory

Analysing Android's Full Disk Encryption Feature

OCRAM-Assisted Sensitive Data Protection on ARM-Based Platform

Protecting Data In-Use from Firmware and Physical Attacks

Trevisor OS-Independent Software-Based Full Disk Encryption Secure Against Main Memory Attacks

Trusted Systems in Untrusted Environments: Protecting Against Strong Attackers