Efficient and Secure SMAP-Enabled Intra-Process Memory Isolation
Total Page:16
File Type:pdf, Size:1020Kb
SEIMI: Efficient and Secure SMAP-Enabled Intra-process Memory Isolation Zhe Wang1;2 Chenggang Wu1;2 Mengyao Xie1;2 Yinqian Zhang3 Kangjie Lu4 Xiaofeng Zhang1;2 Yuanming Lai1;2 Yan Kang1 Min Yang5 1 State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences, 2 University of Chinese Academy of Sciences, 3 The Ohio State University, 4University of Minnesota, 5Fudan University Abstract—Memory-corruption attacks such as code-reuse at- The effectiveness of all such techniques heavily depends on tacks and data-only attacks have been a key threat to systems the integrity and/or confidentiality of the sensitive data. security. To counter these threats, researchers have proposed a To efficiently protect sensitive data, researchers proposed variety of defenses, including control-flow integrity (CFI), code- pointer integrity (CPI), and code (re-)randomization. All of information hiding (IH) which stores sensitive data in a memory them, to be effective, require a security primitive—intra-process region allocated in a random address and wishes that attackers protection of confidentiality and/or integrity for sensitive data could not know the random address thus could not write or (such as CFI’s shadow stack and CPI’s safe region). read the sensitive data. Unfortunately, recent works show that In this paper, we propose SEIMI, a highly efficient intra- memory disclosures and side channels can be exploited to process memory isolation technique for memory-corruption de- readily reduce the randomization entropy and thus to bypass fenses to protect their sensitive data. The core of SEIMI is to use the efficient Supervisor-mode Access Prevention (SMAP), the information hiding [22–24, 36, 41]. As such, even a robust a hardware feature that is originally used for preventing the IH-based defense can be defeated. kernel from accessing the user space, to achieve intra-process To address this problem, recent research instead opts for memory isolation. To leverage SMAP, SEIMI creatively executes practical memory isolation which provides efficient protection the user code in the privileged mode. In addition to enabling the with a stronger security guarantee. Memory isolation, in general, new design of the SMAP-based memory isolation, we further develop multiple new techniques to ensure secure escalation of can be classified into address-based isolation and domain-based user code, e.g., using the descriptor caches to capture the potential isolation. Address-based isolation checks (e.g., bound-check) segment operations and configuring the Virtual Machine Control each memory access from untrusted code to ensure that it Structure (VMCS) to invalidate the execution result of the control cannot access the sensitive data. The main overhead of this registers related operations. Extensive experimental results show method is brought by the code that performs the checks. The that SEIMI outperforms existing isolation mechanisms, including both the Memory Protection Keys (MPK) based scheme and most efficient address-based isolation is based on Intel Memory the Memory Protection Extensions (MPX) based scheme, while Protection Extensions (MPX), which performs bound-checking providing secure memory isolation. with hardware support [30]. Domain-based isolation instead stores sensitive data in I. INTRODUCTION a protected memory region. The permission to accessing this region is granted when requested by the trusted code, Memory-corruption attacks such as control-flow hijacking and is revoked when the trusted access finished. However, and data-only attacks have been a major threat to systems memory accesses from untrusted code (i.e., the potentially security in the past decades. To defend against such attacks, vulnerable code that can be compromised by attackers) cannot researchers have proposed a variety of advanced mechanisms, enable the permission. The main source of the performance including enhanced control-flow integrity (CFI), code-pointer overhead of domain-based memory isolation is the operations integrity (CPI), fine-grained code (re-)randomization, and data- for enabling and disabling the memory-access permissions. The layout randomization. All these techniques require a security most efficient domain-based isolation is to use Intel Memory primitive—effective intra-process memory protection of the Protection Keys (MPK) [25, 30, 40, 47]. integrity and/or confidentiality of sensitive data from potentially In general, existing address-based isolation and domain- compromised code. The sensitive data includes critical data based isolation both incur non-trivial performance overhead structures that are frequently checked against or used for compared to the IH-based scheme. Worse, the overhead will be protection. For example, O-CFI [39] uses a bounds lookup significantly elevated when the workloads (i.e., the frequency table (BLT), and CCFIR [58] uses a safe SpringBoard to of memory accesses that require bound-checking or permission restrict the control flow; CPI [31] uses a safe region, and switching) increase. For example, when protecting the shadow Shuffler [55] uses a code-pointer table to protect the sensitive stack, the MPK-based scheme (i.e., domain-based) incurs a pointers; Oxymoron [6] maintains a sensitive translation table, runtime overhead of 61.18% [40]. When protecting the safe and Isomeron [19] uses a table to protect randomization secrets. region of CPI using the MPX-based scheme (i.e., address- based), the runtime overhead is 36.86% [21]. Both cases are of untrusted user code with ring-0 privilege will offer valuable discouraging and would prevent practical uses of the defense insights and opportunities for future research. For example, mechanisms. As such, we need a more efficient isolation LBR is a privileged hardware feature used by transparent code- mechanism that can adapt to various workloads. reuse mitigation [42] and context-sensitive CFI [48]. Reading In this paper, we propose SMAP-Enabled Intra-process LBR has to trap into the kernel, which incurs expensive context Memory Isolation (SEIMI), a system for highly efficient switching. With the techniques used in SEIMI, since the user and secure domain-based memory isolation. SEIMI leverages code is running in a privileged mode, it can read LBR efficiently Supervisor-mode Access Prevention (SMAP), a widely used without context switching. and extremely efficient hardware feature for preventing kernel We have implemented SEIMI on the Linux/X86_64 platform. code from accessing user space. SEIMI uses SMAP in a To evaluate and compare the performance overhead, we completely different way. The key idea of SEIMI is to run deployed the MPX-based scheme, the MPK-based scheme, user code in the privileged mode (i.e., ring 0) and to store and SEIMI to protect four defense mechanisms: O-CFI [39], sensitive data in the user space. SEIMI employs SMAP to Shadow Stack [40], CPI [31], and ASLR-Guard [35]. We not prevent memory accesses from the “privileged untrusted user only conduct the experiments on SPEC CPU2006 benchmarks, code” to the “user mode” sensitive data. SMAP is temporarily but also on 12 real-world applications, including web servers, disabled when the trusted code (also in the privileged mode) databases, and JavaScript engines. Compared to the MPK-based accesses the sensitive data, and re-enabled when the trusted scheme, SEIMI is more efficient in almost all test cases; while code finishes the data access. Any memory access to theuser compared with the MPX-based scheme, SEIMI achieves a space will raise a processor exception when SMAP is enabled. lower performance overhead on average. Since SMAP is controlled by the RFLAGS register which is In sum, we make the following contributions in this paper. thread-private, disabling SMAP is only effective in current • A novel domain-based isolation mechanism. We pro- thread. Thus, temporarily enabling SMAP does not allow any pose a novel domain-based memory isolation mechanism concurrent access to sensitive data from other threads. that creatively uses SMAP in a reverse way; it can The new and “reverse” use of SMAP in SEIMI however efficiently protect a variety of software defenses against brings new design challenges: How to prevent the user code in memory-corruption attacks. ring 0 from corrupting the kernel and abusing the privileged • New techniques for isolating user code. We identify hardware resources. To prevent the kernel corruption, we choose new security threats when running untrusted user code to use the hardware-assisted virtualization technique (i.e., Intel in ring 0 and propose new solutions to these threats in VT-x) to run the kernel in a more-privileged mode (i.e., the SEIMI. These techniques are of independent interest and VMX root mode). The user code instead runs in ring 0 of the show that safely running user code in a privileged mode VMX non-root mode. Therefore, the user code is isolated from can be practical. the kernel by virtualization. It is worth noting that Dune [7] • New insights from implementation and evaluation. also uses Intel VT-x to provide user-level code with privileged We implement and evaluate SEIMI, and show that it hardware features. But it requires that the code running in ring outperforms existing approaches. Our study suggests that 0 is secure and trusted. To support untrusted code running in using SMAP for domain-based isolation is not only ring 0, we propose multiple novel techniques to prevent the practical but efficient. The enabling of running the user user code from abusing the following two types of hardware code in a privileged mode will also allow future defenses resources: (1) privileged data structures (e.g., the page tables) to efficiently access privileged hardware. and (2) privileged instructions. First, to prevent the user code from manipulating privileged II. BACKGROUND data structures (e.g., page table), we store the privileged data A. Information Hiding structures in the VMX root mode, and SEIMI leverages Intel Information hiding (IH) protects a memory region by putting VT-x to force all the privileged operations to trigger VM it in a randomized location.