Libmpk: Software Abstraction for Intel Memory Protection Keys

libmpk: Software Abstraction for Intel Memory Protection Keys (Intel MPK) Soyeon Park, Georgia Institute of Technology; Sangho Lee, Microsoft Research; Wen Xu, Georgia Institute of Technology; Hyungon Moon, Ulsan National Institute of Science and Technology; Taesoo Kim, Georgia Institute of Technology https://www.usenix.org/conference/atc19/presentation/park-soyeon This paper is included in the Proceedings of the 2019 USENIX Annual Technical Conference. July 10–12, 2019 • Renton, WA, USA ISBN 978-1-939133-03-8 Open access to the Proceedings of the 2019 USENIX Annual Technical Conference is sponsored by USENIX. libmpk: Software Abstraction for Intel Memory Protection Keys (Intel MPK) Soyeon Park Sangho Lee† Wen Xu Hyungon Moon∗ Taesoo Kim Georgia Institute of Technology †Microsoft Research ∗Ulsan National Institute of Science and Technology Abstract MPK) [20], that allows a userspace process to change the permission of the page groups. MPK has three key bene- Intel Memory Protection Keys (MPK) is a new hardware prim- fits over page-table-based mechanisms: (1) performance, (2) itive to support thread-local permission control on groups of group-wise control, and (3) per-thread view. First, MPK uti- pages without requiring modification of page tables. Unfortu- lizes a protection key rights register (PKRU) to maintain the nately, its current hardware implementation and software sup- access rights of individual keys associated with specific pages: port suffer from security, scalability, and semantic problems: read/write, read-only, or no access. Processes only need to (1) vulnerable to protection-key-use-after-free; (2) providing execute a non-privileged instruction (WRPKRU) to update PKRU, the limited number of protection keys; and (3) incompatible which takes less than 20 cycles (§2.3) and requires no TLB with mprotect()’s process-based permission model. flush and context switching. Note that PKRU and page-table In this paper, we propose libmpk, a software abstraction for permissions cannot override each other, so the effective per- MPK. It virtualizes the hardware protection keys to eliminate mission is the intersection of both. the protection-key-use-after-free problem while providing Second, MPK can change the access rights of up to 16 accesses to an unlimited number of virtualized keys. To sup- different page groups at once, where each group consists of port legacy applications, it also provides a lazy inter-thread pages associated with the same key1. This group-wise control key synchronization. To enhance the security of MPK itself, allows applications to change access rights to page groups libmpk restricts unauthorized writes to its metadata. We apply according to the types and contexts of data stored in them libmpk to three real-world applications: OpenSSL, JavaScript (e.g., per-session data of a web server). JIT compiler, and Memcached for memory protection and Third, MPK allows each thread (i.e., each hyperthread) isolation. Our evaluation shows that it introduces negligi- to have a unique PKRU, realizing per-thread memory view. ble performance overhead (<1%) compared with the origi- Accordingly, even if two threads share the same address space, nal, unprotected versions and improves performance by 8.1× their access rights to the same page can be different. compared with the secure equivalents using mprotect(). The Although MPK is a promising primitive in concept, its source code of libmpk is publicly available and maintained current hardware implementation as well as standard library as an open source project. and kernel support suffer from three problems: (1) security, 1 Introduction (2) scalability, and (3) subtle semantic differences, hindering its broader adoption. First, we found that MPK suffers from Maintaining and enforcing memory access permission is an the protection-key-use-after-free problem. The Linux kernel important duty of OSes and CPUs. Traditionally, they have provides two system calls, pkey_alloc() and pkey_free(), used page tables to specify whether processes have rights to to allocate and de-allocate protection keys, respectively. Dur- read from, write to, or execute specific memory pages. OSes ing key de-allocation (pkey_free()), however, it does not can change access permission by updating page-table en- invalidate pages associated with a de-allocated key, resulting tries (PTEs) and flushing corresponding translation lookaside in ambiguity when the de-allocated key is re-allocated and buffer (TLB) entries to reload them. In addition to page ta- assigned to different pages later. bles, some CPUs, e.g., ARM [5] and IBM Power [18], allow Second, MPK fails in scaling because PKRU can manage OSes to maintain the permission of a page group together by only up to 16 protection keys because of its hardware lim- assigning the same key to the correlated pages and controlling itation. When an application tries to allocate more than 16 the key’s permission. To modify the permission of the page protection keys, pkey_alloc() simply fails, implying that the groups, OSes should, on behalf of the process, change the application itself should implement its own mechanism to permission of the corresponding registers. Recently, Intel deployed a similar key-based permis- 1The default group (0) has a special purpose, so only 15 groups are sion control, called Intel Memory Protection Keys (Intel available for general uses. USENIX Association 2019 USENIX Annual Technical Conference 241 OS-managed Userspace process WRPKRU multiplex these protection keys. RDPKRU DTLB PKRU (corea) PKRU (coreb) Third, the semantic of MPK is different from the page# pkey perm. pkey: 0 1 8 15 0 1 8 15 conventional mprotect(), i.e., thread-view versus process- 120 8 r/w perm: r/w n/a ... r/w ... n/a r/w r/w ... r/o ... n/a 232 1 r/o view, which results in potential security and performance (corea) (coreb) 456 ... 8 r/o problems. For example, the Linux kernel implements an page# effective page# effective ITLB perm. perm. execute-only memory with MPK by disabling read access page# perm. 120 r/w 120 r through PKRU and allowing execution through a page table: 232 x 232 x 232 r/x ... 456 r 456 r mprotect(addr, len, PROT_EXEC). Although this feature is (per-process, red : pkey = 1 (per-core, asynchronous) invoked via mprotect(), it only changes the PKRU’s permis- synchronized) blue: pkey = 8 sion of the calling thread, meaning that other threads sharing Figure 1: An example showing how MPK checks the permission of the same address space can still read the execute-only memory. a logical core (hyperthread) on a specific memory page according In other words, it is non-trivial to apply MPK securely and to PKRU and page permissions. The intersection of the permissions efficiently to legacy applications that rely on a process-level determines whether a data access will be allowed. An instruction memory permission model. fetch is independent of the PKRU. In this paper, we propose libmpk, a secure, scalable, and semantic-compatible software abstraction to fully utilize cations to effectively overcome the three challenges. MPK in a practical manner. In particular, libmpk implements • Case studies. We apply libmpk to OpenSSL library, (1) protection key virtualization to eliminate the protection- JavaScript JIT compiler, and Memcached to show its key-use-after-free problem and to support the unrestricted effectiveness and practicality. libmpk secures them with number of memory page groups, (2) lazy inter-thread key syn- a few modifications and negligible overhead. chronization to selectively ensure per-process semantics with MPK, allowing us to substitute mprotect() in an efficient 2 Intel MPK Explained and compatible manner, and (3) metadata integrity to ensure In this section, we describe the hardware design of Intel MPK the integrity of the mapping information while minimizing and current kernel and library support. Also, we check the the number of system call invocations. libmpk consists of a performance characteristics of MPK to show its efficiency. userspace library mainly for efficient permission change and a kernel module mainly for synchronization and metadata 2.1 Hardware Primitives integrity. Intel MPK updates the permission of a group of pages by To show the effectiveness and practicality, we apply libmpk associating a protection key to the group and changing the ac- to three real-world applications: OpenSSL library, JavaScript cess rights of the protection key instead of individual memory just-in-time (JIT) compiler, and Memcached. First, we mod- pages (Figure 1). ify the OpenSSL library to create secure memory pages for Protection key field in page table entry. MPK assigns a storing cryptographic keys to mitigate information leakage. unique protection key to a memory page group to update its Second, we modify three JavaScript JIT compilers (i.e., Spi- permission at the same time. MPK exploits the previously derMonkey, ChakraCore, and v8) to protect the code cache unused four bits of each page table entry (from 32nd to 35th from memory corruption by enforcing the W⊕X security pol- bits) to store a memory page’s corresponding key value. Thus, icy. Third, we modify Memcached to secure almost all its data, MPK supports up to 16 different page groups. Since only including the slab and hash table, whose size can be several supervised code can access and change PTEs, the Linux ker- gigabytes. The evaluation results show that libmpk and its nel (from version 4.6) starts to provide a new system call, applications have negligible overhead (<1%). Furthermore, pkey_mprotect(), to allow applications to assign or change libmpk is 1.73–3.78× faster than mprotect() when changing the keys of their memory pages (§2.2). the permission of 1–1,000 pages at the view of a process, and, PKRU especially, the throughput of Memcached with libmpk is 8.1× Protection key rights register ( ). MPK uses the value higher than that of Memcached with mprotect().

Load more