Lord of the X86 Rings: a Portable User Mode Privilege Separation Architecture on X86

View metadata, citation and similar papers at core.ac.uk brought to you by CORE Lord of the x86 Rings: A Portable User Mode Privilege Separation Architecture on x86 Hojoon Lee∗ Chihyun Song Brent Byunghoon Kang CISPA Helmholtz Center i.G. GSIS, School of Computing, KAIST GSIS, School of Computing, KAIST [email protected] [email protected] [email protected] ABSTRACT 1 INTRODUCTION Modern applications often involve processing of sensitive informa- User applications today are prone to software attacks, and yet are tion. However, the lack of privilege separation within the user space often monolithically structured or lack privilege separation. As leaves sensitive application secret such as cryptographic keys just a result, adversaries who have successfully exploited a software as unprotected as a "hello world" string. Cutting-edge hardware- vulnerability in an application can access sensitive in-process code supported security features are being introduced. However, the or data that are irrelevant to the exploited module or part of the features are often vendor-specific or lack compatibility with older application. Today’s applications often contain secrets that are too generations of the processors. The situation leaves developers with critical to reside in the memory along with the rest of the application no portable solution to incorporate protection for the sensitive contents, as we have witnessed in the incident of HeartBleed [20, application component. 45]. We propose LOTRx86, a fundamental and portable approach The conventional software privilege model that coarsely divides for user-space privilege separation. Our approach creates a more the system privilege into only two levels (user-level and kernel- privileged user execution layer called PrivUser by harnessing the level) has failed to provide a fundamental solution that can support underused intermediate privilege levels on the x86 architecture. The privilege separation in user applications. As a result, critical appli- PrivUser memory space, a set of pages within process address space cation secrets such as cryptographic information are essentially that are inaccessible to user mode, is a safe place for application treated no differently than a "hello world" string in user memory secrets and routines that access them. We implement the LOTRx86 space. When the control flow of a running user context is compro- ABI that exports the privcall interface to users to invoke secret mised, there is no access control left to prevent the hijacked context handling routines in PrivUser. This way, sensitive application oper- to access arbitrary memory addresses. ations that involve the secrets are performed in a strictly controlled Many approaches have been introduced to mitigate the chal- manner. The memory access control in our architecture is privilege- lenging issue within the boundaries of the existing application based, accessing the protected application secret only requires a memory protection mechanisms provided by the operating sys- change in the privilege, eliminating the need for costly remote pro- tem. A number of works proposed using the process abstraction cedure calls or change in address space. We evaluated our platform as a unit of protection by separating a program into multiple pro- by developing a proof-of-concept LOTRx86-enabled web server that cesses [7, 29, 41]. The fundamental idea is to utilize the process employs our architecture to securely access its private key during separation mechanism provided by the OS; these work achieve an SSL connection. We conducted a set of experiments including a privilege separation by splitting a single program into multiple performance measurement on the PoC on both Intel and AMD PCs, processes. However, this process-level separation incurs a signifi- and confirmed that LOTRx86 incurs only a limited performance cant overhead due to the cost of the inter-process communication overhead. (IPC) between the processes or address space switching that incur TLB flushes. Also, the coarse unit of separation still leaves alarge CCS CONCEPTS attack surface for attackers. The direction has advanced through • Security and privacy → Trusted computing; a plethora of works on the topic. One prominent aspect of the ad- vancements is the granularity of protection. Thread-level protection KEYWORDS schemes [6, 24, 44] have reduced the protection granularity com- pared to the process-level separation schemes while still suffering privilege separation; memory protection; operating system from performance overhead from page table modifications. Shreds presented fine-grained in-process memory protection using amem- ∗ affiliation changed from KAIST to CISPA as of Sept2018 ory partitioning feature that has long been present in ARM called Memory Domains [11]. However, the feature has been deprecated Permission to make digital or hard copies of all or part of this work for personal or in the 64-bit execution mode of the ARM architecture (AARCH64). classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation In the more recent years, a number of processor architecture on the first page. Copyrights for components of this work owned by others than ACM revisions and academic works have taken a more fundamental must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, approach to provide in-process protection; Intel has introduced to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]. Software Guard Extensions (SGX) to its new x86 processors to pro- CCS ’18, October 15–19, 2018, Toronto, ON, Canada tect sensitive application and code and data from the rest of the © 2018 Association for Computing Machinery. application as well as the possibly malicious kernel [4, 14]. Intel also ACM ISBN 978-1-4503-5693-0/18/10...$15.00 https://doi.org/10.1145/3243734.3243748 offers hardware-assisted in-process memory safety and protection features [13, 15] and AMD has announced the plans to embed a PrivUser memory space and the use of the key (e.g., sign a mes- similar feature to its future generations of x86 processors [19]. How- sage with the key) is only allowed through our privcall interface. ever, the support for the new processor features are fragmented; In our PoC web server, the in-memory private key is inaccessible not only that the features are not inter-operable across processors outside the privcall routines that are invoked securely, hence ar- from different vendors (Intel, AMD), they are also only available bitrary access to the key is automatically thwarted (i.e., HeartBleed). on the newer processors. Hypervisor-based application memory The evaluation of the PoC and other evaluations are conducted on protection [10, 35] may serve to be a more portable solution, consid- both Intel and AMD PCs. We summarize the contributions of our ering the widespread adaption of hypervisors nowadays. However, LOTRx86 architecture as the following: it is not reasonable for a developer to assume that her users are using a virtual machine. • We propose a portable privileged user mode architecture for The situation presents complications for developers who need to sensitive application code and data protection that does not consider the portability as well as the security of the sensitive data require address switching or run-time page table manipula- her program processes. Therefore, we argue that there is a need tion. for an approach that provides a basis for an privilege in-process • We introduce the privcall ABI that allows user layer to separation based on only the portable features of the processor. invoke the privcall routines in a strictly controlled way. We An memory separation should not require a complete in-process also provide necessary software for building an LOTRx86- address space switching to access the protected memory or costly enabled software. page table modifications. In this paper, we propose a novel x86 user-mode privilege sep- • We developed a PoC LOTRx86-enabled web server to demon- aration architecture called The Lord of the x86 Rings (LOTRx86) strate the protection of in-memory private key during SSL architecture. Our architecture proposes a drastically different, yet connection. portable approach for user privilege separation on x86. While the existing approaches sought to retrofit the memory protection mechanisms within the boundaries of the OS kernel’s support, we pro- 2 BACKGROUND: THE X86 PRIVILEGE pose the creation of a more privileged user layer called PrivUser ARCHITECTURE that protects sensitive application code and data from the normal The LOTRx86 architecture design leverages the x86 privilege struc- user mode. For this objective, LOTRx86 harnesses the underused tures in a unique way. Hence, it is necessary that we explain the x86 intermediate Rings (Ring1 and Ring2) with our unique design x86 privilege system before we go further into the LOTRx86 archi- that satisfies security requirements that define a distinct privilege tecture design. In this section, we briefly describe the x86 privilege layer. The PrivUser memory space is a subset of a process memory concepts focusing on the topics that are closely related to this paper. space that is accessible to when the process context is in PrivUser mode but inaccessible when in user mode. In our architecture, user memory access control is privileged-based. Therefore, the architec- 2.1 The Ring Privileges ture does not require costly run-time page table manipulations nor Modern operating systems on the x86 architecture adapt the two address space switching. privilege level model in which user programs run in Ring3 and We also implement the LOTRx86 ABI that exports the privcall kernel in Ring0. The x86 architecture, in fact, supports four privilege interface that supports PrivUser layer invocation from user layer. layers – Ring0 through Ring3 where Ring0 is the highest privilege To draw an analogy, the syscall interface is a controlled invocation on the system.

Lord of the X86 Rings: a Portable User Mode Privilege Separation Architecture on X86

Deciding Memory Safety for Single-Pass Heap-Manipulating Programs

Intra-Unikernel Isolation with Intel Memory Protection Keys

Nessus 8.3 User Guide

Pwny Documentation Release 0.9.0

I386-Engine™ Technical Manual

Protecting Million-User Ios Apps with Obfuscation: Motivations, Pitfalls, and Experience

“Sok: Eternal War in Memory”

MPTEE: Bringing Flexible and Efficient Memory Protection to Intel

DOS Virtualized in the Linux Kernel

Least Privilege and Privilege Separation

New Concept of the Android Keystore Service with Regard to Security and Portability

Integration of Security Measures and Techniques in an Operating System (Considering Openbsd As an Example)