Lord of the X86 Rings: a Portable User Mode Privilege Separation Architecture on X86

Session 7D: BinDef 1 CCS’18, October 15-19, 2018, Toronto, ON, Canada Lord of the x86 Rings: A Portable User Mode Privilege Separation Architecture on x86 Hojoon Lee∗ Chihyun Song Brent Byunghoon Kang CISPA Helmholtz Center i.G. GSIS, School of Computing, KAIST GSIS, School of Computing, KAIST [email protected] [email protected] [email protected] ABSTRACT ACM Reference Format: Modern applications often involve processing of sensitive informa- Hojoon Lee, Chihyun Song, and Brent Byunghoon Kang. 2018. Lord of the tion. However, the lack of privilege separation within the user space x86 Rings: A Portable User Mode Privilege Separation Architecture on x86. In 2018 ACM SIGSAC Conference on Computer and Communications Security leaves sensitive application secret such as cryptographic keys just (CCS ’18), October 15–19, 2018, Toronto, ON, Canada. ACM, New York, NY, as unprotected as a "hello world" string. Cutting-edge hardware- USA, 14 pages. https://doi.org/10.1145/3243734.3243748 supported security features are being introduced. However, the features are often vendor-specific or lack compatibility with older generations of the processors. The situation leaves developers with 1 INTRODUCTION no portable solution to incorporate protection for the sensitive User applications today are prone to software attacks, and yet are application component. often monolithically structured or lack privilege separation. As We propose LOTRx86, a fundamental and portable approach a result, adversaries who have successfully exploited a software for user-space privilege separation. Our approach creates a more vulnerability in an application can access sensitive in-process code privileged user execution layer called PrivUser by harnessing the or data that are irrelevant to the exploited module or part of the underused intermediate privilege levels on the x86 architecture. The application. Today’s applications often contain secrets that are too PrivUser memory space, a set of pages within process address space critical to reside in the memory along with the rest of the application that are inaccessible to user mode, is a safe place for application contents, as we have witnessed in the incident of HeartBleed [20, secrets and routines that access them. We implement the LOTRx86 45]. ABI that exports the privcall interface to users to invoke secret The conventional software privilege model that coarsely divides handling routines in PrivUser. This way, sensitive application oper- the system privilege into only two levels (user-level and kernel- ations that involve the secrets are performed in a strictly controlled level) has failed to provide a fundamental solution that can support manner. The memory access control in our architecture is privilege- privilege separation in user applications. As a result, critical appli- based, accessing the protected application secret only requires a cation secrets such as cryptographic information are essentially change in the privilege, eliminating the need for costly remote pro- treated no differently than a "hello world" string in user memory cedure calls or change in address space. We evaluated our platform space. When the control flow of a running user context is compro- by developing a proof-of-concept LOTRx86-enabled web server that mised, there is no access control left to prevent the hijacked context employs our architecture to securely access its private key during to access arbitrary memory addresses. an SSL connection. We conducted a set of experiments including a Many approaches have been introduced to mitigate the chal- performance measurement on the PoC on both Intel and AMD PCs, lenging issue within the boundaries of the existing application and confirmed that LOTRx86 incurs only a limited performance memory protection mechanisms provided by the operating sys- overhead. tem. A number of works proposed using the process abstraction as a unit of protection by separating a program into multiple pro- CCS CONCEPTS cesses [7, 29, 41]. The fundamental idea is to utilize the process • Security and privacy → Trusted computing; separation mechanism provided by the OS; these work achieve privilege separation by splitting a single program into multiple KEYWORDS processes. However, this process-level separation incurs a signifi- privilege separation; memory protection; operating system cant overhead due to the cost of the inter-process communication (IPC) between the processes or address space switching that incur TLB flushes. Also, the coarse unit of separation still leaves alarge ∗affiliation changed from KAIST to CISPA as of Sept2018 attack surface for attackers. The direction has advanced through a plethora of works on the topic. One prominent aspect of the ad- Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed vancements is the granularity of protection. Thread-level protection for profit or commercial advantage and that copies bear this notice and the full citation schemes [6, 24, 44] have reduced the protection granularity com- on the first page. Copyrights for components of this work owned by others than ACM pared to the process-level separation schemes while still suffering must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a from performance overhead from page table modifications. Shreds fee. Request permissions from [email protected]. presented fine-grained in-process memory protection using amem- CCS ’18, October 15–19, 2018, Toronto, ON, Canada ory partitioning feature that has long been present in ARM called © 2018 Association for Computing Machinery. ACM ISBN 978-1-4503-5693-0/18/10...$15.00 Memory Domains [11]. However, the feature has been deprecated https://doi.org/10.1145/3243734.3243748 in the 64-bit execution mode of the ARM architecture (AARCH64). 1441 Session 7D: BinDef 1 CCS’18, October 15-19, 2018, Toronto, ON, Canada In the more recent years, a number of processor architecture routines, a modified C library for the building the PrivUser side revisions and academic works have taken a more fundamental (lotr-libc), and a tool for building LOTRx86-enabled program approach to provide in-process protection; Intel has introduced (lotr-build). Software Guard Extensions (SGX) to its new x86 processors to pro- We implemented a prototype of our architecture that is com- tect sensitive application and code and data from the rest of the patible with both Intel and AMD’s x86 processors. Based on our application as well as the possibly malicious kernel [4, 14]. Intel also prototype, we developed a proof-of-concept LOTRx86-enabled web offers hardware-assisted in-process memory safety and protection server. In our PoC, the web server’s private key is protected in the features [13, 15] and AMD has announced the plans to embed a PrivUser memory space and the use of the key (e.g., sign a mes- similar feature to its future generations of x86 processors [19]. How- sage with the key) is only allowed through our privcall interface. ever, the support for the new processor features are fragmented; In our PoC web server, the in-memory private key is inaccessible not only that the features are not inter-operable across processors outside the privcall routines that are invoked securely, hence ar- from different vendors (Intel, AMD), they are also only available bitrary access to the key is automatically thwarted (i.e., HeartBleed). on the newer processors. Hypervisor-based application memory The evaluation of the PoC and other evaluations are conducted on protection [10, 35] may serve to be a more portable solution, consid- both Intel and AMD PCs. We summarize the contributions of our ering the widespread adaption of hypervisors nowadays. However, LOTRx86 architecture as the following: it is not reasonable for a developer to assume that her users are • We propose a portable privileged user mode architecture for using a virtual machine. sensitive application code and data protection that does not The situation presents complications for developers who need to require address switching or run-time page table manipula- consider the portability as well as the security of the sensitive data tion. her program processes. Therefore, we argue that there is a need for an approach that provides a basis for an in-process privilege • We introduce the privcall ABI that allows user layer to separation based on only the portable features of the processor. invoke the privcall routines in a strictly controlled way. We An in-process memory separation should not require a complete also provide necessary software for building an LOTRx86- address space switching to access the protected memory or costly enabled software. page table modifications. • We developed a PoC LOTRx86-enabled web server to demon- In this paper, we propose a novel x86 user-mode privilege sep- strate the protection of in-memory private key during SSL aration architecture called The Lord of the x86 Rings (LOTRx86) connection. architecture. Our architecture proposes a drastically different, yet portable approach for user privilege separation on x86. While the existing approaches sought to retrofit the memory protection mech- 2 BACKGROUND: THE X86 PRIVILEGE anisms within the boundaries of the OS kernel’s support, we pro- ARCHITECTURE pose the creation of a more privileged user layer called PrivUser The LOTRx86 architecture design leverages the x86 privilege struc- that protects sensitive application code and data from the normal tures in a unique way. Hence, it is necessary that we explain the user mode. For this objective, LOTRx86 harnesses the underused x86 privilege system before we go further into the LOTRx86 archi- x86 intermediate Rings (Ring1 and Ring2) with our unique design tecture design. In this section, we briefly describe the x86 privilege that satisfies security requirements that define a distinct privilege concepts focusing on the topics that are closely related to this paper.

Lord of the X86 Rings: a Portable User Mode Privilege Separation Architecture on X86

Openvms Record Management Services Reference Manual

Chapter 1. Origins of Mac OS X

Improving System Security Through TCB Reduction

Least Privilege and Privilege Separation

New Concept of the Android Keystore Service with Regard to Security and Portability

Integration of Security Measures and Techniques in an Operating System (Considering Openbsd As an Example)

The Evolution of the Unix Time-Sharing System*

Vx32: Lightweight User-Level Sandboxing on the X86

VULNERABLE by DESIGN: MITIGATING DESIGN FLAWS in HARDWARE and SOFTWARE Konoth, R.K

Advanced Development of Certified OS Kernels Prof

I.T.S.O. Powerpc an Inside View

Free, Functional, and Secure