KShot: Live Kernel Patching with SMM and SGX Lei Zhou∗y, Fengwei Zhang∗, Jinghui Liaoz, Zhengyu Ning∗, Jidong Xiaox Kevin Leach{, Westley Weimer{ and Guojun Wangk ∗Department of Computer Science and Engineering, Southern University of Science and Technology, Shenzhen, China, zhoul2019,zhangfw,ningzy2019 @sustech.edu.cn f g ySchool of Computer Science and Engineering, Central South University, Changsha, China zDepartment of Computer Science, Wayne State University, Detroit, USA, [email protected] xDepartment of Computer Science, Boise State University, Boise, USA, [email protected] Department of Computer Science and Engineering, University of Michigan, Ann Arbor, USA, kjleach,weimerw @umich.edu { f g kSchool of Computer Science and Cyber Engineering, Guangzhou University, Guangzhou, China, [email protected] Abstract—Live kernel patching is an increasingly common kernel vulnerabilities also merit patching. Organizations often trend in operating system distributions, enabling dynamic up- use rolling upgrades [3], [6], in which patches are designed dates to include new features or to fix vulnerabilities without to affect small subsystems that minimize unplanned whole- having to reboot the system. Patching the kernel at runtime lowers downtime and reduces the loss of useful state from running system downtime, to update and patch whole server systems. applications. However, existing kernel live patching techniques However, rolling upgrades do not altogether obviate the need (1) rely on specific support from the target operating system, to restart software or reboot systems; instead, dynamic hot and (2) admit patch failures resulting from kernel faults. We patching (live patching) approaches [7]–[9] aim to apply present KSHOT, a kernel live patching mechanism based on patches to running software without having to restart it. x86 SMM and Intel SGX that focuses on patching Linux kernel security vulnerabilities. Our patching processes are protected by Several kernel-level live patching tools have been designed hardware-assisted Trusted Execution Environments. We demon- previously, including kpatch [10], kGraft [11], Ksplice [12], strate that our technique can successfully patch vulnerable kernel and the Canonical Livepatch Service [13]. For example, kpatch functions at the binary-level without support from the underlying leverages OS-provided infrastructures such as ftrace to OS and regardless of whether the kernel patching mechanism is trace a target function, clone and fork to hook the entry compromised. We demonstrate the applicability of KSHOT by successfully patching 30 critical indicative kernel vulnerabilities. instruction in that target function, and then trampolines to a patched version of that target function. Moreover, it can use procfs and ptrace system calls to checkpoint and I. INTRODUCTION restore the state of running applications. In addition, all those The growing complexity and heterogeneity of software has approaches need to modify the existing kernel code and trust led to a concomitant increase in the pressure to apply patches the operating system. In a similar vein, KUP [8] replaces the and updates, including to the operating system itself [1]. whole kernel at runtime while retaining state from running Frequently, users that choose to patch their kernels may incur applications. However, KUP incurs significant runtime and downtime when the patch requires restarting the system. This resource overhead (e.g., more than 30GB of memory space) unavoidable disruption impacts both enterprise and end users. to support application checkpointing [14], even for very small For example, in systems that are performing complex scien- kernel patches. tific computations or financial transactions, users are unlikely Existing patching techniques must trust the OS kernel or to reboot a system [2], [3]. According to Gartner [4], the cooperative patching applications to deploy patches. However, average cost of IT downtime is $5; 600 per minute. Busi- patching implementations can suffer from numerous bugs [15], nesses downtime can reach $300; 000 per hour, on average. which may cause patching failures or interruptions. Moreover, Even for general end users, unplanned downtime interrupts a patch may become compromised if the OS or patching running applications risks the loss of unsaved data. As a mechanism becomes compromised. For example, an internal result, enterprises and users often delay applying patches to OS update can be hijacked [16]–[18] to download and install their operating systems, leading to increased risks to their malicious patches. Such attacks download additional malicious computing resources [1]. applications while retaining kernel functionality. Further, even Since patches are important to fixing vulnerabilities and after live patching applies kernel patches, kernel attacks may adding software features, many prior approaches propose live be able to revert the software to a vulnerable version [19]. patching mechanisms that reduce or avoid system reboots Such situations are more likely to happen in remote or cloud or the loss of application or OS state. Early mechanisms computing environments [20], [21], where users have less focused on live updating applications (e.g., POLUS [5]), but control over a remote computer’s patching operations. The work was done while Lei Zhou visiting at COMPASS lab. Thus, there is a need to improve the dependability of live Fengwei Zhang is the corresponding author. patching techniques. 1 Kernel Kernel Kernel To summarize, live kernel patching faces three challenges: vulnerable old kernel 1) Downtime. Traditional kernel patching methods require function addressing function swap downtime, either from unplanned reboots or from stop- buggy old kernel instructions ping applications to checkpoint states. replace new kernel 2) Overhead. Live kernel patching techniques often incur hook & jump function patch non-trivial CPU and memory overhead to apply patches instructions new kernel and restore previously-checkpointed state. 3) Trust. Live patching software depends on the correctness of the underlying OS, which may suffer from bugs [22] or Fig. 1: Overview of live patching approaches—function-, security vulnerabilities. If the OS-level patching mecha- instruction-, and kernel-level. In function-level, entire kernel nism becomes compromised, then patches applied by that functions are replaced with new ones by copying bytes into mechanism cannot be trusted. memory. In instruction-level, single buggy instructions are In this paper, we present KSHOT, a live kernel patching replaced with trampolines to new instructions. In kernel- technique that uses Intel Software Guard eXtensions (SGX) level, the entire kernel image is replaced with a new binary and System Management Mode (SMM) to effectively, ef- image by switching page table entries so that kernel addresses ficiently, and reliably patch running, untrusted kernels. We correspond to a new location in physical memory that contain summarize our contributions as follows: the revised image. • We develop a reliable architecture for live kernel patch- kernel switching. In general, these methods can replace single ing. We leverage Trusted Execution Environments (TEEs) instructions, vulnerable functions, or even the whole kernel implemented with SGX and SMM to prepare and deploy with a patched one to repair bugs or eliminate vulnerabilities. kernel patches that do not require trusting the kernel Solutions in this area include industry-deployed mechanisms patching mechanism. like Ksplice [12] and kpatch [10] as well as academia- • We use SMM (i.e., hardware support) to naturally store proposed solutions like KUP [8] and KARMA [9]. the runtime state of the target host, which reduces However, current KLP techniques extend trust to the kernel external storage overhead and improves live patching itself to correctly deploy patches. If the kernel becomes performance. Employing this hardware-assisted mecha- compromised, then any subsequent patches deployed by that nism supports faster restoration without requiring external kernel are not trustworthy, potentially leading to additional ma- checkpoint and restore solutions (e.g., Criu [14]). More- licious activities [1]. In our work, we implement a trustworthy over, we adopt an SMM-based kernel protection approach KLP mechanism by leveraging TEEs that enable live kernel for secure live patching. patching even when the underlying kernel patching mechanism • We use SGX as a trusted environment for patch prepa- is compromised. ration to provide adequate runtime patch performance. B. System Management Mode Furthermore, patching in an SGX enclave precludes ad- versarial tampering, improving patching reliability. System Management Mode (SMM) is a highly-privileged • We evaluate the effectiveness and efficiency of KSHOT CPU execution mode present in all current x86 machines since by providing an in-depth analysis on a suite of indicative the 80386. It is used to handle system-wide functionality such kernel vulnerabilities. We demonstrate that our approach as power management, system hardware control, or OEM- incurs little overhead while providing trustworthy live specific code. SMM is used by the system firmware but not kernel patches that mitigate known kernel exploits. by applications or normal system software. The code and data used in SMM are stored in a hardware-protected memory II. BACKGROUND region named System Management RAM (SMRAM), which In this section, we first introduce existing live patching is inaccessible
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages13 Page
-
File Size-