Harmonizing Performance and Isolation in Microkernels with Efficient Intra-Kernel Isolation and Communication
Total Page:16
File Type:pdf, Size:1020Kb
Harmonizing Performance and Isolation in Microkernels with Efficient Intra-kernel Isolation and Communication Jinyu Gu, Xinyue Wu, Wentai Li, Nian Liu, Zeyu Mi, Yubin Xia, Haibo Chen Monolithic Kernel and Microkernel 2 Monolithic Kernel and Microkernel Microkernel’s philosophy: Moving most OS components into isolated user processes 3 Benefits and Usages of Microkernel • Achieves good extensibility, security, and fault isolation • Succeeds in safety-critical scenarios (Airplane, Car) • For more general-purpose applications (Google Zircon) 4 Expensive Communication Cost • Tradeoff: Performance and Isolation – Inter-process communication (IPC) overhead File Disk App System Driver Microkernel IPC 5 IPC Overhead is Considerable IPC Cost Real Work in Servers 100% SQLite xv6FS Ramdisk 80% 60% 40% Microkernel 20% Zircon seL4 seL4 Direct cost: privilege switch, process switch, … w/ kpti w/o kpti Indirect cost: CPU internal structures pollution Evaluated on Dell PowerEdge R640 server with Intel Xeon Gold 6138 CPU 6 Goal: Both Ends • Harmonize the tension between Performance and Isolation in microkernels – Reducing the IPC overhead – Maintaining the isolation guarantee 7 New Hardware Brings Opportunities • PKU: Protection Key for Userspace (aka. MPK) – Assign each page one PKEY (i.e., memory domain ID) [0:15] – A new register PKRU stores read/write permission 8 Efficient Intra-Process Isolation App Part • ERIM [Security’19] & Hodor [ATC’19] Library-1 – Based on Intel PKU Library-2 – Build isolate domains in the same process efficiently – Domain switch only takes 28 cycles (modify PKRU) 9 Intra-Process Isolation + Microkernel System Servers App App FS MM Net Drv … Intel PKU Microkernel Process IPC Sched Hardware 10 Design Choice #1 Isolate different system servers in a single process. Server-1 Isolated Server-2 App domains Server-3 … Just as traditional IPCs Microkernel 11 Design Choice #2 Let’s get more aggressive! App-1 App-2 Drawbacks Server-1 Server-1 1. Update Server mapping is costly Server-2 Server-2 2. IPC connection is also costly 3. Less flexibility for applications Server-3 Server-3 on address space and using PKU … … Microkernel 12 An Observation on Intel PKU • A misleading name – Protection Key for Userspace • It still takes effect when in kernel (ring-0) – The “Userspace” means user-accessible memory – U/K bit in PTE 13 UnderBridge: Sinking System Servers System Servers App App FS MM Net Drv … Intel PKU Intra-kernel isolation Microkernel Hardware 14 Design Choice #3: UnderBridge • Build execution domains in the kernel page table User App App App Dom-0 Kernel Microkernel Dom-1 Dom-2 Dom-3 Server-1 Server-2 Server-3 15 Execution Domain • Execution domain 0 is for the microkernel – Use memory domain 0 – Can access all the memory • Others own a private memory domain – A private MPK memory domain ID • Shared memory Dom-0 – Allocate a free Microkernel MPK memory domain ID Dom-1 Dom-2 Server-1 Server-2 16 IPC Gate • Connect two servers Dom-1 Dom-2 Server-1 Server-2 – Generated by the microkernel – Resides in memory domain 0 (execute-only for servers) • Transfer control flow during IPC invocations – context switch and domain switch • Connect the microkernel and servers – System calls Dom-0 Dom-2 Microkernel Server-2 17 Server Migration • The number of execution domain is limited – Hardware only provides 16 memory domains – Time-multiplexing is expensive • Move servers between user and kernel space – Disjoint virtual memory regions – Runtime migration 18 Privilege Deprivation • In-kernel servers have supervisor privilege – Can affect the whole system if compromised – CFI (with binary scanning) incurs runtime overhead – Binary rewriting only is infeasible • Prevent servers to execute privilege instructions – Add a tiny secure monitor in hypervisor mode – For instructions rarely execute: VMExits – For instructions that frequently required: Rewriting 19 Other Designs and Implementations • IPC capability authentication • Seamless server migration • Privilege deprivation details 20 Cross-server IPC Round-Trip Latency 8500 8151 8000 7500 5000 4145 4000 3057 3000 Cycles 2035 2000 1450 1000 437 24 109 0 Monolithic ChCore SkyBridge seL4 seL4 Fiasco.OCFiasco.OC Zircon (UnderBridge) -KPTI -KPTI 21 Evaluated on Dell PowerEdge R640 server with Intel Xeon Gold 6138 CPU SQLite Throughput under YCSB-A 1×∼8× Native w/ KPTI UnderBridge Native w/o KPTI Monolithic SkyBridge Monolithic w/o KPTI 10 8 6 4 Throughput 2 0 Zircon Fiasco.OC seL4 22 Evaluated on Dell PowerEdge R640 server with Intel Xeon Gold 6138 CPU Conclusion & Thanks! • UnderBridge – A redesign of the runtime structure of microkernel OSes for faster OS services – The efficient intra-kernel isolation mechanism may also be used to harden the isolation of monolithic kernels Q&A: [email protected] 23.