<<

Harmonizing Performance and Isolation in with Efficient Intra-kernel Isolation and Communication

Jinyu Gu, Xinyue Wu, Wentai Li, Nian Liu, Zeyu Mi, Yubin Xia, Haibo Chen Monolithic Kernel and

2 Monolithic Kernel and Microkernel

Microkernel’s philosophy: Moving most OS components into isolated user processes

3 Benefits and Usages of Microkernel

• Achieves good extensibility, security, and fault isolation • Succeeds in safety-critical scenarios (Airplane, Car) • For more general-purpose applications (Google Zircon)

4 Expensive Communication Cost

• Tradeoff: Performance and Isolation – Inter- communication (IPC) overhead

File Disk App System Driver

Microkernel

IPC 5 IPC Overhead is Considerable

IPC Cost Real Work in Servers 100% SQLite xv6FS Ramdisk 80%

60%

40% Microkernel 20%

Zircon seL4 seL4 Direct cost: privilege switch, process switch, … w/ kpti w/o kpti Indirect cost: CPU internal structures pollution

Evaluated on Dell PowerEdge R640 server with Intel Xeon Gold 6138 CPU 6 Goal: Both Ends

• Harmonize the tension between Performance and Isolation in microkernels

– Reducing the IPC overhead

– Maintaining the isolation guarantee

7 New Hardware Brings Opportunities

• PKU: Protection Key for Userspace (aka. MPK) – Assign each page one PKEY (i.e., memory domain ID)

[0:15] – A new register PKRU stores read/write permission

8 Efficient Intra-Process Isolation

App Part

• ERIM [Security’19] & Hodor [ATC’19] Library-1

– Based on Intel PKU Library-2

– Build isolate domains in the same process efficiently

– Domain switch only takes 28 cycles (modify PKRU)

9 Intra-Process Isolation + Microkernel

System Servers

App App FS MM Net Drv … Intel PKU

Microkernel

Process IPC Sched

Hardware

10 Design Choice #1

Isolate different system servers in a single process.

Server-1 Isolated Server-2 App domains Server-3 … Just as traditional IPCs Microkernel

11 Design Choice #2

Let’s get more aggressive! App-1 App-2

Drawbacks Server-1 Server-1 1. Update Server mapping is costly Server-2 Server-2 2. IPC connection is also costly

3. Less flexibility for applications Server-3 Server-3 on address space and using PKU … …

Microkernel

12 An Observation on Intel PKU

• A misleading name – Protection Key for Userspace • It still takes effect when in kernel (ring-0) – The “Userspace” means user-accessible memory – U/K bit in PTE

13 UnderBridge: Sinking System Servers

System Servers

App App FS MM Net Drv … Intel PKU

Intra-kernel isolation Microkernel

Hardware

14 Design Choice #3: UnderBridge

• Build execution domains in the kernel page table User App App App

Dom-0 Kernel Microkernel

Dom-1 Dom-2 Dom-3 Server-1 Server-2 Server-3

15 Execution Domain

• Execution domain 0 is for the microkernel – Use memory domain 0 – Can access all the memory • Others own a private memory domain – A private MPK memory domain ID • Shared memory Dom-0 – Allocate a free Microkernel MPK memory domain ID Dom-1 Dom-2 Server-1 Server-2

16 IPC Gate

• Connect two servers Dom-1 Dom-2 Server-1 Server-2 – Generated by the microkernel – Resides in memory domain 0 (execute-only for servers) • Transfer control flow during IPC invocations – and domain switch • Connect the microkernel and servers – System calls Dom-0 Dom-2 Microkernel Server-2 17 Server Migration

• The number of execution domain is limited – Hardware only provides 16 memory domains – Time-multiplexing is expensive • Move servers between user and kernel space – Disjoint regions – Runtime migration

18 Privilege Deprivation

• In-kernel servers have supervisor privilege – Can affect the whole system if compromised – CFI (with binary scanning) incurs runtime overhead – Binary rewriting only is infeasible • Prevent servers to execute privilege instructions – Add a tiny secure monitor in hypervisor mode – For instructions rarely execute: VMExits – For instructions that frequently required: Rewriting

19 Other Designs and Implementations

• IPC capability authentication

• Seamless server migration

• Privilege deprivation details

20 Cross-server IPC Round-Trip Latency

8500 8151 8000

7500 5000 4145 4000 3057 3000 Cycles 2035 2000 1450 1000 437 24 109 0 Monolithic ChCore SkyBridge seL4 seL4 Fiasco.OCFiasco.OC Zircon (UnderBridge) -KPTI -KPTI

21 Evaluated on Dell PowerEdge R640 server with Intel Xeon Gold 6138 CPU SQLite Throughput under YCSB-A

1×∼8× Native w/ KPTI UnderBridge Native w/o KPTI Monolithic SkyBridge Monolithic w/o KPTI 10

8

6

4

Throughput 2

0 Zircon Fiasco.OC seL4 22 Evaluated on Dell PowerEdge R640 server with Intel Xeon Gold 6138 CPU Conclusion & Thanks!

• UnderBridge – A redesign of the runtime structure of microkernel OSes for faster OS services

– The efficient intra-kernel isolation mechanism may also be used to harden the isolation of monolithic kernels

Q&A: [email protected] 23