Microkernel Virtualization Under One Roof - Dare the Impossible
Total Page:16
File Type:pdf, Size:1020Kb
Microkernel virtualization under one roof - dare the impossible - Alexander Böttcher <[email protected]> Outline 1. Introduction 2. Kernel interfaces 3. VM interface harmonization 4. VMMs harmonized 5. Conclusion Microkernel virtualization under one roof - dare the impossible -2 Outline 1. Introduction 2. Kernel interfaces 3. VM interface harmonization 4. VMMs harmonized 5. Conclusion Microkernel virtualization under one roof - dare the impossible -3 Motivation Off-the-shell virtualization solution ridden with complexity. Application of virtualization call for trustworthy solutions. Complexity defeats trust. Alternative approach → Microkernels with hardware assisted virtualization extensions Microkernel virtualization under one roof - dare the impossible -4 Component based virtualization architecture Guest OS Guest OS Guest OS non-root mode root mode VMM VMM VMM Resource management Apps Drivers 9,000 SLOC kernel NOVA Microhypervisor Microkernel virtualization under one roof - dare the impossible -5 Genode OS framework Microkernel virtualization under one roof - dare the impossible -6 General supported kernels on Genode Microkernel virtualization under one roof - dare the impossible -7 Kernels with hardware assisted virtualization Microkernel virtualization under one roof - dare the impossible -8 VMM inventory of Genode Hardware assisted virtualization/separation support Microkernel Host VMM Guest vCPU hw ARM, 32bit custom 1, 32bit hw/trustzone ARM, 32bit custom 1, 32bit hw with Muen Intel, 64bit VBox 4 1, 32bit Seoul N, 32bit NOVA Intel & AMD VBox 4 N, 32bit, 64 bit 32bit, 64bit VBox 5 N, 32bit, 64 bit Microkernel virtualization under one roof - dare the impossible -9 Focus on x86 microkernels for now → NOVA, seL4, Fiasco.OC, and -hw- Approach: Generalize VM interface as used by -hw- Research challenge Vision: VMMs runnable on all kernels w/o re-compilation Microkernel virtualization under one roof - dare the impossible - 10 Approach: Generalize VM interface as used by -hw- Research challenge Vision: VMMs runnable on all kernels w/o re-compilation Focus on x86 microkernels for now → NOVA, seL4, Fiasco.OC, and -hw- Microkernel virtualization under one roof - dare the impossible - 10 Research challenge Vision: VMMs runnable on all kernels w/o re-compilation Focus on x86 microkernels for now → NOVA, seL4, Fiasco.OC, and -hw- Approach: Generalize VM interface as used by -hw- Microkernel virtualization under one roof - dare the impossible - 10 Outline 1. Introduction 2. Kernel interfaces 3. VM interface harmonization 4. VMMs harmonized 5. Conclusion Microkernel virtualization under one roof - dare the impossible - 11 Flow of a virtualization event User-level VMM Guest OS UTCB VMCS UTCB copy NOVA world switch Microkernel virtualization under one roof - dare the impossible - 12 vCPU state on NOVA UTCB VMM user space kernel space UTCB VMCS/VMCB NOVA microhypervisor Transfer: UTCB, VMCS/VMCB agnostic, partial state support Microkernel virtualization under one roof - dare the impossible - 13 vCPU state on Fiasco.OC UTCB VMM vCPU state user space kernel space UTCB VMCS/VMCB vCPU state Fiasco.OC microkernel Transfer: vCPU state, not VMCS/VMCB agnostic, full state Microkernel virtualization under one roof - dare the impossible - 14 vCPU state on seL4 IPCBuffer VMM user space kernel space IPCBuffer VMCS vCPU state seL4 microkernel Transfer: hybrid - IPCBuffer & syscall per VMCS register IPCBuffer: VM exit - 17 registers, VM enter - 3 registers Microkernel virtualization under one roof - dare the impossible - 15 Control flow on NOVA UTCB VMM thread IPC call user space kernel space IPC reply UTCB vCPU VMCS/VMCB NOVA microhypervisor Microkernel virtualization under one roof - dare the impossible - 16 Control flow on Fiasco.OC UTCB VMM vCPU state thread syscall done user space vmresume kernel space (blocking) UTCB vCPU VMCS/VMCB vCPU state Fiasco.OC microkernel Microkernel virtualization under one roof - dare the impossible - 17 Control flow on seL4 IPCBuffer VMM thread syscall done user space vmenter kernel space (blocking) IPCBuffer vCPU VMCS vCPU state seL4 microkernel Microkernel virtualization under one roof - dare the impossible - 18 Control flow on Genode’s -hw- UTCB VMM vCPU state thread signal user space kernel space run UTCB vCPU vCPU state Genode’s -hw- microkernel (ARM) Microkernel virtualization under one roof - dare the impossible - 19 Outline 1. Introduction 2. Kernel interfaces 3. VM interface harmonization 4. VMMs harmonized 5. Conclusion Microkernel virtualization under one roof - dare the impossible - 20 VM event → just another event source I/O event → just another event source Kernel agnostic ABI Unified vCPU state per platform Design goals VMM → just a component Genode components designed event driven Non-blocking thread (entrypoint) register for event sources Events cause transition in state machine State transition by Genode signal or RPC Microkernel virtualization under one roof - dare the impossible - 21 Design goals VMM → just a component Genode components designed event driven Non-blocking thread (entrypoint) register for event sources Events cause transition in state machine State transition by Genode signal or RPC VM event → just another event source I/O event → just another event source Kernel agnostic ABI Unified vCPU state per platform Microkernel virtualization under one roof - dare the impossible - 21 Envisioned vCPU handling VMM entrypoint timer network signal signal user space VM event kernel space vCPU0 vCPUn kernel Microkernel virtualization under one roof - dare the impossible - 22 Envisioned vCPU handling - multi core VMM entrypoint A entrypoint B user space kernel space vCPU A0 ... vCPU An vCPU B0 ... vCPU Bn kernel Microkernel virtualization under one roof - dare the impossible - 23 VM interface - kernel agnostic VMM Entrypoint VM interface ld.lib.so user space kernel space vCPU0 ... vCPUn kernel Genode -base- library with unified ABI in ld.lib.so Microkernel virtualization under one roof - dare the impossible - 24 VM interface - kernel agnostic VM connection/session → VM address space established create_vcpu() - setup new vCPUs cpu_state() - access to guest state attach/detach() - memory management of VM VM_handler class - registration for VM event handling run/pause() - control execution of vCPUs - non-blocking Microkernel virtualization under one roof - dare the impossible - 25 VM interface - kernel agnostic entrypoint VMM VM interface (client) ld.lib.so connection init VM interface (server) core user space kernel space kernel Microkernel virtualization under one roof - dare the impossible - 26 VM interface - kernel agnostic entrypoint VMM VM interface (client) ld.lib.so VM session init VM interface (server) core user space kernel space vCPU0 ... vCPUn kernel Microkernel virtualization under one roof - dare the impossible - 27 VM interface - kernel agnostic entrypoint VMM VM interface (client) ld.lib.so VM session init VM interface (server) core user space kernel space Server: 200-400 LOC Client: NOVA, seL4:vCPU0 ~500 ... vCPUn - Fiasco.OC: ~1000 - hw: ~30 LOC kernel Microkernel virtualization under one roof - dare the impossible - 28 Control flow on Genode’s -hw- and NOVA UTCB VMM vCPU state thread signalIPC call user space kernel space IPC replyrun UTCB vCPU VMCS/VMCB vCPU state Genode’sNOVA -hw- microhypervisor microkernel (ARM) Microkernel virtualization under one roof - dare the impossible - 29 Control flow on Genode’s -hw- and NOVA Event source VMM VM (timer) kernel vCPU Entrypoint hw/NOVA vCPU0 vCPU1 VM exit signal/IPC call run/IPC reply non-blocking VM resume event (timeout) pause/recall non-blocking VM exit signal/IPC call run/IPC reply inject vIRQ Microkernel virtualization under one roof - dare the impossible - 30 Control flow on seL4 and Fiasco.OC IPCBufferUTCB VMM vCPU state thread syscall done user space vmresumevmenter kernel space (blocking) IPCBufferUTCB vCPU VMCS/VMCBVMCS vCPU state Fiasco.OCseL4 microkernel microkernel Microkernel virtualization under one roof - dare the impossible - 31 Control flow on seL4 and Fiasco.OC Event source VMM kernel VM (timer) Entrypoint seL4/Fiasco.OC vCPU0 vCPU1 vmenter/vmresume blocking syscall VM resume Blocking syscall unfortunate → complicates life Kernels provide mechanism to cancel Avoid special case handling in Genode for first take → Workaround: spawn per vCPU extra thread Microkernel virtualization under one roof - dare the impossible - 32 Control flow on seL4 and Fiasco.OC Event source VMM kernel VM (timer) Entrypoint Handler0 Handler1 vCPU0 vCPU1 run non-blocking vmenter/vmresume VM resume run non-blocking vmenter/vmresume VM resume Microkernel virtualization under one roof - dare the impossible - 33 Control flow on seL4 and Fiasco.OC Event source VMM kernel VM (timer) Entrypoint Handler0 seL4/Fiasco.OC vCPU0 vCPU1 run non-blocking vmenter/vmresume blocking syscall VM resume event (timeout) pause cancel vmenter/vmresume vmenter/vmresume syscall returns signal run inject vIRQ vmenter/vmresume inject vIRQ VM resume Microkernel virtualization under one roof - dare the impossible - 34 Outline 1. Introduction 2. Kernel interfaces 3. VM interface harmonization 4. VMMs harmonized 5. Conclusion Microkernel virtualization under one roof - dare the impossible - 35 sel4 v9.0: Kernel fault on VMEnter by non vCPU thread → patch No unrestricted guest support → patch Scheduling bug if vCPU spins → starvation → patch Kernel denies to boot on non VT-x platforms → patch → Working toy VMM on all 3 kernels → no AMD support by seL4 VMM unit test Control flow