Microkernel OS Evolution
Total Page:16
File Type:pdf, Size:1020Kb
COMP9242 Advanced OS S2/2018 W01: Introduction to seL4 @GernotHeiser Copyright Notice These slides are distributed under the Creative Commons Attribution 3.0 License • You are free: – to share—to copy, distribute and transmit the work – to remix—to adapt the work • under the following conditions: – Attribution: You must attribute the work (but not in any way that suggests that the author endorses you or your use of the work) as follows: “Courtesy of Gernot Heiser, UNSW Sydney” The complete license text can be found at http://creativecommons.org/licenses/by/3.0/legalcode 2 COMP9242 S2/2018 W01 © 2017 Gernot Heiser. Distributed under CC Attribution License Reducing Trusted Computing Base: Microkernels • Idea of microkernel: IPC performance – Flexible, minimal platform is critical! – Mechanisms, not policies – Actual OS functionality provided by user-mode servers – Servers invoked by kernel-provided message-passing mechanism (IPC) – Goes back to Nucleus [Brinch Hansen’70] Application Syscall VFS User Unix File Mode Server Device Server IPC, file system Application Driver Kernel Scheduler, virtual memory Mode Device drivers, dispatcher IPC, virtual memory IPC Hardware Hardware 3 COMP9242 S2/2018 W01 © 2017 Gernot Heiser. Distributed under CC Attribution License Monolithic vs Microkernel OS Evolution Monolithic OS Microkernel OS • New features add code kernel • Features add usermode code • New policies add code kernel • Policies replace usermode code • Kernel complexity grows • Kernel complexity is stable • Adaptable • Dependable Application • Highly optimised Syscall VFS User Unix File Mode Server Device Server IPC, file system Application Driver Kernel Scheduler, virtual memory Mode Device drivers, dispatcher IPC, virtual memory IPC Hardware Hardware 4 COMP9242 S2/2018 W01 © 2017 Gernot Heiser. Distributed under CC Attribution License 1993 “Microkernel”: IPC Performance [µs] 400 Mach i486 @ 50 MHz 300 Culprit: Cache 200 footprint 115 µs [Liedtke’95] L4 100 5 µs raw copy 0 0 2000 4000 6000 Message Length [B] 5 COMP9242 S2/2018 W01 © 2017 Gernot Heiser. Distributed under CC Attribution License Microkernel Principle: Minimality A concept is tolerated inside the microkernel only if moving it outside the kernel, i.e. permitting competing implementations, would prevent the implementation of the system’s required functionality. [SOSP’95] Limited by arch- • Advantages of resulting small kernel: specific micro- – Easy to implement, port? optimisations – Easier to optimise – Hopefully enables a minimal trusted computing base – Easier debug, maybe even prove correct? Small attack • Challenges: surface, fewer – API design: generality despite small code base failure modes – Kernel design and implementation for high performance 6 COMP9242 S2/2018 W01 © 2017 Gernot Heiser. Distributed under CC Attribution License Microkernel Evolution First generation Second generation Third generation • Eg Mach [’87] • L4 [’95] • seL4 [’09] (QNX, Chorus) (PikeOS, Integrity) Memory Objects Low-level FS, Memory- Swapping mangmt Devices library Kernel memory Kernel memory Scheduling Scheduling Scheduling IPC, MMU abstr. IPC, MMU abstr. IPC, MMU abstr. • 180 syscalls • ~7 syscalls • ~3 syscalls • 100 kSLOC • ~10 kSLOC • 9 kSLOC • 100 µs IPC • ~ 1 µs IPC • 0.1 µs IPC • capabilities • design for isolation 7 COMP9242 S2/2018 W01 © 2017 Gernot Heiser. Distributed under CC Attribution License L4: A Family of High-Performance Microkernels First L4 iOS security kernel with co-processor capabilities API Inheritance L4-embed. seL4 Code Inheritance OKL4 Microvisor L4/MIPS OKL4 µKernel L4/Alpha Codezero Qualcomm modem chips L3 → L4 “X” Hazelnut Pistachio Fiasco Fiasco.OC UNSW/NICTA GMD/IBM/Karlsruhe NOVA Dresden OK Labs P4 → PikeOS Commercial Clone 93 94 95 96 97 98 99 00 01 02 03 04 05 06 07 08 09 10 11 12 13 8 COMP9242 S2/2018 W01 © 2017 Gernot Heiser. Distributed under CC Attribution License Issues of 2G Microkernels • L4 solved performance issue [Härtig et al, SOSP’97] • Left a number of security issues unsolved • Problem: ad-hoc approach to protection and resource management – Global thread name space ⇒ covert channels [Shapiro’03] – Threads as IPC targets ⇒ insufficient encapsulation – Single kernel memory pool ⇒ DoS attacks – Insufficient delegation of authority ⇒ limited flexibility, performance issues – Unprincipled management of time • Addressed by seL4 – Designed to support safety- and security-critical systems – Principled time management (MCS branch) 9 COMP9242 S2/2018 W01 © 2017 Gernot Heiser. Distributed under CC Attribution License The seL4 Microkernel 10 COMP9242 S2/2018 W01 © 2017 Gernot Heiser. Distributed under CC Attribution License Principles • Single protection mechanism: capabilities – Now also for time [Lyons et al, EuroSys’18] • All resource-management policy at user level – Painful to use – Need to provide standard memory-management library o Results in L4-like programming model • Suitable for formal verification (proof of implementation correctness) – Attempted since ‘70s – Finally achieved by L4.verified project at NICTA [Klein et al, SOSP’09] 11 COMP9242 S2/2018 W01 © 2017 Gernot Heiser. Distributed under CC Attribution License Concepts • Capabilities (Caps) – mediate access • Kernel objects: – Threads (thread-control blocks: TCBs) – Address spaces (page table objects: PDs, PTs) – Endpoints (IPC) – Notifications – Capability spaces (CNodes) – Frames – Interrupt objects (architecture specific) – Untyped memory • System calls – Send, Wait (and variants) – Yield 12 COMP9242 S2/2018 W01 © 2017 Gernot Heiser. Distributed under CC Attribution License What are (Object) Capabilities? Cap = Access Token: Eg. thread, Prima-facie evidence file, … of privilege Object Obj reference Access rights Cap typically in kernel to protect from forgery Eg. read, write, send, execute… Ø user references cap through handle • OO API: err = method( cap, args ); Linux • Used in some earlier microkernels: “capabilities” – KeyKOS [‘85], Mach [‘87], EROS [‘99] are different 13 COMP9242 S2/2018 W01 © 2017 Gernot Heiser. Distributed under CC Attribution License seL4 Capabilities • Stored in cap space (CSpace) – Kernel object made up of CNodes – each an array of cap “slots” • Inaccessible to userland – But referred to by pointers into CSpace (slot addresses) – These CSpace addresses are called CPTRs • Caps convey specific privilege (access rights) – Read, Write, Execute, Grant (cap transfer) • Main operations on caps: – Invoke: perform operation on object referred to by cap o Possible operations depend on object type – Copy/Mint/Grant: create copy of cap with same/lesser privilege – Move/Mutate: transfer to different address with same/lesser privilege – Delete: invalidate slot (cleans up object if this is the only cap to it) – Revoke: delete any derived (eg. copied or minted) caps 14 COMP9242 S2/2018 W01 © 2017 Gernot Heiser. Distributed under CC Attribution License Inter-Process Communication (IPC) • Fundamental microkernel operation – Kernel provides no services, only mechanisms – OS services provided by (protected) user-level server processes – invoked by IPC Client Server seL4 IPC • seL4 IPC uses a handshake through endpoints: – Transfer points without storage capacity send receive – Message must be transferred instantly o Single-copy user ➞ user by kernel 15 COMP9242 S2/2018 W01 © 2017 Gernot Heiser. Distributed under CC Attribution License IPC: Endpoints Thread1 Thread2 Running Blocked Blocked Running ….. Recv (ep1_cap, …) Send (ep1_cap, …) ….... Recv (ep2_cap, …) ….... ….... Send (ep2_cap, …) • Threads must rendez-vous for message transfer – One side blocks until the other is ready – Implicit synchronisation • Message copied from sender’s to receiver’s message registers – Message is combination of caps and data words o Presently max 121 words (484B, incl message “tag”) o Should never use anywhere near that much! 16 COMP9242 S2/2018 W01 © 2017 Gernot Heiser. Distributed under CC Attribution License Endpoints are Message Queues Server Client1 Client2 Kernel TCB1 TCB2 EP First invocation • EP has no sense of direction queues caller • May queue senders or receivers Further callers of same direction – never both at the same time! queue behind • Communication needs 2 EPs! 17 COMP9242 S2/2018 W01 © 2017 Gernot Heiser. Distributed under CC Attribution License Client-Server Communication • Asymmetric relationship: – Server widely accessible, clients not Server Client1 Client2 – How can server reply back to client (distinguish between them)? • Client can pass (session) reply cap in first request – server needs to maintain session state – forces stateful server design • seL4 solution: Kernel provides single-use reply cap – only for Call operation (Send+Recv) – allows server to reply to client – cannot be copied/minted/re-used but can be moved – one-shot (automatically destroyed after first use) 18 COMP9242 S2/2018 W01 © 2017 Gernot Heiser. Distributed under CC Attribution License Call RPC Semantics Client Client Server Call(ep,…) Kernel Server Recv(ep,&rep) mint rep deliver to server process process Send(rep,…) deliver to client destroy rep process 19 COMP9242 S2/2018 W01 © 2017 Gernot Heiser. Distributed under CC Attribution License Stateful Servers: Identifying Clients Stateful server serving multiple clients • Must respond to correct client Server – Ensured by reply cap Client1 Client1 • Must associate request Msg state with correct state Client2 Client2 state • Could use separate EP per client – endpoints are lightweight (16 B) – but requires mechanism to wait on a set of