1 Preliminaries 2 K42 Overview
Total Page:16
File Type:pdf, Size:1020Kb
K42 CS380L: Mike Dahlin October 13, 2003 1 Preliminaries 1.1 Review 1.2 Outline • Overview • Scalability – Clustered objects – Existence locks – IPC • Discussion and open issues 1.3 Preview 2 K42 overview Takes lots of well-received research ideas from last 10+ years and builds an OS with them... • Exokernel/scheduler activations – User-level libraries approach • Mach-like virtual memory • Hydra, mach, L4 microkernel • Emulate old OS with library + server (Hydra, mach, exokernel, ...) • OS-kit device drivers (internal API) • Device driver virtualization (Disco?) • Local RPC (LRPC) for interprocess communication – LRPC + Clustered objects similar to RPC + {default sub, caching, ...} • ... 1 • Key ideas in design – Structure (from intro) ∗ Modular, object oriented ∗ Avoid centralized code paths, global data structures, global locks ∗ move system functionality from kernel to server processes and application libraries – Claimed benefits ∗ Easy prototyping of new OS technologies ∗ Scale up (to large multiprocessor and large applications), Scale down (to small multiprocessors), support small scale applications ∗ Backwards compatible with Linux but easy to add specialized components ∗ Customizable: Allow applications to determine how OS manages their resources ∗ Flexible: support wide range of applications and problem domains, support wide range of hardware, scale down to embedded HW and up to future enterprise server • These ideas (largely) taken from past systems (and adapted) – Kernel: memory management, process management, IPC, base scheduling – User-level libraries for abstractions (Linux API/ABI) (a la exokernel) ∗ Customizable ∗ Overhead is reduced – avoid crossing address space boundaries (e.g., setting timers that will be cleared before they go off) ∗ Space and time are consumed by calling application not by the kernel or servers – Implement key functionality as user-level servers (a la microkernel) ∗ Why is “Linux file server” a separate process rather than a library? – Memory management v. mach ∗ Compare K42 memory management to mach 2 3 Admin Next week: • Monday 10/20: Guest lecture – Emmett Witchel: ACES 2.402 3:30-4:30 • Wednesday 10/22: Midterm – in class – Bring one 8.5x11 inch page of notes (font ≥ 8pt) 4 K42 Scalability 4.1 Clustered objects • Multiprocessor is like a distributed system – Use RPC for communication ∗ LRPC – local remote procedure call (anderson) – RPC a good match for interprocess communication (fewer problems than remote RPC – performance, reliability, ...) – Optimize RPC for scalability (when needed) ∗ Default stub (usually) ∗ Cache/replicate (when needed) (see NFS, Globe) – Mechanism: ∗ Object reference – pointer to pointer to object representative ∗ Object representative – interface (global, shared, local) to global object (cache, replicate, or not) · Global pointer to same virtual address but (possibly) different physical address on different processors · Fill local table on demand with trap handler · e.g., to create a new clustered object, bump a pointer in per-process table and use next entry as Object reference; set it to NULL on each processor. Then, when a processor first references the object, go to global trap handler, see that it is a Clustered object reference, look at global table to see which clustered object and call its handler. It can then decide whether to instantiate a pointer to single global object representative, to object representative shared by 4 processors, or to create new object representative 4.2 Existence locks • Hairy problem in OO programming + concurrency – how to cleanly shut an object down – Need to wait until no one is using the object – Can’t put lock inside object (b/c deallocation is a race between deallocators and threads contending for that lock; e.g., what to do about waiting threads on that lock...) – Putting “existence lock” for object outside of object is complex ∗ Somewhat non-modular ∗ Need existence lock for existence lock, so on to global lock 3 ∗ “Tends to encourage holding of locks for a long time” • Solution: semi-automatic garbage collection – 3 phases 1. Remove all persistent references (e.g., references in global data structures) ∗ “just as you normally would when removing object” ∗ Now, what about threads that have references on their stacks? 2. Uniprocessor: Wait until zero threads active in kernel ∗ Then, no stack can have reference to object → garbage collect it ∗ Not provably live, but intuition is: “system server calls tend to be short...we do not believe this to be a problem in practice” ∗ Future work: keep several generations of “active counts” 3. Multiprocessor: Hand token to each processor and repeat step 2 for all processors 4.3 IPC • Interprocess communication: Must be fast (servers in different processes) • Optimize common case: same processor – process trap, kernel reti to callee – Callee receives caller’s registers, protected authenticator, (optional) shared memory page – Cheaper than context switch: no save registers, no scheduling – Maintains locality 5 Discussion and open issues What were contributions of • Multics • THE • Unix • Disco • Exokernel • Mesa/Threads/Monitors • Scheduling/Scheduler activations • RPC • Active messages • Virtual memory • K42 What are the solved problems in OS? What are the open problems (or at least not addressed so far) in OS? 4.