Acknowledgments Preface 1 Overview I MICRO ERNEL ABSTRACTIONS

Coyotos Microkernel Specification http://coyotos.org/docs/ukernel/spec.html Coyotos Microkernel Specification Version 0.3+ March 20, 2006 Jonathan S. Shapiro, Ph.D., Eric Northup, M. Scott Doerrie, Swaroop Sridhar Systems Research Laboratory Dept. of Computer Science Johns Hopkins University Neal H. Walfield, Marcus Brinkmann GNU Hurd Project Copyright 2006, Jonathan S. Shapiro. Verbatim copies of this document may be duplicated or distributed in print or electronic form for non-commercial purposes. Legal Notice THIS SPECIFICATION IS PROVIDED ``AS IS'' WITHOUT ANY WARRANTIES, INCLUDING ANY WARRANTY OF MERCHANTABILITY, NON-INFRINGEMENT, FITNESS FOR ANY PARTICULAR PURPOSE, OR ANY WARRANTY OTHERWISE ARISING OF ANY PROPOSAL, SPECIFICATION OR SAMPLE. Table of Contents Acknowledgments Preface The Original Plan Overrun by the Hurd Coyotos: Take II A Brief Word on Terminology 1 Overview 1.1 Microkernel Objects 1.2 Sender Capabilities 1.3 Activations 1.4 Messages 1.5 Naming and Invocation 1.6 Exception and Interrupt Handling 1.7 Protection Model I MICROKERNEL ABSTRACTIONS 2 Capabilities 2.1 Representation 2.1.1 Capabilities to Memory Objects 2.1.2 Message-Related Capabilities 2.1.3 Capabilities to Processes 2.1.4 Miscellaneous Capabilities 2.2 Validity 2.3 Extensibility 3 Processes 3.1 State of a Process 3.2 FCRBs 3.3 Execution Model 3.3.1 Event Arrival and Run-In 3.3.2 Preemption 3.3.3 Exception Handling 1 of 38 04/10/06 11:19 Coyotos Microkernel Specification http://coyotos.org/docs/ukernel/spec.html 3.4 Event Delivery 4 Address Spaces 4.1 Memory Objects and Address Interpretation 4.1.1 Permissions 4.1.2 References and Access Violations 4.2 Pages 4.3 Address Space Composition 4.3.1 Translation Algorithm 4.3.2 Exception Handling 4.3.3 Cycle Detection 4.4 Address Space Splitting 5 Capability Invocation and Messages 5.1 The Invocation Block 5.1.1 Control Word 5.1.2 Message Send Block 5.1.3 Meaning of capitem_t 5.1.4 Other Parameters 5.2 The Receive Block 5.2.1 Message Receive Block 6 Sender's Capabilities 6.1 FCRB Sender's Capability 6.2 Multithreaded Servers 6.3 Selective Revocation 6.4 Rationale 6.4.1 Payload Matching 6.4.2 Idempotent Messages 6.4.3 Stateful Messages 6.5 Protected Payloads Under Composition 7 Schedules 7.1 Scheduling Model 8 Other Kernel Objects II MICROKERNEL INTERFACES coyotos.cap Constants Exceptions Operations coyotos.memory Enumerations Operations coyotos.page Operations coyotos.cappage Operations coyotos.ogpt Enumerations Operations coyotos.gpt Constants Operations 2 of 38 04/10/06 11:19 Coyotos Microkernel Specification http://coyotos.org/docs/ukernel/spec.html coyotos.window Operations coyotos.process Enumerations Structures Operations coyotos.fcrb Operations coyotos.rcvqueue Operations coyotos.wrapper Operations coyotos.fault Enumerations Operations coyotos.null coyotos.keybits Constants Structures Operations coyotos.discrim Enumerations Operations coyotos.irqctl Operations coyotos.schedctl coyotos.ioperm coyotos.sleep Operations coyotos.checkpoint Exceptions Operations coyotos.obstore A Change History A.1 Changes in version 0.1 A.2 Changes in version 0.2 Acknowledgments Many people have assisted us in evaluating and advancing this design: Norm Hardy, Charlie Landau, and Bill Frantz of the KeyKOS project. Charlie also runs the CapROS project, another successor to the EROS system. The members of the coyotos-dev mailing list, notably Bas Wijnen and Tom Bachmann, 3 of 38 04/10/06 11:19 Coyotos Microkernel Specification http://coyotos.org/docs/ukernel/spec.html Christopher Nelson, and Dominique Quatravaux. The members of the L4 community, notably Hermann Härtig, Espen Skoglund, and Kevin Elphinstone. There are surely others that we will come to name as the design stabilizes further, and some that we will inadvertently omit. To the last, please accept our apologies. As is customary, any flaw remaining in this specification is ours. Comments and suggestions concerning this specification are welcome. They should be sent to the coyotos-dev electronic mailing list. In order to send, you must be subscribed to the list. The subscription interface may be found at: http://www.coyotos.org/mailman/listinfo/coyotos-dev. In order to keep the mail archives readable, we ask that you send only ``plain text'' emails. Preface Coyotos is a security microkernel. It is a microkernel in the sense that it is a minimal protected platform on which a complete operating system can be constructed. It is a security microkernel in the sense that it is a minimal protected platform on which higher level security policies can be constructed. The Original Plan As originally conceived, Coyotos was intended to be a relatively minor departure from its predecessor, EROS [8]. EROS [5] was a small, robust microkernel whose central design ideas were pervasive use of capabilities [13] as the fundamental access model, an atomic, blocking capability invocation (therefore atomic and blocking IPC) model , and a persistent single-level store [6]. All of these features were inherited with some revision from the KeyKOS system. [2] Early application-level work on EROS, notably the defensible network system [12] and the secure window system [10] revealed areas where the EROS architecture would clearly benefit from refinement, but did not initially suggest fundamental shortcomings in the architecture. Coyotos was to have been that minor refinement, incorporating a new IPC primitive called ``endpoints'' and a revised memory mapping entity called a PATT. Our main goals were cleanup, consistency, and formalization. For algorithmic reasons, the PATT idea did not survive into the current specification, and has been replaced by guarded page tables [ 4 ,1 ]. Though they were independently invented, guarded page tables may be seen as a generalization of the level skipping techniques of the KeyKOS translation mechanism or the Motorola MC68851 memory management unit [24]. The variant of guarded page tables incorporated here are modified to incorporate the fault handler and background space mechanisms of KeyKOS and EROS. In January 2004, a summit meeting of sorts occurred between the several research groups working on L4 derivatives and Shapiro. The L4 Dresden group, in particular, wanted to get a better understanding of capability-based design and kernel mechanisms, with the intent that these would be adapted into the L4 architecture [11]. The new kernel architecture would come to be known as ``L4.sec''. There was some discussion of merging the two kernels, but no agreement could be reached on the future of L4's map and unmap operation. While the failure to merge the architectures was a disappointment, the idea that there would be a controlled experiment that would allow us to directly evaluate the map/unmap approach againts the EROS node approach was a promising result in its own right. Overrun by the Hurd Events intervened in the form of Neal Walfield and Marcus Brinkmann, the current architects of the GNU Hurd system. The Hurd is a protected, object-based operating system that was initially constructed on top of the Mach microkernel. Mach has a variety of problems that have been thoroughly documented in the research literature. Of particular importance to Hurd are a lack of resource accounting mechanisms and poor performance. As a result of these issues, the Hurd project had provisionally decided to move to L4. Unfortunately, modeling copyable, protected object references using L4's map/grant operations proved unexpectedly challenging. This left the Hurd project temporarily disrupted, leading Brinkmann and Walfield to seek more information about capability-based design. An extended discussion between Shapiro, Walfield, and Brinkmann at the 2005 Libre Software Meeting about capability systems in general and the plans for Coyotos ensued. As more information about the L4.sec design emerged [23], it became clear that copyable protected references might be problematic on the L4.sec interface as well. Walfield and Brinkmann travelled to Baltimore for a month-long set of design discussions in January 2006, leading to the current design for Coyotos. The January discussions exposed a fundamental problem in the originally intended Coyotos interface: 4 of 38 04/10/06 11:19 Coyotos Microkernel Specification http://coyotos.org/docs/ukernel/spec.html kernel threads are not cheap. Over the last 15 years, it has become a tenet of faith among microkernel designers that blocking and unbuffered communication mechanisms are good. Perhaps the most persuasive argument supporting this position was offered by Liedtke in Improving IPC by Kernel Design [14]. The essential argument of this paper is that buffering carries additional copying costs, and that non-synchronizing IPC carries additional context switch overheads, the paper argues that the correct way to eliminate these expenses is by eliminating both features in favor of a blocking and unbuffered communication model. When an application must block on multiple communication channels, it should use multiple kernel threads. Our own research results [15] along with those of Ford et al. [16] and others seemed to confirm the performance argument, and unbuffered blocking IPC systems became more or less the accepted design wisdom in the microkernel world. Looking back, it now seems likely that none of us adeqautely considered the issue of space or thought hard enough about application-level complexity. The space concern is a blatant failing in blocking designs. Suppose there is a server that wishes to wait simultaneously for activity on one of 1024 network connections. In a blocking IPC design you need a thread to wait on each of these connections. The register state of a modern Pentium-family processor occupies nearly four kilobytes, and there is additional state needed within the operating system. Estimate it at two pages, or 8 kilobytes. Ignoring their address space, runtime storage requirements, or any other space that may be needed, the mere existence of these 1024 threads requires about 8 megabytes.

Acknowledgments Preface 1 Overview I MICRO ERNEL ABSTRACTIONS

CHERI Instruction-Set Architecture

Research Purpose Operating Systems – a Wide Survey

Os Verification - a Survey As a Source of Future Challenges

A Look at the EROS Operating System Jonathan S

Analysing the Security Properties of Object-Capability Patterns

Protection Nov. 14, 2008 15-410

CHERI Instruction-Set Architecture

Capability Hardware Enhanced RISC Instructions: CHERI Instruction-Set Architecture (Version 6)

A Survey of Distributed Capability File Systems and Their Application to Cloud Environments

Navigating the Attack Surface

CHERI Instruction-Set Architecture

Capsicum: Practical Capabilities for UNIX