The Politics of Cultural Programming in Public Spaces

Interface and Execution Models in the Fluke Kernel Bryan Ford Mike Hibler Jay Lepreau Roland McGrath Patrick Tullmann Department o f Computer Science University of Utah Technical Report UUCS-98-013 August, 1998 Abstract of magnitude, average latency varies by a factor of six, and memory use favors the interrupt model as expected, but not We have defined and implemented a new kernel API that by a large amount. We find that the overhead for restarting makes every exported operation either fully interruptible the most costly kernel operation ranges from 2-8%. and restartable, thereby appearing atomic to the user. To achieve interruptibility, all possible states in which a thread may become blocked for a “long” time are completely rep 1 Introduction resentable as valid kernel API calls, without needing to re tain any kernel internal state. This paper attempts to bring to light an important and useful control-flow property of OS kernel interface seman This API provides important functionality. Since all ker tics that has been neglected in prevailing systems, and to nel operations appear atomic, services such as transparent distinguish this interface property from the control-flow checkpointing and process migration that need access to properties of an OS kernel implementation. An essential the complete and consistent state of a process can be im issue of operating system design and implementation is plemented by ordinary user-mode processes. Atomic op when and how one thread can block and relinquish con erations also enable applications to provide reliability in a trol to another, and how the state of a thread suspended more straightforward manner. by blocking or preemption is represented in the system. This API also allows novel kernel implementation tech This crucially affects both the kernel interface that repre niques and evaluation of existing techniques, which we ex sents these states to user code, and the fundamental inter plore in this paper. Our new kernel’s single source im nal organization of the kernel implementation. A central plements either the “process” or the “interrupt” execution aspect of this internal structure is the execution model in model on both uni- and multiprocessors, depending only which the kernel handles processor traps, hardware inter on a configuration option affecting a small amount of code. rupts, and system calls. In the process model, which is Our kernel structure avoids the major complexities of tra used by traditional monolithic kernels such as BSD, Linux, ditional implementations of the interrupt model, neither re and Windows NT, each thread of control in the system has quiring ad hoc saving of state, nor limiting the operations its own kernel stack. In the interrupt model, used by sys (such as demand-paged memory) that can be handled by tems such as V [7], QNX [14], and Aegis [12], the ker the kernel. Finally, our interrupt model configuration can nel uses only one kernel stack per processor—for typi support the process model for selected components, with cal uniprocessor kernels, just one kernel stack, period. A the attendant flexibility benefits. thread in a process-model kernel retains its kernel stack We report preliminary measurements comparing fully, state when it sleeps, whereas in an interrupt-model kernel partially and non-preemptible configurations of both pro threads must manually save any important kernel state be cess and interrupt model implementations. We find that fore sleeping. This saved kernel state is often known as a the interrupt model has a modest speed edge in some continuation [10], since it allows the thread to “continue” benchmarks, maximum latency varies nearly three orders where it left off. In this paper we draw attention to the distinction be This research was supported in part by the Defense Advanced Re search Projects Agency, monitored by the Department of the Army under tween an interrupt-model kernel implementation, which is contract number DABT63-94-C-0058, and the Air Force Research Lab a kernel that uses only one kernel stack per processor by oratory, Rome Research Site, USAF, under agreement number F30602- manually saving implicit kernel state for sleeping threads, 96-2-0269. and an “atomic” kernel API, which is an API designed so Contact information: [email protected]. Dept, of Computer Sci ence, 50 S. Central Campus Drive, Rm. 3190, University of Utah, SLC, that sleeping threads need no such implicit kernel state at UT 84112-9205. http://www.cs.utah.edu/projects/flux/. all. These two kernel properties are related but fall on or- Execution Model interrupt-model kernel by eliminating the need to store im Interrupt Process plicit state in continuations. In addition, our kernel supports both internal execution * Fluke Fluk* • ITS models through a build-time configuration option affect (inierrupt-modcl) (proccss-modcl) . ing only a small fraction of the source enabling the first oB < “apples-to-apples” comparison between them. Our kernel (O v v demonstrates that the two models are not as fundamentally -a (original) (C arter) Mach Mach o different as they have been considered to be in the past; (Dravcs) (original) • • however, they each have strengths and weaknesses. Some Oh processor architectures have an inherent bias towards the < process model— e.g., a 5-10% kernel entry/exit perfor C mance difference on the x86. It is also easier to make o BSD • process-model kernels fully preemptible. Full preemptibil- ity comes at a cost, but this cost is associated with pre- emptibility, not with the process model itself. Process- model kernels tend to use more per-thread kernel mem Figure 1: The kernel implementation and API model continuums. ory, but this is a problem in practice only if the kernel V was originally a pure interrupt-model kernel but was later modified is liberal in its use of stack space and thus requires large to be partly process-model; Mach was a pure process-model kernel later modified to be partly interrupt-model. kernel stacks, or if the system uses a very large number of threads. Thus, we show that although an atomic API is highly beneficial, the kernel’s internal execution model thogonal dimensions, as illustrated in Figure 1. In a purely is less important: the interrupt-based organization has a atomic API, all possible states in which a thread may sleep slight size advantage, whereas the process-based organiza for a noticeable amount of time are cleanly visible and ex tion has somewhat more flexibility. portable to user mode. For example, the state of a thread Finally, contrary to conventional wisdom, our kernel involved in any system call is always well-defined, com demonstrates that it is practical to use legacy process- plete, and immediately available for examination or modi model code even within interrupt-model kernels and even fication by other threads; this is true even if the system call on architectures such as the x86 that make it difficult. The is long-running and consists of many stages. In general, key is to run the legacy code in user mode but in the ker this means that all system calls and exception handling nel’s address space. mechanisms must be cleanly interruptible and restartable, Our key contributions in this work are: in the same way that the instruction sets of modern proces sor architectures are cleanly interruptible and restartable. • To present a kernel supporting a pure atomic API and For purposes of readability, in the rest of this paper we will demonstrate the advantages and drawbacks of this ap refer to API’s with these properties as “atomic,” as well as proach. the properties themselves. We have developed a new kernel called Fluke which ex • To explore the relationship between an “atomic API” ports a purely atomic API. This API allows the complete and the kernel’s execution model. state of any user-mode thread to be examined and modi fied by other user-mode threads without being arbitrarily • To present the first “apples-to-apples” comparison be delayed. In unpublished work that we were not aware of tween the two kernel implementation models using a until very recently, the MIT ITS [11] system implemented kernel that supports both, revealing that the models an API with a similar property, some 30 years ago. Sev are not as different as commonly believed. eral other systems came close to providing this property • To show that it is practical to use process-model but still had a few situations in which thread state was legacy code in an interrupt-model kernel, and to not always extractable. Supporting a purely atomic API present several techniques for doing so. slightly widens the kernel interface due to the need to ex port additional state information. However, such an API nested transactions, but yield significant benefits, (ii) It is well known provides the atomicity property that gives important ro that providing atomicity at lower layers allows higher layers to be written bustness advantages, making it easier to build fault-tolerant more simply, (iii) ITS exploited an atomic API for a number of prop erties; it particularly made it easy to write user-mode schedulers, as one systems [24].' It also simplifies implementation of a pure could set the state of a thread at any time. [4, 3] (iv) Large telecomm ap plications use an “auditor,” a daemon that periodically wakes up and test 'Some examples of this are (i) This property is similar to “chained each of the critical data structures in the system to see if it does not vio transactions” [Gray93] which allow a single transaction to progress late some assertions. For critical OS’es one could think of applying the through intermediate stages while building up state, but is able to roll same concept to the kernel, (v) One could both debug and detect deadlock back only to the last stage.

Load more