<<

Aversion of this pap er app eared in the Pro ceedings of the Fifteenth Symp osium on Op erating Systems Principles

Extensibility Safety and Performance in the

SPIN Op erating System

Brian N Bershad Stefan Savage PrzemyslawPardyak Emin Gun Sirer

Marc E Fiuczynski David Becker Craig Chambers Susan Eggers

Department of Science and Engineering

UniversityofWashington

Seattle WA

paging algorithms found in mo dern op erating systems

Abstract

can b e inappropriate for database applications result

ing in p o or p erformance Stonebraker General pur

This pap er describ es the motivation architecture and

p ose network proto col implementations are frequently

p erformance of SPIN an extensible op erating system

inadequate for supp orting the demands of high p erfor

SPIN provides an extension infrastructure together

mance parallel applications von Eicken et al Other

with a core set of extensible services that allow applica

applications suchasmultimedia clients and servers and

tions to safely change the op erating systems interface

realtime and fault tolerant programs can also present

and implementation Extensions allow an application to

demands that p o orly match op erating system services

sp ecialize the underlying op erating system in order to

Using SPIN an application can extend the op erating

achieve a particular level of p erformance and function

systems interfaces and implementations to provide a

ality SPIN uses language and linktime mechanisms to

b etter matchbetween the needs of the application and

inexp ensively exp ort negrained interfaces to op erat

the p erformance and functional characteristics of the

ing system services Extensions are written in a typ e

system

safe language and are dynamically linked into the op

erating system kernel This approach oers extensions

rapid access to system services while protecting the op

Goals and approach

erating system co de executing within the kernel address

space SPIN and its extensions are written in Mo dula

The goal of our research is to build a general purp ose

and run on DEC Alpha

op erating system that provides extensibility safety and

go o p erformance Extensibilit y is determined bythe

interfaces to services and resources that are exp orted

Intro duction

to applications it dep ends on an infrastructure that

allows negrained access to system services Safetyde

SPIN is an op erating system that can b e dynamically

termines the exp osure of applications to the actions of

sp ecialized to safely meet the p erformance and function

others and requires that access b e controlled at the

ality requirements of applications SPIN is motivated

same granularity at which extensions are dened Fi

by the need to supp ort applications that presentde

nally go o d p erformance requires lowoverhead commu

mands p o orly matched by an op erating systems imple

nication b etween an extension and the system

mentation or interface A p o orly matched implementa

The design of SPIN reects our view that an op erat tion prevents an application from working well while a

p o orly matched interface prevents it from working at all

ing system can b e extensible safe and fast through the

use of language and runtime services that provide low For example the implementations of disk buering and

cost negrained protected access to op erating system

This researchwas sp onsored by the Advanced Research

resources Sp ecicallythe SPIN op erating system re

Pro jects Agency the National Science Foundation Grants no

CDA and CCR and by an equipmentgrant liesonfourtechniques implemented at the level of the

from Digital Equipment Corp oration Bershad was partially sup

language or its runtime

p orted by a National Science Foundation Presidential FacultyFel

lowship Chamb ers was partially sp onsored by a National Science

Foundation Presidential Young Investigator Award Sirer was

Colocation Op erating system extensions are dy

supp orted by an IBM Graduate StudentFellowship Fiuczynski

namically linked into the kernel virtual address

was partially supp orted by a National Science Foundation GEE

space Colo cation enables communication b etween

Fellowship

system and extension co de to havelow cost

to system services is written in the systems safe ex Enforcedmodularity Extensions are written in

tension language For example wehaveused SPIN to Mo dula Nelson a mo dular programming lan

implement a op erating system server The bulk guage for which the compiler enforces interface

of the server is written in C and executes within its own b oundaries b etween mo dules Extensions which

address space as do applications The server consists execute in the kernels can

of a large b o dy of co de that implements the DEC OSF not access memory or execute privileged instruc

SPIN ex interface and a small number of tions unless they have b een given explicit access

tensions that provide the and through an interface Mo dularity enforced bythe

device interfaces required by the server compiler enables mo dules to b e isolated from one

another with low cost

Wehave also used extensions to sp ecialize SPIN to

the needs of individual application programs For ex

Logical protection domains Extensions exist

ample wehave built a clientserver video system that

within logical protection domains which are ker

requires few control and transfers as images move

nel namespaces that contain co de and exp orted in

from the servers disk to the clients screen Using SPIN

terfaces Interfaces which are languagelevel units

the server denes an extension that implements a direct

represent views on system resources that are pro

stream b etween the disk and the network The client

tected by the op erating system An inkernel dy

viewer application installs an extension into the kernel

namic linker resolves co de in separate logical pro

that decompresses incoming network video packets and

tection domains at runtime enabling crossdomain

displays them to the video frame buer

communication to o ccur with the overhead of a pro

cedure call

The rest of this pap er

Dynamic cal l binding Extensions execute in re

The rest of this pap er describ es the motivation design

sp onse to system events An event can describ e

and p erformance of SPIN In the section we moti

any p otential action in the system such as a virtual

vate the need for extensible op erating systems and dis

memory page fault or the scheduling of a thread

cuss related work In Section we describ e the sys

Events are declared within interfaces and can b e

tems architecture in terms of its protection and exten

dispatched with the overhead of a pro cedure call

sion facilities In Section we describ e the core services

provided by the system In Section we discuss the

Colo cation enforced mo dularity logical protection

systems p erformance and compare it against that of

domains and dynamic call binding enable interfaces to

several other op erating systems In Section we discuss

b e dened and safely accessed with lowoverhead How

our exp eriences writing an op erating system in Mo dula

ever these techniques do not guarantee the systems ex

Finally in Section we present our conclusions

tensibility Ultimately extensibilityisachieved through

the system service interfaces themselves which dene

the set of resources and op erations that are exp orted

to applications SPIN provides a set of interfaces to

Motivation

core system services such as memory management and

scheduling that rely on colo cation to eciently exp ort

Most op erating systems are forced to balance gener

negrained op erations enforced mo dularity and logical

ality and sp ecialization A general system runs many

protection domains to manage protection and dynamic

programs but may run few well In contrast a sp e

call binding to dene relationships b etween system com

cialized system may run few programs but runs them

p onents and extensions at runtime

all well In practice most general systems can with

some eort b e sp ecialized to address the p erformance

and functional requiremen ts of a particular applications

System overview

needs suchasinterpro cess communication synchro

The SPIN op erating system consists of a set of extension

nization thread management networking virtual mem

services and core system services that execute within the

ory and managementDraves et al Bershad

kernels virtual address space Extensions can b e loaded

et al b Sto dolsky et al Bershad Yuhara

into the kernel at any time Once loaded they integrate

et al Maeda Bershad Felten Young

themselves into the existing infrastructure and provide

et al Harty Cheriton McNamee Armstrong

system services sp ecic to the applications that require

Anderson et al Fall Pasquale Wheeler

them SPIN is primarily written in Mo dula which

Bershad Romer et al Romer et al Cao

allows extensions to directly use system interfaces with

et al Unfortunately existing system structures

out requiring runtime conversion when communicating

are not wellsuited for sp ecialization often requiring a

with other system co de

substantial programming eort to aect evenasmall

Although SPIN relies on language features to ensure change in system b ehavior Moreover changes intended

safety within the kernel applications can b e written in to improve the p erformance of one class of applications

any language and execute within their own virtual ad can often degrade that of others As a result system

dress space Only co de that requires lowlatency access sp ecialization is a costly and errorprone pro cess

An extensible system is one that can b e changed dy

ton Kougiouris Hildebrand Engler et al

namically to meet the needs of an application The need

it still do es not approach that of a pro cedure call en

for extensibility in op erating systems is shown clearly

couraging the construction of monolithic nonextensible

by systems such as MSDOS Windows or the Macin

systems For example the L even with its

tosh Op erating System Although these systems were

aggressive design has a protected pro cedure call imple

not designed to b e extensible their weak protection

mentation with overhead of nearly pro cedure call

mechanisms have allowed application to

times Liedtke Liedtke Int As a p ointof

directly mo dify op erating system data structures and

comparison the Intwhich provided hard

co de Schulman et al While individual applica

ware supp ort for protected crossdomain transfer had

tions have b eneted from this level of freedom the lack

a crossdomain communication overhead on the order

of safe interfaces to either op erating system services or

of ab out pro cedure call times Colwell and was

op erating system extension services has created system

generally considered unacceptable

conguration chaos Draves

Some systems rely on little languages to safely ex

tend the op erating system interface through the use

of interpreted co de that runs in the kernel Lee et al

Related work

Mogul et al Yuhara et al These systems

suer from three problems First the languages b eing

Previous eorts to build extensible systems have demon

little make the expression of arbitrary control and data

strated the threeway tension b etween extensibility

structures cumb ersome and therefore limit the range

safety and p erformance For example Wulf et al

of p ossible extensions Second the interface b etween

dened an infrastructure that allowed applications

the languages programming environment and the rest

to manage resources through multilevel p olicies The

of the system is generally narrow making system in

kernel dened the mechanism for allo cating resources

tegration dicult Finallyinterpretation overhead can

between pro cesses and the pro cesses themselves im

limit p erformance

plemented the p olicies for managing those resources

Many systems provide interfaces that enable arbitrary

Hydras architecture although highly inuential had

co de to b e installed into the kernelatruntime Heide

high overhead due to its weighty capabilitybased pro

mann Pop ek Rozier et al In these systems

tection mechanism Consequently the system was de

the right to dene extensions is restricted b ecause any

signed with large ob jects as the basic building blo cks

extension can bring down the entire system application

requiring a large programming eort to aect even a

sp ecic extensibility is not p ossible

small extension

Several pro jects Lucco Engler et al Small

Researchers haverecently investigated the use of

Seltzer are exploring the use of software fault isola

as a vehicle for building extensible sys

tion Wahb e et al to safely link application co de

tems Black et al Mullender et al Cheriton

written in any language into the kernels virtual ad

ker et al Zwaenep o el Cheriton Duda Thac

dress space Software fault isolation relies on a binary

A microkernel typically exp orts a small number

rewriting to ol that inserts explicit checks on memory

of abstractions that include threads address spaces

references and branch instructions These checks al

and communication channels These abstractions can

low the system to dene protected memory segments

b e combined to supp ort more conventional op erating

ware without relying on virtual memory hardware Soft

system services implemented as userlevel programs

fault isolation shows promise as a colo cation mecha

Applicationsp ecic extensions in a microkernel o ccur

nism for relatively isolated co de and data segments It

at or ab ove the level of the kernels interfaces Unfortu

is unclear though if the mechanism is appropriate for a

nately applications often require substantial changes to

system with negrained sharing where extensions may

a microkernels implementation to comp ensate for limi

access a large numb er of segments In addition soft

tations in interfaces Lee et al Davis et al Wald

ware fault isolation is only a protection mechanism and

spurger Weihl

do es not dene an extension mo del or the service inter

Although a microkernels communication facilities

faces that determine the degree to which a system can

provide the infrastructure for extending nearly anyker

b e extended

nel service Barrera Abrossimov et al Forin et al

few have b een so extended We b elievethisisbe

Aegis Engler et al is an op erating system that

cause of high communication overhead Bershad et al

relies on ecient trap redirection to exp ort hardware

Draves et al Chen Bershad which lim

services such as exception handling and TLB manage

its extensions mostly to coarsegrained services Golub

ment directly to applications The system itself denes

et al Stevenson Julin Bricker et al

no abstractions b eyond those minimally provided bythe

Otherwise protected interaction b etween system com

hardware Engler Kaasho ek Instead conven

p onents which o ccurs frequently in a system with ne

tional op erating system services such as virtual memory

grained extensions can b e a limiting p erformance fac

and scheduling are implemented as libraries executing

tor

in an applications address space System service co de

executing in a can b e changed by the applica Although the p erformance of crossdomain communi

tion according to its needs SPIN shares manyofthe cation has improved substantially in recentyears Hamil

rst restriction is enforced at compiletime and the sec same goals as Aegis although its approach is quite dif

ond is enforced through a combination of compiletime ferent SPIN uses language facilities to protect the ker

and runtime checks Automatic storage management nel from extensions and implements protected commu

prevents memory used by a live p ointers referent from nication using pro cedure call Using this infrastructure

b eing returned to the heap and reused for an ob ject of SPIN provides an extension mo del and a core set of ex

a dierenttyp e tensible services In contrast Aegis relies on hardware

protected system calls to isolate extensions from the ker

nel and leaves unsp ecied the manner by which those

extensions are dened or applied

The protection mo del

Several systems Co op er et al Redell et al

Mossenbock Organicklike SPINhave re

A protection mo del controls the set of op erations that

lied on language features to extend op erating system

can b e applied to resources For example a protection

services Pilot for instance was a singleaddress space

mo del based on address spaces ensures that a pro cess

system that ran programs written in Mesa Geschke

can only access memory within a particular range of vir

et al an ancestor of Mo dula In general sys

tual addresses Address spaces though are frequently

tems such as Pilot have dep ended on the language for

inadequate for the negrained protection and manage

all protection in the system not just for the protection

ment of resources b eing exp ensive to create and slo w

of the op erating system and its extensions In contrast

to access Lazowska et al

SPINs reliance on language services applies only to ex

tension co de within the kernel Virtual address spaces

are used to otherwise isolate the op erating system and

programs from one another

Capabilities

All kernel resources in SPIN are referenced by capabil

The SPIN Architecture

ities A capability is an unforgeable reference to a re

source which can b e a system ob ject an interface or a

The SPIN architecture provides a software infrastruc

collection of interfaces An example of each of these is a

ture for safely combining system and application co de

physical page a physical page allo cation interface and

The protection mo del supp orts ecient negrained ac

the entire virtual memory system Individual resources

cess control of resources while the extension mo del en

are protected to ensure that extensions reference only

ables extensions to b e dened at the granularityofa

the resources to which they havebeengiven access In

pro cedure call The systems architecture is biased to

terfaces and collections of interfaces are protected to

wards mechanisms that can b e implemented with low

allow dierent extensions to have dierent views on the

cost on conventional pro cessors Consequently SPIN

set of available services

makes few demands of the hardware and instead relies

Unlike other op erating systems based on capabilities

yp echecking on languagelevel services such as static t

which rely on sp ecialpurp ose hardware Carter et al

and dynamic linking

virtual memory mechanisms Wulf et al prob

abilistic protection Engler et al or protected mes

Relevant prop erties of Mo dula

sage channels Black et al SPIN implements ca

pabilities directly using p ointers which are supp orted

SPIN and its extensions are written in Mo dula a

by the language A p ointer is a reference to a blo ckof

general purp ose designed in the

memory whose typ e is declared within an interface Fig

early s The key features of the language include

ure demonstrates the denition and use of interfaces

supp ort for interfaces typ e safety automatic storage

and capabilities p ointers in SPIN

management ob jects generic interfaces threads and

exceptions We rely on the languages supp ort for ob The compiler at compiletimeprevents a p ointer

jects generic interfaces threads and exceptions for aes

from b eing forged or dereferenced in a way inconsis

thetic reasons only we nd that these features simplify

tent with its typ e There is no runtime overhead for

the task of constructing a large system using a p ointer passing it across an interface or deref

The design of SPIN dep ends only on the languages erencing it other than the overhead of going to memory

safety and encapsulation mechanisms sp ecically inter to access the p ointer or its referent A p ointer can b e

faces typ e safety and automatic storage management h passed from the kernel to a userlevel application whic

An interface declares the visible parts of an implemen cannot b e assumed to b e typ e safe as an externalized

tation module which denes the items listed in the in reference An externalized reference is an index into a

terface All other denitions within the implementation p erapplication table that contains typ e safe references

mo dule are hidden The compiler enforces this restric to inkernel data structures The references can later

tion at compiletime Typ e safety prevents co de from b e recovered using the index Kernel services that in

accessing memory arbitrarilyApointer may only re tend to pass a reference out to user level externalize the

fer to ob jects of its referents typ e and array indexing reference through this table and instead pass out the

op erations must b e checked for b ounds violation The index

system Consequently namespace managementmust Protection domains

o ccur at the language level For example if the name

A protection domain denes the set of accessible names

c is an instance of the typ e ConsoleTthenboth c and

available to an execution context In a conventional op

ConsoleT o ccupy a p ortion of some symb olic names

erating system a protection domain is implemented us

pace An extension that redenes the typ e ConsoleT

ing virtual address spaces A name within one domain

creates an instance of the new typ e and passes it to

a virtual address has no relationship to that same name

a mo dule exp ecting a ConsoleT of the original typ e

in another domain Only through explicit mapping and

creates a typ e conict that results in an error The

sharing op erations is it p ossible for names to b ecome

error could b e avoided by placing all extensions into

meaningful b etween protection domains

a global mo dule space but since mo dules pro cedures

and variable names are visible to programmers we felt

that this would intro duce an overly restrictive program

ming mo del for the system Instead SPIN provides fa

INTERFACE Console An interface

cilities for creating co ordinating and linking program

TYPE T REFANY Read as ConsoleT is opaque

level namespaces in the context of protection domains

CONST InterfaceName ConsoleService

A global name

PROCEDURE OpenT

INTERFACE Domain

Open returns a capability for the console

PROCEDURE Writet T msg TEXT

TYPE T REFANY DomainT is opaque

PROCEDURE Readt VAR msg TEXT

PROCEDURE Closet T

PROCEDURE CreatecoffCoffFileTT

END Console

Returns a domain created from the specified object

file coff is a standard object file format

PROCEDURE CreateFromModuleT

MODULE Console An implementation module

Create a domain containing interfaces defined by the

calling module This function allows modules to

The implementation of ConsoleT

name and export themselves at runtime

TYPE Buf ARRAY OF CHAR

REVEAL T BRANDED REF RECORD T is a pointer

PROCEDURE ResolvesourcetargetT

inputQ Buf to a record

Resolve any undefined symbols in the target domain

outputQ Buf

against any exported symbols from the source

device specific info

END

PROCEDURE Combined d TT

Create a new aggregate domain that exports the

Implementations of interface functions

interfaces of the given domains

have direct access to the revealed type

PROCEDURE OpenT

END Domain

END Console

Figure The Domain interface This interface op erates on in

MODULE Gatekeeper A client

stances of typ e DomainT which are describ ed bytyp e safe p oint

IMPORT Console

ers The implementation of the Domain interface is unsafe with

resp ect to Mo dula memory semantics as it must manipulate

VAR c ConsoleT A capability for

linker symb ols and program addresses directly

the console device

PROCEDURE IntruderAlert

A SPIN protection domain denes a set of names or

BEGIN

c ConsoleOpen

program symb ols that can b e referenced by co de with

ConsoleWritec Intruder Alert

access to the domain A domain named by a capability

ConsoleClosec

is used to control dynamic linking and corresp onds to

END IntruderAlert

one or more safe ob ject les with one or more exp orted

BEGIN

interfaces An ob ject le is safe if it is unknown to the

END Gatekeeper

kernel but has b een signed by the Mo dula compiler

or if the kernel can otherwise assert the ob ject le to b e

safe For example SPINs lowest level device interface

is identical to the DEC OSF driver interface Dig

Figure The Gatekeep er mo dule interacts with SPINs Con

sole service through the Console interface Although Gate

allowing us to dynamically link vendor drivers into the

keep erIntruderAlert manipulates ob jects of typ e ConsoleT it

kernel Although the drivers are written in C the kernel

is unable to access the elds within the ob ject even though it

asserts their safety In general we prefer to avoid using

executes within the same virtual address space as the Console

ob ject les that are safe by assertion rather than by

mo dule

compiler verication as they tend to b e the source of

In SPIN the naming and protection interface is at more than their fair share of bugs

the level of the language not of the virtual memory Domains can b e intersecting or disjoint enabling ap

plications to share services or dene new ones A do tem to guide certain op erations such as page replace

main is created using the Create op eration whichini ment In other cases an extension mayentirely replace

tializes a domain with the contents of a safe ob ject le an existing system service suchasascheduler with a

Any symb ols exp orted byinterfaces dened in the ob new one more appropriate to a sp ecic application

ject le are exp orted from the domain and anyim Extensions in SPIN are dened in terms of events

p orted symb ols are left unresolved Unresolved symb ols and handlers An event is a message that announces a

corresp ond to interfaces imp orted by co de within the change in the state of the system or a request for ser

domain for which implementations havenotyet b een t handler is a pro cedure that receives the vice An even

found message An extension installs a handler on an eventby

explicitly registering the handler with the event through

The Resolve op eration serves as the basis for dynamic

a central dispatcher that routes events to handlers

linking It takes a target and a source domain and

Event names are protected by the domain machinery

resolves any unresolved symb ols in the target domain

describ ed in the previous section An event is dened

against symb ols exp orted from the source During reso

as a pro cedure exp orted from an interface and its han

lution text and data symb ols are patched in the target

dlers are dened as pro cedures having the same typ e A

domain ensuring that once resolved domains are able

handler is invoked with the arguments sp ecied bythe

to share resources at memory sp eed Resolution only

event raiser The kernel is preemptive ensuring that a

resolves the target domains undened symb ols it do es

handler cannot takeover the pro cessor

not cause additional symb ols to b e exp orted Cross

The right to call a pro cedure is equivalent to the right

linking a common idiom o ccurs through a pair of Re

to raise the eventnamedby the pro cedure In fact the

solve op erations

two are indistinguishable in SPIN and any pro cedure

The Combine op eration creates linkable namespaces

exp orted byaninterface is also an event The dispatcher

that are the union of existing domains and can b e used

exploits this similarity to optimize event raise as a direct

to bind together collections of related interfaces For

pro cedure call where there is only one handler for a

example the domain SpinPublic combines the systems

given event Otherwise the dispatcher uses dynamic

public interfaces into a single domain available to ex

co de generation Engler Pro ebsting to construct

tensions Figure summarizes the ma jor op erations on

optimized call paths from the raiser to the handlers

domains

The primary right to handle an event is restricted

The domain interface is commonly used to imp ort

to the default implementation mo dule for the event

or exp ort particular named interfaces A mo dule that

which is the mo dule that statically exp orts the pro ce

exp orts an interface explicitly creates a domain for its

dure named bytheevent For example the mo dule

interface and exp orts the domain through an inkernel

Console is the default implementation mo dule for the

nameserver The exp orted name of the interface which

even t ConsoleOpen shown in Figure Other mo d

can b e sp ecied within the interface is used to co or

ules may request that the dispatcher install additional

dinate the exp ort and imp ort as in many RPC sys

handlers or even remove the primary handler For each

tems Schro eder Burrows Bro ckschmidt The

request the dispatcher contacts the primary implemen

constant ConsoleInterfaceName in Figure denes a

tation mo dule passing the eventnameprovided bythe

name that exp orters and imp orters can use to uniquely

installer The implementation mo dule can deny or allow

identify a particular version of a service

the installation If denied the installation fails If al

Some interfaces such as those for devices restrict ac

lowed the implementation mo dule can provide a guard

cess at the time of the imp ort An exp orter can register

to b e asso ciated with the handler The guard denes

an authorization pro cedure with the nameserver that

a predicate expressed as a pro cedure that is evaluated

will b e called with the identity of the imp orter when

by the dispatcher prior to the handlers invo cation If

ever the interface is imp orted This negrained control

the predicate is true when the event is raised then the

has low cost b ecause the imp orter exp orter and autho

handler is invoked otherwise the handler is ignored

rizer interact through direct pro cedure calls

Guards are used to restrict access to events at a gran

ularity ner than the event name allowing events to b e

dispatched on a p erinstance basis For example the

The extension mo del

SPIN extension that implements IP layer pro cessing de

An extension changes the way in which a system pro

nes the event IPPacketArrivedpkt IPPacketwhich

vides service All software is extensible in one way

it raises whenever an IP packet is received The IP

or another but it is the extension mo del that deter

mo dule which denes the default implementationofthe

mines the ease transparency and eciency with which

PacketArrived event up on each installation constructs

an extension can b e applied SPINs extension mo del

a guard that compares the typ e eld in the header of

provides a controlled communication facilitybetween

the incoming packet against the set of IP proto col typ es

extensions and the base system while allowing for a

ayIPdoesnot that the handler may service In this w

varietyofinteraction styles For example the mo del

The dispatcher also allows a handler to sp ecify an additional

allows extensions to passively monitor system activity

closure to b e passed to the handler during event pro cessing The

and provide uptodate p erformance information to ap

closure allows a single handler to b e used within more than one

context plications Other extensions may oer hints to the sys

made it p ossible to manipulate large ob jects suchasen have to exp ort a separate interface for eacheventin

tire address spaces Young et al Khalidi Nelson stance A handler can stack additional guards on an

or to direct exp ensive op erations for example page event further constraining its invo cation

out Harty Cheriton McNamee Armstrong There maybeanynumb er of handlers installed on a

entirely from user level Others hav particular event The default implementation mo dule e enabled control

over relatively small ob jects such as cache pages Romer

may constrain a handler to execute synchronously or

asynchronously in b ounded time or in some arbitrary et al or TLB entries Bala et al entirely from

order with resp ect to other handlers for the same event the kernel None haveallowed for fast negrained con

trol over the physical and virtual memory resources re

Each of these constraints reects a dierent degree of

trust b etween the default implementation and the han quired by applications SPINs virtual memory system

dler For example a handler may b e b ounded by a time provides suchcontrol and is enabled by the systems

lowoverhead invo cation and protection services

quantum so that it is ab orted if it executes to o long A

handler may b e asynchronous which causes it to exe

The SPIN memory managementinterface decomp oses

cute in a separate thread from the raiser isolating the

memory services into three basic comp onents physi

raiser from handler latency When multiple handlers

cal storage naming and translation These corresp ond

execute in resp onse to an event a single result can b e

to the basic memory resources exp orted by pro cessors

communicated back to the raiser by asso ciating with

namely physical addresses virtual addresses and trans

eachevent a pro cedure that ultimately determines the

lations Applicationsp ecic services interact with these

nal result Pardyak Bershad By default the dis

three services to dene higher level virtual memory ab

patcher mimics pro cedure call semantics and executes

stractions such as address spaces

handlers synchronously to completion in undened or

Each of the three basic comp onents of the memory

der and returns the result of the nal handler executed

system is provided by a separate service interface de

scrib ed in Figure The physical address service con

trols the use and allo cation of physical pages Clients

The core services

raise the Allocate event to request physical memory with

a certain size and an optional series of attributes that

The SPIN protection and extension mechanisms de

reect preferences for machine sp ecic parameters such

scrib ed in the previous section provide a framework for

as color or contiguityAphysical page represents a unit

managing interfaces b etween services within the ker

of high sp eed storage It is not for most purp oses

nel Applications though are ultimately concerned

a nameable entityandmay not b e addressed directly

with manipulating resources such as memory and the

from an extension or a user program Instead clients

provides a set of core pro cessor Consequently SPIN

e a capabilityfor of the physical address service receiv

services that manage memory and pro cessor resources

the memory The virtual address service allo cates ca

These services which use events to communicate b e

pabilities for virtual addresses where the capabilitys

tween the system and extensions exp ort interfaces with

referent is comp osed of a virtual address a length

negrained op erations In general the service inter

and an address space identier that makes the address

faces that are exp orted to extensions within the kernel

unique The translation service is used to express the re

are similar to the secondary internal interfaces found

lationship b etween virtual addresses and physical mem

in conventional op erating systems they provide simple

ory This service interprets references to b oth virtual

functionalityover a small set of ob jects In SPIN it

and physical addresses constructs mappings b etween

is straightforward to allo cate a single virtual page a

the two and installs the mappings into the pro cessors

physical page and then create a mapping b etween the

MMU

two Because the overhead of accessing each of these

The translation service raises a set of events that

op erations is low a pro cedure call it is feasible to pro

corresp ond to various exceptional MMU conditions

vide them as interfaces to separate abstractions and to

For example if a user program attempts to access

build up higher level abstractions through direct com

an unallo cated virtual memory address the Transla

p osition By contrast traditional op erating systems ag

tionBadAddress event is raised If it accesses an al

gregate simpler abstractions into more complex ones

lo cated but unmapp ed virtual page then the Transla

b ecause the cost of rep eated access to the simpler ab

tionPageNotPresent event is raised Implementors of

stractions is to o high

higher level memory management abstractions can use

these events to dene services such as demand pag

Extensible memory management

ing copyonwrite Rashid et al distributed shared

memory Carter et al or concurrent garbage col

A memory management system is resp onsible for the

lection App el Li

allo cation of virtual addresses physical addresses and

mappings b etween the two Other systems have demon The physical page service mayatany time re

strated signicant p erformance improvements from sp e claim physical memory by raising the PhysAddrReclaim

cialized or tuned memory management p olicies that event The interface allows the handler for this eventto

are accessible through interfaces exp osed by the mem volunteer an alternative page whichmaybeoflessim

ory management system Some of these interfaces have p ortance than the candidate page The translation ser

INTERFACE PhysAddr INTERFACE Translation

IMPORT PhysAddr VirtAddr

TYPE T REFANY PhysAddrT is opaque

TYPE T REFANY TranslationT is opaque

PROCEDURE Allocatesize Size attrib Attrib T

Allocate some physical memory with particular PROCEDURE Create T

attributes PROCEDURE Destroycontext T

Create or destroy an addressing context

PROCEDURE Deallocatep T

PROCEDURE AddMappingcontext T VirtAddrT

PROCEDURE Reclaimcandidate T T p PhysAddrT prot Protection

Request to reclaim a candidate page Clients Add vp into the named translation context

may handle this event to nominate with the specified protection

alternative candidates

PROCEDURE RemoveMappingcontext T v VirtAddrT

END PhysAddr

PROCEDURE ExamineMappingcontextT

v VirtAddrT Protection

A few events raised during

illegal translations

INTERFACE VirtAddr

PROCEDURE PageNotPresentv T

PROCEDURE BadAddressv T

TYPE T REFANY VirtAddrT is opaque

PROCEDURE ProtectionFaultvT

PROCEDURE Allocatesize Size attrib Attrib T

END Translation

PROCEDURE Deallocatev T

END VirtAddr

Figure The interfaces for managing physical addresses virtual addresses and translations

vice ultimately invalidates any mappings to a reclaimed

vices Anderson et al In contrast scheduler acti

page

vationswhich are integrated with the kernel havehigh

The SPIN core services do not dene an address space communication overhead Davis et al

mo del directly but can b e used to implement a range

In SPIN an application can provide its own thread

of mo dels using a variety of optimization techniques

package and scheduler that executes within the kernel

For example wehave built an extension that imple

The thread package denes the applications execution

ments UNIX address space semantics for applications

mo del and synchronization constructs The scheduler

It exp orts an interface for copying an existing address

controls the multiplexing of the pro cessor across multi

space and for allo cating additional memory within one

ple threads Together these packages allow an applica

For each new address space the extension allo cates a

tion to dene arbitrary thread semantics and to imple

new context from the translation service This context

ment those semantics close to the pro cessor and other

is subsequently lled in with virtual and physical ad

kernel services

dress resources obtained from the memory allo cation

Although SPIN do es not dene a thread mo del for

services Another kernel extension denes a memory

applications it do es dene the structure on whichan

managementinterface supp orting s task abstrac

implementation of a thread mo del rests This structure

tion Young et al Applications may use these in

is dened by a set of events that are raised or handled

terfaces or they may dene their own in terms of the

byschedulers and thread packages A scheduler multi

lowerlevel services

plexes the underlying pro cessing resources among com

p eting contexts called strands A strand is similar to

a thread in traditional op erating systems in that it re

Extensible thread management

ects some pro cessor context Unlike a thread though

a strand has no minimal or requisite kernel state other

An op erating systems thread management system pro

than a name An applicationsp ecic thread package

vides applications with interfaces for scheduling concur

denes an implementation of the strand interface for its

rency and synchronization Applications though can

own threads

require levels of functionality and p erformance that a

thread management system is unable to deliver User Together the thread package and the scheduler im

level thread management systems have addressed this plement the control owmechanisms for userspace con

mismatchWulf et al Co op er Draves Marsh texts Figure describ es this interface The interface

et al Anderson et al but only partially contains twoevents Block and Unblockthatcanbe

For example Machs userlevel CThreads implemen raised to changes in a strands execution state A

tation Co op er Draves can have anomalous b e disk driver can direct a scheduler to blo ck the current

ha vior b ecause it is not wellintegrated with kernel ser strand during an IO op eration and an interrupt han

that an applicationsp ecic p olicy do es not conict with dler can unblo ck a strand to signal the completion of the

the global p olicy While the global scheduling p olicy is IO op eration In resp onse to these events the sched

replaceable it cannot b e replaced by an arbitrary appli uler can communicate with the thread package man

cation and its replacementcanhave global eects In aging the strand using Checkpoint and Resume events

the current implementation the global scheduler imple allowing the package to save and restore execution state

ments a roundrobin preemptive priority p olicy

Wehave used the strand interface to implementas

kernel extensions a variety of thread managementinter

INTERFACE Strand

faces including DEC OSF kernel threads Dig C

Threads Co op er Draves and Mo dula threads

TYPE T REFANY StrandT is opaque

The implementations of these interfaces are built di

PROCEDURE BlocksT rectly from strands and not layered on top of others

Signal to a scheduler that s is not runnable

The interface supp orting DEC OSF kernel threads

allows us to incorp orate the vendors device drivers di

PROCEDURE Unblocks T

rectly into the kernel The CThreads implementation

Signal to a scheduler that s is runnable

supp orts our UNIX server which uses the MachC

PROCEDURE Checkpoints T

Threads interface for concurrency Within the kernel

Signal that s is being descheduled and that it

age and scheduler implements the a trusted thread pack

should save any state required for

Mo dula thread interface Nelson

subsequent rescheduling

PROCEDURE Resumes T

Signal that s is being placed on a processor and

Implications for trusted services

that it should reestablish any state saved during

a prior call to Checkpoint

The pro cessor and memory services are two instances of

SPINs core services whichprovide interfaces to hard

END Strand

ware mechanisms The core services are trustedwhich

means that they must p erform according to their in

terface sp ecication Trust is required b ecause the ser

Figure The Strand Interface This interface describ es the

vices access underlying hardware facilities and at times

scheduling events aecting control ow that can b e raised within

must step outside the protection mo del enforced bythe

the kernel Applicationspecic schedulers and thread packages

language Without trust the protection and extension

install handlers on these events which are raised on b ehalf of

particular strands A trusted thread package and scheduler pro

mechanisms describ ed in the previous section could not

vide default implementations of these op erations and ensure that

function safely as they rely on the prop er management

extensions do not install handlers on strands for whichtheydo

of the hardware Because trusted services mediate ac

not p ossess a capability

cess to physical resources applications and extensions

must trust the services that are trusted bythe SPIN

Applicationsp ecic thread packages only manipulate

kernel

the ow of control for application threads executing out

In designing the interfaces for SPINs trusted services

side of the kernel For safety reasons the resp onsibil

wehaveworked to ensure that an extensions failure to

ity for scheduling and synchronization within the ker

use an interface correctly is isolated to the extension

nel b elongs to the kernel As a thread transfers from

itself and any others that rely on it For example

user mo de to kernel mo de it is checkp ointed and a

the SPIN scheduler raises events that are handled by

Mo dula thread executes in the kernel on its b ehalf

applicationsp ecic thread packages in order to start or

As the Mo dula thread leaves the kernel the blo cked

stop threads Although it is in the handlers b est in

applicationsp ecic thread is resumed

terests to resp ect or at least not interfere with the

A global scheduler implements the primary pro

semantics implied by the event this is not enforced

cessor allo cation p olicy b etween strands Additional

An applicationsp ecic thread package may ignore the

applicationsp ecic schedulers can b e placed on top

event that a particular userlevel thread is runnable

of the global scheduler using Checkpoint and Resume

but only the application using the thread package will

events to relinquish or receivecontrol of the pro cessor

b e aected In this way the failure of an extension is

That is an applicationsp ecic scheduler presents itself

no more catastrophic than the failure of co de executing

to the global scheduler as a thread package The deliv

in the runtime libraries found in conv entional systems

ery of the Resume event indicates that the new sched

uler can sc hedule its own strands while Checkpoint sig

nals that the pro cessor is b eing reclaimed by the global

System p erformance

scheduler

The Block and Unblock events when raised on strands

In this section weshow that SPIN enables applications scheduled by applicationsp ecic schedulers are routed

to comp ose system services in order to dene new kernel by the dispatcher to the appropriate scheduling imple

mentation This allows new scheduling p olicies to b e services that p erform well Sp ecicallyweevaluate the

implemented and integrated into the kernel provided p erformance of SPIN from four p ersp ectives

System size The size of the system in terms of lines

networkbased le system and a netw ork debugger Re

of co de and ob ject size demonstrates that advanced

dell The third comp onent rtcontains a version of

runtime services do not necessarily create an op er

the DEC SRC Mo dula runtime system that supp orts

ating system kernel of excessive size In addition

automatic memory management and exception pro cess

the size of the systems extensions shows that they

ing The fourth comp onent lib includes a subset of the

can b e implemented with reasonable amounts of

standard Mo dula libraries and handles manyofthe

co de

more mundane data structures lists queues hash ta

bles etc generally required byany op erating system

Microbenchmarks Measurements of lowlevel sys

kernel The nal comp onent sal implements a low

tem services such as protected communication

level interface to device drivers and the MMU oering

thread management and virtual memoryshowthat

functionality such as install a page table entry get

SPINs extension architecture enables us to con

acharacter from the console and read blo ck from

struct communicationintensive services with low

SCSI unit We build sal by applying a few dozen le

overhead The measurements also show that con

dis against a small subset of the les from the DEC

ventional system mechanisms such as a system call

OSF kernel source tree This approach while increas

and crossaddress space protected pro cedure call

ing the size of the kernel allows us to track the vendors

haveoverheads that are comparable to those in con

hardware without requiring that weport SPIN to each

ventional systems

new system conguration

Networking Measurements of a suite of network

ing proto cols demonstrate that SPINs extension

Comp onent Source size Text size Data size

lines bytes

architecture enables the implementation of high

sys

p erformance network proto cols

core

rt

Endtoend performance Finallyweshowthat

lib

endtoend application p erformance can b enet

sal

Total kernel

from SPINs architecture by describing two appli

cations that use system extensions

Table This table shows the size of dierent comp onents of the

system The sys core and rt comp onents contain the interfaces

We compare the p erformance of op erations on three

visible to extensions The column lab eled lines do es not include

op erating systems that run on the same platform SPIN

comments We use the DEC SRC Mo dula compiler release

V of August DEC OSF V whichisa

hisa monolithic op erating system and Machwhic

microkernel We collected our measurements on DEC

Alpha MHz AXP workstations whichare

Microb enchmarks

rated at SPECint Each machine has MBs of

Microb enchmarks reveal the overhead of basic system

memory a KB unied external cache an HP C

functions such a protected pro cedure call thread man

GB diskdrive a Mbsec Lance Ethernet inter

agement and virtual memory They dene the b ounds

face and a FORE TCA Mbsec ATM adapter

of system p erformance and provide a framework for

card connected to a FORE ASX The FORE

understanding larger op erations Times presented in

cards use programmed IO and can maximally deliver

this section measured with the Alphas internal cycle

only ab out Mbsec b etween a pair of hosts Brustoloni

are the average of a large numb er of iterations

Bershad Weavoid comparisons with op erating

and may therefore b e overly optimistic regarding cache

systems running on dierent hardwareasbenchmarks

eects Bershad et al a

tend to scale p o orly for a varietyofarchitectural rea

sons Anderson et al All measurements are taken

while the op erating systems run in singleuser mo de

Protected communication

In a conventional op erating system applications ser

System comp onents

vices and extensions communicate using two protected

mechanisms system calls and crossaddress space calls SPIN runs as a standalone kernel on DEC Alpha work

The rst enables applications and kernel services to in stations The system consists of ve main comp onents

teract The second enables interaction b etween appli sys core rt lib and sal that supp ort dierent classes

cations and services that are not part of the kernel of service Table shows the size of each comp onent

The overhead of using either of these mechanisms is the in source lines ob ject bytes and p ercentages The rst

limiting factor in a conventional systems extensibility comp onent sys implements the extensibilitymachin

High overhead discourages frequentinteraction requir ery domains naming linking and dispatching The

ing that a system b e built from coarsegrained interfaces second comp onent core implements the virtual mem

to amortize the cost of communication over large op er ory and scheduling services describ ed in the previous

ations section as well as device management a diskbased and

SPINs inkernel protected pro cedure call time is con SPINs extension mo del oers a third mechanism

servative Our Mo dula compiler generates co de for for protected communication Simple pro cedure calls

whichanintermo dule call is roughly twice as slowasan rather than system calls can b e used for communica

intramo dule call A more recentversion of the Mo dula tion b etween extensions and the core system Similarly

compiler corrects this disparity In addition our com simple pro cedure calls rather than crossaddress pro

piler do es not p erform inlining which can b e an imp or cedure calls can b e used for communication b etween

tant optimization when calling many small pro cedures applications and other services installed into the kernel

These optimizations do not aect the semantics of the

In Tablewe compare the p erformance of the dif

language and will therefore not change the systems pro

ferent protected communication mechanisms when in

tection mo del

voking the null pro cedure call on DEC OSF Mach

and SPIN The null pro cedure call takes no arguments

and returns no results it reects only the cost of con

Thread management

trol transfer The protected inkernel call in SPIN

Thread managementpackages implement concurrency

is implemented as a pro cedure call b etween two do

control op erations using underlying kernel services As

mains that have b een dynamically linked Although

s inkernel threads are im previously mentioned SPIN

this test do es not measure data transfer the overhead

plemented with a trusted thread package exp orting the

of passing arguments b etween domains even large ar

Mo dula thread interface Applicationsp ecic exten

guments is small b ecause they can b e passed byref

sions also rely on threads executing in the kernel to im

erence System call overhead reects the time to cross

plement their own concurrent op erations Atuserlevel

the userkernel b oundary execute a pro cedure and re

thread managementoverhead determines the granular

turn In Mach and DEC OSF system calls ow from

ity with which threads can b e used to control concurrent

the trap handler through to a generic but xed sys

userlevel op erations

tem call dispatcher and from there to the requested

Table shows the overhead of thread management

system call written in C In SPINthekernels trap

op erations for kernel and user threads using the dier

handler raises a TrapSystemCal l event which is dis

ent systems ForkJoin measures the time to create

patched to a Mo dula pro cedure installed as a handler

schedule and terminate a new thread synchronizing

The third line in the table shows the time to p erform

the termination with another thread PingPong reects

a protected crossaddress space pro cedure call DEC

synchronization overhead and measures the time for a

OSF supp orts crossaddress space pro cedure call us

pair of threads to synchronize with one another the rst

ing so ckets and SUN RPC Machprovides an optimized

thread signals the second and blo cks then the second

path for crossaddress space communication using mes

signals the rst and blo cks

sages Draves SPINs crossaddress space pro cedure

We measure kernel thread overheads using the na

call is implemented as an extension that uses system

sleep and tive primitives provided byeachkernel thread

calls to transfer control in and out of the kernel and

thread wakeup in DEC OSF and Mach and lo cks with

crossdomain pro cedure calls within the kernel to trans

condition variables in SPIN At userlevel we measure

fer control b etween address spaces

the p erformance of the same program using CThreads

on MachandSPIN and PThreads a CThreads sup er

set on DEC OSF The table shows measurements for Op eration DEC OSF Mach SPIN

Protected inkernel call na na

two implementations of CThreads on SPIN The rst

System call

implementation lab eled layered is implemented as

Crossaddress space call

a userlevel library layered on a set of kernel extensions

that implement Machs kernel thread interface The sec

Table Protected communicationoverhead in microseconds

ond implementation lab eled integrated is structured

Neither DEC OSF nor Mach supp ort protected inkernel com

munication

asakernel extension that exp orts the CThreads inter

face using system calls The latter version uses SPINs

The table illustrates two p oints ab out communication

strand interface and is integrated with the scheduling

and system structure First the overhead of protected

behavior of the rest of the kernel The table shows

communication in SPIN can b e that of pro cedure call

that SPINs extensible thread implementation do es not

for extensions executing in the kernels address space

incur a p erformance p enalty when compared to non

SPINs protected inkernel calls provide the same func

extensible ones even when integrated with kernel ser

tionality as crossaddress space calls in DEC OSF and

vices

Mach namely the ability to execute arbitrary co de in

resp onse to an applications call Second SPINs ex

Virtual memory

tensible architecture do es not preclude the use of tradi

Applications can exploit the virtual memory fault path tional communication mechanisms having p erformance

to extend system services App el Li For example comparable to that in nonextensible systems However

concurrent and generational garbage collectors can use the disparitybetween the p erformance of a protected in

write faults to maintain invariants or collect reference kernel call and the other mechanisms encourages the use

information A longstanding problem with faultbased of inkernel extensions

memoryandMach requires that they use the exter

DEC OSF Mach SPIN

nal pager interface Young et al Neither signals

kernel user kernel user kernel user

nor external pagers though have esp ecially ecientim

Op eration layered integrated

plementations as the fo cus of each is generalized func

ForkJoin

tionality Thekkath Levy The second reason for

PingPong

SPINs dominance is that each virtual memory event

which requires a series of interactions b etween the ker

Table Thread managementoverhead in microseconds

nel and the application is reected to the application

ernel protected pro cedure call DEC through a fast ink

OSF and Mach though communicate these events

strategies has b een the overhead of handling a page fault

by means of more exp ensive traps or messages

in an application Thekkath Levy Anderson et al

There are two sources of this overhead First han

dling each fault in a user application requires crossing

Op eration DEC OSF Mach SPIN

Dirty na na

the userkernel b oundary several times Second con

Fault

ventional systems provide quite general exception inter

Trap

faces that can p erform many functions at once As a

Prot

result applications requiring only a subset of the inter

Prot

Unprot

faces functionalitymust pay for all of it SPIN allows

App el

applications to dene sp ecialized fault handling exten

App el

sions to avoid userkernel b oundary crossings and im

plement precisely the functionality that is required

Table Virtual memory op eration overheads in microseconds

Table shows the time to execute several commonly

Neither DEC OSF nor Mach provide an interface for querying

the internal state of a page frame

referenced virtual memory b enchmarks App el Li

Engler et al The line lab eled Dirty in the

table measures the time for an application to query the

status of a particular virtual page Neither DEC OSF

Networking

nor Machprovide this facility The time shown in the

Wehave used SPINs extension architecture to imple

table is for an extension to invoke the virtual memory

mentasetofnetwork proto col stacks for Ethernet and

system an additional microseconds system call time

ATM networks Fiuczynski Bershad Figure il

is required to invoke the service from user level Trap

lustrates the structure of the proto col stacks whichare

measures the latency b etween a page fault and the time

similar to the xkernels Hutchinson et al except

when a handler executes Fault is the p erceived latency

that SPIN p ermits user co de to b e dynamically placed

of the access from the standp oint of the faulting thread

within the stack Each incoming packet is pushed

It measures the time to reect a page fault to an appli

through the proto col graph byevents and pulled by

cation enable access to the page within a handler and

handlers The handlers at the top of the graph can pro

resume the faulting thread Prot measures the time

cess the message entirely within the kernel or copyit

to increase the protection of a single page Similarly

out to an application The RPC and AM extensions

Prot and Unprot measure the time to increase

for example implement the network transp ort for a re

and decrease the protection o ver a range of pages

mote pro cedure call package and active messages von

Machs unprotection is faster than protection since the

Eicken et al The video extension provides a di

op eration is p erformed lazily SPINs extension do es not

rect path for video packets from the network to the

lazily evaluate the request but enables the access as re

framebuer The UDP and TCP extensions supp ort

quested Appel and Appel measure a combination of



The Forward extension pro the Internet proto cols

traps and protection changes The Appel b enchmark

vides transparent UDPIP and TCPIP forw arding for

measures the time to fault on a protected page resolve

packets arriving on a sp ecic p ort Finallythe HTTP

the fault in the handler and protect another page in

extension implements the Hyp erText Transp ort Proto

the handler Appel measures the time to protect

col BernersLee et al directly within the kernel

pages and fault on each one resolving the fault in the

enabling a server to resp ond quickly to HTTP requests

handler Appel is shown as the average cost p er page

by splicing together the proto col stack and the lo cal le

SPIN outp erforms the other systems on the virtual

system

memory b enchmarks for two reasons First SPIN uses

kernel extensions to dene applicationsp ecic system

calls for virtual memory management The calls pro

Latency and Bandwidth

vide access to the virtual and physical memory inter

Table shows the round trip latency and reliable band

faces describ ed in the previous section and install han

width b etween two applications using UDPIP on DEC

dlers for TranslationProtectionFault events that o ccur

within the applications virtual address space In con



Wecurrently use the DEC OSF TCP engine as a SPIN

trast DEC OSF requires that applications use the

extension and manually assert that the co de which is written in

C is safe UNIX signal and mprotect interfaces to manage virtual

net driver is optimized for throughput Using dierent

ers weachieve a roundtrip latency of

Ping A.M. RPC Video Forward HTTP device driv

secs on Ethernet and secs on ATM while reli

able ATM bandwidth b etween a pair of hosts rises to

e estimate the minimum round trip time

ICMP.PktArrivedUDP.PktArrived TCP.PktArrived Mbsec W

using our hardware at roughly secs on Ethernet and

secs on ATM The maximum usable Ethernet and

ICMP UDP TCP

ATM bandwidths b etween a pair of hosts are roughly Mbsec and Mbsec

IP.PktArrived

Proto col forwarding

Event

s extension architecture can b e used to provide

Handler IP SPIN

proto col functionality not generally available in con

entional systems For example some TCP redirection

Event Ether.PktArrived ATM.PktArrived v

e otherwise proto cols Balakrishnan et al that hav

ernel mo dications can b e straightforwardly Lance Fore required k

device driver

dened by an application as a SPIN extension A for

warding proto col can also b e used to load balance ser

Figure This gure shows a proto col stack that routes incom

vice requests across multiple servers

ing network packets to applicationspecic endp oints within the

In SPIN an application installs a no de into the pro

kernel Ovals representevents raised to route control to handlers

to col stackwhich redirects all data and control packets

which are represented byboxes Handlers implement the proto col

destined for a particular p ort numb er to a secondary

corresp onding to their lab el

host Wehave implemented a similar service using DEC

OSF with a userlevel pro cess that splices together

OSF and SPINFor DEC OSF the application

an incoming and outgoing so cket The DEC OSF

co de executes at user level and eachpacket sentin

forwarder is not able to forward proto col control pack

volves a trap and several copy op erations as the data

ets b ecause it executes ab ove the transp ort layer As

moves across the userkernel b oundaryFor SPINthe

a result it cannot maintain a proto cols endtoend se

application co de executes as an extension in the kernel

mantics In the case of TCP endtoend connection

where it has lowlatency access to b oth the device and

establishment and termination semantics are violated

data Each incoming packet causes a series of events

A userlevel intermediary also interferes with the proto

to b e generated for eachlayer in the UDPIP proto

cols algorithms for window size negotiation slow start

col stack EthernetATM IP UDP shown in Figure

failure detection and congestion control p ossibly de

For SPIN proto col pro cessing is done by a separately

grading the overall p erformance of connections b etween

scheduled kernel thread outside of the interrupt handler

the hosts Moreover on the userlevel forwarder each

We do not presentnetworking measurements for Mach

packet makes two trips through the proto col stackwhere

as the system neither provides a path to the Ethernet

it is twice copied across the userkernel b oundaryTa

more ecient than DEC OSF nor supp orts our ATM

ble compares the latency for the two implementations

card

and reveals the additional work done by the userlevel

forw arder

Latency Bandwidth

DEC OSF SPIN DEC OSF SPIN

Ethernet

ATM

TCP UDP

DEC OSF SPIN DEC OSF SPIN

Ethernet

Table Network proto col latency in microseconds and receive

ATM

bandwidth in Mbsec We measure latency using small packets

bytes and bandwidth using large packets for Ethernet

and for ATM

Table Round trip latency in microseconds to route

packets through a proto col forwarder

The table shows that pro cessing packets entirely

within the kernel can reduce roundtrip latency when

compared to a system in whichpackets are handled in

Endtoend p erformance

Throughput which tends not to b e latency

Wehave implemented several applications that exploit sensitive is roughly the same on b oth systems

SPINs extensibility One is a networked video system We use the same vendor device drivers for b oth DEC

that consists of a server and a clientviewer The server OSF and SPIN to isolate dierences due to system

is structured as three kernel extensions one that uses architecture from those due to the characteristics of the

the lo cal le system to read video frames from the disk underlying device driver Neither the Lance Ethernet

another that sends the video out over the network and a driver nor the FORE ATM driver are optimized for la

third that registers itself as a handler on the SendPacket tency Thekkath Levy and only the Lance Ether

event transforming the single send into a multicast to

ts The server transmits frames p er

a list of clien 45

h client On the client an extension awaits

second to eac 40

incoming video packets decompresses and writes them wn directly to the frame buer using the structure sho 35 SPIN T3 Driver DEC OSF/1 T3 Driver

in Figure 30

Because each outgoing packet is pushed through the

t proto col graph only once and not once p er clien 25

SPINs server can supp ort a larger number of

stream 20

clients than one that pro cesses eachpacket in isolation

CPU Utilization

o show this we measure pro cessor utilization as a func

T 15

umb er of clients for the SPIN server and for

tion of the n 10

a server that runs on DEC OSF The DEC OSF

er executes in user space and communicates with

serv 5

clients using so ckets each outgoing packet is copied into

0

ernel and is pushed through the kernels proto col

the k 2 4 6 8 10 12 14

kinto the device driver We determine pro cessor

stac Number of Clients

utilization by measuring the progress of a lowpriority

Figure Server utilization as a function of the numb er of client

idle thread that executes on the server

video streams Each stream requires approximately Mbsec

Using the FORE interface we nd that b oth SPIN

and DEC OSF consume roughly the same fraction of

the servers pro cessor for a given numb er of clients Al

tem to nd the le A comparable userlevel web server though the SPIN server do es less work in the proto col

on DEC OSF that relies on the op erating systems stack the ma jority of the servers CPU resources are

caching le system no double buering takes ab out

consumed by the programmed IO that copies data to

the network one word at a time Using a network inter milliseconds p er request for the same cached le

face that supp orts DMA though we nd that the SPIN

servers pro cessor utilization grows less slowly than the

DEC OSF servers Figure shows server pro ces

sor utilization as a function of the numb er of supp orted

Other issues

client streams when the server is congured with a Dig

ital TPKT adapter The T is an exp erimental net

work interface that can send Mbsec using DMA We

Scalability and the dispatcher

use the same device driver in b oth op erating systems

At streams b oth SPIN and DEC OSF saturate

SPINs event dispatcher matches event raisers to han

the network but SPIN consumes only half as muchof

dlers Since every pro cedure in the system is eectively

the pro cessor Compared to DEC OSF SPIN can

an event the latency of the dispatcher is critical As

supp ort more clients on a faster network or as many

mentioned in the case of a single synchronous han

clients on a slower pro cessor

dleranevent raise is implemented as a pro cedure call

Another application that can b enet from SPINs

from the raiser to the handler In other cases suchas

architecture is a web server To service requests

when there are many handlers registered for a particular

quicklyaweb server should cache recently accessed

event the dispatcher takes a more active role in event

ob jects not cache large ob jects that are infrequently

deliveryFor each guardhandler pair installed on an

accessed Chankhuntho d et al and avoid double

event the dispatcher evaluates the guard and if true

buering with other caching agents Stonebraker

invokes the handler Consequently dispatcher latency

A server that do es not itself cache but is built on top

dep ends on the numb er and complexity of the guards

of a conventional caching le system avoids the double

and the number of event handlers ultimately invoked

buering problem but is unable to control the caching

In practice the overhead of an event dispatch is linear

p olicy In contrast a serv er that controls its own cache

with the numb er of guards and handlers installed on

on top of the le systems suers from double buering

the event F or example round trip Ethernet latency

SPIN allows a server to b oth control its cache and whichwe measure at secs rises to ab out secs

avoid the problem of double buering A SPIN web when additional guards and handlers register inter

server implements its own hybrid caching p olicy based est in the arrival of some UDP packet but all guards

on le typ e LRU for small les and nocache for large evaluate to false When all guards evaluate to true

les which tend to b e accessed infrequently The client latency rises to secs Presentlywe p erform no

side latency of an HTTP transaction to a SPIN web guardsp ecic optimizations suchasevaluating common

server running as a kernel extension is milliseconds sub expressions Yuhara et al or representing guard

when the requested le is in the servers cache Oth predicates as decision trees As the system matures we

erwise the server go es through a noncaching le sys plan to apply these optimizations

Comp onent Source size Text size Data size

Impact of automatic storage management

lines bytes bytes

NULL syscal l

An extensible system cannot dep end on the correctness

IPC

of unprivileged clients for its memory integrity As pre

CThreads

viously mentioned memory managementschemes that

DEC OSF threads

allow extensions to return ob jects to the system heap are

VM workload

IP

unsafe b ecause a rogue client can violate the typ e system

UDP

by retaining a reference to a freed ob ject SPIN uses a

TCP

tracebased mostlycopying garbage collector Bartlett

HTTP

to safely reclaim memory resources The collector

TCP Forward

UDP Forward

serves as a safety net for untrusted extensions and en

Video Client

sures that resources released by an extension either

Video Server

through inaction or as a result of premature termina

tion are eventually reclaimed

Table This table shows the size of some dierent system

Clients that allo cate large amounts of memory can

extensions describ ed in this pap er

trigger frequent garbage collections with adverse global

eects In practice this is less of a problem than might

cult issues that typically arise in any language design

b e exp ected b ecause SPIN and its extensions avoid allo

or redesign For each ma jor issue that we considered

cation on fast paths For example none of the measure

in the context of a safe version of C typ e semantics

ments presented in this section change when we disable

ob jects storage management naming etc wefound

the collector during the tests Even in systems with

the issue already satisfactorily addressed by Mo dula

out garbage collection generalized allo cation is avoided

Moreover we understo o d that the denition of our ser

b ecause of its high latency Instead subsystems imple

vice interfaces was more imp ortant than the language

ment their own allo cators optimized for some exp ected

with whichwe implemented them

usage pattern SPIN services do this as well and for the

Ultimatelywe decided to use Mo dula for b oth the

same reason dynamic memory allo cation is relatively

system and its extensions Early on we found evidence

exp ensive As a consequence there is less pressure on

to abandon our two main prejudices ab out the language

the collector and the pressure is least likely to b e ap

that programs written in it are slow and large and that

plied during a critical path

C programmers could not b e eective using another lan

guage In terms of p erformance wehave found nothing

Size of extensions

remarkable ab out the languages co de size or execution

time as shown in the previous section In terms of pro

Table shows the size of some of the extensions de

grammer eectiveness wehave found that it takes less

scrib ed in this section SPIN extensions tend to require

than a day for a comp etent C to learn the

an amount of co de commensurate with their functional

syntax and more obvious semantics of Mo dula and

ityFor example the Nul l syscal l and IPC extensions

another few days to b ecome procient with its more

are conceptually simple and also have simple imple

advanced features Although anecdotal our exp erience

mentations Extensions tend to imp ort relatively few

has b een that the p ortions of the SPIN kernel written

ab out a dozen in terfaces and use the domain and

in Mo dula are much more robust and easier to under

event system in fairly stylized ways As a result we

stand than those p ortions written in C

have not found building extensions to b e exceptionally

dicult In contrast we had more trouble correctly im

plementing a few of our b enchmarks on DEC OSF

Conclusions

or Mach b ecause wewere sometimes forced to follow

circuitous routes to achieve a particular level of func

The SPIN op erating system demonstrates that it is p os tionality Machs external pager interface for instance

sible to achieve go o d p erformance in an extensible sys required us to implement a complete pager in user space

tem without compromising safety The system provides although wewere only interested in discovering write

a set of ecientmechanisms for extending services as protect faults

well as a core set of extensible services Colo cation

enforced mo dularit y logical protection domains and dy

namic call binding allow extensions to b e dynamically

Exp eriences with Mo dula

dened and accessed at the granularity of a pro cedure

call

Our decision to use Mo dula was made with some care

In the past system builders have only relied on Originallywe had intended to dene and implementa

the programming language to translate op erating sys compiler for a safe subset of C All of us b eing C pro

tem p olicies and mechanisms into machine co de Us grammers were certain that it was infeasible to build

ing a programming language with the appropriate fea an ecient op erating system without using a language

tures webelieve that op erating system implementors having the syntax semantics and p erformance of C As

can more heavily rely on compiler and language run the design of our safe subset pro ceeded we faced the dif

Bershad Bershad B N Practical Considerations for Non

time services to construct systems in which structure

Blo cking Concurrent Ob jects In Proceedings of the Thir

and p erformance are complementary

teenth International Conference on

Additional information ab out the SPIN pro ject is

Systems pages Pittsburgh PA May

available at httpwwwspincswashingtonedu an Al

Bershad et al Bershad B N Anderson T E Lazowska

pha running SPIN and the HTTP extension

E D and Levy H M Lightweight Remote Pro cedure

describ ed in this pap er

Call ACM Transactions on Computer Systems

February

and Forin A Bershad et al a Bershad B N Draves R P

Acknowledgements

Using Microb enchmarkstoEvaluate System Performance

In Proceedings of the Third Workshop on Workstation Op

erating Systems pages Key Biscayne FL April

Many p eople have contributed to the SPIN pro ject

David Dion has b een resp onsible for bringing up the

Bershad et al b Bershad B N Redell D D and Ellis J R

systems UNIX server Jan Sanislo made it p ossible for

Fast Mutual Exclusion for Unipro cessors In Proceedings of

us to use the DEC OSF SCSI driver from SPINAn

the Fifth International ConferenceonArchitectural Sup

thony Lamarca Dylan McNamee Geo Voelker and

port for Programming Languages and Operating Systems

ASPLOSV pages Boston MA Octob er

Alec Wolman assisted in understanding system p erfor

mance on DEC OSF and Mach David Nichols Hank

Black et al Black D L et al Microkernel Op erating Sys

tem Architecture and Mach In Proceedings of the USENIX

Levy and Terri Watson provided feedback on earlier

Workshop on MicroKernels and Other Kernel Architec

drafts of this pap er David Boggs provided us with the

tures pages Seattle WA April

T cards that we used in the video server exp eriment

Bricker et al Bricker A Gien M Guillemont M Lip

Sp ecial thanks are due to DEC SRC who provided us

kis J Orr D and Rozier M A New Lo ok at Micro

with much of our compiler infrastructure

kernelbased UNIX Op erating Systems Lessons in Perfor

mance and Compatibility In Proceedings of the EurOpen

Spring ConferenceTromso e NorwayMay

References

Bro ckschmidt Bro c kschmidt K Inside OLE

Press

Abrossimov et al Abrossimov V Rozier M and Shapiro

M Generic Virtual Memory Management for Op erating

Brustoloni Bershad Brustoloni J C and Bershad B N

System Kernels In Proceedings of the Thirteenth ACM

Simple Proto col Pro cessing for HighBandwidth Low

Symposium on Operating Systems Principles pages

Latency Networking Technical Rep ort CMUCS

Litcheld Park AZ Decemb er

Carnegie Mellon UniversityMarch

Anderson et al Anderson T E Levy H M Bershad

Cao et al Cao P Felten E W and Li K Implementation

B N and Lazowska E D The Interaction of Architecture

and Performance of ApplicationControlled File Caching

and Op erating System Design In Pro ceedings of the Fourth

In Proceedings of the First USENIX Symposium on Oper

International ConferenceonArchitectural Support for Pro

ating Systems Design and Implementation OSDIpages

gramming Languages and Operating Systems ASPLOS

MontereyCANovemb er

IV pages Santa Clara CA April

Carter et al Carter J B Bennett J K and Zwaenep o el

Anderson et al Anderson T E Bershad B N Lazowska

W Implementation and Performance of Munin In Pro

E D and LevyHM Scheduler Activations Eec

ceedings of the Thirteenth ACM Symposium on Operating

tive Kernel Supp ort for the UserLevel Managementof

Systems Principles pages Pacic Grove CA Oc

Parallelism ACM Transactions on Computer Systems

tob er

February

Carter et al Carter N P Keckler S W and DallyWJ

App el Li App el W and Li K Virtual Memory Primi

Hardware Supp ort for Fast CapabilityBased Addressing

tives for User Programs In Proceedings of the Fourth In

In Proceedings of the Sixth International Conferenceon

ternational ConferenceonArchitectural Support for Pro

Architectural Support for Programming Languages and Op

gramming Languages and Operating Systems ASPLOS

erating Systems ASPLOSVI pages San Jose

IV pages Santa Clara CA April

CA Octob er

Bala et al Bala K Kaasho ek M F and WeihlWE

Chankhuntho d et al Chankhuntho d A Danzig P Neer

Software Prefetching and Caching for Translation Lo oka

daels C Schwartz M and Worrell K A Hierarchical

side Buers In Proceedings of the First USENIX Sym

Internet Ob ject Cache Technical Rep ort CUCS

posium on Operating Systems Design and Implementation

DCS University of Colorado July

OSDI pages Monterey CA Novemb er

Balakrishnan et al Balakrishnan H Seshan S Amir E

Chen Bershad Chen J B and Bershad B N The Im

ver and Katz R H Improving TCPIP Performance o

pact of Op erating System Structure on Memory System

Wireless Networks In Proceedings of the First ACM Con

PerformanceInProceeding s of the Fourteenth ACM Sym

ference on Mobile Computing and NetworkingNovember

posium on Operating Systems Principles pages

Asheville NC Decemb er

Barrera Barrera J S A Fast MachNetwork IPC Imple

Cheriton Duda Cheriton D R and Duda K J A

mentation In Proceedings of the Second USENIX Mach

Caching Mo del of Op erating System Kernel Functionality

Symposium pages Monterey CA Novemb er

In Proceedings of the First USENIX Symposium on Oper

ating Systems Design and Implementation OSDIpages

Bartlett Bartlett J F Compacting Garbage Collection with

MontereyCANovemb er

Ambiguous Ro ots Technical Rep ort WRLTR Digi

tal Equipment Corp oration Western Research Labs Febru

Cheriton Zwaenep o el Cheriton D R and Zwaenep o el

ary

W The Distributed V Kernel and its Performance for Disk

BernersLee et al BernersLee T Cailliau R Luotonen less Workstations In Proceedings of the Ninth ACM Sym

A Nielsen H F and Secretr A The WorldWide Web posium on Operating Systems Principles pages

Communications of the ACM August Bretton Wo o ds NH Octob er

Colwell Colwell R The Performance Eects of Func Golub et al Golub D Dean R Forin A and Rashid

tional Migration and Architectural ComplexityinObject R Unix as an Application Program In Proceedings of

Oriented Systems Technical Rep ort CMUCS the Summer USENIX Conference pages June

Carnegie Mellon University August

Co op er Draves Co op er E C and Draves R P C

Hamilton Kougiouris Hamilton G and Kougiouris P

Threads Technical Rep ort CMUCS Carnegie Mel

The Spring Nucleus A Microkernel for Ob jects In Pro

lon University June

ceedings of the Summer USENIX Conferencepages

Cincinnati OH June

Co op er et al Co op er E Harp er R and Lee P The

Fox Pro ject Advanced Development of Systems Software

Harty Cheriton Harty

Technical Rep ort CMUCS Carnegie Mellon Uni

K and Cheriton D R ApplicationControlled Physical

versity August

Memory using External PageCache Management In Pro

ceedings of the Fourth International ConferenceonArchi

Davis et al Davis PB McNamee D Vaswani R and

tectural Support for Programming Languages and Operat

Lazowska E Adding Scheduler Activations to Mach

ing Systems ASPLOSIV pages Santa Clara

In Proceedings of the Third USENIX Mach Symposium

CA April

pages Santa Fe NM April

Heidemann Pop ek Heidemann J and Pop ek G File

Dig Digital Equipment Corp oration DEC OSF Writing

System Development with Stackable Layers Communi

Device Drivers AdvancedTopics

cations of the ACM February

Draves Draves R The Case for RunTime Replaceable Ker

Hildebrand Hildebrand D An Architectural Overview of

nel Mo dules In Proceedings of the Fourth Workshop on

QNX In Proceedings of the USENIX Workshop on Micro

Workstation Operating Systems pages Napa CA

Kernels and Other Kernel Architectures pages

Octob er

Seattle WA April

Draves Draves R P Control Transfer in Op erating System

Kernels Technical Rep ort CMUCS Carnegie Mel

Hutchinson et al Hutchinson N C Peterson L Abb ott

M B and OMalley S RPC in xkernel Evaluating New lon UniversityMay

Design Techniques In Proceedings of the Thirteenth ACM

Draves et al Draves R P Bershad B N Rashid R F

Symposium on Operating Systems Principles pages

and Dean R W Using Continuations to Implement

Litcheld Park AZ Decemb er

Thread Management and Communication in Op erating

Systems In Proceedings of the Thirteenth ACM Sympo

IntIntel Corp oration Introduction to the iAPX Archi

sium on Operating Systems Principles pages Pa

tecture

cic Grove CA Octob er

IntIntel Corp oration i Programmers

Engler Kaasho ek Engler D and Kaasho ek M F Exter

Reference Manual

minate All Op erating System Abstractions In Proceedings

Khalidi Nelson Khalidi Y A and Nelson M An Im

of the Fifth Workshop on Hot Topics in Operating Systems

plementation of UNIX on an Ob jectOriented Op erating

pages Orcas Island WA May

System In Proceedings of the Winter USENIX Con

Engler Pro ebsting Engler D R and Pro ebsting T A

ference pages San Diego CA January

DCG An Ecient Retargettable Dynamic Co de Genera

Lazowska et al Lazowska E D Levy H M Almes G T

tion System In Proceedings of the Sixth International Con

Fischer M Fowler R and Vestal S The Architecture

ferenceonArchitectural Support for Programming Lan

of the Eden System In Proceedings of the Eighth ACM

guages and Operating Systems ASPLOSVI pages

Symposium on Operating Systems Principles pages

San Jose CA Octob er

December

Engler et al Engler D Kaasho ek M F and OToole J

The Op erating System Kernel as a Secure Programmable Lee et al Lee C H Chen M C and Chang R C HiPEC

oceedings of the ACM European Machine In Pr High Performance External Virtual Memory Caching In

SIGOPS Workshop Septemb er Proceedings of the First USENIX Symposium on Operating

Systems Design and Implementation OSDI pages

Engler et al Engler D R Kaasho ek M F and Jr

Monterey CA Novemb er

J O An Op erating System Architecture for

ApplicationLevel Resource Management In Proceedings

Liedtke Liedtke J Fast Thread Management and Com

of the Fifteenth ACM Symposium on Operating Systems

munication Without Continuations In Proceedings of the

Principles Copp er Mountain CO December

USENIX Workshop on MicroKernels and Other Kernel

Architectures pages Seattle WA April

Fall Pasquale Fall K and Pasquale J Improving

ContinuousMedia PlaybackPerformance with InKernel

Liedtke Liedtke J Improving IPC by Kernel Design In

Data Paths In Proceedings of the First IEEE Interna

Proceedings of the Fourteenth ACM Symposium on Oper

tional Conference on Multimedia Computing and Systems

ating Systems Principles pages Asheville NC

pages Boston MA May

December

Felten Felten E W The Case for ApplicationSp ecic Com

Lucco Lucco S HighPerformance Microkernel Systems In

munication Proto cols In Intel Supercomputer Systems

Proceedings of the First USENIX Symposium on Operat

Technology Focus Conference pages April

ing Systems Design and Implementation OSDI page

MontereyCANovemb er

Fiuczynski Bershad Fiuczynski M and Bershad B An

Extensible Proto col Architecture for ApplicationSpecic

Maeda Bershad Maeda C and Bershad B N Proto col

Networking In Proceedings of the Winter USENIX

Service Decomp osition for HighPerformance Networking

Conferenc e San Diego CA January

In Proceedings of the Fourteenth ACM Symposium on Op

erating Systems Principles pages Asheville NC

Forin et al Forin A Golub D and Bershad B N An

December

IO System for MachInProceedings of the Second

USENIX Mach Symposium pages MontereyCA

Marsh et al Marsh B Scott M LeBlanc T and

November

Markatos E FirstClass UserLevel Threads In Proceed

Geschke et al Geschke C Morris J and Satterthwaite ings of the Thirteenth ACM Symposium on Operating Sys

E Early Exp eriences with Mesa Communications of the tems Principles pages Pacic Grove CA Octo

ACM August ber

Stevenson Julin Stevenson J M and Julin D PMach McNamee Armstrong McNamee D and Armstrong K

US Unix On Generic OS Ob ject Servers In Proceedings of Extending the Mach External Pager Interface to Accommo

the Winter USENIX Conference New Orleans LA date UserLevel Page ReplacementPolicies In Proceedings

January of the USENIX Mach Symposium pages Burling

ton VT Octob er

Sto dolsky et al Sto dolsky D Bershad B N and Chen B

Fast Interrupt Priorit y Management for Op erating System

Mogul et al Mogul J Rashid R and Accetta M The

Kernels In Proceedings of the Second USENIX Workshop

Packet Filter An EcientMechanism for Userlevel Net

on Microkernels and Other Kernel Architecturespages

work Co de In Proceedings of the Eleventh ACM Sym

San Diego CA Septemb er

posium on Operating Systems Principles pages

Austin TX Novemb er

Stonebraker Stonebraker M Op erating System Supp ort

for Database Management Communications of the ACM

Mossenbock Mossenbock H Extensibility in the Ob eron

July

System Nordic Journal of Computing Febru

ary

Thacker et al Thacker C P Stewart L C and Satterth

waite Jr E H Firey a Multipro cessor Workstation

Mullender et al Mullender S J Rossum G V Tanen

IEEE Transactions on August

baum A S Renesse R V and van Staveren H Amo eba

A Distributed Op erating System for the s IEEE

Computer pages May

Thekkath Levy Thekkath C A and Levy H M Limits

to LowLatency RPC ACM Transactions on Computer

Nelson Nelson G editor System Programming in Modula

Systems May

Prentice Hall

Thekkath Levy Thekkath C A and Levy H M Hard

Organick Organick E editor Computer System Organiza

ware and Software Supp ort for Ecient Exception Han

trion The BB Series Academic Press

dling In Proceedings of the Sixth International Confer

enceonArchitectural Support for Programming Languages

Pardyak Bershad Pardyak P and Bershad B A Group

and Operating Systems ASPLOSVI pages San

Structuring Mechanism for a Distributed Ob ject Oriented

Jose CA Octob er

Language Ob jects In Proce edings of the Fourteenth Inter

national Conference on Distributed Computing Systems

von Eicken et al von Eicken T Culler D E Goldstein

pages Poznan Poland June

S C and Schauser K E Active Messages A Mecha

nism for Integrated Communication and Computation In

Rashid et al Rashid R Tevanian Jr A Young M

Proceedings of the Nineteenth International Symposium on

Golub D Baron R Black D Bolosky W and Chew

Computer Architecture pages Gold Coast Aus

J MachineIndependent Virtual Memory Managementfor

tralia May

Paged Unipro cessor and Multipro cessor Architectures In

Proceedings of the Second International ConferenceonAr

Wahb e et al Wahb e R Lucco S Anderson T E and

chitectural Support for Programming Languages and Oper

Graham S L Ecient SoftwareBased Fault Isolation In

ating Systems ASPLOSII pages Palo Alto CA

Proceedings of the Fourteenth ACM Symposium on Oper

April

ating Systems Principles pages Asheville NC

December

Redell Redell D Exp erience with Topaz TeledebuggingIn

Proceedings of the ACM SIGPLAN and SIGOPS Work

Waldspurger Weihl Waldspurger C A and Weihl W E

shop on Paral lel and Distributed Debugging Octob er

Lottery Scheduling Flexible Prop ortionalShare Resource

Management In Proceedings of the First USENIX Sym

Redell et al Redell D D Dalal Y K HorsleyTR

posium on Operating Systems Design and Implementation

Lauer H C Lynch W C McJones P R MurrayHG

OSDI pages MontereyCANovemb er

and Purcell S C Pilot An Op erating System for a Per

sonal Computer Communications of the ACM

Wheeler Bershad Wheeler B and Bershad B N Consis

February

tency Management for Virtually Indexed Caches In Pro

ceedings of the Fifth International ConferenceonArchitec

Romer et al Romer T H Lee D and Bershad B N Dy

tural Support for Programming Languages and Oper ating

namic Page Mapping Policies for Cache Conict Resolu

Systems ASPLOSV pages Boston MA Octo

tion on Standard Hardware In Proceedings of the First

ber

USENIX Symposium on Operating Systems Design and

Wulf et al Wulf W A Levin R and Harbison S P

Implementation OSDI pages MontereyCA

HydraCmmp AnExperimental Computer System

November

McGrawHill

Romer et al Romer T Ohlrich W Karlin A and Ber

Young et al Young M Tevanian A Rashid R Golub

shad B Reducing TLB and Memory Overhead Using On

D Eppinger J Chew J Bolosky W Black D and

line Sup erpage Promotion In Proceedings of the Twenty

Baron R The Duality of Memory and Communication

Third International Symposium on

in the Implementation of a Multipro cessor Op erating Sys

pages

tem In Proceedings of the Eleventh ACM Symposium on

Rozier et al Rozier M Abrossimov V Armand F Boule

Operating Systems Principles pages Austin TX

I Giend M Guillemont M Herrmann F Leonard

Novemb er

P Langlois S and Neuhauser W The Chorus Dis

Yuhara et al Yuhara M Bershad B N Maeda C and

tributed Op erating System Computing Systems

Moss J E B EcientPacket Demultiplexing for Mul

tiple Endp oints and Large Messages In Proceedings of

Schro eder Burrows Schro eder M D and Burrows M

the Winter USENIX Conference pages San

Performance of Firey RPC ACM Transactions on Com

Francisco CA January

puter Systems February

Schulman et al Schulman A Maxey D and Pietrek M

Undocumented Windows AddisonWesley

Small Seltzer Small C and Seltzer M VINO An Inte

grated Platform for Op erating System and Database Re

search Technical Rep ort TR Harvard University