<<

Ecient SoftwareBased Fault Isolation

Rob ert Wahbe Steven Lucco Thomas E Anderson Susan L Graham

Computer Science Division

University of California

Berkeley CA

Intro duction Abstract

Application programs often achieve extensibilityby One way to provide fault isolation among co op erating

incorp orating indep endently develop ed software mo software mo dules is to place each in its own address

ules However faults in extension co de can render a space However for tightlycoupled mo dules this so

software system unreliable or even dangerous since lution incurs prohibitivecontext switchoverhead In

such faults could corrupt p ermanent Toin this pap er we present a software approach to imple

crease the reliability of these applications an op erat menting fault isolation within a single address space

ing system can provide services that prevent faults in Our approach has two parts First we load the co de

distrusted mo dules from corrupting application data and data for a distrusted mo dule into its own fault do

Such fault isolation services also facilitate software de main a logically separate p ortion of the applications

velopmentby helping to identify sources of system fail address space Second we mo dify the ob ject co de of a

ure distrusted mo dule to prevent it from writing or jump

For example the postgres database manager in ing to an address outside its fault domain Both these

cludes an extensible typ e system Sto Using this software op erations are p ortable and programming lan

facility postgres queries can refer to generalpurp ose guage indep endent

co de that denes constructors destructors and pred Our approach p oses a tradeo relative to hardware

icates for userdened data typ es such as geometric fault isolation substantially faster communication b e

ob jects Without fault isolation any query that uses tween fault domains at a cost of slightly increased

extension co de could interfere with an unrelated query execution time for distrusted mo dules We demon

or corrupt the database strate that for frequently communicating mo dules im

Similarly recent op erating system research has fo plementing fault isolation in software rather than hard

cused on making it easier for third partyvendors ware can substantially improve endtoend application

to enhance parts of the op erating system An ex p erformance

ample is microkernel design parts of the op erat

This work was supp orted in part by the National Sci

ing system are implemented as userlevel servers that

ence Foundation CDA Defense Advanced Research

can b e easily mo died or replaced More gener

Pro jects Agency DARPA under grantMDAJ and

contracts DABTC and NC the Digi

allyseveral systems have added extension co de into

tal Equipment Corp oration the Systems Research Center and

the op erating system for example the BSD network

the External Research Program and the ATT Foundation

packet lter MRA MJ applicationsp ecic vir

Anderson was also supp orted by a National Science Foundation

tual memory management HC and Active Mes

Young InvestigatorAward The content of the paper does not

necessarily reect the p osition or the p olicy of the Government

sages vCGS Among industry systems Microsofts

and no ocial endorsement should b e inferred

Ob ject Linking and Emb edding system Cla can

link together indep endently develop ed software mo d

ules Also the Quark Xpress desktop publishing sys

To app ear in the Pro ceedings of the Symp osium on Op

tem Dys is structured to supp ort incorp oration of

erating System Principles

Email frwahbe lucco tea grahamgcsberkeleyedu

segments In this pap er we concentrate on a sim generalpurp ose third party co de As with postgres

ple transformation technique called sandb oxing that faults in extension mo dules can render any of these

only slightly increases the execution time of the mo d systems unreliable

ied ob ject co de We also investigate techniques that One waytoprovide fault isolation among co op erat

provide more debugging information but which incur ing software mo dules is to place each in its own address

greater execution time overhead space Using Remote Pro cedure Call RPC BN

Our approach p oses a tradeo relative to hardware mo dules in separate address spaces can call into each

based fault isolation Because we eliminate the need to other through a normal pro cedure call interface Hard

cross hardware b oundaries we can oer substantially ware page tables prevent the co de in one address space

lowercost RPC b etween fault domains A safe RPC in from corrupting the contents of another

our prototyp e implementationtakes roughly sona Unfortunately there is a high p erformance cost

DECstation and roughly s on a DEC Al to providing fault isolation through separate address

pha more than an order of magnitude faster than spaces Transferring control across protection b ound

any existing RPC system This reduction in RPC time aries is exp ensive and do es not necessarily scale

comes at a cost of slightly increased distrusted mo dule with improvements in a pro cessors integer p erfor

execution time On a test suite including the the C mance ALBL A crossaddressspace RPC requires

spec b enchmarks sandb oxing incurs an average of at least a trap into the op erating system kernel copy

execution time overhead on b oth the DECstation ing each argument from the caller to the callee sav

and the Alpha ing and restoring registers switching hardware ad

Softwareenforced fault isolation may seem to b e dress spaces on manymachines ushing the transla

intuitive we are slowing down the common tion lo okaside buer and a trap back to user level

case normal execution to sp eed up the uncommon These op erations must b e rep eated up on RPC re

case crossdomain communication But for fre turn The execution time overhead of an RPC even

quently communicating fault domains our approach with a highly optimized implementation will often

can oer substantially b etter endtoend p erformance be two to three orders of magnitude greater than

To demonstrate this we applied softwareenforced the execution time overhead of a normal pro cedure

fault isolation to the postgres database system run call BALL ALBL

ning the Sequoia b enchmark The b enchmark The goal of our workistomake fault isolation cheap

makes use of the postgres extensible data typ e sys enough that system develop ers can ignore its p erfor

tem to dene geometric op erators For this b ench mance eect in cho osing which mo dules to place in

mark the software approach reduced fault isolation separate fault domains In many cases where fault iso

overhead by more than a factor of three on a DECsta lation would b e useful crossdomain pro cedure calls

tion are frequentyet involve only a mo derate amountof

Asoftware approach also provides a tradeo b e computation p er call In this situation it is imprac

tween p erformance and level of distrust If some mo d tical to isolate each logically separate mo dule within

ules in a program are trusted while others are dis its own address space b ecause of the cost of crossing

trusted as may b e the case with extension co de only hardware protection b oundaries

the distrusted mo dules incur any execution time over We prop ose a software approach to implementing

head Co de in trusted domains can run at full sp eed fault isolation within a single address space Our ap

Similarly it is p ossible to use our techniques to im proach has two parts First we load the co de and data

plement full security preventing distrusted co de from for a distrusted mo dule into its own fault domaina

even reading data outside of its domain at a cost of logically separate p ortion of the applications address

higher execution time overhead Wequantify this ef space A fault domain in addition to comprising a con

fect in Section tiguous region of memory within an address space has

The remainder of the pap er is organized as follows a unique identier which is used to control its access to

Section provides some examples of systems that re pro cess resources such as le descriptors Second we

quire frequentcommunication b etween fault domains mo dify the ob ject co de of a distrusted mo dule to pre

Section outlines howwe mo dify ob ject co de to pre vent it from writing or jumping to an address outside

vent it from generating illegal addresses Section its fault domain Program mo dules isolated in sepa

describ es howwe implementlow latency crossfault rate softwareenforced fault domains can not mo dify

domain RPC Section presents p erformance results each others data or execute each others co de except

for our prototyp e and nally Section discusses some through an explicit crossfaultdomain RPC interface

related work Wehave identied several programminglanguage

indep endent transformation strategies that can render

ob ject co de unable to escap e its own co de and data

Background the distrusted part of the application Section quan

ties this tradeo b etween domaincrossing overhead

In this section wecharacterize in more detail the

and application execution time overhead and demon

typ e of application that can b enet from software

strates that even if domaincrossing overhead repre

enforced fault isolation We defer further description

sents a mo dest prop ortion of the total application ex

of the postgres extensible typ e system until Section

ecution time softwareenforced fault isolation is cost

which gives p erformance measurements for this ap

eective

plication

The op erating systems community has fo cused con

SoftwareEnforced Fault Iso

siderable attention on supp orting kernel extensibil

ity For example the UNIX vno de interface is de

lation

signed to make it easy to add a new le system into

UNIX Kle Unfortunately it is to o exp ensiveto

In this section we outline several software encapsula

forward every le system op eration to user level so

tion techniques for transforming a distrusted mo dule

typically new le system implementations are added

so that it can not escap e its fault domain Werst

directly into the kernel The Andrew le system is

describ e a technique that allows users to pinp ointthe

largely implemented at user level but it maintains a

lo cation of faults within a software mo dule Next we

kernel for p erformance HKM Epochs ter

intro duce a technique called sandboxing that can iso

tiary storage le system Web is one example of op

late a distrusted mo dule while only slightly increasing

erating system kernel co de develop ed by a third party

its execution time Section provides a p erformance

vendor

analysis of this techinique Finallywe present a soft

Another example is userprogrammable high p erfor

ware encapsulation technique that allows co op erating

mance IO systems If data is arriving on an IO

fault domains to share memory The remainder of

channel at a high enough rate p erformance will b e

this discussion assumes we are op erating on a RISC

degraded substantially if control has to b e transferred

loadstore architecture although our techniques could

to user level to manipulate the incoming data FP

b e extended to handle CISCs Section describ es

Similarly Active Messages provide high p erformance

howwe implement safe and ecient crossfaultdomain

message handling in distributedmemory multipro ces

RPC

sors vCGS Typically the message handlers are

We divide an applications into

applicationsp ecic but unless the network controller

segments aligned so that all virtual addresses within

can b e accessed from user level Thi the message

a segment share a unique pattern of upp er bits called

handlers must b e compiled into the kernel for reason

the segmentidentier A fault domain consists of two

able p erformance

segments one for a distrusted mo dules co de the other

A userlevel example is the Quark Xpress desktop

for its static data heap and stack The sp ecic seg

publishing system One can purchase third party soft

ment addresses are determined at load time

ware that will extend this system to p erform func

Software encapsulation transforms a distrusted

tions unforeseen by its original designers Dys At

mo dules ob ject co de so that it can jump only to tar

the same time this extensibility has caused Quark a

gets in its co de segment and write only to addresses

numb er of problems Because of the lack of ecient

within its data segment Hence all legal jump tar

fault domains on the p ersonal where Quark

gets in the distrusted mo dule have the same upp er bit

Xpress runs extension mo dules can corrupt Quarks

pattern segment identier similarly all legal data

internal data structures Hence bugs in third party

addresses generated by the distrusted mo dule share

co de can make the Quark system app ear unreliable

the same segment identier Separate co de and data

b ecause endusers do not distinguish among sources of

segments are necessary to prevent a mo dule from mo d



system failure

ifying its co de segment It is p ossible for an address

All these examples share twocharacteristics First

with the correct segmentidentier to b e illegal for in

using hardware fault isolation would result in a signif

stance if it refers to an unmapp ed page This is caught

icant p ortion of the overall execution time b eing sp ent

by the normal op erating system page fault mechanism

in op erating system context co de Second only

a small amount of co de is distrusted most of the exe

SegmentMatching

cution time is sp ent in trusted co de In this situation

software fault isolation is likely to b e more ecient

An unsafe instruction is any instruction that jumps to

than hardware fault isolation b ecause it sharply re

or stores to an address that can not b e statically ver

duces the time sp ent crossing fault domain b oundaries

Our system supp orts dynamic linking through a sp ecial

while only slightly increasing the time sp ent executing

interface

ied to b e within the correct segment Most control

dedicatedreg target address

transfer instructions such as programcounterrelative

Move target address into dedicated register

branches can b e statically veried Stores to static

scratchreg dedicatedregshiftreg

variables often use an immediate addressing mo de and

Rightshift address to get segment identier

can b e statically veried However jumps through reg

scratchreg is not a dedicated register

isters most commonly used to implement pro cedure

shiftreg is a dedicated register

returns and stores that use a register to hold their

compare scratchreg and segmentreg

target address can not b e statically veried

segmentreg is a dedicated register

A straightforward approach to preventing the use of

trap if not equal

illegal addresses is to insert checking co de b efore ev

Trap if store address is outside of segment

ery unsafe instruction The checking co de determines

store instruction uses dedicatedreg

whether the unsafe instructions target address has the

correct segment identier If the check fails the in

serted co de will trap to a system error routine outside

Figure Assembly pseudo co de for segmentmatching

the distrusted mo dules fault domain We call this

software encapsulation technique segment matching

On typical RISC architectures segment matching

dedicatedreg targetregandmaskreg

requires four instructions Figure lists a pseudoco de

Use dedicated register andmaskreg

fragment for segment matching The rst instruction

to clear segment identier bits

in this fragmentmoves the store target address into

dedicatedreg dedicatedregsegmentreg

a dedicatedregister Dedicated registers are used only

Use dedicated register segmentreg

by inserted co de and are never mo died bycodein

to set segment identier bits

the distrusted mo dule They are necessary b ecause

store instruction uses dedicatedreg

co de elsewhere in the distrusted mo dule may arrange

to jump directly to the unsafe store instruction by

Figure Assembly pseudo co de to sandb ox address

passing the inserted check Hence we transform all

in targetreg

unsafe store and jump instructions to use a dedicated

register

All the software encapsulation techniques presented

Before each unsafe instruction we simply insert co de



in this pap er require dedicated registers Segment

that sets the upp er bits of the target address to the

matching requires four dedicated registers one to hold

correct segment identier We call this sandboxing the

addresses in the co de segment one to hold addresses

address Sandb oxing do es not catch illegal addresses

in the data segment one to hold the segment shift

it merely prevents them from aecting any fault do

amount and one to hold the segment identier

main other than the one generating the address

Using dedicated registers mayhave an impact on

Address sandb oxing requires insertion of twoarith

the execution time of the distrusted mo dule However

metic instructions b efore each unsafe store or jump

since most mo dern RISC architectures including the

instruction The rst inserted instruction clears the

MIPS and Alpha have at least registers we can

segmentidentier bits and stores the result in a ded

retarget the compiler to use a smaller register set with

icated register The second instruction sets the seg

minimal p erformance impact For example Section

ment identier to the correct value Figure lists the

shows that on the DECstation reducing by

pseudoco de to p erform this op eration As with seg

ve registers the register set available to a C compiler

mentmatching we mo dify the unsafe store or jump

gcc did not have a signicant eect on the average

instruction to use the dedicated register Since weare

execution time of the spec benchmarks

using a dedicated register the distrusted mo dule co de

can not pro duce an illegal address even by jumping

Address Sandb oxing

to the second instruction in the sandb oxing sequence

since the upp er bits of the dedicated register will al

The segment matching technique has the advantage

ready contain the correct segmentidentier this sec

that it can pinp oint the oending instruction This

ond instruction will have no eect Section presents

capability is useful during software development We

a simple algorithm that can verify that an ob ject co de

can reduce runtime overhead still further at the cost

mo dule has b een correctly sandb oxed

of providing no information ab out the source of faults

Address sandb oxing requires ve dedicated registers



For architectures with limited register sets suchasthe

One register is used to hold the segment mask two

Int it is p ossible to encapsulate a mo dule using no re

registers are used to hold the co de and data segment

served registers by restricting control ow within a fault domain

this register whenever it is set Since uses of the stack

pointer to form addresses are much more plentiful than

changes to it this optimization signicantly improves

p erformance

Further we can avoid sandb oxing the stackpointer

after it is mo died by a small constant oset as long as

the mo died stack p ointer is used as part of a load or

store address b efore the next control transfer instruc

tion If the mo died stackpointer has moved into a

guard zone the load or store instruction using it will

cause a hardware address fault On the DEC Alpha

pro cessor we apply these optimizations to b oth the

frame p ointer and the stack p ointer

There are a numb er of further optimizations that

could reduce sandb oxing overhead For example

Figure A segment with guard zones The size of

the transformation to ol could remove sandb oxing se

the guard zones covers the range of p ossible immediate

quences from lo ops in cases where a store target ad

osets in registerplusoset addressing mo des

dress changes by only a small constant oset during

each lo op iteration Our prototyp e do es not yet imple

ment these optimizations

identiers and two are used to hold the sandb oxed

co de and data addresses

Pro cess Resources

Optimizations

Because multiple fault domains share the same virtual

address space the fault domain implementationmust

The overhead of software encapsulation can b e re

prevent distrusted mo dules from corrupting resources

duced by using conventional compiler optimizations

that are allo cated on a p eraddressspace basis For

Our current prototyp e applies lo op invariantcodemo

example if a fault domain is allowed to make system

tion and instruction scheduling optimizations ASU

calls it can close or delete les needed by other co de

ACD In addition to these conventional techniques

executing in the address space p otentially causing the

we employa numb er of optimizations sp ecialized to

application as a whole to crash

software encapsulation

One solution is to mo dify the op erating system to

We can reduce the overhead of software encapsula

know ab out fault domains On a system call or page

tion mechanisms byavoiding arithmetic that computes

fault the kernel can use the to deter

target addresses For example many RISC architec

mine the currently executing fault domain and restrict

tures include a registerplusoset instruction mo de

resources accordingly

where the oset is an immediate constant in some lim

Tokeep our prototyp e p ortable we implemented

ited range On the MIPS architecture such osets are

an alternative approach In addition to placing each

limited to the range K to K Consider the

distrusted mo dule in a separate fault domain were

store instruction store valueoffsetreg whose

quire distrusted mo dules to access system resources

address offsetreg uses the registerplusoset ad

only through crossfaultdomain RPC We reservea

dressing mo de Sandb oxing this instruction requires

fault domain to hold trusted arbitration co de that de

three inserted instructions one to sum regoffset

termines whether a particular system call p erformed

into the dedicated register and two sandb oxing in

by some other fault domain is safe If a distrusted

structions to set the segment identier of the dedicated

mo dules ob ject co de p erforms a direct system call we

register

transform this call into the appropriate RPC call In

Our prototyp e optimizes this case by sandb oxing

the case of an extensible application the trusted p or

only the register reg rather than the actual target ad

tion of the application can make system calls directly

dress regoffset therebysaving an instruction To

and shares a fault domain with the arbitration co de

supp ort this optimization the prototyp e establishes

guard zones at the top and b ottom of each segment

To create the guard zones pages ad

Data Sharing

jacent to the segment are unmapp ed see Figure

Hardware fault isolation mechanisms can supp ort data

We also reduce runtime overhead by treating the

sharing among virtual address spaces by manipulat

MIPS stackpointer as a dedicated register Weavoid

ing page table entries Fault domains share an ad

sandb oxing the uses of the stack p ointer by sandb oxing

Our current prototyp e uses the rst approach We dress space and hence a set of page table entries

mo died a version of the gcc compiler to p erform soft so they can not use a standard shared memory im

ware encapsulation Note that while our current imple plementation Readonly sharing is straightforward

mentation is language dep endent our techniques are since our software encapsulation techniques do not al

language indep endent ter load instructions fault domains can read anymem



We built a verier for the MIPS instruction set ory mapp ed in the applications address space

that works for b oth sandb oxing and segmentmatch If the ob ject co de in a particular distrusted mo d

ing The main challenge in verication is that in the ule has b een sandb oxed then it can share readwrite

presence of indirect jumps execution may b egin on memory with other fault domains through a technique

any instruction in the co de segment To address this we call lazy p ointer swizzling Lazy p ointer swizzling

situation the verier uses a prop erty of our software provides a mechanism for fault domains to share ar

encapsulation techniques all unsafe stores and jumps bitrarily many readwrite memory regions with no ad

use a dedicated register to form their target address ditional runtime overhead To supp ort this technique

The verier divides the program into sequences of in we mo dify the hardware page tables to map the shared

structions called unsafe regionsAnunsafe storere memory region into every address space segment that

gion b egins with any mo dication to a dedicated store needs access the region is mapp ed at the same oset

register An unsafe jump region b egins with anymod in each segment In other words we alias the shared

ication to a dedicated jump register If the rst in region into multiple lo cations in the virtual address

struction in a unsafe store or jump region is executed space but each aliased lo cation has exactly the same

all subsequent instructions are guaranteed to b e exe low order address bits As with hardware shared mem

cuted An unsafe store region ends when one of the ory schemes each shared region must have a dierent

following hold the next instruction is a store which segment oset

uses a dedicated register to form its target address Toavoid incorrect shared p ointer comparisons in

the next instruction is a control transfer instruction sandb oxed co de the shared memory creation inter

the next instruction is not guaranteed to b e executed face must ensure that each shared ob ject is given a

or there are no more instructions in the co de segment unique address As the distrusted ob ject co de ac

A similar denition is used for unsafe jump regions cesses shared memory the sandb oxing co de automati

The verier analyzes each unsafe store or jump re cally translates shared addresses into the corresp ond

gion to insure that any dedicated register mo died in ing addresses within the fault domains data segment

the region is valid up on exit of the region For ex This translation works exactly like hardware transla

ample a load to a dedicated register b egins an unsafe tion the low bits of the address remain the same and

region If the region appropriately sandb oxes the ded the high bits are set to the data segment identier

icated register the unsafe region is deemed safe If an Under op erating systems that do not allowvirtual

unsafe region can not b e veried the co de is rejected address aliasing we can implement shared regions by

By incorp orating software encapsulation into an ex intro ducing a new software encapsulation technique

isting compiler we are able to take advantage of com shared segmentmatchingTo implement sharing we

piler infrastructure for co de optimization However use a dedicated register to hold a bitmap The bitmap

this approachhastwo disadvantages First most mo d indicates which segments the fault domain can access

ied compilers will supp ort only one programming lan For each unsafe instruction checked shared segment

guage gcc supp orts C C and Pascal Second the matching requires one more instruction than segment

compiler and verier must b e synchronized with re matching

sp ect to the particular encapsulation technique b eing

employed

Implementation and Verication

An alternative called binary patching alleviates

Wehave identied two strategies for implementing

these problems When the fault domain is loaded the

software encapsulation One approach uses a compiler

system can encapsulate the mo dule by directly mo di

to emit encapsulated ob ject co de for a distrusted mo d

fying the ob ject co de Unfortunately practical and ro

ule the integrity of this co de is then veried when the

bust binary patching resulting in ecient co de is not

mo dule is loaded into a fault domain Alternatively

currently p ossible LB To ols which translate one

the system can encapsulate the distrusted mo dule by

binary format to another have b een built but these

directly mo difying its ob ject co de at load time

to ols rely on compilersp ecic idioms to distinguish

co de from data and use pro cessor emulation to han



Wehave implementedversions of these techniques that p er

dle unknown indirect jumpsSCK For software

form general protection by encapsulating load instructions as

well as store and jump instructions We discuss the p erformance

encapsulation the main challenge is to transform the

of these variants in Section

co de so that it uses a subset of the registers leav

readonly co de segment it can only b e mo died by

a trusted mo dule

For each pair of fault domains a customized call and

return stub is created for each exp orted pro cedure

Currently the stubs are generated by hand rather than

using a stub generator JRT The stubs run unpro

tected outside of b oth the caller and callee domain

The stubs are resp onsible for copying crossdomain

arguments b etween domains and managing machine

state

Because the stubs are trusted we are able to copy

call arguments directly to the target domain Tra

ditional RPC implementations across address spaces

typically p erform three copies to transfer data The

arguments are marshalled into a message the kernel

Figure Ma jor comp onents of a crossfaultdomain

copies the message to the target address space and

RPC

nally the callee must demarshall the arguments By

having the caller and callee communicate via a shared

ing registers available for dedicated use To solve this

buer LRPC also uses only a single copy to pass data

problem weareworking on a binary patching proto

between domains BALL

typ e that uses simple extensions to current ob ject le

The stubs are also resp onsible for managing machine

formats The extensions store control ow and register

state On each crossdomain call any registers that are

usage information that is sucient to supp ort software

b oth used in the future by the caller and p otentially

encapsulation

mo died by the callee must b e protected Only regis

ters that are designated byarchitectural convention to

b e preserved across pro cedure calls are saved As an

Low Latency Cross Fault Do

optimization if the callee domain contains no instruc

tions that mo dify a preserved register we can avoid

main Communication

saving it Karger uses a trusted linker to p erform this

The purp ose of this work is to reduce the cost of fault

kind of optimization b etween address spaces Kar

isolation for co op erating but distrustful software mo d

In addition to saving and restoring registers the stubs

ules In the last section we presented one half of our

must switch the execution stack establish the correct

solution ecientsoftware encapsulation In this sec

register context for the software encapsulation tech

tion we describ e the other half fast communication

nique b eing used and validate all dedicated registers

across fault domains

Our system must also b e robust in the presence of

Figure illustrates the ma jor comp onents of a cross

fatal errors for example an addressing violation while

faultdomain RPC b etween a trusted and distrusted

executing in a fault domain Our current implementa

fault domain This section concentrates on three as

tion uses the UNIX signal facilitytocatch these errors

p ects of fault domain crossing First we describ e

it then terminates the outstanding call and noties the

a simple mechanism whichallows a fault domain to

callers fault domain If the application uses the same

safely call a trusted stub routine outside its domain

op erating system for all fault domains there

that stub routine then safely calls into the destination

must b e a way to terminate a call that is taking to o

domain Second we discuss how arguments are e

long for example b ecause of an innite lo op Trusted

ciently passed among fault domains Third we detail

mo dules may use a timer facilitytointerrupt execu

how registers and other machine state are managed on

tion p erio dically and determine if a call needs to b e

crossfaultdomain RPCs to insure fault isolation The

terminated

proto col for exp orting and naming pro cedures among

fault domains is indep endent of our techniques

Performance Results

The only way for control to escap e a fault domain

is via a jump table Each jump table entry is a con

Toevaluate the p erformance of softwareenforced fault

trol transfer instruction whose target address is a legal

domains we implemented and measured a prototyp e

entry p oint outside the domain By using instructions

of our system on a MHz DECstation dec

whose target address is an immediate enco ded in the

mips and a Mhz Alpha decalpha

instruction the jump table do es not rely on the use of

We consider three questions First howmuchover

a dedicated register Because the table is kept in the

The mo del provides an eectiveway to separate known head do es software encapsulation incur Second how

sources of overhead from second order eects Col fast is a crossfaultdomain RPC Third what is the

umn of Table are the predicted overheads p erformance impact of using software enforced fault

As can b e seen from Table the mo del is on aver isolation on an enduser application We discuss each

age eective at predicting sandb oxing overhead The of these questions in turn

dierences b etween measured and exp ected overheads

are normally distributed with mean and standard

Encapsulation Overhead

deviation of The dierence b etween the means

We measured the execution time overhead of sand

of the measured and exp ected overheads is not statisti

boxing a wide range of C programs including the C

cally signicant This exp eriment demonstrates that

spec b enchmarks and several of the Splash b ench

bycombining instruction countoverhead and oating

marks Ass SWG We treated each b enchmark

pointinterlo ck measurements we can accurately pre

as if it were a distrusted mo dule sandb oxing all of

dict average execution time overhead If we assume

its co de Column of Table rep orts overhead on

that the mo del is also accurate at predicting the over

the decmips column rep orts overhead on the dec

head of individual b enchmarks we can conclude that

alpha Columns and rep ort the overhead of using

there is a second order eect creating the observed

our technique to provide general protection by sand

anomalies in measured overhead

boxing load instructions as well as store and jump

We can discount eective instruction cache size and



instructions As detailed in Section sandb oxing

virtual memory paging as sources for the observed ex

requires dedicated registers Column rep orts the

ecution time variance Because sandb oxing adds in

overhead of removing these registers from p ossible use

structions the eective size of the instruction cache is

by the compiler All overheads are computed as the

reduced While this might account for measured over

additional execution time divided by the original pro

heads higher than predicted it do es not account for

grams execution time

the opp osite eect Because all of our b enchmarks are

On the decmipswe used the program measure

compute b ound it is unlikely that the variations are

menttools pixie and qpt to calculate the number

due to virtual memory paging

of additional instructions executed due to sandb ox

The decmips has a physically indexed physically

ing Dig BL Column of Table rep orts this

tagged direct mapp ed data cache In our exp eriments

data as a p ercentage of original program instruction

sandb oxing did not aect the size contents or starting

counts

virtual address of the data segment For b oth original

The data in Table app ears to contain a num

and sandb oxed versions of the b enchmark programs

b er of anomalies For some of the b enchmark pro

successive runs showed insignicantvariation Though

grams for example ear on the decmips and

dicult to quantifywe do not b elieve that data cache

compress on the decalpha sandb oxing reduced

alignmentwas an imp ortant source of variation in our

execution time In a numb er of cases the overhead is

exp eriments

surprisingly low

We conjecture that the observed variations are

To identify the source of these variations wede

caused by instruction cache mapping conictsSoft

velop ed an analytical mo del for execution overhead

ware encapsulation changes the mapping of instruc

The mo del predicts overhead based on the number

tions to cache lines hence changing the number of in

of additional instructions executed due to sandb ox

struction cache conicts A numb er of researchers have

ing sinstructions and the number of saved oat

investigated minimizing instruction cache conicts to

ing p ointinterlo ck cycles interlocks Sandb oxing in

reduce execution time McF PH Sam One

creases the available instructionlevel parallelism al

researcher rep orted a p erformance gain bysim

lowing the numb er of oatingp ointinterlo cks to b e

ply changing the order in which the ob ject les were

substantially reduced The integer pip eline do es not

linked PH Samples and Hilnger rep ort signif

provide interlo cking instead delay slots are explicitly

icantly improved instruction cache miss rates byre

lled with nop instructions by the compiler or assem

arranging only to of an applications basic

bler Hence scheduling eects among integer instruc

blo cks Sam

tions will b e accurately reected by the countofin

Beyond this eect there were statistically signicant

structions added sinstructions The exp ected over

dierences among programs On average programs

head is computed as

whichcontained a signicant p ercentage of oating

point op erations incurred less overhead On the dec

sinstructions  interlockscyclespersecond

mips the mean overhead for oating p ointintensive

originalexecutiontimeseconds

b enchmarks is compared to a mean of for



Loads in the libraries such as the standard C librarywere

the remaining b enchmarks All of our b enchmarks are

not sandb oxed

decmips decalpha

Fault Protection Reserved Instruction Fault Fault Protection

Benchmark Isolation Overhead Register Count Isolation Isolation Overhead

Overhead Overhead Overhead Overhead Overhead

predicted

alvinn FP

bps FP

cholesky FP

compress INT

ear FP

eqntott INT

espresso INT

gcc INT NA NA

li INT

locus INT

mpd FP

psgrind INT

qcd FP

sc INT NA NA

tracker INT

water FP

Average

Table Sandb oxing overheads for decmips and decalpha platforms The b enchmarks gcc and

sc are dep endentonapointer size of bits and do not compile on the decalpha The predicted fault

isolation overhead for cholesky is negative due to conservativeinterlo cking on the MIPS oatingp ointunit

the native op erating system and measured the round compute intensive Programs that p erform signicant

trip time These times are rep orted in the last two amounts of IO will incur less overhead

columns of Table On these platforms the cost

of crossaddressspace calls is roughly three orders of

Fault Domain Crossing

magnitude more exp ensive than lo cal pro cedure calls

Wenow turn to the cost of crossfaultdomain RPC

Op erating systems with highly optimized RPC im

Our RPC mechanism sp ends most of its time saving

plementations have reduced the cost of crossaddress

and restoring registers As detailed in Section only

space RPC to within roughly twoordersofmagni

registers that are designated by the architecture to b e

tude of lo cal pro cedure calls On Mach cross

preserved across pro cedure calls need to b e saved In

addressspace RPC on a Mhz DECstation

addition if no instructions in the callee fault domain

is times more exp ensive than a lo cal pro cedure

mo dify a preserved register then it do es not need to b e

call Ber The Spring op erating system running on

saved Table rep orts the times for three versions of

a Mhz SPARCstation delivers crossaddressspace

a NULL crossfaultdomain RPC Column lists the

RPC that is times more exp ensive than a lo cal leaf

crossing times when all data registers are caller saved

pro cedure call HK Software enforced fault isola

Column lists the crossing times when the preserved

tion is able to reduce the relative cost of crossfault

integer registers are saved Finally the times listed in

domain RPC by an order of magnitude over these sys

Column include saving all preserved oating p oint

tems

registers In many cases crossing times could b e further

reduced by statically partitioning the registers b etween

Using Fault Domains in postgres

domains

For comparison we measured two other calling To capture the eect of our system on application

mechanisms First we measured the time to p erform a p erformance we added software enforced fault do

C pro cedure call that takes no arguments and returns mains to the postgres database management system

no value Second wesent a single b etween two and measured postgres running the Sequoia

address spaces using the pip e abstraction provided by b enchmark SFGM The Sequoia b enchmark

Cross FaultDomain RPC

Platform Caller Save Save C Pip es

Save Integer IntegerFloat Pro cedure

Registers Registers Registers Call

decmips s s s s s

decalpha s s s s s

Table Crossfaultdomain crossing times

Sequoia Untrusted SoftwareEnforced Number decmipspipe

Query Function Manager Fault Isolation CrossDomain Overhead

Overhead Overhead Calls predicted

Query

Query

Query

Query

Table Fault isolation overhead for postgres running Sequoia b enchmark

contains queries typical of those used by earth scien Analysis

tists in studying the climate To supp ort these kinds

For the postgres exp eriment software encapsulation

of nontraditional queries postgres provides a user

provided substantial savings over using native op erat

extensible typ e system Currently userdened typ es

ing system services and hardware address spaces In

are written in conventional programming languages

general the savings provided by our techniques over

such as C and dynamically loaded into the database

hardwarebased mechanisms is a function of the p er

manager This has long b een recognized to b e a serious

centage of time sp ent in distrusted co de t the p er

d

safety problemSto

centage of time sp ent crossing among fault domains

Four of the eleven queries in the Sequoia b ench

t the overhead of encapsulation h and the ratio

c

mark make use of userdened p olygon data typ es We

r of our fault domain crossing time to the crossing

measured these four queries using b oth unprotected

time of the comp eting hardwarebased RPC mecha

dynamic linking and softwareenforced fault isolation

nism

Since the postgres co de is trusted we only sand

boxed the dynamically loaded user co de For this

savings  r t  ht

c d

exp eriment our crossfaultdomain RPC mechanism

saved the preserved integer registers the variantcor

Figure graphically depicts these tradeos The X

resp onding to Column in Table In addition we

axis gives the p ercentage of time an application sp ends

instrumentedthecodetocountthenumberofcross

crossing among fault domains The Y axis rep orts the

faultdomain RPCs so that we could estimate the p er

relative cost of software enforced faultdomain cross

formance of fault isolation based on separate address

ing over hardware address spaces Assuming that the

spaces

execution time overhead of encapsulated co de is

Table presents the results Untrusted userdened

the shaded region illustrates when software enforced

functions in postgres use a separate calling mecha

fault isolation is the b etter p erformance alternative

nism from builtin functions Column lists the over

Softwareenforced fault isolation b ecomes increas

head of the untrusted function manager without soft

ingly attractive as applications achieve higher degrees

ware enforced fault domains All rep orted overheads in

of fault isolation see Figure For example if an ap

Table are relative to original postgres using the un

plication sp ends of its time crossing fault domains

trusted function manager Column rep orts the mea

our RPC mechanism need only p erform b etter

sured overhead of software enforced fault domains Us

than its comp etitor Applications that currently sp end

ing the numb er of crossdomain calls listed in Column

as little as of their time crossing require only a

and the decmipspipe time rep orted in Table Col

improvement in fault domain crossing time As

umn lists the estimated overhead using conventional

rep orted in Section our crossing time for the dec

hardware address spaces

mips is s and for the decalpha s Hence

100%

A AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

A AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

A AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

A AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

A AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

90% AAAA

A AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

A AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

A AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

A AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

A AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

100% AAAA

A AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

80% AAAA

A AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

A AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

A AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

A AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

A AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

A AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

A AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

A AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

A AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

90% AAAA

A AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

70% AAAA

A AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

A AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

A AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

A AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

A AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

A AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

A AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

A AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

A AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

80% AAAA

A AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

60% AAAA A AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

A AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

A AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

A AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

A AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

A AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

A AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

A AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

A AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

70% AAAA

A AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

A AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

50% AAAA

A AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

A AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

A AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

A AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

A AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

A AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

A AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

A AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

60% AAAA

A AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

A AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

40% AAAA

A AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

A AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

A AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

A AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

A AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

A AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

A AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

A AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

50% AAAA

A AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

Existing RPC

A AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

30% AAAA

A AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

A AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

A AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

A AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

A AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

A AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

A AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

A AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

40% AAAA

A AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

A AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

20% AAAA

A AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

A AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

A AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

A AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

A AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

A AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

A AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

A AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

Existing RPC 30%

A AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

A AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

Crossing Time Relative to 10%

A AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

A AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

A AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

A AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

A AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

A AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

A AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

A AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

20% AAAA

A AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

0% AAAA

A AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

A AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

A AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

A AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

Crossing Time Relative to 10%

A AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

5%

A AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

10% 15% 20% 25% 30% 35% 40% 45% 50%

A AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

A AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA 0% AAAA Percentage of Execution Time Spent Crossing 5% 10% 15% 20% 25% 30% 35% 40% 45% 50%

Percentage of Execution Time Spent Crossing

Figure The shaded region represents when soft

ware enforced fault isolation provides the b etter p er

Figure The shaded region represents when soft

formance alternative The X axis represents p er

ware enforced fault isolation provides the b etter p er

centage of time sp ent crossing among fault domains

formance alternative The X axis represents p er

t The Y axis represents the relative RPC crossing

c

centage of time sp ent crossing among fault domains

sp eed r The curve represents the break even p oint

t The Y axis represents the relative RPC crossing

c

 r t ht In this graph h encapsulation

c d

sp eed r The curve represents the break even p oint

overhead on the decmips and decalpha

 r t ht In this graph h encapsulation

c d

overhead on the decmips and decalpha

for this latter example a hardware address space cross

ing time of sonthe decmips and sonthe

decalpha would provide b etter p erformance than

software fault domains As far as we know no pro

duction or exp erimental system currently provides this el of p erformance lev 4.2 Context Switch

ossing

Further Figure assumes that the entire applica r

C

tion was encapsulated For many applications suchas

t

postgres this assumption is conservative Figure

transforms the previous gure assuming that of

total execution is sp ent in distrusted extension co de

Figures and illustrate that software enforced

ion Time Spen Time ion

fault isolation is the b est choice whenever crossing

t

overhead is a signicant prop ortion of an applica

tions execution time Figure demonstrates that Execu

f DECstation 5000

verhead due to software enforced fault isolation re o Hardware Minimum

ge o ge

mains small regardless of application b ehavior Fig a

t

ure plots overhead as a function of crossing b ehavior cen

r

ypical of vendor and crossing cost Crossing times t Software

Pe

supplied and highly optimized hardwarebased RPC

mechanisms are shown The graph illustrates the rel

e p erformance stability of the software solution

ativ # Crossings/Millesecond

This stabilityallows system develop ers to ignore the

p erformance eect of fault isolation in cho osing which

Figure Percentage of time sp ent in crossing co de

mo dules to place in separate fault domains

versus numb er of fault domain crossings p er millisec

ondonthedecmips The hardware minimum cross

ing number is taken from a crossarchitectural study

Related Work

of context switch times ALBL The Ultrix con

Many systems have considered ways of optimizing

text switch time is as rep orted in the last column of

RPC p erformance vvST TA Bla SB HK

Table

BALL BALL Traditional RPC systems based

placed at random lo cations in the same hardware ad on hardware fault isolation are ultimately limited by

dress space Calls b etween domains are anonymous the minimal hardware cost of taking twokernel traps

that is they do not reveal the lo cation of the caller and two hardware context LRPC was one

or the callee to either side This provides probabilis of the rst RPC systems to approach this limit and

tic protection it is unlikely that any domain will our prototyp e uses a numb er of the techniques found

b e able to discover the lo cation of any other domain in LRPC and later systems the same thread runs in

by malicious or accidental memory prob es Topre b oth the caller and the callee domain the stubs are

serveanonymity a cross domain call must trap to pro kept as simple as p ossible and the crossing co de jumps

tected co de in the kernel however no hardware con directly to the called pro cedure avoiding a dispatch

text switch is needed in the callee domain Unlike these systems software

based fault isolation avoids hardware context switches

substantially reducing crossing costs

Summary

Address space identier tags can b e used to reduce

hardware context switch times Tags allow more than

Wehave describ ed a softwarebased mechanism for

one address space to share the TLB otherwise the

p ortable programming language indep endent fault

TLB must b e ushed on each context switch It was

isolation among co op erating software mo dules By

estimated that of the cost of an LRPC on the

providing fault isolation within a single address space

Firey whichdoesnothave tags was due to TLB

this approach delivers crossfaultdomain communica

missesBALL Address space tags do not however

tion that is more than an order of magnitude faster

reduce the cost of register management or system calls

than any RPC mechanism to date

op erations which are not scaling with integer p erfor

Toprevent distrusted mo dules from escaping their

manceALBL An imp ortant advantage of software

own fault domain we use a software encapsulation

based fault isolation is that it do es not rely on sp ecial

technique called sandb oxing that incurs ab out

ized architectural features such as address space tags

execution time overhead Despite this overhead in

Restrictive programming languages can also b e used

executing distrusted co de softwarebased fault isola

to provide fault isolation Pilot requires all kernel

tion will often yield the b est overall application p er

user and library co de to b e written in Mesa a strongly

formance Extensivekernel optimizations can reduce

typ ed language all co de then shares a single address

the overhead of hardwarebased RPC to within a fac

space RDH The main disadvantage of relying on

tor of ten over our softwarebased alternative Even

strong typing is that it severely restricts the choice

in this situation softwarebased fault isolation will b e

of programming languages ruling out conventional

the b etter p erformance choice whenever the overhead

languages like C C and assembly Even with

of using hardwarebased RPC is greater than

stronglytyp ed languages such as Ada and Mo dula

programmers often nd they need to use lo opholes in

the typ e system undercutting fault isolation In con

Acknowledgements

trast our techniques are language indep endent

Deutsch and Grant built a system that allowed

We thank Brian Bershad Mike Burrows John Hen

userdened measurement mo dules to b e dynamically

nessyPeter Kessler Butler Lampson Ed Lazowska

loaded into the op erating system and executed directly

DavePatterson John Ousterhout Oliver Sharp

on the pro cessor DG The mo dule format was a

Richard Sites Alan Smith and Mike Stonebraker for

stylized native ob ject co de designed to make it easier

their helpful comments on the pap er Jim Larus pro

to statically verify that the co de did not violate pro

vided us with the proling to ol qptWe also thank

tection b oundaries

MikeOlsonandPaul Aoki for helping us with post

An interpreter can also provide failure isolation For

gres

example the BSD UNIX network packet lter utility

denes a language whichisinterpreted by the op erat

References

ing system network driver The interpreter insulates

the op erating system from p ossible faults in the cus

ACD TL Adam KM Chandy and JR Dickson

tomization co de Our approach allows co de written in

A comparison of list schedules for parallel pro

any programming language to b e safely encapsulated

cessing systems Communications of the ACM

or rejected if it is not safe and then executed at near

Decemb er

full sp eed by the op erating system

ALBL Thomas Anderson Henry Levy Brian Ber

Anonymous RPC exploits bit address spaces to

shad and Edward Lazowska The Interaction

provide low latency RPC and probabilistic fault iso

of Architecture and Op erating System Design

lation YBA Logically indep endent domains are

HK Graham Hamilton and Panos Kougiouris The In Proceedings of the th International Confer

Spring nucleus A microkernel for ob jects In enceonArchitecturalSupport for Programming

Proceedings of the Summer USENIX Confer Languages and Operating Systems pages

ence pages June April



Ass Administrator National Graphics

HKM J Howard M Kazar S Menees D Nichols

Asso ciation SPEC Newsletter December

M Satyanarayanan R Sideb otham and

M West Scale and Performance in a Dis

tributed File System ACM Transactions on

ASU Alfred V Aho Ravi Sethi and Jerey D Ull

Computer Systems February

man Compilers Principles Techniques and

Tools AddisonWesley Publishin g Company

Int Corp oration Santa Clara California

Intel Programmers Reference Manual

BALL Brian Bershad Thomas Anderson Edward La

zowska and Henry Levy Lightweight Remote

JRT Michael B Jones Richard F Rashid and

Pro cedure Call ACM Transactions on Com

Mary R Thompson Matchmaker An in

puter Systems February

terface sp ecication language for distributed

pro cessing In Proceedings of the th ACM

BALL Brian Bershad Thomas Anderson Edward La

SIGACTSIGPLAN Symposium on Principles

zowska and Henry Levy UserLevel Interpro

of Programming Languages pages

cess Communication for SharedMemory Mul

January

tipro cessors ACM Transactions on Computer

Systems May

Kar Paul A Karger Using Registers to Optimize

CrossDomain Call Performance In Proceed

Ber Brian Bershad August Private Commu

ings of the rd International Conferenceon

nication

Architectural Support for Programming Lan

BL Thomas Ball and James R Larus Optimally

guages and Operating Systems pages

proling and tracing In Proceedings of the

April

Conference on Principles of Programming Lan

guages pages

Kle Steven R Kleiman Vno des An Architecture

for Multiple File System Typ es in SUN UNIX

Bla David Black Scheduling Supp ort for Concur

In Proceedings of the Summer USENIX

rency and Parallelism in the Mach Op erating

Conference pages

System IEEE Computer May

LB James R Larus and Thomas Ball Rewrit

ing executable les to measure program b e

BN Andrew Birrell and Bruce Nelson Implement

havior Technical Rep ort Universityof

ing Remote Pro cedure Calls ACM Transac

WisconsinMadi son March

tions on Computer Systems Febru

ary

McF Scott McFarling Program optimization for

instruction caches In Proceedings of the In

Cla JD Clark Window Programmer Guide To

ternational ConferenceonArchitectural Sup

OLEDDE PrenticeHall

port for Programming Languages and Operat

DG L P Deutsch and C A Grant A exible mea

ing Systems pages April

surement to ol for software systems In IFIP

MJ Steven McCanne and Van Jacobsen The

Congress

BSD Packet Filter A New Architecture for

Dig Digital Equipment Corp oration Ultrix v

UserLevel Packet Capture In Proceedings of

Pixie Manual Page

the Winter USENIX Conference January

Dys Peter Dyson Xtensions for Xpress Mo dular

Software for Custom Systems Seybold Report

MRA J C Mogul R F Rashid and M J Ac

on Desktop Publishing June

cetta The packet lter An ecient mecha

FP Kevin Fall and Joseph Pasquale Exploiting in

nism for userlevel network co de In Proceed

kernel data paths to improve IO throughput

ings of the SymposiumonOperating System

and CPU a vailabil ity In Proceedings of the

Principles pages Novemb er

Winter USENIX Conference pages

PH Karl Pettis and Rob ert C Hansen Prole

January

guided co de p ositionin g In Proceedings of

HC Keiran Harty and

the ConferenceonProgramming Language De

David Cheriton Applicationcontrol l ed physi

sign and Implementation pages White

cal memory using external pagecache manage

Plains New York June App eared as

ment In Proceedings of the th International

SIGPLAN NOTICES

ConferenceonArchitectural Support for Pro



RDH David D Redell Yogen K Dalal Thomas R

gramming Languages and Operating Systems

Horsley Hugh C Lauer William C Lynch Octob er

Paul R McJones Hal G Murray and Protection in a Bit Address Space In Pro

Stephen C Purcell Pilot An Op erating Sys ceedings of the Summer USENIX Conference

tem for a Communications June

of the ACM February

Sam A Dain Samples Co de reorganization for in

struction caches Technical Rep ort UCBCSD

University of California BerkeleyOc

tob er

SB Michael Schro eder and Michael Burrows Per

formance of Firey RPC ACM Transac

tions on Computer Systems Febru

ary



SCK Richard L Sites Anton Cherno Matthew B

Kirk Maurice P Marks and Scott G Robin

son Communications of

the ACM February

SFGM M Stonebraker J Frew K Gardels and

J Meridith The Sequoia Benchmark

In Proceedings of the ACM SIGMOD Inter

national Conference on Management of Data

May

Sto Michael Stonebraker Extensibility in POST

GRES IEEE Database Engineering Septem

b er

Sto Michael Stonebraker Inclusion of new typ es in

relational data base systems In Michael Stone

braker editor Readings in Database Systems

pages Morgan Kaufmann Publishers

Inc

SWG J P Singh W Web er and A Gupta

Splash Stanford parallel application s for

sharedmemoryTechnical Rep ort CSLTR

Stanford

TA ShinYuan Tzou and David P Anderson A

Performance Evaluation of the DASH Message

Passing System Technical Rep ort UCBCSD

Computer Science Division University

of California Berkeley Octob er

Thi Thinking Machines Corp oration CM Net

work InterfaceProgrammers Guide

vCGS T von Eicken D Culler S Goldstein and

K Schauser Active Messages A Mechanism

for Integrated Communication and Computa

tion In Proceedings of the th Annual Sym

posium on

vvST Robb ert van Renesse Hans van Staveren and

Andrew S Tanenbaum Performance of the

Worlds Fastest Distributed Op erating System

Operating Systems Review Octo

b er

Web Neil Webb er Op erating System Supp ort for

Portable Filesystem Extensions In Proceed

ings of the Winter USENIX Conference

January

YBA Curtis Yarvin Richard Bukowski and Thomas

Anderson Anonymous RPC Low Latency