A Prop ortional Share Resource Allo cation Algorithm for

Real-Time, Time-Shared Systems

 y z x

Ion Stoica Hussein Ab del-Wahab Kevin Je ay Sanjoy K. Baruah

{ k

Johannes E. Gehrke C. Greg Plaxton

time quantum. In addition, the algorithm provides sup- Abstract

port for dynamic operations, such as processes joining

or leaving the competition, and for both fractional and

We propose and analyze a proportional sharere-

non-uniform time quanta. Asaproof of concept we

sourceallocation algorithm for realizing real-time per-

have implementeda prototype of a CPU scheduler un-

formance in time-sharedoperating systems. Processes

der FreeBSD. The experimental results shows that our

are assigned a weight which determines a shareper-

implementation performs within the theoretical bounds

centage of the resource they aretoreceive. The re-

and hence supports real-time execution in a general

source is then al located in discrete-sized time quanta

purpose .

in such a manner that each makes progress at

aprecise, uniform rate. Proportional shareallocation

algorithms are of interest because 1 they provide a

1 Intro duction

natural means of seamlessly integrating real- and non-

real-time processing, 2 they areeasy to implement,

Currently there is great interest in providing real-

3 they provide a simple and e ective means of pre-

time execution and communication supp ort in general

cisely control ling the real-time performanceofa pro-

purp ose op erating systems. Indeed, applications such

cess, and 4 they provide a natural mean of policing

as desktop video conferencing, distributed shared vir-

so that processes that use moreofa resource than they

tual environments, and collab oration-supp ort systems

request have no il l-e ect on wel l-behavedprocesses.

require real-time computation and communication ser-

We analyze our algorithm in the context of an ideal-

vices to b e e ective. At present the dominant approach

ized system in which a resource is assumedtobegranted

to providing real-time supp ort in a general purp ose op-

in arbitrarily smal l intervals of time and show that our

erating system is to emb ed a p erio dic or pro cess

algorithm guarantees that the di erencebetween the

mo del into an existing op erating system kernel and

service time that a process should receive in the ide-

to use a real-time algorithm such as rate-

alized system and the service time it actual ly receives

monotonic scheduling to schedule the pro cesses. In

in the real system is optimal ly bounded by the size of a

such as system, ap erio dic and non-real-time activities



are typically scheduled either as background pro cesses

Supp orted by GAANN fellowship. Dept. of CS, Old Domin-

ion Univ., Norfolk, VA 23529-0162 [email protected].

or through the use of a second-level scheduler that is

y

Supp orted by NSF grant CCR 95{9313857. Dept.

executed quasi-p erio dically as a pro cess by the

of CS, Old Dominion Univ., Norfolk, VA 23529-0162

real-time scheduler.

[email protected].

z

In general, this framework can b e quite e ective

Supp orted by grant from the IBM & Intel corps and NSF

grant CCR 95{10156. Dpt. of CS, Univ. of North Carolina at

at integrating real-time and non-real-time computing.

Chap el Hill, Chap el Hill, NC 27599-3175, [email protected].

However, we observe that this approach has yet to b e

x

Supp orted by NSF under Research Initiation Award CCR{

embraced by the larger op erating systems community.

9596282. Dept. of CS, Univ. of Vermont, Burlignton, VT 05405,

We b elieve that this is due in part to the rigid distinc-

[email protected].

{

tions made b etween real-time and non-real-time. Real- Dpt. of CS, Univ. of Wisconsin-Madison, Madison, WI

53706-1685 [email protected].

time activities are programmed di erently than non-

k

Supp orted by NSF grant CCR{9504145, and the Texas Ad-

real-time ones e.g., as p erio dic tasks and real-time

vanced Research Program under grant No. ARP{93{00365{461.

activities receive hard and fast guarantees of resp onse

Dpt. of CS, Univ. of Texas at Austin, Austin, TX 78712-1188

[email protected].

time if admission control is p erformed. Non-real-time 1

activities are subservient to real-time ones and receive di ers from traditional metho ds of integrating real- and

no p erformance guarantees. While this state of a airs non-real-time pro cesses in that here all pro cesses, real-

is entirely acceptable for many mixes of real-time and and non-real-time, are treated identically. In a prop or-

non-real-time activities, for many it is not. Consider tional share system, real-time and non-real-time pro-

the problem of supp orting real-time video conferencing cesses can b e implemented very much like traditional

on the desktop. This is clearly a real-time application, pro cesses in a time-shared op erating system. Thus in

however, it is not one for which hard and fast guaran- terms of the pro cess mo del, no \sp ecial" supp ort is re-

tees of real-time p erformance are required. For exam- quired to supp ort real-time computing | there is only

ple, it is easy to imagine situations in which one would one typ e of pro cess. Moreover, like manyscheduling

like to explicitly degrade the p erformance of the video- algorithms used in time-shared systems, our algorithm

conference e.g., degrade the rate at which the video allo cates resources in discrete units or quanta which

is displayed so that other activities, such as search- makes it easier to implement than traditional real-time

ing a large mail database for a particular message, can p olicies which are typically event-driven and require

complete quicker. Ideally, a general purp ose op erating the ability to preempt pro cesses at p otentially arbi-

system that supp orts real-time execution should not a trary p oints. In addition, b ecause resources are allo-

priori restrict the basic tenor of p erformance guaran- cated in discrete quanta in a prop ortional share system,

tees that any pro cess is capable of obtaining. one can b etter control and account for the overhead

of the scheduling mechanism as well as tune the sys-

To address this issue weinvestigate an alternate

tem to trade-o ne-grain, real-time control for low

approach to realizing real-time p erformance in time

scheduling and system overhead. Finally, prop ortional

shared op erating systems, namely the use of propor-

share algorithms provide a natural means of uniformly

tional shareresourceallocation algorithms for pro ces-

degrading system p erformance in overload situations.

sor scheduling. In a prop ortional share allo cation sys-

In this pap er we present a prop ortional share

tem, every pro cess in the system is guaranteed to make

scheduling algorithm and demonstrate that it can b e

progress at a well-de ned, uniform rate. Sp eci cally,

used to ensure predictable real-time resp onse to all

each pro cess is assigned a share of the pro cessor | a

pro cesses. Section 2 presents our pro cess mo del and

p ercentage of the pro cessor's total capacity. If a pro-

formally intro duces the concepts of a share and the

cess's share of the pro cessor is s then in anyinterval of

requirement for predictable execution with resp ect to

length t, the pro cess is guaranteed to receives  t  

a share. Section 3 discusses related work in schedul-

units of pro cessor time where 0     , for some con-

ing. Section 4 presents a deadline-based, virtual-time

stant  . In a prop ortional share system, resource al-

scheduling algorithm that is used to ensure pro cesses

lo cation is exible and the share received by a pro cess

receive their requested share of the pro cessor. Section

can b e changed dynamically. In this manner a pro cess's

5intro duces a key technical problem to b e solved in

real-time rate of progress can b e explicitly controlled.

the course of applying our algorithm, namely that of

Prop ortional share resource allo cation algorithms

dealing with the dynamic creation and destruction of

lie b etween traditional general purp ose and real-time

pro cesses. Section 6 outlines the pro of of correctness

scheduling algorithms. On the one hand, prop ortional

of our algorithm and Section 7 presents some exp eri-

share resource allo cation is a variant of the pureproces-

mental results using our prop ortional share system in

sor sharing scheduling discipline, in which during each

the FreeBSD op erating system, and demonstrates how

time unit each pro cess receives 1=n of the pro cessor's

\traditional" real-time pro cesses such as p erio dic tasks

capacity, where n is the numb er of active pro cesses.

can b e realized in a prop ortional share system.

Thus each pro cess app ears as it is making uniform

progress on a virtual pro cessor that has 1=n of the ca-

pacity of the physical pro cessor. On the other hand,

traditional real-time scheduling disciplines for p erio dic

2 The Mo del

tasks can b e viewed as coarse approximations of pro-

p ortional share allo cation. For example, if a p erio dic

We consider an op erating system to consist of a set

task requires c units of pro cessor time every p time

of pro cesses real-time and non-real-time that com-

units, then a rate-monotonic scheduler guarantees that

p ete for a time shared resource such as a CPU or a

for all k  0, in eachinterval [kp; k +1p], the p erio dic

communications channel. Toavoid confusion with ter-

task will indeed receive a share of the pro cessor equal

minology used in the exp erimental section of the pa-

to c=p. Sp eci cally, in each of the ab oveintervals, the

per we use the term client to refer to computational

pro cess will receive pc=p = c units of pro cessor time.

entities i.e., pro cesses. A client is said to b e active

Our prop ortional share resource allo cation p olicy while it is comp eting for the resource, and passive oth-

erwise. We assume that the resource is allo cated in started they must complete in the same time quantum.

time quanta of size at most q .At the b eginning of each For example, once a communication switch b egins to

time quantum a client is selected to use the resource. send a packet of one session, it cannot serveany other

Once the client acquires the resource, it may use it session until the entire packet is sent. As another ex-

either for the entire time quantum, or it may release ample, a pro cess cannot b e preempted while it is in

it b efore the time quantum expires. Although simple, a critical section. Thus, in the rst example we can

this mo del captures the basic mechanisms traditionally cho ose the size of a quantum q as b eing the time re-

used for sharing common resources, such as pro cessor quired to send a packet of maximum length, while in

and communication bandwidth. For example, in many the second example we can cho ose q as b eing the max-

preemptive op erating systems e.g., UNIX, Windows- imum duration of a critical section.

NT, the CPU scheduler allo cates the pro cessing time Due to quantization, in a system in which the re-

among comp eting pro cesses in the same fashion: a pro- source is allo cated in discrete time quanta it is not

cess uses the CPU until its time quantum expires or an- p ossible for a client to always receive exactly the ser-

other pro cess with a higher priority b ecomes active, or vice time it is entitled to. The di erence b etween the

it mayvoluntarily release the CPU while it is waiting service time that a client should receive at a time t, and

for an event to o ccur e.g., an I/O op eration to com- the service time it actual ly receives is called service time

i

plete. As another example, consider a communication lag or simply lag. Let t b e the time at which client

0

i

switch that multiplexes a set of incoming sessions on ;t b e the service time i b ecomes active, and let s t

i

0

i

a packet-by-packet basis. Since usually the transmis- the client receives in the interval [t ;t] here, we as-

0

i

sion of a packet cannot b e preempted, we take a time sume that client i is active in the entire interval [t ;t].

0

quantum to b e the time required to send a packet on Then the lag of client i at time t is

the output link. Thus, in this case, the size q of a time

i i

;t: 3 ;t s t lag t=S t

i i i

0 0

quantum represents the time required to send a packet

of maximum length.

Since the lag quanti es the allo cation accuracy,we

Further, we asso ciate a weight with each client that

use it as the main parameter in characterizing our pro-

determines the relative share of the resource that it

p ortional share algorithm. In particular, in Section

should receive. Let w denote the weight asso ciated to

i

6we show that our prop ortional share algorithm 1

client i, and let At b e the set of all clients activeat

provides b ounded lag for all clients, and that 2 this

time t.We de ne the instantaneous share f tofan

i

b ound is optimal in the sense that it is not p ossible to

active client i at time t as

develop an algorithm that a ords b etter b ounds. To-

w

i

gether, these prop erties indicate that our algorithm will

P

f t= : 1

i

w

j

j 2At

provide real-time resp onse guarantees to clients and

that with resp ect to the class of prop ortional share al-

If the client's share remains constant during a time in-

gorithms, these guarantees are the b est p ossible.

terval [t; t +t], then it is entitled to use the resource

for f tt time units. In general, when the client share

i

varies over time, the service time that client i should

receiveinaperfect fair system, while b eing active dur-

3 Related Work

ing a time interval [t ;t ], is

0 1

Z

t Tijdeman was one of the rst to formulate and an-

1

S t ;t = f  d 2

alyze the prop ortional share allo cation problem [15].

i 0 1 i

t

0

The original problem, an abstraction of diplomatic pro-

time units. The ab ove equation corresp onds to an ideal

to cols, was stated in terms of selecting a union chair-

uid- ow system in which the resource can b e granted

man every year, such that the accumulated number of

1

in arbitrarily small intervals of time . Unfortunately,

chairmen from each state of the union to b e prop or-

in many practical situations this is not p ossible. One of

tional to its weight. As shown in [2], Tijdeman's results

the reasons is the overhead intro duced by the schedul-

can b e easily applied to solve the prop ortional share al-

ing algorithm itself and the overhead in switching from

lo cation problem. In the general setting, the resource is

one client to another: taking time quanta of the same

allo cated in xed time quanta, while the clients' shares

order of magnitude as these overheads could drasti-

maychange at the b eginning of every time quantum.

cally reduce the resource utilization. Another reason is

In this way dynamic op eration can b e easily accommo-

that some op erations cannot b e interrupted, i.e., once

dated. Tijdeman proved that if the clients' shares are

1

known in advance there exists a schedule with the lag

A similar mo del was used by Demers et al [4] in studying

b ound less or equal to 1 1=2n 2, where n represents fair-queuing algorithms in communication networks.

the total numb er of clients. Note that when n !1 stride scheduling [17, 18], which can b e viewed as a

the lag b ound approaches unity. Although he gives cross-application of fair queueing to the domain of pro-

an optimal algorithm for the static case i.e., when the cessor scheduling. Stride scheduling relies on the con-

numb er of clients do es not change over time, he do es cept of global pass which is similar to virtual time to

not giveany explicit algorithm for the dynamic case. measure the work progress in the system. Each client

Furthermore, we note that, even in the general setting, has an asso ciated stride that is inversely prop ortional

the problem formulation do es not accommo date frac- to its weight, and a pass that measures the progress of

2

tional or non-uniform quanta. that client. The algorithm allo cates a time quantum

Recently, the prop ortional share allo cation problem to the client with the lowest pass, which is similar to

has received a great deal of attention in the context of the PFQ p olicy.However, by grouping the clients in a

op erating systems and communication networks. Our binary tree, and recursively applying the basic stride

algorithm is closely related to weighted fair queueing scheduling algorithm at each level, the lag is reduced to

algorithms previously develop ed for bandwidth allo ca- O log n. Moreover, stride scheduling provides supp ort

tion in communication networks [4,5, 10], and general for b oth uniform and non-uniform quanta.

purp ose prop ortional share algorithms, such as stride

Goyal, Guo and Vin have prop osed a new algorithm,

scheduling [17,18]. Demers, Keshav, and Shenker were

called Start-time Fair Queueing SFQ, for hierarchi-

the rst to apply the notion of fairness to a uid- ow

cally partitioning of a CPU among various application

system that mo dels an idealized communication switch

classes [6]. While this algorithm supp orts b oth uni-

in which sessions are serviced in arbitrarily small incre-

form and non-uniform quanta, the delay b ound and

ments [4]. Since in practice a packet transmission can-

implicitly the lag increases linearly with the number

not b e preempted, the authors prop osed an algorithm,

of clients. However, we note that when the number of

called Packet Fair Queueing PFQ, in which the pack-

clients is small, in terms of delay, this algorithm can b e

ets are serviced in the order in which they would n-

sup erior to classical fair queueing algorithms.

ish in the corresp onding uid- ow system i.e., in the

In contrast to the ab ove algorithms, by making use

increasing order of their virtual deadlines. By using

of b oth virtual eligible times and virtual deadlines,

the concept of virtual time, previously intro duced by

the algorithm we develop herein achieves constant lag

Zhang [19], Parekh and Gallager have analyzed PFQ

b ounds, while providing full supp ort for dynamic op-

when the input trac stream conforms to the leaky-

erations. We note that two similar algorithms were

bucket constraints [10,11]. In particular, they have

indep endently develop ed in parallel to our original

shown that no packet is serviced T latter than

max

work [13] by Bennett and Zhang in the context of

it would have b een serviced in the uid- ow system,

allo cating bandwidth in communication networks [3],

where T represents the time to transmit a packet of

max

and by Baruah, Gehrke and Plaxton in the context of

maximum size. However, as shown in [3,13, 18], the lag

pro cessor scheduling for xed time quanta [2]. In addi-

b ound can b e as large as O n, where n represents the

tion to intro ducing the concept of virtual eligible time

numb er of active sessions clients in the system. More-

whichwas also indep endently intro duced in [2] and [3]

over, in PFQ the virtual time is up dated when a client

our work makes several unique key contributions.

joins or leaves the comp etition in the ideal system, and

First, by \decoupling" the request size from the size

not in the real one. This requires one to maintain an

of a time quantum we generalize the previous known

additional event queue, which makes the implementa-

theoretical results [10]. Moreover, our analysis can b e

tion complex and inecient. As a solution, Golestani

easily extended to preemptive systems, as well. For ex-

has prop osed a new algorithm, called Self-Clo cked Fair

ample, we can derive lag b ounds for a fully preemptive

Queueing SCFQ, in which the virtual time is up dated

system, by simply taking time quanta to b e arbitrarily

when the client joins/leaves the comp etition in the real

small. Similarly,by taking the size of a time quantum

system, and not in the idealized one [5]. Although this

to b e the maximum duration of a critical region, we

scheme can b e more eciently implemented, this do es

can derive lag b ounds for a preemptive system with

not come for free: the lag b ounds increase to within a

critical regions. Finally, this decoupling gives a client

factor of two of the ones guaranteed byPFQ.

p ossibility of trading b etween allo cation accuracy and

Recently,Waldspurger and Weihl have develop ed

scheduling overhead see Section 6.

a new prop ortional share allo cation algorithm, called

Second, we address the problem of a client leaving

2

The di erence b etween fractional and non-uniform quanta

the comp etition before using the entire service time it

is that while in the rst case the fraction from the time quan-

has requested. This is an imp ortant extension since in

tum that the client will actually use is assumed to b e known

an op erating system it is typically not p ossible to pre-

in advance, in the non-uniform quanta case this fraction is not

known. dict exactly howmuch service time a client will use for

the next request. We note that this problem do es not issuing 60 requests with a duration of one second each,

o ccur and consequently has not b een addressed in the or by issuing 600 requests with a duration of 100 ms

context of network bandwidth allo cation; in this case, each. As we will show in Section 6, shorter requests

the length of a message and therefore its transmission guarantee b etter allo cation accuracy, while longer re-

time is assumed to b e known up on its arrival. The only quests decrease system overhead. This a ords a client

previous known algorithms that address this problem the p ossibility of trading b etween allo cation accuracy

are lottery and stride scheduling [17,18]. However, and scheduling overhead.

the lag b ounds guaranteed by stride scheduling are as

We formulate our scheduling algorithm in terms of

large as O n, where n represents the numb er of active

the b ehavior of an ideal, uid- ow system that exe-

clients b eing a randomized algorithm, lottery do es not

cutes clients in a virtual-time time domain [19,10].

guarantee tight b ounds. In comparison, our algorithm

Abstractly, the virtual uid- ow system executes each

describ ed next guarantees optimal lag b ounds of one

client for w real-time time units during each virtual-

i

time quantum.

time time unit. More concretely, virtual-time is de ned

Third, we prop ose a new approximation scheme for to b e the following function of real-time

Z

maintaining virtual time, in which up date op erations

t

1

are p erformed when the events e.g., client leaving,

P

d : 4 V t=

w

j

0

j 2A 

joining o ccur in the real system, and not in the ideal

one. This simpli es the implementation and eliminates

Note that virtual-time increases at a rate inversely

the need to keep an event queue. It is worth mention-

prop ortional to the sum of the weights of all ac-

ing that unlike other previous approximations [5], ours

tive clients. That is, when the comp etition increases

guarantees optimal lag b ounds.

virtual-time slows down, while when the comp etition

Besides the class of fair queuing algorithms, a sig-

decreases it accelerates. Intuitively, the ow of virtual-

ni cantnumb er of other prop ortional share algorithms

time changes to \accommo date" all active clients in one

have recently b een develop ed [1,9,12, 16]. Although

virtual-time time unit. That is, the size of a virtual-

none of them guarantees constant lag b ounds in a

time unit is mo di ed such that in the corresp onding

dynamic system, we note that the PD algorithm of

uid- ow system each active client i receives w real-

i

Baruah, Gehrke, and Plaxton [1]achieves constant lag

time units during one virtual-time time unit. For ex-

b ounds in a static system.

ample, consider two clients with weights w = 2 and

1

The idea of applying fair queueing algorithms to pro-

w = 3. Then the rate at which virtual-time increases

2

cessor scheduling was rst suggested byParekh in [11].

1

relative to real-time is =0:2, and therefore a

w +w

Waldspurger and Weihl were the rst to actually de- 1 2

virtual-time time unit equals ve real-time units. Thus,

velop and implement such an algorithm stride schedul-

in each virtual-time time unit the two clients should re-

3

ing for pro cessor scheduling [17, 18]. Finally, to our

ceive w = 2, and w = 3 time units.

1 2

b est knowledge we are the rst to implement and to

Ideally wewould like for our prop ortional share al-

test a prop ortional share scheduler which guarantees

gorithm to approach the b ehavior of the virtual uid-

constant lag b ounds.

ow system. Thus, since in the uid- ow system, at

all p oints in time a clientisbestcharacterized by the

service time it has received up to the current time, to

4 The EEVDF Algorithm

compare our approach with the ideal, wemust b e able

to compute the service time that a client should receive

In order to obtain access to the resource, a client

in the uid- ow system. By combining Eq. 1 and 2

must issue a request in which it sp eci es the duration

we can express the service time that an active client i

of the service time it needs. Once a client's request is

should receive in the interval [t ;t as

1 2

Z

ful lled, it may either issue a new request or b ecome

t

2

1

P

passive. For uniformity, throughout this pap er we as-

S t ;t =w d : 5

i 1 2 i

w

j

t

j 2A 

1

sume that the client is the sole initiator of the requests

For exibilitywe allow the requests to haveany dura-

Once the integral in the ab ove equation is computed,

tion. Note that a clientmay request the same amount

we can easily determine the service time that any client

of service time by generating either fewer longer re-

i should receive during the interval [t ;t , by simply

1 2

quests, or many shorter ones. For example, a client

multiplying the client's weightby the integral's value.

may ask for one minute of computation time either by

Next, from Eq. 5 and 4 it follows that

3

We note that they have also applied stride scheduling to

S t ;t =V t  V t w : 6 other shared resources, such as critical section lo ck accesses.

i 1 2 2 1 i

To b etter interpret the ab ove equation consider a computed exactly from Eq. 4 and 7, since wedonot

much simpler mo del in which the numb er of active knowhow the slop e of the virtual-time mapping will

clients is constant and the sum of their weights is vary in the future it changes dynamically while clients

P

one  w = 1, i.e., the share of a client i is join and leave the comp etition. Therefore we will for-

i

i2A

f = w . Then, in this mo del, the service time that mulate our algorithm in terms of virtual eligible times

i i

client i should receive during an interval [t ;t  is sim- and deadlines and not of the real times. With this,

1 2

ply S t ;t = t t w . Next, notice that by re- the Earliest Eligible Virtual Deadline First EEVDF

i 1 2 2 1 i

placing the real times t and t with the corresp onding algorithm can b e simply stated as follows:

1 2

virtual-times V t  and V t we arrive at Eq. 6.

1 2

EEVDF Algorithm. A new quantum is al locatedto

Thus, Eq. 6 can b e viewed as a generalization for

the client which has the eligible request with the earliest

computing the service time S t ;t inadynamic sys-

i 1 2

virtual dead line.

tem | one in which clients are dynamically joining and

leaving the comp etition.

Since EEVDF is formulated in terms of virtual-

Our scheduling algorithms uses measurements made

times, in the remaining of this pap er we use ve and

in the virtual-time domain to makescheduling deci-

vd to denote the virtual eligible time and virtual dead-

sions. For each client's request we de ne an eligible

line resp ectively, whenever the corresp onding real el-

time e and a dead line d which represent the starting

k 

igible time and the deadline are not given. Let r

and nishing time resp ectively for the request's service

th

denote the length of the k request made by client i,

i

in the corresp onding uid- ow system. Let t b e the

k  k 

0

and let ve and vd denote the virtual eligible time

time at which client i b ecomes active, and let t b e the

and the virtual deadline asso ciated to this request. If

time at which it initiates a new request. Then, a re-

each client's request uses the entire service time it has

quest b ecomes eligible at a time e when the service

requested, then by using Eq. 7 and 8 we obtain the

time that the client should receive in the corresp ond-

following recurrence which computes b oth the virtual

0

ing uid- ow system, S t ;e, equals the service time

i

eligible time and the virtual deadline of each request:

that the client has already received in the real system,

1 i

0 i i

; 9 ve = V t

s t ;t, i.e., S t ;e= s t ;t. Note that if at time

0

i i i

0 0

k 

t client i has received more service time than it was

r

k  k 

; 10 vd = ve +

supp osed to receive i.e., lag t < 0, then it will b e

i

w

i

the case that e> tand hence the client should wait

k +1 k 

ve = vd : 11

until time e b efore the new request b ecomes eligible.

Next, we consider the more general case in which

In this way a client that has received more service time

the client do es not use the entire service time it has

than its share is \slowed down", while giving the other

requested. Since a client never receives more service

active clients the opp ortunity to \catch up". On the

time than requested, we need to consider only the case

other hand, if at time t client i has received less service

when the client uses the resource for less time than

time than it was supp osed to receive i.e., its lag is p os-

k 

requested. Let u denote the service time that client

itive, then it will b e the case that e

i actually receives during its k -th request. Then the

the new request is immediately eligible at time t.By

only change in Eq. 9{11, will b e in computing the

using Eq. 6 the virtual eligible time V eis

eligible time of a new request. Sp eci cally, Eq. 11 is

i

s t ;t

i

0

i

replaced by

V e=V t + : 7

0

w

k  i

u

k +1 k 

ve = ve + : 12

Similarly, the dead line of the request is chosen such

w

i

that the service time that the client should receivebe-

tween the eligible time e and the deadline d equals the

Example. To x the ideas, let us take a simple exam-

service time of the new request, i.e., S e; d= r , where

i

ple see Figure 1. Consider two clients with weights

r represents the length of the new request. By using

w = w = 2 that issue requests with lengths r =2,

1 2 1

again Eq. 6, we derive the virtual deadline V das

and r = 1, resp ectively. We assume that the time

2

r

quantum is of unit size q = 1 and that client1isthe

: 8 V d=V e+

w

rst one whichenters comp etition at real time t =0.

i

0

Notice that although Eq. 7 and 8 give us the vir- Thus, according to Eq. 9 and 10 the virtual eligible

tual eligible time V e and the virtual deadline V d, time for the rst request of client1isve = 0, while

they do not necessarily give us the values of the real its virtual deadline is vd = 1. Being the single client

times e and d!To see why, consider the case in which that has an outstanding eligible request, client 1 re-

e is larger than the current time t. Then e cannot b e ceives the rst quantum. At real time t = 1, client

2enters the comp etition. Since during the interval

client 1

[0; 1 the only active client in the system is client1,

(0, 1) (1, 2) (2, 3)

from Eq. 4, the value of virtual-time at real-time 1

R

1

1

is V 1 = d =0:5. Thus, virtual-time increases 0

w client 2

1

ay, in an ideal

at half the rate of real-time. In this w (0.5, 1) (1, 1.5) (1.5, 2) (2, 2.5)

system, during every virtual-time time unit, client1

0 0.5 1 1.5 2 virtual time

receives w = 2 real time units. Next, after the second

1

tenters the comp etition, the rate of virtual-time

clien 0 123456 7 time

1

=0:25. Hence, in the slows down further to

w +w

1 2

ideal system, during one virtual-time time unit, each

Figure 1. An example of EEVDF scheduling in-

client will receive 2 real time units since w = w = 2.

1 2

volving two clients with equal weights w = w =

1 2

Next, assume that client 2 issues its rst request just

2. Al l the requests generated by client 1 have

b efore the second quantum is allo cated. Then at real

length 2, and al l of the requests generated by client

time t = 1 there are two p ending requests: one of client

2 have length 1. Client 1 becomes active at time

1 with the virtual deadline 1 whichwaits for another

0 virtual-time 0, while client 2 becomes active

time quantum to ful ll its request, and one of client

at time 1 virtual-time 0:5. The vertical arrows

2 which has the same virtual deadline, i.e., 1. In this

represent the times when the requests are initi-

situation we arbitrarily break the tie in favor of client

ated the pair associatedtoeach arrow represents

2, which therefore receives the second quantum. Since

the virtual eligible time and the virtual dead line

this quantum ful lls the current request of client2,

of the corresponding request. The shadedregions

client 2 issues immediately, at real time 3 virtual-time

in the background show the the durations of ser-

1, a new request. From Eq. 11 and 10 the vir-

vicing successive requests of the same client in

tual eligible time and the virtual deadline of the new

the uid- ow system.

request are 1 and 1.5, resp ectively.Thus, at real time

t = 2 virtual-time 0.75 the single eligible request is

main issues of implementing dynamic op erations, rst

the one of client 1, which therefore receives the next

recall that the client's lag represents see Eq. 3

quantum. Further, at real time t = 3 virtual-time 1

the di erence b etween the service time that the client

there are again two eligible requests: the one of client

should receive and the service time it has actually re-

2 that has just b ecome eligible, and the new request

ceived. An imp ortant prop erty of the EEVDF algo-

issued by client1. Since the virtual deadline of the

rithm is that at any time the sum of the lags of all active

second client's request 1.5 is earlier than the one of

clients is zero see Lemma 2 in [14]. Thus, if a client

the rst client 2, the fourth quantum is allo cated to

leaves the comp etition with a negative lag i.e., after

client2.Further, Figure 1 shows how the next four

receiving more service time than it was supp osed to,

quanta are allo cated.

the remaining clients should have received less service

Note the uniform progress of the two clients in Fig-

time than they were entitled to. In short, a gain for one

ure 1. Although the uniformity is p erfect in this con-

client translates into a loss for the other active clients.

trived example, we show in Section 6 that in fact the

Similarly, when a client with p ositive lag leaves, this

deviation of a client's progress from the p erfectly uni-

translates into a gain for the remaining clients. The

form rate i.e., its rate of progress in the ideal uid- ow

main question here is how to distribute this loss/gain

system is b ounded and that these b ounds are the b est

among the remaining clients. In [13]we answered this

p ossible. This shows that for a given quanta q , the

question by distributing it in proportion to the clients'

EEVDF algorithm provides the b est p ossible guaran-

weights. In the remaining of this section we show that

tees of real-time progress.

the same answer is obtained by approaching the prob-

lem from a di erent angle.

The basic observation is that this problem do es not

5 Fairness in Dynamic Systems

o ccur as long as a client with zero lag leaves the comp e-

tition, b ecause there is nothing to distribute. Since in

In this section we address the issue of fairness in

the corresp onding uid- ow system the lag of any client

dynamic systems. Throughout this pap er, we assume

is always zero, a simple solution would b e to consider

that a dynamic system provides supp ort for client join-

4

the time when the client leaves to b e the time when

ing and leaving the comp etition .To understand the

it leaves in the corresp onding uid- ow system, and

4

Note that with these two op erations, changing a client's

join op eration [13].

weight can b e easily implemented as a leave followed by a re-

not in the real system. Unfortunately, this solution the comp etition while having a p ositive lag. Then the

has two ma jor drawbacks. First, in many situations, client will b e simply delayed, while continuing to re-

suchasscheduling incoming packets in a high sp eed ceive service time, until its lag b ecomes zero, i.e., until

0

networking switch, maintaining the events in the uid- time t .Ifwe assume that the slop e of virtual-time

1

ow system is to o exp ensive in practice [5]. Second with resp ect to real-time do es not change b etween t

1

0 0

and more imp ortant, this solution assumes implicitly and t , then from Eq. 5 and 6 we obtain S t ;t =

1 1

1 1

0 0

that the service time that a client will use is known V t  V t w = w t t =w + w + w . Fur-

1 1 1 1 1 2 3

1 1

in advance. While this is generally true in the case ther, by using Eq. 3 and 5, and the fact that

0 0

of the communication switch, where the length of a s t ;t =t t we can compute the virtual-time

1 1 1

1 1

0

message and consequently its service time is assumed at t as

1

lag t 

1 1

0

to b e known when the packet arrives, in the pro ces-

13 V t = V t +

1

1

w + w

2 3 sor case this is not always p ossible. To see why this

is a p otential problem, consider the previous example

The main drawback of this approach is that client1

see Figure 1 in which the rst client leaves the com-

0

continues to receive service time b etween t and t , al-

1

1

p etition after using only 1.1 time-units of the second

though it do es not need it since it has already nished

request, i.e., at time 6.1 in the real system and the cor-

using the resource! Thus, this service time will b e

resp onding virtual time 1.775. However, according to

wasted, which is unacceptable. Our solution is to sim-

Eq. 12, in the ideal system the client should com-

ply let any client with p ositive lag leave immediately,

plete its service and therefore leave the comp etition at

while correctly up dating the value of virtual-time to

2 2

virtual time 1.55 = ve + u =w , which in our ex-

1

account for the change see Figure 2b. In this way

ample corresp onds to the real time 5.2. Unfortunately,

the virtual-times corresp onding to the times when a

since at this p ointwe do not know for how long client

client decides to leave and when it actually leaves are

1 will continue to use the resource we know only that

the same in b oth systems. More precisely consider a

it has made a request for two time-units of execution

client k leaving the comp etition at time t with a p os-

k

and has actually executed for only one time unit we

itive lag i.e., lag t  > 0. Then, by generalizing Eq.

k k

cannot up date the virtual time correctly!

13, the value of virtual-time is up dated as follows

Next, we present our solution to this problem for

lag t 

k k

P

V t =V t + ; 14

k k

a dynamic system in which the following two reason-

w

j

j 2At nfk g

k

able restrictions hold: 1 all the clients that join the

comp etition are assumed to have zero lag, and 2 a

where At  represents the set of all active clients just

k

client has to leave the comp etition as so on as it is n-

before client k leaves. For example, in Figure 2b

ished using the resource i.e., when a client terminates

At = f1; 2; 3g.Thus, At  nfk g represents the set

1 k

it is not allowed to remain in the system. We con-

of all active clients just after client k leaves the comp e-

sider two cases dep ending on whether the client's lag

tition. Further note that according to Eq. 3 the lag

is negative or p ositive. From Eq. 3, 4, 6 it fol-

of any remaining client i 2At  nfk g changes to

k

lows that the client's lag increases as long as the client

lag t 

k k

P

: 15 lag t = lag t +w

receives service time, and decreases otherwise. Thus,

i k i k i

w

j

j 2At nfk g

k

when a client with negative lag wants to leave, we can

simply delay that client without allo cating any service

Thus the lag of client i is proportional ly distributed

time to it until its lag b ecomes zero. This can b e sim-

among the remaining clients, which is consistent with

ply accomplished by generating a dummy request of

our interpretation of fairness in dynamic systems, i.e.,

zero length. However, note that since a request cannot

any gain or loss is prop ortionally distributed among

b e pro cessed b efore it b ecomes eligible, and since the

the remaining clients.

virtual eligible time of the dummy request is equal to

Since virtual-time is up dated only when the events

its deadline see Eq. 8, this request cannot b e pro-

actually o ccur in the real system as opp osed to when

cessed earlier than its deadline. In this way,wehave

they o ccur in the ideal one, the EEVDF algorithm can

reduced the rst case to the second one, in which the

b e easily and eciently implemented. Even in a sys-

client leaving the comp etition has a p ositive lag. Our

tem in which the service times are known in advance,

solution is based on the same idea as b efore: the client

it is theoretically p ossible to up date virtual-time as in

is delayed until its lag b ecomes zero.

the ideal system, however, in practice this is hard to

For clarity, consider the example in Figure 2a, achieve. Mainly, this is b ecause we need to implement

where three clients b ecome simultaneously active. an event queue which has to balance the trade-o b e-

Next, supp ose that at time t , client 1 decides to leave tween timer granularity and scheduling overhead. As 1 virtual time virtual time

t 1 t’1 t 2 t’2 time t 1 t 2 + (t’1 − t1 ) time client 1 client 1 client 2 client 2 client 3 client 3

(a) (b)

Figure 2. Three clients become active at the same time, after which client 1 and client 2, both with positive

lags, leave the competition. In a clients are al lowedtoleave only after their lags become zero; in b clients

are al lowedtoleave immediately. The shadedregions in a represents the time intervals during which the

system al locates service time to the clients until their lags become zero. In both cases the virtual-time just

before a client wants to leave and just after it has actual ly left areequal.

6 Fairness Results wehave shown in [13] all the basic op eration required

to implement the EEVDF algorithm, i.e., inserting and

deleting a request, and nding the eligible request with

The prop ortional share scheduling algorithm we

the earliest deadline can b e implemented in O log n,

have prop osed executes clients at a precise rate. One

where n represents the numb er of active clients.

can determine if a client has a desired real-time re-

sp onse time prop ertyby simply computing the amount

of service time it is to receive during the intervals of

time of interest using either Eq. 5 or 6. However,

b ecause service time is allo cated in discrete quanta, this

We note that in the worst case it may b e p ossible

computation is o by the client's lag. Thus, in order

that all the dummy requests o ccur at the same time. In

to use our prop ortional share algorithm for real-time

this situation, the scheduler should p erform O n dele-

computing, wemust demonstrate that the lag incurred

tions b efore the next \real" request is serviced. Al-

byany client is b ounded at all times. This is done next.

though, this is a p otential problem in the case of a

The problem is stated as that of demonstrating that

communication switch, where the selection of the next

the EEVDF algorithm is fair in the sense that all clients

packet to b e serviced is assumed to b e done during

make progress according to their weights. By demon-

the transmission of the current packet, it do es not sig-

strating that the lag of each client is b ounded at all

ni cantly increase the complexity of CPU scheduling.

times we conclude that our algorithm is fair. Here we

This is mainly b ecause a pro cessor, b esides servicing

sketch the argument that lags are b ounded. The com-

the active clients pro cesses, also executes the schedul-

plete pro of of each result are given in the extended

ing algorithm, as well as other related op erating system

version of this pap er [14].

functions e.g., starting a new pro cess, or terminating

Theorem 1 shows that any request is ful lled no an existing one. Consequently, in a complete mo del we

latter than q time units after its deadline in the cor- need to account for these overheads anyway. A simple

resp onding uid- ow system, where q represents the solution would b e to charge each pro cess for the related

maximum size of a time quantum. Theorem 2 gives overheads. For example, the time to select the next

tight b ounds for the lag of any client in a system in pro cess to receive a time quantum should b e charged

which all the clients that join and leave the comp eti- to that client. In this way, from the pro cessor's p er-

tion have zero lags. Similarly, Theorem 3 gives tight sp ective, a dummy request is no longer a 0-duration re-

b ounds for the client's lag in a system in which a client quest since it should account at least for the scheduling

with p ositive lag may leaveatany time. Finally,asa overhead and eventually for the pro cess termination.

corollary we show that in a dynamic system in which In the current mo del we ignore these overheads, which,

no client request is larger than the maximum size q of as the exp erimental results suggest see Section 7, is

a time quantum the lag of any client is b ounded by an acceptable approximation for many practical situa-

q . Moreover, this result is optimal with resp ect to any tions. However, we plan to address this asp ect in the

prop ortional share algorithm. We b egin by de ning future.

formally the systems we are analyzing. request to b e no greater than several tens of millisec-

onds, due to the delay constraints. Theorem 2 shows

De nition 1 A steady system S-system for short

that EEVDF can accommo date clients with di erent

is a system in which the lag of any client that joins, or

requirements, while guaranteeing tight b ounds for the

leaves the competition is zero.

lag of each client, which are indep endent of the other

clients. As the next theorem shows this is not true for

The next de nition is a formal characterization of the

PS-systems. In this case the lag of a client can b e as

system describ ed in Section 5 see Figure 2b.

large as the maximum request issued by any clientin

the system.

De nition 2 A pseudo-steady system PS-system

for short is a system in which the lag of any client

Theorem 3 Let r be the size of the current request

that joins is zero, and the lag of any client that leaves

issued by client k in a PS-system with quantum q . Then

is positive. Moreover, when a client with positive lag

the lag of client k at any time t while the request is

leaves, the value of virtual-time is updatedaccording to

pending is bounded as fol lows

Eq. 14.

r

k max

The following theorem gives the upp er b ound for the

where R represents the maximum duration of any

max

maximum delay of ful lling a request in an S-system.

request issued by any client in the system. Moreover,

We note that this result generalizes a previous result

these bounds are asymptotical ly tight.

of Parekh and Gallager [10] which holds for the partic-

ular case in which a request is no larger than a time

The following corollary follows directly from Theo-

quantum.

rems 2 and 3.

Corollary If no request of client k is larger than a

Theorem 1 Let d be the dead line of the current re-

time quantum, then at any time t its lag is boundedas

quest issued by client k in an S-system with quantum

fol lows:

q , and let f be the actual time when this request is ful-

q

k

l led. Then

1 the request is ful l led no later than d + q , i.e.,

Finally,we note that according to the following sim-

f

ple lemma the pro of can b e found in [13] the b ounds

2 if f>d, for any time t 2 [d; f , lag t

given in the ab ove corollary are optimal, i.e., they hold

k

for any prop ortional share algorithm.

The next theorem gives tight b ounds for a client lag in

Lemma Given any system with time quanta of size q

an S-system.

and any proportional share algorithm, the lag of any

client is asymptotical ly boundedbyq and q .

Theorem 2 Let r be the size of the current request

issued by client k in an S-system with quantum q . Then

the lag of client k at any time t while the request is

7 Exp erimental Results

pending is bounded as fol lows

r

As a pro of of concept wehave implemented a CPU

k

scheduler prototyp e based on our EEVDF algorithm

Moreover, these bounds are asymptotical ly tight.

under FreeBSD v 2.0.5. All the exp eriments were run

on a PC compatible with a 75 MhZ Pentium and 16

Notice that the b ounds given by Theorem 2 apply

MB of RAM. The scheduler time slice quantum, and

indep endently to each client and dep end only on the

the duration of any client's request were set to 10 ms.

lengths of their requests. While shorter requests o er

Excepting the CPU scheduler, we did not alter the

a b etter allo cation accuracy, the longer ones reduce the

5

FreeBSD kernel. Our scheduler co exists with the orig-

system overhead since for the same total service time

inal FreeBSD scheduler [7]. All the pro cesses that re-

fewer requests need to b e generated. It is therefore

quest prop ortional share or reservation services are as-

p ossible to trade b etween accuracy and system over-

signed a reserved user-level priority, and are handled

head, dep ending on client requirements. For example,

5

Indeed, the fact that we could p erform these exp eriments

for a computationally intensive task it would b e ac-

on top of a largely unmo di ed general purp ose op erating system

ceptable to take the length of the request to b e on the

indicates the go o d t b etween prop ortional share resource allo ca-

order of seconds. On the other hand, in the case of a

tion scheme we advo cate and general purp ose op erating system

multimedia application we need to take the length of a design. 20 60 client 1 −x− client 2 −*− client 3 −o− 15 client 1 50 10

40 5 client 2

0 30 Lag (msec.) −5 Number of Iterations 20 client 3 −10

10 −15

−20 0 100 200 300 400 500 600 700 800 900 1000 0 0 100 200 300 400 500 600 700 800 900 1000 Time (msec.)

Time (msec.) b

a 20 50 client 1 −x− client 2 −*− client 3 −o− 15 45 client 2

40 10

35 client 1 5 30 0 25 client 3 Lag (msec.) 20 −5 Number of Iterations 15 −10

10 −15 5 Reg. 1 Reg. 2Reg. 3 Reg. 4 −20 0 0 200 400 600 800 1000 1200 0 200 400 600 800 1000 1200

Time (msec.) Time (msec.)

d

c

Figure 3. Experiment 1: The number of iterations a and lags b of three clients with weights 3, 2, and

1over1secperiod. Experiment 2: The number of iterations c and the lags d for the same clients, when

client 3 is delayed for 300 ms and each client performs 50 iterations.

by our scheduler. All the other pro cesses are scheduled cesses, with weights 3, 2, and 1, resp ectively. All

by the regular FreeBSD scheduler. In this way, the ker- clients are synchronized via shared-memory to start

nel pro cesses are scheduled over any pro cess in the pro- the actual computation at the same time. Figure 3a

p ortional share or the reservation class. Since FreeBSD shows the numb er of iterations executed by each client

lacks real-time supp ort, such as preemptivekernel or during the rst second. The solid lines depict the ideal

page pinning, in our exp eriments we tried to avoid as numb er of iterations for each client. As it can b e seen,

much as p ossible the interaction with the kernel. For the numb er of iterations actually p erformed by each

example, we make sure that all the measurements are clientisvery closed to the ideal one. In particular,

p erformed after the entire program is loaded into mem- note that the client with a weight of 1 executes 1/2 the

ory Also, with the exception of gettimeofday function numb er of iterations as the clientwithaweight of 2 and

used for time measurements, we do not use any other at 1/3 the rate as the client with weight 3. Figure 3b

system function while running the exp eriments all the depicts the lag of each clientover the same interval.

data are recorded in memory and saved at the end of Note that the lags are always b etween -10 and 10 ms,

each exp eriment. Addressing these issues in a rst- which is consistent with the b ounds given by the corol-

class manner would only serve to improve our already lary in Section 6. Thus, each client is executing at a

go o d results. rate which is precise enough to a ord one the ability

to predict its p erformance in real-time mo dulo 10 ms.

To measure the allo cation accuracy wehave writ-

over anyinterval.

ten a simple iterative applications which p erforms some

arithmetic computations. Each iteration takes close to

In the second exp erimentwe consider again three

9 ms. In each exp erimentwe run several copies of the

clients with weights 3, 2, and 1, resp ectively, but in a

program, by assigning to each copy client a di erent

more \dynamic" scenario. While clients 2 and 3 b egin

weight.

execution at the same time, client 1 is delayed for 300

ms. Each client p erforms 50 iterations after whichit

In the rst exp erimentwe run three clients pro-

[4] A. Demers, S. Keshav and S. Shenkar, \Analysis and Simu- leaves the comp etition. As shown in Figure 3c there

lation of a Fair Queueing Algorithm", Journal of Internet-

are four distinct regions. In the rst region i.e., b e-

working Research & Experience, Octob er 1990, pp. 3{12.

tween 0 and 300 ms there are only two active clients:

[5] S. J. Golestani, \A Self-Clo cked Fair Queueing Scheme for

2 and 3. Therefore client 2 having weight 2 receives

Broadband Applications", Proc. of INFOCOM'94, April

66 p ercent of the CPU, while client 3 having weight

1994, pp. 636{646.

1 receives 33 from the CPU. Consequently, after 300

[6] P.Goyal, X. Guo and H. M. Vin, \A Hierarchical CPU

ms client 2 completes 22 iterations, while client 3 com-

Scheduler for Multimedia Op erating Systems", to app ear

pletes only 11 iterations. After 300 ms, client 1 joins

in Proc. of the 2nd OSDI Symp., Octob er 1996.

the comp etition and therefore in the second region b e-

[7] S. J. Leer, M. K. McKusick, M. J. Karels and J. S. Quar-

tween 300 and 998 ms all three clients are active. Fur-

terman. \The Design and Implementation of the 4.3BSD

ther, at time t = 998 ms client 2 nishes all its iter-

UNIX Op erating System," Addison-Wesley, 1989.

ations and leaves the comp etition. Thus, in the next

[8] C. L. Liu and J. W. Layland, \Scheduling Algorithms

region only clients 1 and 3 remains active. Finally,at

for Multiprogramming in a Hard-Real-Time Environment",

time t = 1128 ms, client 1 nishes, and client 3 re-

Journal of the ACM,Vol. 20, No. 1, January 1973, pp. 46{

main the only one active in region four. Figure 3d

61.

depicts the clients lags, which are again b etween the

[9] U. Maheshwari, \Charged-based Prop ortional Scheduling",

theoretical b ounds, i.e, -10 and 10 ms.

Technical Memorandum MIT/LCS/TM-529, Lab oratory

for CS, MIT, July 1995.

[10] A. K. Parekh and R. G. Gallager, \A Generalized Pro cessor

8 Conclusions

Sharing ApproachTo Flow Control in Integrated Services

Networks-The Single No de Case", ACM/IEEE Trans. on

Wehave describ ed a new prop ortional share resource

Networking,Vol. 1, No. 3, 1992, pp. 344{357.

allo cation scheduler that provides a exible control,

[11] A. K. Parekh, \A Generalized Pro cessor Sharing Approach

and provides strong timeliness guarantees for the ser-

To Flow Control in Integrated Services Networks", Ph.D

vice time received by a client. In this waywe provide

Thesis, Department of EE and CS, MIT, 1992.

a uni ed approach for scheduling \ rm" real-time, in-

[12] I. Stoica, H. Ab del-Wahab, \A new approach to implement

teractive, and batch applications. Weachieve this by

prop ortional share resource allo cation", Technical Report

uniformly converting the application requirements re-

TR-95-05, CS Dpt., Old Dominion Univ., April 1995.

gardless of their typ e in a sequence of requests for the

[13] I. Stoica, H. Ab del-Wahab, \Earliest Eligible Virtual Dead-

resource. Our algorithm guarantees that the di erence

line First: A Flexible and Accurate Mechanism for Prop or-

between the service time that a client should receive

tional Share Resource Allo cation", Technical Report TR-

in the idealized system and the service time it actu-

95-22, CS Dpt., Old Dominion Univ., Nov. 1995.

ally receives in the real system is b ounded by one time

[14] I. Stoica, H. Ab del-Wahab, K. Je ay, S. K. Baruah, J.

quantum and that this b ound is optimal. At our b est

E. Gehrke and C. G. Plaxton, \A Prop ortional Share Re-

source Allo cation Algorithm for Real-Time, Time-Shared knowledge, this is the rst algorithm to achieve these

Systems", Technical Report TR-96-38, CS Dpt., Univ. of

b ounds in a dynamic system that provides supp ort for

North Carolina, Septemb er 1996.

b oth fractional and non-uniform quanta. As a pro of

[15] R. Tijdeman, \The Chairmain Assignment Problem", Dis-

of concept wehave also implemented a prototyp e of a

crete Mathematics,vol. 32, 1980, pp. 323{330.

CPU scheduler under the FreeBSD op erating system.

Our exp erimental results shows that our implementa-

[16] C. A. Waldspurger and W. E. Weihl. \Lottery Scheduling:

Flexible Prop ortional-Share Resource Management," Proc.

tion p erforms within the theoretical b ounds.

of the 1st OSDI Symp.,Novemb er 1994, pp. 1{12.

[17] C. A. Waldspurger and W. E. Weihl. \Stride Scheduling:

Deterministic Prop ortional Share Resource Menagement,"

References

Technical Memorandum, MIT/LCS/TM-528, Lab oratory

for CS, MIT, July 1995.

[1] S. K. Baruah, J. E. Gehrke and C. G. Plaxton, \Fast

Scheduling of Perio dic Tasks on Multiple Resources", Proc.

[18] C. A. Waldspurger. \Lottery and Stride Scheduling: Flexi-

of the 9th Int. Par. Proc. Symp., April 1995, pp. 280{288.

ble Prop ortional-Share Resource Management," PhD The-

[2] S. K. Baruah, J. E. Gehrke and C. G. Plaxton, \Fair On-

sis,Technical Rep ort, MIT/LCS/TR-667, Lab oratory for

Line Scheduling of a Dynamic Set of Tasks on a Single

CS, MIT, Septemb er 1995.

Resource", Technical Report UTCS-TR-96-03, Dpt. of CS,

Univ. of Texas at Austin, February 1996.

[19] L. Zhang, \VirtualClo ck: A New Trac Control Algorithm

for Packet-Switched Networks", ACM Trans. on Comp.

2

[3] J. C. R. Bennett and H. Zhang, \WF Q:Worst-case Fair

Systems,vol. 9, no. 2, May 1991, pp. 101{124.

Queueing", Proc. of INFOCOM'96, San-Francisco, March 1996.