Non-blo cking Data Sharing in Multipro cessor Real-Time Systems
Philippas Tsigas and Yi Zhang
Department of Computing Science
Chalmers University of Technology
S-412 96 Göteb org, Sweden
iii) have to b e ecient in time and in space since they Abstract
must p erform under tight time and space constraints.
A nice description of the ne challenges that inter-task
A non-blocking protocol that al lows real-time tasks
communication proto cols for real-time systems haveto
to share data in a multiprocessor system is presented
address can b e found in [11].
in this paper. The protocol gives the means to concur-
The classical, well-known and most simple solution rent real-time tasks to read and write shared data; the
protocol al lows multiple write and multiple read opera-
enforces mutual exclusion. Mutual exclusion protects
the consistency of the shared data by allowing only tions to beexecutedconcurrently. Our protocol extends
one pro cess at time to access the data. Mutual ex- previous results and is optimal with respect to space
requirements. Together with the protocol, its schedula-
clusion i) causes large p erformance degradation esp e-
cially in multipro cessor systems [12]; ii) leads to com- bility analysis and a set of schedulability tests for a set
plex scheduling analysis since tasks can b e delayed b e- of random task sets are presented. Both the schedula-
bility analysis and the schedulability experiments show
cause they were either preempted by other more urgent
tasks, or b ecause they are blo cked b efore a critical sec- that the algorithm presentedinthispaper exhibits less
overhead than the lock basedprotocols.
tion by another pro cess that can in turn b e preempted
by another more urgent task and so on. (this is also
called as the convoy eect) [2]; and iii) leads to priority
inversion in which a high priority task can b e blo cked
1. Intro duction
for an unb ounded time bya lower priority task [4]. Sev-
eral synchronisation proto cols havebeenintro duced to
In anymultipro cessing system co op erating pro cesses
solve the priorityinversion problem for unipro cessor [4]
share data via shared data ob jects. In this pap er we
and multipro cessor [3] systems. The solution presented
are interested in designing shared data ob jects for co-
in [4] solves the problem for the unipro cessor case with
op erative tasks in real-time multipro cessor systems.
the cost of limiting the schedulability of task sets and
The challenges that have to b e faced in the design of
also making the scheduling analysis of real-time sys-
inter-task communication proto cols for multipro cessor
tems hard. The situation is muchworse in a multipro-
systems b ecome more delicate when these systems have
cessor real-time system, where a task maybeblocked
to supp ort real-time computing. In real-time multipro-
by another task running on a dierent pro cessor [3].
cessor systems inter-task communication proto cols i)
Non-blo cking implementation of shared data ob jects
have to supp ort sharing of data b etween dierenttasks
is a new alternative approach for the problem of inter-
e.g. on an op erational ight program (OFP), tasks
task communication. Non-blo cking mechanisms allow
like navigation, maintaining of pilot displays, control
multiple tasks to access a shared ob ject at the same
of and communication with a variety of sp ecial pur-
time, but without enforcing mutual exclusion to ac-
pose hardware, weap on delivery, and so on; ii) must
complish this. Non-blo cking inter-task communication
meet strict time constraints, the HRT deadlines; and
do es not allow one task to blo ck another task gives sig-
nicantadvantages over lo ck-based schemes b ecause:
Partially supp orted byARTES, a national Swedish strategic
research initiative in Real-Time Systems and TFR the Swedish
1. it do es not give priorityinversion, avoids lo ckcon- Research Council for Engineering Sciences.
Local Memory Local Memory Local Memory
voys that makescheduling analysis hard and de-
lays longer.
Processor 1 Processor 2 Processor n
2. it provides high fault tolerance (pro cessor failures
will never corrupt shared data ob jects) and elim-
k scenarios from twoormoretasks
inates deadlo c Real-Time Interconnection Network
b oth waiting for lo cks held by the other.
I/O Shared Memory
3. and more signicantly it completely eliminates the
interference between pro cess schedule and syn-
Figure 1. Shared Memory Multiprocessor Sys-
chronisation.
tem Structure
Non-blo cking proto cols on the other hand havetouse
more delicate strategies to guarantee data consistency
than the simple enforcement of mutual exclusion be-
istics of a multipro cessor architecture and describ e the
tween the readers and the writers of the data ob ject.
formal requirements that any solution to the synchroni-
These new strategies on the other hand, in order to b e
sation problem that we are addressing must guarantee.
useful for real-time systems, should b e ecient in time
Section 3 presents our proto col. In Section 4, wegive
and space in order to perform under the tight space
the pro of of correctness of the proto col. Section 5 is de-
and time constraints that real-time systems demand.
voted to the schedulability analysis and schedulability
In this pap er, we present an ecient non-blo cking
exp eriments that compare our non-blo cking proto col
solution to the general readers/writers inter-task com-
with the lo ck-based one. The pap er concludes with
munication problem; our solution allows any arbitrary
Section 6.
numb er of readers and writers to p erform their resp ec-
tive op erations. With a simple and ecient mem-
2. Problem Statement
ory management scheme in it, our proto col needs
n + m +1 memory slots (buers) for n readers and
writers and is optimal with resp ect to space re-
m 2.1. Real-time Multiprocessor System Configura-
ts. Sorencen and Hemacher in [8] haveproven
quiremen tion
that n + m +1 memory slots (buers) are necessary.
Together with the proto col its schedulability analysis
Atypical abstraction of a shared memory multipro-
and a set schedulability tests for a set of random task
cessor real-time system conguration is depicted in Fig-
sets are presented. Both the schedulability analysis
ure 1. Each no de of the system contains a pro cessor
and the schedulability exp eriments show that the al-
together with its lo cal memory. All no des are con-
gorithm presented in this pap er exhibits less overhead
nected to the shared memory via an interconnection
than the lo ck based proto col. Our proto col extends
network. A set of co op erating tasks (pro cesses) with
previous results by allowing any arbitrary number of
timing constraints is running on the system p erform-
tasks to p erform read or write op erations concurrently
ing their resp ective op erations. Each task is sequen-
without trading eciency. In previous work, Simpson
tially executed on one of the pro cessors, while each
[6], presented a non-blo cking asynchronous proto col
pro cessor can serve (run) many tasks at a time. The
for task communication b etween 1 writer and 1 reader
co op erating tasks now, p ossibly running on dierent
which needs 4 buers. Chen and Burns [7] presented a
pro cesses, use shared data ob jects build in the shared
non-blo cking synchronous proto col for n readers and 1
1
memory to co ordinate and communicate. Every task
writer that needs n+1+1 buers. Kop etz and Reisinger
has a maximum computing time and has to be com-
[2] also presented a non-blo cking synchronous proto col
pleted by a time sp ecied by a deadline. Tasks synchro-
for n readers and 1 writer that contains a mechanism to
nise their op erations through read/write op erations to
congure the numb er of buers to the application re-
shared memory.
quirements, trading memory space for execution time.
e also b elieve that the memory managementscheme
W 2.2. General Reader/Writer Problem
that we intro duce in this pap er and use in our pro-
to col is of interest and can b e used as an indep endent
In this pap er we are interested in the general
comp onent with other non-blo cking shared data ob ject
read/write buer problem where several reader-tasks
implementations.
1
The rest of this pap er is organised as follows. In
throughout the pap er the terms process and tasks are used
Section 2 wegive a description of the basic character- interchangably
(the readers) access a buer maintained by several
Compare_and_Swap(int *mem, register old, new)
writer-tasks (the writers). There is no limit on the
{
length of the buer that can b e increased by the writers
temp = *mem;
on-line dep ending on the length of the data that they
if (temp == old)
want to write in one atomic op eration. This shared
{
data ob ject can b e used to increase the word length of
*mem = new;
the system.
new = old;
The accessing of the shared ob ject is mo delled by
}
a history h. A history h is a nite (or not) sequence
else
of op eration invo cation and resp onse events. Anyre-
new = *mem
sp onse event is preceded by the corresp onding invo ca-
}
tion event. For our case there are two dierent op era-
tions that can b e invoked, a read op eration or a write
are_and_Swap
Figure 2. The Comp atomic
op eration. An op eration is called complete if there is
primitive
a resp onse event in the same history h; otherwise, it
is said to be p ending. A history is called complete if
all its op erations are complete. In a global time mo del
eachoperationq o ccupies" a time interval [s ;f ] on
q q
interact with the environment (e. g. sampler for tem-
one linear time axis (s q q q p erature or pressure), and they are of high prioritiy. To f as the starting and nishing time instants of q . Dur- q the b est of our knowledge the ab ove assumption holds ing this time interval the op eration is said to b e pend- in most real-time systems. ing. There exists a precedence relation on op erations in history denoted by < ,which is a strict partial order: h 3. The Proto col q < q means that q ends b efore q starts; Op er- 1 h 2 1 2 ations incomparable under < are called overlapping. h h is linearisable if the partial order A complete history 3.1. Idea description < on its op erations can b e extended to a total order h ! that resp ects the sp ecication of the ob ject [1]. For h Our construction divides the memory into memory our ob ject this means that each read op eration should slots and uses a p ointer that points to the slot with return the value written by the write op eration that the latest written data. Each slot has a ag eld used directly precedes the read op eration by this total order by the sp ecial memory management mechanism in the (! ). h proto col. Through this mechanism, i) writers can nd To sum it up as weare lo oking for a non-blo cking a safe slot to write without corrupting the slots from solution to the general Read/Writer problem for real- where overlapping readers are getting their values and time systems we are lo oking for a solution that satises ii) readers nd the slot with the latest information. that: In the worst case, n readers o ccupy n slots to read and m writers allo cate m slots to write and the p ointer Every read op eration guarantees the integrity and points to another slot. So, in total we need at most coherence of the data it returns n + m +1 slots for our proto col. In [8], Sorensen and Hemacher showed that n + m +1 slots is necessary for The behaviour of each read and write op eration this problem. is predictable and can b e calculated for use in the scheduling analysis 3.2. Protocol Description Every p ossible history of our proto col should be linearisable. Our proto col uses the instructions Com- 2 pare_and_Swap and Fetch_and_Add. The re- We assume that writer-tasks have the highest priority sp ective sp ecications of these instructions are shown on the host pro cessor and no two writer-tasks execute in Figure 2 and Figure 3. Most multipro cessor systems on one host pro cessor. Two writer-tasks may overlap either provide these primitives or provide others that but no write pro cesses can be preempted by another can b e used to emulate these primitives. pro cess. Reader pro cesses mayhave dierent priorities 2 and maybe scheduled with the writer pro cess on the IBM System 370 was the rst computer system that intro- same pro cessor. This is b ecause writer-tasks usually duced Compare_and_Swap int Fetch_and_add(int *mem, int increment) structure readwrite_buf { { temp = *mem; data: array[LENGTH_BUF] of Character; *mem = *mem + increment; used: integer; return temp /*-2*TASK'NUM means allocated & writing data } -TASK' NUM means slot is free -TASK' NUM +1 -1 means slot is free but Fetch_and_Add readers accessing it Figure 3. The atomic primitive with 0 means validate data */ } LF_buffer /* Lock-free buffer */ workbuf: array[TASK' NUM+1] of LF_buffer Figure 4 presents commented pseudo-co de for our dataptr: pointer to LF_buffer nonblo cking proto col. We use n + m + 1 = 0 TASKS NUM +1 slots. Each slot has a eld, called function newbufcell(): pointer to LF_buffer `used' that is used by the memory managementlayer for i=0 to TASK' NUM of our proto col. There are a number of values that if (cas(work_buf[i].used, this eld can carry, these values together with an in- -TASK' NUM ,-2*TASK' NUM ) formal description of the asso ciated information can b e return &work_buf[i] describ ed as follows: return NULL k =0 : indicates that the writer has nished writ- function initdata ing in the resp ective slot but no reader has read it for i=0 to TASK' NUM yet work_buf[i].used=-TASK' NUM k>0: indicates that this is the slot with the most dataptr=work_buf[0] recentvalue and there are k readers reading this slot at this moment function writebuf(data: pointer to datatye) temp = newbufcell() (n + m +1) /* allocate a new buffer cell for write */ the slot with the most recentvalue but there are writingsth(temp,data) k +(n + m +1) readers currently reading it. These /* write data into new buffer cell */ readers started their op erations long ago when this temp.used=0 slot had the most recentvalue /* means data valid in the cell */ swap(dataptr,old,temp) k = (n + m +1): indicates that this is not the /* use cas to achieve non-blocking write */ slot with the most recentvalue and that there is faa(old->used, -TASK'NUM ) no reader reading it; the combination of these two /* if no reader then it will change to makes it ideal for a writer to allo cate this slot to -TASK' NUM */ its current op eration and use it k = 2(n + m +1): indicates that a writer has function readbuf(): pointer to datatype allo cated this slot and is currently writing in it /* use fetch and add to mark that current block is in use. */ The read/write proto col can b e describ ed informally loop as follows. reading=dataptr A writer runs the following steps whenever it wants until (faa(reading->used,1)>(-TASK'NUM)) to write data into the ob ject: /* increase used by 1 if it is greater than -TASK' NUM 1. First, it allo cates a free slot from the slots with it can not be recycled*/ ags entry k = (n + m +1). The writer uses return=readingsth(reading) Compare_and_Swap to read and change the ag faa(reading->used,-1) from k = (n + m +1) to k = 2(n + m +1) in one atomic op eration, in this way if several ant to allo cate the same slot, the Com- writers w Figure 4. The Structure and Operations de- are_and_Swap atomic op eration available at a p scription for our non-blocking shared object are level will guarantee that only one will hardw Reader directed to slot increments flag Each reader increments succeed. flag 0>flag> flag by one and reader =-(n+m+1) -(n+m+1) that return decrement by one Allocate Last Reader Leave Then, it writes the data that it wants to write 2. Slot for write the slot Data pointer into that slot. If one reader tries to access that flag= Points to another Cell -2(n+m+1) during this writing, the reader will give up slot Pointer Change Data pointer will be forced to try again as we will see in and Points to the Cell Last step ofa write the readers proto col description. Reader directed to slot increments flag flag=0 Each reader increments flag > 0 flag by one and reader that return decrement by one After the writing has nished, the writer changes 3. Last Reader Leave the slot try of that slot from k = 2(n + m +1) the ag en Slot is in Slot is in "Writable" State "Readable" State to 0 . Flag Changed by Writer Flag Changed by Reader 4. As a next step the writer changes the data p ointer variable to this new slot (so that consequent reads Figure 5. Slot state changing graph nd a p ointer to the fresh value) with the atomic primitive Swap and gets the p ointer to the old slot at the same time. A description of the dierent states that a slot can b e in, together with the description of the actions that 5. Finally, the writer changes the ag entry of the old cause the change of these states, is depicted in Figure slot from k>0 to k<0 by subtracting n + m +1. 5. From the ab ove description, it is easy to checkthat A reader p erforms the following steps during each read the proto col: i) guarantees continuous ow of infor- op eration: mation from the writers to the readers, without any blo cking; ii) makes it p ossible to add or remove read- 1. reads the data p ointer variable to get a p ointer to ers and writers on the y without any change to the the slot with the most recently written value proto col of the other tasks; and iii) do es not require 2. uses the atomic primitive Fetch_and_Add to get information ab out how big the buer has to b e. and to add 1 on the value k of the ag entry of that slot 4. Correctness 3. there are three p ossibilities for the value that the h_and_add will return to the reader fetc 4.1. Model and Definitions (a) k 0, the slot has the most recentvalue writ- In [9] one can nd a formalism for the notion of the ten by all writers until now, and the reader atomic buer and the global time assumption that we can return this value, so go to 5 to get the adopt. We assume that each op eration Op has a time value interval [s ;f ] on a linear time axis. We can think Op Op of s and f as the starting and nishing times of Op. Op Op (b) 0 > k (n + m +1), the slot has the Moreover, we assume that there is a precedence relation most recent value with resp ect to this sp e- on op erations whichis a strict partial order (denoted cic reader from the linearisabilty point of by `< '). Semantically, a< b means that op eration h h view and the reader can return this value, so a ends b efore op eration b starts. If two op erations are go to step 5 to get the value. incomparable under < , they are said to overlap. h (c) k<