A simple correctness pro of of the MCS contentionfree lo ck

Theo dore Johnson

Krishna Harathi

Computer and Information Sciences Department

University of Florida

Abstract

MellorCrummey and Scott present a spinlo ck that avoids network contention byhaving

pro cessors spin on lo cal memory lo cations Their is equivalent to a lo ckfree queue

with a sp ecial access pattern The authors provide a complex and unintuitive pro of of the

correctness of their algorithm In this pap er weprovide a simple pro of that the MCS lo ckisa

correct solution whichprovides insightinto why the algorithm is correct

Keywords mutual exclusion serializability parallel pro cessing spin lo ck

Intro duction

MellorCrummey and Scott present a simple spinlo ckformutual exclusion which they call the

MCS Their algorithm has several desirable prop erties it is is fair b ecause it is equivalentto

aFIFO queue ADT with sp ecial access pattern and it is is contention free b ecause each pro cessor

spins on a lo cal variable In contrast testandset spin waiting is not fair and will severely limit

p erformance

The authors of the MCS lo cks provide a complex pro of of the correctness of their algorithm In

they show correctness through a series of algorithm transformations Since this pro of provides

little insight they also provide intuitive correctness arguments

In this pap er we present a simple correctness pro of for the MCS lo ck We show that the

underlying queue is decisiveinstruction serializable and that no op eration access a garbage address

We conclude that the MCS lo ck is a correct solution to the critical section problem

MCSLo ck Algorithm

The co de for the MCS lo ck which is shown b elow relies on two atomic readmo difywrite op era

tions The swap instruction takes two parameters Lapointer to a memory lo cation and Iavalue

The execution of swapLI puts I into L and returns the old value of LThe compare and swap

instruction takes a third parameter Xavalue The execution of compare and swapLIX puts

and swap returns the old I into L only if L currently contains the value X In either case compare

value in L

Each pro cessor that sets a lo ck rst allo cates a lo callyaccessible record If pro cess p inserts

record r into the queue then wesay that r is ps record and p is r s pro cess This record contains

a b o olean ag for spin waiting and a link for forming a queue The tail of the queue is p ointed

to by L and the head of the queue is implicitly p ointed to by the lo ck holder Tosetalock a

pro cessor executes the acquire lock pro cedure and adds its record p ointed to by I to the end

of the queue The enqueuing is p erformed by executing the fetch and store instruction which

atomically stores I in L and returns Ls old value If the pro cessor has a predecessor in the queue

the pro cessor completes the link otherwise the pro cessor is the lo ck holder

To release a lo ck the pro cessor must reset the busywaiting bit of its successor If the pro cessor

and swap has no successor it sets L to nil atomically with the compare

type qnode record

next qnode ptr to successor in queue

locked Boolean busywaiting necessary

type lock qnode ptr to tail of queue

I points to a queue link record allocated in an enclosing scope

in shared memory locally accessible to the invoking processor

lock L lock I qnode procedure acquire

var pred qnode

Inext nil Initially no successor

pred swapL I Queue for lock

if pred nil Lock was not free

ilocked true Prepare to spin

prednext I Link behind predecessor

repeat while Ilocked Busy wait for lock

procedure release lock L lock I qnode

if Inext nil No known successor

if compare and swapL I nil

return No successor lock free

repeat while Inext nil Wait for successor

Inextlocked false Pass lock

MCSLo ck Algorithm from

Correctness

We show that the MCS lo ck is correct by showing that it maintains a queue and the head of the

queue is the pro cess that holds the lo ck The MCS lo ckisdecisiveinstruction serializable Each

op eration has a single decisive instruction and corresp onding to a concurrent execution C of the

queue op erations there is an equivalent serial execution S such that if op eration O executes its

d

decisive instruction b efore op eration O do es in C then O O in S The decisiveinstruction

  d

serializabili ty of the MCS lo ck greatly simplies its correctness pro of b ecause the equivalent queue

is in a single state at any instant In contrast a concurrent data structure that is linearizable but

not serializable might b e in several states simultaneously

The Queue ADT

A queue is an Abstract Data Typ e that consists of nite set Q and two op erations enq ueue and

q q deq ueue The elements of Q are totally ordered We write the state a queue as Q q

 n

where q q q

Q  Q Q n

The enqueue op eration is sp ecied by

enq ueueq q q q q q q q

 n  n

The dequeue op eration on a nonempty queue is sp ecied by

deq ueueq q q q q

 n  n

where the return value is q A dequeue op eration on an empty queue is undened

We dene two functions on a queue Q headQ and tail Q The head function returns the

least element of the queue and the tail function returns the greatest element of the queue

Corresp onding to an actual MCS lo ck M there is an abstract queue Q Initially b oth M and

Q are empty When pro cess p with record r p erforms the decisive instruction for an acquire lock

op eration Q changes state to enq ueueQ r When a pro cess p erforms the decisive instruction for

lock op eration Q changes state to deq ueueQ We will show that the actual MCS lo ck a release

corresp onds to the abstract queue

Execution Sequences

The MCS lo ck executes correctly b ecause it is presented with a sp ecial sequence of concurrent

op erations We assume that each pro cessor uses the acquire lock and release lock op erations

to synchronize access to a resource

acquire lockLr

critical section

lockL release

If the MCS lo ck is correct then it is a multipleenqueuesingledequeue queue Anynumber of

acquire lock op erations might execute concurrently but we are guaranteed that

At most one release lock op eration executes at any given time Let D and D be two



dierent release lo ck executions executing in the time intervals I and I resp ectively Then



I and I do not overlap



No release lock op eration executes b etween the time that a release lock sets L to Nil and

the time that the rst subsequent acquire lock op eration terminates Let D b e the execution

of a release lock op eration that sets L to NIL and let D complete its execution at time

t Let E b e the execution of the acquire lock op eration that p erforms the rst decisive

d

lock instruction after time t If E completes its execution at time t then no release

d e

op eration executes in the time interval t t

d e

Queue Invariants

The MCS lo ckmaintains three invariant conditions

Tail Invariant If the abstract queue is nonempty then L p oints to the record at the tail of

the abstract queue If the queue is empty L is Nil

We dene the head process to b e the pro cess whose record is at the head of the abstract queue

A pro cess that is waiting for Ilocked to b ecome false is blocked

Head Invariant If the head pro cess is blo ckeditwillbeunblo cked by the time that the

preceding release lock op eration terminates

Blo cking Invariant A nonhead pro cess will not exit the acquire lock pro cedure

Decisive Op erations

lock pro cedure is the fetch and store instruction The The decisive instruction in the acquire

decisive instruction in the release lock pro cedure is the reading of Inext if the fetch returns

and swap instruction otherwise a nonnil value and is the compare

Theorem The MCS lock is decisiveinstruction serializable

Proof Toshow that the MCS lo ck is decisiveinstruction serializable we rst show that the three

invariants are always maintained To show that the invariants are maintained we assume that only

lock op eration We then show that this assumption holds by the head pro cess executes a release

induction

Lemma The tail invariant is always maintained

Proof When a pro cess p erforms the decisive instruction for an acquire lock op eration it sets

the tail p ointer L to its record Therefore acquire lock op erations maintain the tail invariant

lock op eration mo dies the tail p ointer by the compare and swap instruction only A release

The compare and swap will succeed only if the queue contains exactly one element Therefore the

lock op eration also maintains the queue invariant release

Lemma If only the head process executes a release lock operation the invariant is

always maintained

Supp ose that the record of a pro cess is in the queue but isnt the head record By the tail invariant

the pro cess must have read a nonnil tail p ointer when it p erformed the decisive instruction for the

acquire lock op eration The pro cess will therefore wait until the locked eld of its record which

is initialized to trueissettofalse This bit will only b e reset when the pro cess of the predecessor

record executes a release lock op eration But after the release lock decisive instruction the

pro cess b ecomes the head pro cess

Lemma The head invariant is always maintained

Supp ose that when a head pro cess P p erforms its decisive instruction on a release lock op eration

deq ueueQ is nonempty In this case the decisive instruction will rep ort that tail p ointer is not

the pro cess record Then P will wait until the next eld of its record R is nonnil By the tail

invariant this eld will eventually p ointtoRs successor in Q r b ecause of the execution of r s

lock pro cedure After P s decisive instruction r is the head record and pro cess pintheacquire

p is the head pro cess Pro cess P will reset the locked bit of r unblo cking p b efore P completes

Supp ose that when P p erforms its decisive instruction Q b ecomes emptyIfp is the pro cess

that executes the next decisive instruction whichmust b e for a acquire lock op eration p will

b ecome the head pro cess and will nd that L is nil Since p nds L is NIL p never blo cks

Lemma Aprocess executes a release lock operation only when it is the head process

th

Proof We pro ceed by induction on the i decisive instruction For the base case consider the

op eration that executes a decisive instruction Since the queue is empty the op eration must b e an

lock and the lemma holds acquire

th th

Supp ose that the lemma holds for the i decisive instruction If the i decisive instruction

th

lock the lemma holds Supp ose that the i decisive instruction is due to is due to an acquire

an release lock op eration By the execution sequence assumption the pro cess that executes this

op eration must have previously executed a acquire lock op eration By the blo cking invariant the

pro cess must b e the head pro cess so the lemma holds

Since the head invariantisalways maintained lo cks are granted in the order requested where

the order of requests is the order of the decisive instructions

Garbage

We next show that the queue do es not access garbage lo cations We assume when a pro cess executes

lock pro cedure it rst allo cates a record When a pro cess exits the release lock the acquire

pro cedure it discards the record that was at the head of the queue p ointed to by I For simplicity

we assume that a record is never reused A record is valid during the time b etween its creation and

its deletion

Theorem No process accesses an invalid queue record

Proof By the blo cking invariant only the head pro cess p erforms a release lock op eration

and the pro cess dequeues its own record Therefore whenever a pro cess access its record it accesses

avalid record

lock op eration the pro cess inserts a p ointer to its When a pro cess executes the acquire

records predecessor in Q if one exists The pro cess of the predecessor record do es not exit

the release lock pro cedure until the next eld of its record is mo died Therefore no pro cess

lock op eration accesses an invalid record when executing the acquire

When a pro cess executes the release lock op eration it mo dies its records successor in Q

By the execution sequence assumption this record remains in the queue until the pro cess completes

the op eration Therefore no pro cess access an invalid record when executing the release lock

op eration

Critical Section Solution

A critical section solution is correct if it satises the following three criteria

At most one pro cess executes in the critical section at anygiven time

If no pro cess is executing in the critical section and at least one pro cess wishes to enter the

critical section then a pro cess enters the critical section in a nite amount of time

If a pro cess wishes to enter the critical section then a nite numb er of pro cesses enter rst

We can see that the MCS lo ck is a correct solution to the critical section problem We dene

the set of pro cesses that are in or wish to enter the critical section as the set of pro cesses whose

records are in Q By the blo cking invariant criteria holds By the head invariant criteria holds

The MCS lo ck is a decisiveinstruction serializable FIFO queue so criteria holds

Conclusion

The MCS lo ck is metho d of implementing mutual exclusion that avoids network contention Fur

ther the MCS lo ck is fair unlike the simple testandset solution We present a simple correctness

pro of for the MCS lo ck We show that the MCS lo ckisequivalent to a decisiveinstruction serial

izable queue and show that several queue invariants always hold Whereas the correctness pro of

provided by MellorCrummey and Scott is indirect and provides little insight our correctness pro of

is direct and provides insightinto the algorithms execution

References

T E Anderson The p erformance of spin lo ck alternatives for shared memory multipro cessors

IEEE Transactions on Paral lel and Distributed Systems

RR Glenn DV Pryor JM Conroy and T Johnson Characterizing memory hotsp ots in a

shared memory mimd machine In Supercomputing IEEE and ACM SIGARCH

M Herlihy and J Wing A correctness condition for concurrent ob jects ACM

Transactions on Programming Languages and Systems

J MellorCrummey and M Scott for scalable synchronization on sharedmemory

multipro cessors Technical Rep ort TR Rice University Dept of CS

JM MellorCrummey and ML Scott Algorithms for scalable synchronization on shared

memory multipro cessors ACM Trans Computer Systems

D Shasha and N Go o dman Concurrent search structure algorithms ACM Transactions on

Database Systems

esley A Silb erschatz J Peterson and P Galvin Concepts Addison W