Software Transactional Memory

Software Transactional Memory Nir Shavit Dan Touitou TelAviv University TelAviv University Abstract As we learn from the literature exibilityincho osing synchronization op erations greatly sim plies the task of designing highly concurrent programs Unfortunately existing hardware is in LinkedStore Conditional op eration on a single word exible and is at b est on the level of a Load Building on the hardware based transactional synchronization metho dology of Herlihy and Moss we oer softwaretransactional memory STManovel software metho d for supp orting exible transactional programming of synchronization op erations STM is nonblo cking and can b e imple mented on existing machines using only a Load LinkedStore Conditional op eration We use STM to provide a general highly concurrent metho d for translating sequential ob ject implementations to nonblo cking ones based on implementing a k word compareswap STMtransaction Empirical evidence collected on simulated multipro cessor architectures shows that the our metho d always outp erforms all the nonblo cking translation metho ds in the style of Barnes and outp erforms Her lihys translation metho d for suciently large numb ers of pro cessors The key to the eciency of our softwaretransactional approachisthatunlike Barnes style metho ds it is not based on a costly recursive helping p olicy Contact Author Email shanirtheorylcsmitedu y A preliminary version of this pap er app eared in the th ACM Symp osium on the Principles of Distributed Computing Ottowa Ontario Canada Intro duction A ma jor obstacle on the way to making multipro cessor machines widely acceptable is the diculty of programmers in designing highly concurrent programs and data structures Given the growing realization that unpredictable delay is an increasingly serious problem in mo dern multipro cessor architectures we argue that conventional techniques for implementing concurrent ob jects by means of critical sections are unsuitable since they limit parallelism increase contention for memory and interconnect and make the system vulnerable to timing anomalies and pro cessor failures The key to highly concurrent programming is to decrease the numb er and size of critical sections amultipro cessor program uses p ossibly eliminating critical sections altogether by constructing classes of implementations that are nonblocking As we learn from the literature exibilityincho osing the synchronization op erations greatly simplies the task of designing non blo cking concurrent programs Examples are the nonblo cking datastructures of Massalin and Pu which use a CompareSwap on twowords Andersons parallel path compression on lists which uses a sp ecial Splice op eration the counting networks of which use combination of FetchComplement and FetchInc Israeli and Rapp op orts Heap which can b e implemented using a threeword CompareSwap and many more Unfortunately most of the currentorsoon Load LinkedStore Conditional to b e develop ed architectures supp ort op erations on the level of a op eration for a single word making most of these highly concurrent algorithms impractical in the near future Bershad suggested to overcome the problem of providing ecient programming primitives on existing machines by employing op erating system supp ort Herlihy and Moss have prop osed an ingenious hardware solution transactional memory By adding a sp ecialized asso ciative cache and making several minor changes to the cache consistency proto cols they are able to supp ort a exible transactional language for writing synchronization op erations Any synchronization op eration can b e written as a transaction and executed using an optimistic algorithm built into the consistency proto col Unfortunately though this solution is blocking This pap er prop oses to adopt the transactional approach but not its hardware based implemen tation Weintro duce softwaretransactional memory STM a novel design that supp orts exible transactional programming of synchronization op erations in software Though we cannot aim for the same overall p erformance our software transactional memory has clear advantages in terms of applicabili tytotodays machines p ortability among machines and resiliency in the face of timing anomalies and pro cessor failures We fo cus on implementations of a software transactional memory that supp ort static transactions that is transactions which access a predetermined sequence of lo cations This class includes most of the known and prop osed synchronization primitives in the literature STM in a nutshell In a nonfaultyenvironment the way to ensure the atomicity of the op erations is usually based on lo cking or acquiring exclusively ownerships on the memory lo cations accessed by an op eration Op If a transaction cannot capture an ownerships it fails and releases the ownerships already acquired Otherwise it succeeds in executing Op and frees the ownerships acquired To guarantee liveness one must rst eliminate deadlo cks which for static transactions is done by acquiring the ownerships needed in some increasing order In order to continue ensuring liveness in a faultyenvironment wemust make certain that every transaction completes even if the pro cess which executes it has b een delayed swapp ed out or crashed This is achieved by a helping metho dology forcing other transactions which are trying to capture the same lo cation to help the owner of this lo cation to complete its own transaction The key feature in the transactional approach is that in order to free a lo cation one need only help its single owner transaction Moreover one can eectively avoid the overhead of co ordination among several transactions attempting to help release a lo cation by employing a reactive helping p olicy whichwe call nonredundanthelping SequentialtoNonBlo cking Translation One can use STM to provide a general highly concurrent metho d for translating sequential ob ject implementations into nonblo cking ones based on the caching approachof The approachis straightforward use transactional memory to to implementany collection of changes to a shared ob ject p erforming them as an atomic kword CompareSwap transaction see Figure on the desired lo cations The nonblo cking STM implementation guarantees that some transaction will always succeed Herlihyin referred to in the sequel as Herlihys method was the rst to oer a general transformation of sequential ob jects into nonblo cking concurrent ones According to his metho d ology up dating a data structure is done by rst copying it into a new allo cated blo ckofmemory making the changes on the new version and tentatively switching the p ointer to the new data struc LinkedStore Conditional atomic op erations Unfortunately ture all that with the help of Load Herlihys method do es not provide a suitable solution for large data structures and like the stan dard approachoflocking the whole ob ject do es not supp ort concurrent up dating Alemanyand Felten and LaMarca suggested to improve the eciency of this general metho d at the price of lo osing p ortabilityby using op erating system supp ort making a set of strong assumptions on system b ehavior Toovercome the limitations of Herlihys method Barnes in intro duced his caching metho d that avoids copying the whole ob ject and allows concurrent disjoint up dating A similar approach was indep endently prop osed byTurek Shasha and Prakash According to Barnes a pro cess rst simulates the execution of the up dating in its private memory ie reading a lo cation for the rst time is done from the shared memory but writing is done into the private memory Then the pro cess uses an nonblo cking kword ReadModifyWrite atomic op eration whichchecks if the values contained in the memory are equivalent tothethevalue read in the cache up date If this is the case the op eration stores the new values in the memory Otherwise the pro cess restarts from the b eginning Barnes suggested to implement the kword ReadMo difyWrite bylocking in ascending order of their key the lo cations involved in the up date executing the op eration and after executing the op eration needed releasing the lo cks The key to achieving the nonblo cking resilientbehavior in the caching approachof is the cooperative method whenever a pro cess needs a lo cation already lo cked by another pro cess it helps the lo cking pro cess to complete its own op eration and this is done recursively along the dep endency chain Though Barnes and Turek Shasha and Prakash are vague on sp ecic implementation details a recent pap er by Israeli and Rapp op ort gives using the co op erative metho d a clean and streamlined implementation of a nonblo cking kword CompareSwap using Load LinkedStore Conditional However as our empirical results suggest b oth the general metho d and its sp ecic implementation havetwo ma jor drawbacks whichareovercome by our STM based translation metho d The cooperative method has a recursive structure of helping which frequently causes pro cesses to help other pro cesses which access a disjoint part of the data structure Unlike STMs transactional kword CompareSwap op erations which mostly fail on the transaction level and are thus not help ed a high p ercentage of co op erative kword Com pareSwap op erations fail but generate contention since they are nevertheless help ed by other pro cesses Take for example a pro cess P which executes a word CompareSwap on lo cations a and b Assume that some other pro cess Q already owns b According to the co op erative metho d P rst helps Q complete its op eration

Load more