Time Complexity Bounds for Shared-Memory Mutual Exclusion

Total Page:16

File Type:pdf, Size:1020Kb

Time Complexity Bounds for Shared-Memory Mutual Exclusion Time Complexity Bounds for Shared-memory Mutual Exclusion by Yong-Jik Kim A dissertation submitted to the faculty of the University of North Carolina at Chapel Hill in partial fulfillment of the requirements for the degree of Doctor of Philosophy in the Department of Computer Science. Chapel Hill 2003 Approved by: James H. Anderson, Advisor Sanjoy K. Baruah, Reader Jan F. Prins, Reader Michael Merritt, Reader Lars S. Nyland, Reader Jack Snoeyink, Reader ii c 2003 Yong-Jik Kim ALL RIGHTS RESERVED iii ABSTRACT YONG-JIK KIM: Time Complexity Bounds for Shared-memory Mutual Exclusion. (Under the direction of James H. Anderson.) Mutual exclusion algorithms are used to resolve conflicting accesses to shared re- sources by concurrent processes. The problem of designing such an algorithm is widely regarded as one of the “classic” problems in concurrent programming. Recent work on scalable shared-memory mutual exclusion algorithms has shown that the most crucial factor in determining an algorithm’s performance is the amount of traffic it generates on the processors-to-memory interconnect [23, 38, 61, 84]. In light of this, the RMR (remote-memory-reference) time complexity measure was proposed [84]. Under this measure, an algorithm’s time complexity is defined to be the worst-case number of remote memory references required by one process in order to enter and then exit its critical section. In the study of shared-memory mutual exclusion algorithms, the following funda- mental question arises: for a given system model, what is the most efficient mutual exclusion algorithm that can be designed under the RMR measure? This question is important because its answer enables us to compare the cost of synchronization in dif- ferent systems with greater accuracy. With some primitives, constant-time algorithms are known, so the answer to this question is trivial. However, with other primitives (e.g., under read/write atomicity), there are still gaps between the best known algorithms and lower bounds. In this dissertation, we address this question. The main thesis to be supported by this dissertation can be summarized as follows. The mutual exclusion problem exhibits different time-complexity bounds as a function of both the number of processes (N) and the number of simultaneously active processes (contention, k), depending on the available synchronization primitives and the underlying system model. Moreover, these time-complexity bounds are nontrivial, in that constant-time algorithms are impossible in many cases. In support of this thesis, we present a time-complexity lower bound of Ω(log N/ log log N) for systems using atomic reads, writes, and comparison primitives, such as compare-and-swap. This bound is within a factor of Θ(log log N) of being optimal. iv Given that constant-time algorithms based on fetch-and-φ primitives exist, this lower bound points to an unexpected weakness of compare-and-swap, which is widely regarded as being the most useful of all primitives to provide in hardware. We also present an adaptive algorithm with Θ(min(k, log N)) RMR time complexity under read/write atomicity, where k is contention. In addition, we present another lower bound that precludes the possibility of an o(k) algorithm for such systems, even if comparison primitives are allowed. Regarding nonatomic systems, we present a Θ(log N) nonatomic algorithm, and show that adaptive mutual exclusion is impossible in such systems by proving that any nonatomic algorithm must have a single-process execution that accesses Ω(log N/ log log N)distinctvariables. Finally, we present a generic fetch-and-φ-based local-spin mutual exclusion algo- rithm with Θ(logr N) RMR time complexity. This algorithm is “generic” in the sense that it can be implemented using any fetch-and-φ primitive of rank r,where2≤ r<N. The rank of a fetch-and-φ primitive expresses the extent to which processes may “order themselves” using that primitive. For primitives that meet a certain additional con- dition, we present a Θ(log N/ log log N) algorithm, which is time-optimal for certain primitives of constant rank. v ACKNOWLEDGMENTS First, I want to thank my advisor, Jim Anderson, for his support and enthusiasm. I have learned an enormous amount from working with him, and this dissertation would not have been possible without his far-reaching insight and constant encouragement. I also want to thank Michael Merritt, who took the trouble of reading this large dissertation and flying to attend my defense. Thanks, too, to the rest of my committee: Jan Prins, Jack Snoeyink, Sanjoy Baruah, and Lars Nyland. I greatly appreciate their willingness to bend their schedules to accommodate countless meetings and exams, and also their support and encouragement throughout the process. I would also like to thank Guido Gerig, my ex-advisor, and people in the medical imaging group, for being supportive and friendly throughout my stay at UNC. In addition, I would like to thank people in the distributed algorithms research community for lively discussions and reviewing earlier versions of the results presented in this dissertation. I cannot hope to recall everyone who should be mentioned, but the list would surely include the following people: Hagit Attiya, Faith Fich, Maurice Herlihy, Leslie Lamport, Victor Luchangco, Maged Michael, Mark Moir, Eric Ruppert, Jennifer Welch, and many others. Many thanks to Anand Srinivasan, Phil Holman, and Shelby Funk, for their friend- ship and their help in term projects, writing papers, and other things. Lucia Cevidanes, I enjoyed working with you; I hope you will find a better matlab programmer who can decipher my code! Special thanks to the Korean friends who made my stay in Chapel Hill a lot more homelike: Juhyun, Sungeui, Joohee (of the math department), Joohi, Sang-Uok and Hye-Chung, In-Kyung, Hyemi Choi, Kuan-Hui Lee, and (of course) many, many others. No words will be enough to thank my family for their love and support. Thank you, Mom and Dad, Namhee, Heeyeon, and Yonggon. I love all of you. Finally, I would like to thank Eunjung, my lovely wife. Throughout the years we have been together, she has always been kind, understanding, and inspiring. Without her, I would have missed much of what life is. vi CONTENTS LIST OF TABLES x LIST OF FIGURES xi 1 Introduction 1 1.1KnownLocal-spinAlgorithms....................... 4 1.2KnownAdaptiveAlgorithms........................ 5 1.3KnownNonatomicAlgorithms....................... 6 1.4Contributions................................ 6 1.4.1 GeneralMutualExclusion..................... 8 1.4.2 AdaptiveMutualExclusion.................... 9 1.4.3 NonatomicMutualExclusion................... 9 1.4.4 Mutual Exclusion with fetch-and-φ Primitives . 10 1.5Organization................................ 11 2 Related Work 12 2.1Preliminaries................................ 13 2.2Local-spinAlgorithms........................... 16 2.2.1 Algorithms that use fetch-and-φ Primitives............ 17 2.2.2 Algorithms that use Only Reads and Writes . 23 2.3FastMutualExclusion........................... 26 2.3.1 Algorithm L: Lamport’s Fast Mutual Exclusion Algorithm . 27 2.3.2 Fast Mutual Exclusion with Local Spinning . 29 2.4AdaptiveAlgorithms............................ 30 vii 2.4.1 AdaptiveAlgorithmsusingFilters................. 32 2.4.2 AnAdaptiveBakeryAlgorithm.................. 36 2.4.3 Adaptive Mutual Exclusion with Infinitely Many Processes . 37 2.4.4 Local-spinAdaptiveAlgorithms.................. 38 2.5MutualExclusionwithNonatomicOperations.............. 38 2.6 Time- and Space-complexity Lower Bounds . 41 2.6.1 OverviewofProofMethodologies................. 41 2.6.2 Lower-boundResults........................ 42 3 Space-optimal Mutual Exclusion Under Read/Write Atomicity 47 3.1 Transformation of Algorithm YA-N .................. 48 3.2TimeComplexity.............................. 55 3.3ConcludingRemarks............................ 57 4 Adaptive Mutual Exclusion Under Read/Write Atomicity 59 4.1AdaptiveAlgorithm............................. 61 4.1.1 RelatedAlgorithmsandKeyIdeas................ 61 4.1.2 Algorithm A-U: Unbounded Algorithm . 65 4.1.3 Algorithm A-B: Bounded Algorithm . 73 4.1.4 Algorithm A-LS:Linear-spaceAlgorithm........... 75 4.2ConcludingRemarks............................ 83 5 Time-complexity Lower Bound for General Mutual Exclusion 85 5.1Definitions.................................. 87 5.1.1 AtomicShared-memorySystems.................. 87 5.1.2 Mutual-exclusionSystems..................... 93 5.1.3 Cache-coherentSystems...................... 95 5.2Lower-boundProofStrategy........................ 99 5.2.1 Process Groups and Regular Computations . 99 5.2.2 DetailedProofOverview...................... 103 5.3DetailedLower-boundProof........................ 110 5.4 Constant-time Algorithm for LFCU Systems . 137 viii 5.5ConcludingRemarks............................ 139 6 Time-complexity Lower Bound for Adaptive Mutual Exclusion 142 6.1ProofStrategy................................ 143 6.2DetailedLower-boundProof........................ 148 6.3ConcludingRemarks............................ 167 7 Algorithm and Time-complexity Lower Bound for Nonatomic Sys- tems 169 7.1NonatomicAlgorithm............................ 172 7.2NonatomicShared-memorySystems.................... 175 7.3 Lower Bound: Proof Sketch . 180 7.3.1 BriefOverview........................... 181 7.3.2 FormalDefinitions......................... 190 7.3.3 DetailedProofOverview...................... 198 7.4ConcludingRemarks............................ 208 8 Generic Algorithm for fetch-and-φ Primitives
Recommended publications
  • Synchronization Spinlocks - Semaphores
    CS 4410 Operating Systems Synchronization Spinlocks - Semaphores Summer 2013 Cornell University 1 Today ● How can I synchronize the execution of multiple threads of the same process? ● Example ● Race condition ● Critical-Section Problem ● Spinlocks ● Semaphors ● Usage 2 Problem Context ● Multiple threads of the same process have: ● Private registers and stack memory ● Shared access to the remainder of the process “state” ● Preemptive CPU Scheduling: ● The execution of a thread is interrupted unexpectedly. ● Multiple cores executing multiple threads of the same process. 3 Share Counting ● Mr Skroutz wants to count his $1-bills. ● Initially, he uses one thread that increases a variable bills_counter for every $1-bill. ● Then he thought to accelerate the counting by using two threads and keeping the variable bills_counter shared. 4 Share Counting bills_counter = 0 ● Thread A ● Thread B while (machine_A_has_bills) while (machine_B_has_bills) bills_counter++ bills_counter++ print bills_counter ● What it might go wrong? 5 Share Counting ● Thread A ● Thread B r1 = bills_counter r2 = bills_counter r1 = r1 +1 r2 = r2 +1 bills_counter = r1 bills_counter = r2 ● If bills_counter = 42, what are its possible values after the execution of one A/B loop ? 6 Shared counters ● One possible result: everything works! ● Another possible result: lost update! ● Called a “race condition”. 7 Race conditions ● Def: a timing dependent error involving shared state ● It depends on how threads are scheduled. ● Hard to detect 8 Critical-Section Problem bills_counter = 0 ● Thread A ● Thread B while (my_machine_has_bills) while (my_machine_has_bills) – enter critical section – enter critical section bills_counter++ bills_counter++ – exit critical section – exit critical section print bills_counter 9 Critical-Section Problem ● The solution should ● enter section satisfy: ● critical section ● Mutual exclusion ● exit section ● Progress ● remainder section ● Bounded waiting 10 General Solution ● LOCK ● A process must acquire a lock to enter a critical section.
    [Show full text]
  • The Deadlock Problem
    The deadlock problem More deadlocks mutex_t m1, m2; Same problem with condition variables void p1 (void *ignored) { • lock (m1); - Suppose resource 1 managed by c1, resource 2 by c2 lock (m2); - A has 1, waits on 2, B has 2, waits on 1 /* critical section */ c c unlock (m2); Or have combined mutex/condition variable unlock (m1); • } deadlock: - lock (a); lock (b); while (!ready) wait (b, c); void p2 (void *ignored) { unlock (b); unlock (a); lock (m2); - lock (a); lock (b); ready = true; signal (c); lock (m1); unlock (b); unlock (a); /* critical section */ unlock (m1); One lesson: Dangerous to hold locks when crossing unlock (m2); • } abstraction barriers! This program can cease to make progress – how? - I.e., lock (a) then call function that uses condition variable • Can you have deadlock w/o mutexes? • 1/15 2/15 Deadlocks w/o computers Deadlock conditions 1. Limited access (mutual exclusion): - Resource can only be shared with finite users. 2. No preemption: - once resource granted, cannot be taken away. Real issue is resources & how required • 3. Multiple independent requests (hold and wait): E.g., bridge only allows traffic in one direction - don’t ask all at once (wait for next resource while holding • - Each section of a bridge can be viewed as a resource. current one) - If a deadlock occurs, it can be resolved if one car backs up 4. Circularity in graph of requests (preempt resources and rollback). All of 1–4 necessary for deadlock to occur - Several cars may have to be backed up if a deadlock occurs. • Two approaches to dealing with deadlock: - Starvation is possible.
    [Show full text]
  • Fair K Mutual Exclusion Algorithm for Peer to Peer Systems ∗
    Fair K Mutual Exclusion Algorithm for Peer To Peer Systems ∗ Vijay Anand Reddy, Prateek Mittal, and Indranil Gupta University of Illinois, Urbana Champaign, USA {vkortho2, mittal2, indy}@uiuc.edu Abstract is when multiple clients are simultaneously downloading a large multimedia file, e.g., [3, 12]. Finally applications k-mutual exclusion is an important problem for resource- where a service has limited computational resources will intensive peer-to-peer applications ranging from aggrega- also benefit from a k mutual exclusion primitive, preventing tion to file downloads. In order to be practically useful, scenarios where all the nodes of the p2p system simultane- k-mutual exclusion algorithms not only need to be safe and ously overwhelm the server. live, but they also need to be fair across hosts. We pro- The k mutual exclusion problem involves a group of pro- pose a new solution to the k-mutual exclusion problem that cesses, each of which intermittently requires access to an provides a notion of time-based fairness. Specifically, our identical resource called the critical section (CS). A k mu- algorithm attempts to minimize the spread of access time tual exclusion algorithm must satisfy both safety and live- for the critical resource. While a client’s access time is ness. Safety means that at most k processes, 1 ≤ k ≤ n, the time between it requesting and accessing the resource, may be in the CS at any given time. Liveness means that the spread is defined as a system-wide metric that measures every request for critical section access is satisfied in finite some notion of the variance of access times across a homo- time.
    [Show full text]
  • Download Distributed Systems Free Ebook
    DISTRIBUTED SYSTEMS DOWNLOAD FREE BOOK Maarten van Steen, Andrew S Tanenbaum | 596 pages | 01 Feb 2017 | Createspace Independent Publishing Platform | 9781543057386 | English | United States Distributed Systems - The Complete Guide The hope is that together, the system can maximize resources and information while preventing failures, as if one system fails, it won't affect the availability of the service. Banker's algorithm Dijkstra's algorithm DJP algorithm Prim's algorithm Dijkstra-Scholten algorithm Dekker's algorithm generalization Smoothsort Shunting-yard algorithm Distributed Systems marking algorithm Concurrent algorithms Distributed Systems algorithms Deadlock prevention algorithms Mutual exclusion algorithms Self-stabilizing Distributed Systems. Learn to code for free. For the first time computers would be able to send messages to other systems with a local IP address. The messages passed between machines contain forms of data that the systems want to share like databases, objects, and Distributed Systems. Also known as distributed computing and distributed databases, a distributed system is a collection of independent components located on different machines that share messages with each other in order to achieve common goals. To prevent infinite loops, running the code requires some amount of Ether. As mentioned in many places, one of which this great articleyou cannot have consistency and availability without partition tolerance. Because it works in batches jobs a problem arises where if your job fails — Distributed Systems need to restart the whole thing. While in a voting system an attacker need only add nodes to the network which is Distributed Systems, as free access to the network is a design targetin a CPU power based scheme an attacker faces a physical limitation: getting access to more and more powerful hardware.
    [Show full text]
  • In the Beginning... Example Critical Sections / Mutual Exclusion Critical
    Temporal relations • Instructions executed by a single thread are totally CSE 451: Operating Systems ordered Spring 2013 – A < B < C < … • Absent synchronization, instructions executed by distinct threads must be considered unordered / Module 7 simultaneous Synchronization – Not X < X’, and not X’ < X Ed Lazowska [email protected] Allen Center 570 © 2013 Gribble, Lazowska, Levy, Zahorjan © 2013 Gribble, Lazowska, Levy, Zahorjan 2 Critical Sections / Mutual Exclusion Example: ExampleIn the beginning... • Sequences of instructions that may get incorrect main() Y-axis is “time.” results if executed simultaneously are called critical A sections Could be one CPU, could pthread_create() • (We also use the term race condition to refer to a be multiple CPUs (cores). situation in which the results depend on timing) B foo() A' • Mutual exclusion means “not simultaneous” – A < B or B < A • A < B < C – We don’t care which B' • A' < B' • Forcing mutual exclusion between two critical section C • A < A' executions is sufficient to ensure correct execution – • C == A' guarantees ordering • C == B' • One way to guarantee mutually exclusive execution is using locks © 2013 Gribble, Lazowska, Levy, Zahorjan 3 © 2013 Gribble, Lazowska, Levy, Zahorjan 4 CriticalCritical sections When do critical sections arise? is the “happens-before” relation • One common pattern: T1 T2 T1 T2 T1 T2 – read-modify-write of – a shared value (variable) – in code that can be executed concurrently (Note: There may be only one copy of the code (e.g., a procedure), but it
    [Show full text]
  • The Deadlock Problem
    The deadlock problem n A set of blocked processes each holding a resource and waiting to acquire a resource held by another process. n Example n locks A and B Deadlocks P0 P1 lock (A); lock (B) lock (B); lock (A) Arvind Krishnamurthy Spring 2004 n Example n System has 2 tape drives. n P1 and P2 each hold one tape drive and each needs another one. n Deadlock implies starvation (opposite not true) The dining philosophers problem Deadlock Conditions n Five philosophers around a table --- thinking or eating n Five plates of food + five forks (placed between each n Deadlock can arise if four conditions hold simultaneously: plate) n Each philosopher needs two forks to eat n Mutual exclusion: limited access to limited resources n Hold and wait void philosopher (int i) { while (TRUE) { n No preemption: a resource can be released only voluntarily think(); take_fork (i); n Circular wait take_fork ((i+1) % 5); eat(); put_fork (i); put_fork ((i+1) % 5); } } Resource-allocation graph Resource-Allocation & Waits-For Graphs A set of vertices V and a set of edges E. n V is partitioned into two types: n P = {P1, P2, …, Pn }, the set consisting of all the processes in the system. n R = {R1, R2, …, Rm }, the set consisting of all resource types in the system (CPU cycles, memory space, I/O devices) n Each resource type Ri has Wi instances. n request edge – directed edge P1 ® Rj Resource-Allocation Graph Corresponding wait-for graph n assignment edge – directed edge Rj ® Pi 1 Deadlocks with multiple resources Another example n P1 is waiting for P2 or P3, P3 is waiting for P1 or P4 P1 is waiting for P2, P2 is waiting for P3, P3 is waiting for P1 or P2 n Is there a deadlock situation here? Is this a deadlock scenario? Announcements Methods for handling deadlocks th n Midterm exam on Feb.
    [Show full text]
  • 05 Mutex Algorithms I
    G52CON: Concepts of Concurrency Lecture 5: Algorithms for Mutual Exclusion "I Brian Logan School of Computer Science [email protected] Outline of this lecture" •# mutual exclusion protocols •# criteria for a solution –# safety properties –# liveness properties •# simple spin lock •# spin lock using turns •# spin lock using the Test-and-Set special instruction © Brian Logan 2014 G52CON Lecture 5: Algorithms for Mutual 2 Exclusion I Archetypical mutual exclusion" Any program consisting of n processes for which mutual exclusion is required between critical sections belonging to just one class can be written: // Process 1 // Process 2 ... // Process n init1; init2; initn; while(true) { while(true) { while(true) { crit1; crit2; critn; rem1; rem2; remn; } } } where initi denotes any (non-critical) initialisation, criti denotes a critical section, remi denotes the (non-critical) remainder of the program, and i is 1, 2, … n. © Brian Logan 2014 G52CON Lecture 5: Algorithms for Mutual 3 Exclusion I Archetypical mutual exclusion" We assume that init, crit and rem may be of any size: •#crit must execute in a finite time—process does not terminate in crit •#init and rem may be infinite—process may terminate in init or rem •#crit and rem may vary from one pass through the while loop to the next With these assumptions it is possible to rewrite any process with critical sections into the archetypical form. © Brian Logan 2014 G52CON Lecture 5: Algorithms for Mutual 4 Exclusion I Ornamental Gardens problem" // West turnstile // East turnstile init1; init2; while(true) { while(true) { // wait for turnstile // wait for turnstile < count = count + 1; > < count = count + 1; > // other stuff ..
    [Show full text]
  • The Mutual Exclusion Problem Part II: Statement and Solutions
    56b The Mutual Exclusion Problem Part II: Statement and Solutions L. Lamport1 Digital Equipment Corporation 6 October 1980 Revised: 1 February 1983 1 May 1984 27 February 1985 June 26, 2000 To appear in Journal of the ACM 1Most of this work was performed while the author was at SRI International, where it was supported in part by the National Science Foundation under grant number MCS-7816783. Abstract The theory developed in Part I is used to state the mutual exclusion problem and several additional fairness and failure-tolerance requirements. Four “dis- tributed” N-process solutions are given, ranging from a solution requiring only one communication bit per process that permits individual starvation, to one requiring about N! communication bits per process that satisfies every reasonable fairness and failure-tolerance requirement that we can conceive of. Contents 1 Introduction 3 2TheProblem 4 2.1 Basic Requirements ........................ 4 2.2 Fairness Requirements ...................... 6 2.3 Premature Termination ..................... 8 2.4 Failure ............................... 10 3 The Solutions 14 3.1 The Mutual Exclusion Protocol ................. 15 3.2 The One-Bit Solution ...................... 17 3.3 A Digression ........................... 21 3.4 The Three-Bit Algorithm .................... 22 3.5 FCFS Solutions .......................... 26 4Conclusion 32 1 List of Figures 1 The One-Bit Algorithm: Process i ............... 17 2 The Three-Bit Algorithm: Process i .............. 24 3TheN-Bit FCFS Algorithm: Process i............. 28 4TheN!-Bit FCFS Algorithm: Process i. ............ 31 2 1 Introduction This is the second part of a two-part paper on the mutual exclusion problem. In Part I [9], we described a formal model of concurrent systems and used it to define a primitive interprocess communication mechanism (communica- tion variables) that assumes no underlying mutual exclusion.
    [Show full text]
  • 9 Mutual Exclusion
    System Architecture 9 Mutual Exclusion Critical Section & Critical Region Busy-Waiting versus Blocking Lock, Semaphore, Monitor November 24 2008 Winter Term 2008/09 Gerd Liefländer © 2008 Universität Karlsruhe (TH), System Architecture Group 1 Overview Agenda HW Precondition Mutual Exclusion Problem Critical Regions Critical Sections Requirements for valid solutions Implementation levels User-level approaches HW support Kernel support Semaphores Monitors © 2008 Universität Karlsruhe (TH), System Architecture Group 2 Overview Literature Bacon, J.: Operating Systems (9, 10, 11) Nehmer,J.: Grundlagen moderner BS (6, 7, 8) Silberschatz, A.: Operating System Concepts (4, 6, 7) Stallings, W.: Operating Systems (5, 6) Tanenbaum, A.: Modern Operating Systems (2) Research papers on various locks © 2008 Universität Karlsruhe (TH), System Architecture Group 3 Atomic Instructions To understand concurrency, we need to know what the underlying indivisible HW instructions are Atomic instructions run to completion or not at all It is indivisible: it cannot be stopped in the middle and its state cannot be modified by someone else in the middle Fundamental building block: without atomic instructions we have no way for threads to work together properly load, store of words are usually atomic However, some instructions are not atomic VAX and IBM 360 had an instruction to copy a whole array © 2008 Universität Karlsruhe (TH), System Architecture Group 4 Mutual Exclusion © 2008 Universität Karlsruhe (TH), System Architecture Group 5 Mutual Exclusion Problem Assume at least two concurrent activities 1. Access to a physical or to a logical resource or to shared data has to be done exclusively 2. A program section of an activity, that has to be executed indivisibly and exclusively is called critical section CS1 3.
    [Show full text]
  • Concurrency and Parallelism in Functional Programming Languages
    Concurrency and Parallelism in Functional Programming Languages John Reppy [email protected] University of Chicago April 21 & 24, 2017 Introduction Outline I Programming models I Concurrent I Concurrent ML I Multithreading via continuations (if there is time) CML 2 Introduction Outline I Programming models I Concurrent I Concurrent ML I Multithreading via continuations (if there is time) CML 2 Introduction Outline I Programming models I Concurrent I Concurrent ML I Multithreading via continuations (if there is time) CML 2 Introduction Outline I Programming models I Concurrent I Concurrent ML I Multithreading via continuations (if there is time) CML 2 Introduction Outline I Programming models I Concurrent I Concurrent ML I Multithreading via continuations (if there is time) CML 2 Programming models Different language-design axes I Parallel vs. concurrent vs. distributed. I Implicitly parallel vs. implicitly threaded vs. explicitly threaded. I Deterministic vs. non-deterministic. I Shared state vs. shared-nothing. CML 3 Programming models Different language-design axes I Parallel vs. concurrent vs. distributed. I Implicitly parallel vs. implicitly threaded vs. explicitly threaded. I Deterministic vs. non-deterministic. I Shared state vs. shared-nothing. CML 3 Programming models Different language-design axes I Parallel vs. concurrent vs. distributed. I Implicitly parallel vs. implicitly threaded vs. explicitly threaded. I Deterministic vs. non-deterministic. I Shared state vs. shared-nothing. CML 3 Programming models Different language-design axes I Parallel vs. concurrent vs. distributed. I Implicitly parallel vs. implicitly threaded vs. explicitly threaded. I Deterministic vs. non-deterministic. I Shared state vs. shared-nothing. CML 3 Programming models Different language-design axes I Parallel vs.
    [Show full text]
  • Why Mutual Exclusion? Why Mutual Exclusion?
    Recap: Consensus • On a synchronous system – There’s an algorithm that works. • On an asynchronous system CSE 486/586 Distributed Systems – It’s been shown (FLP) that it’s impossible to guarantee. Mutual Exclusion • Getting around the result – Masking faults – Using failure detectors – Still not perfect Steve Ko • Impossibility Result Computer Sciences and Engineering – Lemma 1: schedules are commutative University at Buffalo – Lemma 2: some initial configuration is bivalent – Lemma 3: from a bivalent configuration, there is always another bivalent configuration that is reachable. CSE 486/586, Spring 2013 CSE 486/586, Spring 2013 2 Why Mutual Exclusion? Why Mutual Exclusion? • Bank’s Servers in the Cloud: Think of two • Bank’s Servers in the Cloud: Think of two simultaneous deposits of $10,000 into your bank simultaneous deposits of $10,000 into your bank account, each from one ATM. account, each from one ATM. – Both ATMs read initial amount of $1000 concurrently from – Both ATMs read initial amount of $1000 concurrently from the bank’s cloud server the bank’s cloud server – Both ATMs add $10,000 to this amount (locally at the ATM) – Both ATMs add $10,000 to this amount (locally at the ATM) – Both write the final amount to the server – Both write the final amount to the server – What’s wrong? – What’s wrong? • The ATMs need mutually exclusive access to your account entry at the server (or, to executing the code that modifies the account entry) CSE 486/586, Spring 2013 3 CSE 486/586, Spring 2013 4 Mutual Exclusion Mutexes • Critical section problem • To synchronize access of multiple threads to – Piece of code (at all clients) for which we need to ensure common data structures there is at most one client executing it at any point of time.
    [Show full text]
  • Lock Implementation Goals • We Evaluate Lock Implementations Along Following Lines
    Fall 2017 :: CSE 306 Implementing Locks Nima Honarmand (Based on slides by Prof. Andrea Arpaci-Dusseau) Fall 2017 :: CSE 306 Lock Implementation Goals • We evaluate lock implementations along following lines • Correctness • Mutual exclusion: only one thread in critical section at a time • Progress (deadlock-free): if several simultaneous requests, must allow one to proceed • Bounded wait (starvation-free): must eventually allow each waiting thread to enter • Fairness: each thread waits for same amount of time • Also, threads acquire locks in the same order as requested • Performance: CPU time is used efficiently Fall 2017 :: CSE 306 Building Locks • Locks are variables in shared memory • Two main operations: acquire() and release() • Also called lock() and unlock() • To check if locked, read variable and check value • To acquire, write “locked” value to variable • Should only do this if already unlocked • If already locked, keep reading value until unlock observed • To release, write “unlocked” value to variable Fall 2017 :: CSE 306 First Implementation Attempt • Using normal load/store instructions Boolean lock = false; // shared variable Void acquire(Boolean *lock) { Final check of condition & write while (*lock) /* wait */ ; while to should happen atomically *lock = true; lock } Void release(Boolean *lock) { *lock = false; } • This does not work. Why? • Checking and writing of the lock value in acquire() need to happen atomically. Fall 2017 :: CSE 306 Solution: Use Atomic RMW Instructions • Atomic Instructions guarantee atomicity •
    [Show full text]