Outline of Lecture-09 Synchronization, Critical ❚ Problems with concurrent access to shared data Sections and Semaphores Ø and Ø General structure for enforce critical section (SGG Chapter 6) ❚ Software based solutions: Ø Simple solution Ø Peterson’s solution Instructor: Dr. Tongping Liu ❚ Hardware based solution Ø Disable Ø TestAndSet Ø Swap ❚ OS solution -- ❚ Using semaphores to solve real issues 1 Department of @ UTSA 2

Objectives Threads Memory Model

❚ To introduce the critical-section problem, whose ❚ Conceptual model: solutions can be used to ensure the consistency of Ø Multiple threads run in the same context of a shared data Ø Each has its own separate thread context ❚ To present both software and hardware solutions of ü Thread ID, stack, stack pointer, PC, and GP registers Ø All threads share the remaining process context the critical-section problem ü Code, data, heap, and shared library segments ❚ To introduce the concept of an atomic transaction ü Open files and installed handlers and describe mechanisms to ensure atomicity ❚ Operationally, this model is not strictly enforced: ❚ Shared Data Ø Register values are truly separate and protected, but… Ø at the same logical address space Ø Any thread can read and write the stack of any other Ø at different address space through messages (later in DS) thread

1 Mapping Variable Instances to Memory Different outputs Global var: 1 instance (ptr [data]) Variable Referenced by Referenced by Referenced by instance Local vars main thread?: 1 instance ( peer thread 0?i.m, msgs.m) peer thread 1? [0]: Hello from foo (svar=1) yes yes yes ptr Local var : 2 instances ( char **ptr; /* global */ cnt myid.t0no [peer thread 0’s stack],yes yes [1]: Hello from bar (svar=2) i.m myid.t1yes [peer thread 1’s stack] no no int main() msgs.m ) yes yes yes { myid.t0 no yes no [1]: Hello from bar (svar=1) int i; myid.t1 no no yes pthread_t tid; /* thread routine */ [0]: Hello from foo (svar=2) char *msgs[2] = { What are possiblevoid *thread(voidoutputs? *vargp) "Hello from foo", { "Hello from bar" int myid = (int)vargp; [1]: Hello from bar (svar=1) }; static int cnt = 0; ptr = msgs; [0]: Hello from foo (svar=1) printf("[%d]: %s (svar=%d)\n", for (i = 0; i < 2; i++) myid, ptr[myid], ++cnt); Pthread_create(&tid, } [0]: Hello from foo (svar=1) NULL, thread, [1]: Hello from bar (svar=1) (void *)i); Local stac var: 1 instance (cnt [data]) ……. Conclusion: different threads may execute in a random order } sharing.c! 6

Concurrent Access to Shared Data What is the problem?

❚ Two threads A and B access a shared variable ❚ Observe: In a time-shared system, the exact “Balance” instruction execution order cannot be predicted § Scenario 1: § Scenario 2: Thread A: Thread B: A1. LOAD R1, BALANCE B1. LOAD R1, BALANCE Balance = Balance + 100 Balance = Balance - 200 A2. ADD R1, 100 B2. SUB R1, 200 A3. STORE BALANCE, R1 ! Context Switch! A1. LOAD R1, BALANCE B1. LOAD R1, BALANCE A2. ADD R1, 100 B2. SUB R1, 200 A3. STORE BALANCE, R1 Context Switch! B1. LOAD R1, BALANCE B3. STORE BALANCE, R1 A1. LOAD R1, BALANCE B3. STORE BALANCE, R1 B2. SUB R1, 200 A2. ADD R1, 100 § Sequential correct execution § Mixed wrong execution A3. STORE BALANCE, R1 B3. STORE BALANCE, R1 § Balance is effectively § Balance is effectively decreased by 100! decreased by 200!!!

Department of Computer Science @ UTSA 7 What are possible results for this program? 8

2 a = 0; b = 0; // Initial state Common Properties of Two Examples

Thread 1 Thread 2 ❚ There are some shared variables Ø Account T1-1: if (b == 0) T2-1: if (a == 0) Ø a, b ❚ At least one operation is a write operation T1-2: a = 1; T2-2: b = 1; ❚ The results of the execution are non-deterministic, depending on the execution order

99.43% 0.56% 0.01%

a = 1 a = 0 a = 1 a = 1 b = 0 b = 1 b = 1 b = 1 Department of Computer Science @ UTSA 10

Race Conditions Definition of Synchronization

❚ Multiple processes/threads write/read shared data and the ❚ Cooperating processes/threads share data & have outcome depends on the particular order to access effects on each other à executions are NOT shared data are called race conditions reproducible with non-deterministic exec. speed Ø A serious problem for concurrent system using shared variables! ❚ Concurrent executions Ø Single processor à achieved by time slicing How do we solve the problem?! Ø Parallel/distributed systems à truly simultaneous ❚ Need to make sure that some high-level code sections are ❚ Synchronization à getting processes/threads to executed atomically work together in a coordinated manner Ø Atomic operation means that it completes in its entirety without worrying about interruption by any other potentially conflict- causing process

Department of Computer Science @ UTSA 11 12

3 Critical-Section (CS) Problem General Structure for Critical Sections

❚ Multiple processes/threads compete to access the do { shared data …… entry section critical section ❚ Critical section/region: a piece of code that exit section accesses a shared resource ( or device) must not be concurrently accessed by more remainder statements than one thread of execution. } while (1); ❚ In the entry section, the process requests ❚ Problem – ensure that only one process/thread is “permission”. allowed to execute in its critical section (for the same shared data) at any time. The execution of critical sections must be mutually exclusive in time.

Department of Computer Science @ UTSA 13 Department of Computer Science @ UTSA 14

Requirements for CS Solutions

❚ Mutual Exclusion Ø At most one can be in its CS at any time ❚ Progress Ø If all other tasks are in their remainder sections, a task is allowed to enter its CS Ø Only those tasks that are not in their remainder section can decide which task can enter its CS next, and this decision cannot be postponed infinitely. ❚ Bounded Waiting Ø Once a task has made a request to enter its CS à other tasks can only enter their CSs with a bounded number of times

15 Department of Computer Science @ UTSA 16

4 Solutions for CS Problem Outline of Lecture-09

❚ Software based solution ❚ Problems with concurrent access to shared data Ø Peterson’s solution Ø Race condition and critical section Ø General structure for enforce critical section ❚ Hardware based solutions ❚ Software based solutions: Ø Disable interrupts Ø Simple solution Ø Atomic instructions: TestAndSet and Swap Ø Peterson’s solution ❚ Hardware based solution ❚ OS solutions: Ø Disable interrupts Ø Semaphore Ø TestAndSet Ø Swap ❚ OS solution -- Semaphore ❚ Using semaphores to solve real issues Department of Computer Science @ UTSA 17 Department of Computer Science @ UTSA 18

A Simple Solution Peterson’s Solution

❚ Bool turn; à indicate whose turn to enter CS ❚ Three shared variables: turn and flag[2] ❚ T0 and T1: alternate between CS and remainder

Pros: : How does this solution satisfy the CS requirements (mutual exclusion, progress, bounded waiting) ? 1. Pi’s critical section is executed 1. Progress is not satisfied since it iff turn = i requires strict alternation 2. Pi is if Pj is in CS 2. A process cannot enter the CS (mutual exclusion) more often than the other 19 20

5 Peterson’s Solution Peterson’s Solution

❚ Guarantee mutual exclusion? ❚ Progress Guarantee? Ø Q: when process 0 is in its critical section, can Ø Q: when P0 is in its remainder section, can P1 get into process1 get into its critical section? its critical section? à flag[0] = 1 and either flag[1]=0 or turn = 0 à flag[0] = 0 à Thus, for process1: under flag[0]=1, if flag[1]=0, P1 is in à Process1: if flag[0]=0, then P1 can enter without its CS reminder section; if turn = 0, it waits before its CS without waiting

21 22

Peterson’s Solution Peterson’s Solution (cont.)

❚ Mutual exclusion ❚ Toggle first two lines for one process ❚ Progress ❚ Bounded waiting: alternation Any issues of Peterson’s solution? ❚ Busy waiting ❚ Each thread should have different code ❚ Can it support more than two threads? How does this solution break the CS requirements ❚ Error-prone (mutual exclusion, progress, bounded waiting) ?

23 24

6 Outline of Lecture-09 Hardware Solution 1: Disable

❚ Problems with concurrent access to shared data ❚ Uniprocessors – could disable interrupts Ø Race condition and critical section Ø Currently running code would execute without Ø General structure for enforce critical section ❚ Software based solutions: do { What are the problems with this solution? Ø Simple solution …… 1. Mutual exclusion is preserved but Ø Peterson’s solution DISABLE INTERRUPT efficiency of execution is degraded ❚ Hardware based solution critical section • while in CS, we cannot Ø Disable interrupts ENABLE INTERRUPT interleave execution with other

Ø TestAndSet processes that are in RS RemainderSection Ø Swap 2. On a multiprocessor, mutual } while (1); ❚ OS solution -- Semaphore exclusion is not preserved ❚ Using semaphores to solve real issues 3. Tracking time is impossible in CS Department of Computer Science @ UTSA 25 Department of Computer Science @ UTSA 26

Hardware Support for Synchronization Hardware Instruction: TestAndSet

❚ Synchronization ❚ The TestAndSet instruction tests and modifies the Ø Need to check and a value atomically content of a word atomically (non-interruptible) ❚ IA32 hardware provides a special instruction: xchg ❚ Keep setting the to 1 and return old value. Ø When the instruction accesses memory, the bus is bool TestAndSet(bool *target){ locked during its execution and no other process/thread boolean m = *target; do { can access memory until it completes!!! *target = true; … …

Ø Other variations: xchgb, xchgw, xchgl return m; while(TestAndSet(&lock)); ❚ Other hardware instructions } critical section Ø TestAndSet (a) What’s the problem? //free the lock lock = false; Ø Swap (a,b) 1. Busy-waiting, waste cpu remainder section 2. Hardware dependent, not } while(true);

27 bounded-waiting Department of Computer Science @ UTSA 28

7 Another Hardware Instruction: Swap Outline of Lecture-09

❚ Swap contents of two memory words ❚ Problems with concurrent access to shared data void Swap (bool *a, bool *b){ Ø Race condition and critical section bool temp = *a; What’s the problem? Ø General structure for enforce critical section *a = *b; *b = temp: 1. Busy-waiting, waste cpu ❚ Software based solutions: } 2. Hardware dependent, not Ø Simple solution Ø Peterson’s solution bool lock = FALSE; bounded-waiting While(true){ ❚ Hardware based solution bool key = TRUE; LOCK == FALSE Ø Disable interrupts while(key == TRUE) { Ø TestAndSet Swap(&key, &lock) ; } Ø Swap critical section; ❚ OS solution -- Semaphore lock = FALSE; //release permission } ❚ Using semaphores to solve real issues Department of Computer Science @ UTSA 29 Department of Computer Science @ UTSA 30

Semaphores Semaphores without Busy Waiting

❚ Synchronization without busy waiting ❚ The idea: Ø Motivation: Avoid busy waiting by blocking a process Ø Once need to wait, remove the process/thread from CPU execution until some condition is satisfied Ø The process/thread goes into a special queue waiting for a semaphore (like an I/O waiting queue) ❚ Semaphore S – integer variable Ø OS/runtime manages this queue (e.g., FIFO manner) and remove a process/thread when a occurs ❚ Two indivisible (atomic) operations: ❚ A semaphore consists of an integer (S.value) and a Ø wait(s) (also called P(s) or down(s) or acquire()); linked list of waiting processes/threads (S.list) Ø signal(s) (also called V(s) or up(s) or release()) Ø If the integer is 0 or negative, its magnitude is the number Ø User-visible operations on a semaphore of processes/threads waiting for this semaphore. Ø Easy to generalize, and less complicated for application ❚ Start with an empty list, and normally initialized to 0 programmers

Department of Computer Science @ UTSA 31 32

8 Implement Semaphores _ _down

Two operations by system calls typedef struct{ block – place the current process to int value; the waiting queue. struct process *list; wakeup – remove one of processes in WaitingQ.ins }semaphore; the waiting queue and place it in the ready queue. wait(semaphore * s){ signal(semaphore * s){ s->value--; s->value++; while(s->value < 0){ if (s->value <= 0){ enlist(s->list); delist(P, s->list); block(); wakeup(p); } } WaitingQ.del } }

34 Is this one without busy waiting? 33

Semaphore Usage Attacking CS Problem with Semaphores S = number of resources ❚ Counting semaphore – while(1){ ❚ Shared data …… Ø Can be used to control access Ø semaphore mutex = 1; /* initially mutex = 1 */ to a given resources with finite wait(S); use one of S resource ❚ For any process/thread number of instances signal(S); remainder section do { } wait(semaphore * s){ ❚ Binary semaphore – integer … … s->value--; mutex = 1 wait(mutex); if (s->value <0){ value can range only while(1){ critical section enlist(s->list); between 0 and 1; Also …… wait(mutex); signal(mutex); block(); known as mutex locks Critical Section } signal(mutex); remainder section remainder section } } } while(1);

Department of Computer Science @ UTSA 36

9 Revisit “Balance Update Problem” Semaphore for General Synchronization

§ Shared data: ❚ Execute code B in Pj after code A was executed in Pi int Balance; ❚ Use semaphore flag : what is the initial value? semaphore mutex = 1; // initially mutex = 1 ❚ Code Semaphore flag = 0;

Pi : Pj : . . § Process A: § Process B: . . ……. ……. . wait(flag) wait (mutex); wait (mutex); A B Balance = Balance – 100; Balance = Balance – 200; signal(flag) signal (mutex); signal (mutex); …… …… What about 2 threads wait for 1 thread? Or 1 thread waits for 2 threads? Department of Computer Science @ UTSA 37 Department of Computer Science @ UTSA 38

Semaphore for General Synchronization Classical Synchronization Problems

❚ A, B, C is for T1, T2, T3 ❚ Producer-Consumer Problem ❚ A à B àC Ø Shared bounded-buffer Ø Producer puts items to the buffer area, wait if buffer is full Semaphore f1 = 0; Semaphore f1 = 0; Ø Consumer consumes items from the buffer, wait if is empty Semaphore f2 = 0; Semaphore f2 = 0; T1 : T2 : T3 : T1 : T2 : T3 : wait(f2); ❚ Readers-Writers Problem A; wait(f1); wait(f2); A; wait(f1); wait(f2); Ø Multiple readers can access concurrently Signal(f1); B; C; Signal(f1); B; C; Ø Writers mutual exclusive with writes/readers Signal(f2); Signal(f2); Signal(f2);

❚ Dining-Philosophers Problem Ø Multiple resources, get one at each time

39 Department of Computer Science @ UTSA 40

10 Producer-Consumer Problem Producer/Consumer Loops

❚ Producer Loop With Bounded-Buffer ......

producer consumer ❚ Need to make sure that Ø The producer and the consumer do not access the ❚ Consumer Loop For the shared variable counter, buffer area and related variables at the same time what may go wrong? Ø No item is available to the consumer if all the buffer slots are empty. Ø No slot in the buffer is available to the producer if all the buffer slots are full

Department of Computer Science @ UTSA 41 42

Potential Solution I Potential Solution II

wait(sem); wait(sem);

signal(sem); signal(sem);

wait(sem); wait(sem);

signal(sem); signal(sem);

Problem: if counter = n now, then the producer is holding the semaphore Problem: if there are multiple threads, and a producer produces one item, then it is possible that they will compete for the item Department of Computer Science @ UTSA 43 Department of Computer Science @ UTSA 44

11 What Semaphores are needed? Producer/Consumer Loops

❚ semaphore mutex, full, empty; ❚ Producer Loop

What are the initial values? Initially:

prodItems = 0 /* The number of produced items */ ❚ Consumer Loop For the shared variable counter, What may go wrong? availSlots=n /* The number of empty slots */

mutex = 1 /* controlling mutual access to the buffer pool */

Department of Computer Science @ UTSA 45 46

Producer-Consumer Codes Dining-Philosophers Problem

Five philosophers share a common Producer Consumer 0 What will 4 0 circular table. There are five do { … do { … happen if chopsticks and a bowl of rice (in the produce an item in nextp wait(prodItems); 4 middle). When a philosopher gets we change 1 hungry, he tries to pick up the … wait(mutex); closest chopsticks. … the order? wait(availSlots); 3 remove an item from buffer to nextc 1 A philosopher may pick up only one wait(mutex); chopstick at a time, and cannot pick … … 3 2 up a chopstick already in use. When add nextp to buffer signal(mutex); 2 done, he puts down both of his … signal(availItems); chopsticks, one after the other. signal(mutex); … ❚ Shared data signal(prodItems); consume the item in nextc semaphore chopstick[5]; } while (1) … Initially all semaphore values are 1 } while (1) A classic example of synchronization problem: allocate several resources Department of Computer Science @ UTSA 47 among several processes Departmentin a -free of Computer Science @ UTSA and starvation-free manner48

12 Problems with Using Semaphores Problems with Using Semaphores (cont.)

❚ Let S and Q be two semaphores initialized to 1 ❚ Strict requirements on the sequence of operations Ø Correct: wait (mutex) …. signal (mutex) T0 T1 wait (S); wait (Q); Ø Incorrect: signal (mutex) …. wait (mutex); wait (mutex) …. wait(mutex); wait (Q); wait (S); ... … signal (S); signal (Q); ❚ Complicated usages signal (Q); signal (S); ❚ Incorrect usage could lead to What is the problem with the above code? Ø deadlock

Department of Computer Science @ UTSA 49 Department of Computer Science @ UTSA 50

Summary

❚ Synchronization for coordinated processes/threads ❚ The produce/consumer and bounded buffer problem ❚ Critical Sections and solutions’ requirements ❚ Peterson’s solution for two processes/threads ❚ Synchronization Hardware: atomic instructions ❚ Semaphores Ø Its two operations wait and signal and standard usages ❚ Bounded buffer problem with semaphores Ø Initial values and order of operations are crucial

51

13