Warning: Running out of Memory R

Operating Systems Sebnem Baydere

Lecture 3: Dispatching, Context Switch, Thread Cooperation

Content: Each thread has illusion of its own CPU, yet on a uniprocessor, all threads (sequential execution part of a process) share the same physical CPU. How does this work? Two key concepts: 1. thread/process control block 2. dispatching loop

Why do we need to handle cooperating threads or processes? Atomic operations

3.1 Per-thread state

Thread control block (in Nachos, Thread class one per thread ) Execution state: registers, program counter, pointer to stack scheduling information etc. (add more information as needed)

3.2 Dispatching Loop

LOOP Run thread

Save state (into thread control block)

Choose new thread to run

Load its state (into TCB) and loop

3.2.1 Running a thread:

In order to run a thread, load its state (registers, PC, stack pointer) into the CPU, and do a jump.

How does dispatcher get control back? Two ways:

Internal events (running thread goes to sleep and hopes something will wake up)

1. Thread blocks on I/O (examples: for disk I/O, or in text editor to wait for you to type at keyboard) 2. Thread blocks waiting for some other thread to do something 3. Yield -- give up CPU to someone else waiting

-- ?? What if thread never did any I/O, never waited, and didn't yield control? Dispatcher has to gain control back somehow.

External events

1. Interrupts -- type character, disk request finishes wakes up dispatcher, so it can choose another thread to run 2. Timer -- like an alarm clock.

3.2.2 Choosing a thread to run

Dispatcher keeps a list of ready threads -- how does it choose among them?

Zero ready threads -- dispatcher just loops

One ready thread -- easy.

More than one ready thread: (CPU Scheduling ; we will see it later)

1. LIFO (last in, first out): put ready threads in front of the list, and dispatcher takes threads from front. Results in starvation.

2. FIFO (first in, first out): put ready threads on back of list, pull them off from the front (this is what Nachos does)

3. Priority queue -- give some threads better shot at CPU Priority field in thread control block. Keep ready list sorted by priority.

3.2.3 Thread states Each thread can be in one of three states:

1. Running -- has the CPU 2. Blocked -- waiting for I/O or synchronization with another thread 3. Ready to run -- on the ready list, waiting for the CPU

Running

Ready Blocked

Thread/Process State Diagram 3.2.4 Context Switch (saving/restoring state ) What do you need to save/restore when the dispatcher switches to a new thread? Anything next thread may trash: PC, registers, change execution stack

What if you make a mistake in implementing switch? For instance, suppose you forget to save and restore register 4? Get intermittent failures depending on exactly when context switch occurred, and whether new thread was using r4. Potentially, system will give wrong result, without any warning (if program didn't notice that r4 got trashed).

Can you devise an exhaustive test to guarantee that switch works? No!

3.2.5 Interrupts Interrupts are a special kind of hardware-invoked context switch: I/O finishes or timer expires. Hardware causes CPU to stop what it's doing, start running interrupt handler.

Handler: saves state of interrupted thread runs restores state of interrupted thread (if time-slice, restore state of new thread) Return to normal execution in restored state

3.3 Thread creation Thread "fork" -- create a new thread Thread fork implementation: -- Allocate a new thread control block and execution call stack -- Initialize the thread control block and stack,(with initial register values and the address of the first instruction to run ) -- Tell dispatcher that it can run the thread (put thread on ready list).

Thread fork is not the same thing as UNIX "fork". (UNIX fork creates a new process, so it has to create a new address space, in addition to a new thread.)

Thread fork is very much like an asynchronous procedure call

--it means, go do this work, where the calling thread does not wait for the callee to complete.

What if the calling thread needs to wait?

Thread "Join" -- wait for a forked thread to finish.

*A traditional procedure call is logically equivalent to doing a fork then immediately doing a join. This is a normal procedure call: A() { B(); } B() { } The procedure A can also be implemented as:

A'() { Thread t = new Thread; t->Fork(B); t->Join(); }

3.4 Multiprocessing vs. Multiprogramming

A B C Multiprocessing

Multiprogramming

A B C A B B

Dispatcher can choose to run each thread to completion, or time-slice in big chunks, or time slice so that each thread executes only one instruction at a time (simulating a multiprocessor, where each CPU operates in lockstep).

If the dispatcher can do any of the above, programs must work under all cases, for all interleavings.

So how can you know if your concurrent program works? Whether all interleavings will work?

3.5 Cooperating vs Independent Threads/Processes 3.5.1 Definitions

Independent threads: No state shared with other threads Deterministic -- input state determines result Reproducible Scheduling order doesn't matter

Cooperating threads: Shared state Non-deterministic Non-reproducible Non-reproducibility and non-determinism means that bugs can be intermittent. This makes debugging really hard!

3.5.2 Why allow cooperating threads? Computers model people's behavior, so computers at some level have to cooperate! 1. Share resources/information 2. Speedup 3. Modularity

3.5.3 Some simple concurrent programs Most of the time, threads are working on separate data, so scheduling order doesn't matter:

Thread A Thread B x = 1 y = 2

What about: initially, y = 12

x = y + 1 y = y * 2

What are the possible values for x after the above? What are the posssible values of x below?

x = 1 x = 2 Can't say anything useful about a concurrent program without knowing what are the underlying indivisible operations!

3.5.4 Atomic operations Atomic operation: operation always runs to completion, or not at all. indivisible, can't be stopped in the middle.

On most machines, memory reference and assignment (load and store) of words, are atomic. Many instructions are not atomic. For example, on most 32-bit architectures, double precision floating point store is not atomic; it involves two separate memory operations. 3.5.5 A Larger Concurrent Program Example Two threads, A and B, compete with each other; one tries to increment a shared counter, the other tries to decrement the counter. For this example, assume that memory load and memory store are atomic, but incrementing and decrementing are not atomic.

Thread A Thread B

i = 0 i = 0 while (i < 10) while (i > -10) i = i + 1; i = i - 1; print A wins print B wins

Questions: 1. Who wins? Could be either.

2. Is it guaranteed that someone wins? Why not?

3. What if both threads have their own CPU, running in parallel at exactly the same speed. Is it guaranteed that it goes on forever?

4. Could this happen on a uniprocessor?